batch train buffer full

madhujahagirdar · March 15, 2018, 11:21pm

@honnibal , Finally I was able to make it work , look at the below code and let me know if its ok. I added deepcopy(vector) as it was getting reset from best_model. If its ok, I have a path now to continue.

However, one clarification I have is, as we are saving vector and putting it back, is the vector updated with the learning it has or it is static from the word2vec model ? or should we move the

nlp.vocab.vectors = vectors before nlp.vocab.vectors = Vectors()

so that we can save the updated vector

#Save the Vectors
vectors = nlp.vocab.vectors
    print("lenght of vectors is ", len(vectors))
    for i in range(n_iter):
        loss = 0.
        random.shuffle(examples)
        for batch in cytoolz.partition_all(batch_size,
                                           tqdm.tqdm(examples, leave=False)):
            batch = list(batch)
            loss += model.update(batch, revise=False, drop=dropout)
        if len(evals) > 0:
            with nlp.use_params(model.optimizer.averages):
                acc = model.evaluate(tqdm.tqdm(evals, leave=False))
                if acc['accuracy'] > best_acc['accuracy']:
                    best_acc = dict(acc)
                    nlp.vocab.vectors = Vectors()
                    best_model = nlp.to_bytes()
                    nlp.vocab.vectors = vectors
            print_(printers.tc_update(i, loss, acc))
    if len(evals) > 0:
        print_(printers.tc_result(best_acc))
    if output_model is not None:
        if best_model is not None:
            #I had to do this, as nlp.from_bytes was resetting vector to 0 length. This works ok now
            vectors_save = deepcopy(vectors)
            nlp = nlp.from_bytes(best_model)
            nlp.vocab.vectors = vectors_save
        msg = export_model_data(output_model, nlp, examples, evals)
        print_(msg)
    return best_acc['accuracy']

Topic		Replies	Views
Error on saving model from textcat.batch-train textcat , spacy	1	1406	December 29, 2017
Spacy pretrain best practices usage , done , spacy	16	5277	March 13, 2020
textcat.batch-train error "operands could not be broadcast together..." textcat , spacy	4	616	September 24, 2019
Command "ner.batch-train" returns MemoryError ner , solved	5	826	August 22, 2019
Terms Trains Crashing spacy , terms	32	3378	March 15, 2018

batch train buffer full

Related topics