@honnibal , Finally I was able to make it work , look at the below code and let me know if its ok. I added deepcopy(vector) as it was getting reset from best_model. If its ok, I have a path now to continue.
However, one clarification I have is, as we are saving vector and putting it back, is the vector updated with the learning it has or it is static from the word2vec model ? or should we move the
nlp.vocab.vectors = vectors before nlp.vocab.vectors = Vectors()
so that we can save the updated vector
#Save the Vectors
vectors = nlp.vocab.vectors
print("lenght of vectors is ", len(vectors))
for i in range(n_iter):
loss = 0.
random.shuffle(examples)
for batch in cytoolz.partition_all(batch_size,
tqdm.tqdm(examples, leave=False)):
batch = list(batch)
loss += model.update(batch, revise=False, drop=dropout)
if len(evals) > 0:
with nlp.use_params(model.optimizer.averages):
acc = model.evaluate(tqdm.tqdm(evals, leave=False))
if acc['accuracy'] > best_acc['accuracy']:
best_acc = dict(acc)
nlp.vocab.vectors = Vectors()
best_model = nlp.to_bytes()
nlp.vocab.vectors = vectors
print_(printers.tc_update(i, loss, acc))
if len(evals) > 0:
print_(printers.tc_result(best_acc))
if output_model is not None:
if best_model is not None:
#I had to do this, as nlp.from_bytes was resetting vector to 0 length. This works ok now
vectors_save = deepcopy(vectors)
nlp = nlp.from_bytes(best_model)
nlp.vocab.vectors = vectors_save
msg = export_model_data(output_model, nlp, examples, evals)
print_(msg)
return best_acc['accuracy']