@ryanwesslen : Just let you know I am busy with a different approach, as described here under.
Header: How to reuse the prodigy.db to retrain the older (spacy v2) ner custom model
Hi,
Last year in June 2021 I created a ner custom model with Prodi.gy (and spacy 2.x.x.) on my windows laptop:
python -m prodigy train ner dataset,dataset_correct,dataset_correct1,dataset_correct3 en_vectors_web_lg — output C:\Users\myname\Documents\tmp_model — eval-split 0.2 — n-iter 40
I tried to upload this model to huggingface.co, but I couldnot, due to incompatibility between spacy2 used in the model vs spacy3 of the spacy-huggingface-hub. Therefore I have decided to install prodi.gy on my ubuntu laptop 22.04 to retrain the old model or rebuilt it, depending on the possibility.
I still have my .prodigy folder from my windows laptop from last year. It contains two files: prodigy.db (168 MB, 9 datasets) and prodigy.json (6B). I want to reuse this prodigy.db database to retrain or rebuilt the old model.
Can you please give suggestions on how to do it, with links to right code?
gr.
Rahul