hi @jiebei!
Yes, you can move to ner.teach
(i.e., active learning) if you want to improve the model.
However, if you're overall happy with your model's performance, you can add other extensions like spacy-streamlit
. This is a streamlit app that can be used to demo your model. This can be extremely helpful to show your model to non-data scientists. You can run it locally or deploy onto a cloud environment.
You'll want to use the model that you previously trained from the manual (and/or correct) annotations. In prodigy train
, you need to specify the output_path
for your model. Within that model, it saves a model-last
(which is the last version of your model) and a model-best
(which is the best performing version of your model. You can then use that path output_path/model-last
(e.g., if you want your last run) as your model that you'll use.
The example is a case where you want to use spaCy's built-in ner
that has multiple different entity types like PERSON
or ORG
. Since you have custom entities, you need a model that has been trained for those entities.
Likely you'd want to use new text data. If your model has done a good job, likely it has already embedded the information from the manual recipes into the model, and thus already has "learned" from that example. What you want in doing ner.teach
is to find blind spots of your model. The ner.teach
model can perform active learning that will modify the order examples are given to you to choose to label the ones the model is most uncertain about.
It's important to note that while ner.teach
(active learning) in theory makes sense, it doesn't always work in practice. As an alternative, you could instead keep on using the ner.correct
recipe which is like the ner.teach
applied on new examples, but only makes predictions on the new text, it doesn't reorder the examples.
Here's a great discussion on Matt about active learning:
Hope this helps!