Hi, my task annotates 4 NERs at once in a sentence as shown below. After annotation, the 4 NERs were trained with one commandprodigy train. Therefore I am wondering:
1 How prodigy train actually trains the 4 different NERs simultaneously?
2 Will the train-at-onceinfluence the final result? for bad or for good?
3 How to check the evaluation for each NER?
4 Do you recommend doing so or training each NER separately? If separately, which command to use with the existing annotated dataset (i don't want to annotate one more time )?
The annotation commond: prodigy ner.manual data_5_1.0 en_core_web_lg ./input/data_5_ground_truth_1.0.jsonl --label RIGHTV,RIGHTN,ACCESSV,ACCESSN
The training commond: prodigy train --ner data_5_1.0 ./tmp_model --eval-split 0.2 --config config.cfg
The training result is as follows:
This is excellent. We're happy to help coach you along the way. I'll provide a lot of detailed links that can point you in the right direction. Some you may already know, but hopefully it can help other community members for context. Also, keep searching in the wealth of information in our spaCy documentation, Prodigy documentation, and in this community and spaCy GitHub community.
Can I rephrase this to align with Prodigy's terminology?
I would suggest your goal is to train one NER model with four entity types: RIGHTV, RIGHTN, ACCESSV, ACCESSN, not "4 NERs". I would recommend looking over Prodigy's glossary of terms.
ner.manual recipe produces manual annotations that highlight the full entity span, as opposed to using binary recipes that create the annotations as Yes/No. Binary recipes like ner.teach or ner.correct are usually when you already have a model that you want to improve incrementally. This is annotating with a model-in-the-loop as the model is affecting the order of annotation or at least being used to predict and the annotator corrects it.
prodigy train is simply a wrapper for spacy train. spacy train is defined by its config file. To make things easier, prodigy train will create a default config file, which is identical to spacy init config.
I see you used your own config.cfg file. This is excellent. If you're interested in how training is done, see spaCy training docs or look over the config docs.
Should I start with a blank model or update an existing one?¶
spaCy’s NER architecture was designed to support continuous updates with more examples and even adding new labels to existing trained models. Updating an existing model makes sense if you want to keep using the same label scheme and only want to fine-tune it on more data or more specific data. However, it can easily lead to inconsistent behavior if you’re adding new entity types and/or annotations that conflict with the data the model was trained on. For instance, if you suddenly want to predict all cities as CITY instead of GPE. Instead of trying to “fight” the existing weights trained on millions of words, it often makes more sense to train a new model from scratch.
Even if you’re training from scratch, you can still use a trained model to help you create training data more efficiently. Prodigy’s ner.correct will stream in the model’s predictions for the given labels and lets you manually correct the entity spans. This way, you can let a model label the entity types you want to keep, add your new types on top, and make corrections along the way. This is a very effective method for bootstrapping large gold-standard training corpora without having to do all the labelling from scratch.
Here's another discussion on the differences.
Note this was in 2019 when prodigy train was called prodigy batch-train for ner.
Very important: make sure to create a dedicated hold-out (evaluation) dataset early on if you're experimenting. It's easy to use the --eval-split but that will mean your evaluation dataset will change in each run. Without a very large dataset, your model's evaluation may change wildly because of different evaluation sets. This will confuse your results. If you do this, you can specify your evaluation dataset with the eval: prefix in prodigy train like:
prodigy train --ner train_data,eval:eval_data ...
Also, this post explains more:
See this post:
But you may want to check out the NER workflow (fyi we're planning to update this very soon with improved names!):
Separately isn't needed. However, here's a little background if you want to exclude some examples from annotation.
Be sure to use the --exclude argument where you can pass examples to exclude examples from a stream that you don't want to annotate. Typically, Prodigy will default in its configuration exclude by task_hash, which is automatically done. This creates a unique code (hash) that identifies every record by its input text + annotation task (e.g., ner.manual). You can change this to exclude by input_hash by changing exclude_by on your prodigy.json (config file). There are 50+ Prodigy Support issues that tack the problem of exclude. See them for examples of workflows and other questions.
Hey, thanks a lot for your detailed reply, the helpful links, and for offering to coach. I would reply every night so that you could check my questions and updates on the second day. How do you like this style of coaching?
The current training result is as follows. It makes sense from my point, since RIGHTV,ACCESSV,RIGHTN are within a dictionary of words, while ACCESSN varies a lot. However, my dataset is around 100 - 150 sentences. How do you think of the NER model trained on 100 sentences? Do you have any suggestions for the training and for a solid project, like data argumentation, or setting a smaller batch size?
I can't guarantee any response turn-around but feel free to keep posting on this chain. We'll answer as we can.
Typically, we'd advocate for more many sentences. For example, our NER workflow recommends to start to with at least 1,000 unlabeled sentences at the very beginning
While there's no set, we typically recommend models training with about 500 sentences at minimum with evaluation datasets of at least a few hundred. Therefore, if you followed that you wouldn't have enough for training or even data augmentation. Is there anyway to get more sentences?
Hopefully you can as I would be skeptical on how far you can go with only 100 sentences.
If you did want to use data augmentation, augmenty could work.
If you can overcome the few sentences, down the road you should consider running train-curve. This will give you an idea of how your accuracy changes with more annotations. Look for documentation on the recommendations.
Hi @ryanwesslen , thanks a lot for your detailed reply. Now I have two questions from prodigy train to spacy train.
1 I used the command to train the model. Do you know what is the same as--label-stas in spacy train? How to check the performance for each NER with spacy train?
The spacy evaluate is the command. It can evaluate trained models. Since spaCy is open source, you can see the code for it:
What's important is the function handle_scores_per_type. This is what is called when using --label stats. As you can see, it's called by default for spacy evaluate.
The one thing you may want to do is create a dedicated hold out (evaluation) dataset. By default, Prodigy can enable the --eval-split 0.2 which will do the splitting for you. The problem is each run you may get a different set of data. Ideally, you should split the data.
If you're going to use spacy train, the data-to-spacy recipe can help. It'll do a partition of your data and then convert it to .spacy binary format (which is ideal for using spacy).
There isn't a spacy version of train-curve unfortunately. You can write your own version. See this for details:
Also, you can export your prodigy train-curve results to a .txt file by adding in "> train_curve.txt" so: