questions on Multi NERs Annotation & Training at Once in a Sentence

ruiyeNLP · September 22, 2022, 5:58pm

Hi, my task annotates 4 NERs at once in a sentence as shown below. After annotation, the 4 NERs were trained with one command prodigy train. Therefore I am wondering:
1 How prodigy train actually trains the 4 different NERs simultaneously?
2 Will the train-at-once influence the final result? for bad or for good?
3 How to check the evaluation for each NER?
4 Do you recommend doing so or training each NER separately? If separately, which command to use with the existing annotated dataset (i don't want to annotate one more time )?

The annotation commond: prodigy ner.manual data_5_1.0 en_core_web_lg ./input/data_5_ground_truth_1.0.jsonl --label RIGHTV,RIGHTN,ACCESSV,ACCESSN
The training commond: prodigy train --ner data_5_1.0 ./tmp_model --eval-split 0.2 --config config.cfg
The training result is as follows:

Since I probably will turn this project into a publication, I need to know what exactly the training work.
Looking forward to discussing this with you.

ryanwesslen · September 26, 2022, 7:31pm

hi @ruiye!

Thanks for your message!

This is excellent. We're happy to help coach you along the way. I'll provide a lot of detailed links that can point you in the right direction. Some you may already know, but hopefully it can help other community members for context. Also, keep searching in the wealth of information in our spaCy documentation, Prodigy documentation, and in this community and spaCy GitHub community.

Can I rephrase this to align with Prodigy's terminology?

I would suggest your goal is to train one NER model with four entity types: RIGHTV, RIGHTN, ACCESSV, ACCESSN, not "4 NERs". I would recommend looking over Prodigy's glossary of terms.

ner.manual recipe produces manual annotations that highlight the full entity span, as opposed to using binary recipes that create the annotations as Yes/No. Binary recipes like ner.teach or ner.correct are usually when you already have a model that you want to improve incrementally. This is annotating with a model-in-the-loop as the model is affecting the order of annotation or at least being used to predict and the annotator corrects it.

prodigy train is simply a wrapper for spacy train. spacy train is defined by its config file. To make things easier, prodigy train will create a default config file, which is identical to spacy init config.

I see you used your own config.cfg file. This is excellent. If you're interested in how training is done, see spaCy training docs or look over the config docs.

Also, you may find the NER Prodigy docs page's section on training strategies helpful. Here's an excerpt:

Should I start with a blank model or update an existing one?¶

spaCy’s NER architecture was designed to support continuous updates with more examples and even adding new labels to existing trained models. Updating an existing model makes sense if you want to keep using the same label scheme and only want to fine-tune it on more data or more specific data. However, it can easily lead to inconsistent behavior if you’re adding new entity types and/or annotations that conflict with the data the model was trained on. For instance, if you suddenly want to predict all cities as CITY instead of GPE. Instead of trying to “fight” the existing weights trained on millions of words, it often makes more sense to train a new model from scratch.

Even if you’re training from scratch, you can still use a trained model to help you create training data more efficiently. Prodigy’s ner.correct will stream in the model’s predictions for the given labels and lets you manually correct the entity spans. This way, you can let a model label the entity types you want to keep, add your new types on top, and make corrections along the way. This is a very effective method for bootstrapping large gold-standard training corpora without having to do all the labelling from scratch.

Here's another discussion on the differences.

Note this was in 2019 when prodigy train was called prodigy batch-train for ner.

Also this post below details some differences between Prodigy and SpaCy. Just note that spaCy 3.0 came out since and has changed a lot. If you want a strong, reproducible project, I would encourage learning spaCy projects. You can find a great template that integrates with Prodigy as part of the spacy projects repo

Ideally, better. See this post:

See this post:

Very important: make sure to create a dedicated hold-out (evaluation) dataset early on if you're experimenting. It's easy to use the --eval-split but that will mean your evaluation dataset will change in each run. Without a very large dataset, your model's evaluation may change wildly because of different evaluation sets. This will confuse your results. If you do this, you can specify your evaluation dataset with the eval: prefix in prodigy train like:

prodigy train --ner train_data,eval:eval_data ...

Also, this post explains more:

See this post:

But you may want to check out the NER workflow (fyi we're planning to update this very soon with improved names!):

https://prodi.gy/36f76cffd9cb4ef653a21ee78659d366/prodigy_flowchart_ner.pdf

Separately isn't needed. However, here's a little background if you want to exclude some examples from annotation.

Be sure to use the --exclude argument where you can pass examples to exclude examples from a stream that you don't want to annotate. Typically, Prodigy will default in its configuration exclude by task_hash, which is automatically done. This creates a unique code (hash) that identifies every record by its input text + annotation task (e.g., ner.manual). You can change this to exclude by input_hash by changing exclude_by on your prodigy.json (config file). There are 50+ Prodigy Support issues that tack the problem of exclude. See them for examples of workflows and other questions.

Last, an incredibly powerful design philosophy of Prodigy is that: no one knows what is the right way to build your model. Instead, Prodigy is designed to allow you to rapidly experiment and iterate for your unique problem. In the Named Entity Documentation page, there's a great section on How to Choose the right recipe and workflow.

ruiyeNLP · September 26, 2022, 8:51pm

Hey, thanks a lot for your detailed reply, the helpful links, and for offering to coach. I would reply every night so that you could check my questions and updates on the second day. How do you like this style of coaching?

The current training result is as follows. It makes sense from my point, since RIGHTV,ACCESSV,RIGHTN are within a dictionary of words, while ACCESSN varies a lot. However, my dataset is around 100 - 150 sentences. How do you think of the NER model trained on 100 sentences? Do you have any suggestions for the training and for a solid project, like data argumentation, or setting a smaller batch size?

ryanwesslen · September 26, 2022, 9:29pm

hi @ruiyeNLP!

I can't guarantee any response turn-around but feel free to keep posting on this chain. We'll answer as we can.

Typically, we'd advocate for more many sentences. For example, our NER workflow recommends to start to with at least 1,000 unlabeled sentences at the very beginning

While there's no set, we typically recommend models training with about 500 sentences at minimum with evaluation datasets of at least a few hundred. Therefore, if you followed that you wouldn't have enough for training or even data augmentation. Is there anyway to get more sentences?

Hopefully you can as I would be skeptical on how far you can go with only 100 sentences.

If you did want to use data augmentation, augmenty could work.

If you can overcome the few sentences, down the road you should consider running train-curve. This will give you an idea of how your accuracy changes with more annotations. Look for documentation on the recommendations.

ruiyeNLP · October 3, 2022, 8:10pm

Hi @ryanwesslen , thanks a lot for your detailed reply. Now I have two questions from prodigy train to spacy train.
1 I used the command to train the model. Do you know what is the same as--label-stas in spacy train? How to check the performance for each NER with spacy train?

prodigy train --ner rights_training_1 ./tmp_model_all_1 --eval-split 0.2 --config config.cfg --label-stats

2 In prodigy, i can use prodigy train-curve, and it is quite useful. Do you know how can I use spacy train-curve? Or how to save the prodigy train-curve results as pdf?

Looking forward to your reply.

ryanwesslen · October 3, 2022, 8:53pm

hi @ruiyeNLP!

The spacy evaluate is the command. It can evaluate trained models. Since spaCy is open source, you can see the code for it:

github.com

explosion/spaCy/blob/70e21dfcad28b044903ba33b2b8831d925151b76/spacy/cli/evaluate.py#L54


      
                  data_path,
                  output=output,
                  use_gpu=use_gpu,
                  gold_preproc=gold_preproc,
                  displacy_path=displacy_path,
                  displacy_limit=displacy_limit,
                  silent=False,
              )
          
          
          def evaluate(
              model: str,
              data_path: Path,
              output: Optional[Path] = None,
              use_gpu: int = -1,
              gold_preproc: bool = False,
              displacy_path: Optional[Path] = None,
              displacy_limit: int = 25,
              silent: bool = True,
              spans_key: str = "sc",
          ) -> Dict[str, Any]:

What's important is the function handle_scores_per_type. This is what is called when using --label stats. As you can see, it's called by default for spacy evaluate.

github.com

explosion/spaCy/blob/70e21dfcad28b044903ba33b2b8831d925151b76/spacy/cli/evaluate.py#L138


      
                      ents=render_ents,
                  )
                  msg.good(f"Generated {displacy_limit} parses as HTML", displacy_path)
          
              if output_path is not None:
                  srsly.write_json(output_path, data)
                  msg.good(f"Saved results to {output_path}")
              return data
          
          
          def handle_scores_per_type(
              scores: Dict[str, Any],
              data: Dict[str, Any] = {},
              *,
              spans_key: str = "sc",
              silent: bool = False,
          ) -> Dict[str, Any]:
              msg = Printer(no_print=silent, pretty=not silent)
              if "morph_per_feat" in scores:
                  if scores["morph_per_feat"]:
                      print_prf_per_type(msg, scores["morph_per_feat"], "MORPH", "feat")

The one thing you may want to do is create a dedicated hold out (evaluation) dataset. By default, Prodigy can enable the --eval-split 0.2 which will do the splitting for you. The problem is each run you may get a different set of data. Ideally, you should split the data.

If you're going to use spacy train, the data-to-spacy recipe can help. It'll do a partition of your data and then convert it to .spacy binary format (which is ideal for using spacy).

There isn't a spacy version of train-curve unfortunately. You can write your own version. See this for details:

Also, you can export your prodigy train-curve results to a .txt file by adding in "> train_curve.txt" so:

prodigy train-curve ... > train_curve.txt

Topic		Replies	Views
ner.train number of examples usage , ner	8	1946	August 3, 2018
NER - basic model doubt ner	13	380	February 22, 2024
Best strategy for training an NER engine usage , ner	8	2177	December 27, 2017
Training NER models with synthetic data sets usage , ner , spacy , solved	13	2954	August 26, 2019
Prodigy to Spacy Guide ner , spacy , best-practices	4	5328	January 13, 2020

questions on Multi NERs Annotation & Training at Once in a Sentence

Should I start with a blank model or update an existing one?¶

Related topics