User defined classes for manual NER

Considering acquiring prodigy in order to tag a large documental dataset. Can I manually define the classes I’m trying to tag?

On the live demo I can manually tag with ORG, PERSON, etc. However, in the scope of my dataset I’d like to tag with other classes (ex: diseases, occupations, etc). Is this something I can do?

Sure! Training custom categories is super important :slightly_smiling_face:

The ner.manual recipe (which uses the manual annotation interface that's also shown in the demo) lets you specify a label set that's then used to provide the options. For example:

prodigy ner.manual your_dataset en_core_web_sm your_input_data.jsonl --label DISEASE,OCCUPATION

This will load up Prodigy, stream in data from your_input_data.jsonl, give you the label options DISEASE and OCCUPATION and save the collected annotations in the dataset your_dataset.

1 Like

Seems very intuitive and simple to use.

Thank you very much!

1 Like