Considering acquiring prodigy in order to tag a large documental dataset. Can I manually define the classes I’m trying to tag?
On the live demo I can manually tag with ORG, PERSON, etc. However, in the scope of my dataset I’d like to tag with other classes (ex: diseases, occupations, etc). Is this something I can do?
Sure! Training custom categories is super important
ner.manual recipe (which uses the manual annotation interface that’s also shown in the demo) lets you specify a label set that’s then used to provide the options. For example:
prodigy ner.manual your_dataset en_core_web_sm your_input_data.jsonl --label DISEASE,OCCUPATION
This will load up Prodigy, stream in data from
your_input_data.jsonl, give you the label options
OCCUPATION and save the collected annotations in the dataset
Seems very intuitive and simple to use.
Thank you very much!