Considering acquiring prodigy in order to tag a large documental dataset. Can I manually define the classes I’m trying to tag?
On the live demo I can manually tag with ORG, PERSON, etc. However, in the scope of my dataset I’d like to tag with other classes (ex: diseases, occupations, etc). Is this something I can do?
Sure! Training custom categories is super important
The ner.manual recipe (which uses the manual annotation interface that's also shown in the demo) lets you specify a label set that's then used to provide the options. For example:
This will load up Prodigy, stream in data from your_input_data.jsonl, give you the label options DISEASE and OCCUPATION and save the collected annotations in the dataset your_dataset.