Custom recipe w/o model

pronks · April 13, 2018, 6:28pm

Hello,

I’ve been trying to create custom recipes for collecting single and multi-label annotations for NER without a spacy model (w/o model-in-the-loop), but have been running into some issues. I haven’t been able to locate examples of annotation-gathering being done w/o models, which is making this difficult. It seems like it should be possible, but I’m not sure what the config arguments should be in this case (esp. concerning labels) and how the data stream should be formatted.

For example, given the code for ner.manual (from which I’ve removed the arguments pertaining to the spacy model), how would I format config and stream? I have config as {‘labels’: label} and stream as [{‘text’: ‘text’}…], but I’m getting an “Oops something went wrong” page when I try to run it.

Thanks for your help.

ines · April 14, 2018, 2:15am

Hi! I think the solution might actually be easier than you think

One thing that's important to note here is that ner.manual doesn't actually update a model in the loop – it only uses the spaCy model for tokenization. Pre-tokenizing the text allows you to annotate faster because the highlighted selection can "snap" to the token boundaries. So if you run the ner.manual recipe out-of-the-box, it will stream in the text so you can annotate it manually and save the annotations to your dataset (which you can then use however you like).

If you don't want to use spaCy for tokenization, you can also implement your own logic. The input format for the manual NER interface expects the data to have an additonal "tokens" property. You can find more details on this in the "Annotation task formats" section of your PRODIGY_README.html. Here's a simple example:

{
    "text": "Hello Apple",
    "tokens": [
        {"text": "Hello", "start": 0, "end": 5, "id": 0},
        {"text": "Apple", "start": 6, "end": 11, "id": 1}
    ]
}

pronks · April 18, 2018, 8:42pm

Ended up implementing own logic for tokenization, and it worked. Thanks for the quick and thorough response!

Topic		Replies	Views
recipe proposing list of custom chosen sentences for manual new usage , ner , custom , solved	4	1096	January 21, 2018
How do I use prodigy as a purely annotation tool with no underlying SpaCy model? usage	1	1590	April 27, 2018
ner.train on data not annotated by Spacy? ner	3	1148	June 11, 2018
Named Entities(manual) usage , ner , solved	4	803	May 11, 2018
NER and POS Tagging Annotation using One Prodigy User Interface	2	17	January 31, 2025

Custom recipe w/o model

Related topics