re-visiting a partially annotated docment

I am using Prodigy ner.manual recipe with blank:en model to annotate text documents for an NER task. Is it possible to perform the annotation task in multiple Prodigy sessions ?

Say I annotate E1, E2, E3, E4 and E5 in round one and save the dataset. In a subsequest session can I add entites E6, E7, ... EN to the same document ( without having to redo E1...E5).

One could think of the above as a way to "review" a previously annotated document and add more entity annotations in the review session.

If this is possible, what would be the command for the "review" session so that the saved dataset from the previous session is read into prodigy ?


Hi @nlpfan,
you can use the dataset used for annotating the first batch of entities as input for the second one. Let's say you have annotated your data using the command:

prodigy ner.manual ner_session_1 blank:en ./data.jsonl --label E1,E2

After saving your annotations you could then use the command

prodigy ner.manual ner_session_2 blank:en dataset:ner_session_1 --label E3,E4

to add more entities to your data. However, you have to use two different datasets for saving the two different session. Otherwise, prodigy might not show you new examples because they are filtered out for preventing duplicates.

I hope that answers your question. If you have any follow-up questions, please feel free to ask.

1 Like

Thank you @Jette16 for the clarification above. It solves my problem fully.