Yes, if you're labelling manually, you usually want to label every instance of the term "gigigi" every single time. Named entity recognition is context-dependent, so you want your data to include the entities in a variety of contexts. For ambiguous entities, this is especially important.
Because labelling everything manually can be kinda annoying and tedious, Proidgy tries to make this easier with the semi-automated recipes like ner.teach
(uses active learning) or ner.match
(without active learning) that will suggest candidates and let you say yes or no.
No, the default ner.manual
recipe should stream in all examples as they come in and not skip any. By "received", do you mean that they annotated everything, but you only have 120 tasks in the dataset? Some possible explanations could be:
- Does you data contain any duplicate sentences? If so, Prodigy will filter those out.
- If the annotator refreshes the browser, the Prodigy app will request the next batch of tasks – and until you've received all answers and the session is over, Prodigy can't know whether a task needs to be sent out again. This thread has more details on this and suggestions for a solution.
- Always make sure to save your progress in the web app after you're done annotating. Otherwise, you might lose the last batch of annotations when you close the browser.