As the default behaviour, yes. Prodigy tries to make as little assumptions as possible about your input stream, the dataset you’re using and annotations you’ve already collected. You won’t be asked the same question twice in the same session, but new sessions will start off fresh and with no preexisting state by default.
However, once the fix to this bug is pushed, you’ll be able to specify the --exclude
argument on the ner.teach
command and others to exclude annotations that are already present in one or more datasets. The task hash (based on the input data and features you annotate, e.g. spans or labels) is used to determine if a question has been asked before. So you could set it to exclude the current dataset to not ask about things you’ve already annotated in that set. You can also use it to make sure that evaluation examples don’t end up in your training set, and vice versa.