I am trying to annotate some text with the custom recipe bert.ner.manual and after I do a couple of annotations, save them and stop the prodigy server and restart it, it does not start the annotations from where I left of, but starts from scratch again.
I looked at the ner.manual recipe and can't find anything special about how it does that, so how could I make the custom recipe "bert.ner.manual" to behave the same?
Thanks for your message and welcome to the Prodigy community
It's a bit hard to know without seeing the recipe. However, it's sounding like hashing isn't working correctly.
One simple way to test would be to change exclude_by: input in your configuration (either prodigy.json, override, or returned in your recipe). This will try to hash by input, instead of task.
But another possibility is that you're not even hashing to begin with.
Like bert.ner.manual, are you loading your input source like:
It does not seem to use those rehash=True and dedup=True options and adding them indeed solved the issue...It is now correctly starting where I left off...
I did not see those options in the default ner.manual recipe in the github entry, since it is using stream = JSONL(source) which I am guessing is already doing the filtering itself.