I'm trying to use the review recipe to look through and correct past annotations for a dataset. This works fine if im using a separate destination dataset from my source, but I want to save review results back to source. I tried running review with the same source and destination datasets like shown below:
python -m prodigy review NER_titles_2 NER_titles_2
This runs ok the first time, but after saving a few review examples and running again, the following error pops up:
Conflicting view_id values in datasets Can't review annotations of 'review' (in dataset 'NER_titles_2') and 'ner_manual' (in previous examples)
Is what I'm trying to perform doable? If so how can I fix this error?
Hi! I'm not sure what your end goal is, but in general, you should always use a different dataset to save your final reviewed corpus – otherwise, you end up with duplicate and inconsistent data. Datasets in Prodigy are append-only by design so you never lose any data points (because overwriting your annotations by accident would be bad).
review recipe will create a final copy of the examples with the versions it was based on and your final decision. So you typically want to have that in a separate dataset that you can then train from, not mixed in with your original annotations. (If you really want to, you can always remove your original annotations later – although I'm not sure that's really necessary.)
If you ended up with your one dataset containing mixed annotations of differnt types, the easiest solution would be to export the data using
db-out, removing the lines added from the review, and re-uploading the data with
db-in. You can then start again with a separate review dataset.
I stopped at the "remove added lines from revision and send data" part, but I couldn't find the "new lines from revision" in the structure. What are these review lines like?