Correction of manually labeled relation

MoJo · October 21, 2021, 12:58pm

Hi!

I have the following problem:
I started to manually label a dataset w.r.t named entities and relations. My dataset consists of ~5000 sentences, of which I have labeled 350 up to this point.

Unfortunately, I did a mistake during the labeling process:
In some document, I labeled a relation between two words where I forgot to label one word as an entity. That is a problem for later downstream tasks. Is it possible to correct this issue? I loaded the already labeled data in python, so I know among others the _task_hash of this document/sentence. I am looking for a recipe which allows me to correct the specific document labels given the _task_hash.

Thank's a lot for your help!

MoJo · October 21, 2021, 2:11pm

I found a workaround:

Export dataset as jsonl file
Delete specific row out of file
Delete dataset and create it again with "db_in" and the new jsonl file
Start browser application, then the specific document should appear again

However, I think this is not a nice way to do it. A recipe which makes it possible to adjust existing labels would be great!

ines · October 22, 2021, 11:31am

Hi! Yes, this definitely works

Alternatively, if you want to go through all of your annotations and adjust them later (e.g. if you changed your label scheme or just want to re-annotate), you can also set dataset:name_of_your_dataset as the input source instead of a file, and the dataset will be queued up again. For example:

prodigy rel.manual new_dataset blank:en dataset:your_previous_dataset ...

Just make sure to save the result to a new dataset so you don't end up with duplicates in the same set.

MoJo · October 25, 2021, 1:28pm

Hi @ines ,

thank you very much for your answer!
That's a really useful hint

Topic		Replies	Views
Corrections on an already annotated NER dataset usage , ner	3	521	December 21, 2022
Annotate relationships on existing entities usage , ner	2	816	July 12, 2022
correcting bad labels for NER with Jupyter and prodigy usage , ner	2	561	December 13, 2022
Adding new label usage , ner	5	1335	November 8, 2021
Check annotated relations from #prodigy usage , solved , relations	2	400	November 15, 2021

Correction of manually labeled relation

Related topics