when to use db-in vs ner.manual

ning · October 1, 2020, 10:53pm

Hi,

I am a little confused in which scenarios should I use "db-in" or "ner.manual". Can you please help?

Start annotations from scratch with a patterns file.
--> I did prodigy db-in dev ./raw_dev.jsonl, then prodigy ner.manual dev en_custom ./raw_dev.jsonl --label LOC --patterns patterns.jsonl
and I got the message "No tasks available"
Increase number of examples to be annotated
---> Do I use db-in or ner.manual with the same dataset name?
Edit annotated examples
---> After I use db-out to export the annotated examples. If I were to edit or review the annotated examples, do i use db-in or ner.manual with a new dataset name?

Thank you.

ines · October 2, 2020, 9:17am

Hi! The db-in command is only intended to import existing annotations into your Prodigy datasets – for example, if you've already labelled data with some other process and want to combine it with new annotations or if you want to re-import annotations to a new dataset.

If you just want to annotate data, you do not have to import anything upfront – you can just start the server with your input data and Prodigy will stream it in, let you annotate and save the collected annotations to the database.

The reason you're seeing "No tasks available" after importing the data is that Prodigy will skip questions that are already in the dataset. Since you've already imported the raw data to the dataset of annotations, there's nothing new in the data because it's all in the database already.

This depends on what you want to do with the data: if you want to re-annotate the exported JSON examples to correct them etc., you can load them back into ner.manual. If you just want to add them to a new dataset to use Prodigy to train from them, or so you can add more examples to them later, you can use db-in with a new dataset name.

Topic		Replies	Views
Editing approved NER dataset usage , ner , solved	1	421	April 30, 2020
no task available - ner.manual ner	2	160	November 7, 2023
Reviewing/Editing annotated data usage , review , streams	1	948	June 23, 2020
Edit Saved NER Manual Annotations usage , ner , database , solved	4	1389	September 13, 2018
How to edit existing texts that were added to a dataset using db-in ner , database	3	1074	February 3, 2020

when to use db-in vs ner.manual

Related topics