entity labeling

ines · December 20, 2017, 1:25pm

Thanks for your questions!

For the first release, we've mostly been focusing on the capabilities of Prodigy as a developer tool and reengineering traditional annotation processes to help developers iterate on the data and run experiments faster. This means that the current workflows aim at making the annotator do as little as possible and using the interface to focus on one decision at a time to move as quickly as possible. (You can see our latest NER video tutorial for an example of a development workflow like this. It also shows the use of word vectors and terminology lists to pre-label entities.)

I totally understand your process, though – in some cases, it definitely makes sense to work through an entire document manually and at once, and label everything that needs to be labelled. This is currently not possible in Prodigy. However, we are working on new interfaces for those types of use cases, including text, images (object detection and segmentation), as well as potentially audio files.

I'm not sure I understand the question correctly – do you mean importing raw text data to the database, but from a directory? Currently, prodigy db-in only works for single files. But you can easily process a whole directory using a simple shell script, or run the function from Python:

from prodigy.__main__ import db_in
for filename in directory:
    db_in('my_dataset', filename)

Yes, this is exactly what we had in mind for the Prodigy Annotation Manager. We're currently planning this as a Prodigy add-on, i.e. a separate package you can plug into your Prodigy workflow and that extends the app with more functionality and an annotation management console that lets you orchestrate larger annotation projects, handle quality control etc. We don't have a timeline for this yet, but it's definitely something we've been thinking about a lot, and have been experimenting with.

Topic		Replies	Views
annotating entities in text documents usage , ner , solved	15	9924	November 28, 2017
Annotate text with multiple entities using ner_manual usage , ner	4	877	November 26, 2018
Annotation strategy for gold-standard data usage , ner , solved , best-practices	5	2706	October 26, 2018
Best approach for using ner manual and mark usage , ner , solved	22	2345	January 20, 2020
Post Processing On prodigy usage , ner	2	300	February 2, 2022

entity labeling

Related topics