Thanks for your questions!
For the first release, we've mostly been focusing on the capabilities of Prodigy as a developer tool and reengineering traditional annotation processes to help developers iterate on the data and run experiments faster. This means that the current workflows aim at making the annotator do as little as possible and using the interface to focus on one decision at a time to move as quickly as possible. (You can see our latest NER video tutorial for an example of a development workflow like this. It also shows the use of word vectors and terminology lists to pre-label entities.)
I totally understand your process, though – in some cases, it definitely makes sense to work through an entire document manually and at once, and label everything that needs to be labelled. This is currently not possible in Prodigy. However, we are working on new interfaces for those types of use cases, including text, images (object detection and segmentation), as well as potentially audio files.
I'm not sure I understand the question correctly – do you mean importing raw text data to the database, but from a directory? Currently, prodigy db-in
only works for single files. But you can easily process a whole directory using a simple shell script, or run the function from Python:
from prodigy.__main__ import db_in
for filename in directory:
db_in('my_dataset', filename)
Yes, this is exactly what we had in mind for the Prodigy Annotation Manager. We're currently planning this as a Prodigy add-on, i.e. a separate package you can plug into your Prodigy workflow and that extends the app with more functionality and an annotation management console that lets you orchestrate larger annotation projects, handle quality control etc. We don't have a timeline for this yet, but it's definitely something we've been thinking about a lot, and have been experimenting with.