This roadmap should give you an overview of what’s next for Prodigy and other ideas worth exploring. Feel free to ask questions, or submit requests and suggestions in the comments.
v1.5.0 allow plugging in recipes, databases and loaders via Python entry points
v1.5.0 validate stream using JSON schemas before starting the server and while Prodigy is running, and output detailed messages if tasks don’t have the expected format
add built-in solutions for pseudo-rehearsal and data augmentation
add more features to manual image annotation interface: editing shapes, undo/redo, fully tested touch screen support and more
built-in wrappers for scikit-learn, Pytorch and TensorFlow / Keras: those will be available via spaCy’s machine learning library Thinc. You can already test the experimental PyTorch integration in the latest release.
Prodigy Annotation Manager
Status: ️ active development Public beta: September 2018 (earlier for existing Prodigy users)
set up large annotation projects with multiple annotators, perform quality control and check concordance
add-on product and library that integrates with Prodigy, orchestrates recipes and tasks, and manages your cluster
admin console for settings, statistics and annotator management
full data privacy, support for internal and external networks
Database and Corpus Management
Status: ️ planning
separate open-source library for managing and reconciling annotation layers
integrates with seamlessly with Prodigy – but can also be used standalone!
out-of-the-box support for training spaCy models and adapters for other libraries
possible add-on: web UI for viewing data and annotations
Documentation
v1.4.0 “Prodigy Cookbook” with quick solutions for various problems
In a recent release of Thinc, we quietly shipped the first version of a PyTorch wrapper so we can start testing it The wrappers will be open-source so they can evolve and be updated quickly. It also means that Prodigy won't have to ship with a bunch of super specific code that may have to change often, and it allows you to reuse them across applications, without having to depend on Prodigy.
For the “Prodigy Annotation Manager” do you need help? Beta testing, developing some parts? I am asking because we are launching an annotation project for our medical data. I have tested prodigy by adding two entity, it all worked out. I can generate my models etc. Therefore I am ready to move to the next step
@idealley@plusepsilon Thanks a lot! We’re not quite yet at a stage where it’s ready to be tested by others – but there’ll definitely be an alpha/beta program exclusively for existing users
@ines Sorry to keep bugging you about the Annotation Manager (really excited!), but do you have a sense of how close you are to a private beta (a week, a month)? I’ve been putting off building an internal solution, but we have a use case that likely can’t wait until the end of September.
@steve Ah, sorry, I totally missed your comment! We actually just started our Prodigy Annotation Manager sprint to finish up the app, and we’d obviously love to get it out to people as soon as possible. We’ll hopefully have an update and an announcement to make about the results this week!
However, I still wouldn’t be comfortable making any promises at this point, especially if you have to meet your deadlines. So in the meantime, you might want to check out @andy’s open-source multi-user extension. The annotation manager will be fully compatible with Prodigy’s existing database formats, so you’ll always be able to make the switch later.