Prodigy Roadmap


(Ines Montani) #1

This roadmap should give you an overview of what’s next for Prodigy and other ideas worth exploring. Feel free to ask questions, or submit requests and suggestions in the comments.

:clipboard: See here for the changelog.

Recipes, Components and Interfaces

  • :white_check_mark: v1.5.0 manual image annotation: image segments (square and polygon shapes)
  • :white_check_mark: v1.5.0 allow plugging in recipes, databases and loaders via Python entry points
  • :white_check_mark: v1.5.0 validate stream using JSON schemas before starting the server and while Prodigy is running, and output detailed messages if tasks don’t have the expected format
  • add built-in solutions for pseudo-rehearsal and data augmentation
  • add more features to manual image annotation interface: editing shapes, undo/redo, fully tested touch screen support and more


  • add support for sense2vec models
  • :soon: built-in wrappers for scikit-learn, Pytorch and TensorFlow / Keras: those will be available via spaCy’s machine learning library Thinc. You can already test the experimental PyTorch integration in the latest release.

Prodigy Annotation Manager

Status: :eight_spoked_asterisk:️ active development
Public beta: September 2018 (earlier for existing Prodigy users)

  • set up large annotation projects with multiple annotators, perform quality control and check concordance
  • add-on product and library that integrates with Prodigy, orchestrates recipes and tasks, and manages your cluster
  • admin console for settings, statistics and annotator management
  • full data privacy, support for internal and external networks

Database and Corpus Management

Status: :eight_pointed_black_star:️ planning

  • separate open-source library for managing and reconciling annotation layers
  • integrates with seamlessly with Prodigy – but can also be used standalone!
  • out-of-the-box support for training spaCy models and adapters for other libraries
  • possible add-on: web UI for viewing data and annotations


  • :white_check_mark: v1.4.0Prodigy Cookbook” with quick solutions for various problems
  • produce more end-to-end video tutorials like our text classification with Prodigy video
    • Improving spaCy’s NER model on your data
    • Manual NER annotation
    • Image segmentation and object detection
    • A/B evaluation
    • Data curation and manual annotation (e.g. image selection and preference)

Able to customize UI to give info to user?
Feature Request: Machine Translation View
running prodigy on internal network with multiple annotators
Converting SpaCy training json file to Prodigy jsonl format
Saving and retrieving annotations

When could we expect the release of the built-in wrappers for Pytorch and TensorFlow/Keras?

(Ines Montani) #5

In a recent release of Thinc, we quietly shipped the first version of a PyTorch wrapper so we can start testing it :slightly_smiling_face: The wrappers will be open-source so they can evolve and be updated quickly. It also means that Prodigy won’t have to ship with a bunch of super specific code that may have to change often, and it allows you to reuse them across applications, without having to depend on Prodigy.

(Samuel Pouyt) #6

For the “Prodigy Annotation Manager” do you need help? Beta testing, developing some parts? I am asking because we are launching an annotation project for our medical data. I have tested prodigy by adding two entity, it all worked out. I can generate my models etc. Therefore I am ready to move to the next step :wink:


(Motoki Wu) #7

I’ll be happy to early test the annotation manager as well.

We have a multiple Prodigy set-up using pm2 but it’s very clunky :slight_smile:

(Ines Montani) #8

@idealley @plusepsilon Thanks a lot! We’re not quite yet at a stage where it’s ready to be tested by others – but there’ll definitely be an alpha/beta program exclusively for existing users :blush: