Prodigy Roadmap

enhancement
meta

(Ines Montani) #1

This roadmap should give you an overview of what’s next for Prodigy and other ideas worth exploring. Feel free to ask questions, or submit requests and suggestions in the comments.

:clipboard: See here for the changelog.


Recipes, Components and Interfaces

  • :white_check_mark: v1.5.0 manual image annotation: image segments (square and polygon shapes)
  • :white_check_mark: v1.5.0 allow plugging in recipes, databases and loaders via Python entry points
  • :white_check_mark: v1.5.0 validate stream using JSON schemas before starting the server and while Prodigy is running, and output detailed messages if tasks don’t have the expected format
  • add built-in solutions for pseudo-rehearsal and data augmentation
  • add more features to manual image annotation interface: editing shapes, undo/redo, fully tested touch screen support and more

Models

  • add support for sense2vec models
  • :soon: built-in wrappers for scikit-learn, Pytorch and TensorFlow / Keras: those will be available via spaCy’s machine learning library Thinc. You can already test the experimental PyTorch integration in the latest release.

Prodigy Annotation Manager

Status: :eight_spoked_asterisk:️ active development
Public beta: September 2018 (earlier for existing Prodigy users)

  • set up large annotation projects with multiple annotators, perform quality control and check concordance
  • add-on product and library that integrates with Prodigy, orchestrates recipes and tasks, and manages your cluster
  • admin console for settings, statistics and annotator management
  • full data privacy, support for internal and external networks

Database and Corpus Management

Status: :eight_pointed_black_star:️ planning

  • separate open-source library for managing and reconciling annotation layers
  • integrates with seamlessly with Prodigy – but can also be used standalone!
  • out-of-the-box support for training spaCy models and adapters for other libraries
  • possible add-on: web UI for viewing data and annotations

Documentation

  • :white_check_mark: v1.4.0Prodigy Cookbook” with quick solutions for various problems
  • produce more end-to-end video tutorials like our text classification with Prodigy video
    • Improving spaCy’s NER model on your data
    • Manual NER annotation
    • Image segmentation and object detection
    • A/B evaluation
    • Data curation and manual annotation (e.g. image selection and preference)

Feature Request: Machine Translation View
Saving and retrieving annotations
Able to customize UI to give info to user?
running prodigy on internal network with multiple annotators
How do we inspect dataset sessions?
Converting SpaCy training json file to Prodigy jsonl format
#4

When could we expect the release of the built-in wrappers for Pytorch and TensorFlow/Keras?


(Ines Montani) #5

In a recent release of Thinc, we quietly shipped the first version of a PyTorch wrapper so we can start testing it :slightly_smiling_face: The wrappers will be open-source so they can evolve and be updated quickly. It also means that Prodigy won’t have to ship with a bunch of super specific code that may have to change often, and it allows you to reuse them across applications, without having to depend on Prodigy.


(Samuel Pouyt) #6

For the “Prodigy Annotation Manager” do you need help? Beta testing, developing some parts? I am asking because we are launching an annotation project for our medical data. I have tested prodigy by adding two entity, it all worked out. I can generate my models etc. Therefore I am ready to move to the next step :wink:

Sam


(Motoki Wu) #7

I’ll be happy to early test the annotation manager as well.

We have a multiple Prodigy set-up using pm2 but it’s very clunky :slight_smile:


(Ines Montani) #8

@idealley @plusepsilon Thanks a lot! We’re not quite yet at a stage where it’s ready to be tested by others – but there’ll definitely be an alpha/beta program exclusively for existing users :blush:


(Steve Brown) #9

@ines Sorry to keep bugging you about the Annotation Manager (really excited!), but do you have a sense of how close you are to a private beta (a week, a month)? I’ve been putting off building an internal solution, but we have a use case that likely can’t wait until the end of September.


(Ines Montani) #10

@steve Ah, sorry, I totally missed your comment! We actually just started our Prodigy Annotation Manager sprint to finish up the app, and we’d obviously love to get it out to people as soon as possible. We’ll hopefully have an update and an announcement to make about the results this week!

However, I still wouldn’t be comfortable making any promises at this point, especially if you have to meet your deadlines. So in the meantime, you might want to check out @andy’s open-source multi-user extension. The annotation manager will be fully compatible with Prodigy’s existing database formats, so you’ll always be able to make the switch later.


(Ines Montani) #11

@idealley @plusepsilon @steve Just posted an update here :tada: