Best way to annotate data with Prodigy for joint intent-slot filling (IOB format) Tensorflow model

Background: We are currently developing a chatbot for basic goal oriented dialogue. We found a good Tensorflow model that does joint slot filling and intent detection out of the one of the recent conferences that we are planning on using. We have a nice training/deployment pipeline setup for this which we tested on ATIS dataset that overall works smoothly. Now we just need data to train the model for our specific use-case. As our chatbot is not released we are unfortunately in a cold start situation regarding data. We are getting people write out examples of queries that users might write but we still need to annotate these in IOB format and assign an intent similar to ATIS. I thought Prodigy might be a good at doing this.

Question(s):
(1) I’m somewhat confused about what I should call to start prodigy on this task. Assuming we export these examples to a .jsonl file how would we start annotating it. Is there a way to annotate for both slot and intent at the same time? If not what is the best way to approach this?
(2) Secondly, is it at all possible to do active learning using prodigy on the Tensorflow model itself? That would be ideal as it would enable faster annotation. We currently, do have a function to restore the from the saved model checkpoint.
(3)

Thanks for the help.

Hi Isaac,

To answer your specific questions:

Is there a way to annotate for both slot and intent at the same time? If not what is the best way to approach this?

There isn’t really a way to do both at the same time, although you could make the slot labels intent-specific, so that given the slots, the intents are unambiguous.

Why not tag the intents first though? Applying the intent tags will be very quick, and then you can apply some automated logic to pre-fill some of the slots, based on what the intent is. At a minimum, it will be very helpful to limit the labels during annotation to only the ones available for that intent. It’s much easier to code this sort of logic offline, rather than trying to put it into the interface.

(2) Secondly, is it at all possible to do active learning using prodigy on the Tensorflow model itself?

In theory yes, you just have to pass an update callback out of the recipe. You can see examples of the recipe scripts here: https://github.com/explosion/prodigy-recipes

Are you sure active learning can help in your context, though? It sounds like you’ll always be annotating and training from all of your data.

I think you should