Background: We are currently developing a chatbot for basic goal oriented dialogue. We found a good Tensorflow model that does joint slot filling and intent detection out of the one of the recent conferences that we are planning on using. We have a nice training/deployment pipeline setup for this which we tested on ATIS dataset that overall works smoothly. Now we just need data to train the model for our specific use-case. As our chatbot is not released we are unfortunately in a cold start situation regarding data. We are getting people write out examples of queries that users might write but we still need to annotate these in IOB format and assign an intent similar to ATIS. I thought Prodigy might be a good at doing this.
Question(s):
(1) I’m somewhat confused about what I should call to start prodigy on this task. Assuming we export these examples to a .jsonl file how would we start annotating it. Is there a way to annotate for both slot and intent at the same time? If not what is the best way to approach this?
(2) Secondly, is it at all possible to do active learning using prodigy on the Tensorflow model itself? That would be ideal as it would enable faster annotation. We currently, do have a function to restore the from the saved model checkpoint.
(3)
Is there a way to annotate for both slot and intent at the same time? If not what is the best way to approach this?
There isn't really a way to do both at the same time, although you could make the slot labels intent-specific, so that given the slots, the intents are unambiguous.
Why not tag the intents first though? Applying the intent tags will be very quick, and then you can apply some automated logic to pre-fill some of the slots, based on what the intent is. At a minimum, it will be very helpful to limit the labels during annotation to only the ones available for that intent. It's much easier to code this sort of logic offline, rather than trying to put it into the interface.
(2) Secondly, is it at all possible to do active learning using prodigy on the Tensorflow model itself?