I would like to use the Choice interface with pre-defined “accepted” values, is this possible ? I thought it was but cannot find it in the doc
The idea behind is to have a sort of “make-gold” recipe for a multi-label classification task.
What exactly do you mean by “pre-defined accepted values”? Do you have an example?
You can set "choice_auto_accept": true in your prodigy.json (or the recipe config) to automatically accept the selected answer. But I’m not sure if that’s what you mean?
What I mean is when using the choice interface, I would like to display some choices already “clicked”.
The use case would be similar to ner.make-gold but for classification, a trained model predicts labels (choice) and the user just have to correct the predicted labels.
For instance for a task with 3 choice (A,B,C) if my model predict label A, I’d like to send something like this:
Ah, okay, got it. In general, the best strategy to achieve these kinds of things in Prodigy is to just feed it input of the same format as the expected output. That’s also how it works for NER and other interfaces. For the choice interface, this means including an "accept" key with a list of accepted options, just like the annotation format you expect it to produce in the database. For example:
"accept": [1, 4]
The "choice" currently doesn’t “officially” support this, so it likely won’t update the very first task when you load the app. However, from the second task on, it should work! (This sounds strange, but it’s just because the Prodigy doesn’t check for pre-defined choice options and only keeps a record of them once the interface is updated.)
Supporting this workflow “officially” won’t be a problem to implement, so I’d be happy to include this for a future release
Nice, I just tried it and it works great, excepted on the first example as you said.
I don’t know yet if this workflow will be the best but if it’s not too much :). We are trying different workflow to better annotate multi-label example. The thing is that if the number of labels is too big, the choice interface could become less efficient. We are also experimenting a recipe that for every label, goes over examples that miss this label and then simply use the classification interface. Will see what works best.
The numbers here are the "id" values of the options! Sorry if this was confusing. The option IDs can be intergers or strings. So if your option is {"id": "FOO", "text": "bar"}, you can set "accept": ["FOO"] to pre-select it.
Emoji should definitely be no problem – I use them in options all the time
And wait, you're using the textcat.manual recipe, right? The problem here is that the recipe already takes care of adding the "options", so whatever you have in there will be overwritten by the actual labels you're annotating. Everything else would be kinda confusing, because you could en up with mismatched labels.
If you look at the source of prodigy/recipes/textcat.py, you'll see that the recipe is pretty simple, though – essentially, it's very similar to this:
So if you want to customise the labels and pre-populate the data, it might make more sense to just do it in a custom recipe instead. If you want this to be really elegant, you could even add a command-line argument that lets you pass in those label aliases, or load the options from a file or something like that.