Bug in textcat recipes when using blank textcat component in the loop

we have built a workflow where we use custom spacy model configurations to initialise models from the very start of an annotation project (for example to include custom tokenisation). We are initialising the models with blank textcat components if it is a textcat annotation, and we are running into a problem.

If you use textcat.correct with 2 exclusive classes and a blank model in the loop, the model suggests both labels with 0.5 score. While the UI will only display one radio button selected, the getChoices function in the React code will in fact return both labels, so if the user doesn't make any change and just clicks accept, the saved annotation will have both labels. This of course causes errors down the line when trying to train a model.

Currently we are using a workaround of setting the threshold to 0.51, but I wanted to report it anyway as it's clearly a bug, even if it's perhaps a relatively unlikely edge case for most users.

I tried reproducing this but I seem to be getting a different view.

I used this data.csv file to train a model:

set a 4 minute timer,foo
set a 2 minute timer,bar
set a 3 minute timer,foo
set a 5 minute timer,bar
set a 1 minute timer,foo

I then proceeded to annotate, alternating between the "Foo" and "Bar" label.

python -m prodigy textcat.manual issue5716 issue-5716/data.csv --exclusive --label foo,bar

Next, I trained a textcat model, with exclusive classes.

python -m prodigy train --textcat issue5716 model-out

When I now run textcat.correct with another dataset I see this interface:

python -m prodigy textcat.correct issue5716 model-out/model-best progress/clinc.csv

I only hit "accept" on everything. Here's what the final few row in db-out looks like:

  "text":"how do you say please in french",
  # Notice, only one item in `accept`! 

So I don't see both labels as a saved annotation

There might be something else happening though. Under the hood, the textcat.correct recipe is using choice as a view_id. That means that you should be able to add some extra settings in your prodigy.json file to make it behave differently. Just to double-check, do you have such a settings file around that might be causing this behaviour?