Bootstrapping using rule-based matching - handling conflicting patterns within single text

Hi @janp,

This question has come up before, so we've been thinking about how to add some extra options to the built-in recipe to control this. However, one of the ideas behind Prodigy is that everyone wants slightly different behaviours, and the easiest way to get what you want is to put the pieces together yourself into a custom recipe.

You can find a discussion of how to filter the stream to prevent the duplicate texts in this thread: textcat.teach presents same annotation task if text snippet contains multiple patterns . I think if you add the stream filter Ines is suggesting there, it should ensure that you're not asked the redundant questions.

One thing to keep in mind is, since you're doing a multilabel problem, you'll want to make sure it can ask you about different text/label combinations. So you want to make sure you're keying the filter by both the text and the label.