Dynamically defining subset of labels to use in SpanCat

hi @darinkishore,

Thanks for the thoughtful post and welcome to the Prodigy community :wave:

This is an interesting use case. Let me discuss this with the team next week.

We just had a related post and this has come up before too. But yes, the challenge is Prodigy was designed as "one-label-set-per-session".

I like this direction - but now it looks like you're hitting a second problem of nested or hierarchical categorization. This is another UI challenge but also one of designing the categorization scheme (e.g., why 12 of 50? how do you define which subsets).

Have you seen the validate answer callback? Easy way to validate an answer.

I don't think that's possible but perhaps open a discussion post GitHub or I can check with the team next week.

Update: I checked with a spaCy core teammate. So it is possible if you're using different components, the key is naming them something unique but still calling the "llm" factory.

[nlp]
pipeline = ["llm_textcat", "llm_ner"]
...

[components]

[components.llm_textcat]
factory = "llm"

[components.llm_textcat.model]
@llm_models = ...

[components.llm_textcat.task]
@llm_tasks = "spacy.TextCat.v3"

...

[components.llm_ner]
factory = "llm"

[components.llm_ner.model]
@llm_models = ...

[components.llm_ner.task]
@llm_tasks = "spacy.NER.v3"
...

No worries at all! We really appreciate your post. As I mentioned, we've been thinking about several of these items for a while. We'll reach back out next week. Thank you!