hi @padejumo,
I'm glad you found the issue.
That's really hard because it likely depends on the context and your own preferences. One important point is that if you annotated with a spans
recipe, make sure you didn't make an overlapping spans as its permissible with spans
recipes but not ner
. If you try to train ner
, you'll get an error if there are overlapping spans.
Not sure if you've seen it, but there's also a spaCy template project that compares spancat
vs. ner
:
You could also try modifying your suggester function. Also, are you using data-to-spacy
so as to run spacy train
instead of prodigy train
? I doubt that'll do much for memory, but it does give you the ability to span characteristics via spacy debug data
(see the ner_spancat_compare
project for more details).
It may be a not an option, but any way you could even reframe your problem by splitting it up someway and using textcat
instead? This post mentions it and a few related ideas.
Hope this helps!