spancat out of memory

ines · March 25, 2022, 10:47am

It's possible that this is related to the suggester function, which by default, will use an ngram range of all the available spans lengths in the data. So if you have really long spans, you'll end up with a lot of potential candidates (e.g. all possible spans between 1 and 60 tokens, which can be a lot). If you run prodigy train with the --verbose flag, it should show you more detailed information on the suggester function used: Span Categorization · Prodigy · An annotation tool for AI, Machine Learning & NLP

One option to prevent this would be to use a config that defines a different logic for potential span candidates via the suggester function: SpanCategorizer · spaCy API Documentation How you set this up depends on the data, but there might be common patterns that you can use instead of considering every possible combination.

The suggester functions can also integrate with Prodigy during annotation so you can ensure that only spans matching the suggester can be selected: Span Categorization · Prodigy · An annotation tool for AI, Machine Learning & NLP

Topic		Replies	Views
prodigy train OutOfMemoryError	3	476	November 16, 2022
any solution for this issue even after i've changed batch size its not working usage , spacy , training , spancat	9	881	June 23, 2022
Way to get an estimate Memory consumption of spancat model spancat	1	278	November 9, 2022
Train spancat bug spacy , training , spancat	7	557	October 12, 2021
training long sequence on spancat memory problem spancat	1	392	March 29, 2023

spancat out of memory

Related topics