Token indices sequence length is longer than the specified maximum sequence length for this model

I am using the falcon model as shown in this config

[components.llm.model]
@llm_models = spacy.Falcon.v1
name = falcon-rw-1b

I am getting the follwing error in some texts, created batches of 100 rows and I reduced the number of token per row text to less 1000 and used BERT model to tokenize to 512 but still the above error occurs. It works for some batches but others it does not. How can I fix this?

python -m prodigy ner.llm.fetch  config.cfg latitude_preprocessed_data.csv latitude_annotations/batch1.jsonl
Getting labels from the 'llm' component
Using 4 labels: ['COORDINATES', 'DEPTH', 'METHODS', 'UNDERWATER']
ℹ RECIPE: Writing fetched data to local file:
latitude_annotations/batch1.jsonl
  0%|                                                                                                                                                                                                               | 0/3238 [00:00<?, ?it/s]Token indices sequence length is longer than the specified maximum sequence length for this model (2716 > 1024). Running this sequence through the model will result in indexing errors
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
  0%|                                                                                                                                                                                                               | 0/3238 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/f0scraft/Documents/OWA/LANG/ML36-SPACY/venv/lib/python3.10/site-packages/prodigy/__main__.py", line 63, in <module>
    controller = recipe(*args, use_plac=True)
  File "cython_src/prodigy/core.pyx", line 883, in prodigy.core.recipe.recipe_decorator.recipe_proxy
  File "/home/f0scraft/Documents/OWA/LANG/ML36-SPACY/venv/lib/python3.10/site-packages/plac_core.py", line 367, in call
    cmd, result = parser.consume(arglist)
  File "/home/f0scraft/Documents/OWA/LANG/ML36-SPACY/venv/lib/python3.10/site-packages/plac_core.py", line 232, in consume
    return cmd, self.func(*(args + varargs + extraopts), **kwargs)
  File "/home/f0scraft/Documents/OWA/LANG/ML36-SPACY/venv/lib/python3.10/site-packages/prodigy/recipes/llm/ner.py", line 179, in llm_fetch_ner
    srsly.write_jsonl(output, stream)
  File "/home/f0scraft/Documents/OWA/LANG/ML36-SPACY/venv/lib/python3.10/site-packages/srsly/_json_api.py", line 175, in write_jsonl
    for line in lines:
  File "cython_src/prodigy/components/stream.pyx", line 160, in prodigy.components.stream.Stream.__next__
  File "cython_src/prodigy/components/stream.pyx", line 183, in prodigy.components.stream.Stream.is_empty
  File "cython_src/prodigy/components/stream.pyx", line 198, in prodigy.components.stream.Stream.peek
  File "cython_src/prodigy/components/stream.pyx", line 311, in prodigy.components.stream.Stream._get_from_iterator
  File "cython_src/prodigy/components/preprocess.pyx", line 568, in make_ner_suggestions
  File "/home/f0scraft/Documents/OWA/LANG/ML36-SPACY/venv/lib/python3.10/site-packages/tqdm/std.py", line 1182, in __iter__
    for obj in iterable:
  File "/home/f0scraft/Documents/OWA/LANG/ML36-SPACY/venv/lib/python3.10/site-packages/spacy/language.py", line 1567, in pipe
    for doc in docs:
  File "/home/f0scraft/Documents/OWA/LANG/ML36-SPACY/venv/lib/python3.10/site-packages/spacy/language.py", line 1611, in pipe
    for doc in docs:
  File "/home/f0scraft/Documents/OWA/LANG/ML36-SPACY/venv/lib/python3.10/site-packages/spacy/util.py", line 1705, in _pipe
    yield from proc.pipe(docs, **kwargs)
  File "/home/f0scraft/Documents/OWA/LANG/ML36-SPACY/venv/lib/python3.10/site-packages/spacy_llm/pipeline/llm.py", line 175, in pipe
    error_handler(self._name, self, doc_batch, e)
  File "/home/f0scraft/Documents/OWA/LANG/ML36-SPACY/venv/lib/python3.10/site-packages/spacy/util.py", line 1724, in raise_error
    raise e
  File "/home/f0scraft/Documents/OWA/LANG/ML36-SPACY/venv/lib/python3.10/site-packages/spacy_llm/pipeline/llm.py", line 173, in pipe
    yield from iter(self._process_docs(doc_batch))
  File "/home/f0scraft/Documents/OWA/LANG/ML36-SPACY/venv/lib/python3.10/site-packages/spacy_llm/pipeline/llm.py", line 199, in _process_docs
    responses_iters = tee(self._model(prompts_iters[0]), n_iters)
  File "/home/f0scraft/Documents/OWA/LANG/ML36-SPACY/venv/lib/python3.10/site-packages/spacy_llm/models/hf/falcon.py", line 49, in __call__
    return [
  File "/home/f0scraft/Documents/OWA/LANG/ML36-SPACY/venv/lib/python3.10/site-packages/spacy_llm/models/hf/falcon.py", line 50, in <listcomp>
    self._model(pr, generation_config=self._hf_config_run)[0]["generated_text"]
  File "/home/f0scraft/Documents/OWA/LANG/ML36-SPACY/venv/lib/python3.10/site-packages/transformers/pipelines/text_generation.py", line 205, in __call__
    return super().__call__(text_inputs, **kwargs)
  File "/home/f0scraft/Documents/OWA/LANG/ML36-SPACY/venv/lib/python3.10/site-packages/transformers/pipelines/base.py", line 1140, in __call__
    return self.run_single(inputs, preprocess_params, forward_params, postprocess_params)
  File "/home/f0scraft/Documents/OWA/LANG/ML36-SPACY/venv/lib/python3.10/site-packages/transformers/pipelines/base.py", line 1147, in run_single
    model_outputs = self.forward(model_inputs, **forward_params)
  File "/home/f0scraft/Documents/OWA/LANG/ML36-SPACY/venv/lib/python3.10/site-packages/transformers/pipelines/base.py", line 1046, in forward
    model_outputs = self._forward(model_inputs, **forward_params)
  File "/home/f0scraft/Documents/OWA/LANG/ML36-SPACY/venv/lib/python3.10/site-packages/transformers/pipelines/text_generation.py", line 268, in _forward
    generated_sequence = self.model.generate(input_ids=input_ids, attention_mask=attention_mask, **generate_kwargs)
  File "/home/f0scraft/Documents/OWA/LANG/ML36-SPACY/venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/f0scraft/Documents/OWA/LANG/ML36-SPACY/venv/lib/python3.10/site-packages/transformers/generation/utils.py", line 1602, in generate
    return self.greedy_search(
  File "/home/f0scraft/Documents/OWA/LANG/ML36-SPACY/venv/lib/python3.10/site-packages/transformers/generation/utils.py", line 2450, in greedy_search
    outputs = self(
  File "/home/f0scraft/Documents/OWA/LANG/ML36-SPACY/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/f0scraft/.cache/huggingface/modules/transformers_modules/tiiuae/falcon-rw-1b/e4b9872bb803165eb22f0a867d4e6a64d34fce19/modeling_falcon.py", line 900, in forward
    transformer_outputs = self.transformer(
  File "/home/f0scraft/Documents/OWA/LANG/ML36-SPACY/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/f0scraft/.cache/huggingface/modules/transformers_modules/tiiuae/falcon-rw-1b/e4b9872bb803165eb22f0a867d4e6a64d34fce19/modeling_falcon.py", line 797, in forward
    outputs = block(
  File "/home/f0scraft/Documents/OWA/LANG/ML36-SPACY/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/f0scraft/.cache/huggingface/modules/transformers_modules/tiiuae/falcon-rw-1b/e4b9872bb803165eb22f0a867d4e6a64d34fce19/modeling_falcon.py", line 453, in forward
    attn_outputs = self.self_attention(
  File "/home/f0scraft/Documents/OWA/LANG/ML36-SPACY/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/f0scraft/.cache/huggingface/modules/transformers_modules/tiiuae/falcon-rw-1b/e4b9872bb803165eb22f0a867d4e6a64d34fce19/modeling_falcon.py", line 374, in forward
    attention_probs = F.softmax(attention_logits + attention_mask_float, dim=-1, dtype=hidden_states.dtype)
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 902.00 MiB (GPU 0; 5.80 GiB total capacity; 4.72 GiB already allocated; 151.44 MiB free; 4.82 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

hi @foscraft!

Does this help? See the linked spaCy discussions post too for more details.

Also you may want to search more in spaCy discussions or spacy-llm discussions pages for tips and similar examples.

For example, for your out-of-memory issue:

Since your problem is really a spaCy issue, you'll likely find a lot more helpful posts there (let alone the spaCy core dev team answers questions there).

Hope this helps!

I decided to move to Databricks but I am having this error again.
here

Hi @koaning !
Please look at this conversation when you have a minute.
Thank you.

It's possible that your Databricks machine also does not have the expected GPU resources. Are you able to varify?