Error when running ner.teach on japanese blank model

while running the prodigy command ner.teach on a blank model the following error came up.

   /usr/local/conda3/lib/python3.6/site-packages/toolz/itertoolz.py:368: RuntimeWarning: Mean of empty slice.
      return next(iter(seq))
    /usr/local/conda3/lib/python3.6/site-packages/numpy/core/_methods.py:85: RuntimeWarning: invalid value encountered in double_scalars
      ret = ret.dtype.type(ret / rcount)
    /usr/local/conda3/lib/python3.6/site-packages/toolz/itertoolz.py:368: RuntimeWarning: Degrees of freedom <= 0 for slice
      return next(iter(seq))
    /usr/local/conda3/lib/python3.6/site-packages/numpy/core/_methods.py:110: RuntimeWarning: invalid value encountered in true_divide
      arrmean, rcount, out=arrmean, casting='unsafe', subok=False)
    /usr/local/conda3/lib/python3.6/site-packages/numpy/core/_methods.py:132: RuntimeWarning: invalid value encountered in double_scalars
      ret = ret.dtype.type(ret / rcount)
    Traceback (most recent call last):
      File "cython_src/prodigy/core.pyx", line 55, in prodigy.core.Controller.__init__
      File "/usr/local/conda3/lib/python3.6/site-packages/toolz/itertoolz.py", line 368, in first
        return next(iter(seq))
    StopIteration

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last):
      File "cython_src/prodigy/core.pyx", line 178, in prodigy.core.recipe.recipe_decorator.recipe_proxy
      File "cython_src/prodigy/core.pyx", line 60, in prodigy.core.Controller.__init__
    ValueError: Error while validating stream: no first batch. This likely means that your stream is empty.

    The above exception was the direct cause of the following exception:

    Traceback (most recent call last):
      File "/usr/local/conda3/lib/python3.6/runpy.py", line 193, in _run_module_as_main
        "__main__", mod_spec)
      File "/usr/local/conda3/lib/python3.6/runpy.py", line 85, in _run_code
        exec(code, run_globals)
      File "/usr/local/conda3/lib/python3.6/site-packages/prodigy/__main__.py", line 259, in <module>
        controller = recipe(*args, use_plac=True)
    SystemError: <built-in function delete_Tagger> returned a result with an error setPreformatted text

we have created blank japanese model with mecab.

import spacy
nlp = spacy.blank('ja')
nlp.to_disk(path/to/model)

prodigy command used:

prodigy ner.teach new_dataset japanese_blank_model path/to/train/file

Can you help us through this?

I think the problem here is that you’re starting off with a blank model that doesn’t have an entity recognizer and doesn’t predict anything yet – but you’re asking it to make suggestions. ner.teach takes an existing model and optional patterns, and will suggest entities. But your model doesn’t suggest anything, which is why there’s no stream of examples to annotate. So when validating your stream, the recipe fails with the following message:

 ValueError: Error while validating stream: no first batch. This likely means that your stream is empty.

To allow the model to make suggestions, you need to give it examples or pre-train it with a small set that has

  • Run ner.manual with the labels you’re looking to annotate and create a small gold-standard set. Then pre-train the model with ner.batch-train and the --no-missing flag and export the pre-trained model. You can then use that model as the base model and improve it from there.
  • Create a patterns.jsonl file with examples of entities. You can either supply exact matches, or descriptions of individual tokens. If you’re using token-based patterns, make sure that the tokenization matches your model’s tokenization. For example:
{"label": "PERSON", "pattern": "David Bowie"}
{"label": "PERSON", "pattern": [{"lower": "barack"}, {"lower": "obama"}]}

This video tutorial shows this process in more detail:

Hi Ines,

I was following this thread for Japanese Model and tried using ner.manual

prodigy ner.manual new_dataset jap_model_1/ out.txt --label ORG,NORP

but I encountered this error:
Using 2 labels: ORG, NORP
StopIteration

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/conda3/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/local/conda3/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/usr/local/conda3/lib/python3.6/site-packages/prodigy/__main__.py", line 259, in <module>
    controller = recipe(*args, use_plac=True)
  File "cython_src/prodigy/core.pyx", line 178, in prodigy.core.recipe.recipe_decorator.recipe_proxy
  File "cython_src/prodigy/core.pyx", line 55, in prodigy.core.Controller.__init__
  File "/usr/local/conda3/lib/python3.6/site-packages/toolz/itertoolz.py", line 368, in first
    return next(iter(seq))
  File "cython_src/prodigy/core.pyx", line 84, in iter_tasks
SystemError: <built-in function delete_Tagger> returned a result with an error set

In one of the Github issue they say its a problem with SWIG.

Swig version = 2.0.10
mecab-python3 version =0.996.1

Thanks :slight_smile: .