Unable to give many labels in command line

When Ii try to run Manual NER with the following command i get the following error

python -m prodigy ner.manual my_set en_core_web_sm /Users/Development/BigData/RS/annotation/Prodigy/news_headlines.jsonl --label Artifact,Contrast_opacification,Equipment_malfunction,Field_of_View,Identification_error,Image_quality,Incomplete_Study,Motion,Patient_motion,Physician_Protocoling,Physiologic_motion,Positioning,Positioning_Error,Radiation_Dose,Radiation_Dose_Too_Low,Reconstruction

Traceback (most recent call last):
File “/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/runpy.py”, line 184, in _run_module_as_main
main”, mod_spec)
File “/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/runpy.py”, line 85, in _run_code
exec(code, run_globals)
File “/Users/philips/Development/BigData/RS/annotation/Prodigy/prodigy/main.py”, line 248, in
controller = recipe(args, use_plac=True)
File “cython_src/prodigy/core.pyx”, line 150, in prodigy.core.recipe.recipe_decorator.recipe_proxy
File “/Users/philips/Development/BigData/RS/annotation/venv/lib/python3.5/site-packages/plac_core.py”, line 328, in call
cmd, result = parser.consume(arglist)
File “/Users/philips/Development/BigData/RS/annotation/venv/lib/python3.5/site-packages/plac_core.py”, line 207, in consume
return cmd, self.func(
(args + varargs + extraopts), **kwargs)
File “/Users/philips/Development/BigData/RS/annotation/Prodigy/prodigy/recipes/ner.py”, line 140, in manual
labels = get_labels(label, nlp)
File “cython_src/prodigy/util.pyx”, line 96, in prodigy.util.get_labels
File “/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/pathlib.py”, line 1306, in exists
self.stat()
File “/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/pathlib.py”, line 1126, in stat
return self._accessor.stat(self)
File “/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/pathlib.py”, line 371, in wrapped
return strfunc(str(pathobj), *args)
OSError: [Errno 63] File name too long: ‘Artifact,Contrast_opacification,Equipment_malfunction,Field_of_View,Identification_error,Image_quality,Incomplete_Study,Motion,Patient_motion,Physician_Protocoling,Physiologic_motion,Positioning,Positioning_Error,Radiation_Dose,Radiation_Dose_Too_Low,Reconstruction’

Ah, this is interesting – when a user passes in a --label argument, Prodigy checks whether it’s a path to a file, or a string of comma-separated labels. Apparently, if the string is too long, the file path check fails. Will add a condition to prevent this problem.

To work around this for now, simply add your labels to a labels.txt or any other plain text file with one label per line. Considering the amount of labels you have, this is probably better anyways – it keeps the command tidy and easier to read.

I added it to text file and ran the command again like below

python -m prodigy ner.manual my_set en_core_web_sm /Users/Development/BigData/RS/annotation/Prodigy/news_headlines.jsonl --label /Users/Development/BigData/RS/annotation/Prodigy/labels.txt.

but now it takes it as string instead as file . Do i need to pass a different flag ?

That's weird – are you sure the path is correct? In the example you posted, there's a . behind labels.txt – maybe you accidentally added this on the command line as well?

If the following holds true for your path, Prodigy should read the labels line by line:

from pathlib import Path
labels_path = Path('/Users/Development/BigData/RS/annotation/Prodigy/labels.txt')
assert labels_path.exists()

Sorry, my file path was wrong. It works now with one label per line. Thanks.

1 Like