Hi, I'm trying to annotate examples from other dataset and I'm receiving this error:
RecipeError: ("Can't find path dataset:ESmodelAnotado", PosixPath('/app/dataset:ESmodelAnotado')
these are my prodigy parameters:
python -m prodigy my.teach 2humanES /app/data/modelos/enWandb/ES_tier2/model_best/ dataset:ESmodelAnotado--label /app/data/constantes/iabtier2,/app/data/constantes/iabtier1 -e 2humanES -F /app/recipes/my_teachbinaryT1yT2.py
the dataset source exists,
how can I fix that?
One thing I notice immediately is that there's no space between the source and label argument:
In any case, I think it's just copy/paste error bc the actual error message would be different if you tried to pass this command as is. What probably is the real issue is how the source is being read.
Could you share the Prodigy version are you using and how does your custom recipe read the source? I suspect you might be using the JSONL legacy loader directly (just like the example
ner.teach recipe does in Prodigy Recipes repo) which is meant to be used for filepaths and not for datasets.
If you'd like to use a dataset name instead, you should be using the
get_stream helper that does the source resolution for you.
Also, it's worth noting that apart from the legacy
get_stream(which is still a valid way to load the data), we also have a new reimplemented version (with improved source resolution and handling) that returns a Stream object. Please see the documentation for the recommended way of loading the data source: Components and Functions · Prodigy · An annotation tool for AI, Machine Learning & NLP
you found the issue, it was I'm using JSON loader instead of the new get_stream