Hello,
I have defined a basic custom pipeline in Spacy that remove accents, special characters, stop words, ecc and tokenize the text.
Then what I have done is to use the API nlp.to_disk(...) to export it, and moving all the custom code into a specific python file, I have tried to create a package using the Spacy CLI (python -m spacy package ...).
The output of the previous command is now a folder that includes a folder dist, a folder with the name of the pipeline, a meta.json file and a setup.py file among the others.
I am doing this because the next step would be to tag the text I have using Prodigy, but what I would like to do is to use my custom pipeline so it is applied before starting the NER step.
I saw that with the recipes such as ner.correct you can specify the spacy pipeline to be used. I have tried to use the one I exported but I always got some error such as for example "OSError: [E053] Could not read meta.json from en_myP-3.1.0.tar.gz".
Can you please help me understanding what I am doing wrong? And how can I fix this?
Thank your for your help, it's very appreciated.
Regards,
Mauro