I have some patterns and want to pre-annotate the dataset using the pattern. I know ner.manual will highlight the words and I can assert it as correct or wrong. But I don't want to check it manually. I want to set all the pre-highlighted text as correct and make it as trainable format and train it. What will be procedure then?
Hi @ta13, welcome to Prodigy!
If you don't want to annotate, you can just use spaCy instead: use the EntityRuler with your patterns, and save the annotated samples.
One that's done, you can either:
- Save it in Prodigy's JSONL format, then import it into the database,
-
Add the
Doc
s into aDocBin
and save it in the.spacy
format.
From those formats, you can already start training
1 Like
When I work with entity ruler, I found these error:
After using create_pipe() though it's a built in component, I found this:
What should I do?
This sounds like perhaps you're using spaCy v2? Could you upgrade to a more recent version of spaCy v3?
1 Like