Hi there,
I'm using ner.manual
with patterns to recognize the company names in the text.
prodigy ner.manual ner_company_names nl_core_news_lg ./assets/raw_text.jsonl --label ORG,PERSON --patterns ./assets/company_name_patterns.jsonl
I have a few questions:
-
Patterns: Can I edit patterns during the
ner.manual
annotation while the server is running? If I make changes to the file with patterns, should I restart the server and refresh the browser? What effect will it have on the dataset? (Currently, I'm just restarting the server and refreshing the browser.) -
Evaluation dataset: The evaluation file contains a few thousand samples. Do I need to perform any annotation, run training, or should it remain as raw text?
prodigy ner.manual ner_company_names_eval nl_core_news_lg ./assets/raw_text_eval.jsonl --label ORG,PERSON
Raw text line is looking like:
{"text": "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nam faucibus eros aliquam, laoreet magna et, tincidunt arcu.", "meta": {"source": "lipsun", "id": 4450141}}
Thanks in advance.