I have job.jsonl data. i want to annotate the data as ner.manual for name entity recognition im having problem while running as it shows the error messages as :

prodigy ner.manual ner_job blank:en ./job.jsonl --label SUMMARY,COMPANYNAME
Using 2 label(s): SUMMARY, COMPANYNAME

✘ Error while validating stream: no first example
This likely means that your stream is empty. This can also mean all the examples
in your stream have been annotated in datasets included in your --exclude recipe
parameter.

Waiting for the solutions

@ines can you help me :sweat_smile:

Hi @kushal_pythonist ,

Just to sanity check, what's inside the job.jsonl file? Usually this error shows up if it cannot find an example in your JSONL file or if it's empty. You can double-check the JSONL formats in the documentation.

This is the sample in job.jsonl files

{"0":"Designation","1":"CompanyName","2":"CompanyLocation","3":"JobSummary","4":"PostedDate","5":"Salary"}
{"0":"newFull Stack Developer","1":"RED TECHNOLOGIES","2":"Charlotte, NC 28203 (Dilworth area)","3":"Work with other developers, designers, and product managers to develop new features consistent with the product roadmap.\nCompany-paid STD and Life insurance.","4":"PostedToday","5":"None"}
{"0":"newEntry Level Software Engineer","1":"Revature","2":"Charlotte, NC+7 locations","3":"As a Revature Entry Level Software Engineer you will receive on-the-job-training to become an experienced software engineer.","4":"Posted2 days ago","5":"None"}

Hi @kushal_pythonist , you need to format your entities / text into something that's compatible with Prodigy:

  • If you are annotating from scratch, you need to pass the whole text in the "text" field.
  • If you already have pre-annotated NER data, you need to format your spans and entities similar to the ner_manual task format

its just confusing :sweat_smile: and still stucked at the jsonl data formats