Is there any to train the the annotated data through spacy which I get from Prodigy in jsonl format? Actually my prodigy version is quite old and I dont want to resubscribe it as the old version is enough for me, But there are some functionalities now which are only compatible for spacy newer versions. So I wanted to know is there any way I can train my data from prodigy jsonl file through spacy instead of using ner.batch-train?
Hi, there's a built-in spacy converter that's intended for use with prodigy NER data:
spacy convert --lang en data.jsonl .
This should create data.json in spacy's training format. You need to specify the language so that the converter can tokenize the texts.
See a more detailed example here: Unable to use Prodigy annotations with SpaCy CLI train
              
              
              1 Like
            
            
          I have converted the jsonl file from prodigy to json format and now I want to train that json file in Spacy for NER. Sample of json file is below:
[
{
"id":0,
"paragraphs":[
{
"raw":"Really very sad. Allah rehem kere Ameen",
"cats":[
    ],
    "sentences":[
      {
        "brackets":[
        ],
        "tokens":[
          {
            "ner":"U-IGNORE",
            "id":0,
            "orth":"Really"
          },
          {
            "ner":"U-IGNORE",
            "id":1,
            "orth":"very"
          },
          {
            "ner":"U-IGNORE",
            "id":2,
            "orth":"sad"
          },
          {
            "ner":"O",
            "id":3,
            "orth":"."
          }
        ]
      },
      {
        "brackets":[
        ],
        "tokens":[
          {
            "ner":"U-IGNORE",
            "id":4,
            "orth":"Allah"
          },
          {
            "ner":"U-IGNORE",
            "id":5,
            "orth":"rehem"
          },
          {
            "ner":"U-IGNORE",
            "id":6,
            "orth":"kere"
          },
          {
            "ner":"U-IGNORE",
            "id":7,
            "orth":"Ameen"
          }
        ]
      }
    ]
  }
]
How can I train this type of data in spacy for NER?
You can find documentation about training on the command line here: