combined labelling for NER and Classification purposes

I used ner-manual with 10 labels. Out of 10, couple are for classification and rest for NER. I have several questions after ner-maual.

  1. first all is it okay to combine labelling for both purposes if so, below questions
  2. Do i need to delete classificationlabel1, classificationlabel2 in "spans" section while doing ner.batch-train? or not to mention those 2 labels in the --label option of ner.batch-train?
  3. is there any easy way to convert prodigy jsonl to IOB format to feed to non-spacy models?

You can use the same source data, but it's not required. I'd also recommend using separate datasets for your NER and text classification annotations, to avoid conflicts and make it easier to run separate experiments. You'll still be training separate model components and you might want to run them differently. Or maybe it turns out that you need slightly different annotations for the components to achieve better results.

When you labelled the text categories, did you use spans to do that? If so, yes – the NER model will be trained using the "spans" and expects them to be named entities. If you want to train a text classifier, you typically want to have one text and a top-level "label" and not labelled spans in the text.

Prodigy's output gives you the original text and the character offsets into the text. This should let you write converters for any common format you need. You can also use spaCy's biluo_tags_from_offsets helper to convert character offsets to token-based BILUO tags.

[INES] thanks a lot for giving clarity on NER vs classification labeling.
i have another doubt. My input is a document with lot of paragraphs. Sometimes i missed labeling entities. Now i am editing dataset to cover missing data. Does the model affects if i accidentally miss tagging lot of entities in the document?.

Yes, if your data is inconsistent, your model may produce significantly worse results. During training, you're essentially asking it to come up with a strategy that's consistent with the training data – and a strategy based on wrong or inconsistent annotations may not generalise that well.