Exporting NER annotations for HF datasets

ChrisM · September 27, 2021, 9:07am

Just a quick question in case someone has already done this: I have exported my annotations to JSONL for NER with the transformers library. I'm going to use the HF datasets package. Has anybody converted the prodigy span format to something suitable for loading seamlessly into a dataset and then into a transformers NER model?

ChrisM · September 30, 2021, 3:17am

I've done it! With some very slow pandas along the way, unfortunately. I'll try to optimise and then share at some point.

ines · October 1, 2021, 8:57am

Awesome! If there's something you want to share, that'd be great, and we'd be happy to help you tidy it up so we can maybe make it a proper integration. Because that'd definitely be super cool to have

ChrisM · December 9, 2021, 6:19am

If anyone is interested, this is part of an upcoming paper using transformers for adverse drug reaction detection. We're using Prodigy for all our annotation. Here is the repo (GitHub - AustinMOS/adr-nlp) for training a NER model using a JSONL file of annotations exported from Prodigy.

Topic		Replies	Views
NER for Financial Text ner	14	1623	October 25, 2023
JSONL with annotation for NET multi-tag for newbies usage , ner	3	664	February 14, 2022
Updating an NER model using the annotation tool ner , spacy	6	400	June 5, 2023
CONLL to Prodigy usage , done	1	704	June 22, 2018
Training prodigy ner data through spacy usage , ner , spacy , solved	3	893	January 8, 2020

Exporting NER annotations for HF datasets

Related topics