sequence labelling with prodigy ?

tba · February 26, 2018, 10:18pm

Hi,
I’ve got a list of phrases with entities separated by commas, as shown below.
Phrase 1 : “Entity 1, Entity 2, Entity 3, Entity 4”
Phrase 2 : “Entity 5, Entity 6, Entity 7, Entity 8, Entity 9”

I have 5 categories, and each entity belongs to 1 category. I’d use a text classifier, but the position of the entity and the label of the previous entities is important, so it’s more like a sequence labelling problem.
Can I use prodigy’s NER feature for this ? I’ve tried to create recipe similar to ner.teach with a custom tokenizer but it doesn’t seem to do the trick.

honnibal · February 26, 2018, 11:34pm

Do you already have the data labelled? If so, you might want to work with the spaCy directly. I would guess the NER model would be able to learn your data. Check that it can memorise a small sample first – train it on a few examples, and evaluate it on the same ones to make sure it’s learning them.

Actually I think the parser would be able to learn your data as well, if you make a little tree out of your phrases instead of a flat list. You could make each phrase depend on the one immediately after it. This might perform better, as the parser takes more care to condition on the current state than the NER model does.

tba · February 27, 2018, 10:05am

Thx for the quick reply.
The data is not labelled yet. I was planning on using prodigy for the labelisation (since prodigy helps minimize the volume of data to be labelled), but I was looking for a simple way to force prodigy to understand that the boundaries of the entities is always a comma. If it’s not possible, I’ll label the data another way.
I get the spacy NER option, but I don’t get the parser option, could you elaborate ?

Topic		Replies	Views
spaCy, prodigy, annotation usage , ner , solved	2	722	February 8, 2019
Correct procedure for ner.teach usage , ner , spacy	7	571	May 25, 2022
Labeling sequence labeling (e.g. NER) task from scratch ner , spacy	16	3492	October 22, 2017
NER model from scratch (strange behaviour) usage , ner , spacy	7	451	October 13, 2020
Named Entities(manual) usage , ner , solved	4	803	May 11, 2018

sequence labelling with prodigy ?

Related topics