Forcing NER to ignore stopwords

claire · June 8, 2018, 10:28am

This is great - thank you Ines! The annotating is already so much faster without including these obvious rejections. As suggested on exit I assign those examples as reject however I am a little uncertain on how to save these back to my data set. The relevant part of my recipe is

    def reject_stopwords(removed_examples):
        for eg in removed_examples:
            eg['answer'] = 'reject'
            print (eg)


def on_exit(ctrl):
    nonlocal removed_examples
    # this is called when you exit the Prodigy server
    dataset_name = ctrl.dataset
    database = ctrl.db
    # do something here
    reject_stopwords(removed_examples)

I am not sure if I should make the updated rejected examples into a list and pass that back or if I can update the dataset on the fly?

Thanks.

Topic		Replies	Views
NER not containing <word_list> usage , ner , spacy	11	1248	September 9, 2019
Excluding patterns for NER enhancement , usage , ner	2	727	May 9, 2019
Feature Request: Antipatterns enhancement	2	1174	February 21, 2018
ner.teach suggests spaces as entities? usage , ner , solved	13	1673	August 3, 2018
NER Training for Corporate Names ner , best-practices	22	11385	September 4, 2019

Forcing NER to ignore stopwords

Related topics