get_seeds_from_set function

madhujahagirdar · March 20, 2018, 1:38am

any idea if the get_seeds_from_set is deprecated (i upgraded to Prodigy 1.4) or anything else can be used ?

File “”, line 1, in
File “/home/madhujahagirdar/cnn-annotation/venv/lib/python3.5/site-packages/prodigy/init.py”, line 4, in
from . import recipes # noqa
File “/home/madhujahagirdar/cnn-annotation/venv/lib/python3.5/site-packages/prodigy/recipes/init.py”, line 4, in
from . import dep, ner, textcat, pos, compare, terms, generic, textcat_attention_weights # noqa
File “/home/madhujahagirdar/cnn-annotation/venv/lib/python3.5/site-packages/prodigy/recipes/textcat_attention_weights.py”, line 12, in
from prodigy.util import get_seeds, get_seeds_from_set, log
ImportError: cannot import name ‘get_seeds_from_set’

ines · March 20, 2018, 10:20am

Yes, in v1.4.0, the textcat.teach recipes now also take a match patterns file instead of just a list of dataset of single seed terms (just like the NER recipes). So all internal helper functions for the specific handling of seed terms from lists and sets are deprecated.

See the recipe source for the implementation. If you want to use the new pattern matching and you already have an existing seeds dataset, you can use the terms.to-patterns recipe to convert it to a patterns file.

Alternatively, you can also write a simple get_seeds_from_set function yourself. All it did was take a dataset name and extract the text of all examples that were marked as accepted. (Prodigy’s helper function also raised some useful errors and printed details, but you can leave this out if you don’t need it.)

from prodigy.components.db import connect()

db = connect()

def get_seeds_from_set(dataset):
    examples = db.get_dataset(dataset)
    seeds = [eg['text'] for eg in examples if eg['answer'] == 'accept']
    return set(seeds)

Topic		Replies	Views
Seeds not recognized by textcat.teach textcat , solved	10	3276	January 23, 2019
Text Classification, Bootstrapping Error textcat	1	671	June 7, 2018
Training Insults classifier video out of date (--seeds argument) and moved documentation docs	4	669	February 8, 2019
unrecognized arguments: --seeds in textcat.teach usage , textcat , solved	1	992	March 12, 2019
textcat.teach repeating data with --exclude flag set and trained model in the loop usage , textcat , solved	9	744	September 25, 2019

get_seeds_from_set function

Related topics