get_seeds_from_set function

any idea if the get_seeds_from_set is deprecated (i upgraded to Prodigy 1.4) or anything else can be used ?

File “”, line 1, in
File “/home/madhujahagirdar/cnn-annotation/venv/lib/python3.5/site-packages/prodigy/init.py”, line 4, in
from . import recipes # noqa
File “/home/madhujahagirdar/cnn-annotation/venv/lib/python3.5/site-packages/prodigy/recipes/init.py”, line 4, in
from . import dep, ner, textcat, pos, compare, terms, generic, textcat_attention_weights # noqa
File “/home/madhujahagirdar/cnn-annotation/venv/lib/python3.5/site-packages/prodigy/recipes/textcat_attention_weights.py”, line 12, in
from prodigy.util import get_seeds, get_seeds_from_set, log
ImportError: cannot import name ‘get_seeds_from_set’

Yes, in v1.4.0, the textcat.teach recipes now also take a match patterns file instead of just a list of dataset of single seed terms (just like the NER recipes). So all internal helper functions for the specific handling of seed terms from lists and sets are deprecated.

See the recipe source for the implementation. If you want to use the new pattern matching and you already have an existing seeds dataset, you can use the terms.to-patterns recipe to convert it to a patterns file.

Alternatively, you can also write a simple get_seeds_from_set function yourself. All it did was take a dataset name and extract the text of all examples that were marked as accepted. (Prodigy’s helper function also raised some useful errors and printed details, but you can leave this out if you don’t need it.)

from prodigy.components.db import connect()

db = connect()

def get_seeds_from_set(dataset):
    examples = db.get_dataset(dataset)
    seeds = [eg['text'] for eg in examples if eg['answer'] == 'accept']
    return set(seeds)