So I’m trying to write a custom loader for my text. The text is in the form of a csv document and my loader function is below:
def load_data(path):
with open(path) as f:
reader = csv.DictReader(f)
for row in reader:
text = row.get('Description of the Event')
task = {'text': text}
yield task
The issue I’m running into is when running the recipe I get an error stating object of type ‘generator’ has no length. The recipe is below:
@recipe('sent.teach',
dataset=recipe_args['dataset'],
spacy_model=recipe_args['spacy_model'],
source=recipe_args['source'],
label=recipe_args['label_set'],
long_text=("Long text", "flag", "L", bool))
def teach(dataset, spacy_model, source=None, label='',long_text=False):
log('RECIPE: Starting recipe sent.teach', locals())
DB = connect()
path = os.getcwd()
if spacy_model:
print("loading model")
nlp = spacy.load(spacy_model)
stream = load_data(os.path.join(path, 'data', 'sims_2018.csv'))
# Rank the stream. Note this is continuous, as model() is a generator.
# As we call model.update(), the ranking of examples changes.
stream = prefer_uncertain(nlp(stream))
return {
'view_id': 'classification',
'dataset': dataset,
'stream': stream,
'update': nlp.update,
'config': {'lang': nlp.lang, 'labels': nlp.textcat.labels}
}
I used the textcat.teach as a template for creating my recipe. I know the error arises in
stream = prefer_uncertain(nlp(stream))
, however, I thought the input into the predict function was supposed to be a generator? I’m not sure what I’m doing wrong here.