Error with A/B testing recipe

Has anyone successfully used the A/B testing recipe? I tried to create two small model files, with 2 examples, and I got the following error:


Traceback (most recent call last):
File “cython_src\prodigy\core.pyx”, line 55, in prodigy.core.Controller.init
File “C:\Users\svajjala\Python36\lib\site-packages\toolz\itertoolz.py”, line 368, in first
return next(iter(seq))
StopIteration

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “C:\Users\svajjala\Python36\lib\runpy.py”, line 193, in _run_module_as_main
main”, mod_spec)
File “C:\Users\svajjala\Python36\lib\runpy.py”, line 85, in run_code
exec(code, run_globals)
File "C:\Users\svajjala\Python36\lib\site-packages\prodigy_main
.py", line 259, in
controller = recipe(*args, use_plac=True)
File “cython_src\prodigy\core.pyx”, line 178, in prodigy.core.recipe.recipe_decorator.recipe_proxy
File “cython_src\prodigy\core.pyx”, line 60, in prodigy.core.Controller.init
ValueError: Error while validating stream: no first batch. This likely means that your stream is empty.


So, I tried just using the two examples in the webpage (https://prodi.gy/docs/workflow-evaluation) i.e., a modelA.jsonl and a modelB.jsonl with one example in each. I got the exact same error again. Is there a minimum number of required examples? I did not find information on this in earlier forum posts.

Thanks for the report. It should definitely be possible to run Prodigy with only one example in the stream, so maybe you actually came across a bug here. (If so, sorry about that!)

Could you share the exact command you ran and an example of your two JSONL files? (Or did you use the exact one from the docs?)

prodigy dataset abtestingtrial

prodigy compare abtestingtrial modelA.jsonl modelB.jsonl

these are the commands I used. the two jsonl files had one example each, and I copied those from the demo. (I tried with my own examples too, but I got the same error)

@ines - any ideas on this issue are appreciated! I used the same example as in the README. The commands I used are mentioned in the previous comment.

Thanks for your patience – I tested it and couldn’t reproduce it at first, but I think I figured it out. Sorry, the docs were probably a bit misleading here.

JSONL (newline-delimited JSON) is read in line-by-line, so each example has to be on one line. This way, we avoid having to parse the entire file upfront. The examples in the docs are formatted to make them easier to read. But the final files will have to look like this:

{"id": 0, "input": {"image": "https://images.unsplash.com/photo-1433162653888-a571db5ccccf?ixlib=rb-0.3.5&q=80&fm=jpg&crop=entropy&cs=tinysrgb&w=400&fit=max&s=cb3099ba9dc50a500db3b298c6d7c156"}, "output": {"text": "A pug in a blanket on the grass"}}
{"id": 0, "input": {"image": "https://images.unsplash.com/photo-1433162653888-a571db5ccccf?ixlib=rb-0.3.5&q=80&fm=jpg&crop=entropy&cs=tinysrgb&w=400&fit=max&s=cb3099ba9dc50a500db3b298c6d7c156"}, "output": {"text": "An unhappy child in the garden"}}

We should probably add a note to the example that explains this, or format it differently. Because if you load in a JSONL file that doesn’t contain a full, valid JSON object in one line, that line will be skipped because the markup is invalid, so you’ll end up with a stream of 0 examples.

Thanks Ines!

1 Like