A/B Evaluation not working.

Hi,

I have been trying to use the compare recipe for A/B evaluation but it isn't working. At first I tried it with a custom recipe and then without a custom recipe but in both the cases I got the same error response.
The error is as follows:

I am unable to locate the error. Could you please help me locate the error?

hi @PrithaSarkar!

Thanks for your question and welcome to the Prodigy community :wave:

Unfortunately, by a quick look at your error, I think there's an issue either with your data or your custom recipe.

Since your Project info doesn't even have the recipe name or dataset, this leads me to believe you may have misspecified your custom recipe.

Can you provide reproducible examples of your data and your custom recipe? If you didn't use a custom recipe, can you provide the command so I can see how you ran your recipe in case you missed anything?

Hi! Thanks for your reply.

So let me start by providing examples of the datasets I have been using.
file_a:
{"id": 0, "input": {"text": "Declaration of competing interest There is no conflict to declare."}, "output": {"text": "competing interest"}}
{"id": 1, "input": {"text": "Competing interests: The authors have declared that no competing interests exist."}, "output": {"text": "no competing interests"}}
file_b:
{"id": 0, "input": {"text": "Declaration of competing interest There is no conflict to declare."}, "output": {"text": ["of competing interest There is no conflict to declare "]}}
{"id": 1, "input": {"text": "Competing interests: The authors have declared that no competing interests exist."}, "output": {"text": ["The authors have declared that no competing interests exist "]}}

The command I used:
python3 -m prodigy $PRODIGY_RECIPE -NR -D $PRODIGY_DATASET $CANDIDATE_ANNOTATIONS_FILE_A $CANDIDATE_ANNOTATIONS_FILE_B;
where,
PRODIGY_RECIPE = compare
PRODIGY_DATASET= test
CANDIDATE_ANNOTATIONS_FILE_A = file_a
CANDIDATE_ANNOTATIONS_FILE_B = file_b
This was the final time I used it before I raised the issue and didn't provide the custome_recipe path.

Previously, I was using a custom recipe with the following command:
python3 -m prodigy $PRODIGY_RECIPE -NR -D $PRODIGY_DATASET $CANDIDATE_ANNOTATIONS_FILE_A $CANDIDATE_ANNOTATIONS_FILE_B -F $PRODIGY_CUSTOM_RECIPE;
where, the PRODIGY_CUSTOM_RECIPE variable would point to the recipe file. The recipe was inspired from here.

Hi @PrithaSarkar!

Yes, it seems like small typos with your input format, specifically for file_b.jsonl.

Specifically, you have unnecessary [ and ] brackets for:

"output": {"text": ["of competing interest There is no conflict to declare "]}

and

"output": {"text": ["The authors have declared that no competing interests exist "]}

Removing those brackets yields this new file_b.jsonl:

{"id": 0, "input": {"text": "Declaration of competing interest There is no conflict to declare."}, "output": {"text": "of competing interest There is no conflict to declare "}}
{"id": 1, "input": {"text": "Competing interests: The authors have declared that no competing interests exist."}, "output": {"text": "The authors have declared that no competing interests exist "}}

Which then ran with:

python3.9 -m prodigy compare -NR -D issue-6490 file_a.jsonl file_b.jsonl 

Also, just glancing at your data, depending on your tasks, make sure your two input files align to the specifications for the compare recipe:

Expects two JSONL files where each entry has an "id" (to match up the outputs on the same input), and an "input" and "output" object with the content to render, e.g. the "text"

At a quick glance, your example files looked a little confusing. For example, for file_b.jsonl, the input and output text were not complete sentences, so I'm wondering if this was an accidental error. No worries if not.

Hope this helps!

Hi @ryanwesslen! Thanks for getting back.

The incomplete sentences were not accidental but I see what you mean about the extra [ and ] brackets causing problems. Thank for taking a look at it. I'll be applying your advices.