Tournament Recipe Examples?

Question: Where can I find an example tournament recipe? The recipe repo doesn’t have any.

Use Case: I have pre-generated documents that I want to evaluate with Prodigy. Tournaments seem like a great way to do this, but the included recipes assume I’m generating documents on the fly. I already have 3 variations of each document, I want to chug through a bunch of pairwise comparisons to measure which variation is the best.

I’d hoped to start with a simple example and build up, but I’ll try starting with site-packages\prodigy\recipes\openai\ab.py and slimming it down.

Welcome to the forum @pbronez! :waving_hand:

Looking at source code was exactly the right thing to do. Since v1.16.0 we've removed all the cython compiled code making the source code available to users, which is why we also deprioritized adding new examples to the recipe repo.

You should definitely be able to adjust the existing ab.llm.tournament or ab.openai.tournamet to your use case. You should strip the LLM generation layer and read pre-computed text variants from the input instead:
Assuming the following input format:

 {"text": "optional context", "variants": {"gpt4": "...", "claude": "...", "llama": "..."}}

The skeleton of it would be something like this:

import itertools
import random
from pathlib import Path
from statistics import NormalDist
from typing import Dict, List

import srsly

from prodigy.components.db import connect
from prodigy.components.stream import Stream
from prodigy.components.tournament import GlickoTournament
from prodigy.core import Arg, recipe
from prodigy.util import log, msg, set_hashes


# Just copy print_prob_table from prodigy/recipes/llm/ab.py — it's a small
# standalone helper (~20 lines) that uses tournament.top_k(), ._ratings, and
# .match_log to display a probability table via msg.table().
def print_prob_table(t: GlickoTournament):
    ...


# A tiny validator used in ab.llm.tournament — ensures the user
# actually selected one of the two options before accepting.
def ensure_option_selected_when_accept(eg):
    ...


@recipe(
    "ab.tournament",
    dataset=Arg(help="Dataset to save answers to"),
    inputs_path=Arg(help="Path to JSONL file with variants"),
    no_random=Arg("--no-random", "-NR", help="Don't randomize option order"),
    resume=Arg("--resume", "-r", help="Resume from dataset, replaying existing matches"),
    nometa=Arg("--no-meta", "-nm", help="Don't display meta information"),
)
def ab_tournament(
    dataset: str,
    inputs_path: Path,
    no_random: bool = False,
    resume: bool = False,
    nometa: bool = False,
) -> Dict:
    db = connect()

    # Step 1: Load the JSONL and extract variant names from the first record.
    # The keys of record["variants"] are your player names (e.g. "gpt4", "claude").
    # Validate that there are at least 2 variants.
    records = list(srsly.read_jsonl(inputs_path))
    variant_names = ...  # sorted list of keys from records[0]["variants"]

    # Step 2: Initialize the tournament — same as ab.llm.tournament but using
    # variant names instead of prompt/config combo names.
    tournament = GlickoTournament(options=variant_names)

    # Step 3: The update callback. Called after each annotation. Look at how
    # ab.llm.tournament does it — extract the winner from ex["accept"][0],
    # compare with ex["options"][0]["id"] to determine the outcome, then call
    # tournament.update(name_i, name_j, outcome).
    def update(examples: List[Dict], show_table=True) -> None:
        for ex in examples:
            if (ex["answer"] == "accept") and ex.get("accept"):
                option_a = ...  # id of first option
                option_b = ...  # id of second option
                winner_id = ...  # the selected option
                tournament.update(
                    name_i=option_a,
                    name_j=option_b,
                    outcome=...,  # 1.0 if option_a won, 0.0 otherwise
                )
        if show_table:
            print_prob_table(t=tournament)

    # Step 4: Resume logic — replay matches from the existing dataset.
    # Same pattern as ab.llm.tournament: load examples from db, check that
    # both players are in variant_names, call update() silently.
    if resume:
        ...

    # Step 5: Cycle through inputs indefinitely (tournaments need repetition
    # for statistical confidence). For each record, use tournament.top_k(2) to
    # pick the two most informative variants to compare next.
    inputs_cycle = itertools.cycle(records)

    def make_stream(cycle):
        for record in cycle:
            variants = record["variants"]
            # Pick the top 2 candidates the tournament wants to compare
            top2 = tournament.top_k(2)

            # Build options — each option needs "id" (variant name) and "text"
            # (the variant's generated text from the record)
            options = ...

            # Optionally shuffle (unless --no-random)
            ...

            # Build the example dict with "text" (context), "options", and
            # optionally "meta". Then yield with set_hashes().
            eg = {"text": record.get("text", ""), "options": options}
            yield set_hashes(eg, input_keys=("text",))

    # Step 6 (optional): on_exit callback to print final rankings.
    # Count wins per variant from the dataset examples, display with msg.table().
    def on_exit(ctrl):
        ...

    # Step 7: Return the controller components dict — same shape as
    # ab.llm.tournament. Key settings: batch_size=1, choice_auto_accept=True.
    return {
        "dataset": dataset,
        "view_id": "choice",
        "stream": Stream.from_iterable(make_stream(inputs_cycle)),
        "update": update,
        "on_exit": on_exit,
        "validate_answer": ensure_option_selected_when_accept,
        "config": {
            "batch_size": 1,
            "choice_auto_accept": True,
            "exclude_by": "input",
        },
    }

Btw ab.llm.tournament is the newer and more flexible version than the original ab.openai.tournment which can only be used with the openai backend.