Compare view problem with progress

beka · April 11, 2019, 2:35pm

Hi,

When trying to use the compare view, I see that the progress never updates.
I tried to take ner.eval-ab and just add a constant progress, but I can’t see it reflected in the UI.
Can you please check this out?

Thanks,
Beka

This is the (very slightly, just adding progress) versio of ner.eval-ab I tried:

@recipe(
    "ner.eval-ab",
    dataset=recipe_args["dataset"],
    before_model=recipe_args["spacy_model"],
    after_model=recipe_args["spacy_model"],
    source=recipe_args["source"],
    api=recipe_args["api"],
    loader=recipe_args["loader"],
    label=recipe_args["label_set"],
    exclude=recipe_args["exclude"],
    unsegmented=recipe_args["unsegmented"],
)
def ab_evaluate(
    dataset,
    before_model,
    after_model,
    source=None,
    api=None,
    loader=None,
    label=None,
    exclude=None,
    unsegmented=False,
):
    """
    Evaluate a n NER model and build an evaluation set from a stream.
    """
    print("RECIPE: Starting CUSTOM recipe ner.eval-ab", locals())

    def get_task(i, text, ents, name):
        spans = [{"start": s, "end": e, "label": L} for s, e, L in ents]
        task = {
            "id": i,
            "input": {"text": text},
            "output": {"text": text, "spans": spans},
        }
        task[INPUT_HASH_ATTR] = murmurhash.hash(name + str(i))
        task[TASK_HASH_ATTR] = murmurhash.hash(name + str(i))
        return task

    def get_tasks(model, stream, name):
        tuples = ((eg["text"], eg) for eg in stream)
        for i, (doc, eg) in enumerate(model.nlp.pipe(tuples, as_tuples=True)):
            ents = [(ent.start_char, ent.end_char, ent.label_) for ent in doc.ents]
            if model.labels:
                ents = [seL for seL in ents if seL[2] in model.labels]
            task = get_task(i, eg["text"], ents, name)
            yield task

    before_model = EntityRecognizer(spacy.load(before_model), label=label)
    after_model = EntityRecognizer(spacy.load(after_model), label=label)
    stream = list(
        get_stream(
            source, api=api, loader=loader, rehash=True, dedup=True, input_key="text"
        )
    )
    if not unsegmented:
        stream = list(split_sentences(before_model.nlp, stream))
    before_stream = list(get_tasks(before_model, stream, "before"))
    after_stream = list(get_tasks(after_model, stream, "after"))
    stream = list(get_compare_questions(before_stream, after_stream, True))
    progress = lambda session, total: 0.5

    return {
        "view_id": "compare",
        "dataset": dataset,
        "stream": stream,
        "on_exit": printers.get_compare_printer("Before", "After"),
        "exclude": exclude,
        "progress": progress,
    }

ines · April 11, 2019, 2:58pm

So just to make sure I understand this correctly: When you just run the recipe normally (without a custom progress), the progress just stays at 0?

When observing the progress, one thing to keep in mind is that it's calculated on the server and updated whenever new answers are sent back. This is so that it can be based on the loss reported by the model updating etc. So it will always take at least one batch of answers for the progress to update. If you want the updates to be sent quicker, you can set a lower batch size.

Ah, I think there's an interesting edge case here: If a stream exposes a __len__ attribute (e.g. if it's a list that has a length, as opposed to a generator), Prodigy will use that to calculate the progress based on the stream length vs. the number of annotations. Otherwise, it falls back to the progress function. This is probably unideal – because in your case, it looks like it's using the stream length and not the custom function.

A workaround for now could be to explicitly return a generator insteas of a list, e.g. by adding stream = (eg for eg in stream).

beka · April 14, 2019, 5:30pm

Thanks a lot Ines! I didn’t figure out the progress is only updated when hitting save, so I thought it is never updates
Regarding the __len__ vs. progress, I can just use the __len__ now, but your trick is useful whatsoever.

Topic		Replies	Views
Custom progress bar usage , custom , solved	5	1193	September 12, 2019
Recipe for comparing NER model and manual annotation usage , ner , custom , compare	4	1408	July 13, 2021
ner.eval-ab doesn't show results of the a/b test usage , ner , solved	2	317	April 21, 2021
Recipe ner.eval-ab never returns any tasks ner , done , front-end	4	1338	February 21, 2018
Training NER does not make any progress usage , ner , training	3	862	December 16, 2021

Compare view problem with progress

Related topics