Pattern Ids disappear from spans upon modification

[NER manual]
The "pattern" fields disappear on the answers' spans when an annotation is changed!
This must be a bug?! It doesn't just disappear from the modified span it disappears from all of them.
When I accept as-is I can see the pattern ids set for each span.
Further debugging shows patterns never make it out of the UI!
Can see it easily by looking at the /give_answers & /validate_answer request payloads

'spans': [{'start': 0, 'end': 16, 'token_start': 0, 'token_end': 1, 'label': 'KEYWORD'}]

vs

'spans': [{'start': 0, 'end': 16, 'token_start': 0, 'token_end': 1, 'label': 'KEYWORD', 'pattern':'-723433933'}]

This has the side effect of making PatternMatcher.updates moot as the documentation says it only updates if the pattern id is set;

Update the pattern matcher from annotation and update its scores. Typically called as part of a recipe’s update callback and with answers received from the web app. Expects the examples to have an "answer" key ( "accept" , "reject" or "ignore" ) and will use all "spans" that have a "pattern" key, which is the ID of the pattern assigned by PatterMatcher.__call__ .

Help :sob:

version 1.10.2

Hi! The problem here is that the PatternMatcher and its update mechanism (which generates pattern-based scores) was really mostly designed for binary accept/reject workflows where you collect feedback on individual matches, and less for manual workflows where you edit the result.

For manual UIs, the matcher is mostly used to pre-set spans – but there's not really good answer for how to handle and score the results if what comes back is significantly different. That's why the built-in manual recipes don't use the pattern scoring.

When editing and updating existing spans, Prodigy should probably preserve any arbitrary properties on them that were pre-defined – at least until you delete a span. I'll add that to the list of enhancements for the next version :+1:

(If you delete a span, it's reasonable for the meta to be deleted, too, even if you previously had a span over the same tokens with some meta. Preserving and re-adding partial meta data wouldn't make sense here and could easily lead to unexpected behaviour.)

I think it makes sense for pattern etc meta properties to be deleted on modified spans (new/modified), and to preserve the meta properties for the unmodified ones.
This way one can process the ones without properties on the backend if needed.
Right now I have no way of knowing (without implementing some complicated comparison logic) which spans are new or modified or deleted and which ones are acccepted -> maybe accept/reject should be a per span property? :thinking:

A quick solution in the meantime could be to just keep a different property on the task that stores information about the pre-defined spans and where they came from – either as a list of spans and their meta, or a dict keyed by "start|end|label" or something similar. For each span, you could then look up its meta and use that to decide how you want to score the result.

Accept/reject as a span property is what the binary annotations are translated to, and it's what inspired the binary active learning worfklow: even without knowing the full correct parse, we can generate enough information to update the model towards the correct prediction, which is typically one of the top X parses.

Detecting a modified span is always going to be tricky because it depends on what "modified" means and how that should be scored. In the traditional NER evaluation metrics, partial matches are typically discarded and counted as false positives, because they're not considered desirable (and scoring them would get pretty messy). So it probably makes sense to use the same approach if you're comparing original spans predicted by the model (or added by some other logic) to final gold-standard annotations.

I think first we need to keep track of the states of the (entity) spans and return them (of course along with the pattern ids) from the frontend:

there are 3 states/actions: accept,reject,new
(user deletes entity span) span-> reject
(user creates entity span) span -> new (check if it was previously rejected cause users might accidentally delete entity and re-add, respecting exact span boundaries)(no pattern id set since frontend doesnt know)
(user doesn't modify span) span -> accept
I think this should be easy to implement but i don't feel like going javascript hacking... I hope this makes it in as a new feature soon :pray:

Then it is up to the recipe to decide on how to update the scores based on the states of the spans. (which shouldn't be that difficult if it is just a per pattern_id counting/scoring method as I understood it to be, based on the reply you gave on the other thread)
Edit: On the server side to make life easier it would be nice to have something like PatternMatcher.has_pattern(pattern_id) or PatternMatcher.get_pattern(pattern_id)->score

Can I get an ETA and details on this feature request?

There are a few other things we want to implement, but we'll probably publish another patch release later this month. In the meantime, if you just want to be able to look up meta data for spans, you could add it as a separate property. If you key it as "{start}|{end}", it should be pretty easy to look up spans and their additional info.

Ah, that's an interesting idea and the 3 states would make it internally consistent. But this would be too much of a breaking change at this point, because a lot of usage workflows people have could rely on the presence and absence of "spans" (rather than a status toggled on the span). So this would break all workflows where the user expectation is that annotating produces the final gold-standard data.

But we could easily have a helper that merges the before/after spans according to that definition. Untested but something along these lines should work and preserve any data that was present on the original spans (assuming an example has "orig_spans" of the original state and "spans" of the produced annotations):

orig_spans = {(s["start"], s["end"], s["label"]): s for s in eg["orig_spans"]}
annotated_spans = {(s["start"], s["end"], s["label"]): s for s in eg["spans"]}
accepted_spans = []
new_spans = []
deleted_spans = []
for key, span in orig_spans.items():
    if key in annotated_spans:
        accepted_spans.append(span)
    else:
        deleted_spans.append(span)
for key, span in annotated_spans.items():
    if key not in orig_spans:
        new_spans.append(span)

Yes, exactly and it would also allow people to re-use their own evaluation and scoring logic. Ultimately, comparing two different annotations states is really just an evaluation and it's reasonable ot use the same metrics and logic for it that's also used in the model.

round-tripping the orig_spans, to-from the UI, in a metadata field works!
Then I do a 2-step PatternMatcher.update one for rejected spans one for accepted
and (optionally) add the new patterns.
Thanks

1 Like

Just released v1.10.14, which will preserve any custom meta data on existing "spans" in ner_manual (if they're not removed). For use cases like yours, reconciling the results in the recipe still makes sense so you can compare and evaluate the differences – but at least the UI is more consistent now :slightly_smiling_face:

1 Like