Review recipe - how to review?

I have annotations completed by two annotators, and trying to review them. I tried the following command, but was unable to see the annotations in the interface:

screen -r python3 -m prodigy review Med_EM_Problem_detection_dataset_v23.jsonl Med_test_schema_v0.4

Error:

The running recipe is configured for multiple annotators using named
sessions with feed_overlap=True, or via a task router setting, but a client is
requesting questions using the default session. For this recipe, open the app
with ?session=name added to the URL or set feed_overlap to False in your
configuration.

I received the error. I understand feed_overlap is set because we want annotators to double annotate. How do I maintain that ability and still review?
What should the url look like
?session=review or something else? There is nothing to see if I don't have a session id set.

Hi @tlingren,

The typical workflow for the review recipe is to have just one person to the revision - a single final adjudicator of the annotation conflicts. This is the recommended way to ensure that the conflicts are being resolved consistently. Would you like to have different reviewers review the same or different examples?

In any case with feed_overlap set to True, Prodigy expects to be accessed with so called "named session" and the name in ?session=name pattern should be name (or an identifier) of the reviewer e.g. sam so the full URL could be http://localhost:8080/?session=sam . This name or identifier will be used to uniquely identify the annotations done by this reviewer (it will be stored in the _session_idand _annotator_id attributes of the task and a copy of all annotations will also be stored in a separate dataset with this identifier in the name). If you don't provide it, and the feed_overlap is set to True, Prodigy will error out.

Thanks for the reply.
To clarify, I can have one of the annotators set as the 'adjudicator' and turn feed_overlap to true, which when I run the review recipe will display all annotator's output overlayed on the same text?

Is feed_overlap a command line parameter or do I need to edit a py code? and which one?

I have created a prodigy.json in the main home directory and added {feed_overlap=True}
That eliminates the error message, however, I'm getting three boxes of annotations (one presumably for the master, and one for each of the 2 annotators).
Could this be because we are using both spans and rel in the annotation? I can make rel decisions here, but see the output of spans when looking at the individual annotators output.


Hi @tlingren,

Yes, you can have one adjudicator with feed_overlap set to true you just need to access Prodigy server with the "named session" pattern we talked about before.
feed_overlap can be set in the global Prodigy config file prodigy.json (by default located in your home_directory/.prodigy) or local Prodigy config file located in your current working directory (just like you did). For completeness, you can also set it via CLI or environment variables. The settings from all these sources are merged so it's just a matter of convenience where you set it.
Btw. prodigy.json is like a python dictionary in terms of format so for the setting to take the effect it should be:

{"feed_overlap": true}

The spelling errors won't raise en error, it would be silently ignored instead.
(You can find more details on what else can be set via config here)

Now, as for the the review interface: you're right, it will have the "adjudicated" or "master" view and then it will show alternative annotations from different annotators. The adjudicator can select using the majority vote (which is what will show up as default), select a different alternative (by clicking on the alternative) or manually edit the "adjudicated" view. So the number of boxes will depend on how many alternatives there are (not on a particular "view id"). You can also automatically accept the ones that all annotators agreed on with -A flag.

Do you want to adjudicate both relation annotations and span annotations? If so, you should also add "relations_span_labels": []to your prodigy.json where you would list the target span labels in the square brackets (this is just a workaround, we are planning to support --span-labels argument in review but we currently don't which why it's not documented).

Or the problem is that you're not seeing the relation annotation in the "annotator boxes"? :thinking: It might be that this particular example did not receive any relation annotation? You'd need to double check with the source jsonl file as a first check.

In this annotation we are using --labels and --span_labels . The --labels are relationships. The --span_labels are entities
What the issue was that above, it said "no_label", but there was clearly at least one annotation (for span_labels) in the transcript seen (second image).

If we have to do 2 separate reviews (one for relations and one for spans, that's ok). But I did the above (relationship labels show up, but only the spans show up in the annotation.

And the wrap button does not work with these multiple windows. It only seems to work for the top box. It looks like I can check it in the other boxes, but nothing happens and the screen refreshes. And if the top box is wrapped, it will unwrap.
If I click on an annotated text, the screen refreshes and it jumps to the front of the text.
It's a frustrating UI to try to work through.

@magdaaniol I completed the review session, and saved each snippet. But when I exported the db , the annotations in the jsonl file were the same.

python3 -m prodigy db-out Med_test_schema_v0.4 > ./med_test_v04_consensus

Is there a different way to export a review session ? There wasn't anything in the documentation about a review export.

Hi @tlingren,

Sorry to hear your experience with the review was not exactly smooth. I admit that there are a few rough edges to polish for reviewing the relations UI and the fact the UI wrap checkbox doesn't work is one of them. It's definitely on our TODO list to fix it.

In the meantime, just to summarize the the workarounds to the issues you've reported:

  1. The lack of span labels in the review UI
    You can add them via prodigy config by specifying
"relations_span_labels": ["PER","ORG","LOC"] # example list of labels to add

So there's no need to to do the revision of spans and relations in two passes.

  1. Dysfunctional wrap checkbox in review

Again, this setting can be effectively added via prodigy config by specifying

"wrap_relations": true

These are all settings of the relationsUI documented here. The review API does not handle them all yet, but they can be definitively added via the config file.
Again, we are aware of these shortcomings and will fix it as soon as possible.

As to your last question about the export: there's nothing specific about exporting the review dataset.
is Med_test_Schema_v04 the name of the dataset that you used as dataset parameter to the review recipe?
In other words, when calling the review recipe, you used a commad that looked something like:

python3 -m prodigy review my_reviewed_dataset annotated_dataset --label X,Y 

What you want to export with db-out is my_reviewed_dataset. Each example will contain the copies of the particular versions under the versions keys but the top level annotation is the adjudicated one. You can also check the _session_id should correspond to the adjudicator session_id (either timestamp or name if a named revision session was used)

Thank you for the detailed response.
WRT the db-out, there was no difference between the original and reviewed. So perhaps my review command was incorrect.
What I did was:
my_reviewed_dataset = original source file
annotated_dataset = DATASET (as listed in the file below)
image

I specified the labels in the .prodigy/.prodigy config and not the command line.

Right, just to be clear: my_reviewed_dataset (in the example command I provided) should be a fresh empty dataset where to store your revisions. And that's the one you should db-out. If you used an existing dataset as the target of the review your revisions would be stored together with what is already in there.
I think the most efficient way here would probably be for you to revise the review command you used against the documentedreview API here, where all parameters are defined. That hopefully clears out any doubts about the meaning of each dataset name.