Hi,
My team is working on annotating both NER and RE with the rel.manual
recipe. For this, I am using the following config (prodigy.json
):
{
"feed_overlap": true,
"custom_theme": {
"cardMaxWidth": 1500,
"smallText": 16,
"relationHeightWrap": 40
}
}
I'm specifying the sessions with PRODIGY_ALLOWED_SESSIONS=jane,joe,sarah,ale
. That is for 3 annotators and me (ale
). My session is just for testing.
We started with one database, let's call it test_v1
, and an input jsonl file, let's say data.jsonl
, which contains all texts we want to annotate. Let's say that ~300 out of 1000 got annotated for the db test_v1
.
After a while we modified our annotation rules, so I decided to create a second version of the database (test_v2
) in a new Prodigy instance. For the input texts this time, I pulled a subset of texts from data.jsonl
to create a data_v2.jsonl
. This subset may have some overlapping sentences with the 300 that were previously annotated (I selected texts from line 300 and onwards of data.jsonl
).
When the annotators started seeing repeated sentences I thought it was the ones overlapping between data_v2.jsonl
and the 300 they had annotated in test_v1
. However, after close examination I see this is not the case. I exported test_v2
using pgy db-out
. The repeated sentences reported by one annotator had 4 annotations, which is worrying given there are only 3 annotator and my session is not being used. When looking at the annotator_id
and session_id
something weird shows up:
3 annotations look like this (correct annotator and session ids):
"_annotator_id":"test_v2-jane","_session_id":"test_v2-jane"
"_annotator_id":"test_v2-joe","_session_id":"test_v2-joe"
"_annotator_id":"test_v2-sarah","_session_id":"test_v2-sarah"
But a forth annotation has test_v1
(incorrect database) in the annotator and session ids:
"_annotator_id":"test_v1-jane","_session_id":"test_v1-jane"
Why could this be happening?
Update: Only one of the three annotators is experiencing the issue. One thing to note is that this annotator (jane
) was working on test_v1
on a browser session when I asked her to save the progress as I was going to restart the server to set up a new instance for test_v2
. Once the new instance was running, she may have kept annotating on that same browser window she was using for test_v1
.
Thanks