I am working on a sentence annotation task to identify whether a sentence is about climate change or not.
I am hosting prodigy (using nginx) on my home computer so that I my master student can also access the UI and annotate sentences. We are however coming across some weird behaviour which I don't understand where it is coming from. Currently the UI says that we have annotated a total of 1940 sentences in the dataset
climate_klimatonly_v1 . This number is the same for both him and me.
However, when I run:
from prodigy.components.db import connect db = connect() all_dataset_names = db.datasets examples = db.get_dataset("climate_klimatonly_v1") print(len(examples))
I get 1612 examples. Hence, some of our annotation progress has disappeared. I have looked through prodigy.db.
The strange behaviour is this:
- When my master student does the same we can both see that the total number in the UI increases but the number of examples in the DB does not increase.
- When I annotate both the total progress in the UI and the number of examples in the DB both increase.
Note that we are using sessions so he annotates with his name and I use my own. I have double checked with him that he is saving the progress. I have also double checked that the dataset in the UI is
Any ideas of what might be causing this? First, I was thinking that it might be a network connectivity issue? Perhaps, he looses connection when trying to save??