Check out this:
Now it separates by metadata (document); but you could modify for _session_id
. I really like the idea from the post of creating a streamlit app like this (FYI, this uses a different approach for tracking annotators by giving each annotator their own port/Prodigy dataset.).
Also you may want to check out the progress
recipe:
prodigy progress news_headlines_person,news_headlines_org
================================== Legend ==================================
New New annotations collected in interval
Total Total annotations collected
Unique Unique examples (not counting multiple annotations of same example)
================================= Progress =================================
New Unique Total Unique
----------- ---- ------ ----- ------
10 Jul 2021 1123 733 1123 733
12 Jul 2021 200 200 1323 933
13 Jul 2021 831 711 2154 1644
14 Jul 2021 157 150 2311 1790
15 Jul 2021 1464 1401 3775 3191
It doesn't break down by "session_id"
; however, you can view the underlying recipe and modify it (e.g., replace time with "session_id"
when it prints out the table (or something like it). To find the location of the recipe, run python -m prodigy stats
and find the Location:
path. Open that folder, then look for /recipes/commands.py
. You can then use that as a custom recipe and run it with the -F new_recipe.py
. If you get either to work, please post it back so other members of the community can use it!
Did you annotate more than 10 examples and/or make sure to click "Save"? By default, the first 10 example (as batch_size
is 10 by default) will not be saved to the database unless you save it or get through those first 10 (then it'll automatically save, and retrieve a new batch).
Are you using named multi-user sessions? You likely would want to as there's no other way to identify your data by annotator. Also, be sure to be aware of the difference of feed_overlap
(that is, do you want overlapping or non-overlapping annotations.
No. If you saved the annotations to the database, then it's not "deleting" anything. I suspect your problem was that the annotations were made (less than 10), but you didn't save them to DB by clicking save.
Yes, use get_dataset
and filter by "_session_id".
from prodigy.components.db import connect
db = connect()
examples = db.get_dataset("my_dataset")
# filter examples by `"session_id"`
This is similar to the earlier post.
Since you're annotating on your phone, you may also like @koaning's Prodigy Short on tips on running Prodigy on a mobile device: