I purchased the licensed(more than 5) to work on NER Project. I do not have any idea how i work collaboratively using these licensed prodigy account. I just want to know how i go with the process. I have 5 peoples to work with me. What and how should i perform the task. Can anyone please explain the process and help me out to complete the project?
Thanks for your question.
First, it's important to determine the "what" you'll be labeling:
Here's more of the "how" to implement your project.
First, if your annotators are on the same network (e.g., server), check out our documentation on named multi-user sessions. You can start a Prodigy session as you would normally:
python prodigy ner.manual ner_news_headlines blank:en ./news_headlines.jsonl --label PERSON,ORG,PRODUCT,LOCATION
✨ Starting the web server at http://localhost:8080 ...
Open the app in your browser and start annotating!
To create a custom named session, add
?session=xxx
to the annotation app URL. For example, annotator Alex may access a running Prodigy project viahttp://localhost:8080/?session=alex
. Internally, this will request and send back annotations with a session identifier consisting of the current dataset name and the session ID – for example,ner_person-alex
. Every time annotator Alex labels examples for this dataset, their annotations will be associated with this session identifier.The
"feed_overlap"
setting in yourprodigy.json
or recipe config lets you configure how examples should be sent out across multiple sessions. Iftrue
, each example in the dataset will be sent out once for each session, so you’ll end up with overlapping annotations (e.g. one per example per annotator). Setting"feed_overlap"
tofalse
will send out each example in the data once to whoever is available. As a result, your data will have each example labelled only once in total.As of v1.8.0, the
PRODIGY_ALLOWED_SESSIONS
environment variable lets you define comma-separated string names of sessions that are allowed to be set via the app. For instance,PRODIGY_ALLOWED_SESSIONS=alex,jo
would only allow?session=alex
and?session=jo
, and other parameters would raise an error.
An alternative approach is where you have one session with a unique port and dataset for each annotator:
This is harder to manage manually when you go beyond a few annotators. Some community members have found using tools like tmux to help:
Second, if you're running on this on a different machine than what your annotators are on, you'll need to modify the host, changing Prodigy Host to "0.0.0.0"
from localhost
.
To do this, you'll need to modify the configuration in the prodigy.json
file or setting the environment variable for PRODIGY_HOST
.
As mentioned, for production setups, you may want to consider reverse proxies. Also, be aware that you may have to handle firewalls or other issues depending on your server setup.
Developing comprehensive guidelines for working collaboratively is hard, because a lot of design decisions depend on the unique network setup. There are many server
posts. and cloud posts like aws or google-cloud that may help depending on your setup.
Hope this helps!
Lets take a example here:
I want to annotate NER. I have 5 poeples streamlined
I use this command :
prodigy ner.manual resume blank:en /home/random/Documents/anno/outputfiles/randompersonresume.jsonl --label PERSON,LOCATION,UNIVERSITY,ROLE,ORGANIZATION,DATE,DOMAIN,PROJECT,EXPERIENCE,HARDSKILLS,SOFTSKILLS,DEGREE
to annotate. In first day, they complete the annotations. In second day is this the same way to start over? Or any other way?
I know this may be silly questions but i need to solve this issue asap. Thank You. Appreciated
Unfortunately, the answer is "it depends" on your use case and Ines' post is the best advice for how to start your project.
A lot of the decisions you need to make is based on what your use case is. Matt discussed this as the base of the "ML Hierarchy of Needs":
I suspect you're in a rush so likely may not have time to step back and focus on these problems.
If you need something, the closest step-by-step instructions I can provide is our NER flowchart:
Hope this helps!
Thank you so much @ryanwesslen . I will surely watch the video you provided. And thanks again for your constant help and guidance.