By default, this is expected, because Prodigy will send out a single batch to one person and wait for it to come back – so if there's only one batch of examples, those will go out and when the next user accesses the link, there's no second batch to send out, so Prodigy will show that no tasks are available anymore.
There are several ways to change this, depending on your use case:
- Implement an infinite stream as described here. If examples aren't in the database yet, they'll get sent out again until they're annotated. This works well if you only have one annotator and want to make sure no batch is lost when they refresh (without haviing to restart the server). With multiple annotators and little data (only a few batches), this approach works less well, because you might get overlaps: while person A is still annotating a batch, person B might be getting the same batch, because the answers aren't in the database yet (because person A is still working on them).
- Use the new named multi-user sessions by appending
?session=xxx
(annotator name or ID) to the URL. By default, all data will then be sent out once to each annotator (so everyone annotates the same questions and you can use something like thereview
interface later on to resolve conflicts and review the decisions).
Hmm, this is strange, because it really does look like there's something with the prodigy.json
that makes ujson
struggle to open it. Can you double-check the file encoding? And do you have any escaped strings in there that might confuse it?
The good news is that this isn't specific to Prodigy, so you should be able to just test it with your own script that loads the JSON and if that works, Prodigy should be able to load it as well.
Yes, you can modify the UI pretty freely via the global_css
. I wrote up different approaches for this in my comment here.
Are you using the choice
interface? I haven't tried this yet, but in that case, you could probably check the window.prodigy.content
in a cusomt script and see if accept
is an empty list or includes a selected option. If it's empty, you disable the accept button, if it's not empty, you reset it.
Alternatively, if it's a single-choice interface (e.g. radio inputs with only one possible answer), you could also hide the accept button completely and set "choice_auto_accept": True
in your recipe config. This will automatically submit the answer if the user selects an option.
You can probably also do that with a custom script: add the countdown and when it's up, call window.prodigy.answer('ignore')
automatically. This will still add the answer to the dataset with "answer": "ignore"
, though – so in your stream generator you'd have to check what's in the dataset and not only send answers out if they aren't in the dataset yet, but also if they are in the dataset but ignored.
You might also want to consider adding meta information to the task indicating that it was skipped automatically. For example, something like that:
window.prodigy.update({ skipped: true, skipped_after: 10 })
window.prodigy.answer('ignore')
This will add "skipped"
and "skipped_after"
to the task, so you'll always know where it came from, that it was auto-skipped and that it was skipped after 10 seconds.