Hi, I am new to the tool. where are the annotation files stored by default and how do I access the same.
Hi @Hyanam ,
Annotations are stored by default in the sqlite database prodigy.db
that is created in your Prodigy Home directory. You can inspect the path to the default Prodigy Home by running python -m prodigy stats
.
To export the annotations to .jsonl
format you can use prodigy db-out
command. To export the annotations to spaCy DocBin
format you can use data-to-spacy
command. Finally, you can pretty print the annotations in the terminal for a quick look with print-dataset
command.
Thank you for quick response.
I am getting when accessing dataset from sqlite. Below is the dataset and sql command. Can you give me step by step process to access and download the file and if there is any easy UI to help with this?
How do we download annotations for Bert or other LLMs?
Hi Hyanam - I am learning prodigy as well, and can partly help you.
To download annotation files, in the terminal the syntax is:
python -m prodigy db-out database_name path/your_jsonl_file_with_annotaions.jsonl
see: Built-in Recipes · Prodigy · An annotation tool for AI, Machine Learning & NLP
Good luck - its a big learning curve, but when you get your first model predictions out, its a huge high!
Alphie
Hi @Hyanam ,
You don't need to use any sql to get the data out in thejsonl
format. Please see my previous response for the commands to use together with links to the documentation.
In your case it would be:
python -m prodigy db-out news_headlines > ./news_headlines.jsonl
These commands are terminal-based, they just export the dataset so there's no UI to support this process.
How do we download annotations for Bert or other LLMs?
Not sure what do you mean here? The download or export process is the same for all kinds of datasets manually or bulk annotated with a model. You can find more information on annotating with models here.