'Can't find db' error for db-out and db-merge for spancat

I keep getting a can't find db error message when trying both db-out and db-merge, however, prodigy train and prodigy spans.correct are working with the dataset.

Example of training working:

MicrosoftTeams-image (4)

Error from same dataset when trying db-out

I even successfully db-in a new dataset and it's showing that it cannot be found when trying to merge or train (I know the screenshot has a typo in the dataset name, but I tried running again with the correct naming and got the same error):

hi @mettsj,

Thanks for your question and the details.

For db-out, it looks like you're running:

prodigy db-out mgmt-sample-large ./annotations.json

Can you try to run instead:

prodigy db-out mgmt-sample-large > ./annotations.jsonl

You'll need to add > to save the file and make sure to save it as a .jsonl. This is shown in the docs.

I'm not completely sure on the db-merge. It's a bit tough to read the images -- in fact, can you post the code as markdown next time? Images aren't searchable so it's easier to post as markdown.

But from your image, it looks like you may have an unnecessary space or missing something between mgmt-KSAs mgmt_all:

prodigy db-merge mgmt_sample_large,mgmt-KSAs mgmt_all

Can you confirm those are the right names and add a comma between those datasets?

If not, can you run and provide the output to prodigy stats -l? This will provide the Prodigy version and the -l will print out all of the datasets you have. This is a way to confirm that the datasets you expect are actually in your database.

Hi @ryanwesslen,

Thanks for the quick response and sorry about the screenshots. My coworker is the one having issues on his end, so was going off what he had sent me. I've asked him to run all of this in a markdown file and can send if needed once I have from him.

For the db-merge, mgmt_sample_large,mgmt-KSAs references the in_set dbs and mgmt_all represents the desired out_set db. According to the documentation on the site, what we tried should be correct with the space between the in_set (commas separated list) and out_set value.

As for the annotation export, I believe the '>' made a difference and the export was successful. And when running prodigy stats -1, he got the below output (again, I'm working on getting via markdown and will send once I have). The output shows that the 2 databases (mgmt_sample_large, mgmt-KSAs) we are trying to merge exist, so we don't understand why we are receiving the 'can't find database error' for both of these.

MicrosoftTeams-image (7)

Hi @ryanwesslen

Just want to follow up, as I am now experiencing the issue on my computer as well with db-merge. See attached markdown file. Not sure why it says it can't find the 2 datasets when it lists them after running prodigy stats -l.

db-merge error.jsonl (3.1 KB)


For other users, we found a fix for this by adding double quotes "" around the dataset names. This seems to be a difference in CLI string parsing so if others have the same issue, try adding double quotes.