I want to use Prodigy for named entity recognition and assigning dependency relations, and have my training data file ready to go in a .jsonl. In my VS Code, I've installed Prodigy in a virtual environment, but when I run Prodigy with even a blank pipeline with "prodigy blank:en", I get a popup window saying "How do you want to open this file" instead of taking me to a localhost so I can start annotations. The same happens when I run Prodigy with the pipeline components and span labels that I want to include. Can you please help me solve this issue? Thanks in advance.
I can't seem to replicate your setup issue. A few things to sanity-check:
How do you open the Prodigy tab in VSCode? Are you using the CTRL+SHIFT+P and "Open Prodigy" command?
Is it possible to send screenshots of this error message? Does this appear in the browser? Usually, vscode prodigy just mimics a browser inside your VS Code interface, so no files are being opened.
Can you also send me the logs in the Prodigy command whenever this error shows up? You can prepend the command with PRODIGY_LOGGING=verbose like so: PRODIGY_LOGGING=verbose prodigy ...
I've recreated my process for installing Prodigy. I open VS Code, then create a new virtual environment, and within that virtual environment, install Prodigy according to the documentation's installation instructions. As far as I can tell, Prodigy is installed properly.
However, when I execute "prodigy blank:en" in my terminal, just to see if I can start annotating, nothing really happens (as you may be able to see from my terminal output). I get a popup saying "How do you want to open this file?" The notepad file looks like this:
Is there a better way for me to open Prodigy in VS Code so I can start annotating in a localhost in a browser? Is there some intermediate step I'm missing?
Are you trying to run the command prodigy blank:en specifically? This won't work because prodigy requires you to pass a recipe. To check if your installation is correct, perhaps you can try running the example recipe in the vscode-prodigy repo?
First you need to clone the repo, then enter the cloned directory:
git clone git@github.com:explosion/vscode-prodigy.git
cd vscode-prodigy
code .
Then from there you can run a Prodigy session (the example files are already provided in the repo):
Note that you have to pass a recipe (in this case ner.manual and a set of example data in JSONL for it to work. To view Prodigy in another VSCode tab, you need to open the VSCode Command Prompt (CTRL+SHIFT+P) and then click the Open Prodigy command.
I wasn't able to clone your github repository (error message attached below).
However, I was able to clone another an NLP model from one of my own repositories, but even when passing in a recipe and jsonl file with "prodigy rel.manual...", I still only get a popup window with a file.
This assumes you already installed Visual Studio Code (VS Code). While Prodigy is running somewhere, you can type (CTRL+SHIFT+P to open the Command Prompt. It will then show a drop-down box where you can find the "Open Prodigy" command:
I encourage you to clone the example repository again and try it out first. Also, when looking at your logs, I'm not sure if Prodigy is running correctly?
I was able to clone another an NLP model from one of my own repositories, but even when passing in a recipe and jsonl file with "prodigy rel.manual...", I still only get a popup window with a file.
Ensure that the model you're passing is a model that was trained in spaCy.
Does it matter whether I'm running on Windows or Mac? When I try to run "python3 -m prodigy..." I get an error saying "No module named prodigy." Also, what do you mean when you say "when Prodigy is running somewhere?" Don't I have to start Prodigy from the terminal?
Good news is I'm able to clone the https version of the repository. Thanks in advance.
It shouldn't be based on your operating system, so Windows and Mac should be ok. I'd say that maybe you installed prodigy in a different path? You can check the following:
which prodigy
And from there you can see the path of your Prodigy installation. Sometimes, there's also a common catch where you installed it in the python path but not in the python3 path. Try running the same command with python instead of python3.
Also, what do you mean when you say "when Prodigy is running somewhere?" Don't I have to start Prodigy from the terminal?
You got it right. It means that if you have a Prodigy session running, you can then open Visual Studio code, open the command prompt, then look for the "Open Prodigy" command.
I've been looking into installing via setup wheel...do you think that would help because it'd be based on my specific system? Also, the installation command on your documentation seems to be for mac users--if I have python 3.10.2 and Windows 10 64-bit processor what would the new installation command be?
Hi Kyle, can you try rerunning the installation process for Prodigy again? And then for good measure, copy paste the installation logs so that we can verify if Prodigy was installed correctly. My hunch is that Prodigy didn't install properly in your system, hence why you can't start a Prodigy session, and consequently why VS Code won't show anything (as it depends on the running Prodigy session).
if I have python 3.10.2 and Windows 10 64-bit processor what would the new installation command be?
Next time you can copy-paste the text instead rather than posting the whole image. I think there's one mistake in your command. If you look at the rel.manual recipe, , it expects for a "dataset" as its first positional argument. You might need to update it as:
You are working inside a virtual environment, and that your prodigy installation happened inside that environment. If you encountered errors such as "prodigy cannot be found" or something, it might be because you're "outside" the environment and forgot to activate it.
rel_dataset is the missing argument. This can be anything, and Prodigy will save your annotations into this dataset.
Although I wonder why it didn't give an error when you put the incorrect command . If possible can you try running the same command in a Linux-based environment? Say, Windows Terminal on WSL2? I'll also check in with our team why it's the case
Thank you for that. When I tried including "venv/bin/python" in my command I got an error saying 'venv/bin/python is not recognized'. I tried the command "prodigy rel.manual rel_dataset /data.jsonl --label PERSON,GPE again (this time with the rel_dataset), but I still only got the popup to open a file instead of taking me to the annotation tool. I'll try running on WSL2. Thanks