first steps - loading the correct dataset - troubleshooting

Hi everybody,
and nice to meet you, I am a new user a nd I am trying to take the first steps with prodigy and Ner.
I have been having a problem with the annotation interface. I run prodigy in a jupyter lab framework, I think I have installed everything correctly, but I can't figure out how to solve this.
I start with this

import os
import prodigy
!python -m prodigy stats -l

and I can see two databases, which is correct.

However, when I try and start a new annotation task with this

!python -m prodigy ner.manual GP_set blank:en ./test_data.jsonl --label LABEL_A

and I get to the prodigy tab, the annotation interface is set to a different dataset, which I loaded in a previous session (new_set instead of GP_set which I am trying to load, see pic below).

What am I doing wrong?

Sorry if this is very trivial, any help would be greatly appreciated.
Best, g.

Hi! Just to make sure I understand the question correctly: you mean the name of the dataset that the annotations are saved to, right? And that it says new_set in the sidebar, even though you started a command with GP_set?

From the command you posted, it looks like you're in a Jupyter Notebook, right? Maybe what happened is that the previous server you started wasn't terminated properly? So what you're seeing here in the app is still the first command/process running on port 8080, not the second one (although you should have seen an error in that case).

Try killing the process thatr runs on that port and re-start Prodigy. How you kill a process running on a port depends on your operating system, but you'll find lots of instructions on StackOverflow as it's a common thing people don't remember how to do (I have to google for the command every time :sweat_smile:).

1 Like

Hi Ines!

Thanks for chiming in. Yes, you understood the problem perfectly, I am indeed in a Jupyter notebook (in windows 10), and killing the process that runs on the prodigy port actually did the trick.

I am posting the command lines below for windows users who might need them:

!netstat -ano | findstr :<YOURPORT>

in my case, port number is 8095, the associated PID number for the process is 8392 (see below)


which you have to type in the following line to kill the corresponding task

!Taskkill /PID 8392 /F 

This worked fine, thanks!
I was wondering however, is there a "prodigy" way to do this? (you mentioned the problem might be due to the server not being terminated properly).

Also, I am still facing a minor bug (idk if a new thread is needed). Whenever I try to start a new annotation process, and I type for example:

 !python -m prodigy ner.manual a_new_dataset blank:en ./new_data.jsonl --label LABEL1,LABEL2

the jupyter interface always freezes, and I am forced to interrupt the kernel. Only when I type the command again, I get the message:

"Using 2 label(s): LABEL1, LABEL2 [*] Starting the web server at 
[http://localhost:8095](http://localhost:8095/) ... Open the app in your 
browser and start annotating!

and i can start annotaning.

Any ideas why this happens? Thanks in advance!

Thanks for the update, glad it worked! :slightly_smiling_face:

I think this mostly comes down to the way command line commands work in Jupyter environments. If you're running the prodigy command in a terminal, you'd typically press ctrl+c to exit the server/process it's running. The equivalent of this in a notebook is interrupting the kernel: python - Is there an equivalent to CTRL+C in IPython Notebook in Firefox to break cells that are running? - Stack Overflow