Prodigy with spacy-llm ner.llm.correct - not showing the text to be annotated on the UI

Fantahun · February 9, 2024, 2:57am

Hello all,
Am back having some issues that I couldn't sort out using Prodigy. I am annotating a very large set of text for NER with 50 long and complex labels. I run this command:
dotenv run -- python3 -m prodigy ner.llm.correct annotated-xxx config.cfg examples.jsonl
It loaded the labels and prompted me to use the prodigy UI. On the UI I could see the labels, "Show prompt sent to the LLM" and Show responses from the LLM" and the Accept, Reject buttons. I don't see the text to be annotated - so that I can manually manipulate the annotation when I found a mistake. Looks like I'm missing something here. With very small dataset it works fine as expected, but with the large dataset like mine it couldn't show the text annotated on the UI. I appreciate any help on this.

Thank you

magdaaniol · February 12, 2024, 1:24pm

Hi @Fantahun ,

We haven't seen that issue before. Could you share how many examples does your dataset have so that I can try to reproduce the issue.
Thanks!

cromander · February 13, 2024, 1:15pm

I had the same problem for any model, but it works perfectly fine with OpenAI's model.

Fantahun · February 14, 2024, 12:21am

Thanks for your reply @magdaaniol. I'm using 262 texts (lines of text) in my examples.jsonl, fifty labels - most of which are multi-word and my ner_example.yaml is almost empty. I'm planning to use a couple tens of thousands of text in examples.jsonl. BTW I'm using OpenAI's GPT-4 LLM in the background. If possible/and required I can share screenshots as well.
Thank you again.

magdaaniol · February 16, 2024, 11:13am

Hi @Fantahun

Thanks for the extra info. Are you sure the only factor that changes how things are rendered is the length of the input file? That in principle shouldn't be the case as the input is processed in batches.

Could you try the same input and the number of labels with the regular ner.manual and see if you experience the same issue? (Just to exclude the potential issues with the input being corrupted/empty due to the LLM API)
I'm suspecting the labels might be covering up most of the UI, but you still should be able scroll to see all the elements. It might be helpful if you share the screenshot of what you're seeing - thanks!
Also, which Prodigy version are you on? in v1.13.2 we introduced a dedicated front end component for handling LLMs so that would help me to figure out how is your UI being rendered.

Fantahun · February 24, 2024, 12:15am

I'm using Prodigy v1.14.12

Fantahun · February 24, 2024, 4:04pm

I need help with this please!!

Fantahun · February 24, 2024, 4:05pm

I'm talking about a bug in the software I purchased. The company is responsible to fix this. I appreciate any help. Thank you.

magdaaniol · February 26, 2024, 8:51am

Hi @Fantahun,

In order to be able to help you, we really need you to try to answer the questions I asked before (the Prodigy version was only one of the questions). Otherwise, it would be much harder and would take much more time to reproduce your problem and understand if it is a bug.
If you are not able to engage in this process, the only thing we can offer is a refund.

We try our best to answer questions as quickly and detailed as possible but we're a small team and we're not able to get to everyone's questions immediately, especially not on a weekend. You send your follow-ups within less than 24 hours and on a Saturday. This isn't very helpful and makes it a lot harder for us to answer everyone's questions on the forum.

Fantahun · February 27, 2024, 11:55pm

Sorry for looking a little harsh or my request and am not looking for a refund as well. I'm feeling there is a bug with the tool or maybe the way I'm using it. My back to back question is just by chance. I'm one of early advocates of spacy and want to test prodgy spacy-llm combo to its limits. I hope this will benefit the project.
Getting back to the question, I don't think my dataset is suited for ner.manual. I guess I've to dig a little deeper to see where the problem is really originated from. Thanks

Fantahun · February 28, 2024, 1:18am

Hello again, and sorry for bother you.
I think I identified the problem I guess it's a bug with prodigy UI. I reduced the number of labels to 12 and tried it and it worked fine as expected - showing the labels, text and other parts. Please see the screenshot. Previously I used 50 labels - Maybe the UI is not able to accommodate that with the text to be annotated. I've checked scrolling if it's hidden, but that didn't work either.
Thank you

Fantahun · February 28, 2024, 1:18am

magdaaniol · February 28, 2024, 11:05am

Hi @Fantahun,

The truth is that the UI was not really designed to handle this many labels. But there's a reason to it as it is, likely, not the best idea to try to annotate this many labels at the same time.
This would be really taxing for the annotators, as they need to think about big data model with every annotation task and not the easiest task for the model ether.
This thread by @ines explains very well why you might consider splitting your annotation in steps.

Additionally, this high number of labels is even less recommended in the context of LLM annotation.
It makes the prompt much bigger and more difficult for the model. it also slows down the interference time. The official prompt engineering guide by OpenAI explicitly instructs to split complex tasks into simpler ones and have one task per prompt.

Finally, if you do need to show the high number of labels in the UI, you could make the label area scrollable with custom css via global_css setting in .prodigy.json:

#prodigy.json
 "global_css": " .prodigy-labels { max-height: 150px; overflow-y: auto;} .prodigy-container { max-width: 950px; }",

I'd like reiterate that it would not be our recommended way of dealing with a high number of labels.

If you're interested in some more NER annotation good practice tips, this thread has plenty of relevant references on the topic of dealing with a high number of labels.

Fantahun · February 29, 2024, 3:14am

Great. Thank you very much @magdaaniol for your suggestions. I will try to simplify my annotation tasks to a manageable level. As always, keep up the good work at Explosion.

Topic		Replies	Views
start to annotate pre-defined labels in python usage , solved	3	818	May 2, 2019
All labels highlighted although only one label specified usage , ner , front-end	3	11	February 6, 2025
Multi-labels not working usage , ner , solved	6	1016	August 23, 2019
How to overwrite/correct annotations? ner , solved	7	2067	September 7, 2021
annotations imported via db-in not showned ner , done , front-end	2	40	August 31, 2024

Prodigy with spacy-llm ner.llm.correct - not showing the text to be annotated on the UI

Related topics