Progress bar & Score

khushal17ad · March 4, 2018, 2:53pm

Hi,

I have two queries regarding Prodigy tool. Mind the text in the snapshot, this is just for reference purpose.

What does the score below the text signifies? (marked with red below the text in the image)
How do i get the stats (Accept, Reject, Ignore) of the dataset on the interface. Similar to the one given in the insults classifier video.

Thanks

ines · March 4, 2018, 3:17pm

This is the confidence of the prediction assigned by the model – for example, the category label predicted by the text classifier, or the entity label predicted by the entity recognizer. In your example, the score of 0.5 is kind of the "perfect" uncertain prediction – by default, Prodigy prioritises examples with a prediction closest to 0.5, i.e. the ones it's most uncertain about and which will give you the most relevant gradient for training. (No matter if you click accept or reject – the model will always have a gradient of 0.5 to learn from.)

Just set "show_stats": true in your prodigy.json – see here for more details!

Btw, some more background on the progress bar (also in case others come across this issue later): Prodigy's active learning recipes will use the loss returned by the model's update method to calculate an estimated annotation progress, based on how the model is improving. It does a simple regression to predict how many examples until the loss reaches zero.

Recipes that don't use a model in the loop will check whether the stream has a length and calculates the progress based on the total examples and the examples already annotated in this session. This is usually not the case if the stream is a generator, so the progress bar will show the infinity symbol (like in the screenshot above). In your custom recipes, you can also define your own progress function as the 'progress' component returned by the recipe, for example:

def get_progress(session=0, total=0, loss=0):
    progress = compute_something_here()
    return progress

khushal17ad · March 4, 2018, 3:31pm

Hi, thanks for the answer.

I got the progress bar query correctly. Thanks for the detailed answer.

Just to clarify on the above part, if the score is 0.01 or quite less (between 0.0 to 0.1), it means that the model is certain about these sentences/entities? and vice versa for higher values (i.e more than 0.6)

P:S I encountered these values in the real dataset.

Thanks

ines · March 4, 2018, 3:35pm

A score of 0.01 means that the model has assigned a very low probability to the suggested annotation (or, phrased differently, is very confident that it's wrong). Vice versa, higher scores signal higher probability.

khushal17ad · March 4, 2018, 4:30pm

That’s great.

Thanks a lot

Yuri · October 28, 2019, 11:33am

How can I update progress for the "textcat.manual" recipe so that instead of infinity symbol a number of answered out of total tasks will appear? Thank you.

ines · October 28, 2019, 4:18pm

@Yuri You could edit the recipe function (or wrap it – see the README for an example) and overwrite the stream with a list instead of a generator. So bascially, stream = list(stream). This means that the stream has a __len__, and Prodigy will be able to calculate the progress based on the total number of examples.

jdixosnd · December 15, 2020, 5:46am

Hi, I have a doubt regarding the progress bar.
As you stated earlier that it updates the model and calculates the loss every time the "update" function is called. But sometimes even if I ignore some garbage sentences and mark them as ignore then also the progress bar updates its value.(In such scenario, I am not accepting or rejecting any sentence in between, I am ignoring all of them, and still the progress bar updates)
I am not sure what happens in this case. My only concern is that it should not learn anything from ignored sentences.

ines · December 15, 2020, 10:45am

Hi! Are you sure the batch of updates sent back to the server only includes the ignored examples? The progress bar requires an update from the server so it's updated whenever a batch of answers is sent to the server and after the model is updated and reports a new loss. So it's possible that the progress updates after you ignore a bunch of texts and that this progress is based on previously accepted/rejected examples.

Topic		Replies	Views
Blocks and Progress bar usage , custom , front-end	20	2410	September 26, 2022
Edit an existing recipe. usage , textcat	1	390	May 12, 2021
Understanding high level flow of Prodigy for text classification textcat , api	1	931	September 28, 2018
Get progress status for ongoing dataset solved	4	248	September 19, 2023
Custom model Requirements usage , custom	8	2920	March 25, 2019

Progress bar & Score

Related topics