Annotating classifier results - Random sort, classes disorder, random scores

liorm · December 3, 2019, 8:26am

I'm really new to this amazing tool, just to clarify this.

I have a multi-label classifier and the task is to annotate the output of this model.

The documents are as follow:
{'text': 'some text', 'label': 'x1', 'score': 0.45}
{'text': 'some other text', 'label': 'x1', 'score': 0.3}
{'text': 'other other text', 'label': 'x2', 'score': 0.5}

I want the order of the documents to be static, I don't want the order to change at all, I want the data tagger to focus first on the two documents of label 'x1' and just than move on to label 'x2'.
For some reason (again, I'm really newbie with this tool) when I go to the UI I can see the first text with some random score, I even deleted the scores from the Jsonl file and still it shows scores on the UI. How can fix this one? I want the data tagger to see the score each text received by the model.
Another question but it is less urgent - Is there a way to attach for each document a link that will be shown on the UI and the data tagger will be able to simply click on it?

honnibal · December 5, 2019, 9:19am

Hi Lior,

First, sorry for the delay replying to you on this, and thanks for the kind words.

I think going label-by-label is a great approach, it makes the annotation much easier. If you want to do this, the easiest thing is to just start the task with one label, and then have another run where you do the second label. You can always merge the datasets later.

Regarding your second question, the scores are produced by the model, when it's ranking the examples for the active learning. You can disable this, depending on the recipe settings you're using. Could you give the command you're using to start the server?

Finally, yes, you can include a link. Anything you put in the meta section of the tasks will be displayed to the user, so you can include links there. The UI will detect URLs and make them clickable.

liorm · January 5, 2020, 11:58am

Thank you for your answer

The command I'm using is

python3.6 -m prodigy textcat.manual tagging_task en_core_web_sm data_to_tag.jsonl --label "label1","label2","label3"

ines · January 5, 2020, 12:36pm

That's strange – the textcat.manual shouldn't be adding any scores, and there's really no place they could come from, except from the original data that's loaded in If there are scores or other meta properties in the "meta", Prodigy will just pass them through and show them in the UI, so you can display custom meta with the tasks (like links, internal IDs and so on).

If I run your command with the data from the first post, the result in the UI looks like this for me:

Topic		Replies	Views
Problem with multi labeled text classifier usage , textcat	2	1454	November 30, 2018
Multi-label classification and ranked labeling usage , textcat	2	435	September 20, 2021
Merge annotations for multi label classification tasks (non mutually exclusive) usage , textcat	3	777	January 25, 2021
Multi-label annotation with Transfer Learning textcat , solved , best-practices	5	980	June 6, 2020
Using your UI on imported data for classification and annotation usage , ner , textcat	5	1216	August 28, 2018

Annotating classifier results - Random sort, classes disorder, random scores

Related topics