Pre-annotation does not work

I have implemented a custom recipe for my model, the text appears and labels on the UI except that the pre-annotations are not there, the text is not highlighted....

I used the pseudocode here : https://prodi.gy/docs/named-entity-recognition#custom-model
I followed the same format.

Hi @aisha_harbi ,

If the text appears but the labels on the UI aren't there, then the usual problem is that the tokens and spans are not being aligned correctly. To clarify:

  • For a Token, the start and end are the character offsets
  • For a Span, the token_start and token_end are the token indices, while the start and end are again, character offsets.

What you can do is first check the samples of your JSONL file, then work your way backwards in case there's some missing piece of logic (e.g. alignment, missing keys, etc.)

1 Like

isn't this the expected format?

{
  "text": "Apple updates its analytics service with new metrics",
  "spans": [{"start": 0, "end": 5, "label": "ORG"}]
}

You're missing token_start and token_end in the spans. You also need another key, tokens that should contain a list of this data structure:

... "tokens": [{"text": str, "id": int, "start": int, "end": int},...]

A minimal structure looks like this:

{
   "text":"Welcome to Prodigy!",
   "tokens":[
      {
         "text":"str",
         "start":"int",
         "end":"int",
         "id":"int"
      },
      {
         "text":"str",
         "start":"int",
         "end":"int",
         "id":"int"
      }
   ],
   "spans":[
      {
         "token_end":"int",
         "token_start":"int",
         "label":"str",
         "start":"int",
         "end":"int"
      }
   ],
   "meta":{
      "ids":[
         "str",
         "str"
      ],
      "start_indices":[
         "int",
         "int"
      ]
   }
}
1 Like

Do the keys within the list's dictionary have to be in order? and how do I create the meta and start_indices lists? Because I have all the keys within the span and tokens and text lists..... Manual annotating works fine, it's just the pre-annotation that's not working. I think what I'm trying to ask is what does the pre annotation depend on, the spans, right?

Thanks so much,

Never mind it works, thanks again