Text does not exist in spans after NER labelling

hi @bev.manz!

By default, annotated spans do not include the raw text by design.

You mention that it seems "random" that sometimes you do see the spans text and other times you don't. Could you be simply seeing the text for the tokens, not the spans text?

Let me describe. For example, you can view the ner_manual interface and see what the intended annotated spans should look like:

Produces:

{
  "text": "First look at the new MacBook Pro",
  "spans": [
    {"start": 22, "end": 33, "label": "PRODUCT", "token_start": 5, "token_end": 6}
  ],
  "tokens": [
    {"text": "First", "start": 0, "end": 5, "id": 0},
    {"text": "look", "start": 6, "end": 10, "id": 1},
    {"text": "at", "start": 11, "end": 13, "id": 2},
    {"text": "the", "start": 14, "end": 17, "id": 3},
    {"text": "new", "start": 18, "end": 21, "id": 4},
    {"text": "MacBook", "start": 22, "end": 29, "id": 5},
    {"text": "Pro", "start": 30, "end": 33, "id": 6}
  ]
}

Notice that the tokens have the text, but not the spans, which are the actual annotations.

Please confirm that this is consistent with what you're seeing.

The reason is that for training, spaCy only needs the start and end info, not the actual text itself.

If you do need to add the text, you can add it with something like this: