Different meta data in ner.correct output - 'spans'.


I noticed that in the ner.correct output, there are two different types of span dictionaries in the meta data.
Most of the spans have five keys in the dictionary - 'start', 'end', 'token_start', 'token_end', 'label',
but some of them have an additional 3 keys - 'text', 'source', 'input_hash'.

Why is there a difference? What spans get the additional meta data?

Hi! Prodigy's JSON format allows attaching arbitrary metadata to the objects and while the extra keys aren't required, they're added by the recipe to the pre-labelled predictions by default. The most relevant one is probably source: this tells you where the prediction is coming from, e.g. the spaCy model used to pre-annotate the data. Annotations you add manually in the UI won't have this key. If you don't care about this info, it's also safe to just ignore it :slightly_smiling_face: