Esteemed prodigy experts,
I have played with the get_dataset_examples functionality and I am receiving two different formats of examples. Could someone help to clear up why this exists?
My code is very simple: I am handing over the name of a dataset and collect the examples in a list.
db = connect()
# Set up the result list
lst_examples = []
examples = db.get_dataset_examples(dataset_name)
for example in examples:
lst_examples.append(example)
db.close()
This all works as expected, but when I expect the spans in the examples, I get two different formats of spans. Two have the actual text of the span, a source and an input hash - the third one does not.
I am using ner.correct in this case. My hunch is that the "longer" format is created by the prediction model while the "shorter" format is created by a manual action. Could that be true and if so, why is that?
I've tried peaking into the db itself but I don't think that will clear this up.
Many thanks for any help!
Kai