Dep.Teach doesn't use same tokenenization as pretrained model

@kak-to-tak How is your custom tokenizer implemented? Prodigy will use the model's nlp.make_doc method to create a tokenized Doc from the string of text. By default, this will call into nlp.tokenizer. So your custom tokenization should be implemented via the model's tokenizer.

Alternatively, you can also feed in pre-tokenized data that has a "tokens" property. See here for an example of the format: https://prodi.gy/docs/api-interfaces#dep