I use Prodigy 1.10.3 (before I used 1.9.5, 1.10.2) with Python 3.7 and Ubuntu 18.04. I’ve got a problem with the ouput and the coloring feature (in whatever version).
A first example with the print-dataset recipe : I annotated manually a corpora with the ner.manual recipe. Then I used the print-dataset recipe on the annotated dataset. The sentences wich contain more than one named entity make a problem. A raw sentence like this :
La voix d’Ariadne se fit entendre ; elle appelait Olga dans le jardin.
... become with print-dataset recipe :
A second example with the print-stream recipe : I annotated 8000 sentences. Then I made a model (model_8000). I applied the model on a new corpora, like this :
prodigy print-stream model_8000/ new.jsonl
I have the same problem with the output. A raw sentence like this :
Vers la fin de l’été, le lendemain du jour où le petit Louis est rentré au lycée, mademoiselle Zozo, au retour d’une réunion chez les Chaduis, se met à table, un soir, toute brillante de plaisir.
become with the print-stream recipe :
So it’s a big problem for me. I can apply my model but I can’t used the result even if it seems to be good. So I’m frustrated.
I saw a simular discussion on the support (***/prodigy-print-dataset-shows-weird-format-output-no-coloring/2586). But I didn’t manage to resolve my problem. Sorry if the answer is somewhere else… I don’t see it.
Thank’s a lot for helping me.