French NER

Hi Ines,

I’m already using msgpack==0.5.6. Right now I just truncated fasttext vector files to about 1500k vectors which doesn’t trigger the error. It’s not ideal but the last vectors being less common, It should still allow understanding while building the data set.

As for the count of entities, I think I see your point, but my intuition (and the fact that I ran several times the batch-train and always obtained similar results) is that there is something in that’s not quite right – unless pretrained models count “entities” differently.