How is the support for Languages other than English?


First of all, really nice work!

I am curious about the support for languages other than English, especially for CJK languages?

I couldn’t find any clue about that from the online demo.

Thanks in advance!


Prodigy uses spaCy for NLP by default, although you can also change this, and write recipes that use any other NLP library instead.

We don’t have pre-trained NER models for CJK languages in spaCy yet, but we have segmentation for Chinese and Japanese based on third-party libraries. For text classification, I would expect everything to work fine.

I would suggest giving the CJK support in spaCy a try. If you find that works OK, you’ll probably find Prodigy works well too.

To add to @honnibal’s comment above, here’s a thread that shows an example of using Prodigy with languages that spaCy doesn’t yet provide pre-trained models for (in this case, to train a Norwegian text classifier):

And this thread discusses using Prodigy to add NER and text classification capabilities to a Chinese spaCy model (which, according to the user, seems to have worked well):

