Can I use Prodigy with CoreML?

Howdy, and please accept my apologies for bumbling about with naive questions. I'm planning my first NLP project and am looking for a tool to make annotating my large collection of text samples easier.

From what I can tell so far, Prodigy seems like a great tool for this but given the cost, I figured I'd verify my understanding of how it fits into the toolchain before committing.

I'm planning to do some NLP work for Apple OS apps, a bit of NER and classification. My current limited understanding is that CoreML provides some existing models which may be of use, but more likely, I'll want to supply my own models from other sources, probably not spaCy based on my googling, but maybe Tensor Flow. I am NOT asking if I can create those models with Prodigy, but rather whether I can use it in an agnostic way to generate annotated data that I can feed into whatever tool I end up using to train models. From reading the docs, it seems like exporting annotated data is generally the whole point, and the connection to spaCy is side benefit rather than a locked in workflow. So, is my understanding correct?

Thanks for not mocking me too hard for asking what I assume in six months will feel like a very obvious question :grimacing:.


Hi @cameronmcefee , welcome to Prodigy!

Prodigy can definitely fit into your workflow in many ways:

  • Prepare your training data for CoreML training. For this use-case, you'd annotate your texts using Prodigy, export it into a JSON file (db-out) and then feed that file to the ML framework of your choice. Under the hood, your Prodigy annotations are saved in a MySQL database (which you can configure), giving you more freedom to what to do with it afterwards.
  • Perform QA on the predictions of your CoreML model. Let's say you have a v1 of your CoreML model, then you made some predictions that you want to QA: you can use Prodigy by converting your output into one of the compatible formats, then import it into Prodigy to re-annotate them.

So to answer your question: yes, you can use Prodigy even if you're not going to train a spaCy model. You can even write a custom recipe to integrate your CoreML model, if you so choose.

Awesome, thanks @ljvmiranda921!