We're crowdsourcing ideas for open-source Prodigy plugins! As a reminder, plugins are recipes that are separated out into their own packages because they require a 3rd party library. We've built plugins like:
What labelling use cases do you have that would benefit from a Prodigy integration with a third-party Python library? What would be your dream Prodigy plugin?
Have been meaning to write a Prodigy integration for ZenML for a while. Would be a nice addition to our supported annotators. But thatβs the other way round. Not sure if thatβs what you were asking
I joined recently and am quite new to Prodigy. Great tool.
I would love to see native support for information retrieval, entity resolution, and similar tasks where we annotate pairs of records rather than classify single records.
We have some similar-ish plug-ins like Prodigy-ann and Prodigy-lunr that allow you to query your examples to find the most relevant subset for annotation but it's doesn't fully satisfy the use case you're describing. I've added an issue on this for the team to discuss.
many more indexing techniques than just HNSW, including good old KNN using different metrics (which should be the preference when number of documents is small, e.g., <10k)
comes with GPU support (CUDA on linux only),
probably the most established package in this domain (27.7k github stars as of this writing)