Hello,
I was wondering if we buy the prodigy software, Is there a way to add a custom spacy that we made or some other spacy-based packages like medspacy ?
Thanks,
Rouzbeh
Welcome to the forum @rouzbehf!
Yes, Prodigy is fully compatible with custom spaCy pipelines and spaCy-based packages like medspaCy! This is actually one of Prodigy's strengths - it's designed to work seamlessly with the spaCy ecosystem.
Here's an overview how you can use your custom spaCy models with Prodigy:
- Loading custom pipelines to built-in workflows: Simply specify your custom model path when initializing a built-in workflow:
prodigy ner.manual your-dataset path-to-custom-pipeline /path/to/texts.jsonl --label CONDITION,MEDICATION
- Using medspaCy as python package: You can load medspaCy components directly in custom Prodigy recipes:
# example psuedocode
import prodigy
from prodigy.preprocess import make_ner_suggestions
import medspacy
@prodigy.recipe(
"medspacy-ent",
dataset=Arg(help="Dataset to save answers to."),
view_id=Arg("--view-id", "-v", help="Annotation interface")
)
def medpacy_ent(dataset:str, view_id:str = "ner_manual"):
# Load medspaCy with custom components
nlp = medspacy.load()
# Load your own input data streams from anywhere you want
stream = load_my_custom_stream()
# Get annotations from medspaCy
stream.apply(make_ner_suggestions, nlp=nlp, component="ner", labels=labels)
return {
"dataset": dataset,
"view_id": view_id,
"stream": stream,
}
- Training with custom data: You can train your custom spaCy pipeline directly from Prodigy annotations (either train from scratch or fine-tune an existing model):
prodigy train --ner your-dataset --output ./models --base-model path-to-custom-spacy-pipeline --config path-to-spacy-train-config
Prodigy also supports exporting annotated datasets to spaCy DocBin format for training directly with spaCy, so your custom pipeline development workflow remains flexible.
Hope this helps! Let us know if you need more specific examples for your use case.