Hi, I followed the recipe instructions to run ner.manual and then ner.textcat on the headlines dataset with project id of news_headlines for both. I get the following output from the model when I run
prodigy textcat.print-dataset news_headlines | less -r :
I'm wondering if this is the expected behavior. My use case is that I am interested in annotating dialog data for which I need to classify intent and extract slots.
I figured that I could run these two on the same dataset and the backend could handle the merging instead of needing to do this manually. Is that the case?
Hi! From the screenshot you posted, it looks like the examples you're looking at don't have a
"label", which is why you're seeing the
Yes, you can always merge annotations on the same text later on. At the moment, it looks like you still have two versions of each example, though: one with the NER annotations and one with the text classification annotations. When you export the data with
db-out, you'll see that each example contains an
_input_hash, which is a hash representing the original input data (in this case, the raw text). Annotations on the same text all receive the same input hash, so for each example annotated with named entities, you can find the corresponding example annotate in textcat mode and merge the label (or convert the examples however else you need them for your next steps).