I used silver_to_gold recipe to convert the binary annotated ones to gold data. One example given below. One text with same input hash - with accept and reject labels in the binary annotations. but i do not find this text in the gold.
{"text":"POOH to 30 '' casing shoe.","_input_hash":-1258106350,"_task_hash":808499138,"tokens":[{"text":"POOH","start":0,"end":4,"id":0},{"text":"to","start":5,"end":7,"id":1},{"text":"30","start":8,"end":10,"id":2},{"text":"''","start":11,"end":13,"id":3},{"text":"casing","start":14,"end":20,"id":4},{"text":"shoe","start":21,"end":25,"id":5},{"text":".","start":25,"end":26,"id":6}],"spans":[{"start":0,"end":4,"text":"POOH","rank":0,"label":"Action","score":0.6760335891,"source":"xx_model","input_hash":-1258106350}],"meta":{"score":0.6760335891},"answer":"accept"}
{"text":"POOH to 30 '' casing shoe.","_input_hash":-1258106350,"_task_hash":484451321,"tokens":[{"text":"POOH","start":0,"end":4,"id":0},{"text":"to","start":5,"end":7,"id":1},{"text":"30","start":8,"end":10,"id":2},{"text":"''","start":11,"end":13,"id":3},{"text":"casing","start":14,"end":20,"id":4},{"text":"shoe","start":21,"end":25,"id":5},{"text":".","start":25,"end":26,"id":6}],"spans":[{"text":"shoe","start":21,"end":25,"priority":0.5,"score":0.5,"pattern":521189801,"label":"Equipment"}],"meta":{"score":0.5,"pattern":3189},"answer":"reject"}
{"text":"POOH to 30 '' casing shoe.","_input_hash":-1258106350,"_task_hash":1080809320,"tokens":[{"text":"POOH","start":0,"end":4,"id":0},{"text":"to","start":5,"end":7,"id":1},{"text":"30","start":8,"end":10,"id":2},{"text":"''","start":11,"end":13,"id":3},{"text":"casing","start":14,"end":20,"id":4},{"text":"shoe","start":21,"end":25,"id":5},{"text":".","start":25,"end":26,"id":6}],"spans":[{"start":14,"end":20,"text":"casing","rank":0,"label":"Fluid Additive","score":0.6164347514,"source":"xx_model","input_hash":-1258106350}],"meta":{"score":0.6164347514},"answer":"reject"}
{"text":"POOH to 30 '' casing shoe.","_input_hash":-1258106350,"_task_hash":-571148332,"tokens":[{"text":"POOH","start":0,"end":4,"id":0},{"text":"to","start":5,"end":7,"id":1},{"text":"30","start":8,"end":10,"id":2},{"text":"''","start":11,"end":13,"id":3},{"text":"casing","start":14,"end":20,"id":4},{"text":"shoe","start":21,"end":25,"id":5},{"text":".","start":25,"end":26,"id":6}],"spans":[{"start":14,"end":20,"text":"casing","rank":0,"label":"Action","score":0.6226858742,"source":"xx_model","input_hash":-1258106350}],"meta":{"score":0.6226858742},"answer":"reject"}
{"text":"POOH to 30 '' casing shoe.","_input_hash":-1258106350,"_task_hash":-885625635,"tokens":[{"text":"POOH","start":0,"end":4,"id":0},{"text":"to","start":5,"end":7,"id":1},{"text":"30","start":8,"end":10,"id":2},{"text":"''","start":11,"end":13,"id":3},{"text":"casing","start":14,"end":20,"id":4},{"text":"shoe","start":21,"end":25,"id":5},{"text":".","start":25,"end":26,"id":6}],"spans":[{"start":14,"end":20,"text":"casing","rank":0,"label":"Organization","score":0.5286895079,"source":"xx_model","input_hash":-1258106350}],"meta":{"score":0.5286895079},"answer":"reject"}
I do not understand why is this not accepted in the gold data.
like this, 100+ of the annotated text is missing in total of 600+ of total. The interface shows" no tasks available" for this dataset. I don't know if there is anything that I missed out here.
Also Another question:
in this command should I add
--exclude gold_dataset
so that it excludes the existing annotations in the gold dataset?
prodigy ner.silver-to-gold silver_dataset gold_dataset model -F ner_silver_to_gold.py