I am trying to reduce an existing labelling scheme to 20 labels, and I suspect that this could give me a warm start but that there will also be a lot of errors. My planned workflow is:
- try my best to distil the existing labelled categories into the 20 I've determined to be independent
- train a model on this dataset to get an idea of initial performance; which labels perform worst, etc.
- correct incorrect labels using the trained model (or a blank model?) and
I am a unsure of the best workflow generally, but also whether I can just use my custom manual recipe to correct/ extend the dataset, or whether I should try and customize the
make gold recipe for a multilabel task?