I am currently training a binary classification model. I used 20% as Validation Data and 80% as Traindata (eval-split 0.2) The evaluation result already looks not bad. Now I want to ask whether a cross validation is possible within Prodigy - so that Prodigy do not use always the same data when I set the eval-split to 20% ?
To achieve a model which is even better, I than first looked at: With which eval-split I can receive the best model. Do you think that this approach makes sense?
After determining the best split I than want to play a little bit with the paramters "batch-size" and "n-iter" to further improve the model. After having the perfect model I than want to export the model and test it in a python environment on new datasets (datasets which prodigy havent seen, to see whether the model is overfitted). Does this makes sense in your eyes?
By testing the exported model on the new data I have to set a threshold-score, so that the program knows at what score a dataset should be considered relevant. How do I determine such a threshold? Respectively, how does prodigy set such a threshold within the validation?
Thanks in advance!!!