what to do if train-curve shows slight decrease in last sample

Hi @dave-espinosa!

Excellent questions!

This is from spaCy scorer. You can see the code/formulas from here:

FYI if you're interested, there are ways to add custom evaluation metrics (this uses textcat but should be similar for ner):

Excellent question! I searched Prodigy/spaCy documentation and found there isn't documentation on creating annotation schemes. I suspect @SofieVL meant annotation schemes in general in terms of carefully defining what each entity means. This made me realize I think there's a lot of potential opportunity with guidelines to help users.

The closest documentation I know of is Matt's 2018 PyData talk. Around 8 minutes into it, he goes through the "Applied NLP Pyramid of Greatness" where he discusses the role of defining an annotation scheme. He then goes through an example of an inadequate annotation scheme for ner. Hopefully this will give you a bit of an idea on why annotation schemes can be important.

Let me know if you have any thoughts or questions! We greatly appreciate your feedback/questions so please keep them coming :slight_smile:

1 Like