Problem with creating a new entity in swedish

Im trying to create a new entity for broadband related products and started out with creating a pattern file.
The problem occurs at the end when trying to run the new model on new or existing phrases. It identifies all words in the sentence as label product. I start with a complete new empty model.

Step 1. Create the pattern file and save it to service_pattern.jsonl

{"label":"product","pattern":[{"lower":"telefoni"}]}
{"label":"product","pattern":[{"lower":"telefoni"}]}
{"label":"product","pattern":[{"lower":"bredband"}]}
{"label":"product","pattern":[{"lower":"bredbandsabonnemang"}]}
{"label":"product","pattern":[{"lower":"bredbandsanslutning"}]}
{"label":"product","pattern":[{"lower":"internet"}]}

Step 2 :I import a file with suitable phrases and start the annotation tool. In this case I use a file with 50 sentences but I have been using a file with thousands of rows with the same result.

prodigy ner.teach ner_swedish_products sv-model service-phrases.txt
--patterns service_pattern.jsonl

Step 3: batch-train and output to a new model

  prodigy ner.batch-train ner_swedish_products sv-model 
  --output product_bootstrap_model

Step 4: The training performs well with accuracy of 1.0

Accuracy 1.000

Problem: The problem comes when I try to use the new model as input when i ner.teach on new phrases.
Prodigy identifies all words inside the phrase as products.

Even when Im using the service-phrases.txt above it missmatch.

What am I doing wrong?

Yes, an accuracy of 1.0 is always suspicious! Your workflow sounds alright – could you post the full results after training that includes all statistics?

I cant reproduce the scenario with the same 1.0 accuracy but I started over with a new dataset and fresh model. The result is the same but with lower accuracy. I also added a new batch of training after the first round but it makes no difference.

  1. Created a new dataset with name ner_product and start annotating with the same patterns file.

    prodigy dataset ner_product
    Successfully added 'ner_product' to database SQLite
    
    prodigy ner.teach ner_product sv-empty-product-model service-phrases.txt --patterns 
    service_pattern.jsonl
    
    Saved 41 annotations to database SQLite
    Dataset: ner_product
    Session ID: 2018-11-26_10-36-03
    
  2. Result of the fist training round

    prodigy ner.batch-train ner_product sv-empty-product-model --output product_model
    Loaded model sv-empty-product-model
    Using 50% of accept/reject examples (12) for evaluation
    Using 100% of remaining examples (12) for training
    Dropout: 0.2  Batch size: 4  Iterations: 10  
    
    BEFORE     0.222     
    Correct    2
    Incorrect  7
    Entities   47        
    Unknown    44   
    
    Correct    7
    Incorrect  4
    Baseline   0.222     
    Accuracy   0.636                                                                                      
    
  3. Annotating the new model with other phrases

    prodigy ner.teach ner_product product_model service-phrases3.txt
    Saved 103 annotations to database SQLite
    Dataset: ner_product
    Session ID: 2018-11-26_10-40-04
    
  4. Run batch training again from empty model

    prodigy mikaeleriksson$ prodigy ner.batch-train ner_product sv-empty-product-model --output` 
    product_model_ver2
    
    Loaded model sv-empty-product-model
    Using 50% of accept/reject examples (23) for evaluation
    Using 100% of remaining examples (24) for training
    Dropout: 0.2  Batch size: 4  Iterations: 10  
    
    BEFORE     0.000     
    Correct    0
    Incorrect  20
    Entities   74        
    Unknown    64    
    
    Correct    6
    Incorrect  7
    Baseline   0.000     
    Accuracy   0.462  
    

It is almost the same result if i use a few examples or if i use thousands of them.

When I evaluate the model the model still thinks that punctuation, words like “jag(I), och (and), eller(or)” and so on are products.

Do you have any full results and/or examples you can share from when you used larger dataset with thousands of examples? The problem here is that it’s super difficult to draw any conclusions from the results, because you’re only training from 24 examples. Even if the result is similar to the result you see if you train with thousands of examples, it might be for completely different reasons.