POS-tags messed up after ner.batch-train

nodearcnode · April 17, 2018, 12:52pm

spacy: 2.0.11
prodigy 1.4.2

Hi,

I started a new dutch-model

init-model with freqs, pruned vectors, clusters
train-model for tagger and parser (not yet for ner!)

If i use this model it will give good results;

>>> nlp.pipe_names
['tagger', 'parser']

>>> for x in doc:
...     print('%8s %12s %30s %30s' % (x.pos_, x.dep_, x.tag_, x.text))
... 
     ADP         case                        VZ|init                             In
     DET          det              LID|bep|stan|rest                             de
    NOUN          obl     N|soort|ev|basis|zijd|stan                         aanpak
     ADP         case                        VZ|init                            van
     DET          det              LID|bep|stan|rest                             de
    NOUN         nmod               N|soort|mv|basis                    wachttijden
     ADP         case                        VZ|init                             in
     DET          det              LID|bep|stan|rest                             de
    NOUN         nmod     N|soort|ev|basis|zijd|stan                            ggz

So far, so good. After this, i wanted to train the ner-pipe:

prodigy ner.batch-train NER_TOTAL_001 nl_md --output /home/prodigy/trained_20180416/  --n-iter 4 --eval-split 0.2 --label "PER,ORG,NORP,ORG_C,PER_C,GPE,LOC"


#          LOSS       RIGHT      WRONG      ENTS       SKIP       ACCURACY  
01         30.583     1923       769        2848       0          0.714                                                         
02         21.804     2205       487        3069       0          0.819                                                         
03         19.325     2246       446        3047       0          0.834                                                         
04         19.151     2267       425        3076       0          0.842

If i now run the POS-tags, it shows all -ADJ-
(the NER is working fine now)

>>> nlp.pipe_names
['sbd', 'tagger', 'parser', 'ner']


>>> for x in doc:
  ...     print('%8s %12s %30s %30s' % (x.pos_, x.dep_, x.tag_, x.text))

 ADJ         case    ADJ|prenom|basis|met-e|stan                             In
 ADJ          det    ADJ|prenom|basis|met-e|stan                             de
 ADJ          obl    ADJ|prenom|basis|met-e|stan                         aanpak
 ADJ         case    ADJ|prenom|basis|met-e|stan                            van
 ADJ          det    ADJ|prenom|basis|met-e|stan                             de
 ADJ         nmod    ADJ|prenom|basis|met-e|stan                    wachttijden
 ADJ         case    ADJ|prenom|basis|met-e|stan                             in
 ADJ          det    ADJ|prenom|basis|met-e|stan                             de
 ADJ         nmod    ADJ|prenom|basis|met-e|stan                            ggz

I found a difference in the cfg-files in vocab/parser and vocab/tagger. I dont know if this is of any meaning? This text is added after ner.batch-train :

  "deprecation_fixes":{
    "vectors_name":"nl_model.vectors"
  },

My questions:

What can i do to keep the tagger return the right POS-tags (pos_ and tag_ fields)?
During ner.batch-train the SBD-pipe was added, can i add this in an earlier stage to the model? does this influence the tagger/parser?

Thanks,

Rob

nodearcnode · April 18, 2018, 9:20am

After trying few things i came to the following outcome:

keeping the n-iter lower than 4 and increasing the eval-split from 0.2 to 0.4 provides a good result for pos_ and tag_ field, but also for ner

What i dont understand yet:

How does learning NER influence POS-tags?
What triggers the ner-learning that it adds the SBD-pipe?

Topic		Replies	Views
KeyError: 'token_end' when trying to use ner.batch-train ner , done	9	859	June 7, 2019
batch-train error ([E001] No component 'tagger' found in pipeline. Available names: ['ner']) training	4	832	September 20, 2021
Error with pos.batch-train usage , solved	4	583	February 4, 2019
Cannot train tagger on trf models spacy , transformers	3	502	June 24, 2022
Does spacy NER model use POS for modelling enhancement , ner , spacy	3	1220	October 25, 2018

POS-tags messed up after ner.batch-train

Related topics