retraining a dependency parser or just changing the def. of noun chunks ?

holl · April 25, 2019, 3:18pm

Hi,

I’ve been using the dependency parser quite extensively in english. I wanted to extend my code to be able to cope with french, but I noticed that noun chunks behave quite differently:
“He gave an apple to Paul” gives 2 noun chunks ‘an apple’ and ‘Paul’
“Il donna une pomme à Paul” gives a single noun chunk “une pomme à Paul”. We should be getting 2 (“pomme” and “Paul”).

I suspect this comes from the dependency tree (english “to” is between “apple” and “paul” while in french “à” is child of “Paul”).

Does it mean English ClearNLP dependency tree and UDP are defined in such a different way that the definition of noun chunk should be changed ? Is it just a problem with training (i.e this dependency is not parsed properly and more training will solve the problem)?

Any insight appreciated.

honnibal · April 29, 2019, 11:11pm

Hey,

I don’t know French grammar very well, so I’m not sure what the target parse for the UD would be. I would expect this would be an issue of the noun chunk rules though, rather than the parse quality. That should mean it’s pretty easy to fix. You can set a different function for the noun chunk iteration, so you can customise the rules.

Here are the current rules for French: https://github.com/explosion/spaCy/blob/master/spacy/lang/fr/syntax_iterators.py . You can see the FrenchDefaults language class here: https://github.com/explosion/spaCy/blob/master/spacy/lang/fr/init.py

The simplest way to customise the noun chunks logic would be to write to the French.Defaults.syntax_iterators class variable before loading your model.

Topic		Replies	Views
[E029] Trained models cause noun_chunks requires the dependency parse, which requires a statistical model to be installed and loaded usage , solved	1	1255	August 10, 2020
Noun chunks for Romanian and rel.manual troubleshooting usage , relations	1	475	April 10, 2022
Training Dependency parsing with sparse annotations usage , dep , best-practices	1	541	August 28, 2020
Training dependency parser usage , ner , done , spacy	5	3879	March 11, 2018
Dependency Parser and POS tagger for unsupported language spacy , pos	2	1012	March 9, 2020

retraining a dependency parser or just changing the def. of noun chunks ?

Related topics