Hopefully a quick query to resolve, is there a way to identify whether a token is the head of a conjunction?
I'm seeking to refactor the
noun_chunks iterator for customisation. Where the existing noun_chunk iterator uses
token.left_edge.i, I'm looking to expand under certain conditions using
I'd like the following sentence:
"Both Americans and Muslim friends and citizens, tax-paying citizens, and Muslims in nations were just appalled and could not believe what -- what we saw on our TV screens."
...to return the following
possmodifier Muslim to be added as a hidden element
Muslims in nations,
This sentence contains a sub-conjunction within a main conjunction, which makes the tasks more complicated:
"Americans" : Conjuncts(friends, citizens, Muslims, citizens)
"Americans" : Children(Both, and, friends, ,, citizens, ,, and, Muslims)
"Friends : Conjuncts(Americans, citizens, Muslims, citizens)
"Friends" : Children(Muslim, and, citizens) #
childrenattribute correctly identifies the sub-conjunction
Presently, I'm using the following code that produces the desired answer:
# if the word has conjuncts but does not have a `conj` dependency it is the head of the main conjunction. if word.conjuncts and word.dep != conj: # prev_end is the current word index prev_end = word.i yield word.left_edge.i, word.i + 1, cc_label # if the word has a `conj` dependency and its subtree contains `conj` dependencies, it is the head of a sub-conjunction to a main conjunction elif word.dep == conj and list(word.rights) and conj in [t.dep for t in word.rights]: # prev_end is the current word index prev_end = word.i yield word.left_edge.i, word.i + 1, cc_label # for when the word is not part of a conjunction elif word.dep in np_deps: # `conj` added to np_deps for other tokens of a conjunction # prev_end marks the right edge of the token subtree prev_end = word.right_edge.i yield word.left_edge.i, word.right_edge.i + 1, cc_label
elif statement to identify the sub-conjunction head feels somewhat hacky, is there a more affirmative way to identify whether a token is a conjunction head, or could such an attribute be requested?