I am trying to use Prodigy+Spacy for information retrieval of Spanish texts. Those texts are internal annotation from Customer Service Agents, and follow some kind of annotation tips. Some of the agents, when they are summarizing the final offer to the customer, use a + sign instead of the space. Something like:
FINAL OFFER: PRODUCTA+PRODUCTB+20% DISCOUNT+12 MONTHS ADDITIONAL SERVICE FINAL PRICE:23,20€
The challenge that I am facing as a newbie is PRODUCTA+PRODUCTB is one single token and I would like to be able to select only PRODUCTA and PRODUCTB.
I have been checking the documentation, and if my understanding is right, I should somehow change how Spacy tokenized by adding '+', but I want to be sure on the approach, and if this is going to be consistent with the approach.
Thanks in advance