I must hold the "memory" of the previous sentences.
As per yours suggestion i split my long documents into paragraphs just looking at \n character.
At the moment i need to tag two custom entities that have the same structure of a date (dd-mm-yyyy) but, in a specific context, i use EXP_START and EXP_END instead of DATE label.
When i train the model with long documents, it works well, i also have changed the conv_depth to 8.
However to speed up the training process i will split the documents as i wrote above.
The problem is that EXP_START and EXP_END could be splitted in my Sentencizer so for example:
....word1 word2 word3 \n
from 01-01-2010 to 01-01-2012 \n
word1 word2 word3...
as you can see if i train the model passing the sentence "from 01-01-2010 to 01-01-2012" the model will never understand if it is a "simple" DATE entity or an EXP_START / EXP_END. It has no context.
So my question is, how can i try to hold the memory of the previous sentence to let the model understands better the context (for very small sentences ) ?
In this (stupid) example, i must take the previous word1 word2 word3 to understand if 01-01-2010 is a EXP_START ..etc