If you skip these words from processing, you won't be able to extract any information from it. If you believe that the information you're after appears correctly tokenized in other parts of text and you would be getting enough training examples despite ignoring mistokenized words then it should OK to ignore them.
This thread is in-depth discussion of such "agglutinations" - it might be of interest to you as well.