Hi! This sounds similar to the antipatterns request here:
We don't currently have that implemented out of the box, but you could add a filter to the stream at the very end of the recipe that explicitly doesn't send out an example if the span text is part of an exclude list. For example:
def filter_stream(stream):
exclude_list = ("with", ".", ",") # etc.
for eg in stream:
span = eg["spans"][0]
if eg["text"][span["start"]:span["end"]] not in exclude_list:
yield eg
# End of the recipe
stream = filter_stream(stream)
However, this also means that you won't get to annotate it. From what you describe, it sounds like your model is a bit "lost" and possibly doesn't get to see enough positive examples, so it starts suggesting a lot of very random tokens over and over again. Are you able to add more patterns to help bootstrap the suggestions? Alternatively, it's also possible that your use case just needs the model to be pre-trained more before you can start annotating with the model in the loop. So you might want to experiment with doing some manual annotation first so the model knows at least something about the entity type.