ner.print-stream for patterns?

wpm · December 30, 2017, 4:23pm

Is there a utility to show all the spans highlighted by a set of patterns in a corpus? Like ner.print-stream except for a set of patterns instead of a model?

(I think the answer is “no” and it’s fairly easy to write this, but I wanted to double-check.)

ines · December 30, 2017, 4:43pm

Not currently, no – but this is a good idea, and adding a --patterns argument to ner.print-stream should definitely be no problem. Will put this on my list for the next release!

In the meantime, I think the easiest way to write this yourself would be to take inspiration from ner.match (i.e. use the PatternMatcher), and add it to ner.print-stream

wpm · December 30, 2017, 7:08pm

Here’s my standalone recipe to do this.

import spacy
from prodigy.components import printers
from prodigy.components.loaders import get_stream
from prodigy.core import recipe, recipe_args
from prodigy.models.matcher import PatternMatcher
from prodigy.util import log


@recipe('ner.print-pattern-stream',
        spacy_model=recipe_args['spacy_model'],
        patterns=('Path to match patterns file', 'positional'),
        source=recipe_args['source'],
        api=recipe_args['api'],
        loader=recipe_args['loader'])
def print_pattern_stream(spacy_model, patterns, source=None, api=None, loader=None):
    """
    Pretty print spans matched in a stream.
    """
    log("RECIPE: Starting recipe ner.print-pattern-stream", locals())
    model = PatternMatcher(spacy.load(spacy_model)).from_disk(patterns)
    stream = get_stream(source, api, loader, rehash=True, input_key='text')
    printers.pretty_print_ner(eg for _, eg in model(stream))

Pretty easy. Nice.

wpm · December 30, 2017, 7:33pm

No wait, that only highlights the first match of every document. Have to look at this some more…

Topic		Replies	Views
how to add --pattern to ner manual usage , ner , textcat , spacy , solved	5	573	February 17, 2022
How do I ner.print-stream on synthetic training data? ner , solved	2	1073	January 16, 2018
Combining ner.teach with patterns file and manual correction of spans usage , ner , front-end	2	785	September 11, 2020
Creating a custom recipe to integrate bespoke model usage , ner , custom , solved	3	713	November 12, 2019
textcat.manual with --patterns argument enhancement , textcat	7	1100	September 25, 2019

ner.print-stream for patterns?

Related topics