terms.to-patterns looks strange

Sorry if this was confusing – the terms.to-patterns recipe is designed to convert a dataset of single terms to a patterns file – fore example, a dataset created with terms.teach, which would include examples like "text": "Apple". That patterns file can then be used to bootstrap training in ner.teach and make sure the model sees enough positive suggestions.

Creating patterns from existing annotations is a good idea, though – you could even use ner.manual to label a few texts manually and then convert the highlighted spans to patterns. There's no built-in recipe for this, but writing your own converter is pretty straightforward. Essentially, all you have to do is load the dataset, get the accepted annotations and use the "spans" property (highlighted text) to extract the entity text and add it to the list of patterns:

from prodigy.components.db import connect
from prodigy.util import write_jsonl

db = connect()  # connect to DB with setting sfrom prodigy.json
examples = db.get_dataset('my_set')  # load the dataset

patterns = []
for eg in examples:  # iterate over the annotations
    if eg['answer'] == 'accept':  # we only want accepted entities
        spans = eg.get('spans', [])  # get the annotated spans
        for span in spans:
            # get the highlighted text and create a pattern
            text = eg['text'][span['start']:span['end']]
            patterns.append({'pattern': text, 'label': span['label']})

write_jsonl('/path/to/patterns.jsonl', patterns)

The above example only creates patterns for exact string matches, e.g. "pattern": "Apple". If you want case-insensitive token-based matching, you can use spaCy to tokenize the text for you and create a pattern this way:

text = eg['text'][span['start']:span['end']]
doc = nlp(text)
tokens = [{'lower': token.lower_} for token in doc]
patterns.append({'pattern': tokens, 'label': span['label']})

You can also check out this thread, which discusses a similar approach and solution for creating patterns: