Bool flag not accepted in the jsonl pattern file

I'm using spacy config.cfg to train my model

...
[nlp]
lang = "xx"
pipeline = ["ner","entity_ruler"]
disabled = []

...
[initialize.components.entity_ruler.patterns]

@readers = "srsly.read_jsonl.v1"

path = "my_patterns.jsonl"

skip = false

In my_patterns.jsonl file, I wrote all the patterns using the srsly.write_jsonl. The thing is the output file will transform all the True flag to true. Which are correct for the JSON format but not recognized by python.
One example of pattern could be {"label": "my_label" , "pattern": [{"LOWER": {"IN": [list of keywords]}},{"IS_SPACE": True, "OP":"?"}, {"IS_PUNCT": True, "OP": "?"}, {"IS_DIGIT": True}]}
This will be transformed using srsly to
{"label": "my_label" , "pattern": [{"LOWER": {"IN": [list of keywords]}},{"IS_SPACE": true, "OP":"?"}, {"IS_PUNCT": true, "OP": "?"}, {"IS_DIGIT": true}]}
And when training the model, the label is not recognized.

In the other hand, I tried to write a simple code :

import spacy

# Import the Matcher

from spacy.matcher import Matcher

# Load a model and create the nlp object

nlp = spacy.load("xx_ent_wiki_sm")

# Initialize the matcher with the shared vocab

matcher = Matcher(nlp.vocab)

# Add the pattern to the matcher

pattern = [{"LOWER": {"IN": [list of keywords]}},{"IS_SPACE": True, "OP":"?"}, {"IS_PUNCT": True, "OP": "?"}, {"IS_DIGIT": True}]

matcher.add("my_label", [pattern])

# Process some text

doc = nlp("Hello, this is a keyword from the keyword list")

# Call the matcher on the doc

matches = matcher(doc)

for match_id, start, end in matches:

     print(doc[start:end])

This code works and can detect the keyword from the doc. But, when training the model using the config.cfg and my_patterns.jsonl file. It is not working. How can I fix this?

hi @HadhemiDS,

Thanks for your question.

Your question looks specific to spaCy, not Prodigy. Could you post it on spaCy's discussion forum? This forum is for Prodigy-specific questions.

Hi,
Done! Thanks

1 Like