Bool flag not accepted in the jsonl pattern file

HadhemiDS · August 3, 2023, 4:47pm

I'm using spacy config.cfg to train my model

...
[nlp]
lang = "xx"
pipeline = ["ner","entity_ruler"]
disabled = []

...
[initialize.components.entity_ruler.patterns]

@readers = "srsly.read_jsonl.v1"

path = "my_patterns.jsonl"

skip = false

In my_patterns.jsonl file, I wrote all the patterns using the srsly.write_jsonl. The thing is the output file will transform all the True flag to true. Which are correct for the JSON format but not recognized by python.
One example of pattern could be {"label": "my_label" , "pattern": [{"LOWER": {"IN": [list of keywords]}},{"IS_SPACE": True, "OP":"?"}, {"IS_PUNCT": True, "OP": "?"}, {"IS_DIGIT": True}]}
This will be transformed using srsly to
{"label": "my_label" , "pattern": [{"LOWER": {"IN": [list of keywords]}},{"IS_SPACE": true, "OP":"?"}, {"IS_PUNCT": true, "OP": "?"}, {"IS_DIGIT": true}]}
And when training the model, the label is not recognized.

In the other hand, I tried to write a simple code :

import spacy

# Import the Matcher

from spacy.matcher import Matcher

# Load a model and create the nlp object

nlp = spacy.load("xx_ent_wiki_sm")

# Initialize the matcher with the shared vocab

matcher = Matcher(nlp.vocab)

# Add the pattern to the matcher

pattern = [{"LOWER": {"IN": [list of keywords]}},{"IS_SPACE": True, "OP":"?"}, {"IS_PUNCT": True, "OP": "?"}, {"IS_DIGIT": True}]

matcher.add("my_label", [pattern])

# Process some text

doc = nlp("Hello, this is a keyword from the keyword list")

# Call the matcher on the doc

matches = matcher(doc)

for match_id, start, end in matches:

     print(doc[start:end])

This code works and can detect the keyword from the doc. But, when training the model using the config.cfg and my_patterns.jsonl file. It is not working. How can I fix this?

ryanwesslen · August 4, 2023, 2:07am

hi @HadhemiDS,

Thanks for your question.

Your question looks specific to spaCy, not Prodigy. Could you post it on spaCy's discussion forum? This forum is for Prodigy-specific questions.

HadhemiDS · August 4, 2023, 12:58pm

Hi,
Done! Thanks

Topic		Replies	Views
Pattern doesn't work in Prodigy but does work in spacy matcher usage , ner , solved	2	969	September 18, 2019
Pre-annotate entities with patterns usage , ner , solved	6	762	January 11, 2023
ner.manual: issue to recognize multi-words entity containing "-" usage , spacy , solved	2	308	June 15, 2021
UserWarning: [W036] The component 'matcher' does not have any patterns defined. stream = (eg for _, eg in pattern_matcher(stream)) usage	4	1825	December 21, 2022
Convert DocBins or .spacy files to .jsonl format usage , ner , spacy	2	839	January 3, 2023

Bool flag not accepted in the jsonl pattern file

Related topics