Hello,
I am updating a custom NER model, from v3.6.1 to v3.8.14 (current one at present day). I am reusing some samples originally labeled with Prodigy.
To do so, I am using a previously built GCP Pipeline (former Vertex AI Pipeline), which at some point, requires me to create a custom logger. My current version looks like:
@spacy.registry.loggers("spacy_history_logger.v1")
def custom_logger(log_path):
def setup_logger(
nlp: Language,
stdout: IO=sys.stdout,
stderr: IO=sys.stderr
) -> Tuple[Callable, Callable]:
stdout.write(f"Logging to {log_path}\n")
log_file = Path(log_path).open("w", encoding="utf8")
def log_step(info: Optional[Dict[str, Any]]):
if info:
to_write = {
'epoch': info['epoch'],
'step': info['step'],
'score': info['score'],
'loss_ner': info['losses']['ner'],
'f1_score': info['other_scores']['ents_f']
}
log_file.write(json.dumps(to_write))
log_file.write("\n")
def finalize():
log_file.close()
return log_step, finalize
return setup_logger
Now, the last time I used this (about 2 years ago) this pipeline worked fine, but nowadays, it is giving me an error at the 'to_write' values:
TypeError: Object of type float32 is not JSON serializable
File "/logger.py", line 30, in log_step
log_file.write(json.dumps(to_write))
Hence, I have a really simple, but determining question: why this pipeline works fine with spacy v3.6.1 but fails with v3.8.14? What changed regarding loggers?
Thank you