nlp.rehearse does not work

I am trying to do dynanic model training. For this I need to retrain my model every time when for eg. employees need to add new data to model. Basically my app extracts data from docs and if something wrong - people check what is wrong and model updates

I have 3 solutions in my mind

  1. To do one json file or db with examples, always add new examples to it and retrain whole model from 0 every time (I guess it is the worse solution)
  2. To create new ner pipeline every time I train model (but I think that model will be executing slower, because it is posible to have 100-200 new pipelines)
  3. Best solution what I found - use pseudo-rehearsal

Here is my code that will retrain model

optimizer = model.resume_training()

for itn in range(1000):
    random.shuffle(data)
    losses = {}
    for item in data:
        doc = model.make_doc(item['text'])
        ents = []
        for annotation in item['annotations']:
            start = annotation.get('start')
            end = annotation.get('end')
            label = annotation.get('label')
            if start is not None and end is not None and label is not None:
                span = doc.char_span(start, end, label=label)
                if span is not None:
                    ents.append(span)
        doc.ents = ents
        example = Example.from_dict(doc, {"entities": ents} )
        model.rehearse([example], sgd=optimizer, losses=losses)

If I am using model.rehearse my model does not update at all, but it is successfully processed

When I am trying to use model.update - all works, but now I am getting problem called "chatastrophic forgetting"

Am I doing something wrong, or this feature can not do what I need? Thank you!

hi @apparat!

Thanks for your question and welcome to the Prodigy community :wave:

Could you post your message on spaCy's GitHub discussions forum?

Your question is specific to spaCy (nlp.rehearse), so you are best off posting there. That's where the spaCy core team answered questions and you'll get a much faster response by posting there. This forum is for Prodigy-specific questions.

Thanks for your understanding!