how to write a model.update() function

My team want to load and re-train some Chinese models through Prodigy, e.g. NER model and text-classification model. Since spaCy doesn’t provide any basic Chinese models, we are trying to implement these recipes.

'update': model.update,  # update model with annotations

I am still confused about the input, output and main process logic of this bound method model.update. Is there any paradigm to help me write a correct one?

Really glad to help you do this. I do hope we’ll be able to add some Chinese models for spaCy soon too.

You should actually be able to use the built-in recipes for NER and textcat, even with Chinese. But to answer your question about the update() function: there’s some documentation in the PRODIGY_README.html file that you might want to look at. The signature of the function is very simple. Example:


examples = [{"text": "some text", "spans": [{"start":0, "end":4, "label": "DT"}], "answer": "accept"}]

def update(answered_examples):
    loss = 0.0
    return loss

update(examples)

The update() function must take a minibatch of dict objects, where each dict should have a key answer, with value one of "accept", "reject" or "ignore". For the NER recipe the example should have a key spans, which should be a list of dicts. Each span dict should have the keys "start", "end" and "label", where start and end are character offsets, and label is a string.

To make the update function work well, there are a few things to consider. First, in the NER update, you’re not going to have complete annotations for the inputs. You might only have one entity for the sentence. You also need a way to learn from "reject" examples. If the answer is "reject", it’s easy to calculate the gradient of the error for the class you got wrong, but for other classes you probably want to zero the gradient. I’m not sure what the neatest way to express this in Tensorflow or PyTorch would be. Personally I wouldn’t bother trying to express it as a loss — I would just calculate the gradient and pass that in.

Many thanks for your help.
I am curious about “You should actually be able to use the built-in recipes for NER and textcat, even with Chinese”.
As prodigy take spacy as the default ner model to handle coldstart problem, it works well with English(German, Spanish, etc) texts. However, when we take spacy model as a default one to fullfill chinese ner, i am afraid that all text spans would get the same probability 0. As a consequence, perfer_function() will not recommend valuable questions.

Yes, to get over the cold start problem, you'll have to start off with examples of the entity first to give the model something to learn from. The ner.teach recipe supports passing in a JSONL file containing match patterns (like the patterns used by spaCy's Matcher). Prodigy will then start showing you matches of those patterns in your texts. As you annotate those examples, the model is updated and will eventually start suggesting examples as well, based on the updated weights. We actually just recorded a video tutorial that shows this workflow for training a new entity type.
There's also more information in this thread and this comment.

In this example, we use the terms.teach recipe to bootstrap a terminology list from word vectors and then convert it to a patterns file using terms.to-patterns. But you could also generate the list of patterns manually – see the PRODIGY_README.html for an example of what's possible.

many thanks
I have another question. We are trying to train a chinese-ner model through spacy. Will prodigy get access with this model seamlessly?

Yes! Any model you export from spaCy, e.g. via .to_disk(), can be loaded directly into Prodigy. The model you specify on the command line can either be a path to a data directory containing the exported model, or a model Python package created with spacy package.

(If you've made modifications to spaCy, for example, to the Chinese language data or other parts of the library, those will have to be available to Prodigy as well. So you can either use a custom recipe, run Prodigy with a fork of spaCy containing your modifications, or create a Python package for your model and include your custom code in the model's __init__.py.)

Bravo!
I just test our spacy-Chinese-NER-model, and it works well with prodigy.
Now i turn to train a relation-extractor based on the spacy-Chinese-NER-model, any suggestions?
Again, thank you all.

Cool, that's nice to hear! :+1:

Here's a thread with some ideas for how to use Prodigy for relationship and dependency annotation – this might be helpful to figure out the best interface to use:

Hi , I have a NER model in Pytorch. I used the model to predict and score the entities step 1 and step 2 but I don't know how to update my model with the answers.
Can you help me with the FUNCTION "Update your model" in step 3?

Hi @elazzouzi1080 , the update_your_model function is where you put your Pytorch-specific update step. Now this might differ based on your implementation but you can check this documentation for more info. Usually, when I train in pytorch, my training loop looks like this:


import torch
import torch.optim as optim

loss_fn = torch.nn.CrossEntropyLoss()  # or the appropriate loss for your task
optimizer = optim.Adam(model.parameters())

for t in range(num_steps):
    y_pred = model(X_t)
    loss = loss_fn(y_pred, y_t)
    
    # Zero the gradients before running the backward pass
    optimizer.zero_grad()

    # Backpropagation
    loss.backward()

    # Update weights using optimizer
    optimizer.step()

Now, what you should probably care about is X_t and y_t. In our case:

  • X_t: your training text data
  • y_t: the labeled data, you can obtain it from the spans or doc.ents

You might also want to check this tutorial on using Pytorch for your NER model.

1 Like

Hi @ljvmiranda921 thank you for your feedback :slight_smile:
Im workin on Active learning with a custom model
to explain what I have done :
1 - in step 1 i used the model to predict and score entities

stream = TXT('Prodigy_data.txt')
labeles is a list of lists : [[''O','O','BPATIENT','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O'],['O','B-DOCTOR','I-DOCTOR','I-DOCTOR','B-DOCTOR','I-DOCTOR','O','O','O','O','O','O','B-STR','I-STR','I-STR','I-STR','O','O','O','O','BZIP','BVILLE','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','O','B-VILLE'], ...........]

the generated predictions and scores

in spans i have the y_pred not y_t(labeles)

step 2 : Sort the stream by score using prefer_uncertain,
Resort stream to prefer uncertain scores.

in step 3:
for the update i have my training fuction:


the iterator is my train_dataloder:
image

but in prodigy we use stream

i don't know how to do the update based on the answers: accept , reject and ignore
** The update method should receives the examples with an added "answer" key that either maps to "accept",**
** "reject" or "ignore".**

another question
i used my model to highlight suggestions for me and i have this result

my model didnt detect one entite (X) : can i use in this case the active learning by corecting this entite (x) with Doctor and then i press ACCEPT
Will the model be able to learn from this changes
and what if i have more entities to add or correct

This depends on your task. A naive way might be to just filter all those answers that you have accepted, and then perform some data transformation step so that you can feed them directly to the model.update function

If you're using active learning, the model will update in the loop. However, in the end, you'd still want to collect all your annotations together and train a new model out of that. You will then use that final model into your downstream / production tasks.

So whenever you are in a model-in-a-loop scenario, my recommendation is to annotate all of your dataset first. Then, export your annotations using db-out and then train a model out of those. With that, you get the benefit of customizing your training method and do some of the common deep learning tricks (dropout, etc.).

Hi everybody,

I'm a colleague of @elazzouzi1080

We are indeed confused on how to adapt our custom pytorch model so that it is able to take advantage of the binary decisions produced via the ner.teach UI.
If we understood everything well, ner.teach produces sparse annotations as described in the very interesting presentation of @ines : Belgium NLP Meetup: Rapid NLP Annotation Through Binary Decisions, Pattern Bootstrapping and Active Learning - Speaker Deck
The spacy-based models in the loop can take advantage of such annotations. Also, via the --binary argument it is possible to update a model with these annotations, once the teaching step is ended.

We don't have a clear vision on how to take such sparse annotations in our custom model:

  1. for the model in the loop step
  2. to update a final model taking into account the sparse annotations.

Any advice on this ?

Thank you very much.

Guillaume