Prodigy 1.3 Segmentation Fault

I’ve installed the new 1.3 prodigy (linux) to use the new ner.make-gold recipe and every time the recipe saves annotation in the database (either manually or when the batch is full) i get a Segmentation Fault and the server stops. I tried it with several recipes (ner.manual, ner.make-gold, ner.teach) and several database (default SQLITE and own amazon rds psql db) and the segmentation fault always occure.

I downgraded to 1.2 and everything seems to work just fine.

The new 1.3 version seems to have a little bug on the saving part.

1 Like

Thanks for the report – that’s very strange! So just to be clear: is the segfault happening with the ner.make-gold recipe? That recipe doesn’t use an update callback, so I’m puzzled as to what could be going on here.

Yes ner.make-gold produce a segfault when the annotation are saved (resulting in no annotations saved)

Thanks for the report – this is indeed very strange. Did the traceback provide more details on where the segfault occurred, or could you run it again with PRODIGY_LOGGING=basic and post the last entries in the log?

Also, here’s the standalone, bare-bones version of what happens internally when you save a batch of answers (e.g. by making a request to /give_answers). Does the script below also cause the error for you?

from prodigy.core import Controller

ctrl = Controller('test_dataset', 'text', [], None, True, None, None, None, None, None, {})
answers = [{'text': 'hello world', 'answer': 'accept', '_input_hash': 1, '_task_hash': 2}]

I have no other details than the segmentation fault when it occurs.

10:25:39 - GET: /project
10:25:40 - GET: /get_questions
10:25:40 - CONTROLLER: Iterating over stream
10:25:40 - PREPROCESS: Tokenizing examples
10:25:40 - PREPROCESS: Splitting sentences
10:25:40 - FILTER: Filtering duplicates from stream
10:25:44 - CONTROLLER: Returning a batch of tasks from the queue
10:25:44 - RESPONSE: /get_questions (10 examples)
Segmentation fault

It occurs when i press the save icon on the web interface.

I tried executing your hello world code and it worked just fine, the annotation was saved in the database

I don’t suppose you could share the examples you’re working with?

The error is likely to be occurring within spaCy, not Prodigy. Even in the ner.manual recipe, we still pass the text to spaCy to tokenize it.

Is it possible that you have an example with extremely long text? Empty text fields shouldn’t crash it, but that’s another possibility that comes to mind.

Ok I just found out the error. I was working on a docker image using a light-weight python image python:3.6.2-alpine. I updated my image using a more complete environment (python:3.6.4-stretch). The issue must have been the C libraries that the light-weight image was using that doesn’t work anymore on the 1.3 version of Prodigy.

Thank you for your time and sorry for the inconvenience.


1 Like

Thanks for the update!

Running on a light-weight Python image sounds very reasonable, so it’s good to know there was a problem. It might well come up again.