peewee.IntegrityError on update

I started seeing this error message.
The MySQL database has no custom modifications.
Could you please help me?

Prodigy Version: 1.12.4
Database: MySQL
Error Message on save:
image
Error message in the server:

Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:   File "/usr/local/lib/python3.9/dist-packages/starlette/middleware/cors.py", line 147, in simple_response
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:     await self.app(scope, receive, send)
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:   File "/usr/local/lib/python3.9/dist-packages/starlette/middleware/exceptions.py", line 79, in __call__
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:     raise exc
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:   File "/usr/local/lib/python3.9/dist-packages/starlette/middleware/exceptions.py", line 68, in __call__
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:     await self.app(scope, receive, sender)
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:   File "/usr/local/lib/python3.9/dist-packages/fastapi/middleware/asyncexitstack.py", line 21, in __call__
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:     raise e
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:   File "/usr/local/lib/python3.9/dist-packages/fastapi/middleware/asyncexitstack.py", line 18, in __call__
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:     await self.app(scope, receive, send)
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:   File "/usr/local/lib/python3.9/dist-packages/starlette/routing.py", line 718, in __call__
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:     await route.handle(scope, receive, send)
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:   File "/usr/local/lib/python3.9/dist-packages/starlette/routing.py", line 276, in handle
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:     await self.app(scope, receive, send)
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:   File "/usr/local/lib/python3.9/dist-packages/starlette/routing.py", line 66, in app
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:     response = await func(request)
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:   File "/usr/local/lib/python3.9/dist-packages/fastapi/routing.py", line 237, in app
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:     raw_response = await run_endpoint_function(
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:   File "/usr/local/lib/python3.9/dist-packages/fastapi/routing.py", line 165, in run_endpoint_function
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:     return await run_in_threadpool(dependant.call, **values)
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:   File "/usr/local/lib/python3.9/dist-packages/starlette/concurrency.py", line 41, in run_in_threadpool
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:     return await anyio.to_thread.run_sync(func, *args)
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:   File "/usr/local/lib/python3.9/dist-packages/anyio/to_thread.py", line 33, in run_sync
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:     return await get_asynclib().run_sync_in_worker_thread(
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:   File "/usr/local/lib/python3.9/dist-packages/anyio/_backends/_asyncio.py", line 877, in run_sync_in_worker_thread
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:     return await future
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:   File "/usr/local/lib/python3.9/dist-packages/anyio/_backends/_asyncio.py", line 807, in run
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:     result = context.run(func, *args)
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:   File "/usr/local/lib/python3.9/dist-packages/prodigy/app.py", line 572, in give_answers
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:     controller.receive_answers(
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:   File "cython_src/prodigy/core.pyx", line 540, in prodigy.core.Controller.receive_answers
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:   File "cython_src/prodigy/core.pyx", line 557, in prodigy.core.Controller.receive_answers
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:   File "cython_src/prodigy/core.pyx", line 657, in prodigy.core.Controller._db_add_examples
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:   File "/usr/local/lib/python3.9/dist-packages/prodigy/components/db.py", line 770, in add_examples
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:     self.link(dataset, ids)
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:   File "/usr/local/lib/python3.9/dist-packages/prodigy/components/db.py", line 786, in link
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:     Link.bulk_create(links, batch_size=batch_size)
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:   File "/usr/local/lib/python3.9/dist-packages/peewee.py", line 6609, in bulk_create
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:     res = cls.insert_many(accum, fields=fields).execute()
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:   File "/usr/local/lib/python3.9/dist-packages/peewee.py", line 1966, in inner
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:     return method(self, database, *args, **kwargs)
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:   File "/usr/local/lib/python3.9/dist-packages/peewee.py", line 2037, in execute
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:     return self._execute(database)
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:   File "/usr/local/lib/python3.9/dist-packages/peewee.py", line 2842, in _execute
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:     return super(Insert, self)._execute(database)
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:   File "/usr/local/lib/python3.9/dist-packages/peewee.py", line 2555, in _execute
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:     cursor = database.execute(self)
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:   File "/usr/local/lib/python3.9/dist-packages/peewee.py", line 3254, in execute
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:     return self.execute_sql(sql, params)
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:   File "/usr/local/lib/python3.9/dist-packages/peewee.py", line 3246, in execute_sql
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:     cursor.execute(sql, params or ())
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:   File "/usr/local/lib/python3.9/dist-packages/peewee.py", line 3014, in __exit__
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:     reraise(new_type, new_type(exc_value, *exc_args), traceback)
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:   File "/usr/local/lib/python3.9/dist-packages/peewee.py", line 192, in reraise
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:     raise value.with_traceback(tb)
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:   File "/usr/local/lib/python3.9/dist-packages/peewee.py", line 3246, in execute_sql
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:     cursor.execute(sql, params or ())
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:   File "/usr/local/lib/python3.9/dist-packages/pymysql/cursors.py", line 153, in execute
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:     result = self._query(query)
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:   File "/usr/local/lib/python3.9/dist-packages/pymysql/cursors.py", line 322, in _query
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:     conn.query(q)
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:   File "/usr/local/lib/python3.9/dist-packages/pymysql/connections.py", line 558, in query
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:     self._affected_rows = self._read_query_result(unbuffered=unbuffered)
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:   File "/usr/local/lib/python3.9/dist-packages/pymysql/connections.py", line 822, in _read_query_result
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:     result.read()
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:   File "/usr/local/lib/python3.9/dist-packages/pymysql/connections.py", line 1200, in read
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:     first_packet = self.connection._read_packet()
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:   File "/usr/local/lib/python3.9/dist-packages/pymysql/connections.py", line 772, in _read_packet
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:     packet.raise_for_error()
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:   File "/usr/local/lib/python3.9/dist-packages/pymysql/protocol.py", line 221, in raise_for_error
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:     err.raise_mysql_exception(self._data)
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:   File "/usr/local/lib/python3.9/dist-packages/pymysql/err.py", line 143, in raise_mysql_exception
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]:     raise errorclass(errno, errval)
Jul 28 13:10:41 ip-172-31-85-61 prodigy_stt[4093767]: peewee.IntegrityError: (1452, 'Cannot add or update a child row: a foreign key constraint fails (`prodigy_stt`.`link`, CONSTRAINT `link_ibfk_1` FOREIGN KEY (`example_id`) REFERENCES `example` (`id`))')

Hi @ngawangtrinley ,

Sorry to hear you're experiencing problems with the audio data. Could you also tell us if 1) you're saving full audio files in the DB (keeping base64 data) or you're saving just paths to the audio files (if you use a built-in recipe it would be just paths) and 2) do you use instant_submit: true for this task in your config file? Thanks you!

Hi @magdaaniol,

  1. I'm using the built-in recipe for this audio transcription. The database holds the path to the file on disk.
  2. instant_submit is false for some and true for some instances.

Hi @magdaaniol ,
Super weird but It seems to work fine now. I did not change anything.

OK! Thanks for reporting back. Do let us know if the issue reappears - thanks!

@magdaaniol
the same issue seems to happen when I run drop

➜  staging sudo -u prodigy PRODIGY_CONFIG="./config_mysql.json" /usr/bin/python3.9 -m prodigy drop stt_review_merged

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/dist-packages/peewee.py", line 3246, in execute_sql
    cursor.execute(sql, params or ())
  File "/usr/local/lib/python3.9/dist-packages/pymysql/cursors.py", line 153, in execute
    result = self._query(query)
  File "/usr/local/lib/python3.9/dist-packages/pymysql/cursors.py", line 322, in _query
    conn.query(q)
  File "/usr/local/lib/python3.9/dist-packages/pymysql/connections.py", line 558, in query
    self._affected_rows = self._read_query_result(unbuffered=unbuffered)
  File "/usr/local/lib/python3.9/dist-packages/pymysql/connections.py", line 822, in _read_query_result
    result.read()
  File "/usr/local/lib/python3.9/dist-packages/pymysql/connections.py", line 1200, in read
    first_packet = self.connection._read_packet()
  File "/usr/local/lib/python3.9/dist-packages/pymysql/connections.py", line 772, in _read_packet
    packet.raise_for_error()
  File "/usr/local/lib/python3.9/dist-packages/pymysql/protocol.py", line 221, in raise_for_error
    err.raise_mysql_exception(self._data)
  File "/usr/local/lib/python3.9/dist-packages/pymysql/err.py", line 143, in raise_mysql_exception
    raise errorclass(errno, errval)
pymysql.err.IntegrityError: (1451, 'Cannot delete or update a parent row: a foreign key constraint fails (`prodigy_stt`.`link`, CONSTRAINT `link_ibfk_1` FOREIGN KEY (`example_id`) REFERENCES `example` (`id`))')

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/usr/local/lib/python3.9/dist-packages/prodigy/__main__.py", line 63, in <module>
    controller = recipe(*args, use_plac=True)
  File "cython_src/prodigy/core.pyx", line 862, in prodigy.core.recipe.recipe_decorator.recipe_proxy
  File "/usr/local/lib/python3.9/dist-packages/plac_core.py", line 367, in call
    cmd, result = parser.consume(arglist)
  File "/usr/local/lib/python3.9/dist-packages/plac_core.py", line 232, in consume
    return cmd, self.func(*(args + varargs + extraopts), **kwargs)
  File "/usr/local/lib/python3.9/dist-packages/prodigy/recipes/commands.py", line 122, in drop
    dropped = DB.drop_dataset(set_id, batch_size)
  File "/usr/local/lib/python3.9/dist-packages/prodigy/components/db.py", line 902, in drop_dataset
    delete_refs(original_example_ids, Link, Example)
  File "/usr/local/lib/python3.9/dist-packages/prodigy/components/db.py", line 894, in delete_refs
    ExampleType.delete().where(
  File "/usr/local/lib/python3.9/dist-packages/peewee.py", line 1966, in inner
    return method(self, database, *args, **kwargs)
  File "/usr/local/lib/python3.9/dist-packages/peewee.py", line 2037, in execute
    return self._execute(database)
  File "/usr/local/lib/python3.9/dist-packages/peewee.py", line 2555, in _execute
    cursor = database.execute(self)
  File "/usr/local/lib/python3.9/dist-packages/peewee.py", line 3254, in execute
    return self.execute_sql(sql, params)
  File "/usr/local/lib/python3.9/dist-packages/peewee.py", line 3246, in execute_sql
    cursor.execute(sql, params or ())
  File "/usr/local/lib/python3.9/dist-packages/peewee.py", line 3014, in __exit__
    reraise(new_type, new_type(exc_value, *exc_args), traceback)
  File "/usr/local/lib/python3.9/dist-packages/peewee.py", line 192, in reraise
    raise value.with_traceback(tb)
  File "/usr/local/lib/python3.9/dist-packages/peewee.py", line 3246, in execute_sql
    cursor.execute(sql, params or ())
  File "/usr/local/lib/python3.9/dist-packages/pymysql/cursors.py", line 153, in execute
    result = self._query(query)
  File "/usr/local/lib/python3.9/dist-packages/pymysql/cursors.py", line 322, in _query
    conn.query(q)
  File "/usr/local/lib/python3.9/dist-packages/pymysql/connections.py", line 558, in query
    self._affected_rows = self._read_query_result(unbuffered=unbuffered)
  File "/usr/local/lib/python3.9/dist-packages/pymysql/connections.py", line 822, in _read_query_result
    result.read()
  File "/usr/local/lib/python3.9/dist-packages/pymysql/connections.py", line 1200, in read
    first_packet = self.connection._read_packet()
  File "/usr/local/lib/python3.9/dist-packages/pymysql/connections.py", line 772, in _read_packet
    packet.raise_for_error()
  File "/usr/local/lib/python3.9/dist-packages/pymysql/protocol.py", line 221, in raise_for_error
    err.raise_mysql_exception(self._data)
  File "/usr/local/lib/python3.9/dist-packages/pymysql/err.py", line 143, in raise_mysql_exception
    raise errorclass(errno, errval)
peewee.IntegrityError: (1451, 'Cannot delete or update a parent row: a foreign key constraint fails (`prodigy_stt`.`link`, CONSTRAINT `link_ibfk_1` FOREIGN KEY (`example_id`) REFERENCES `example` (`id`))')

Thanks! Looking into it and we'll update asap.

We have been using mysql to store annotations. I have a new Recipe to apply against an existing dataset with a little filtering. We have tested the Recipe and used it successfully in sqlite with no problem. We have been using the mysql for other recipes with no problem. I did not expect to run into this problem, so we have people ready to annotate against this recipe tomorrow.

I created the new dataset using python

examples = db.get_dataset_examples(ds)
events = [e for e in examples if len(e['spans'])>0]
db.add_dataset('msa_date')
db.add_examples(events, datasets=['msa_date'])

I tried to use my new recipe with the above created dataset and I used db-out to create the json and re-upload with db-in. It failed to load the data with the same error that I got before.

The recipe loads the view correctly, its only when you click save that you see the error.

peewee.IntegrityError: (1216, 'Cannot add or update a child row: a foreign key constraint fails')

I have updated packages for peewee, mysql and prodigy to the latest version.

The stack trace goes back to pymysql, to peewee to these lines in prodigy

File "/srv/data/loretta/venv3.10/lib/python3.10/site-packages/prodigy/recipes/commands.py", line 192, in db_in
DB.add_examples(
File "/srv/data/loretta/venv3.10/lib/python3.10/site-packages/prodigy/components/db.py", line 790, in add_examples
self.link(dataset, ids)
File "/srv/data/loretta/venv3.10/lib/python3.10/site-packages/prodigy/components/db.py", line 806, in link
Link.bulk_create(links, batch_size=batch_size)

Thoughts?
-Loretta

Thanks for all the details @lauvil. We're looking into it, it appears to be an issue in our end. We'll get back as soon as possible.

I did find that I had duplicates in this dataset, although it was coming from a review recipe, so I didn't expect there to be duplicates. And although I was able to create this dataset once. I deleted it thinking since there were duplicates it might be causing the issue and now I can't even save it again without the duplicates.

As @magdaaniol mentioned we're looking at this issue. I've combined this with another post so we can track both together as they seem to be dealing with the same problem.

1 Like

Hi @ngawangtrinley & @lauvil,

We have found that our DB bulk inserts could have led to intermittent instability of the operations resulting in the issues you've been experiencing. We have fixed it by removing bulk inserts makes DB operations stable although somewhat slower.
Prodigy 1.12.6 with this fix is already available on PyPi.

Do upgrade existing installation:

pip install --upgrade prodigy -f https://xxxx-xxxx-xxxx-xxxx@download.prodi.gy

Thank you both again for detailed reports that helped us with the fix!