500 Internal Server Error

Hello,
I was annotating a 4000 lines jsonl file using prodigy spans.manual as follows and when I hit the 120th line, I got "Oops, something went wrong :slightly_frowning_face:. Check the console for possible errors." error.

!python -m prodigy spans.manual dataset_name en_core_web_lg ./jsonl_file.jsonl --label Label_1, Label_2, Label_3, Label_4, Label_5

The jsonl data follows the format
{"text":"statement"}
{"text":"statement"}
{"text":"statement"}
{"text":"statement"}
{"text":"statement"}
...
{"text":"statement"}

  • I tried restarting the local server to no effect.
  • With PRODIGY_LOGGING=verbose, the log is giving the following error
    INFO: 127.0.0.1:51092 - "POST /get_session_questions HTTP/1.1" 500 Internal Server Error
  • above this error line, there is a bunch of lines sampled as follows
    23:22:03: ROUTER: Routing item with _input_hash=-2065928483 -> ['2024-05-29_23-21-54']
    23:22:03: ROUTER: Routing item with _input_hash=-844096250 -> ['2024-05-29_23-21-54']
    23:22:03: ROUTER: Routing item with _input_hash=-1393159783 -> ['2024-05-29_23-21-54']
    23:22:03: ROUTER: Routing item with _input_hash=708647469 -> ['2024-05-29_23-21-54']
    23:22:03: ROUTER: Routing item with _input_hash=1132993060 -> ['2024-05-29_23-21-54']
    23:22:03: ROUTER: Routing item with _input_hash=-1145188808 -> ['2024-05-29_23-21-54']

note that i'm running Python 3.10.13 on Ubuntu20.4 on local machine and this issue is happening in both Firefox and Chrome browsers.

I need to resume the annotation work on this jsonl file building on the same database that is holding my annotations before this glitch occurs. I appreciate your help with this.

Thank you

Hi @adixxov,

If you run into the error and ctrl-C out of prodigy process does any traceback show up in the terminal?

Also, are you running Prodigy locally or behind a proxy/load balancer?

Thanks!

No there is no traceback that shows in the terminal when ctrl+C prodigy. For comparison, I tried crl+C another session that works normally, and I got the following traceback in the terminal.

INFO: Shutting down
INFO: Waiting for application shutdown.
INFO: Application shutdown complete.
INFO: Finished server process [3324932]

I'm running Prodigy locally.

Hi @adixxov,

Looking at your command - it looks like you're running Prodigy in a Jupyter notebook? Would it be possible to try running it directly in your terminal to see if we can get the traceback if you ctrl-c?
Thanks!

OK got it. Here you go the traceboack after ctrl+C

^CINFO: Shutting down
INFO: Waiting for application shutdown.
INFO: Application shutdown complete.
INFO: Finished server process [99524]
Task exception was never retrieved
future: <Task finished name='Task-15' coro=<RequestResponseCycle.run_asgi() done, defined at /home/......../lib/python3.10/site-packages/uvicorn/protocols/http/h11_impl.py:402> exception=ValueError('Unmatched ''"' when when decoding 'string'')>
Traceback (most recent call last):
File "/home/......../lib/python3.10/site-packages/uvicorn/protocols/http/h11_impl.py", line 409, in run_asgi
self.logger.error(msg, exc_info=exc)
File "/usr/lib/python3.10/logging/init.py", line 1506, in error
self._log(ERROR, msg, args, **kwargs)
File "/usr/lib/python3.10/logging/init.py", line 1624, in _log
self.handle(record)
File "/usr/lib/python3.10/logging/init.py", line 1633, in handle
if (not self.disabled) and self.filter(record):
File "/usr/lib/python3.10/logging/init.py", line 821, in filter
result = f.filter(record)
File "/home/......../lib/python3.10/site-packages/prodigy/init.py", line 21, in filter
raise rec.exc_info[1]
File "/home/......../lib/python3.10/site-packages/uvicorn/protocols/http/h11_impl.py", line 404, in run_asgi
result = await app( # type: ignore[func-returns-value]
File "/home/......../lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 84, in call
return await self.app(scope, receive, send)
File "/home/......../lib/python3.10/site-packages/fastapi/applications.py", line 1054, in call
await super().call(scope, receive, send)
File "/home/......../lib/python3.10/site-packages/starlette/applications.py", line 123, in call
await self.middleware_stack(scope, receive, send)
File "/home/......../lib/python3.10/site-packages/starlette/middleware/errors.py", line 186, in call
raise exc
File "/home/......../lib/python3.10/site-packages/starlette/middleware/errors.py", line 164, in call
await self.app(scope, receive, _send)
File "/home/......../lib/python3.10/site-packages/starlette/middleware/cors.py", line 93, in call
await self.simple_response(scope, receive, send, request_headers=headers)
File "/home/......../lib/python3.10/site-packages/starlette/middleware/cors.py", line 148, in simple_response
await self.app(scope, receive, send)
File "/home/......../lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 65, in call
await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
File "/home/......../lib/python3.10/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
raise exc
File "/home/......../lib/python3.10/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
await app(scope, receive, sender)
File "/home/......../lib/python3.10/site-packages/starlette/routing.py", line 756, in call
await self.middleware_stack(scope, receive, send)
File "/home/......../lib/python3.10/site-packages/starlette/routing.py", line 776, in app
await route.handle(scope, receive, send)
File "/home/......../lib/python3.10/site-packages/starlette/routing.py", line 297, in handle
await self.app(scope, receive, send)
File "/home/......../lib/python3.10/site-packages/starlette/routing.py", line 77, in app
await wrap_app_handling_exceptions(app, request)(scope, receive, send)
File "/home/......../lib/python3.10/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
raise exc
File "/home/......../lib/python3.10/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
await app(scope, receive, sender)
File "/home/......../lib/python3.10/site-packages/starlette/routing.py", line 72, in app
response = await func(request)
File "/home/......../lib/python3.10/site-packages/fastapi/routing.py", line 278, in app
raw_response = await run_endpoint_function(
File "/home/......../lib/python3.10/site-packages/fastapi/routing.py", line 193, in run_endpoint_function
return await run_in_threadpool(dependant.call, **values)
File "/home/......../lib/python3.10/site-packages/starlette/concurrency.py", line 42, in run_in_threadpool
return await anyio.to_thread.run_sync(func, *args)
File "/home/......../lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
return await get_async_backend().run_sync_in_worker_thread(
File "/home/......../lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2144, in run_sync_in_worker_thread
return await future
File "/home/......../lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 851, in run
result = context.run(func, *args)
File "/home/......../lib/python3.10/site-packages/prodigy/app.py", line 439, in get_session_questions
return _shared_get_questions(controller, session_id, excludes=req.excludes)
File "/home/......../lib/python3.10/site-packages/prodigy/app.py", line 387, in _shared_get_questions
tasks = controller.get_questions(session_id=session_id, excludes=excludes)
File "cython_src/prodigy/core.pyx", line 528, in prodigy.core.Controller.get_questions
File "cython_src/prodigy/core.pyx", line 529, in prodigy.core.Controller.get_questions
File "cython_src/prodigy/components/session.pyx", line 133, in prodigy.components.session.Session.get_questions
File "cython_src/prodigy/components/stream.pyx", line 324, in iter_queue
File "cython_src/prodigy/components/stream.pyx", line 304, in prodigy.components.stream.Stream.get_next
File "cython_src/prodigy/components/stream.pyx", line 343, in prodigy.components.stream.Stream._get_from_iterator
File "cython_src/prodigy/components/decorators.pyx", line 165, in inner
File "cython_src/prodigy/components/preprocess.pyx", line 200, in add_tokens
File "/home/......../lib/python3.10/site-packages/spacy/language.py", line 1574, in pipe
for doc in docs:
File "/home/......../lib/python3.10/site-packages/spacy/language.py", line 1618, in pipe
for doc in docs:
File "/home/......../lib/python3.10/site-packages/spacy/language.py", line 1615, in
docs = (self._ensure_doc(text) for text in texts)
File "/home/......../lib/python3.10/site-packages/spacy/language.py", line 1564, in
docs_with_contexts = (
File "cython_src/prodigy/components/preprocess.pyx", line 193, in genexpr
File "cython_src/prodigy/components/loaders.pyx", line 35, in _add_attrs
File "cython_src/prodigy/components/filters.pyx", line 54, in filter_duplicates
File "cython_src/prodigy/components/filters.pyx", line 25, in filter_empty
File "cython_src/prodigy/components/loaders.pyx", line 29, in _rehash_stream
File "cython_src/prodigy/components/source.pyx", line 755, in load_noop
File "cython_src/prodigy/components/source.pyx", line 109, in iter
File "cython_src/prodigy/components/source.pyx", line 110, in prodigy.components.source.Source.iter
File "cython_src/prodigy/components/source.pyx", line 595, in read
File "/home/......../lib/python3.10/site-packages/srsly/_json_api.py", line 39, in json_loads
return ujson.loads(data)
ValueError: Unmatched ''"' when when decoding 'string'

Thanks @adixxov. It looks like there is a malformed example in your input dataset that can't successfully be parsed into a json object. You can see that at the very end of traceback:

return ujson.loads(data)
ValueError: Unmatched ''"' when when decoding 'string'

In order to find the malformed example, it would probably be easiest to just try parsing the input file line by line and see which line causes issues:

from pathlib import Path
import json

with Path("your_dataset.jsonl").open("r") as f:
    for line in f:
        try:
            json.loads(f)
        except:
            print("Problem:", line)

Apart from that, there's also a problem with the traceback being swollen like that in the Jupyter environment, which we'll look into of course.

@magdaaniol that got it to work properly again. using your super helpful script, i found the lines that were inconsistent with json format and fixed them. the recipe worked fine from there. thanks so much

1 Like

yes would be great to be able to trace execution from jupyter

Yes, we agree of course. FWIW we are planning to improve error propagation for Prodigy v2.