Prodigy process freezing on start-up in Docker

Hello,

I followed the documentation in my attempt to Dockerize Prodigy and came up with the following configuration / scripts:

Dockerfile

FROM python:3.8-slim-buster

RUN mkdir -p /usr/src/app
WORKDIR /usr/src/app

RUN apt-get update && apt-get install -y \
    build-essential \
    libssl-dev \
    curl

COPY prodigy*.whl /usr/src/app
COPY prodigy.json /usr/src/app

RUN pip install spacy==2.3.5 && python -m spacy download en_core_web_sm
RUN pip install ./prodigy*.whl

EXPOSE 8080

COPY scripts /usr/src/app/scripts

COPY project_ner /usr/src/app/project_ner

CMD ["./scripts/start.sh"]

prodigy.json

{
  "host":"0.0.0.0",
  "port":8080,
  "db": "sqlite",
  "db_settings": {
    "sqlite": {
      "name": "prodigy.db",
      "path": "/proj"
    }
  }
}

start.sh

#! /bin/bash
export PYTHONPATH=$PWD

case "$START" in
    "manual")
        prodigy ner.manual project project_ner /proj/data.jsonl --label DOCUMENT,PERSON,ORG,ISSUE,ACTIVITY
    ;;
    "binary")
        prodigy ner.teach project project_ner /proj/data.jsonl --label DOCUMENT,PERSON,ORG,ISSUE,ACTIVITY
    ;;
    "active")
        prodigy ner.correct project project_ner /proj/data.jsonl  --label DOCUMENT,PERSON,ORG,ISSUE,ACTIVITY
    ;;
    *)
        echo "Unknown start setting" \"$START\"
        exit 1;
esac

Unfortunately, the container freezes when I attempt to run it with the following command:

docker run -v proj:/proj -p 8080:8080 -e START=manual <image_id>

There are no logs, errors or anything else that indicates a configuration issue.

Removing "host":"0.0.0.0" from prodigy.json produces (the expected)

[Errno 99] error while attempting to bind on address ('::1', 8080, 0, 0): cannot assign requested address

I went through most Docker-related posts here, but couldn't find anything mentioning an issue quite like that...

P.S. I've tried building the container with Python 3.6, 3.7 & 3.8 (slim-buster) as a base. Same result.

Any help or nudge in the right direction would be greatly appreciated!

to set the host/port I use an environment variable in Dockerfile and it works for me:

ENV PRODIGY_HOST=0.0.0.0
ENV PRODIGY_PORT=8080

Unfortunately, we've already tried that and it makes no difference whatsoever. Thank you for your suggestion though!

A few ideas...

analyzing the command you use to run "-v proj:/proj", you are mapping a volume. Are you sure the volume has the file data.jsonl? if you want to map a folder from the host you should use "-v /absolute/path/to/folder:/proj"

if the file is there, I should try to run the command manually:

docker run -it -v proj:/proj -p 8080:8080 -e START=manual <image_id> /bin/bash

Once you are inside the container run the command to see what is the result:

prodigy ner.manual project project_ner /proj/data.jsonl --label DOCUMENT,PERSON,ORG,ISSUE,ACTIVITY
1 Like

Thanks! Well apart from not seeing the usual Prodigy start-up messages, after reinstalling Docker (version 20.10.2) everything seems to be working fine.

I've no idea what might've gone wrong there, but the issue is resolved. Thank you for your input!