@snd507 and @Kairine, because of your reports I was able to reproduce the problem and come up with fixes for the duplicate entries you see when using force_stream_order
. The fixes will be included in the next release. There were a few problems:
The python app recently transitioned to FastAPI which runs requests in threads, and this exposed a data race when the server was trying to receive answers and return new tasks at the same time. To fix this, I moved the thread lock that we use up from the Feed class to the Controller class, where it can lock both reads and writes to the state.
The frontend had an off-by-one error while using force_stream_order
and asking the server for new questions. Specifically, it didn't include the example you were currently answering in the list of examples to exclude from the next batch. This resulted in the server sending back a duplicate of the current example when asked for new questions.
The frontend didn't prevent you from asking the server for more questions while you were in the middle of asking for questions. This meant that if you answered questions very quickly (e.g. holding down a shortcut key) you could cause the client to call get_session_questions
multiple times in a row. With force_stream_order
this meant you could get duplicates.
To confirm the fixes, I used the dataset of 60 examples above and held down a shortcut key to answer the problems as quickly as possible until there were no more questions. Before the fixes were applied this resulted in a variable total number of examples always greater than 60. After applying the fixes I ran multiple tests with named and unnamed sessions, and they all resulted in a total of 60 annotations.
Thanks so much for your help tracking down these issues, we'll update this thread when the next version is released.