Example in Components and Functions documentation doesn't work as expected

Hi.

I am going through the documentation and seems that this example doesn't work (at least it doesn't in my computer). My prodigy version is: 1.10.5

This is the code you post in the documentation: link to code

from prodigy.components.filters import filter_duplicates
stream = [{"text": "foo", "label": "bar"}, {"text": "foo", "label": "bar"}, {"text": "foo"}]
stream = filter_duplicates(stream, by_input=False, by_task=True)
# [{'text': 'foo', 'label': 'bar'}, {'text': 'foo'}]
stream = filter_duplicates(stream, by_input=True, by_task=True)
# [{'text': 'foo', 'label': 'bar'}]

In order to iterate over the stream and see the elements I added:

list(stream)

This throws this error:

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-159-36d117afad89> in <module>
      5 
      6 stream = filter_duplicates(stream, by_input=False, by_task=True)
----> 7 list(stream)

cython_src/prodigy/components/filters.pyx in filter_duplicates()

KeyError: '_task_hash'

I don't know if this is the expected output. If it is, I think it would be better a reproducible example that works well.

Thanks in advance.

Sergio M.

Thanks for the heads-up! It looks like the example is actually missing the line that sets the hashes (which are then used in the filtering). So it should look like this:

from prodigy.components.filters import filter_duplicates
from prodigy import set_hashes

stream = [{"text": "foo", "label": "bar"}, {"text": "foo", "label": "bar"}, {"text": "foo"}]
stream = [set_hashes(eg) for eg in stream]
stream = filter_duplicates(stream, by_input=False, by_task=True)
# [{'text': 'foo', 'label': 'bar', '_input_hash': ..., '_task_hash': ...}, {'text': 'foo', '_input_hash': ..., '_task_hash': ...}]
stream = filter_duplicates(stream, by_input=True, by_task=True)
# [{'text': 'foo', 'label': 'bar', '_input_hash': ..., '_task_hash': ...}]

I'll update the docs example accordingly :slightly_smiling_face:

1 Like

Okay.

Now it works. Thanks :slight_smile: