Hi @ale,
Some answers inline:
- In the documentation, where are
task_keys
extracted from? The default is("spans", "label", "options")
. Are these from the recipe dictionary or attributes of each example or somewhere else?
These are extracted from the attributes of each example, yes. The built-in recipes create certain task structures (dictionaries) specific to each recipe. Thus, if you want to add a custom task_key
for the hashing function to use, it should be a first level key on the task dictionary.
- Using the custom recipe
cat-facts
example from above, I ran a small test with two annotators:jane
andjoe
. First, I annotated sentences with"labels": ["RELEVANT"]
withjane
. Then I changed the recipe's labels to"labels": ["CAT"]
and annotated withjoe
. For both annotators, the same sentences have equal input hashes (expected) but also same task hashes. Shouldn't the task hashes be different because I'm using different labels?
The default keys used for computing the task_hash are: spans, label, options, arcs
. If you look closely there's no label
attribute on the custom task here. The label
attribute is stored for binary classification tasks. In this case the config attribute labels
is used for determining the UI and the labels will be stored under spans
if there are any. Thus, for NER, the task hash is affected by pre-existing spans, not by the set of labels available. The idea is to distinguish between the "kinds" of annotation or what is being annotated, not particular label sets.
On a separate topic, is it possible to have more than 1 interface of the same interface type? For example, a custom recipe with two
choice
interfaces, each with different options.
Technically, you could define multiple choice
blocks. You would need to add the respective options as value of the "options" key in the block definition:
blocks = [
{"view_id": "ner_manual"},
{"view_id": "choice", "text": None, "options": options},
{"view_id": "choice", "text": None, "options": options2},
{"view_id": "text_input", "field_rows": 3, "field_label": "Explain your decision"}
]
Please note that all answers will be written under the same accept
key, so in order to be able to mark the options from both blocks, you would need to switch to "multiple" choice style. With the single style there will be only one answer permitted per both blocks. Also, by default, the keyboard shortcuts will be the same for both blocks so you might want to modify them or completely disable via custom javascript.
If you want more flexibility/control over the final UI you can always use custom HTML and JavaScript and build your own form with multiple checkboxes / radio button groups. window.prodigy.update
callback lets you update the current task with any custom data, like information about the checkbox that was selected. Here's a straightforward example of a custom checkbox:
- Is it possible to add a static question (or title) above the
choice
interface in thecat-facts
recipe?
Yes, you can achieve that by adding another html
block on top of existing choice
blocks:
blocks = [
{"view_id": "ner_manual"},
{"view_id": "html"},
{"view_id": "choice", "text": None, "html":None, "options": options},
{"view_id": "text_input", "field_rows": 3, "field_label": "Explain your decision"}
]
Note, that similarly to text
, html
has to be set to None in the choice
view_id definition to prevent the text from appearing twice.
The html
view_id expects html
field on the task so that will have to be added while you're creating the tasks:
def get_stream():
res = requests.get("https://cat-fact.herokuapp.com/facts").json()
for fact in res:
yield {"text": fact["text"], "options": options, "html":"<h2>This is my static question</h2>"}
You can also add extra styling, of course. Please check the custom interfaces section on html and css for examples.
- Is it possible to add
theme
options to the recipe so that it is not necessary to specify them inprodigy.json
? For examplerelationHeight
andrelationHeightWrap
from the documentation.
Yes, Prodigy merges the configuration from the global and the local prodigy.json
, cli overrides and the config
key returned from the recipe. So you can return custom_theme
dictionary under the config
key of the dictionary returned from the recipe:
return {
"dataset": dataset, # the dataset to save annotations to
"view_id": "blocks", # set the view_id to "blocks"
"stream": stream, # the stream of incoming examples
"config": {
"labels": ["CAT"], # the labels for the manual NER interface
"blocks": blocks, # add the blocks to the config
"custom_theme": {"buttonSize": 500} # set custom theme options
}
}