I've trained a model for Luxembourgish using a tag_map oriented on the German Tiger tagset:
TAG_MAP = {
"$(": {POS: PUNCT, "PunctType": "brck"},
"$,": {POS: PUNCT, "PunctType": "comm"},
"$.": {POS: PUNCT, "PunctType": "peri"},
"ADJA": {POS: ADJ},
"ADJD": {POS: ADJ},
"ADV": {POS: ADV},
"APPO": {POS: ADP, "AdpType": "post"},
"APPR": {POS: ADP, "AdpType": "prep"},
"APPRART": {POS: ADP, "AdpType": "prep", "PronType": "art"},
"APZR": {POS: ADP, "AdpType": "circ"},
"ART": {POS: DET, "PronType": "art"},
"CARD": {POS: NUM, "NumType": "card"},
...
}
Now, when I use prodigy to further annotate and correct the POS, I am only offered the Universal commons tag set:
python3 -m prodigy pos.correct mypos_new model-december2020/model-best/ rtl_radio.jsonl
ℹ Using universal coarse-grained POS tags: ADJ, ADP, ADV, AUX, CONJ,
CCONJ, DET, INTJ, NOUN, NUM, PART, PRON, PROPN, PUNCT, SCONJ, SYM, VERB, X,
SPACE
Is this the intended behaviour or is it possible to access also the custom POS tags in prodigy?
Thanks!