Welcome to the forum @it176131! 
Where do I find current source code for a Prodigy recipe seeing that the public repo is not actively maintained?
You're absolutely right, and we appreciate you pointing it out. Our public recipes repo does lag behind our current releases, and that's something we need to improve. We're a small team stretched across core development and user support, so keeping the open source recipes repo perfectly in sync is challenging, which is why we really appreciate posts like this one.
In the meantime, the source code is the best place to look for the current implementation—and this is actually why we removed Cython encryption from v1.16 onwards, to make the code more transparent and easier to debug and understand directly.
The source code for the ner.silver-to-gold recipe can be found in your Prodigy installation path and then recipes/ner.py. If you need to double check where Prodigy is installed on your machine, you can run prodigy stats which prints it to stdout under Location keyword.
How do I get the metadata in my “silver” data to show up in the ner.silver-to-gold UI?
The recipe's main job is to take all the different "silver" annotations you've made for the same piece of text and merge them into one new, combined example. When it creates this new merged example, the code was designed to only build the most essential parts: the text and the new, merged spans. It doesn't have a built-in mechanism to know which of the potentially conflicting meta fields from your different silver annotations to keep, so it simply doesn't copy any of them over to the new example. It's really hard to know upfront what logic to apply.
That said, since you have access to the source code, you can definitely persist this information by modifying the function that builds the new example i.e. the make_best method called on line 390 of the recipe:
stream = model.make_best(data)
This method is defined in models/ner.py. Here's an example modification that takes the meta from the first example in the group but you can of course customize it in case you need to merge the values of the meta field in a different way:
def make_best(self, examples: Iterable[TaskType]) -> Iterable[TaskType]:
"""Add spans to a dataset for the best predictions, using the model and
previous annotation decisions.
"""
log("MODEL: Get best predictions for examples")
golds = merge_spans(examples)
for batch in partition_all(32, golds):
batch = list(batch)
batch_texts = [eg["text"] for eg in batch]
batch_annots = [eg["spans"] for eg in batch]
beam = _BatchBeam(self.nlp, batch_texts, w=16, b=NER_DEFAULT_BEAM_DENSITY)
for i, parse in enumerate(
beam.predict_best(batch_annots, max_wrong=None, min_right=None)
):
original_eg = batch[i]
# copy the original example
eg = copy.deepcopy(original_eg)
# overwrite spans rather than create the example from scratch
eg["spans"] = parse
eg[BINARY_ATTR] = False
eg = set_hashes(eg)
yield eg
Alternatively, you can do something similar directly at the recipe level by storing meta values by input_hash and then readding them to the stream:
# Store meta information keyed by input hash, so we can add it back later
metas_by_hash = {}
for eg in data:
if "meta" in eg and INPUT_HASH_ATTR in eg:
metas_by_hash[eg[INPUT_HASH_ATTR]] = eg["meta"]
def add_meta_back(stream: StreamType) -> StreamType:
for eg in stream:
eg_copy = copy.deepcopy(eg)
if INPUT_HASH_ATTR in eg_copy and eg_copy[INPUT_HASH_ATTR] in metas_by_hash:
eg_copy["meta"] = metas_by_hash[eg_copy[INPUT_HASH_ATTR]]
yield eg_copy
stream = Stream(GeneratorSource(iter(stream)), loader=load_noop, wrappers=[])
stream.apply(add_meta_back)
stream.apply(filter_stream)