I am getting the error below when using spacy.llm to get labels for a text.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
my python file
from spacy_llm.util import assemble
from dotenv import load_dotenv
load_dotenv()
nlp = assemble('config.cfg')
doc = nlp("""
In the depths of the ocean, an intrepid explorer named Jane Smith.
a dedicated member of the Marine Conservation Society.
embarked on a remarkable journey to study the diverse marine life that inhabits the underwater world.
Her expedition led her to a hidden reef teeming with exotic fish species and surrounded by a stunning coral garden.
Jane's findings contributed to our understanding of the delicate ecosystems in these remote aquatic environments.
shedding light on the importance of protecting these fragile habitats.
As she continued her exploration, Jane also encountered a local fishing community that depended on
the ocean for their livelihoods, highlighting the intricate relationship between humans and the underwater world.
""")
#print(doc)
for ent in doc.ents:
print(ent,ent.label_)
my config.cfg
[nlp]
lang = "en"
pipeline = ["llm"]
[components]
[components.llm]
factory = "llm"
[components.llm.task]
@llm_tasks = "spacy.NER.v3"
labels = ["UNDERWATER","COORDINATES","DEPTH","METHODS"]
description = "Entities are the names water pysical features as oceans and ll, coordinates of places, depth in metres or any scale and methods used in collecting data and samples"
[components.llm.task.label_definitions]
UNDERWATER = "Extract names of known underwater places and features e.g Seamount, Seamounts"
COORDINATES = "Extract geographic coordinates for latitude and longitude"
DEPTH = "Extract references to depth in feet or metres"
METHODS = "Extract references to collection methods e.g trawling, dredging, sampling, collecting"
[components.llm.model]
@llm_models = "spacy.Falcon.v1"
# For better performance, use dolly-v2-12b instead
name = "falcon-rw-1b"
#"Mistral-7B-v0.1"
Using
spacy==3.7.2
prodigy 1.14.10