span categorization

kushal_pythonist · March 23, 2023, 2:38pm

I have a resume data and i want to segment the profile, name, contact, email from the resume. I had perform the spancat annotations from prodigy. and followed the tutorials as in this video

I m confused how to inference the spancat models and what to do next as that is not mentioned in the videos. or can you guide me what to do next ? Yes i have the train.spacy and dev.spacy files.

NOTE : I had a resume data as:

{"text": "Name: helen helen E-Mail: helen.helen@gmail.com Address: Hong Kong, Hong Kong Github: https://github.com/helen LinkedIn: https://linkedin.com/helen Phone No. 192094070156"}

This is the data and how i annotate i will mention as : the whole above text is labelled as : PROFILE, and othere as the given labels: PROFILE_NAME,PROFILE_EMAIL,PROFILE_ADDRESS,PROFILE_PHONE

Is am i doing something wrong here? Please guide me Appreciated

ryanwesslen · March 23, 2023, 2:56pm

hi @kushal_pythonist,

By inference, do you mean training?

Since you have the spacy binary files, I presume you ran data-to-spacy like in the docs:

$ prodigy data-to-spacy ./corpus --spancat covid_articles

To use this data for training with spaCy, you can run:
python -m spacy train ./corpus/config.cfg --paths.train ./corpus/train.spacy --paths.dev ./corpus/dev.spacy

Like the output states, the next step is to run spacy train. Did you do that?

Can you describe what steps are confusing?

kushal_pythonist · March 23, 2023, 4:16pm

Well i need to check whether the trained spancat model is performing good or bad

yes

yes i have done that but i'm confused what to do next and have no any insights to go further. Can you enlighten me .

sherpa.codes · March 24, 2023, 10:33am

ryanwesslen:

$ prodigy data-to-spacy ./corpus --spancat covid_articles

To use this data for training with spaCy, you can run:
python -m spacy train ./corpus/config.cfg --paths.train ./corpus/train.spacy --paths.dev ./corpus/dev.spacy

I have trained the model using this command. And got stuck and confused
what to do next. Can @kushal_pythonist and @ryanwesslen help me through this and provide guidance. ATM I'm I m null about proceeding further. Thanks

Topic		Replies	Views
Spancat is not trained spancat	12	1113	July 27, 2022
Extracting useful information from Job description ner , textcat , spancat	1	1561	January 24, 2023
Span Cat Annotations and Incorrect Predictions spacy , spancat	4	844	June 8, 2023
Spacy NER - tokeniser for camembert-base ner	17	1143	March 15, 2023
spancat with really large spans? (Identify sections in text) spancat	9	928	March 29, 2023

span categorization

Related topics