Extracting skills from job postings

Hi Hines, thank you very much for the great answer!

I've been successfull to label and train an empty model following your guidelines.

Do you think makes sense to use ner.manual for some examples and then use ner.teach to start from a partially pre-trained model?

I am trying to extrack skills from jobs postings on linkedin, should I keep my examples at sentence level or is it better to use the whole job posting as an example for the labeling phase?

An example of job text is the following:
<<
DATA ENGINEER
Looking for qualified Data Engineer’s to join an innovative team in Charlotte, NC, right outside of uptown. This engineer will be supporting the company’s rapidly expanding Digital Transformation products. The Select Group is looking for someone who experience working within Big Data and has strong knowledge of Hadoop ecosystem. MUST be able to work on a W2 basis to be considered.

DATA ENGINEER REQUIREMENTS

  • Data engineer utilizing big data (specifically Hadoop)
  • Strong knowledge of Hadoop ecosystem; working with Hive, HBase, PySpark, Spark
  • Ability to build data frameworks, data ingestion
  • Experience writing ETL
  • AWS knowledge, specifically with EMR (cloud native bid data platform)

DATA ENGINEER RESPONSIBILITIES
The Data Engineers will join the company’s Data Engineer Practice to support several products that are in production. Their Big Data environment is mainly within Hadoop and they are in the process of implementing AWS as well. The company’s environment is purely agile and they look for innovators who have a passion for growth and technology.

Thank you for any advice!

1 Like