Can recognised named entities be used as features for price prediction using ML? (Named entities treated as categorical data converted into integer data using one-hot encoding for prediction)

Stellen · August 1, 2020, 7:37am

Hi.

We run a third-party marketplace for used cars. We would like to use ML to estimate the prices based on both structured attributes and unstructured descriptions provided by the car owners to identify cars that are under-priced. If a car is under-priced, we can purchase and resell.

We wonder if can we use Prodigy to annotate the unstructured description for entity recognition and use the extracted entities as the features for price prediction.

Thank you.

honnibal · August 4, 2020, 9:54am

Hi Stellen,

I think what you're interested in should be possible, but it will likely be a combination of a relatively simple regression model for the financials, with some features predicted from the unstructured text. So there's an interplay between two models there: someone will need to be designing the trade-off between what you can get out of the text accurately and easily, and what information is actually useful in the pricing model. I think the project will require a lot of domain expertise combined with some amount of prior NLP experience.

For example, you might find that sellers with certain demographic features often have underpriced listings ("a little old lady who just drove to church on Sundays), but simultaneously, savy sellers try to pose as these groups. Maybe something in the text gives this away, and you can figure out to remove the misleading demographic information for those listings. Maybe. But maybe there are no cases like this where the text features help, or you're unable to predict the feature accurately even if you can think of it.

I really can't say whether the project will be successful, but I think Prodigy would be a good tool for the NLP component, as it's well suited for rapid development.

Stellen · August 5, 2020, 6:59am

Hi Honnibal.

Many thanks for your advice. They did reinforce our idea on how to go about implementing this project and gave us an idea or two.

My apology that I didn't make it clear in the question. Rather than "read between lines" using NLP, we have dataset of listings labelled with ground truth prices. We will model relationships between the prices and the corresponding unstructured and structured data using ML.

We hope our machine will eventually be intelligent enough to price the listed car with certain level of accuracy. The difference between the price estimated by the machine and the price listed by the user suggests under/over-pricing.

After further research, I believe we can use Prodigy and NER to extract certain features from the unstructured descriptions to serve as structured predictor variables to be input into the training model for price prediction.

And noted about Prodigy being a tool for rapid development. We do need such tool to iterate through different methods.

Thank you.

Topic		Replies	Views
Transfer Learning for NER usage , ner	6	2313	May 24, 2021
Use Case Feasibility ner , spacy	1	436	July 28, 2019
Will Prodigy work for me?	2	160	July 5, 2023
Can we train an NER to recognise some entities not learned from labelled examples, but a list of imported entities, such as names of areas, main roads, etc.? usage , ner , spacy , solved	2	538	June 21, 2020
What feature representation does the Prodigy artificial neural network use for annotation?	1	214	April 3, 2023

Can recognised named entities be used as features for price prediction using ML? (Named entities treated as categorical data converted into integer data using one-hot encoding for prediction)

Related Topics