I have an interesting case which I have not been able to successfully complete. I have multiple excel sheets which consists of multiple tables. ETL doesn't work on these type of sheets because of huge variation. Thus I am looking for more better ways to get this done.
Take a look at the samle here.
Some points to keep in mind
- I always get this data in excel sheets.
- I want to extract "Dishes" against the restaurants which can have "Take away" or "Dine -in".
- At times Dish is same but the name could be slightly different
As mentiond earlier ETL doenst work on this becuase of complex sheets which I get. The headers are messy and often I do get a lot of information which makes it hard to form a strucuture which can be ingested by any ETL tool.
What are you thoughts on this? How can I annotate it?
So, I was thinking if I could train an excel sheet, the way we can do for Texts or PDFs.