Hi,
I'm trying to use Entity Linking to distinguish between same name entities. As base I've used the "Emerson" example so well explained by Sofie (great video lesson).
The difficulties for me comes from the specific of the sentences I have. Here some examples:
Starship Enterprise 77 5766
Starship Enterprise TOG 77 NN 5766
Starship Enterprise 55783 ML 09
Starship Enterprise ML 09 55799
Starship Enterprise TOG 1977 NN 5766
Starship Enterprise 55783 ML 2009
Starship Enterprise - this is the vessel name
77, 1977, 09, 2009 - are vessel built year
55783, 5766 - are vessel deadweight tonnage
I've added "Starship Enterprise" as entity in Entity Ruler and put the Entity Ruler before NER into the pipeline, so the entity is properly detected by the model.
As you can see except the "Starship Enterprise" which is the entity I want to distinguish there are numbers and abbreviations in the sentences I have.
Questions:
- Is this achievable using Entity Linker in regards of the structure of the sentences I have?
- Is there good approach when there is disbalance lots of numbers and fewer words in text examples?
- Should I pre-process the numbers and replaces them with tokens which are more meaningful to the model?
Thanks!