I am new to Prodigy and Spacy and I am amazed about the possibilities that I see so far. Great work!
I am starting a new project which analyze R&I projects. I would like to extract the relevant entities and label the project with the subjects related but I have several questions before organizing the work:
Do I start by labeling generic entities for every field of R&I and, once I have those generic entities, I identify more specific entities from the generic ones (I have seen in the docs that easier to train the models if you are more generic)?
In order to label the documents with the subjects of research, would it be useful to find the category at the paragraph level? I have used LDA for Topic Modelling but I would like to improve the results and I don’t know if text categorization could be used for that.