Fact extraction for earnings news

nix411 · December 11, 2018, 2:07pm

Cool I’ll take care of the preprocessing. However should I include a whole earnings report for each row in JSONL? I’ve just noticed that you recommend only giving it small phrases so I’m not entirely sure how to chunk my earnings report into a training JSONL dataset. The reports also include some markup tables and I suppose I should handle these without spaCy. I imagine having a pipeline where I first do some “document layout analysis”. Send some of the document to spaCy and the rest to a table parser.

Topic		Replies	Views
Model and label strategy for information extraction task ner , textcat , spacy , finance	1	564	December 12, 2019
Extracting numeric token for several entities in order using Spacy usage , spacy , off-topic	0	717	October 1, 2020
Information Extraction for long, semi-unique documents ner	1	538	October 16, 2019
NER for Financial Text ner	14	1563	October 25, 2023
Parsing/Identifying sections in job descriptions usage , ner , custom	3	3246	June 16, 2022

Fact extraction for earnings news

Related topics