I hadn't considered this yet, but it's a good idea! I'll put this on my list of enhancements for the upcoming version. In the meantime, it should hopefully be pretty straightforward to implement your own loader for this.
For the built-in loader, we might have to make this a separate command that outputs the JSON you can pipe forward, or come up with some special syntax so you can specify which annotations to extract from the Doc
objects. For example, in ner.manual
you'll (likely) want the doc.ents
to be added to the "spans"
, but in another recipe, you might want to use the part-of-speech tags instead, etc.