I am using using displacy.render
to render manual NER annotation in HTML and it's looking wrong. I am running the following code:
displacy.render(document_html, style="ent", page=True, manual=True, options={"colors:{"AUTHOR":"Salmon"}})
The contents of document_html
are in the attached document-1.entities.jsonl (798 Bytes)
file. (Actually that file's contents are incorrect. See my reply below.) The original text is:
Pynchon and Nabokov
SECTION 1.0 Thomas Pynchon
Thomas Pynchon's greatest novel is "Gravity's Rainbow".It tells the story of an American army officer in occupied
Germany pursuing a mystical V-2 rocket.SECTION 2.0 Vladimir Nabokov
Vladimir Nabokov's greatest novel is "Lolita"."Lolita"'s main character, Humbert Humbert, is one of the
most famous unreliable narrators in all of literature.
I've verified that the "text" value in document_html
is correct, as are all the entity character offsets. I expect all the "Thomas Pynchon" and "Vladimir Nabokov" spans to be highlighted in salmon with the label "AUTHOR". Instead I see this.
(I see the same problem if I change the command option to page=False
.)
Am I doing something wrong or is this a bug?
Here's the HTML that was generated:
<!DOCTYPE html>
<html>
<head>
<title>displaCy</title>
</head>
<body style="font-size: 16px; font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol'; padding: 4rem 2rem;">
<figure style="margin-bottom: 6rem">
<h2 style="margin: 0">document-1.txt</h2>
<div class="entities" style="line-height: 2.5">Pynchon and Nabokov</br></br>SECTION 1.0 Thomas Pynchon</br>Thomas Pynchon's greatest novel is "Gravity's Rainbow".</br></br>It tells the story of an American army officer in occupied</br>Germany pursuing a mystical V-2 rocket.</br></br></br>SECTION 2.0
<mark class="entity" style="background: Salmon; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em; box-decoration-break: clone; -webkit-box-decoration-break: clone">
Vladimir Nabokov
<span style="font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; text-transform: uppercase; vertical-align: middle; margin-left: 0.5rem">AUTHOR</span>
</mark>
</br>
<mark class="entity" style="background: Salmon; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em; box-decoration-break: clone; -webkit-box-decoration-break: clone">
Vladimir Nabokov
<span style="font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; text-transform: uppercase; vertical-align: middle; margin-left: 0.5rem">AUTHOR</span>
</mark>
<mark class="entity" style="background: Salmon; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em; box-decoration-break: clone; -webkit-box-decoration-break: clone">
Thomas Pynchon
<span style="font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; text-transform: uppercase; vertical-align: middle; margin-left: 0.5rem">AUTHOR</span>
</mark>
</br>
<mark class="entity" style="background: Salmon; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em; box-decoration-break: clone; -webkit-box-decoration-break: clone">
Thomas Pynchon
<span style="font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; text-transform: uppercase; vertical-align: middle; margin-left: 0.5rem">AUTHOR</span>
</mark>
's greatest novel is "Gravity's Rainbow".
It tells the story of an American army officer in occupied
Germany pursuing a mystical V-2 rocket.
SECTION 2.0 Vladimir Nabokov
Vladimir Nabokov's greatest novel is "Lolita".
"Lolita"'s main character, Humbert Humbert, is one of the
most famous unreliable narrators in all of literature.
</div>
</figure>
</body>
</html>