You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi. Thank you for creating and sharing this tool; it's truly impressive, and I was amazed when I discovered it recently.
I am playing around with quotations() and when I use .out('text') and doc.html(), it seems that some punctuation marks are missing (the results differ to doc.json() where the punctuation marks are included). Is this intended or a bug?
<script src="https://unpkg.com/compromise"></script>
<script>
var doc = nlp(`This is a "test". Hello "World."`)
let hold = doc.quotations();
console.log(hold.out('text'));
console.log(hold.json())
document.body.innerHTML = doc.html({
'.red': hold
});
</script>
The text was updated successfully, but these errors were encountered:
hey @beholdbible - thank you for the good issue and kind words.
It's clear, seeing this, that we should shuffle the pre- and -post whitespace characters around a bit, to try and avoid this weirdness. Same as #1144 - for any paired punctuation symbols.
Happy to look at this, it's tricky because the tokenizer doesn't know very much, will have to guess abt some of the classification.
Will move this to plans for the next release.
cheers
Hi. Thank you for creating and sharing this tool; it's truly impressive, and I was amazed when I discovered it recently.
I am playing around with quotations() and when I use .out('text') and doc.html(), it seems that some punctuation marks are missing (the results differ to doc.json() where the punctuation marks are included). Is this intended or a bug?
The text was updated successfully, but these errors were encountered: