You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jul 5, 2018. It is now read-only.
Apostrophes ʼ are not parsed correctly - sometimes they appear in pairs to mark quotations. The second apostrophe usually gets assigned to the following sentence and if there is none (-> end of chapter), it will be assigned its own sentence with length=1. You can find those locations searching for "1".*\n\s{3}</.
I am not familiar with the sentence id schema here - how can we fix a bug that affects sentence segmentation?
The text was updated successfully, but these errors were encountered:
This is a bug in the old Perseus segmentation code and something that should be noted as a requirement for the Annotation Service and any tokenization services we use in Perseids.
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Apostrophes
ʼ
are not parsed correctly - sometimes they appear in pairs to mark quotations. The second apostrophe usually gets assigned to the following sentence and if there is none (-> end of chapter), it will be assigned its own sentence with length=1. You can find those locations searching for"1".*\n\s{3}</
.I am not familiar with the sentence id schema here - how can we fix a bug that affects sentence segmentation?
The text was updated successfully, but these errors were encountered: