Sentence segmentation with apostrophes #1

fbaumgardt · 2013-10-21T16:23:15Z

Apostrophes ʼ are not parsed correctly - sometimes they appear in pairs to mark quotations. The second apostrophe usually gets assigned to the following sentence and if there is none (-> end of chapter), it will be assigned its own sentence with length=1. You can find those locations searching for "1".*\n\s{3}</.

I am not familiar with the sentence id schema here - how can we fix a bug that affects sentence segmentation?

The text was updated successfully, but these errors were encountered:

balmas · 2013-11-15T00:55:03Z

This is a bug in the old Perseus segmentation code and something that should be noted as a requirement for the Annotation Service and any tokenization services we use in Perseids.

ghost assigned balmas Nov 15, 2013

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sentence segmentation with apostrophes #1

Sentence segmentation with apostrophes #1

fbaumgardt commented Oct 21, 2013

balmas commented Nov 15, 2013

Sentence segmentation with apostrophes #1

Sentence segmentation with apostrophes #1

Comments

fbaumgardt commented Oct 21, 2013

balmas commented Nov 15, 2013