From 1d2909fa899fce9b3818315fc2742b29cd635aac Mon Sep 17 00:00:00 2001 From: "Hugh A. Cayless" Date: Sun, 3 Sep 2023 17:31:07 +0200 Subject: [PATCH] Some updates to TEI Pointers. --- .../en/SA-LinkingSegmentationAlignment.xml | 44 +++++++++---------- 1 file changed, 21 insertions(+), 23 deletions(-) diff --git a/P5/Source/Guidelines/en/SA-LinkingSegmentationAlignment.xml b/P5/Source/Guidelines/en/SA-LinkingSegmentationAlignment.xml index 4bdeaa15d4..f369c17d6d 100644 --- a/P5/Source/Guidelines/en/SA-LinkingSegmentationAlignment.xml +++ b/P5/Source/Guidelines/en/SA-LinkingSegmentationAlignment.xml @@ -746,13 +746,12 @@ kinds of object: A node is an instance of one of the node kinds defined in -the XQuery -1.0 and XPath 2.0 Data Model (Second Edition). It represents +the XQuery and XPath Data Model 3.1. It represents a single item in the XML information set for a document. For pointing purposes, the only nodes that are of interest are Text Nodes, Element Nodes, and Attribute Nodes. -A Sequence follows the definition in the XPath 2.0 Data +A Sequence follows the definition in the XPath 3.1 Data Model, with one alteration. A Sequence is an ordered collection of zero or more items, where an item is either a node or a partial text node. @@ -829,7 +828,7 @@ scheme.

xpath()

Sequence xpath(XPATH)

The xpath() scheme locates zero or more nodes within an XML - Information Set. The single argument XPATH is an XPath selection pattern, as + Information Set. The single argument XPATH is an XPath selection pattern, as defined in XSLT 3.0, that returns a node or sequence of nodes. XPaths returning atomic values (e.g. substring()) are illegal in the xpath() @@ -882,11 +881,11 @@ recommended when possible.

left() -

Point left( IDREF | XPATH )

+

Point left( IDREF | XPATH )

The left() scheme locates the point immediately preceding the node addressed by its argument, -which is either an XPATH as defined above or an -IDREF, the value of an xml:id +which is either an XPATH as defined above or an +IDREF, the value of an xml:id occurring in the document addressed by the base URI in effect for the pointer.

Example: the pointer #left(//supplied[1]) @@ -895,11 +894,11 @@ indicates the point between the first lb and the first

Example: #left(//gap[1]) indicates the point immediately before the first gap element in line two and the string si.

Example: #left(line1) indicates the point immediately before -the ]]> element.

+the lb n="1" element.

right() -

Point right( IDREF | XPATH )

+

Point right( IDREF | XPATH )

The right() scheme locates the point immediately following the node addressed by its argument.

Example: the pointer #right(//lb[@n='3']) @@ -910,12 +909,12 @@ in the example.

string-index() -

Point string-index( IDREF | XPATH, OFFSET )

+

Point string-index( IDREF | XPATH, OFFSET )

The string-index() scheme locates a point based on character positions in a text stream relative - to the node identified by the IDREF or XPATH parameter. The OFFSET + to the node identified by the IDREF or XPATH parameter. The OFFSET parameter is a positive, negative, or zero integer which determines -the position of the point. An offset of 0 represents the +the position of the point. An offset of 0 represents the position immediately before the first character in either the first text node descendant of the node addressed in the first parameter or the first following text node, if the addressed element contains @@ -925,8 +924,7 @@ between the s and the i in the word si in line 2.

Note: The OFFSET parameter (and similarly the LENGTH parameter found below in the string-range() scheme) are measured in characters. What is considered a single character will -depend (assuming the document being evaluated is in Unicode) on the -Normalization Form in use (see +depend on the Normalization Form in use (see UNICODE NORMALIZATION FORMS). A letter followed by a combining diacritic counts as two characters, but the same diacritic precombined with a letter would count @@ -942,15 +940,15 @@ counting.

range() -

Sequence range( POINTER, POINTER[, POINTER, POINTER ...])

+

Sequence range( POINTER, POINTER[, POINTER, POINTER ...])

The range() scheme takes as parameters one -or more pairs of POINTERs, which are each members of the set IDREF, -XPATH, left(), +or more pairs of POINTERs, which are each members of the set IDREF, +XPATH, left(), right(), or string-index(). A range() locates a (possibly non-contiguous) sequence beginning at the first POINTER parameter and ending at the -last. If the POINTER locates a node (i.e. is an XPATH or IDREF), then +last. If a POINTER locates a node (i.e. is an XPATH or IDREF), then that node is a member of the addressed sequence. If a sequence addressed by a range pointer overlaps, but does not wholly contain, an element (i.e. it contains only the start but not the end tag or vice-versa), @@ -971,17 +969,17 @@ the non-contiguous sequence in mentem.

string-range() -

Sequence string-range(IDREF | XPATH, OFFSET, LENGTH[, OFFSET, LENGTH ...])

+

Sequence string-range(IDREF | XPATH, OFFSET, LENGTH[, OFFSET, LENGTH ...])

The string-range() scheme locates a sequence based on character positions in a text stream relative to the node identified by the first parameter. The location of the beginning of the addressed sequence is determined precisely -as for string-index(). The OFFSET +as for string-index(). The OFFSET parameter is defined as above in string-index(). -The LENGTH parameter is a positive integer that denotes +The LENGTH parameter is a positive integer that denotes the length of the text stream captured by the sequence. As with range(), the addressed sequence may -contain text nodes and/or elements. The +contain text nodes and elements. The string-range() scheme can accept multiple OFFSET, LENGTH pairs to address a non-contiguous sequence in much the same way that range() can accept multiple pairs of pointers.

@@ -1002,7 +1000,7 @@ the non-contiguous sequence in mentem.

match() -

Sequence match(IDREF | XPATH, 'REGEX' [, INDEX])

+

Sequence match(IDREF | XPATH, 'REGEX' [, INDEX])

The match scheme locates a sequence based on matching the REGEX parameter against a text stream relative to the reference node identified by the first parameter. REGEX is a regular expression as defined by