-
Notifications
You must be signed in to change notification settings - Fork 7
/
features.html
322 lines (209 loc) · 14.3 KB
/
features.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta name="generator" content="HTML Tidy for Windows (vers 12 April 2005), see www.w3.org" />
<title>XmlPull v1 API Features</title>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta name="Author" content="Aleksander Slominski" />
</head>
<body bgcolor="white">
<h1>XmlPull v1 API Features</h1>
<p>The feature defines a discoverable and possibly changeable characteristic of XmlPull parser.</p>
<p>Please note: the key words "<b>MUST</b>", "<b>MUST NOT</b>", "<b>REQUIRED</b>", "<b>SHALL</b>", "<b>SHALL
NOT</b>", "<b>SHOULD</b>", "<b>SHOULD NOT</b>", "<b>RECOMMENDED</b>", "<b>MAY</b>", and "<b>OPTIONAL</b>" in this
document are to be interpreted as described in <a href="http://www.ietf.org/rfc/rfc2119.txt">RFC 2119</a>.</p>
<h2>Standard features</h2>
<p>The semantics of of all standard features <b>MUST</b> be followed by every XmlPull v1 API implementation.</p>
<p>All standard feature values <b>MUST NOT</b> change unless setFeature() is called (even when setInput() is
called!)</p>
<p> </p>
<h3><a name="process-namespaces" id="process-namespaces"></a>Standard feature: PROCESS NAMESPACES</h3>
<p>The feature is identified by <a href=
"http://xmlpull.org/v1/doc/features.html#process-namespaces">http://xmlpull.org/v1/doc/features.html#process-namespaces</a></p>
<p>Processing of namespaces in XMLPULL V1 parser <b>MUST</b> be by default set to false.</p>
<p>This feature <b>MUST</b> be recognized and allowed to be set both to true and false by every XMLPULL V1 API
compliant parser implementation.</p>
<p>If set to false XML namespaces <b>MUST</b> not be processed and instead namespace attributes (starting with xmlns)
will be treated as normal attributes.</p>
<p>If set to true than XML namespaces <b>MUST</b> be processed according to <a href=
"http://www.w3.org/TR/1998/PR-xml-names-19981117">Namespaces in XML specification</a>.</p>
<p>The feature <b>MUST</b> be allowed to be changed only when parser is in START_DOCUMENT state i.e. this feature can
not be changed during parsing.</p>
<p> </p>
<h3><a name="process-docdecl" id="process-docdecl"></a>Standard feature: PROCESS DOCDECL</h3>
<p>Identified by <a href=
"http://xmlpull.org/v1/doc/features.html#process-docdecl">http://xmlpull.org/v1/doc/features.html#process-docdecl</a></p>
<p>The default value of the feature is undefined.</p>
<p>If the VALIDATION feature is true then this feature <b>MUST</b> not be modified and <b>MUST</b> be reported as
true</p>
<p>If the feature is set to false and if <a href="http://www.w3.org/TR/2000/REC-xml-20001006#NT-doctypedecl">XML
document type declaration</a> (in short DOCDECL) is encountered it <b>MUST</b> be reported by nextToken() and
<b>MUST</b> be ignored by next().</p>
<p>As the DOCDECL was ignored, references to entities cannot be expanded (they will produce a fatal error if you call
next(), but will be reported as ENTITY_REF if you call nextToken() instead).</p>
<p>If the feature is true and VALIDATION feature is false then parser <b>MUST</b> be non validating as defined in
<a href="http://www.w3.org/TR/2000/REC-xml-20001006">XML 1.0 specification</a> (DOCDECL <b>MUST</b> be parsed and
processed by parser).</p>
<p>If the feature is true and VALIDATION feature is true then parser <b>MUST</b> be validating as defined in <a href=
"http://www.w3.org/TR/2000/REC-xml-20001006">XML 1.0 specification</a>.</p>
<p>The feature <b>MUST</b> be allowed to be changed only when parser is in START_DOCUMENT state i.e. this feature can
not be changed during parsing.</p>
<p> </p>
<h3><a name="validation" id="validation"></a>Standard feature: VALIDATION</h3>
<p>Identified by <a href=
"http://xmlpull.org/v1/doc/features.html#validation">http://xmlpull.org/v1/doc/features.html#validation</a></p>
<p>The feature <b>MUST</b> be false by default.</p>
<p>See description of <a href="#process-docdecl">PROCESS DOCDECL</a> feature for required behavior.</p>
<p>If the feature is set to true then PROCESS DOCDECL <b>MUST</b> be automatically set to true as well.</p>
<p>The feature <b>MUST</b> be allowed to be changed only when parser is in START_DOCUMENT state i.e. this feature can
not be changed during parsing.</p>
<p> </p>
<hr />
<h2>Optional features</h2>
<p>They <b>MAY</b> be supported but are not part of XmlPull API</p>
<p> </p>
<p> </p>
<h3><a name="report-namespace-prefixes" id="report-namespace-prefixes"></a>Optional feature: REPORT NAMESPACE
ATTRIBUTES</h3>
<p>The feature is identified by <a href=
"http://xmlpull.org/v1/doc/features.html#report-namespace-prefixes">http://xmlpull.org/v1/doc/features.html#report-namespace-prefixes</a></p>
<p>The feature <b>MUST</b> be false by default and only meaningful when PROCESS NAMESPACES feature is on.</p>
<p>This feature <b>MAY</b> be recognized and allowed to be set to true (by default as other optional features it is
false)</p>
<p>When set to true then XMLPULL parser <b>MUST</b> report namespace attributes also - they can be distinguished
looking for prefix == "xmlns" or prefix == "" and name == "xmlns</p>
<p>The feature <b>MUST</b> be allowed to be changed only when parser is in START_DOCUMENT state i.e. this feature can
not be changed during parsing.</p>
<p> </p>
<h3><a name="names-interned" id="names-interned"></a>Optional feature: NAMES INTERNED<br />
</h3>
<p>This feature is identified by <a href=
"http://xmlpull.org/v1/doc/features.html#names-interned">http://xmlpull.org/v1/doc/features.html#names-interned</a></p>
<p>If set to true then XMLPULL parser <b>MUST</b> intern all returned names using java.lang.String.intern that
includes following functions: getName(), getPrefix(), getNamespace(), getNamespace(prefix),
getNamespacePrefix(int), getNamespaceUri(int pos) , getAttributeName(), getAttributeNamespace(),
getAttributePrefix(), getAttributeType(). However this feature provides no assurance about String values (like
getText()) or attribute values (getAttributeValue()). </p>
<p>NOTE: when enabled this feature allows for fast testing of equality against string constants (no need to use
String.equals()).</p>
<h3> </h3>
<h3><a name="expand-entity-ref" id="expand-entity-ref"></a>Optional feature: EXPAND ENTITY REF</h3>
<p>This feature is identified by <a href=
"http://xmlpull.org/v1/doc/features.html#expand-entity-ref">http://xmlpull.org/v1/doc/features.html#expand-entity-ref</a></p>
<p>This optional feature changes behavior of nextToken() and when enabled parser <b>MUST</b> report expanded entity
reference content as tokens and <b>MUST</b> report end of expanded entity ref as ENTITY_REF with name that is null
(getText() returns null).</p>
<p>For example if entity foo was defined to have replacement text bar then parser will report following tokens:
ENTITY_REF with 'foo' as entity name, TEXT with value 'bar' and ENTITY_REF with no name to signal end of expansion) .
If PROCESS_DOCDECL is true then reported tokens may include START_TAG/END_TAG if entity has embedded markup and
ENTITY_REF for recursively expanded entity references (following calls to nextToken() will report entity
content as possibly multiple TEXT, START_TAG, END_TAG, ENTITY_REF etc.).</p>
<p>NOTE: it can only be supported when PROCESS_DOCDECL is true as defineEntityReplacementText() will always escape
markup or entity references.</p>
<p>NOTE: next() will always expand entity ref and report combined TEXT event and this feature does not affect next()
behavior.<br />
</p>
<h3><a name="xml-roundtrip" id="xml-roundtrip"></a>Optional feature: XML ROUNDTRIP</h3>
<p>This feature is identified by <a href=
"http://xmlpull.org/v1/doc/features.html#xml-roundtrip">http://xmlpull.org/v1/doc/features.html#xml-roundtrip</a></p>
<p>If ROUNDTRIP is on it is affecting getText/getTextCharacters() when nextToken() is used to make possible an
exact roundtrip of XML 1.0 input - here are constraints on values returned by getText()/getTextCharacters:</p>
<ul>
<li>for all tokens exactly what was in input <b>MUST</b> be returned - in particular returned content <b>MUST
NOT</b> be end-of-line normalized (the algorithm described in <a href=
"http://www.w3.org/TR/REC-xml#sec-line-ends">XML 1.0 End-of-Line Handling</a> is not applied)</li>
<li>for START_TAG and END_TAG event original XML value <b>MUST</b> be returned for events for start/end tag (for
example "tag" for <tag>)</li>
<li>for PROCESSING INSTRUCTION event exact content of PI <b>MUST</b> be returned: in <?target
data?> white spaces between target and data must be preserved</li>
<li>content of XMLDecl must be reported as not null by <a href=
"http://xmlpull.org/v1/doc/features.html#xmldecl-content">XMLDECL CONTENT</a> property</li>
<li>DOCDECL <b>MUST</b> be reported exactly as in input</li>
</ul>
<ul>
<li>additionally for nextToken(): ignorable whites spaces outside root element <b>MUST</b> be reported as
IGNORABLE_WHITESPACE</li>
</ul>
<p>By default this feature is off and it implies unchanged behavior of XmlPullParser:</p>
<ul>
<li>for all tokens call to getText()/getTextCharacters returns end-of-line normalized content<br />
(that includes beside TEXT, IGNORABLE_WHITESPACE also COMMENT, PROCESSING_INSTRUCTION, ENTITY_REF etc.)</li>
<li>getText() for START_TAG and END_TAG event MUST return null</li>
<li>PROCESSING_INSTRUCTION MUST be reported as target + ' ' + data or exactly as it was in input</li>
<li>DOCDECL MUST be reported exactly if PROCESS DOCDECL is true otherwise MAY be reported (but does not have to and
may be even null)</li>
<li>ignorable whites spaces (event IGNORABLE_WHITESPACE) outside root element MAY be reported by nextToken()</li>
</ul>
<p>Note that content outside of root element is typically not reported by parsers to make it consistent with XML
infoset <a href="http://www.w3.org/TR/xml-infoset/#infoitem.document">Document Information Item properties</a> in
which white spaces in prolog and part root element are ignored. If reported IGNORABLE_WHITESPACE event may be used to
preserved formatting of XML input when serializing it.</p>
<p>However as IGNORABLE_WHITESPACE event MAY be reported make sure that application will be prepared to ignore
IGNORABLE_WHITESPACE evens from nextToken() when processing content outside of root element (getDepth() == 0).</p>
<p> </p>
<h3><a name="detect-encoding" id="detect-encoding"></a>Optional feature: DETECT ENCODING<br />
</h3>
<p>This feature is identified by <a href=
"http://xmlpull.org/v1/doc/features.html#detect-encoding">http://xmlpull.org/v1/doc/features.html#detect-encoding</a></p>
<p>If this feature is true it means that XmlPull parser <b>MUST</b> detect encoding from input stream when
inputEncoding is null in call setInput(InputStream inputStream, String inputEncoding).</p>
<p>Parameter inputStream from setInput(InputStream inputStream, String inputEncoding) contains raw byte input stream
of possibly unknown encoding (when inputEncoding is null) and in such case the parser must derive encoding from
<?xml declaration or assume UTF8 or UTF16 as described in <a href=
"http://www.w3.org/TR/REC-xml#sec-guessing-no-ext-info">XML 1.0 Appendix F.1 Detection Without External Encoding
Information</a> otherwise if inputEncoding is present then it must be used (this is consistent with <a href=
"http://www.w3.org/TR/REC-xml#sec-guessing-with-ext-info">XML 1.0 Appendix F.2 Priorities in the Presence of External
Encoding Information</a> that allows for exception only for files and in such cases inputEncoding should be null to
trigger auto detecting.</p>
<p> </p>
<h3><a name="serializer-attvalue-use-apostrophe" id="serializer-attvalue-use-apostrophe"></a>Optional feature:
SERIALIZER ATTVALUE USE APOSTROPHE<br />
</h3>
<p>This feature is identified by <a href=
"http://xmlpull.org/v1/doc/features.html#serializer-attvalue-use-apostrophe">http://xmlpull.org/v1/doc/features.html#serializer-attvalue-use-apostrophe</a></p>
<p>If this feature is supported that means that XmlSerializer output for attribute value quotation can be controlled
(when serializing XML 1.0 it allows to chose one of <a href="http://www.w3.org/TR/REC-xml#NT-AttValue">alternative
representations for AttValue</a>).</p>
<p>If the feature is set to true it means than XmlSerializer parser <b>MUST</b> use apostrophe (') to quote attribute
value.</p>
<p>If the feature is set to false it means than XmlSerializer parser <b>MUST</b> use quotation mark (") to quote
attribute value.</p>
<p> </p>
<h3><a name="relaxed" id="relaxed"></a>Optional feature: RELAXED<br />
</h3>
<p>This feature is identified by <a href=
"http://xmlpull.org/v1/doc/features.html#relaxed">http://xmlpull.org/v1/doc/features.html#relaxed</a></p>
<p>If this feature is supported that means that XmlPull parser will be lenient when checking XML well formedness.</p>
<p><b>NOTE</b>: use it only if XML input is not well-formed and in general usage if this feature is
<b>discouraged</b></p>
<p><b>NOTE</b>: as there is no definition of what is relaxed XML parsing therefore what parser will do completely
depends on implementation used</p>
<p> </p>
<p> </p>
<p> </p>
<hr />
<address>
<a href="http://www.extreme.indiana.edu/~aslom/">Aleksander Slominski</a>
</address>
<p> </p>
<p> </p>
<p> </p>
<p> </p>
<p> </p>
<p> </p>
<p> </p>
<p> </p>
<p> </p>
<p> </p>
<p> </p>
<p> </p>
<p> </p>
<p> </p>
<p> </p>
<p> </p>
<p> </p>
<p> </p>
<p> </p>
</body>
</html>