Implementation of a XML 1.0-parser using pest.
- Syntax Parser
-
write the grammar file(s)
- Test:
ExternalID
(everything under ###External Entities+Expl in #XML Overview)
- Test:
-
add data structures
-
(add methods for manipulating data structures & writing the data back )
- Markup- and Programming-Languages are just a sequence of strings/bytes in the end
- Specification != Implementation is possible
- writing Grammar-Rules in PEG
- Unicode-Encoding
->see XML 1.0 Spec: Character Data and Markup
Markup = Define document's structure (or contain metadata)
STag
(start-tag),ETag
(end-tag),EmptyElemTag
EntityRef
,CharRef
Comment
CDStart
,CDEnd
doctypedecl
(DTD)PI
s (Processing Instructions)XMLDecl
(EncodingDecl
,VersionInfo
,SDDecl
:standalone decl)TextDecl
(for external parsed entities)
Character Data = document's content
CDATA
= for escaping blocks of texts containing markup-strings
=storage units
- Document Entity = starting point for XML processor; contained in every XML document
- Parsed Entity = contents="replacement text"
- Unparsed Entity = may (or may not) be text and if text may be other than XML (eg gif)
- ->no restrictions on contents of unparsed entities!
- Parameter Entities = for use within DTD
System Identifier (SystemLiteral
) = mostly a URI reference
- no fragment identifier (
#
) within URI!! - relative URI = relative to XML-doc in which this External Entity is defined?? (unsure, see Spec 4.2.2)
Example
<!ENTITY open-hatch
SYSTEM "http://www.textuality.com/boilerplate/OpenHatch.xml">
<!ENTITY open-hatch
PUBLIC "-//Textuality//TEXT Standard open-hatch boilerplate//EN"
"http://www.textuality.com/boilerplate/OpenHatch.xml">
<!ENTITY hatch-pic
SYSTEM "../grafix/OpenHatch.gif"
NDATA gif >
- taken from 4.2.2 External Entities