Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TEI @xmlns #2621

Open
KAKDH opened this issue Nov 29, 2024 · 6 comments · May be fixed by #2632
Open

TEI @xmlns #2621

KAKDH opened this issue Nov 29, 2024 · 6 comments · May be fixed by #2632

Comments

@KAKDH
Copy link

KAKDH commented Nov 29, 2024

Hello,

Not sure this is an issue, but at least a quick consistency question about xmlns on TEI:

In the current guidelines, the TEI element has highlighted the version attribute & some att.classes that one can also use. Further in the notes, it adds that "It is customary to specify the TEI namespace http://www.tei-c.org/ns/1.0 on it." (See: [https://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-TEI.html] )

The current tei_all.rng that gets automatically associated with new tei_all type documents in Oxygen has version optional and xmlns required, which confuses my students substantially.

Would it make sense to either update the guidelines to reflect the best practice, or update the schema to make both of them optional?

All the best,
K.

@sydb
Copy link
Member

sydb commented Nov 30, 2024

Hi @KAKDH !
Hmmm … that “remarks” section could probably be worded somewhat better.
The <TEI> element, like all other TEI elements except <egXML>, is in the TEI namespace.[1] So the namespace has to be declared on the outermost element of a TEI document whether that element is <TEI> or <teiCorpus> or something else (which would be weird and probably non-conformant). But whether that namespace is declared with our without a prefix, and what that prefix is if there is one, is entirely up to the user. That is you can have any one of the following, it is up to you.

<TEI xmlns="http://www.tei-c.org/ns/1.0">
<t:TEI xmlns:t="http://www.tei-c.org/ns/1.0">
<tei:TEI xmlns:tei="http://www.tei-c.org/ns/1.0">
<ns03:TEI xmlns:ns03="http://www.tei-c.org/ns/1.0">
<TEI xmlns="http://www.tei-c.org/ns/1.0" xmlns:tei="http://www.tei-c.org/ns/1.0">

I think (but am not at all sure) the point here is that the vast majority of TEI users use first one, above, and all the examples in the Guidelines are designed to be copy-and-posted into documents that are set up that way.

Anyone want to take a crack at better wording? I have assigned @GusRiva to implement (as I am on ticket assignment duty with @trishaoconnor until tomorrow. : - ), but anyone can propose better wording. It currently says:

    <p>This element is required. It is customary to specify the TEI
    namespace <ident type="ns">http://www.tei-c.org/ns/1.0</ident> on
    it, for example: <tag type="start">TEI version="4.4.0"
    xml:lang="it" xmlns="http://www.tei-c.org/ns/1.0"</tag>.</p>

Notes
[1] http://www.tei-c.org/ns/1.0

@sydb
Copy link
Member

sydb commented Nov 30, 2024

Oh, forgot to mention that @version is explicitly optional. Some folks (including me) consider it a really good idea to use it, but it is up to you. (I think it is less confusing years later when you come back to a document for it to be explicit about which version of TEI it uses. In truth, you can do that with an xml-model PI or a <schemaRef> element instead, but @version on <TEI> (or <teiCorpus>) seems like a fast and easy ways to do this.)

@KAKDH
Copy link
Author

KAKDH commented Dec 2, 2024

Hi @sydb,

Sure, and I get your point and agree.

The teiCorpus entry, however, isn’t structured in the same way ([https://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-teiCorpus.html]), i.e. there is no note that your teiCorpus should have ns declaration, even though the tei_all.rng also gives you an error if there isn’t one. The note reads “Should contain one TEI header for the corpus, and a series of TEI elements, one for each text.”

I might be wrong, but I think the confusion for my students comes from the fact that TEI XMLs are the very first XMLs they are working with, so xmlns not being included in the list of attributes is confusing. In class, when we look for attributes allowed on a certain element, we go to the REF section of the guidelines and look for the attributes in the table and ns isn’t there, so we're lost. :)

All the best,
Katarzyna

@sydb
Copy link
Member

sydb commented Dec 2, 2024

Response to @KAKDH

Oh, good point (that the <teiHeader> reference page does not mention namespace declaration), it probably should.

So I agree wholeheartedly that the “notes” section for both should be more forceful. But there is not much more we can do. The reason @xmlns is not listed along with the other attributes is because it is not an attribute,[1] it is a namespace declaration.

To prove this to yourself, enter this wee snippet into a new XML document oXygen:

<?xml version="1.0" encoding="UTF-8"?>
<greeting xmlns="http://www.example.edu/ns" xml:lang="en" type="secular">Happy Holidays!</greeting>

Then, using the XPath box in the upper L, search for all attributes with //@*. You will find that only 2 items are found: the @type attribute and the @xml:lang attribute. Then search for all namespaces with //namespace::*. You will find that 2 items are found: the XML namespace and the one I made up for the example.[2]

(Oxygen, at least for me, is inconsistent about whether it colors namespace declarations the same as attributes or not; but in any case, the non-prefixed namespace has always been the same color as an attribute, IIRC.)

The good news, of course, is that all the examples of <TEI> or <teiCorpus> in the Guidelines show the namespace declaration, and oXygen puts it there without being told to. On those occasions when I teach TEI before we have covered namespaces I just explain it (the namespace declaration) away as being some magic stuff, like the PIs immediately above it, to be ignored until later. Not sure if that is satisfactory in your circumstances or not.

For implementer(s)
I am not convinced it is necessary to note here the fact that a <TEI> element is required in a TEI document. Thus here is a crack at a rewrite of the <remarks> for both <TEI> and <teiCorpus>:

  <remarks versionDate="2024-12-01" xml:lang="en">
    <p>As with all elements in the TEI scheme (except <gi>egXML</gi>) this element is
     in the TEI namespace. (See <ptr target="#SGname"/>.) Thus, when it is used as the
     outermost element of a TEI document it is necessary to specify the TEI namespace
     on it. This is customarily achieved by specifying the TEI namespace
     <ident type="ns">http://www.tei-c.org/ns/1.0</ident> without indicating a prefix,
     and then not using a prefix on TEI elements in the rest of the document. For example:
     <tag type="start">TEI version="4.8.1" xml:lang="it" xmlns="http://www.tei-c.org/ns/1.0"</tag>.</p>
   </remarks>

Notes
[1] This is not entirely true. Modern XML tools (i.e., XPath, and thus XSLT, XQuery, and Schematron, etc.) use the XDM (“XML data model”, I think) for which this is true. But when parsing an XML document and validating against a DTD, what we now call a namespace declaration has to be treated as an attribute. This is because DTDs were settled long before namespaces were invented.
[2] Interesting to note that the XML namespace shows up in the search resuilts even if @xml:lang is not there. The XML namespace is defined (in the XML specification) as being available without declaration, so (I guess) oXygen finds it easier to consider it present on all documents.

@KAKDH
Copy link
Author

KAKDH commented Dec 3, 2024

Hi @sydb,

Thanks a lot for your thorough response. This is very helpful – as always.

A quick off-topic, just to give you some context where my question is coming from:
The confusion indeed came up with the assignment for which my students were supposed to write their own short DTD and validate their basic encoding against it. Some of them used simply XML as their root and had no issues, while others used TEI and then Oxygen forced the namespace, which in DTD looked like an attribute, but an attribute not listed in the TEI guidelines. You know, it’s just like with Google these days: if something isn’t in the guidelines it doesn’t exist. Later this term we will talk about ODD & Schematron, for which they will have to master namespaces, so I am sure it will become more clear. End of off-topic.

All the best,
Katarzyna

@GusRiva
Copy link
Member

GusRiva commented Dec 3, 2024

This discussion was a very interesting read!
I like @sydb 's rewriting of the remark. Will make a pull request with it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants