Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

schema for corpus.xml #44

Open
ingoboerner opened this issue Feb 8, 2023 · 1 comment
Open

schema for corpus.xml #44

ingoboerner opened this issue Feb 8, 2023 · 1 comment
Labels
enhancement New feature or request

Comments

@ingoboerner
Copy link
Collaborator

The file corpus.xml in each (all?) github repository seems to be required. It is undocumented, and, does not validate against tei-all. We should look at that.

Fehlerlevel: error
Beschreibung: element "fileDesc" incomplete; missing required element "sourceDesc"

Fehlerlevel: error
Beschreibung: element "teiCorpus" incomplete; expected element "TEI", "facsimile", "fsdDecl", "sourceDoc", "standOff", "teiCorpus" or "text"

We don't need the xInclude namespace declared on root.

@ingoboerner ingoboerner added the enhancement New feature or request label Feb 8, 2023
@ingoboerner
Copy link
Collaborator Author

See related issue #68 (easy to fix) BUT:

We would have to rework the corpus.xml:

Currently, it includes only a <teiHeader> but to be valid, it MUST include one of the elements: <TEI>, <standOff>, <teiCorpus> or <text>. I could adapt the content model but then we would have a modification that results in the file not validating against tei-all.

An option would be to include references to the individual TEI files of the plays. We have the xInclude (which I don't really like b/c causes problems with the exist, at least, used to..)

The second major problem:
The <teiHeader> in the <teiCorpus> needs to include a <sourceDesc>. We could list the individual corpus' sources here, e.g. Textgrid, ... I looked into this for the CLSINFRA deliverable D7.3; we could aggregate the sources from the individual TEI files: See https://versioning-living-corpora.clsinfra.io/3-2_gerdracor_corpus_archeology.html
5a00205813c84cfe592dc974ac97825ec257c567e981f9286461149ac6e35c51

The other things, e.g. additional idno type values I can fix (allow) on the schema level.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant