Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow PT Registry to use 2.1 xhtml schema #24

Open
ericpyle opened this issue Jan 16, 2018 · 8 comments
Open

Allow PT Registry to use 2.1 xhtml schema #24

ericpyle opened this issue Jan 16, 2018 · 8 comments
Assignees
Labels

Comments

@ericpyle
Copy link
Contributor

  • backport 2.1 xhtml schema into text metadata 1.5 schema
  • update schema on DBL
  • update schema in PT8.0
  • create xsd for Registry to validate xhtml
@ericpyle ericpyle self-assigned this Jan 16, 2018
@ericpyle ericpyle added the Epic label Jan 16, 2018
@mvahowe
Copy link
Contributor

mvahowe commented Jan 16, 2018

@ericpyle If this means changes to metadata 1.5, I think the resulting schema needs to be called something other than metadata 1.5, because a lot of things in a lot of organizations depend on the current definition of 1.5.

@ericpyle
Copy link
Contributor Author

@mvahowe valid consideration. Do you think that 1.5 metadata publisher clients would not be able to handle 2.1 xhtml? Perhaps if we could get there input we may not need to bump the revision?

Otherwise, I'd just rather say PT Registry is going to need to stay put until all PT archivists are using PT 8.1.

@mvahowe
Copy link
Contributor

mvahowe commented Jan 16, 2018

I think I need to understand the problem we are trying to solve here. What is going to use the new schema, and what changes are planned? If it's about making the schema stricter we could produce a variant of the existing 1.5 schema that still works with 1.5 documents. But if we're talking about breaking changes, ie documents that validate with the new schema may not validate with the old one, I can't see how we can do that without bumping the version number.

@mvahowe
Copy link
Contributor

mvahowe commented Jan 16, 2018

Even with making the schema stricter, there would be an issue if we can that on DBL, because documents that are being validated locally, eg by a Biblica toolchain, could suddenly start failing on upload.

@ericpyle
Copy link
Contributor Author

@mvahowe I'm looking to you to help me understand the possible impact of these changes. This is not a new request. It's an old one (that I believe was prompted by Biblica) that was under discussion as you were developing metadata 2.0.

My understanding is that the main change is to allow proper xhtml nesting with the existing 1.5 xhtml elements. So the current 1.5 is "too strict" in that regard compared to 2.1.

As for Biblica toolchain, I believe we're only talking about PT8 which would need to be updated to take advantage of any new nesting. But that's no big deal.

The changes are "breaking" if publishing metadata processers expect the broken 1.5 xhtml format. But considering that we didn't actually add new elements, I would guess that they just trust that the xhtml we supply can be passed through as html?

@mvahowe
Copy link
Contributor

mvahowe commented Jan 16, 2018

@ericpyle I'd need to look again at the details, but I thought that the 2.1 HTML schema both allowed things the 1.5 one didn't and doesn't allow things the 1.5 did. If that's the case I'd expect to start to get schema mismatch problems if PT and DBL don't agree. My possibly faulty understanding was that Kent was using some sort of HTML generator at his end (and then maybe pasting into PT.)

@ericpyle
Copy link
Contributor Author

ericpyle commented Jan 16, 2018

@mvahowe thanks for looking into this more. Dan didn't make it sound like it was urgent, so if we just want to say, "we're not going to do this until everyone is using a 2.1 uploader", I think that's okay with me!

I have yet to receive any xhtml incompatibility issues with metadata 1.5 being upgraded to 2.1. I'm guessing the transforms do their best to fix things?

In any case, however/whenever we want to allow pure 2.1 xhtml data to upload, it sounds like we'd need to make some script that can upgrade all of the PT Registry entries to match the metadata 2.1 xhtml (however it got fixed by the transforms)

fwiw, Kent has to go through the PT Registry for xhtml fields for text metadata. He may use some tool to for audio metadata, but even then he'd need to copy and paste it into PT audio uploader form and have it validate before it gets uploaded to DBL.

@mvahowe
Copy link
Contributor

mvahowe commented Jan 16, 2018

The XHTML pruning part of the process is in a separate file so it would be relatively easy to run somewhere else, if that was useful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants