Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The XMP box library is nice, but out in the wild are PDF files that fail parsing. For example dc.create is a Bag instead of a Seq.
Ideally the parser would have a mode where it tries to read as many properties as possible by simply discarding unreadable ones. This is not good if you want to write back a PDF but if you just want to extract Metadata, such a mode would be nice. In this case this invalid dc.creator value would be dropped. This would require doing some more work.
I've seen that there is a non strict parsing mode, which I don't think should be confused with this proposed lenient mode, but as the name suggests it should be less strict. So in this mode Sequences could be read fom Bags and vice versa. I left Alt cardinality as an error because it doesn't really fit in.
Maybe in one of the modes an element that should be an array but isn't could automagically be wrapped into one...
(I also believe that a Bag could always be read from a Sequence...)