-
Notifications
You must be signed in to change notification settings - Fork 493
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UTF8 coding problems OAI-PMH? #9987
Comments
I suspect this is a duplicate of #9910 An upstream lib fix is in preparation (gdcc/xoai#192), but needs approval from @landreev or @pdurbin. Further feedback on that PR is very welcome! |
Yes, this is definitely a duplicate of #9910.
Yes, we have a fix for this in the xoai library - thanks to @poikilotherm! - and we should be able to incorporate it into Dataverse shortly. However, it will only become part of a Dataverse release as of 6.1, which is still a couple of months away. |
Well, those two guys would know! Closing as a duplicate of this issue: |
#9910 has been fixed and closed. Also, I'd like to point out that what I said earlier - "Unfortunately, the current versions of xoai are not going to work with pre-6.0 versions of Dataverse" was not true (see the linked comment below): |
Thanks |
What steps does it take to reproduce the issue?
The OAI request
https://heidata.uni-heidelberg.de/oai?verb=ListRecords&resumptionToken=b2Zmc2V0OjozMHxwcmVmaXg6Om9haV9kZGk=
results in a non-valid XML. Apparently the apostrophe in the author name for https://heidata.uni-heidelberg.de/dataset.xhtml?persistentId=doi:10.11588/data/10034 (Siang, Ch’ng Kean) is not encoded correctly.
Firefox reports that the xml is not well-formed. And also jhove reports an error:
If I export the metadata directly to an XML format (for example https://heidata.uni-heidelberg.de/api/datasets/export?exporter=Datacite&persistentId=doi%3A10.11588/data/10034), the encoding is apparently correct.
OAI-PMH requests. Maybe all UTF-8-Codepoints with more then two bytes?
https://heidata.uni-heidelberg.de/oai?verb=ListRecords&resumptionToken=b2Zmc2V0OjozMHxwcmVmaXg6Om9haV9kZGk=
see above
All OAI-PMH users
Same coding as for frontend and other exports.
Which version of Dataverse are you using?
5.13
The text was updated successfully, but these errors were encountered: