Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

POL_1 not always reported #161

Open
maria-messerschmidt opened this issue Jun 26, 2024 · 2 comments · Fixed by #168
Open

POL_1 not always reported #161

maria-messerschmidt opened this issue Jun 26, 2024 · 2 comments · Fixed by #168
Assignees
Labels
bug Something isn't working

Comments

@maria-messerschmidt
Copy link

This is somewhat related to #160 since POL_1 should catch all encrypted files, not just the ones that aren't stored. But the examples from #160 show that this is not the case. In addition to the examples in #160, I have added a few additional ones that specifically relate to POL_1 below.

Scenario 1: Working scenario -> POL_1
POL_1 is correctly reported if a file is saved with encryption/password from e.g. LibreOffice. For example:

E002b.ods

Which produces this error log:
C:\odf\odf-validator-main>odf-validator.bat -p "filer\testfiler\E002b.ods"
APP-1: [INFO] Validating filer\testfiler\E002b.ods.
APP-5: [INFO] DNA ODF Spreadsheets Preservation Specification Profile report for filer\testfiler\E002b.ods.
POL_1: Object 1\styles.xml [ERROR] Encryption | The package MUST NOT contain any encrypted entries.
POL_1: Object 1\content.xml [ERROR] Encryption | The package MUST NOT contain any encrypted entries.
POL_1: settings.xml [ERROR] Encryption | The package MUST NOT contain any encrypted entries.
POL_1: Object 1\settings.xml [ERROR] Encryption | The package MUST NOT contain any encrypted entries.
POL_1: manifest.rdf [ERROR] Encryption | The package MUST NOT contain any encrypted entries.
POL_1: meta.xml [ERROR] Encryption | The package MUST NOT contain any encrypted entries.
POL_1: ObjectReplacements\Object 1 [ERROR] Encryption | The package MUST NOT contain any encrypted entries.
POL_1: Pictures\10000000000001AE000002009161D160.png [ERROR] Encryption | The package MUST NOT contain any encrypted entries.
POL_1: styles.xml [ERROR] Encryption | The package MUST NOT contain any encrypted entries.
POL_1: content.xml [ERROR] Encryption | The package MUST NOT contain any encrypted entries.
NOT VALID, 10 errors, 0 warnings and 0 info messages.

However, this is the only scenario where this is caught.

Also, the files are not actually encrypted (although the document is password-protected, so POL_1 is valid).

For example content.xml from this file:
Path = content.xml
Folder = -
Size = 4656
Packed Size = 4656
Modified = 2024-06-26 09:12:38
Created =
Accessed =
Attributes =
Encrypted = -
Comment =
CRC = DB07EAF9
Method = Store
Characteristics = Descriptor UTF8
Host OS = FAT
Version = 20
Volume Index = 0
Offset = 18854

Scenario 2: encrypting a full ODS-package
I tried encrypting the ODS-file itself and checked that all zip entries were encrypted.

E002c.ods

Example of attributes for content.xml

Path = content.xml
Folder = -
Size = 23922
Packed Size = 3990
Modified = 2024-06-26 09:28:02.0000000
Created =
Accessed =
Attributes =
Encrypted = +
Comment =
CRC = 64587F76
Method = pkAES-256 Deflate
Characteristics = NTFS StrongCrypto : Encrypt StrongCrypto UTF8
Host OS = FAT
Version = 51
Volume Index = 0
Offset = 31176

Then ran validation which produced the following:

C:\odf\odf-validator-main>odf-validator.bat -p "filer\testfiler\E002c.ods"
APP-1: [INFO] Validating filer\testfiler\E002c.ods.
APP-2: [ERROR] Unsupported feature encryption used in entry settings.xml

So here, we are back to the APP-2 error (and also the various errors documented in #160 that is instead of APP-2 when validator is run without profile).

C:\odf\odf-validator-main>odf-validator.bat "filer\testfiler\E002c.ods"
APP-1: [INFO] Validating filer\testfiler\E002c.ods.
org.apache.commons.compress.archivers.zip.UnsupportedZipFeatureException: Unsupported feature encryption used in entry settings.xml
at org.apache.commons.compress.archivers.zip.ZipUtil.checkRequestedFeatures(ZipUtil.java:147)
at org.apache.commons.compress.archivers.zip.ZipFile.getInputStream(ZipFile.java:953)
at org.openpreservation.format.zip.ZipFileProcessor.getEntryInputStream(ZipFileProcessor.java:116)
at org.openpreservation.odf.pkg.PackageParserImpl.processEntry(PackageParserImpl.java:129)
at org.openpreservation.odf.pkg.PackageParserImpl.processZipEntries(PackageParserImpl.java:109)
at org.openpreservation.odf.pkg.PackageParserImpl.parsePackage(PackageParserImpl.java:100)
at org.openpreservation.odf.pkg.PackageParserImpl.parsePackage(PackageParserImpl.java:70)
at org.openpreservation.odf.validation.ValidatingParserImpl.parsePackage(ValidatingParserImpl.java:74)
at org.openpreservation.odf.validation.Validator.validatePackage(Validator.java:107)
at org.openpreservation.odf.validation.Validator.validate(Validator.java:83)
at org.openpreservation.odf.apps.CliValidator.validatePath(CliValidator.java:68)
at org.openpreservation.odf.apps.CliValidator.call(CliValidator.java:60)
at org.openpreservation.odf.apps.CliValidator.call(CliValidator.java:35)
at picocli.CommandLine.executeUserObject(CommandLine.java:2041)
at picocli.CommandLine.access$1500(CommandLine.java:148)
at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2461)
at picocli.CommandLine$RunLast.handle(CommandLine.java:2453)
at picocli.CommandLine$RunLast.handle(CommandLine.java:2415)
at picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2273)
at picocli.CommandLine$RunLast.execute(CommandLine.java:2417)
at picocli.CommandLine.execute(CommandLine.java:2170)
at org.openpreservation.odf.apps.CliValidator.main(CliValidator.java:87)

Scenario 3: working POL_1 scenario without profile
I thought I would try to see what happened if I tried validation on the file that was saved with password from LibreOffice. I assume this should be a valid file since the content is encrypted, but it is also stored.

E002b.ods

However, the output has a number of errors:
C:\odf\odf-validator-main>odf-validator.bat "filer\testfiler\E002b.ods"
APP-1: [INFO] Validating filer\testfiler\E002b.ods.
APP-4: [INFO] Validation report for filer\testfiler\E002b.ods.
XML-3: settings.xml [ERROR] Not a well formed XML document. XML parsing exception at line 1 and column 1: Invalid byte 2 of 4-byte UTF-8 sequence..
DOC-3: mimetype [INFO] OpenDocument MIMETYPE application/vnd.oasis.opendocument.spreadsheet detected
XML-3: manifest.rdf [ERROR] Not a well formed XML document. XML parsing exception at line 1 and column 1: Invalid byte 2 of 2-byte UTF-8 sequence..
XML-3: meta.xml [ERROR] Not a well formed XML document. XML parsing exception at line 1 and column 1: Invalid byte 1 of 1-byte UTF-8 sequence..
PKG-7: Thumbnails\thumbnail.png [WARNING] An OpenDocument Package SHOULD contain a preview image Thumbnails/thumbnail.png.
XML-3: content.xml [ERROR] Not a well formed XML document. XML parsing exception at line 1 and column 1: Invalid byte 1 of 1-byte UTF-8 sequence..
XML-3: styles.xml [ERROR] Not a well formed XML document. XML parsing exception at line 1 and column 1: Invalid byte 2 of 2-byte UTF-8 sequence..
NOT VALID, 5 errors, 1 warnings and 1 info messages.

PKG-7 is expected since the thumbnail isn't generated in this scenario, but I am not sure about the rest.

@carlwilson carlwilson linked a pull request Jul 25, 2024 that will close this issue
@maria-messerschmidt
Copy link
Author

I have tested the fix for this and am getting the following logs:

C:\odf\odf-validator-main>odf-validator.bat -p "C:\Users\maria\Desktop\2024-07\AT068\AT068.ods"
APP-1: [INFO] Validating C:\Users\maria\Desktop\2024-07\AT068\AT068.ods.
SYS-1: [ERROR] Package could not be parsed, due to an exception. | The following zip entries could not be read: settings.xml: Unsupported Zip feature: compression method
META-INF/manifest.xml: Unsupported Zip feature: compression method
manifest.rdf: Unsupported Zip feature: compression method
mimetype: Unsupported Zip feature: compression method

C:\odf\odf-validator-main>odf-validator.bat -p "C:\Users\maria\Desktop\2024-07\AT040\AT040tmp.ods"
APP-1: [INFO] Validating C:\Users\maria\Desktop\2024-07\AT040\AT040tmp.ods.
SYS-1: [ERROR] Package could not be parsed, due to an exception. | The following zip entries could not be read: settings.xml: Unsupported Zip feature: compression method
manifest.rdf: Unsupported Zip feature: compression method
meta.xml: Unsupported Zip feature: compression method
styles.xml: Unsupported Zip feature: compression method

It would be good to 1. ensure there is still a policy error of some sort (POL_1 og POL_2) and 2. distinguish between compression and encryption if possible.

When running the validator without profile, XML-3 errors are still generated for encrypted files since these cannot be parsed.

@carlwilson carlwilson added the bug Something isn't working label Sep 9, 2024
@maria-messerschmidt
Copy link
Author

We need to check and make sure how the new error for this is caught in the API as well. I will update the issue with more details once I have been able to test this, but as discussed, this likely will not currently work with the API.

@carlwilson carlwilson self-assigned this Oct 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants