Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some DOCX files are not properly recognized. #38

Open
majkel89 opened this issue Nov 9, 2024 · 0 comments
Open

Some DOCX files are not properly recognized. #38

majkel89 opened this issue Nov 9, 2024 · 0 comments

Comments

@majkel89
Copy link

majkel89 commented Nov 9, 2024

Describe the bug
I noticed this issue on one of my production services.
I was able to reproduce this bug by obtaining issue311docx.testfile file from https://github.com/file/file repository

To Reproduce
Steps to reproduce the behavior:

  1. Go to test: added example of docx that does not match current magic numbers #37
  2. See tests are failing on 365-issue311.docx

Expected behavior
365-issue311.docx is properly recognized as DOCX

Additional context
DOCX is hard to recognized because it is ZIP archive with different extension
DOCX can be verified by searching for common file names within its contents
Moreover ZIP file is hard to verify because one should start to check it from the very end.
Not sure it this bug should be fixed here or you just need to let to know others that for DOCX they should use something else eg .https://github.com/hey-red/Mime/tree/master

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant