-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reading .bz2 files fails to decompress or segfaults #116
Comments
Some interesting findings. If using On the other hand if using regular From
This might be what is causing the problem. |
Also:
That block size value matches the |
See GitHub issue #116 for additional info
Commit c214d22 adds a compressed |
Thanks! Fortunately, this fails on travis too, so we have a test. |
Some other tests are now wrong because they all shared the same |
Oops I'll fix that |
Actually, I was fixing it on my side, so give me a few minutes. |
ok |
Split the testing of mocat directory parsing from the bzip2 issue. See discussion at #116
The other tests are fixed by making them as before and moving this issue to a new test. For efficiency, it's good to have tests that cover a bunch of issues simultaneously, but this was the simplest way. |
Originally reported as a bug in NGLess (see ngless-toolkit/ngless#116). After the original report, @unode provided the following analysis: > If using pbzip2 the parallel version of bzip2 to create the files, > ngless is able to consume the files up to a certain size. In the > test-case I setup locally a Fastq file with 9724 lines, (266413 bytes > compressed, 900170 uncompressed) causes ngless to fail with > BZ2_bzDecompress: -1. Regular unix bzip2 is able to decompress the file > without problems. > > On the other hand if using regular bzip2, tried as many as 90000 lines > and ngless is still able to consume the files without error.
This is an upstream issue, reported it there. |
This has been merged upstream (snoyberg/bzlib-conduit#7). Once we have a new release and that makes it into the stackage LTS, we can just bump the version that NGLess uses and close here. |
Hi, |
Can you perhaps share one such file? |
yes, I will send you a link |
This was tested using the 1.0.0 conda build (is this one just the wrapped static build?) as well as with several different 'static' and containerized versions from 0.9 to 1.0.1.
In all cases loading of data failed at the same step but depending on the version and how it was compiled two errors were seen:
and
We didn't try the docker containers but those also make use of the static builds so they should be equally affected.
I also tried using the same binary on the bz2 files in our testsuite and all worked fine which hints at some buffer or filesize related issue.
Currently in the process of creating a bz2 file that is big enough to trigger the error locally. If not too big I'll add this to the testsuite.
Credits to @jakob-wirbel for finding this bug.
The text was updated successfully, but these errors were encountered: