-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Zstd: ZSTD_getDecompressedSize is obsolete and used incorrectly. #499
Comments
We might borrow from python-zstandard here: |
what if we just added python-zstandard as a dependency |
We probably should. My suggestion is that we break numcodecs up into smaller packages focusing on individual upstream codecs and then combine them via some metapackage. It would also be good if we had a mechanism to manage multiple implementations of the same codec. |
I have mentioned this before, but |
cramjam does not do blosc though. I suppose we could just depend on https://github.com/Blosc/python-blosc2 . I think we may need a new repository or Github organization for this. Could someone sketch out an optional dependency package structure and how we make that work both in pip and conda or prefix? |
Correct, but it does deal with issues like this specific one. And indeed, I'd be happy to delegate blosc to someone else too ( or push on milesgranger/cramjam#110 ). |
This is sounding a bit as if a WG like the zarr-python-refactoring WG might also be appropriate here. |
numcodecs currently treats a return value of
0
fromZSTD_getDecompressedSize
as an input error. A value of zero could mean one of the following.numcodecs/numcodecs/zstd.pyx
Lines 151 to 153 in 366318f
Rather numcodecs should use
ZSTD_getFrameContentSize
which the return value can be differentiated.0
means empty0xffffffffffffffff
,ZSTD_CONTENTSIZE_UNKNOWN
, means unknown0xfffffffffffffffe
,ZSTD_CONTENTSIZE_ERROR
, means errorSee zstd.h or the manual for a reference.
https://github.com/facebook/zstd/blob/7cf62bc274105f5332bf2d28c57cb6e5669da4d8/lib/zstd.h#L195-L203
https://facebook.github.io/zstd/zstd_manual.html
This error arose during the implementation of Zstandard in n5-zarr:
saalfeldlab/n5-zarr#35
There the compressor was producing blocks which would return
ZSTD_CONTENTSIZE_UNKNOWN
.ZSTD_getDecompressedSize
would return0
and numcodecs would incorrectly interpret this as an error.Handling
ZSTD_CONTENTSIZE_UNKNOWN
may be difficult.dest
buffer is provided, then perhaps that should we set as the expected decompressed size and an error should occur if the decompressed size is not that.dest
buffer is not provided, we may need to either use a default or use the streaming API to build an growing buffer until all the data is decompressed.The text was updated successfully, but these errors were encountered: