Skip to content

Commit

Permalink
fix(handlers): improve gzip reader according to recent changes in cpy…
Browse files Browse the repository at this point in the history
…thon.

The gzip._GzipReader we're inheriting from had some changes to stay up
to date with changes in zlib library and to perform some optimization.

Specifically:

- GzipFile.read has been optimized. There is no longer a unconsumed_tail
  member to write back to padded file. This is instead handled by the
  ZlibDecompressor itself, which has an internal buffer.
- _add_read_data has been inlined, as it was just two calls.

We've adapted our own code to reflect these changes.

More info: python/cpython#97664
  • Loading branch information
qkaiser committed Oct 13, 2023
1 parent 43dc0c7 commit c09b4a0
Showing 1 changed file with 7 additions and 1 deletion.
8 changes: 7 additions & 1 deletion unblob/handlers/compression/_gzip_reader.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
import gzip
import zlib

from ...file_utils import DEFAULT_BUFSIZE

Expand All @@ -10,14 +11,19 @@ def read_header(self):
self._init_read()
return self._read_gzip_header()

def _add_read_data(self, data):
self._crc = zlib.crc32(data, self._crc)
self._stream_size = self._stream_size + len(data)

def read(self):
uncompress = b""

while True:
buf = self._fp.read(DEFAULT_BUFSIZE)

uncompress = self._decompressor.decompress(buf, DEFAULT_BUFSIZE)
self._fp.prepend(self._decompressor.unconsumed_tail)
if hasattr(self._decompressor, "unconsumed_tail"):
self._fp.prepend(self._decompressor.unconsumed_tail)
self._fp.prepend(self._decompressor.unused_data)

if uncompress != b"":
Expand Down

0 comments on commit c09b4a0

Please sign in to comment.