Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Supply parameters for compression algorithms #22

Open
yoavweiss opened this issue Dec 4, 2019 · 7 comments
Open

Supply parameters for compression algorithms #22

yoavweiss opened this issue Dec 4, 2019 · 7 comments
Labels
addition/proposal New features or enhancements needs concrete proposal Moving the issue forward requires someone to figure out a detailed plan

Comments

@yoavweiss
Copy link
Contributor

Even for built in algorithms (gzip and deflate), there are various parameters that users can supply which can unlock some use cases. Examples:

  • compression level - determining the tradeoff between CPU and compression ratio
  • Flushing strategy - Flushing strategies can have significant impact on the tradeoff between added latency and compression ratios when compressing streams. Also, some algorithms (e.g. SSH) rely on very specific flushing strategies, so implementing them using this API may rely on being able to set those specific flushing strategies.
@ricea
Copy link
Collaborator

ricea commented Dec 4, 2019

I am expecting to add an options bundle as the second argument to the constructor. So for example we'd have something like

new CompressionStream('deflate', {
    level: 0.1,
    flush: 'always'
});

How the parameters work will be difficult to fix if we get it wrong, which is why they aren't in the initial version of the standard.

@yoavweiss
Copy link
Contributor Author

That makes perfect sense, thanks! :)

@chris-morgan
Copy link

I imagine that will also include the use of a custom dictionary? At present for Fastmail, we use pako in a worker to compress our API request bodies, and use a custom dictionary because it makes the compression much more effective. I had hoped that we could look to switching it to a standard API that would compress faster with less code loaded.

Given the already-niche status of manual compression in JavaScript (for web systems specifically, I personally can’t think of having heard of even one other user, though doubtless some exist), I was a little surprised to hear of this shipping in Chrome without support for varying the compression level or providing a custom dictionary—it’s so rare that people want to do manual compression that I’d guess that a fair fraction of those that do use it have tuned things carefully, and so will not be able to use this new thing at this time without altering the balance.

My surprise is probably because I believe supporting at least those two parameters (level and dictionary) to be quite straightforward, with a broad approach (an options object to the constructor) being obvious, and individual options decisions that should just be made, where discussion is unlikely to affect matters. For starters, I take it as given that the available options depend wholly on the compression method selected.

level could reasonably be an enum, an integer or a float in the range 0–1; given JavaScript and given conventions of extant compression software, probably an integer. Its reasonable range could be 0–3 (matching FLEVEL) or 1–9 (matching most software). What the default is also varies—for FLEVEL’s 0–3, 2 is defined as the default; for compression tools with a level 1–9, some default to 8 (e.g. the zlib library) and others to 6 (e.g. gzip(1)). These numbers are, of course, fairly arbitrary anyway. You could then either leave the default unspecified, or pick 6 or 8 and run with it.

For compression, dictionary should probably be an String containing only ASCII, an ArrayBuffer or a Uint8Array. For decompression, you could wish to provide more than one dictionary, so perhaps dictionary (or dictionaries?) would be an object mapping Adler-32 to dictionary.

@ricea
Copy link
Collaborator

ricea commented Feb 6, 2020

I imagine that will also include the use of a custom dictionary?

Yes, that's on the roadmap, although I don't think I've specifically mentioned it here. I filed issue #27 to make it explicit.

I was a little surprised to hear of this shipping in Chrome without support for varying the compression level or providing a custom dictionary

I believe in shipping the most uncontroversial parts of an API first. We need to assess demand to set the priority for shipping more advanced features.

individual options decisions that should just be made, where discussion is unlikely to affect matters.

Discussion does affect matters. You gave 4 different approaches to level yourself. Someone else may have another suggestion which copes well with libdeflate having levels all the way up to 12, or zopfli providing a higher level of compression which is extraordinarily expensive.

@noell
Copy link

noell commented May 8, 2020

The parameter should be a float in [0..1], similar to toDataURL, toBlob, imho.

Mapping to internal compression levels, like say libdeflate 12 for example, should be an unspecified internal implementation detail [1].

[1] toDataURL, toBlob are spec'd that way. How their parameters are mapped to the internal details of a codec is not in the spec. That was intentional -- it allows browser vendors some wiggle-room to choose what's best for their underlying implementations.

@ricea
Copy link
Collaborator

ricea commented May 8, 2020

@noell I didn't know about the quality argument to toDataURL and toBlob. That's a good precedent.

I feel there should be some kind of restriction on implementations. For example, level: 0.1 should use less CPU than level: 0.9, and the difference between 0.8 and 1.0 shouldn't be more than a factor of 2. The reason being that code that performs well in one browser shouldn't perform badly in another browser.

@jasnell
Copy link

jasnell commented Feb 8, 2022

In addition to options for the compression algorithm, it would be good to be able to set the Queuing Strategies for these as well following the same approach as the TransformStream constructor.

@ricea ricea mentioned this issue Apr 9, 2024
@annevk annevk added addition/proposal New features or enhancements needs concrete proposal Moving the issue forward requires someone to figure out a detailed plan labels Oct 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
addition/proposal New features or enhancements needs concrete proposal Moving the issue forward requires someone to figure out a detailed plan
Development

No branches or pull requests

6 participants