Client-Decompression support #123

MahdiBM · 2022-11-16T20:47:27Z

This PR adds client-decompression support for web-socket connections using Zlib.
This also partially resolves #55.

Zlib

The zlib stuff are copy-pasted from NIO-extras, but i've also removed some stuff that we don't need and added some that we do.
The reason for the copy paste was that i basically asked the NIO team if we can make NIOHTTPDecompression.Decompressor of NIO-extras public so we can use it in this package, and Cory answered:

We very deliberately don’t want those to be API. Copy-pasting them to your repo is fine 🙂

Then i decided to go full on copy-pasting and don't pull NIO-extras only because of the small Zlib c-introp target.

Manual Tests

I've tested this PR on Discord Gateway connections using my Discord library (this branch) and it was working like its not even using compression.
Discord uses deflate but i've also left the option of gzip (like NIO-extras).

FrameSequence

I also simplified the WebSocket's FrameSequence.
Previously it used to always turn a text's buffer into a string, which is wasteful if you need the actual data for for-example decoding JSON anyway.

TODO

~~This PR is still not final.~~ I need to:

Write tests.
Decide how to allocate decompression-buffer's initial capacity.

codecov-commenter · 2022-11-16T20:53:19Z

Codecov Report

Merging #123 (32b517b) into main (2d9d218) will decrease coverage by 4.17%.
The diff coverage is 84.28%.

❗ Current head 32b517b differs from pull request most recent head 8e3a62a. Consider uploading reports for the commit 8e3a62a to get more accurate results

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #123      +/-   ##
==========================================
- Coverage   73.37%   69.20%   -4.17%     
==========================================
  Files           5        8       +3     
  Lines         507      721     +214     
==========================================
+ Hits          372      499     +127     
- Misses        135      222      +87

Impacted Files	Coverage Δ
Sources/WebSocketKit/WebSocketClient.swift	`75.00% <66.66%> (-1.80%)`	⬇️
Sources/WebSocketKit/WebSocketHandler.swift	`50.00% <75.00%> (-5.32%)`	⬇️
...ketKit/Concurrency/Compression/Decompression.swift	`84.00% <84.00%> (ø)`
Sources/WebSocketKit/WebSocket.swift	`74.90% <90.90%> (+2.79%)`	⬆️
...ocketKit/Concurrency/Compression/Compression.swift	`100.00% <100.00%> (ø)`
...urces/WebSocketKit/HTTPInitialRequestHandler.swift	`68.88% <100.00%> (+0.70%)`	⬆️

... and 1 file with indirect coverage changes

MahdiBM · 2022-11-16T21:02:18Z

Codecov seems to making problems for some of the tests (I haven't even touched tests yet)

MahdiBM · 2022-11-18T14:56:41Z

it appears this will resolve #79 too

gwynne · 2022-12-21T19:59:41Z

Sources/WebSocketKit/WebSocket.swift

@@ -51,6 +62,11 @@ public final class WebSocket {
        self.onTextCallback = callback
    }

+    /// The same as `onText`, but with raw data instead of the decoded `String`.


What purpose does this callback serve? I don't see any usage of it anywhere, including in the tests.

Currently we only have onTextCallback. What is sent to the ws is, of course, Data, not string. but when using onTextCallback, ws-kit turns the data into a string and passes the string to the users of the package. The problem is that if the text is for example in JSON format, ws-kit users need to turn the string into Data again to pass it to somewhere like JSONDecoder. so we have Data -> String -> Data instead of just Data which is wasteful. this new callback solves that problem.

like it is mentioned in this issue: #79

i should add some tests, never-the-less.

added a test to assert both text callbacks have the same behavior.
I should also add that onBinary is not the same as onTextBuffer because onBinary only is activated if the ws frame is an actual binary frame. onTextBuffer is for when the ws frame is a text frame, but users might still prefer to access the string's data directly. I did try to mix the two, but it could cause problems.

MahdiBM · 2022-12-21T20:48:45Z

the linux 5.5 test was taking too long and i re-ran it. It was due to an untouched test taking too long (AsyncWebSocketKitTests.testWebSocketEcho)
I don't see how it can be related to this PR at all. Probably swift's concurrency didn't do something on-time, if i were to guess. I have seen these kind of problems with async/await before.
EDIT: Gwynne kind-of confirmed that this is not related to this PR:

The problem is quite obviously a race condition

fumoboy007

Hmm it seems like in this proposal, the API user needs to specify the compression algorithm to use during decompression. Shouldn’t the library instead automatically negotiate the compression algorithm with the server? I only quickly glanced at RFC 7692 but it looks like there should be some sort of client–server negotiation going on?

fumoboy007 · 2023-01-07T07:40:46Z

Sources/WebSocketKit/WebSocket.swift

            switch frameSequence.type {
            case .binary:
-                self.onBinaryCallback(self, frameSequence.binaryBuffer)
+                if decompressor != nil {


if let decompressor?

We still need to support swifts older than 5.7.

Other than that, iirc we can't trigger a compiler copy of the decompressor.

We still need to support swifts older than 5.7.

Oops, I forgot about that. 😆

Other than that, iirc we can't trigger a compiler copy of the decompressor.

Hmm interesting. Then should Decompressor be a class since it manages a resource?

Then, we will have ARC overhead, which we currently don't have.
This is also not part of the API, so we don't really need to worry about it being friendly to developers, too much.
Swift NIO also had it as a struct, not class.

MahdiBM · 2023-01-07T07:56:19Z

I'll need to check on that later in some days/weeks 🤔

I'm not sure if that blocks this pr though. The current implementation surely works, although it might be suboptimal.

fumoboy007 · 2023-01-07T08:05:19Z

I'll need to check on that later in some days/weeks 🤔

I'm not sure if that blocks this pr though. The current implementation surely works, although it might be suboptimal.

I’m sure we could reuse a lot of this that you have implemented! But without the negotiation, I don’t think the server would even do compression/decompression since the client did not tell it that it supports that.

MahdiBM · 2023-01-07T08:14:36Z

RFCs are good, but you should also try not to diverge from the reality of how it actually works in practice.

I don't think this will work ....

That's not the case with Discord Gateway connections at the very least. You specify that you want zlib compression using a query parameter in your URL, then you enable the decompression in the library. Everything works nicely thereafter.

fumoboy007 · 2023-01-07T08:20:39Z

That's not the case with Discord Gateway connections at the very least. You specify that you want zlib compression using a query parameter in your URL, then you enable the decompression in the library. Everything works nicely thereafter.

Oh, interesting! Today I learned. 😊

Is this a common pattern or is it Discord-specific?

MahdiBM · 2023-01-07T08:22:36Z

I'm not sure. I will have to check later when i have time.

MahdiBM · 2023-01-11T03:57:24Z

Thank you @fumoboy007 for pointing me to the RFC. I'm working on this now ...

RFC 7692 doesn't mention gzip. It appears deflate is the main algorithm and implementing deflate should suffice for now

MahdiBM · 2023-01-11T05:57:53Z

The approach this library has taken, is that you let the websocket library know that you'd prefer compression enabled than not, then the library would try to do the negotiation with the ws server, and if successful, compression will be enabled. Otherwise compression won't be enabled but the connection will still be established with the ws server.

MahdiBM · 2023-04-05T11:36:00Z

I dug a little bit more into this, and it appears that Discord is actually not using an standard implementation of Websocket decompression, so I'll need to make sure the PR supports at least some standards before trying to merge it.

MahdiBM · 2023-04-05T11:39:24Z

Specifically, I played a little bit with "permessage-deflate no-context-takeover" mode and it seems the current implementation does not work well with that. The PR is generally fine/ok, structure-wise, but i'll probably need to make sure correct values are passed to the zlib decompressor so it can work with permessage-deflate no-context-takeover mode.

koraykoska · 2023-04-07T12:43:32Z

Will this support compression for WebSocket client as well or just as a server?

MahdiBM · 2023-04-07T12:48:54Z

This PR, whenever I actually finish it, will only support decompression as a websocket client. That's the plan at least.
We can then move forward from there.

Do you have any use cases for compression as a client though? I was looking for some test subjects to test the PR against in real world, and Google wasn't too helpful.

koraykoska · 2023-04-07T12:55:28Z

@MahdiBM I have a very specific use case. My Vapor application is a high throughput proxy that forwards HTTP or ws RPC requests from the client to a pool of backends through WebSocket connections. Right now we have many TBs of traffic per day between the proxies and the backend pool over WebSocket. As the requests and responses are JSON and we have data around gzip over HTTP we believe that we can reduce bandwidth at least 5x (probably even 7-8x).

So yes, we would need this mainly for WebSocket clients. Very happy to test this out ahead of the official release. What would be the best way to contact you so we can coordinate it?

MahdiBM · 2023-04-07T13:29:45Z

@koraykoska In that case, your best chance is probably to fork websocket-kit, and use compress-nio to compress/decompress the data sent (e.g. this is where you send text, where you receive any data).

This PR is not ready so there is not much to test. The current state is that it supports only decompressing and only when client-side (or that's what I recall), which misses the compress-on-server-side part you need.

If you're sending texts (or text-frames more specifically) there is also this issue which could help you optimize (and is fairly easy to do).

koraykoska · 2023-04-07T13:57:23Z

@MahdiBM I am not in a rush with this. And implementing it myself instead of waiting for the PR to be ready seems like a waste of time. Unless you believe it's going to take rather long until it gets merged. If not I'll simply wait. But thanks for the heads up about the separate callback.

MahdiBM · 2023-04-07T14:11:13Z

@koraykoska I'm honestly not sure how much time this is going to take. Could be a week or two or 6 months.

add full client-decompression support

81785a3

MahdiBM marked this pull request as draft November 16, 2022 20:47

MahdiBM added 3 commits November 17, 2022 13:40

add Tests

49bc530

fixes and improvements

7239cae

fix linux compile

9fc1e19

MahdiBM marked this pull request as ready for review November 17, 2022 15:09

MahdiBM added 9 commits November 17, 2022 18:41

fix api breakage?

8045aef

fix api breakage

a43978a

better reserve capacity

ac9c4e2

better reserve capacity

90051db

remove unneeded import

d5cc050

better configuration

07ec835

better namings for future development

d0f66cf

cleanup

da6fd79

remove testing-left-over

0954614

MahdiBM linked an issue Nov 18, 2022 that may be closed by this pull request

onBinary Usage? #79

Closed

gwynne reviewed Dec 21, 2022

View reviewed changes

add tests for onTextBuffer

7c87942

fumoboy007 reviewed Jan 7, 2023

View reviewed changes

fumoboy007 mentioned this pull request Jan 7, 2023

Feature Request: WebSocket Compression Support apple/swift-nio-extras#189

Open

MahdiBM added 2 commits January 11, 2023 07:30

remove gzip support for now

f1cd76e

RFC 7692 doesn't mention gzip. It appears deflate is the main algorithm and implementing deflate should suffice for now

remove decompression limit, as maxFrameSize already covers that

e90c3d7

remove unused decompression error

32b517b

minor changes

8e3a62a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Client-Decompression support #123

Client-Decompression support #123

MahdiBM commented Nov 16, 2022 •

edited

Loading

codecov-commenter commented Nov 16, 2022 •

edited

Loading

MahdiBM commented Nov 16, 2022

MahdiBM commented Nov 18, 2022

gwynne Dec 21, 2022

MahdiBM Dec 21, 2022

MahdiBM Dec 21, 2022

MahdiBM Dec 21, 2022

MahdiBM Dec 21, 2022 •

edited

Loading

MahdiBM commented Dec 21, 2022 •

edited

Loading

fumoboy007 left a comment •

edited

Loading

fumoboy007 Jan 7, 2023

MahdiBM Jan 7, 2023

fumoboy007 Jan 7, 2023

MahdiBM Jan 7, 2023

MahdiBM commented Jan 7, 2023

fumoboy007 commented Jan 7, 2023

MahdiBM commented Jan 7, 2023

fumoboy007 commented Jan 7, 2023

MahdiBM commented Jan 7, 2023

MahdiBM commented Jan 11, 2023

MahdiBM commented Jan 11, 2023

MahdiBM commented Apr 5, 2023

MahdiBM commented Apr 5, 2023

koraykoska commented Apr 7, 2023

MahdiBM commented Apr 7, 2023

koraykoska commented Apr 7, 2023 •

edited

Loading

MahdiBM commented Apr 7, 2023 •

edited

Loading

koraykoska commented Apr 7, 2023

MahdiBM commented Apr 7, 2023

Client-Decompression support #123

Are you sure you want to change the base?

Client-Decompression support #123

Conversation

MahdiBM commented Nov 16, 2022 • edited Loading

Zlib

Manual Tests

FrameSequence

TODO

codecov-commenter commented Nov 16, 2022 • edited Loading

Codecov Report

MahdiBM commented Nov 16, 2022

MahdiBM commented Nov 18, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MahdiBM Dec 21, 2022 • edited Loading

Choose a reason for hiding this comment

MahdiBM commented Dec 21, 2022 • edited Loading

fumoboy007 left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MahdiBM commented Jan 7, 2023

fumoboy007 commented Jan 7, 2023

MahdiBM commented Jan 7, 2023

fumoboy007 commented Jan 7, 2023

MahdiBM commented Jan 7, 2023

MahdiBM commented Jan 11, 2023

MahdiBM commented Jan 11, 2023

MahdiBM commented Apr 5, 2023

MahdiBM commented Apr 5, 2023

koraykoska commented Apr 7, 2023

MahdiBM commented Apr 7, 2023

koraykoska commented Apr 7, 2023 • edited Loading

MahdiBM commented Apr 7, 2023 • edited Loading

koraykoska commented Apr 7, 2023

MahdiBM commented Apr 7, 2023

MahdiBM commented Nov 16, 2022 •

edited

Loading

codecov-commenter commented Nov 16, 2022 •

edited

Loading

MahdiBM Dec 21, 2022 •

edited

Loading

MahdiBM commented Dec 21, 2022 •

edited

Loading

fumoboy007 left a comment •

edited

Loading

koraykoska commented Apr 7, 2023 •

edited

Loading

MahdiBM commented Apr 7, 2023 •

edited

Loading