Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow custom options for transferable headers and non-cached headers #16

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

Matt-Yorkley
Copy link
Contributor

@Matt-Yorkley Matt-Yorkley commented Aug 20, 2023

Hey there, I've recently been experimenting with Rack::Cache and some maximalist caching strategies in a Rails app, and exploring some different ideas and problems. I ended up working a couple of new configuration options into a fork of Rack::Cache to enable a little more flexibility in the way that headers are dealt with in different scenarios, in particular in specifying how custom headers should be handled during validation requests. I thought I'd submit them here as a PR/proposal in case anyone's interested. The two new options are entirely opt-in, and the defaults don't change the current Rack::Cache behaviour unless values are explicitly specified.

Non-cached headers

This options allows certain headers to be specified as being passed to the client (not removed) but also not stored in the cache. This is similar to but different from the existing ignore_headers config option, which doesn't cache the specified headers but also strips them out from responses.

Transfer headers

This option allows certain headers to be specified as being "transferred" from the backend (original response) to the client (actual response), even in cases where Rack::Cache is serving a cached response.

So for example, in the case where Rack::Cache is processing a request and it is holding a cached entity, and it validates it by passing a modified request with ETags to the backend and then receives a 304 response before passing a cached response to the client, the backend can set custom headers on the 304 response, and Rack::Cache will transfer them on to the subsequent cached response to the client.

Why?

The combination of the above two options open up a lot of interesting possibilities for using custom headers as a mechanism for transmitting unique bits of data from the backend/application to the client side (and particularly client-side JS) even when there are fully cached responses in-between. I guess "a lot of interesting possibilities" sounds a bit vague, so I'll try to give some concrete scenarios:

Scenario 1: Flash messages

With http caching and a caching proxy, the standard flash messages feature in Rails can become hilariously broken in interesting ways, as the messages are displayed in the body but unrelated to the ETag generation process. If a response with a flash message gets "stuck" in the cache, all users who subsequently view the page will see the same message popping up repeatedly.

As a side note; I tried a little experiment in removing flash messages from the <body> and out into a little client-side call over ActionCable that listens for navigation events and then checks for flash messages (over a websocket), allowing the client side to then display it dynamically outside of the HTTP request cycle (and independent of page caching). It works but it seems sub-optimal.

Anyway, an alternative option here is to extract the transmission of flash messages out into custom headers. This might sound a bit weird, but bare with me. If we set flash messages in headers like ["flash-notice"] = "Post updated!"and use the following configuration for Rack::Cache, the server can set these headers even if it's responding with a 304, and Rack::Cache will transfer them over to the cached response sent to the client, but without storing them in the cache (entitystore):

config.action_dispatch.rack_cache = {
  metastore: "redis://localhost:6379/1/metastore",
  entitystore: "redis://localhost:6379/1/entitystore",
  transfer_headers: ["flash-notice", "flash-error"],
  non_cached_headers: ["flash-notice", "flash-error"]
}

If the server is setting those headers, then the client side can then potentially pluck those headers out and dynamically inject a flash notice or error based on their content. In this example I'm using Hotwire/Turbo for navigation and it's easy for the client side JS to inspect the response headers on all requests and act on their content. The custom headers then become a separate mechanism for transmitting unique server->client data which runs "over the top" of the response-caching process.

Scenario 2: Set-Cookie

There's a bunch of very old background discussion around the way Rack::Cache handles the Set-Cookie header in this old PR: rtomayko/rack-cache#52, but to summarise; Rack::Cache completely removes the set-cookie header both from the cache and from all responses to the client.

In some cases this can lead to pretty annoying / app-breaking behaviour, for example with login/logout requests that use a POST or DELETE which redirects to (for example) the homepage with a GET, if the the GET response is cached then the new session cookie will not necessarily get updated on the client side as the header is stripped out. There's some extra complexity here in the case of Fetch requests (used by Hotwire/Turbo) due to the way the Fetch specification handles following redirects. Anyway, the upshot is the default behaviour of Rack::Cache can mess with those session-changing requests.

This behaviour can be optionally modified using the new transfer_headers and non_cached_headers configurations, and removing the ignore_headers option which defaults to ['Set-Cookie'], like this:

config.action_dispatch.rack_cache = {
  metastore: "redis://localhost:6379/1/metastore",
  entitystore: "redis://localhost:6379/1/entitystore",
  ignore_headers: [], # Don't remove cookie headers from responses (the default)
  transfer_headers: ["Set-Cookie"],
  non_cached_headers: ["Set-Cookie"]
}

Now the server can potentially set or change cookies as part of the validation request between Rack::Cache and the backend, and Rack::Cache will transfer the Set-Cookie header onto the (cached) response to the client, but not save that header in the cache store. This means totally cached responses can still adjust the session (or set cookies generally). Yay!

Having played around with this, it seems like preferable behaviour over the current defaults, but I guess it depends on the dev and what they're trying to do 🤷‍♂️

Scenario 3: Client side session-change awareness

This one might be a bit esoteric, but I thought I'd write it up anyway. I've been experimenting with some ways to enable client side JS to be a little more aware of session-change events and potentially act on them in different ways. One of the issues here is with the session cookie spec, it's designed to very explicitly not allow Javascript to inspect anything to do with the session cookie (for security reasons). One way to enable JS on the client side to "see" session changes is to throw a unique token somewhere into the response (like in a <meta> tag) that's derived from the current session, for example a SHA256 digest of the session ID. If the JS grabs and records that token and then sees that it changes, then a session-change event has occurred and the client side code can act on that. Simple! The problem with this is that http caching totally breaks this mechanism, and essentially precludes the possibility of putting any unique request-specific content anywhere in the <html>. Sooo... one option is to pass that unique session-derived token in a custom header and have the JS interact with that instead. Like this:

config.action_dispatch.rack_cache = {
  metastore: "redis://localhost:6379/1/metastore",
  entitystore: "redis://localhost:6379/1/entitystore",
  transfer_headers: ["unique-id-token"],
  non_cached_headers: ["unique-id-token"]
}

...

# (in a callback)
response.set_header(
  "unique-id-token",
  OpenSSL::Digest::SHA256.hexdigest(session.options[:id].to_s)
)

There's various use-cases here for things that the client-side might want to do whenever a session-change event is detected, for example immediately re-opening an active websocket (because they're long-lived but are tied to the session of the original request that opened them). Depending on what kind of other session-centric data is being held in JS there could be various other actions that might be taken (cleanup/resetting etc). I'm not sure I've explained this well enough, but hopefully you get the gist.

TLDR

The examples I've given here are pretty Rails-centric and Turbo-centric (assuming page navigations and form submissions are happening asynchronously without a full-page-reload reinitialising all Javascript), but there's probably quite a few other uses for this mechanism.

Essentially it provides generically useful ways to modify the headers-handling behaviour of Rack::Cache and opens up the possibility of allowing unique request-specific bits of data to be transmitted from the server to the client (and JS code in particular) even when the responses are fully cached, and the various ways of taking advantage of that are limited only by the dev's imagination. ❤️

What do you think?

@ioquatix
Copy link
Member

It seems reasonable to me.

I need to review the code in more detail.

@Matt-Yorkley
Copy link
Contributor Author

Matt-Yorkley commented Aug 21, 2023

There's a test failing with uninitialised constant for MiniTest in relation to this line:

mock = MiniTest::Mock.new

Seems like that constant should have a lowercase t for the Minitest namespace? It's defined here: https://github.com/minitest/minitest/blob/ed88d196bc5dde30d48026ef7b338997b640e799/lib/minitest/mock.rb#L3-L10

It passes with this change:

-- mock = MiniTest::Mock.new 
++ mock = Minitest::Mock.new 

But it's weird that it doesn't seem to have failed in previous test suite runs though, right...?

Edit: interestingly, I get this error when I run the tests on main as well, so it seems unrelated to these changes?

@ioquatix
Copy link
Member

I believe it's due to a change in Minitest. Do you mind submitting a separate PR to fix it?

Headers specified in the :transfer_headers option are passed (transferred) from responses from the backend through to responses sent to the client (included cached entries).
Headers specified in the :non_cached_headers option are included in responses but not stored in cached entities.
@ioquatix
Copy link
Member

ioquatix commented Dec 9, 2023

@Matt-Yorkley are you happy if we give this one final review before merging?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants