Skip to content

Commit

Permalink
Initial commit
Browse files Browse the repository at this point in the history
We explain the purpose of this repository, the history behind it, where
it's going & how can others help. We also captured all contributors to
date.

We also captured the Varnish + Fly.io config as it existed for:
- Let's build a CDN - Part 1 - https://www.youtube.com/watch?v=8bDgWvyglno
- Kaizen 15 - NOT a pipe dream - https://changelog.com/friends/50

There is a lot more context here: thechangelog/changelog.com#518

The most interesting part is the `run` script. To run it, you will need to:
1. Have a Fly.io account
2. Have a back-end app deployed 💡 https://fly.io/speedrun/
3. Change the name of the backend app (i.e. `changelog-2024-01-12`)
4. Launch the app in this repository

Signed-off-by: Gerhard Lazu <[email protected]>
  • Loading branch information
gerhard committed Aug 16, 2024
0 parents commit 0224c72
Show file tree
Hide file tree
Showing 7 changed files with 585 additions and 0 deletions.
4 changes: 4 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# https://hub.docker.com/_/varnish
FROM varnish:7.4.3
ENV VARNISH_HTTP_PORT 9000
COPY default.vcl /etc/varnish/default.vcl
21 changes: 21 additions & 0 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
Content and Design Copyright (c) Changelog Media LLC. All rights reserved.

Code Copyright (c) Changelog Media LLC and licensed under the following conditions:

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.
41 changes: 41 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# The Pipe Dream™️

A single-purpose, single-tenant CDN for [changelog.com](https://changelog.com).
Runs [Varnish Cache](https://varnish-cache.org/releases/index.html) (open
source) on [Fly.io](https://fly.io/changelog)

This repository exists for a single reason: build the simplest CDN on Fly.io

## How it started

> I like the idea of having like this 20-line Varnish config that we deploy
> around the world, and it’s like “Look at our CDN, guys.”
>
> It’s so simple, and it can do exactly what we want it to do, and nothing
> more.
>
> But I understand that that’s a <strong>pipe dream</strong>, because that
> Varnish config will be slightly longer than 20 lines, and we’d run into all
> sorts of issues that we end up sinking all kinds of time into.
>
> Jerod Santo - March 29, 2024 - <a href="https://changelog.com/friends/38#transcript-208" target="_blank">Changelog & Friends #38</a>
## How is it going

- [x] Static backend, 1 day stale, stale on error, x-headers - `46` lines of VCL
- [ ] Dynamic backend, cache-status header - `55` lines of VCL

## How can you help

If you have any ideas on how to improve this, please open an issue or go
straight for a pull request. We make this as easy as possible:
- All commits emphasize [good commit messages](https://cbea.ms/git-commit/) (more text for humans than code for machines)
- This repository is kept small & simple (the only purpose is to build the simplest CDN on Fly.io)
- We are taking a slow & thoughtful approach - join our journey via [audio with transcripts](https://changelog.com/topic/kaizen) or [written](https://github.com/thechangelog/changelog.com/discussions/categories/kaizen)

Hope to see you in our Slack: <https://changelog.slack.com> 👋

## Contributors

- [James A Rosen](https://www.jamesarosen.com/), Staff Engineer
- [Matt Johnson](https://github.com/mttjohnson), Sr Site Reliability Engineer
117 changes: 117 additions & 0 deletions default.vcl
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
# https://varnish-cache.org/docs/7.4/reference/vcl.html#versioning
vcl 4.1;

import std;

# Thanks Matt Johnson! 👋
# - https://github.com/magento/magento2/blob/03621bbcd75cbac4ffa8266a51aa2606980f4830/app/code/Magento/PageCache/etc/varnish6.vcl
# - https://abhishekjakhotiya.medium.com/magento-internals-cache-purging-and-cache-tags-bf7772e60797

backend default {
.host = "top1.nearest.of.changelog-2024-01-12.internal";
.host_header = "changelog-2024-01-12.fly.dev";
.port = "4000";
.first_byte_timeout = 5s;
.probe = {
.url = "/health";
.timeout = 2s;
.interval = 5s;
.window = 10;
.threshold = 5;
}
}

# https://varnish-cache.org/docs/7.4/users-guide/vcl-grace.html
# https://docs.varnish-software.com/tutorials/object-lifetime/
# https://www.varnish-software.com/developers/tutorials/http-caching-basics/
# https://blog.markvincze.com/how-to-gracefully-fall-back-to-cache-on-5xx-responses-with-varnish/
sub vcl_backend_response {
# Objects within ttl are considered fresh.
set beresp.ttl = 60s;

# Objects within grace are considered stale.
# Serve stale content while refreshing in the background.
# 🤔 QUESTION: should we vary this based on backend health?
set beresp.grace = 24h;

if (beresp.status >= 500) {
# Don't cache a 5xx response
set beresp.uncacheable = true;

# If is_bgfetch is true, it means that we've found and returned the cached
# object to the client, and triggered an asynchoronus background update. In
# that case, since backend returned a 5xx, we have to abandon, otherwise
# the previously cached object would be erased from the cache (even if we
# set uncacheable to true).
if (bereq.is_bgfetch) {
return (abandon);
}
}

# 🤔 QUESTION: Should we configure beresp.keep?
}

# NOTE: vcl_recv is called at the beginning of a request, after the complete
# request has been received and parsed. Its purpose is to decide whether or not
# to serve the request, how to do it, and, if applicable, which backend to use.
sub vcl_recv {
# https://varnish-cache.org/docs/7.4/users-guide/purging.html
if (req.method == "PURGE") {
return (purge);
}

# Implement a Varnish health-check
if (req.method == "GET" && req.url == "/varnish_status") {
return(synth(204));
}
}

# https://gist.github.com/leotsem/1246511/824cb9027a0a65d717c83e678850021dad84688d#file-default-vcl-pl
# https://varnish-cache.org/docs/7.4/reference/vcl-var.html#obj
sub vcl_deliver {
# What is the remaining TTL for this object?
set resp.http.x-ttl = obj.ttl;
# What is the max object staleness permitted?
set resp.http.x-grace = obj.grace;

# Did the response come from Varnish or from the backend?
if (obj.hits > 0) {
set resp.http.x-cache = "HIT";
} else {
set resp.http.x-cache = "MISS";
}

# Is this object stale?
if (obj.ttl < std.duration(integer=0)) {
set resp.http.x-cache = "STALE";
}

# How many times has this response been served from Varnish?
set resp.http.x-cache-hits = obj.hits;
}

# TODOS:
# - ✅ Run in debug mode (locally)
# - ✅ Connect directly to app - not Fly.io Proxy 🤦
# - ✅ Serve stale content + background refresh
# - QUESTION: Should the app control this via Surrogate-Control? Should we remove this header?
# - EXPLORE: varnishstat
# - EXPLORE: varnishtop
# - EXPLORE: varnishncsa -c -F '%m %U %H %{x-cache}o %{x-cache-hits}o'
# - ✅ Serve stale content on backend error
# - https://varnish-cache.org/docs/7.4/users-guide/vcl-grace.html#misbehaving-servers
# - If the backend gets restarted (e.g. new deploy), backend remains sick in Varnish
# - https://info.varnish-software.com/blog/two-minute-tech-tuesdays-backend-health
# - EXPLORE: varnishlog -g raw -i backend_health
# - Implement If-Modified-Since? keep
# - Expose FLY_REGION=sjc env var as a custom header
# - https://varnish-cache.org/lists/pipermail/varnish-misc/2019-September/026656.html
# - Add Feeds backend: /feed -> https://feeds.changelog.place/feed.xml
# - Store cache on disk? A pre-requisite for static backend
# - https://varnish-cache.org/docs/trunk/users-guide/storage-backends.html#file
# - Add Static backend: cdn.changelog.com requests
#
# FOLLOW-UPs:
# - Run varnishncsa as a separate process (will need a supervisor + log drain)
# - https://info.varnish-software.com/blog/varnish-and-json-logging
# - How to cache purge across all varnish instances?
43 changes: 43 additions & 0 deletions fly.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
# Full app config reference: https://fly.io/docs/reference/configuration/
app = "cdn-2024-01-26"
# Closest to James
primary_region = "sjc"
# Secondary region will be "lhr", closest to Gerhard

kill_signal = "SIGTERM"
kill_timeout = 30

[env]
VARNISH_SIZE="500M"

[[vm]]
size = "shared-cpu-1x"
memory = "256MB"

[deploy]
strategy = "bluegreen"

[[services]]
internal_port = 9000
protocol = "tcp"

[[services.http_checks]]
grace_period = "5s"
interval = "5s"
method = "get"
path = "/varnish_status"
protocol = "http"
timeout = "4s"

[[services.ports]]
handlers = ["tls", "http"]
port = 443

[[services.ports]]
handlers = ["http"]
port = "80"

[services.concurrency]
hard_limit = 2500
soft_limit = 2000
type = "connections"
36 changes: 36 additions & 0 deletions regions.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
NAME CODE GATEWAY LAUNCH PLAN + ONLY GPUS
Amsterdam, Netherlands ams ✓ ✓
Ashburn, Virginia (US) iad ✓ ✓
Atlanta, Georgia (US) atl
Bogotá, Colombia bog
Boston, Massachusetts (US) bos
Bucharest, Romania otp
Chicago, Illinois (US) ord ✓
Dallas, Texas (US) dfw ✓
Denver, Colorado (US) den
Ezeiza, Argentina eze
Frankfurt, Germany fra ✓ ✓
Guadalajara, Mexico gdl
Hong Kong, Hong Kong hkg ✓
Johannesburg, South Africa jnb
London, United Kingdom lhr ✓
Los Angeles, California (US) lax ✓
Madrid, Spain mad
Miami, Florida (US) mia
Montreal, Canada yul
Mumbai, India bom ✓
Paris, France cdg ✓
Phoenix, Arizona (US) phx
Querétaro, Mexico qro ✓
Rio de Janeiro, Brazil gig
San Jose, California (US) sjc ✓ ✓
Santiago, Chile scl ✓
Sao Paulo, Brazil gru
Seattle, Washington (US) sea ✓
Secaucus, NJ (US) ewr ✓
Singapore, Singapore sin ✓
Stockholm, Sweden arn
Sydney, Australia syd ✓ ✓
Tokyo, Japan nrt ✓
Toronto, Canada yyz ✓
Warsaw, Poland waw
Loading

0 comments on commit 0224c72

Please sign in to comment.