A deep look at how Fastly Varnish works internally


<aside> 👉 Heads up: The following content was written by Mark McDonnell, and is mirrored in this wiki for the sake of completeness and long-term archival. It has been lightly touched up to fit the format of Notion, but is otherwise unmodified.

Read the original post by following this link!

All content © 2017-2019, Mark McDonnell.

</aside>

In this post I’m going to be explaining how the Fastly CDN works, with regards to their ‘programmatic edge’ feature (i.e. the ability to execute code on cache servers nearest to your users).

Fastly utilizes free software and extends it to fit their purposes, but this extending of existing software can make things confusing when it comes to understanding what underlying features work and how they work.

Introduction

Varnish is an open-source HTTP accelerator. More concretely it is a web application that acts like a HTTP reverse-proxy.

You place Varnish in front of your application servers (those that are serving HTTP content) and it will cache that content for you. If you want more information on what Varnish cache can do for you, then I recommend reading through their introduction article (and watching the video linked there as well).

Fastly is many things, but for most people they are a CDN provider who utilise a highly customised version of Varnish. This post is about Varnish and explaining a couple of specific features (such as hit-for-pass and serving stale) and how they work in relation to Fastly’s implementation of Varnish.

One stumbling block for Varnish is the fact that it only accelerates HTTP, not HTTPS. In order to handle HTTPS you would need a TLS/SSL termination process sitting in front of Varnish to convert HTTPS to HTTP. Alternatively you could use a termination process (such as nginx) behind Varnish to fetch the content from your origins over HTTPS and to return it as HTTP for Varnish to then process and cache.

Note: Fastly helps both with the HTTPS problem, and also with scaling Varnish in general.

The reason for this post is because when dealing with Varnish and VCL it gets very confusing having to jump between official documentation for VCL and Fastly’s specific implementation of it. Even more so because the version of Varnish Fastly are using is now quite old and yet they’ve also implemented some features from more recent Varnish versions. Meaning you end up getting in a muddle about what should and should not be the expected behaviour (especially around the general request flow cycle).

Ultimately this is not a “VCL 101”. If you need help understanding anything mentioned in this post, then I recommend reading:

Fastly has a couple of excellent articles on utilising the Vary HTTP header (highly recommended reading).

Varnish Basics

Varnish is a ‘state machine’ and it switches between these states via calls to a return function (where you tell the return function which state to move to). The various states are: