How to save bandwidth and beat latency
Given the previous information, we can think of POP shielding as a kind of "clustering" but at the inter-POP level, rather than the intra-POP level.
Shielding is vital for origin services like S3 over long hauls, which have all objects located in some central server, and general rate limits on request "burstiness" and throughput. A shield is a POP which is solely responsible for contacting the origin in the event of a miss, the same way a cluster node is. All other POPs instead route to the shield POP and let it handle origin communications. Both POPs do intra-node clustering, and caching, as described before.
Shielding adds a "two level" caching structure, and ideally colocates the "shield POP" close to the actual origin server to reduce latency. Requests to a normal POP may result in a miss — but upon performing a request to the shield POP, may result in a hit. This effectively allows you to skip the origin entirely for many cases at a global level. This means two users in two different parts of the world can talk to two different POPs — and still get a hit from the shield POP if they ask for the same object. In a normal setup, two distinct POPs would have to each do an origin request if they missed.
Below is a diagram that tries to succinctly outline how POP shielding works in our setup. It also gives a high-level view of the inside of each POP, and how nodes in a POP cooperate.
┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐ ┌────────────────────────────────────────┐
│Client│ │Client│ │Client│ │Client│ │Client│ │LEGEND │
└──────┘ └──────┘ └──────┘ └──────┘ └──────┘ │ │
▲ ▲ ▲ ▲ ▲ │ "Clustering": Cooperation of an ENode │
│ │ │ │ │ │ and a CNode inside a POP, to handle │
└──┐ └────┬───┘ ┌─┘ │ │ a request with only 2 hops. │
┌──────┼──────────┼──────────┼──────────┼──────┐ │ │
│ ▼ ▼ ▼ ▼ │ │ All nodes in a POP can be either an │
│ ┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐ │ │ ENode or a CNode for many objects; │
│ │ ENode │ │ ENode │ │ ENode │ │ ENode │ │ │ roles are not pre-assigned. │
│ └────────┘ └────────┘ └────────┘ └────────┘ │ │ │
│ ▲ ▲ │ │ "Shielding": Routing all requests to │
│ ┏ ━ ━ ━ ━ ━ │ │ a single POP, which is the only POP │
│ ┏ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ┛ │ │ responsible for origin requests. │
│ │ │ Acts like "clustering" at a global │
│ ▼ │ │ level. │
│ ┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐ │ │ │
│ │ CNode │ │ Node │ │ Node │ │ Node │ │ │ │
│ └────────┘ └────────┘ └────────┘ └────────┘ │ │ ENode: "Edge Node" │
│ ▲ Edge POP │ │ - Uses MEMORY cache │
└──────╬───────────────────────────────────────┘ │ - Random client pick │
║ │ - Clusters with CNode, based on │
╚═══POP-to-POP link═══╗ │ object hash │
║ │ - One object exists in many memory │
┌────────────────────────────╬─────────────────┐ │ caches on several ENodes │
│ ▼ │ │ │
│ ┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐ │ │ │
│ │ Node │ │ Node │ │ ENode │ │ Node │ │ │ CNode: "Cache Node" │
│ └────────┘ └────────┘ └────────┘ └────────┘ │ │ - Uses DISK cache │
│ ▲ │ │ - All ENodes map a single object │
│ │ │ to the same CNode │
│ ┗ ━ Hop ━ ━ │ │ - One object exists on a single │
│ ┃ │ │ disk cache in a single CNode │
│ ▼ │ │ │
│ ┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐ │ │ │
│ │ Node │ │ Node │ │ Node │ │ CNode │ │ │ POP-to-POP link: inter-POP routes, │
│ └────────┘ └────────┘ └────────┘ └────────┘ │ │ often pre-established connections │
│ Shield POP ▲ │ │ (avoids common 3way TCP handshake on │
└───────────────────────────────────────┼──────┘ │ long haul) over open internet link │
┌──Origin Hop───┘ │ │
│ │ │
▼ │ │
┌──────────────────────────────────────────────┐ │ │
│ Origin │ │ │
└──────────────────────────────────────────────┘ └────────────────────────────────────────┘
With shielding, we're able to soak up a substantially larger number of requests through the CDN, because they will all be routed through one origin, and it will be responsible for globally caching everything.
<aside> 💡 Insight: POP shielding is not an extra feature added onto Fastly's version of Varnish — it is built completely in VCL itself, and is conceptually just like any double reverse-proxy setup with any caching server!
Because POP shielding is purely built "on top of" VCL, this explains why it interferes with hit ratio metrics: the hit ratio is calculated over all hits and misses for all POPs, not just the shield POP. The shield POP could have an effective hit ratio of 100% (i.e. the origin S3 bucket never gets talked to), but every other POP could have an effective hit ratio of 10% (i.e. 1/10 requests goes to the shield POP), skewing the global average.
Because POP shielding is implemented in VCL, it can also have non-trivial interactions with the caching system and your own VCL.
</aside>
Another useful aspect of shielding is outlined by the (somewhat mangled) diagram below. Intuitively, shielding lets us use "pre-warmed" backbone connections between POPs, avoiding high-latency 3-way TCP handshakes for clients. This effectively reduces the overall TTFB for clients, especially as the gap between the origin and the "edge" POP becomes larger.
In the below diagram, we assume the user connects to the SIN
datacenter in Singapore, and the origin is in the US — for example, near Washington, AKA IAD
. In the diagram, the flow on the top shows what happens without shielding: a small POP connection followed by an expensive TCP connection. The flow on the bottom shows how introducing a POP in the middle allows us to skip a handshake by using pre-established flows.
This effectively reduces the overall client latency from 15 + 600 = 615ms
to 15 + 200 + 5 = 220ms
, an effective reduction of 1/3rd!
┌───────────────┐ ┌────────┐
┌──────┐ 15ms │Singapore (SIN)│ 600ms │ │
│Client│◀───────▶│ POP │◀━━━━━━━━(3x 200ms ━━┓ │ │
└──────┘ │ │ for TCP) ┃ │ │
└───────────────┘ ┃ │ │
┃ │ │
┗━━━━━━━━━━━━━━━━━━━━━▶│ │
│ Origin │
┌─────▶│ │
│ │ │
┌───────────────┐ ┌────────────────┐ 5ms │ │
┌──────┐ 15ms │Singapore (SIN)│ │Washington (IAD)│ │ │ │
│Client│◀───────▶│ POP │◀═══200ms══▶│ POP │◀─────┘ │ │
└──────┘ │ │ │ │ │ │
└───────────────┘ └────────────────┘ └────────┘