Server-side content caching

What is content caching

Having a static cache layer sitting on top of server-side dynamically generated pages, such as catalogue or product page for e-commerce is a pretty common pattern.

This architecture provides better performance for the client and decreases the load on the server.

How it works

The common architecture to implement server-side caching is to have a piece of software that sits on top of the application server, and saves server-rendered pages into its own cache system, with a specific cache mechanism (such as hash and expiration).
The cache is usually common to all visitors and the caching flow follows these steps:

The first visitor that arrives on the website will reach the cache server but miss the cache
Thus the cache server will then call the actual web server to get the content, triggering a rendering of the page.
The cache server then stores the page in cache (memory or file) and returns the page to the visitor.
The following visitors arrive on the website and reach the cache server
The cache server successfully finds the page in cache (cache hit) and returns it directly, without calling the web server.

An example architecture is displayed here, with a cache system on top of an application server backend:

As you can see, there is less call to the application server than for the cache server, thus increasing performance and reducing load.

Feature management & experimentation with content caching

Although content caching greatly increases performance and reduces load on the infrastructure, it has an immediate impact on the experimentation & personalization possibilities of a web application.
Most of the visitors will get the content from the cache server, and thus never reach the application server.

Traditional feature API calls / SDK implementation in the application server will not work with a caching mechanism, because all the visitors would see the personalized page for the first visitors that requested the resource.
Therefore, any personalization logic such as feature flagging or experimentation must be run on the cache server or a lightweight server dedicated to Flagship API that send the response to the cache server instead of the application server, because the cache server is the only component that every visitor will reach.