Caching complete pages in Sitecore XM

This blog post showcases a proof-of-concept I did yesterday: The code is simply a sketch but it shows how we can get access to the final rendered Sitecore page via a response filter and cache the result.

While Sitecores HTML cache works on each rendering, the idea behind the proof-of-concept was to cache a complete rendered page in Sitecore for 5 minutes, reusing the same output again and again. As the implementation hooks into the mvc.requestBegin pipeline, this will only work for pages based on a MVC layouts, but the idea should be pretty generic.

The core challenge is that the HttpResponse is a stream-like object that is written to in a piece-meal fashion and flushed to the client either when its buffer is full or when explicitly flushed.

To get access to the complete response as a string, we need to inject a filter stream (in my implementation called a PageCacheFilterStream). The filter (being a stream) exposes Write, Flush and Close methods which is called in when the HttpResponse is written to, flushed or closed.

There is a lot of ways to make this work while making sure that everything is called in the correct order and only once – but in essence the filter needs to to do its “work” (e.g. update the cache) at some point and make sure that the filtered content is also written to the HttpReponse at some point.

My PageCacheFilterStream simply keeps the response in memory, and does its “work” when the filter stream is closed. This means that I turn off the piece-meal updating of the HttpReponse when the filter is applied which makes everything a lot simpler. If you have multiple filters added to the HttpResponse make sure that this logic plays well with the other filters.

To start out, let us look at the PageCacheFilterStream implementation. As you can see, I use a MemoryStream which is updated when content is written to the HttpReponse. When the stream is closed, I write the content into a string and execute the onClose action on it. Finally I update the responseStream (the HttpResponse):

namespace sc10xm.Foundation.PageCache.Models
{
    using System;
    using System.IO;
    using System.Text;

    public class PageCacheFilterStream : MemoryStream
    
        Stream responseStream;
        Encoding encoding;
        Action<string> onClose;

        public PageCacheFilterStream(Stream responseStream, Encoding encoding, Action<string> onClose) : base(5000)
        {
            this.responseStream = responseStream;
            this.encoding = encoding;
            this.onClose = onClose;
            IsClosing = false;
        }

        public bool IsClosing { get; set; }

        public override void Close()
        {
            IsClosing = true;
            var buffer = ToArray();

            // Execute the onClose on the complete content
            onClose(encoding.GetString(buffer));

            // Write the complete content to the response
            responseStream.Write(buffer, 0, buffer.Length);
        }

        public override void Flush()
        {
            return;
        }

        public override void Write(byte[] buffer, int offset, int count)
        {
            if (!IsClosing)
                base.Write(buffer, offset, count);
        }
    }
}

A twist is that I need to mark the stream as IsClosing before writing to the HttpReponse, because I sometime experience that the filters Write method is indeed triggered (again) by this, adding the same content twice.

With this filter, I am able to grab the complete rendered page and store it in the Sitecore HTML cache. This is handled in a RequestBeginProcessor added to the top of the mvc.requestBegin pipeline:

namespace sc10xm.Foundation.PageCache.Pipelines.MvcRequestBegin
{
    using sc10xm.Foundation.PageCache.Models;
    using Sitecore.Caching;
    using Sitecore.Mvc.Pipelines.Request.RequestBegin;
    using System;
    using System.Web;

    public class PageCacheProcessor : RequestBeginProcessor
    {
        public override void Process(RequestBeginArgs args)
        {
            if (!Sitecore.Context.PageMode.IsNormal)
                return;

            var item = Sitecore.Context.Item;
            var siteContext = Sitecore.Context.Site;
            var request = HttpContext.Current.Request;
            var response = HttpContext.Current.Response;

            if (item == null || siteContext == null || request == null || response == null)
                return;

            var htmlCache = CacheManager.GetHtmlCache(siteContext);

            if (htmlCache == null)
                return;

            var cacheKey = string.Join(":", nameof(PageCacheProcessor), item.ID, item.Language, request.Url);

            var cachedHtml = htmlCache.GetHtml(cacheKey);

            if (!string.IsNullOrEmpty(cachedHtml))
            {
                response.Output.Write(cachedHtml);
                response.End();
                args.AbortPipeline();
            }
            else
            {
                response.Filter = new PageCacheFilterStream(response.Filter, response.ContentEncoding, 
                    (html) => htmlCache.SetHtml(cacheKey, html, TimeSpan.FromMinutes(5)));
            }
        }
    }
}

As you can see, the processor checks for a cached page – if not found it adds a PageCacheFilterStream to the response together with an action that – on close – adds the rendered page to the cache.

The only step left is to patch in the processor:

<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/">
    <sitecore>
        <pipelines>
            <mvc.requestBegin>
                <processor type="sc10xm.Foundation.PageCache.Pipelines.MvcRequestBegin.PageCacheProcessor, sc10xm" patch:before="processor[1]" />
            </mvc.requestBegin>
        </pipelines>
    </sitecore>
</configuration>

This will cause the processor to be run on each request to a MVC page allowing the result to be reused for 5 minutes.