Archive

Posts Tagged ‘http’

Client side caching

4 October 2009 5 comments

One thing that gets neglected a lot these days is the optimisation of websites.  I never bothered with it much until fairly recently.

In this post I will cover client side caching.  That is when the page gets stored on the clients browser and only updated when that page actually changes.  This can all be set up in IIS, however we do not always have access to the server or more often have dynamic pages. 

Note: IIS7 is needed in order to access the Response headers so this only works with IIS7

HTTP 1.1 specifications sets out the rules by which servers and clients must abide.  Basically who can cache content, how long for and when it was cached.  The client will then tell the server what version of the content it has cached and the server will then either return a HTTP status 304 ("Not modified") or reprocess the request.

Firstly we will create an ETag.  This will uniquely identify each page.  The HTTP 1.1 specification expects double quotes around the ETag.  Also, SetETag method can also only be called once, a second call will cause an exception. If “cacheability” is set to “private”, the header will not be added and you will have to set it using the HttpRequest’s AppenHeader method.

If contents is cached, the client will send a conditional GET request, meaning the request will contain the If-Modified-Since header amongst others.  This corresponds to Last-Modified header for the response.  The HTTP 1.1 specification says that both the ETag and last modification date must be processed by the server if both is contained in the request.

Here is the code that I have.  First is for creating the ETag, which as you can see is a MD5 hash of the filename and last modified date.  This code might not be the most efficient so please review it before using it in a live environment.  I copied this code and translated it from a Delphi blog just to a state where it works.

private string GetFileETag(string fileName, DateTime modifyDate)
{
    string FileString;
    System.Text.Encoder StringEncoder;
    byte[] StringBytes;
    MD5CryptoServiceProvider MD5Enc;
    //use file name and modify date as the unique identifier
    FileString = fileName + modifyDate.ToString("d", CultureInfo.InvariantCulture);
    //get string bytes
    StringEncoder = Encoding.UTF8.GetEncoder();
    StringBytes = new byte[StringEncoder.GetByteCount(FileString.ToCharArray(), 0, FileString.Length, true)];
    StringEncoder.GetBytes(FileString.ToCharArray(), 0, FileString.Length, StringBytes, 0, true);
    //hash string using MD5 and return the hex-encoded hash
    MD5Enc = new MD5CryptoServiceProvider();
    return "\"" + BitConverter.ToString(MD5Enc.ComputeHash(StringBytes)).Replace("-", string.Empty) + "\"";
}

Here we check whether the ETag is in the request and then take the appropriate action.  Please note the hardcoded date, replace this with your last updated date for your page

//REPLACE THIS WITH ACTUAL DATE AND FILENAME
DateTime updated = DateTime.Parse("01/10/2009");
string filename = "Default.aspx";

DateTime modifyDate = new DateTime();
//see if we got a valid date
if (!DateTime.TryParse(Request.Headers["If-Modified-Since"], out modifyDate))
{
    modifyDate = DateTime.Now;
}
//get request's etag
string eTag = Request.Headers["If-None-Match"];
//check if we got an etag
if (string.IsNullOrEmpty(eTag))
{
    //get new etag
    eTag = GetFileETag(filename, updated);
}

//check if the file had been modified
if (!IsFileModified(filename, modifyDate, eTag))
{
    //no change, return 304
    Response.StatusCode = 304;
    Response.StatusDescription = "Not Modified";
    //set to 0 to prevent client waiting for data
    Response.AddHeader("Content-Length", "0");
    //has to be not Private
    Response.Cache.SetCacheability(HttpCacheability.Public);
    Response.Cache.SetLastModified(modifyDate);
    Response.Cache.SetETag(eTag);
    Response.End();
    return;
}
else
{
    //make sure the client caches it
    Response.Cache.SetAllowResponseInBrowserHistory(true);
    Response.Cache.SetCacheability(HttpCacheability.Public);
    Response.Cache.SetLastModified(modifyDate);
    Response.Cache.SetETag(eTag);
}

The above coded was added to my Page_Load event. The 304 should be returned before any content.

private bool IsFileModified(string filename, DateTime modifyDate, string eTag)
{
    bool fileDateModified;
    DateTime modifiedSince;
    TimeSpan modifyDiff;
    bool eTagChanged;

    //assume file has been modified unless we can determine otherwise
    fileDateModified = true;

    //Check If-Modified-Since request header, if it exists 
    if (!string.IsNullOrEmpty(Request.Headers["If-Modified-Since"]) 
        && DateTime.TryParse(Request.Headers["If-Modified-Since"], out modifiedSince))
    {
        fileDateModified = false;
        if (modifyDate > modifiedSince)
        {
            modifyDiff = modifyDate - modifiedSince;
            //ignore time difference of up to one seconds to compensate for date encoding
            fileDateModified = modifyDiff > TimeSpan.FromSeconds(1);
        }
    }

    //check the If-None-Match header, if it exists, this header is used by FireFox to validate entities based on the etag response header 
    eTagChanged = false;
    if (!string.IsNullOrEmpty(Request.Headers["If-None-Match"]))
    {
        eTagChanged = Request.Headers["If-None-Match"] != eTag;
    }
    return (eTagChanged || fileDateModified);
}

It might be a good idea to put this into a HttpHandler for your dynamic pages/images or maybe a HttpModule.

Note: IE does not return the caching headers when you are access http://localhost/, not IE7 anyway.  Took me about an hour to find the problem.

I hope this is clear enough.  If not, comment and I will try and answer all questions.