Thursday, April 27, 2006

Speed up your AJAX based webapps

UPDATE: I guess most people are not getting what this technique does in the first place. It sets the expiry of the JavaScript to years and not days. Once the JavaScript file is downloaded it is never downloaded again, ofcourse unless you force it by removing the file in the cache. If you visit the site often the JavaScript will not be removed from the cache. If you make any changes to the JavaScript you only need to change the version of the file and the new file will be downloaded. The older file is automatically removed from the cache when it is no longer requested. And just to add one more point this can be done on the WebServer itself without using this technique, but that has its own drawbacks. To further speed up the download you can gzip the JavaScript.


If you have developed an AJAX based web application you would know how many JavaScript files are required per webpage. If you use the prototype or dojo toolkit library you would know how big those JavaScript files can turn out to be.

I am currently developing a website fefoo.com, and I learned a few things about caching and how you can speed up your website for users who visit your site often. A website like digg takes up more than a minute to load on my dialup connection even though the main page is no more than a 27-32 KB. The real time is taken up by the JavaScript files. The solution for this problem is to cache the JavaScript files. Though caching improves the speed but it causes a problem when you have to update the JavaScript files, since the browser will not look for updated files if they have been cached.

Since your files are cached they will not be requested by the browser and you will not be able to send out updated JavaScript files. The solution to this problem is that you use a different name for your JavaScript file every time, or you can version control your directory. So for version 0.1 of your project http://testserver.com/javascript/0.1/test.js, and for 0.2 http://testserver.com/javascript/0.2/test.js

Thought this solution is good but it's still difficult to implement, and soon you will have multiple directories to take care of, and you will face problems when only one file needs to be changed.

After facing the trouble of a slow server and the JavaScript file being downloaded every time, I came up with this solution for PHP and .net based web application. You need to download getjs.php or getjs.aspx depending on your server. To load a script file use
<script src="getjs.php?file=test1&version=0.1" />
<script src="getjs.aspx?file=test2&version=0.2" />

In this example we load two files test1.js and test2.js. This solves two problems firstly it will work even if you have virtual hosting and secondly it solves the problem of multiple files. So if you wish to change only one file change the version from 0.1 to 0.2 and the new file will be downloaded and cached. In the next article I will try to tell you how this has been implemented, though if you know PHP or VB.net and have some idea about HTTP you won't face too many problems.

You can view a demo to see how you can improve your own web applications using prontoCache.
<?php
/*
* Author: Vivek Jishtu
* Copyright (c) 2006 Viamatic Softwares
*/

/**************************************************
This acts like a security measure also. So no
other extension except JavaScript can be downloaded.
**************************************************/

$filename = $_GET["file"]. ".js";
header("Content-Type: text/javascript");
if(!file_exists($filename)) {
echo "alert('The file [" .htmlspecialchars($_GET["file"], ENT_QUOTES). "] does not exist. Please inform webmaster.');";
exit;
}

$if_modified_since = preg_replace('/;.*$/', '', $HTTP_IF_MODIFIED_SINCE);
/**************************************************
The javascript never expires so if we get any
Request for a modified page we send back that
javascript has not been modified.
**************************************************/
if ($if_modified_since != "") {
header("HTTP/1.0 304 Not Modified");
exit;
}

/**************************************************
Set the cache such that it does not expire. In
this example we set it till 22nd Feb 2011.
You can change the date to whatever year you want,
any year in the future.
**************************************************/
header("Last-Modified: " . gmdate('D, d M Y H:i:s', time()) . ' GMT');
header("Expires: Tue, 22 Feb 2011 05:00:00 GMT");
header("Cache-Control: public");

echo "/* prontoCached on ". gmdate('D, d M Y H:i:s', time()) . " */\r\n";
require_once($filename);
?>
The PHP Code
<%@ Page Language="VB" %>
<script runat="server">
'
' Author: Vivek Jishtu
' Copyright (c) Viamatic Softwares
'
Protected Sub Page_Load(ByVal sender As Object, ByVal e As System.EventArgs)
'**************************************************
' This acts like a security measure also. So no
' other extension except JavaScript can be downloaded.
'**************************************************

Dim FileName As String = Me.MapPath(Request.QueryString("file") & ".js")
Response.ContentType = "text/javascript"

If Not My.Computer.FileSystem.FileExists(FileName) Then
Response.Write("alert('The file does not exist. Please inform webmaster.');")
Response.End()
Return
End If
'**************************************************
' The javascript never expires so if we get any
' Request for a modified page we send back that
' javascript has not been modified.
'**************************************************
If Request.Headers("If-Modified-Since") <> "" Then
Response.StatusCode = "304"
Response.StatusDescription = "Not Modified"
Response.End()
End If

'**************************************************
' Set the cache such that it does not expire. In
' this example we set it till 22nd Feb 2011.
' You can change the date to whatever year you want,
' any year in the future.
'**************************************************
Response.AddHeader("Last-Modified", DateToHTTPDate(Date.Now))
Response.AddHeader("Expires", "Tue, 22 Feb 2011 05:00:00 GMT")
Response.AddHeader("Cache-Control", "public")
Response.Write("/* prontoCached on " & Date.Now & " */" & vbCrLf)
Response.Write(My.Computer.FileSystem.ReadAllText(FileName))
Response.End()

End Sub

'Source for DateToHTTPDate from http://www.motobit.com/tips/detpg_net-last-modified/
Function DateToHTTPDate(ByVal OleDATE As Date) As String
On Error Resume Next
OleDATE = OleDATE.ToUniversalTime
Return engWeekDayName(OleDATE) & _
", " & Right("0" & Day(OleDATE), 2) & " " & engMonthName(OleDATE) & _
" " & Year(OleDATE) & " " & Right("0" & Hour(OleDATE), 2) & _
":" & Right("0" & Minute(OleDATE), 2) & ":" & Right("0" & Second(OleDATE), 2) & " GMT"
End Function

Function engWeekDayName(ByVal dt As Date) As String
Dim Out As String = ""
Select Case Weekday(dt, 1)
Case 1 : Out = "Sun"
Case 2 : Out = "Mon"
Case 3 : Out = "Tue"
Case 4 : Out = "Wed"
Case 5 : Out = "Thu"
Case 6 : Out = "Fri"
Case 7 : Out = "Sat"
End Select
Return Out
End Function

Function engMonthName(ByVal dt As Date) As String
Dim Out As String = ""
Select Case Month(dt)
Case 1 : Out = "Jan"
Case 2 : Out = "Feb"
Case 3 : Out = "Mar"
Case 4 : Out = "Apr"
Case 5 : Out = "May"
Case 6 : Out = "Jun"
Case 7 : Out = "Jul"
Case 8 : Out = "Aug"
Case 9 : Out = "Sep"
Case 10 : Out = "Oct"
Case 11 : Out = "Nov"
Case 12 : Out = "Dec"
End Select
Return Out
End Function

Public Function DateFromHTTP(ByVal HTTPDate As String) As Date
Dim Swd As String, d As String, Sm As String, y As String, h As String
Dim m As String, s As String, g As String, Out As Date
HTTPDate = LCase$(HTTPDate)

If Mid$(HTTPDate, 27, 3) = "gmt" Then
Swd = Left$(HTTPDate, 3)
d = Mid$(HTTPDate, 6, 2)
Sm = Mid$(HTTPDate, 9, 3)
y = Mid$(HTTPDate, 13, 4)
h = Mid$(HTTPDate, 18, 2)
m = Mid$(HTTPDate, 21, 2)
s = Mid$(HTTPDate, 24, 2)
Out = New Date(y, mFromSm(Sm), d, h, m, s)
Out = Out.ToLocalTime
End If

Return Out
End Function

Function wdFromSwd(ByVal Swd As String) As Integer
Dim Out As Integer
Select Case LCase$(Swd)
Case "sun" : Out = 1
Case "mon" : Out = 2
Case "tue" : Out = 3
Case "wed" : Out = 4
Case "thu" : Out = 5
Case "fri" : Out = 6
Case "sat" : Out = 7
End Select
Return Out
End Function

Function mFromSm(ByVal Sm As String) As Integer
Dim Out As Integer
Select Case LCase$(Sm)
Case "jan" : Out = 1 : Case "feb" : Out = 2
Case "mar" : Out = 3 : Case "apr" : Out = 4
Case "may" : Out = 5 : Case "jun" : Out = 6
Case "jul" : Out = 7 : Case "aug" : Out = 8
Case "sep" : Out = 9 : Case "oct" : Out = 10
Case "nov" : Out = 11 : Case "dec" : Out = 12
End Select
Return Out
End Function
</script>
The VB.net Code

Incase you have the rights to change the expiry of JavaScript on the webserver itself, you can still use the version technique to send out new versions of files using <script src="test1.js?version=0.1" />. The version parameter is there to make it clear, you can also use any random value if you want.


After looking at suggestions from people I guess the best option is to use <script src="test1.js?timestamp={timestamp('test1.js');}" />. This is a simplest way of doing it. To get the timestamp of a file is language/platform dependent. But anytime the file is modified the timestamp would be changed. Using this method you would not have to make changes in the version either.

Technorati Tags: , ,

23 comments:

Anonymous said...

You could have also just appended the timestamp of the JS file when you call it... implementing that would take a lot less logic.

somescripts.js?1146185994

Anonymous said...

Bad form outputting possible dangerous output, e.g., alert( ... filename. A specially crafted URL could allow the attacker to execute JS on the client's machine as if it were your domain, for example, stealing or deleting cookies and/or session ids.

Anonymous said...

i agree with the first anonymous, I make AJAX applications almost every day, appending a timestamp or some other random number is the easiest way to keep it from getting cached.

Archimedes Trajano said...

Or you can just rename the javascript file to something else.

However, I did post the AjaxQueue pattern on my blog which in theory speed up part of the AJAX equation. Namely the response since it can group them all together in one chunk rather than dealing with a lot of small requests.

AeH said...

I believe the point of this post was to SELECTIVELY cache javascript files.

Seems to me like a good solution to this problem.

Anonymous said...

His point is not to stop caching of js, but indeed to allow js to be cached by the browser, but to immediately invalidate browser cache if and when the js changes. Can be handled by server etags, though that still means the browser will need to ping the server to see if the js has changed. I got around this by using rewrite rules, and a perl script, which added cvs version info to my urls. Set a 4 week freshness setting on all the js, images, css, etc. and change the url only when cvs version changes (which immediately invalidates the browser cache). Since I am using mod_rewrite, don't ever need to actually maintain multiple dirs. Works well.

Anonymous said...

In agreeance with, and expanding upon the 1st and 3rd anonymous posters' comments, why couldn't the call to the js file be appended with a query var everytime that a new version was posted? It appears that editing of the js call is happening anyway (with the version number being changed in calls) in this implementation, but I believe that this additional hit on the server to call the PHP file could be avoided.

From my understanding, and the previous posters comments will validate this, the browser will cache based on unique URLs. So test.js?a=1 will not be seen as the same as test.js?a=2. Why then do we need this additional call to the server, when instead of modifying the file,we could merely modify the src reference of the script tag? (Like the Timestamp method mentioned earlier).

Vivek Jishtu said...

Thanks Anonymous (2) I removed the filename from the JavaScript. That should take care of the security hole.

Anonymous said...

Or how about leaving it big to encourage dial-up users to switch to broadband and get out of the stone age. Or petition their local phone company for broadband.. Costs is no matter, broadband dsl as cheap as dialup now =P

Austin said...

Don't forget that in ASP.NET you can use generic handlers (.ashx files) for time you don't need to full overhead of a page. Here is a sample:

<%@ WebHandler Language="VB" Class="HelloWorld" %>

Imports System
Imports System.Web

Public Class HelloWorld : Implements IHttpHandler

Public Sub ProcessRequest(ByVal context As HttpContext) Implements IHttpHandler.ProcessRequest
context.Response.ContentType = "text/plain"
context.Response.Write("Hello World")
End Sub

Public ReadOnly Property IsReusable() As Boolean Implements IHttpHandler.IsReusable
Get
Return True
End Get
End Property

End Class

Vivek Jishtu said...
This comment has been removed by a blog administrator.
Anonymous said...

If you use Ruby on Rails it automatically appends the timestamp to anything when you use the asset helpers. Just in case anyone thought about implementing something like this in Rails.

Anonymous said...

This is exactly the way ASP.NET 2.0 webresources work : It appends (among other things) the assembly timestamp in the querystring when requesting the resource.

It may be worth mentioning that you can apply the same technique to CSS files and images (link and img tags), the server will ignore the querystring when serving the resource, but the browser will keep it in the cached resource name.

Anonymous said...
This comment has been removed by a blog administrator.
Anonymous said...
This comment has been removed by a blog administrator.
Corey Tisdale said...

Here is an idea -- keep your file named the same thing, ie mfile.js. Then reference it like myfile.1.js or myfile.2.js each time you change the content. Then set up your server rewrites to ignore any numbers after that second period, using a regex like

([0-9a-zA-Z]+)(\.[\d]*)?(\.js)

And then use match 1 and 3, who cares if 2 is blank or has stuff in it.

PS - I know this will work with ISAPI rewrite; I dont really have any experience with apache mod_rewrite, but I imagine it can use regexes and that one or a slightly modified one should work there.

Anonymous said...
This comment has been removed by a blog administrator.
Anonymous said...

I simply use jsmin (which removes comments and whitespace) and append all javascript files into one file named all-VERSION.js, where VERSION is the svn revision number. The goal here was twofold: Enable versioning, plus keep the javascript parsing and downloading to a minimum. The effect of using jsmin really speeds up client side interpretation of the javascript. I suppose I could gzip the all-VERSION.js file as well.

Here's the script:

#!/bin/bash

rm -f all-*.js

ver=$(svn info js | grep Revision | awk '{ print $2 }')

oldver=`cat VERSION`

echo "Old version is $oldver, version is $ver"

file="all-$ver.js"

cat > $file <<EOF
/*

In order to keep this file as small as possible, all license texts have been
removed. For a copy of the license text for all javascript code used throughout
this site, please see:

http://www.lapdonline.org/crimemap/license.txt

*/
EOF

if [ ! -f jsmin ]
then
gcc -Wall -o jsmin ../assets/jsmin/jsmin.c
fi

for i in js/* # Oversimplified, as the files must be in order, but you get the idea
do
echo $i
./jsmin < $i >> $file
done

echo -n ${ver} > VERSION

Anonymous said...

Dude, throw away all that old ASP 3.0 crap that spawns from the function DateToHTTPDate to format the date in RFC 1123. The .NET has the Date.Now.ToString("r") that automatically formats the date for you.

That is, replace this:
Response.AddHeader("Last-Modified", DateToHTTPDate(Date.Now))
with this:
Response.AddHeader("Last-Modified", Date.Now.ToString("r"))

-Gianni

Anonymous said...

The queston that needs to be answered here is:

will /scripts/script.js?version=1 be cached assuming the expiry is set into the future?


if so, the simple answer is to append the version to the request so that once the version is cached, it stays cached. when the script is updated, ensure the version key/value is changed.

Anonymous said...

To answer the questions: yes, all browsers do cache file.js?a=1 differently from file.js?a=2. The Rails team tested it extensively.

Also, author, rewrite that damn .NET code! .NET (finally) gives full date formatting that is tied into the OS. Native code can format a real date much faster than you can, and much more accurately.

purse said...

Hi! I've been reading your blog from the beginning..Thank you for your wonderful work! Keep up the good work.

Daniel Gomes Silveira said...

In HTTP 1.1 Specifications it states about the Expires field:

To mark a response as "never expires," an origin server sends an Expires date approximately one year from the time the response is sent. HTTP/1.1 servers SHOULD NOT send Expires dates more than one year in the future.

So, you SHOULD NOT set the expires date to 2011