»Governing a great state is like cooking small fish.«
Tao Te Ching
The Scenario

Imagine: You run a popular website using AMP technologies, but your hardware is awfully old and over the years, as your website became more and more popular, your website got slower and slower and slower. Today it runs with a load of 42 and smoke starts getting out of the TCP ports. And because you're a silly idealistic fool you've no money for a new hardware or to pay a reasonable hosting service. What now?

The answer is in the book, the book of caches.

The Example

Let me set up a simple example:

You've a PHP (or whatever) based system for your website and your requests usually look like:

http://domain/index.php?page=welcome To get human readable and more SEOish addresses you're already using Apache's mod_rewrite in your .htaccess:

RewriteEngine on RewriteRule ^([^/]*).html$ /index.php?page=$1 [L] Now all your website's URLs look like static HTML pages: http://domain/welcome.html Perfecto.

Now let's focus on the backend. For this example I'll use the following very(!) simple(!) PHP code:

The MySQL table pages looks like this:

+----+---------+-----------------------------------------+ | id | name | content | +----+---------+-----------------------------------------+ | 1 | welcome | Dear Traveler, welcome to my AMP world! | | 2 | about | This is about AMP! | | 3 | team | Apache, MySQL, and PHP. | +----+---------+-----------------------------------------+ The files header.php, navigation.php and footer.php contain some mix of PHP and HTML to build the navigation and some basic page layout.

Everything put together may look like this in a browser:

The Benchmark

Now, I'm using ApacheBench (included in every Apache installation) and fire 1000 sequential request at my website: % ab -n 1000 http://demo/welcome.html ... Requests per second: 397.25 [#/sec] (mean) ... In this case my AMP system was able to serve 397 request per second. Not bad, but it's also a very(!) simple(!) PHP script. Setting up the cache

First, I add some lines of code to my previous PHP script.

One line just before the include("header.php") statement:

ob_start(); And this four lines at the end just after the include("footer.php") statement:

$output=ob_get_contents(); file_put_contents("cache/".basename($_REQUEST['page']).".html", $output); ob_end_clean(); echo $output; ob_start() instructs PHP keep the generated output into an internal buffer. And the last 4 lines tell PHP to save this buffer into a file, for example: cache/welcome.html.

Now I create a directory called cache next to my index.php file and make sure my Apache is able to write and access that directory: % mkdir cache % chmod a+rwx cache If I now reload my welcome page in my browser, a file named welcome.html gets created in this cache directory: % ls -l cache total 4 -rw-r--r-- 1 www-data www-data 732 2009-09-02 13:02 welcome.html Now I add this lines to my mod_rewrite configuration (new lines highlighted): RewriteEngine on RewriteCond %{REQUEST_URI} \.html$ RewriteCond %{DOCUMENT_ROOT}/cache/%{REQUEST_URI} -s RewriteRule . /cache/%{REQUEST_URI} [L] RewriteRule ^([^/]*).html$ /index.php?page=$1 [L] These three lines reads like this: (first line) For all requests ending with ".html": (second line) If there is a file in the cache directory, named exactly like the resource my web server's got asked for, than (third line) send this file to the browser. If there is no such file, continue with calling the PHP script.

The Rerun of the Benchmark

That's all, now I rerun my benchmark from earlier: % ab -n 1000 http://demo/welcome.html ... Requests per second: 1287.57 [#/sec] (mean) ... Wow, that's about three times faster as the regular PHP version. And in this example I'm using a very(!) simple(!) PHP script. On a more complex system, the boost will be much higher. For example on www.apachefriends.org we're using a cache based on this recipe and we got a performance win of 300 times. (That's because we have a very complex - some may say crappy - CMS running.) Pros & Cons

Pros:
  • quite easy to set up
  • no additional software is needed, just an Apache with mod_rewrite
  • very high performance win on slow systemsworks with every web programming language, not only PHP
Cons:
  • the cache will never refresh
    the system doesn't work with user sessions
But all these drawbacks can be relatively easily solved by adding some more lines of program code or mod_rewrite configurations.

More...