Linux and PHP web application support and development (Bromsgrove, UK)

Improving site performance using asynchronous database writes and memcache in PHP

Web applications often have a requirement to log information about a request to a database. In a contrived example this could just be a global counter which attempts to amaze visitors to your site about how many requests you’ve handled in the last hour/day/week/month.

One approach to doing this is to update a database each time a page of interest is accessed, however this isn’t very good for performance reasons. The article below illustrates the problem and how it could be done using an asynchronous background task.

Initial implementation

Starting with code like :

$todaysDate = date('Ymd');
$db->query("INSERT INTO site_hits (date, count)
    VALUES (?, ?) ON DUPLICATE KEY UPDATE count = count + 1", array($todaysDate, 1));

We could use the MySQL ‘ON DUPLICATE KEY’ check, and having a unique key constraint on the ‘date’ field we can ensure we have an incrementing number for each day. This number is being updated in real time (hopefully good). The above can be enhanced slightly by changing it to be ‘INSERT DELAYED’, which should stop MySQL waiting for the query to execute before returning control back to your PHP code.

The problem

The drawback to the above is that every page hit now results in a write to the database – so although your users will see that figure incremented each time they refresh a page, the data will not be cached by MySQL and will result in the server having to read/write from disk all the time. Under normal operations this may not be a problem – but if the site experiences a sudden spike in traffic, you’ll probably wish your I/O bandwidth wasn’t being wasted by the stat counter.

A solution – asynchronous writing to the database

A (hopefully) better approach is to separate out the writing of the counter to the database from that of the page load by using a “global variable” which resides in memory. Periodically this variable is written to disk by a scheduled task/cron job. Pseudo code to provide an example of this is shown below :

On each page load, we instead do the equivalent of :

$counter = memcache_connect('MemcacheServerHost');
$cacheKey = 'my-global-counter';
$count = $counter->get($cacheKey);
if(empty($count)) {
    $count = 0;
}
$count++;
$counter->set($cacheKey, $count);

The above could also be done using APC or XCache variables instead of Memcache, depending on circumstance.

Next, create a cron job which runs every few minutes and does the equivalent of :

$counter = memcache_connect('MemcacheServerHost');
$count = (int) $counter->get($cacheKey);
if($count > 0) {
    $db->query("INSERT INTO site_hits (date, count) VALUES (?, ?) ON DUPLICATE KEY UPDATE count = count + " . $count, array('20130305', $count));
    $counter->set($cacheKey, 0);
}

The above can also be modified slightly again, so we can delay writing to the database if the server load is above a given threshold or we otherwise think the server is too busy. At this point we’d probably want to wrap up our procedural style proof of concept code into a reasonable object – like the following :

// ....
// In each page request :
SiteStatCounter::lazyRecordHit();

Cron job :

// ....
// In the cron job :
if(Server::isNotBusy()) {
    SiteStatCounter::saveHitsToDb();
}

, , , ,

Leave a Reply

Your email address will not be published. Required fields are marked *