How to improve WordPress/Apache performance on a relatively heavily loaded server using Varnish and more
Everything’s turning mobile – a talk given to BCS Birmingham in April 2012
Last Tuesday I [DG] presented a talk to BCS Birmingham, titled “Everything is turning mobile” (See the Birmingham.BCS.org website here).
I’ve also been “volunteered” into presenting a similar talk to the Wolverhampton branch of BCS sometime in the next month or two.
The talk covered a little bit of history (i.e.. what was happening 5-10 years ago) and where we are today. It then shows how various technologies can be used to create an optimised website or mobile application, gave some examples of what we [Pale Purple] have created and issues we’ve encountered.
Thanks to BCS for arranging a great meeting which was well attended.
Interviewing time again
It appears to be that time of the year when I arrange to take a trip to Aberystwyth to interview students for a possible Industrial Year placement with Pale Purple. We’ve had 13 applications this year – some names I remember from the Gregynog interviews back in Autumn 2011. They all look good – so it’s going to be difficult to narrow them down to 5-6 for interviewing purposes.
PHP UK Conference 2012
On Friday and Saturday, last week, the annual London PHP conference took place.
We were there on Friday – and I saw the talks on :
- “The Journey towards Continuous Integration” (which has prompted me to upgrade our internal Jenkins based CI infrastructure and introduce Sonar for long term statistics capture)
- “Security Audits as an Integral part of PHP application development” (the presenter unfortunately wasn’t aware of any tools which could be easily integrated with a build environment like Jenkins, so the only approach seems to still be manual review or penetration testing on deliverables).
- “Profiling PHP Applications” (which was good, but could have been improved if the presenter had a “real” application to demo with which had performance problems).
- “Try { getting people to come to a talk about exceptions}” – which was an interesting look on exception and error reporting within software.
The conference was (as with previous years) well attended and contained a number of useful nuggets of information which made attending worthwhile.
As with previous years, videos of all the talks should be online – so catch-up should be possible.
Fight the bulge – page load time matters
One often overlooked, area ripe for improvement on websites is that of page weight – namely how much data needs to be downloaded by the web browser before the page is rendered. Most web pages will be constructed from a mixture of Javascript, Stylesheets (CSS), Images and HTML. A BBC news article states that the average page ‘weight’ is approaching 1mb
Why does it matter? Well, although it’s tempting to think everyone must have pretty quick >4mbit ADSL, it’s not always the case -
- Mobile users may be on a GPRS (2G) or 3G connection – which is normally far less than 1Mbit/s (3g minimum should be 200Kbit). At this speed our average page would take about 40 seconds to load.
- Some people (often in rural areas) will be limited to around around 512Kb or 1Mbit – due to the long distances between themselves and the nearest telephone exchange (so around about 10 seconds to load).
- If you’re lucky enough to have around a 10Mbit ADSL connection, then the download would take around about 1 second.
We seem to be seeing mobile users accounting for around about 10-20% of the traffic to websites at the moment – and this figure is likely to grow as tablet computing becomes more popular. Having a mobile theme on your website is likely to help with this – so resources are loaded via AJAX (see e.g jQueryMobile)
From a developers point of view, there are a number of things we can do to help reduce the ‘weight’ of a web page – for example :
- Enable compression on textual data being sent back by the web server using deflate or gzip (trading off some CPU time against network bandwidth) (Example below)
- Resize images so they are not being resized in a stylesheet / by the browser – when using PHP, something like the phpimageresize plugin could be used (Example below)
- Ensure static assets have appropriate / correct expiry times/headers to encourage client side caching (or caching by upstream proxy servers) (see mod_expires) (Example below)
- Try and use ‘common’ URLs for JS libraries (e.g. Google’s API hosting) – as there’s a reasonable chance the user will already have the script cached in their browser before visiting your site.
- Compact or minimise stylesheets and javascript resources – removing unnecessary spaces and comments (see e.g. the YUI compressor )
- Delay loading/fetching of Javascript (see e.g. the contentLoader script from jsclasses.org) until after the initial page has loaded.
- Merge multiple resources together (the Google mod_pagespeed Apache plugin can do this) – so rather than your browser making multiple requests to the server for multiple javascript files, it sees only one.
- Appropriate use of AJAX to load content (saving the user from having to download unchanged HTML between ‘pages’).
Server side data compression
If you’re using Apache, enabling the ‘deflate’ module and then having something like the following in a .htaccess file should work well -
AddOutputFilterByType DEFLATE text/css application/x-javascript text/x-component text/html
PHP can also perform the compression for you, but handling the compression through the web server will give better results (more stuff will be compressed).
Image Resizing
If a graphic designer has uploaded an image to the website, and it then gets resized through CSS, your browser is technically downloading more data than it needs to. Instances of this can be easily discovered using the Google PageSpeed tool.
Dynamic resizing of images can be done using the phpImageResize tool (and I’m sure there are many other alternatives) as follows – on the assumption we want to only show a 20px x 20px image:
<img src="<?php echo resize('images/whatever.jpg', array('h' => 20, 'w' => 20, 'scale' => true)); ?>">
For one customer we found correctly resizing the images reduced page ‘weight’ from around about 5mb originally to 1mb (it is a very image heavy news site/blog). Such a drastic reduction will make the site feel ‘snappier’ and more responsive to all users, save on server resources (bandwidth) and allow the server to handle more traffic at once. This should lead to more page impressions and therefore greater advertising revenue.
Expiry Times
Adding something like the following to a ‘.htaccess’ file should work well, assuming Apache as your web server:
ExpiresByType text/css A7200
ExpiresByType application/x-javascript A7200
Which tells the browser to cache the javascript and CSS files for 2 hours after access. Often the expiry value can be far higher – often days, or even years.
It’s also normally useful to turn off eTag support at this point (“FileEtag None” in .htaccess) – to try and stop browsers trying to validate whether their cache is up to date on each request.
Common URLs for JS Libraries
For example, rather than hosting jQuery from your own server (which is probably identical to jQuery on many other servers) you could use e.g.
<script src=”https://ajax.googleapis.com/ajax/libs/jquery/1.7.1/jquery.min.js”></script>
Which is both minified and there’s a reasonable chance the end user may already have it cached due to other websites using it. See the Google code site for more information.
The Circuit Breaker Design Pattern
This post is based upon some content within our PHP OO training course (which covers design patterns).
The circuit breaker design pattern is a fairly simple, and handy approach to dealing with remote services which may be offline. To explain the pattern, here’s a semi-true story -
So, imagine the front page of a website includes some output from talking to twitter. However, one day twitter is offline – and all visitors to the site start to experience a 10-15 second delay in page load time. You investigate the problem and discover it’s due to Twitter being offline – in that all requests to it are timing out. What would be nice at this point in time is if your code could have known in advance that Twitter was offline and skipped making a request on it – therefore saving the end users from experiencing a page load delay.
In order to implement the above, we need some sort of shared state between requests – within a PHP context, we could use something like memcache or APC (both are fast and pretty simple to use).
So, before your code checks a remote service, it checks the circuit breaker’s status. If the circuit breaker says “ok” then your code talks to the remote service and notifies the circuit breaker of the outcome (success or failure). If more than a set number of failures occur within a set time period then the circuit breaker changes state and your code can therefore avoid experiencing whatever timeouts may have otherwise occurred.
Pseudo code to illustrate the above could look like :
<?php
$cb = new CircuitBreaker('talking_to_twitter');
if($cb->isOk()) {
$url = 'http://search.twitter.com/search.json?q=cake';
$json = @file_get_contents($url);
if(false === $json) {
// immediate problem with twitter -
// tell the circuit breaker.
$cb->fail();
}
else {
// let's try and decode the data - check it's valid.
$data = @json_decode($json);
if(false === $data || !isset($data['completed_in'])) {
$cb->fail();
}
else {
// everything looks good... tell the circuit breaker
// and print something trivial out.
$cb->success();
echo $data['results'][0]['text'];
}
}
}
else {
// circuit breaker thinks twitter is ill, so don't bother trying
echo "Sorry, twitter is down
";
}
Unfortunately there don’t seem to be many PHP CircuitBreakers around – the best examples I can find (outside of what I think is proprietary code belonging to a customer) is this post (which uses a database) and this proposal for one to go into the Zend Framework.
A good circuit breaker would need to :
- Support multiple instances (e.g. one for Twitter, one for talking to Google or whatever)
- Support variable timeouts and thresholds
- Support different storage ‘backends’ (e.g. memcache, APC, MySQL (perhaps) or fileSystem) – at which point it might be best for it to just use Zend Cache.
Help! Developer needed! (PHP/Bromsgrove) (Jan 2012)
We seem to have too much work at the moment; and will need to either hire a new developer within the next few months – or use a freelancer/contractor. So if you’re looking for work (or a change) here are some details:
Your primary role will involve building and testing PHP based web applications.
- PHP – you should know about design patterns (MVC, Factory, CircuitBreaker and so on), be able to talk to a database and be aware of security considerations. You should have knowledge/experience with PHP Frameworks (e.g. Zend).
- CSS
- Javascript – AJAX, jQuery …
- Unit testing – PHPUnit, Selenium and Jenkins.
If anyone is interested, please contact david at palepurple.co.uk with a CV and salary expectations.
Please no agencies.
Simple load checking shell script
Below is a simple shell script which can be used to control execution of tasks on a Linux system based on the systems current load value – with the intention that if the 5 minute load average is greater than a given value the script exits with an error return code (1) or completes without error (0).
In this case saved in a file called /usr/local/bin/load_check,
#!/bin/bash
if [ -z $1 ]; then
echo "Incorrect usage .... " > /dev/stderr
exit 1
fi
LOADLIMIT=$1
load_avg=$(uptime | awk -F 'load average:' '{print $2}' | cut -d, -f1)
if [[ $load_avg < $LOADLIMIT ]]; then
exit 0
fi
exit 1
And usage would look like :
/usr/local/bin/load_check 3 && run/whatever/command
It’s now possible to modify non-essential cron jobs (for example /etc/cron.d/munin) so that they do not run if the system is deemed too busy – so changing :
*/5 * * * * munin if [ -x /usr/bin/munin-cron ]; then /usr/bin/munin-cron; fi
to
*/5 * * * * munin if [ -x /usr/bin/munin-cron ]; then /usr/local/bin/load_check 4 && /usr/bin/munin-cron; fi
Will result in munin-cron not running if the load is over 4. Rinse and repeat for other cron jobs which aren’t critical.
Gregynog 2011 – Interviewing feedback
Today, I interviewed a number of second year Aberystwyth Computer Science students at Gregynog. The aim of the exercise is to help prepare them for upcoming industrial placement interviews.
As a whole, the students were better this year – their CVs and covering letters had fewer obvious mistakes and appeared better prepared. The majority of students were also smartly dressed which gave a good impression.
However, many students undersell themselves – CVs were often missing reference to work they’d done outside their degree scheme – or the extent of the reference was “Perl” or “Python”. Yet, in one example a student had written a Python/MySQL GUI application and others had experimented with JQuery, CSS3 or HTML5.
Most students expressed a deep interest or passion in a specific field – yet they would often lack supporting “evidence” of self directed research. Being able to mention a mailing list / google group / relevant website / conference or hot topic within that field would lend credibility to the claim – employers want employees who are genuinely motivated and interested.
Finally – it was common to see students saying something like “It will help me in my degree to work for you as I’ll learn to do X and Y”. Unsurprisingly an employer is not likely to be interested in what the student will get from the employment period – they are interested in what benefits the student will bring to them and their team.
See also - http://www.palepurple.co.uk/interviewing-students-some-findings
PHP Barcelona conference, November 2011.
This is a quick summary of the PHP Barcelona conference – this is the second year one of us attended, and again it was well organised and had stimulating presentations/talks, such as:
- The Pomodoro Technique for time keeping
- Cloud hosting (managed vs unmanaged with Microsoft’s Azure+ and Orchestra.io as examples). This included live demos and a good coverage of the various options available (and why you might pick (for example) EC2 over Orchestra or vice versa).
- Solr – document indexing (which we are already using, but there was some useful content within the marathon 2 hour talk/workshop)
- Unit testing as an afterthought – Marco Tabini (phparch.com) spoke about his belief that when writing unit tests, code coverage on it’s own is not enough when it comes testing through unit tests – like phpunit – as it’s also necessary to cover how components interact with each other, and therefore effectively do testing of an entire application and not just isolated components (something we’d already discovered).
- Doctrine2 – an overview of why you should be using an ORM and not lower level SQL everywhere – followed by a quick overview of why Doctrine2 is better/faster/superior to Doctrine1.
- PHP on the CLI – a few useful titbits where hidden within this talk covering using PHP for command line scripts (options parsing via the getopt library being the most useful).
