Job Queues – Gearman and PHP
When developing applications, an initial implementation often involves multiple tasks occurring in real time, and in sequence. While this may function well initially, often over time as the system grows, performance issues crop up.
The main problem is that if the tasks take a long time to complete and/or are taking place sequentially in the foreground then the response to the browser will be delayed – so the end user has to wait. If the end user’s browser times out, then confusion and multiple identical requests are likely (once they click ‘refresh’).
Assuming the tasks can be executed in the background, one solution is to use a job queue allowing the time consuming tasks to be executed (possibly in parallel) in the background without blocking the end user.
A number of job queues exist – Gearman being just one.
Gearman integrates with multiple languages – such as Python, C, PHP and Perl. From a PHP perspective we’ve found the PECL extension to work well.
Gearman provides a number of useful features –
- background or foreground jobs
- specifying a priority for the job(s) (allowing higher priority tasks to skip low priority ones – e.g. a password reset mail being processed before a marketing mail).
- an optional unique job ID (stops duplicate tasks being queued)
- the ability to use persistent queues (e.g. MySQL)
- an admin protocol, allowing for querying of queue sizes/workers running etc
Using a job queue can help scalability of an application, as queued tasks can be executed by workers on other server nodes. Additionally the decoupling of web requests from task execution gives flexibility – it may not be a problem if the queue size expands quickly in the short term as long as the queue is emptied eventually – allowing the application to cope with short term increases in traffic. Our favourite aspect is that if the server becomes busier, reducing the number of, or delaying creating new queue workers is possible – so helping prioritise resources.