Pale Purple https://www.palepurple.co.uk Office Address Registered Office Blount House, Hall Court, Hall Park Way,,
Telford, Shropshire, TF3 4NQ GB
sales@palepurple.co.uk GB 884 6231 01
A short howto for configuring a two node highly available server cluster. This will be an active-standby configuration whereby a local filesystem is mirrored to the standby server in real time (by DRBD).
The article below shows how to configure the underlying components (DRBD, setup a filesystem to go on top, setup DRBDLinks) with a simple integration into the system init system combined with manual failover.
Failover to the other node can be automated (E.g. using Heartbeat / Pacemaker) or performed manually by the administrator (this is beyond this relatively short article).
Two Dell SC1435 servers. There’s nothing overly special about these – they have two network ports each, and each has dual mirrored SATA hard disks.
The first server has dual Western Digital 1Tb disks. The second server has dual 1Tb Seagate disks (this is to avoid a single point of failure based on the same hard disk having a flaw).
The second network card (eth2) in each server is connected to the other through a cross-over ethernet cable and has a 10.0.0.x network address; this is to reduce external dependencies (switches etc!).
192.168.0.6 is to be used as a ‘shared’ IP address which could be in service on either server node (but not both).
Debian Wheezy (7).
Additional packages :
We want :
Some of this partitioning / filesystem configuration can be done within the Debian Installer at installation time –
After installation, check the contents of /proc/mdstat on both notes – note the number of blocks (well, volume size) for what will be the DRBD replicated volume needs to be IDENTICAL on each server (else DRBD will moan – “The peer’s disk size is too small!”).
Personalities : [raid1] md1 : active raid1 sda2[0] sdb2[1] 878974784 blocks super 1.2 [2/2] [UU] bitmap: 1/7 pages [4KB], 65536KB chunk md0 : active raid1 sda1[0] sdb1[1] 97589248 blocks super 1.2 [2/2] [UU] bitmap: 0/1 pages [0KB], 65536KB chunk
If the device (i.e. md1) you wish to mirror over DRBD is not the same size on both servers, you’ll need to resize the larger to be the same as the other – using something like :
mdadm --grow /dev/md1 --bitmap none mdadm --grow /dev/md1 --size=878974784
On each node, create /etc/drbd.d/srv.res which should look something like :
resource srv { on node1 { device /dev/drbd1; disk /dev/md1; address 10.0.0.1:7789; meta-disk internal; } on node2 { device /dev/drbd1; disk /dev/md1; address 10.0.0.2:7789; meta-disk internal; } }
It’s probably a good idea to increase the default synchronisation rate from 1M/s to something more realistic like 80M/s – so, edit /etc/drbd.d/global_common.conf and add ‘rate 80M’ into the syncer { … } section, so it looks a bit like :
..... other stuff .... syncer { rate 80M; } .... other stuff ... startup { wfc-timeout 10; degr-wfc-timeout 10; become-primary-on node01; } .... other stuff ....
(The ‘startup’ items above control DRBD’s behaviour on system bootup – to stop it hanging and waiting for input and also to make it favour/fall back to ‘node01’).
Next, initialise DRBD on the primary node (node01) using :
And on the secondary node, just do :
You should now see the underlying volume syncing –
root@node01:/# cat /proc/drbd version: 8.3.11 (api:88/proto:86-96) srcversion: F937DCB2E5D83C6CCE4A6C9 1: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r----- ns:39998396 nr:0 dw:0 dr:40008188 al:0 bm:2440 lo:1 pe:182 ua:128 ap:0 ep:1 wo:f oos:838961108 [>....................] sync'ed: 4.6% (819296/858344)Mfinish: 2:29:09 speed: 93,728 (22,104) K/sec
So, in our case, it’s going to take about 2.5 hours to mirror an ~800Gb volume.
While DRBD is syncing, the device is usable on the primary node (albeit perhaps a little slower than normal).
So –
A common problem with active-standby clusters is that software updates on the standby node become more difficult – due to :
Because of #2, it’s common to store the configuration files on the shared/replicated volume. However, this then breaks any system init-scripts on the standby node (as they won’t be able to access said configuration files).
One solution to this is the ‘drbdlinks‘ program.
Once installed, edit /etc/drbdlinks.conf – mine looks a bit like the following :
.... some stuff ... mountpoint('/srv') link('/etc/apache2', '/srv/etc-apache2') link('/var/www', '/srv/var-www')
Now, we can use /usr/sbin/drbdlinks start to get :
root@node01:/srv# ls -ald /etc/apache2* lrwxrwxrwx 1 root root 16 Dec 30 14:36 /etc/apache2 -> /srv/etc-apache2 drwxr-xr-x 7 root root 4096 Dec 30 12:48 /etc/apache2.drbdlinks
(In English, it’s detecting that the DRBD volume is mounted (on /srv), and if so, it’s “fixing up” the configuration files to use those stored on the shared volume through use of symlinks. Hence /etc/apache2 now pointing to /srv/etc-apache2).
If drbdlinks is stopped (/usr/sbin/drbdlinks stop), then the symlinks are removed, and the affected directories renamed back to their original names – i.e. /etc/apache2.drbdlinks becomes /etc/apache2. This now allows ‘apache2’ to be upgraded through apt-get/dpkg as normal on the standby node without any awkward dependencies coming into play.
In my case, I’ve created this script (/srv/service.sh) to live on the shared/mirrored volume and
And modified /etc/rc.local so it tries to mount the DRBD filesystem and execute the service script on the primary (node01) server.
#!/bin/bash # Start/stop necessary services. # Note, service startup order normally matters. # This could be called from /etc/rc.local on the primary node. if [ "$#" -eq 0 ]; then echo "Usage: $0 start|stop " >/dev/stderr exit 1 fi set -e set -x if [ $1 = 'start' ]; then /usr/sbin/drbdlinks start # bring up our shared IP address (this will be on the active cluster node) ifconfig eth0:1 192.168.0.6 netmask 255.255.255.0 up for thing in postgresql apache2 do [ -x /etc/init.d/$thing ] && /etc/init.d/$thing start done fi if [ $1 = 'stop' ]; then for thing in apache2 postgresql do [ -x /etc/init.d/$thing ] && /etc/init.d/$thing stop done ifconfig eth0:1 down || true /usr/sbin/drbdlinks stop fi
The operating systems need to be kept in sync (packages, package versions etc) to ensure that if a failover does take place then the standby node is capable of running all services. Using something like Ansible may be a good idea at this point.
Having a manual failover process tends to simplify things (split brain is no longer an issue, hardware fencing/STONITH aren’t required etc) at the cost of failover not being automatic/seamless.
‹ Kashflow Discount code PHPNE Conference 2014 (We’re sponsoring it) ›
Comments are currently closed.