Pacemaker and Corosync HA

In this setup we will setup a HA failover solution using Corosync and Pacemake, in a Active/Passive setup.

Installation and Setup


  • Hosts or DNS resolvers
  • NTP Must be installed and configured on all nodes

[code]cat /etc/hosts
10.0.1 10 ha1 server01 ha2 server02

We will install pacemaker, it should install corosync as an dependency, if not install it.

[code]apt-get install pacemaker[/code]

Edit corosync.conf. The bind address is the network address, NOT the IP. The mcastaddr is default, which is fine.

[code]cat /etc/corosync/corosync.conf
interface {
# The following values need to be set based on your environment
ringnumber: 0
mcastport: 5405

We also want corosync to start pacemaker automatically. If we do not do this, we will have to start pacemaker manually.
ver: 0 Indicates corosync to start pacemaker automatically. Setting it to 1, will require manually start of pacemaker!

[code] cat /etc/corosync/corosync.conf
service {
# Load the Pacemaker Cluster Resource Manager
ver: 0
name: pacemaker

Copy/paste the content of corosync.conf, or scp the file to the second node.
[code]scp /etc/corosync/corosync.conf[/code]

Make corosync starts at boot time.
[code]cat /etc/default/corosync
# start corosync at boot [yes|no]
Start corosync
[code]/etc/init.d/corosync start[/code]

Check the status of the cluster

[code]Last updated: Fri Jun 9 11:02:55 2017 Last change: Wed Jun 7 14:26:06 2017 by root via cibadmin on server01
Stack: corosync
Current DC: server01 (version 1.1.14-70404b0) – partition with quorum
2 Nodes configured, 2 expected votes
0 Resources configured.

Online: [ server01 ]
Copy the config file to the second node
[code]scp /etc/corosync/corosync.conf server02:/etc/corosync/
Now on the second node, try to start corosync
[code]/etc/init.d/corosync start[/code]

Check the status again. We should now hopefully see the second node joining. If this fails check the firewall settings and hosts file (they must be able to resolve).

We are getting some warnings. Use the following commands:

[code]crm configure property stonith-enabled=false
sudo crm configure property no-quorum-policy=ignore
crm_verify -L

Now add a virtual IP to the cluster.

[code]crm configure primitive VIP ocf:IPaddr2 params ip= nic=eth0 op monitor interval=10s
Now we should have added an VIP/Floating IP, we can test this by a simple ping. Should respond from both nodes.

Adding Resources: Services

Now we are ready to add a service to our cluster. In this example we use a postfix service (smtp) that we want to failover. Postfix must be installed on both nodes

[code]crm configure primitive HA-postfix lsb:postfix op monitor interval=15s[/code]

Check the status.
[code]crm status[/code]

As we have not linked the IP to the service yet, postfix could be running on server02 while the IP is on server01. We need to set them both in one HA group.

[code]crm configure group HA-Group VIP HA-postfix[/code]

If we check the status again, we can see that the two resources are now running on the same server.

Online: [ server01 server02 ]

Resource Group: HA-Group
VIP (ocf::heartbeat:IPaddr2): Started server01
HA-postfix (lsb:postfix): Started server01[/code]

Looks good !

If an resource fails, for some reason, like postfix crashes, and cannot start again, we want to migrate to another server.
Per default the migration-threshold is not defined/set to infinity, which will never migrate it.

When we have 3 fails, migrate the node, and expire the failed resource after 60 seconds. This will allow it to automatically to move it back to this node.

[code]primitive HA-postfix lsb:postfix \
op monitor interval="15s" \
meta target-role="Started" migration-threshold="3" failure-timeout=60s

Now we are DONE!

Some extra commands that might be usefull when managing the cluster:

Deleting a resource
[code]crm resource stop HA-XXXX
crm configure delete HA-XXXX[/code]
Where XXXX is the name of the HA cluster.

Migrate / Move Resource
[code]crm_resource –resource HA-Group –move –node server02[/code]

View configuration
[code]crm configure show[/code]

View status and fail counts
[code]crm_mon -1 –fail[/code]


Geef een reactie

%d bloggers liken dit: