= Convert Cluster to Active/Active =

The primary requirement for an Active/Active cluster is that the data
required for your services is available, simultaneously, on both
machines. Pacemaker makes no requirement on how this is achieved; you
could use a SAN if you had one available, but since DRBD supports
multiple Primaries, we can continue to use it here.

== Install Cluster Filesystem Software ==

The only hitch is that we need to use a cluster-aware filesystem. The
one we used earlier with DRBD, xfs, is not one of those. Both OCFS2
and GFS2 are supported; here, we will use GFS2.

On both nodes, install the GFS2 command-line utilities and the
Distributed Lock Manager (DLM) required by cluster filesystems:
----
# yum install -y gfs2-utils dlm
----

== Configure the Cluster for the DLM ==

The DLM needs to run on both nodes, so we'll start by creating a resource for
it (using the *ocf:pacemaker:controld* resource script), and clone it:
----
[root@pcmk-1 ~]# pcs cluster cib dlm_cfg
[root@pcmk-1 ~]# pcs -f dlm_cfg resource create dlm ocf:pacemaker:controld op monitor interval=60s
[root@pcmk-1 ~]# pcs -f dlm_cfg resource clone dlm clone-max=2 clone-node-max=1
[root@pcmk-1 ~]# pcs -f dlm_cfg resource show
 ClusterIP	(ocf::heartbeat:IPaddr2):	Started 
 WebSite	(ocf::heartbeat:apache):	Started 
 Master/Slave Set: WebDataClone [WebData]
     Masters: [ pcmk-2 ]
     Slaves: [ pcmk-1 ]
 WebFS	(ocf::heartbeat:Filesystem):	Started 
 Clone Set: dlm-clone [dlm]
     Stopped: [ pcmk-1 pcmk-2 ]
----

Activate our new configuration, and see how the cluster responds:
----
[root@pcmk-1 ~]# pcs cluster cib-push dlm_cfg
CIB updated
[root@pcmk-1 ~]# pcs status
Cluster name: mycluster
Last updated: Fri Aug 14 11:19:36 2015
Last change: Fri Aug 14 11:19:28 2015
Stack: corosync
Current DC: pcmk-1 (1) - partition with quorum
Version: 1.1.12-a14efad
2 Nodes configured
8 Resources configured


Online: [ pcmk-1 pcmk-2 ]

Full list of resources:

 ClusterIP	(ocf::heartbeat:IPaddr2):	Started pcmk-2 
 WebSite	(ocf::heartbeat:apache):	Started pcmk-2 
 Master/Slave Set: WebDataClone [WebData]
     Masters: [ pcmk-2 ]
     Slaves: [ pcmk-1 ]
 WebFS	(ocf::heartbeat:Filesystem):	Started pcmk-2 
 ipmi-fencing   (stonith:fence_ipmilan):        Started pcmk-1 
 Clone Set: dlm-clone [dlm]
     Started: [ pcmk-1 pcmk-2 ]

PCSD Status:
  pcmk-1: Online
  pcmk-2: Online

Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/enabled
----

[[GFS2_prep]]
== Create and Populate GFS2 Filesystem ==

Before we do anything to the existing partition, we need to make sure it
is unmounted. We do this by telling the cluster to stop the WebFS resource.
This will ensure that other resources (in our case, Apache) using WebFS
are not only stopped, but stopped in the correct order.

----
[root@pcmk-1 ~]# pcs resource disable WebFS
[root@pcmk-1 ~]# pcs resource
 ClusterIP	(ocf::heartbeat:IPaddr2):	Started 
 WebSite	(ocf::heartbeat:apache):	Stopped 
 Master/Slave Set: WebDataClone [WebData]
     Masters: [ pcmk-2 ]
     Slaves: [ pcmk-1 ]
 WebFS	(ocf::heartbeat:Filesystem):	Stopped 
 Clone Set: dlm-clone [dlm]
     Started: [ pcmk-1 pcmk-2 ]
----

You can see that both Apache and WebFS have been stopped,
and that *pcmk-2* is the current master for the DRBD device.

Now we can create a new GFS2 filesystem on the DRBD device.

[WARNING]
=========
This will erase all previous content stored on the DRBD device. Ensure
you have a copy of any important data.
=========

[IMPORTANT]
===========
Run the next command on whichever node has the DRBD Primary role.
Otherwise, you will receive the message:
-----
/dev/drbd1: Read-only file system
-----
===========

-----
[root@pcmk-2 ~]# mkfs.gfs2 -p lock_dlm -j 2 -t mycluster:web /dev/drbd1
It appears to contain an existing filesystem (xfs)
This will destroy any data on /dev/drbd1
Are you sure you want to proceed? [y/n]y
Device:                    /dev/drbd1
Block size:                4096
Device size:               1.00 GB (262127 blocks)
Filesystem size:           1.00 GB (262126 blocks)
Journals:                  2
Resource groups:           5
Locking protocol:          "lock_dlm"
Lock table:                "mycluster:web"
UUID:                      9a72c488-d8a7-24c9-ceee-add7a8ca52c2
-----

The `mkfs.gfs2` command required a number of additional parameters:

* `-p lock_dlm` specifies that we want to use the
kernel's DLM.

* `-j 2` indicates that the filesystem should reserve enough
space for two journals (one for each node that will access the filesystem).

* `-t mycluster:web` specifies the lock table name. The format for
this field is +pass:[<replaceable>clustername:fsname</replaceable>]+. For
+pass:[<replaceable>clustername</replaceable>]+, we need to use the same
value we specified originally with `pcs cluster setup --name` (which is also
the value of *cluster_name* in +/etc/corosync/corosync.conf+).
If you are unsure what your cluster name is, you can look in
+/etc/corosync/corosync.conf+ or execute the command
`pcs cluster corosync pcmk-1 | grep cluster_name`.

Now we can (re-)populate the new filesystem with data
(web pages). We'll create yet another variation on our home page.

-----
[root@pcmk-2 ~]# mount /dev/drbd1 /mnt
[root@pcmk-2 ~]# cat <<-END >/mnt/index.html
<html>
<body>My Test Site - GFS2</body>
</html>
END
[root@pcmk-2 ~]# chcon -R --reference=/var/www/html /mnt
[root@pcmk-2 ~]# umount /dev/drbd1
[root@pcmk-2 ~]# drbdadm verify wwwdata
-----

== Reconfigure the Cluster for GFS2 ==

With the WebFS resource stopped, let's update the configuration.

----
[root@pcmk-1 ~]# pcs resource show WebFS
 Resource: WebFS (class=ocf provider=heartbeat type=Filesystem)
  Attributes: device=/dev/drbd1 directory=/var/www/html fstype=xfs
  Meta Attrs: target-role=Stopped
  Operations: start interval=0s timeout=60 (WebFS-start-timeout-60)
              stop interval=0s timeout=60 (WebFS-stop-timeout-60)
              monitor interval=20 timeout=40 (WebFS-monitor-interval-20)
----

The fstype option needs to be updated to *gfs2* instead of *xfs*.

----
[root@pcmk-1 ~]# pcs resource update WebFS fstype=gfs2
[root@pcmk-1 ~]# pcs resource show WebFS
 Resource: WebFS (class=ocf provider=heartbeat type=Filesystem)
  Attributes: device=/dev/drbd1 directory=/var/www/html fstype=gfs2 
  Meta Attrs: target-role=Stopped 
  Operations: start interval=0s timeout=60 (WebFS-start-timeout-60)
              stop interval=0s timeout=60 (WebFS-stop-timeout-60)
              monitor interval=20 timeout=40 (WebFS-monitor-interval-20)
----

GFS2 requires that DLM be running, so we also need to set up new colocation
and ordering constraints for it:
----
[root@pcmk-1 ~]# pcs constraint colocation add WebFS with dlm-clone INFINITY
[root@pcmk-1 ~]# pcs constraint order dlm-clone then WebFS
Adding dlm-clone WebFS (kind: Mandatory) (Options: first-action=start then-action=start)
----

== Clone the IP address ==

There's no point making the services active on both locations if we can't
reach them both, so let's clone the IP address.

The *IPaddr2* resource agent has built-in intelligence for when it is configured
as a clone. It will utilize a multicast MAC address to have the local switch
send the relevant packets to all nodes in the cluster, together with *iptables
clusterip* rules on the nodes so that any given packet will be grabbed by
exactly one node. This will give us a simple but effective form of
load-balancing requests between our two nodes.

Let's start a new config, and clone our IP:
----
[root@pcmk-1 ~]# pcs cluster cib loadbalance_cfg
[root@pcmk-1 ~]# pcs -f loadbalance_cfg resource clone ClusterIP \
     clone-max=2 clone-node-max=2 globally-unique=true
----

* `clone-max=2` tells the resource agent to split packets this many ways. This
should equal the number of nodes that can host the IP.
* `clone-node-max=2` says that one node can run up to 2 instances
of the clone. This should also equal the number of nodes that can
host the IP, so that if any node goes down, another node can take over
the failed node's "request bucket". Otherwise, requests intended for
the failed node would be discarded.
* `globally-unique=true` tells the cluster that one clone isn't identical
to another (each handles a different "bucket"). This also tells the resource
agent to insert *iptables* rules so each host only processes packets in its
bucket(s).

Notice that when the ClusterIP becomes a clone, the constraints
referencing ClusterIP now reference the clone.  This is
done automatically by pcs.
----
[root@pcmk-1 ~]# pcs -f loadbalance_cfg constraint
Location Constraints:
Ordering Constraints:
  start ClusterIP-clone then start WebSite (kind:Mandatory)
  promote WebDataClone then start WebFS (kind:Mandatory)
  start WebFS then start WebSite (kind:Mandatory)
  start dlm-clone then start WebFS (kind:Mandatory)
Colocation Constraints:
  WebSite with ClusterIP-clone (score:INFINITY)
  WebFS with WebDataClone (score:INFINITY) (with-rsc-role:Master)
  WebSite with WebFS (score:INFINITY)
  WebFS with dlm-clone (score:INFINITY)
----

Now we must tell the resource how to decide which requests are
processed by which hosts. To do this, we specify the *clusterip_hash* parameter.
The value of *sourceip* means that the source IP address of incoming packets
will be hashed; each node will process a certain range of hashes.

----
[root@pcmk-1 ~]# pcs -f loadbalance_cfg resource update ClusterIP clusterip_hash=sourceip
----

Load our configuration to the cluster, and see how it responds.
-----
[root@pcmk-1 ~]# pcs cluster cib-push loadbalance_cfg
CIB updated
[root@pcmk-1 ~]# pcs status
Cluster name: mycluster
Last updated: Fri Aug 14 11:32:07 2015
Last change: Fri Aug 14 11:32:04 2015
Stack: corosync
Current DC: pcmk-1 (1) - partition with quorum
Version: 1.1.12-a14efad
2 Nodes configured
9 Resources configured


Online: [ pcmk-1 pcmk-2 ]

Full list of resources:

 WebSite	(ocf::heartbeat:apache):	Stopped 
 Master/Slave Set: WebDataClone [WebData]
     Masters: [ pcmk-1 ]
     Slaves: [ pcmk-2 ]
 WebFS	(ocf::heartbeat:Filesystem):	Stopped 
 ipmi-fencing   (stonith:fence_ipmilan):        Started pcmk-1 
 Clone Set: dlm-clone [dlm]
     Started: [ pcmk-1 pcmk-2 ]
 Clone Set: ClusterIP-clone [ClusterIP] (unique)
     ClusterIP:0	(ocf::heartbeat:IPaddr2):	Started pcmk-1 
     ClusterIP:1	(ocf::heartbeat:IPaddr2):	Started pcmk-2 

PCSD Status:
  pcmk-1: Online
  pcmk-2: Online

Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/enabled
-----

If desired, you can demonstrate that all request buckets are working
by using a tool such as `arping` from several source hosts
to see which host responds to each.

== Clone the Filesystem and Apache Resources ==

Now that we have a cluster filesystem ready to go,
and our nodes can load-balance requests to a shared IP address,
we can configure the cluster so both nodes mount the filesystem
and respond to web requests.

Clone the filesystem and Apache resources in a new configuration.
Notice how pcs automatically updates the relevant constraints again.
----
[root@pcmk-1 ~]# pcs cluster cib active_cfg
[root@pcmk-1 ~]# pcs -f active_cfg resource clone WebFS
[root@pcmk-1 ~]# pcs -f active_cfg resource clone WebSite
[root@pcmk-1 ~]# pcs -f active_cfg constraint
Location Constraints:
Ordering Constraints:
  start ClusterIP-clone then start WebSite-clone (kind:Mandatory)
  promote WebDataClone then start WebFS-clone (kind:Mandatory)
  start WebFS-clone then start WebSite-clone (kind:Mandatory)
  start dlm-clone then start WebFS-clone (kind:Mandatory)
Colocation Constraints:
  WebSite-clone with ClusterIP-clone (score:INFINITY)
  WebFS-clone with WebDataClone (score:INFINITY) (with-rsc-role:Master)
  WebSite-clone with WebFS-clone (score:INFINITY)
  WebFS-clone with dlm-clone (score:INFINITY)
----

Tell the cluster that it is now allowed to promote both instances to be DRBD
Primary (aka. master).

-----
[root@pcmk-1 ~]# pcs -f active_cfg resource update WebDataClone master-max=2 
-----

Finally, load our configuration to the cluster, and re-enable the WebFS resource
(which we disabled earlier).
-----
[root@pcmk-1 ~]# pcs cluster cib-push active_cfg
CIB updated
[root@pcmk-1 ~]# pcs resource enable WebFS
-----

After all the processes are started, the status should look similar to this.
-----
[root@pcmk-1 ~]# pcs resource
 Master/Slave Set: WebDataClone [WebData]
     Masters: [ pcmk-1 pcmk-2 ]
 Clone Set: dlm-clone [dlm]
     Started: [ pcmk-1 pcmk-2 ]
 Clone Set: ClusterIP-clone [ClusterIP] (unique)
     ClusterIP:0	(ocf::heartbeat:IPaddr2):	Started 
     ClusterIP:1	(ocf::heartbeat:IPaddr2):	Started 
 Clone Set: WebFS-clone [WebFS]
     Started: [ pcmk-1 pcmk-2 ]
 Clone Set: WebSite-clone [WebSite]
     Started: [ pcmk-1 pcmk-2 ]
-----

== Test Failover ==

Testing failover is left as an exercise for the reader.
For example, you can put one node into standby mode,
use `pcs status` to confirm that its ClusterIP clone was
moved to the other node, and use `arping` to verify that
packets are not being lost from any source host.

[NOTE]
====
You may find that when a failed node rejoins the cluster,
both ClusterIP clones stay on one node, due to the
resource stickiness. While this works fine, it effectively eliminates
load-balancing and returns the cluster to an active-passive setup again.
You can avoid this by disabling stickiness for the IP address resource:
----
[root@pcmk-1 ~]# pcs resource meta ClusterIP resource-stickiness=0
----
====
