1. Environment Preparation
Select two servers with identical configurations as load machines, and install the same OS: Redhat Enterprise Edition 6.5 on both nodes.
-
Disable the firewall on each node. Use the following commands to turn off the firewall:
# chkconfig iptables off # chkconfig ip6tables off
After restarting, you can verify that the firewall is disabled with:
# chkconfig --list iptables iptables 0:off 1:off 2:off 3:off 4:off 5:off 6:off # chkconfig --list ip6tables ip6tables 0:off 1:off 2:off 3:off 4:off 5:off 6:off
-
Disable SELINUX on each node. Modify the
/etc/sysconfig/selinux
file:
# vi /etc/sysconfig/selinux # This file controls the state of SELinux on the system. # SELINUX= can take one of these three values: # enforcing - SELinux security policy is enforced. # permissive - SELinux prints warnings instead of enforcing. # disabled - No SELinux policy is loaded. SELINUX=disabled # SELINUXTYPE= can take one of these two values: # targeted - Targeted processes are protected, # mls - Multi Level Security protection. SELINUXTYPE=targeted
After restarting, confirm SELINUX is disabled with:
# sestatus SELinux status: disabled
2. Allocate Physical Partitions
Allocate a physical partition on each load machine, ensuring both partitions have the same capacity (e.g., /dev/sdb1
).
3. Configure Hostnames
Configure the hostnames on both machines (IPs: 132.121.12.173 and 132.121.12.174):
# On 132.121.12.173
[root@132.121.12.173 ~]# vim /etc/hosts
132.121.12.173 OCSJZ13
132.121.12.174 OCSJZ14
# On 132.121.12.174
[root@132.121.12.174 ~]# vim /etc/hosts
132.121.12.173 OCSJZ13
132.121.12.174 OCSJZ14
4. Set Up SSH Trust
Establish SSH trust between the root users of the two load machines:
# On OCSJZ13
[root@OCSJZ13 ~]# ssh-keygen -t rsa (press Enter for all prompts)
[root@OCSJZ13 ~]# ssh-copy-id -i ~/.ssh/id_rsa.pub root@OCSJZ14 (enter the root password of OCSJZ14)
# On OCSJZ14
[root@OCSJZ14 ~]# ssh-keygen -t rsa (press Enter for all prompts)
[root@OCSJZ14 ~]# ssh-copy-id -i ~/.ssh/id_rsa.pub root@OCSJZ13 (enter the root password of OCSJZ13)
5. Set Up SSH Trust for Gbase User
Establish SSH trust between the gbase user on the load machines and all cluster nodes. This needs to be done for both load machines but only one-way (load machines can access all cluster nodes without a password).
6. Prepare Installation Packages
On both load machines:
- Extract
pkgs.tar.bz2
and install all RPM packages. - Extract
toolkit.tar.bz2
, copy all programs from thesbin
directory to/usr/sbin
, and copy all files from theservice
directory to/etc/init.d
.
7. Configure DRBD
Configure DRBD on both load machines with the same configuration file /etc/drbd.d/global_common.conf
:
[root@OCSJZ13 ~]# vim /etc/drbd.d/global_common.conf
global {
usage-count yes;
}
common {
handlers {
pri-on-incon-degr "/usr/lib/drbd/notify-pri-on-incon-degr.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
local-io-error "/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger ; halt -f";
}
net {
protocol C;
}
syncer {
rate 100M;
}
}
Create the resource file /etc/drbd.d/dispdrbd.res
on both load machines:
[root@OCSJZ13 ~]# vim /etc/drbd.d/dispdrbd.res
resource dispdrbd {
on OCSJZ13 {
device /dev/drbd0;
disk /dev/sdb1;
address 192.168.0.173:7789;
meta-disk internal;
}
on OCSJZ14 {
device /dev/drbd0;
disk /dev/sdb1;
address 192.168.0.174:7789;
meta-disk internal;
}
}
Create the DRBD resources on both load machines:
[root@OCSJZ13 ~]# dd if=/dev/zero bs=1M count=1 of=/dev/sdb1
[root@OCSJZ13 ~]# drbdadm create-md dispdrbd
# Start DRBD service on both machines
[root@OCSJZ13 ~]# service drbd start
# If synchronization does not start, execute on either machine
[root@OCSJZ14 ~]# drbdadm invalidate dispdrbd
After synchronization (when cat /proc/drbd
shows ds:Uptodate/Uptodate
), set the primary node (assuming 132.121.12.173 is the primary):
[root@OCSJZ13 ~]# drbdsetup /dev/drbd0 primary
# If setting primary fails
[root@OCSJZ13 ~]# drbdadm -- --overwrite-data-of-peer primary all
# Format the shared device
[root@OCSJZ13 ~]# mkfs.ext4 /dev/drbd0
8. Configure NFS
On both load machines:
[root@OCSJZ13 ~]# mkdir /nfsshare
[root@OCSJZ13 ~]# vim /etc/exports
/nfsshare *(rw,sync,no_root_squash)
9. Configure Corosync
On the primary node:
[root@OCSJZ13 ~]# corosync-keygen
[root@OCSJZ13 ~]# chmod 0400 /etc/corosync/authkey
[root@OCSJZ13 ~]# cp /etc/corosync/corosync.conf.example /etc/corosync/corosync.conf
[root@OCSJZ13 ~]# vim /etc/corosync/corosync.conf
compatibility: whitetank
totem {
version: 2
secauth: off
threads: 0
interface {
member {
memberaddr: 132.121.12.173
}
member {
memberaddr: 132.121.12.174
}
ringnumber: 0
bindnetaddr: 132.121.12.1
mcastport: 5422
ttl: 1
}
transport: udpu
}
logging {
fileline: off
to_stderr: no
to_logfile: yes
to_syslog: yes
logfile: /var/log/cluster/corosync.log
debug: off
timestamp: on
}
service {
ver: 0
name: pacemaker
use_mgmtd: yes
}
Copy the configuration to the secondary node:
[root@OCSJZ13 ~]# scp /etc/corosync/authkey /etc/corosync/corosync.conf root@OCSJZ14:/etc/corosync
# On the secondary node
[root@OCSJZ14 ~]# chmod 0400 /etc/corosync/authkey
[root@OCSJZ14 ~]# vim /etc/corosync/corosync.conf
10. Start Services
Start the corosync service on both load machines:
[root@OCSJZ13 ~]# service corosync start
11. Configure Pacemaker
On one of the load machines, configure CRM:
[root@OCSJZ13 ~]# crm configure
Disable STONITH:
crm(live)configure# property stonith-enabled=false
Modify the cluster state check to ignore quorum not being met:
crm(live)configure# property no-quorum-policy=ignore
Specify the default stickiness value for resources:
crm(live)configure# rsc_defaults resource-stickiness=100
crm(live)configure# commit
Configure DRBD (primary/secondary mode):
crm(live)configure# primitive dispdrbd ocf:linbit:drbd params drbd_resource=dispdrbd op monitor role=Master interval=50s timeout=30s op monitor role=Slave interval=60s timeout=30s
crm(live)configure# master ms_dispdrbd dispdrbd meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true"
Configure the mount point:
crm(live)configure# primitive webfs ocf:heartbeat:Filesystem params device="/dev/drbd0" directory="/nfsshare" fstype="ext4"
crm(live)configure# colocation webfs_on_ms_dispdrbd inf: webfs ms_dispdrbd:Master
crm(live)configure# order webfs_after_ms_dispdrbd inf: ms_dispdrbd:promote webfs:start
Configure the virtual IP (this IP is used to provide external services and must not conflict with other IPs in the system; the user-provided IP for this setup is 132.121.12.175):
crm(live)configure# primitive vip ocf:heartbeat:IPaddr params ip="132.121.12.175" cidr_netmask="24" op monitor interval="5s"
crm(live)configure# colocation vip_on_ms_dispdrbd inf: vip ms_dispdrbd:Master
Configure NFS:
crm(live)configure# primitive rpcbind lsb:rpcbind op monitor interval="10s"
crm(live)configure# colocation rpcbind_on_ms_dispdrbd inf: rpcbind ms_dispdrbd:Master
crm(live)configure# primitive nfsshare lsb:nfs op monitor interval="30s"
crm(live)configure# colocation nfsshare_on_ms_dispdrbd inf: nfsshare ms_dispdrbd:Master
crm(live)configure# order nfsshare_after_rpcbind mandatory: rpcbind nfsshare:start
crm(live)configure# order nfsshare_after_vip mandatory: vip nfsshare:start
crm(live)configure# order nfsshare_after_webfs mandatory: webfs nfsshare:start
Check the status of the Pacemaker service using the crm status
command:
[root@OCSJZ13 ~]# crm status
============
Last updated: Thu Feb 7 21:20:35 2013
Last change: Thu Feb 7 20:36:54 2013 via cibadmin on OCSJZ13
Stack: openais
Current DC: OCSJZ13 - partition with quorum
Version: 1.1.6-3.el6-a02c0f19a00c1eb2527ad38f146ebc0834814558
2 Nodes configured, 2 expected votes
5 Resources configured.
============
Online: [ OCSJZ13 OCSJZ14 ]
Master/Slave Set: ms_dispdrbd [dispdrbd]
Masters: [ OCSJZ13 ]
Slaves: [ OCSJZ14 ]
webfs (ocf::heartbeat:Filesystem): Started OCSJZ13
vip (ocf::heartbeat:IPaddr): Started OCSJZ13
nfsshare (lsb:nfs): Started OCSJZ13
12. Configure Related Configuration Files (All Configuration Files Must Be Consistent on Both Load Machines)
Configure gciplist.conf
:
[root@OCSJZ13 ~]# gciplist HOSTNAME >/etc/gciplist.conf
(Note: HOSTNAME refers to the IP of any node in the cluster. This configuration needs to be reset each time the cluster topology changes—adding or removing nodes.)
Configure dispmon.conf
:
There are two modes for loading monitoring services: background service mode and cron job mode. The configurations for the two modes are different.
1) Background Service Mode
[root@OCSJZ13 ~]# vim /etc/dispmon.conf
mode=daemon
exec=fetch_load.sh
(Note: fetch_load.sh
refers to the filename of the user's load control program execution file, which needs to be deployed in the /usr/sbin
directory.)
2) Cron Job Mode
[root@OCSJZ13 ~]# vim /etc/dispmon.conf
mode=crontab
Configure dispcron.conf
(Only Needed for Cron Job Mode):
This file must be placed in the /etc
directory and is configured in the same format as a crontab
configuration file.
Note: The dispmon.conf
and dispcron.conf
configuration files must be set while the dispmon
service on both load machines is stopped.
13. Complete the Configuration of Related Services
[root@OCSJZ13 ~]# crm configure
Configure the dispserver
Service:
crm(live)configure# primitive dispserver lsb:dispsvr op monitor interval="10s"
crm(live)configure# colocation dispsvr_on_ms_dispdrbd inf: dispserver ms_dispdrbd:Master
crm(live)configure# order dispsvr_after_nfsshare mandatory: nfsshare dispserver:start
Configure the monitor
Service:
crm(live)configure# primitive dispmon lsb:dispmon op monitor interval="10s"
crm(live)configure# colocation dispmon_on_ms_dispdrbd inf: dispmon ms_dispdrbd:Master
crm(live)configure# order dispmon_after_dispserver mandatory: dispserver dispmon:start
crm(live)configure# commit
crm(live)configure# quit