Previous posts in this series:
Part 1: Setting up the servers
RedHat Cluster and GFS2 Setup
- Required reading: https://alteeve.com/w/2-Node_Red_Hat_KVM_Cluster_Tutorial Seriously. Don’t do anything until you read his tutorial a couple of times. Might as well read a bunch of his other tutorials, too.
- Unless mentioned, the following commands should be done on each server.
Set date/time to be accurate and within a few minutes of each other.
- Install the ntp program and update to current time.
yum install ntpntpdate time.nist.gov
- Set time servers and start ntpd
service ntpd start- Edit the
/etc/ntp.conffile to use the following servers: -
server 0.pool.ntp.org server 1.pool.ntp.org server 2.pool.ntp.org server 3.pool.ntp.org
- Restart ntpd
service ntpd restartchkconfig ntpd on
Cluster setup
RedHat Cluster must be set up before the GFS2 File systems can be created and mounted.
- Instal the necessary programs
yum install openais cman rgmanager lvm2-cluster gfs2-utils ccs- Create a
/etc/cluster/cluster.confREMEMBER: Always increment the “config_version” parameter in theclustertag!-
<?xml version=“1.0”?> <cluster config_version=“24” name=“web-production”> <cman expected_votes=“1” two_node=“1”/> <fence_daemon clean_start=“1” post_fail_delay=“6” post_join_delay=“3”/> <totem rrp_mode=“none” secauth=“off”/> <clusternodes> <clusternode name=“bill” nodeid="1"> <fence> <method name="ipmi"> <device action=“reboot” name=“ipmi_bill”/> </method> </fence> </clusternode> <clusternode name=“ted” nodeid="2"> <fence> <method name="ipmi"> <device action=“reboot” name=“ipmi_ted”/> </method> </fence> </clusternode> </clusternodes> <fencedevices> <fencedevice agent=“fence_ipmilan” ipaddr=“billsp” login=“root” name=“ipmi_bill” passwd=“PASSWORD-HERE”/> <fencedevice agent=“fence_ipmilan” ipaddr=“tedsp” login=“root” name=“ipmi_ted” passwd=“PASSWORD-HERE”/> </fencedevices> <rm log_level="5"> <resources> <clusterfs device=“/dev/mapper/StorageTek2530-sites” fstype=“gfs2” mountpoint=“/sites” name=“sites”/> <clusterfs device=“/dev/mapper/StorageTek2530-databases” fstype=“gfs2” mountpoint=“/databases” name=“databases”/> <clusterfs device=“/dev/mapper/StorageTek2530-logs” fstype=“gfs2” mountpoint=“/logs” name=“logs”/> </resources> <failoverdomains> <failoverdomain name=“bill-only” nofailback=“1” ordered=“0” restricted="1"> <failoverdomainnode name=“bill”/> </failoverdomain> <failoverdomain name=“ted-only” nofailback=“1” ordered=“0” restricted="1"> <failoverdomainnode name=“ted”/> </failoverdomain> </failoverdomains> </rm> </cluster>
-
- We’ll be adding more to this later, but this will work for now.
- Validate the config file
ccs_config_validate
- Set a password for the ricci user
passwd ricci
- Start ricci, and set to start on boot
service ricci startchkconfig ricci on
- Start modclusterd and set to start on boot
service modclusterd startchkconfig modclusterd on
- Sync the cluster.conf file to other node
ccs -f /etc/cluster/cluster.conf -h ted --setconf
- Start cman on both servers at the same time
service cman start
- Set cman to start on boot
chkconfig cman on
- Check the tutorial on testing the fencing
Create GFS2 partitions
Create a partition on the new scsi device /dev/mapper/mpatha using parted. NOTE: This part only needs to be done once on one server
parted /dev/mapper/mpathamklabel gptmkpart primary 1 -1set 1 lvm onquit- Now you can see a partition for the storage array.
parted -l
Edit the /etc/lvm/lvm.conf file and set the value for locking_type = 3 to allow for cluster locking.
In order to enable the LVM volumes you are creating in a cluster, the cluster infrastructure must be running and the cluster must be quorate.
service clvmd startchkconfig clvmd onchkconfig gfs2 on
Create LVM partitions on the raw drive available from the StorageTek. NOTE: This part only needs to be done once on one server.
pvcreate /dev/mapper/mpatha1vgcreate -c y StorageTek2530 /dev/mapper/mpatha1
Now create the different partitions for the system: sites, databases, logs, home, root
lvcreate --name sites --size 350GB StorageTek2530lvcreate --name databases --size 100GB StorageTek2530lvcreate --name logs --size 50GB StorageTek2530lvcreate --name root --size 50GB StorageTek2530
Make a temporary directory /root-b and copy everything from root’s home directory to there, because it will be erased when we make the GFS2 file system.
Copy /root/.ssh/known_hosts to /etc/ssh/root_known_hosts so the file is different for both servers.
Before doing the home directory, we have to remove it from the local LVM.
unmount /homelvremove bill_local/homeand on tedlvremove ted_local/home- Remove the line from
/etc/fstabreferring to the /home directory on the local LVM - Then add the clustered LV.
lvcreate --name home --size 50GB StorageTek2530
Create GFS2 files systems on the LVM partitions created on the StorageTek. Make sure they are unmounted, first. NOTE: This part only needs to be done once on one server.
mkfs.gfs2 -p lock_dlm -j 2 -t web-production:sites /dev/mapper/StorageTek2530-sitesmkfs.gfs2 -p lock_dlm -j 2 -t web-production:databases /dev/mapper/StorageTek2530-databasesmkfs.gfs2 -p lock_dlm -j 2 -t web-production:logs /dev/mapper/StorageTek2530-logsmkfs.gfs2 -p lock_dlm -j 2 -t web-production:root /dev/mapper/StorageTek2530-rootmkfs.gfs2 -p lock_dlm -j 2 -t web-production:home /dev/mapper/StorageTek2530-home
Mount the GFS2 partitions
- NOTE: GFS2 file systems that have been mounted manually rather than automatically through an entry in the fstab file will not be known to the system when file systems are unmounted at system shutdown. As a result, the GFS2 script will not unmount the GFS2 file system. After the GFS2 shutdown script is run, the standard shutdown process kills off all remaining user processes, including the cluster infrastructure, and tries to unmount the file system. This unmount will fail without the cluster infrastructure and the system will hang.
- To prevent the system from hanging when the GFS2 file systems are unmounted, you should do one of the following:
- Always use an entry in the fstab file to mount the GFS2 file system.
- If a GFS2 file system has been mounted manually with the mount command, be sure to unmount the file system manually with the umount command before rebooting or shutting down the system.
- If your file system hangs while it is being unmounted during system shutdown under these circumstances, perform a hardware reboot. It is unlikely that any data will be lost since the file system is synced earlier in the shutdown process.
Make the appropriate folders on each node (/home is already there).
mkdir /sites /logs /databases
Make sure the appropriate lines are in /etc/fstab
#GFS2 partitions shared in the cluster /dev/mapper/StorageTek2530-root /root gfs2 defaults,acl 0 0 /dev/mapper/StorageTek2530-home /home gfs2 defaults,acl 0 0 /dev/mapper/StorageTek2530-databases /databases gfs2 defaults,acl 0 0 /dev/mapper/StorageTek2530-logs /logs gfs2 defaults,acl 0 0 /dev/mapper/StorageTek2530-sites /sites gfs2 defaults,acl 0 0
Once the GFS2 partitions are set up and in /etc/fstab, rgmanager can be started. This will mount the GFS2 partions.
service rgmanager startchkconfig rgmanager on
Starting Cluster Software
To start the cluster software on a node, type the following commands in this order:
service cman startservice clvmd startservice gfs2 startservice rgmanager start
Stopping Cluster Software
To stop the cluster software on a node, type the following commands in this order:
service ossec-hids stop- (ossec monitors the apache log files, so the /logs partition will not be unmounted unless ossec is stopped first.)
service rgmanager stopservice gfs2 stopumount -at gfs2service clvmd stopservice cman stop
Cluster tips
If a service shows as ‘failed’ when checking on services with clustat
- Disable the service first:
clusvcadm -d service-name - Then re-enable it:
clusvcadm -e service-name
Have Shorewall start sooner in the boot process.
- It was necessary to move shorewall up in the boot process, otherwise cman had no open connection to detect the other nodes.
- Edit the
/etc/init.d/shorewalland change the line near the top from# chkconfig: - 28 90to# chkconfig: - 18 90
- Then use chkconfig to turn off shorewall and then back on.
chkconfig shorewall offchkconfig shorewall on