UNIX Health Check - PowerHA / HACMP

Tech Blog

These are blog entries written by the UNIX Health Check development team. Our team has extensive technical experience on both AIX and Red Hat systems, and we like to share our knowledge with our visitors.

Topics: PowerHA / HACMP

HACMP MAC Address take-over

If you wish to enable MAC Address take-over on an HACMP cluster, you need a virtual MAC address. You can do a couple of things to make sure you have a unique MAC Address on your network:

Use the MAC address of an old system, that you know has been destroyed.
Buy a new network card, use the MAC address, then destroy this card.
Use a DEADBEEF address: (0xdeadbeef1234). This is a non-existent hardware vendor. You might run into problems with someone else making up a deadbeef address, so use this option with caution.

Anyway, register the MAC address you're using for HACMP clusters.

Topics: Monitoring, PowerHA / HACMP ↑

HACMP Event generation

HACMP provides events, which can be used to most accurately monitor the cluster status, for example via the Tivoli Enterprise Console. Each change in the cluster status is the result of an HACMP event. Each HACMP event has an accompanying notify method that can be used to handle the kind of notification we want.

Interesting Cluster Events to monitor are:

node_up
node_down
network_up
network_down
join_standby
fail_standby
swap_adapter
config_too_long
event_error

You can set the notify method via:

# smitty hacmp
Cluster Configuration
Cluster Resources
Cluster Events
Change/Show Cluster Events

You can also query the ODM:

# odmget HACMPevent

Topics: PowerHA / HACMP ↑

A few HACMP rules

With HACMP clusters documentation is probably the most important issue. You cannot properly manage an HACMP cluster if you do not document it. Document the precise configuration of the complete cluster and document any changes you've carried out. Also document all management procedures and stick to them! The cluster snapshot facility is an excellent way of documenting your cluster.

Next step: get educated. You have to know exactly what you're doing on an HACMP cluster. If you have to manage a production cluster, getting a certification is a necessity. Don't ever let non-HACMP-educated UNIX administrators on your HACMP cluster nodes. They don't have a clue of what's going on and probably destroy your carefully layed-out configuration.

Geographically separated nodes are important! Too many cluster nodes just sit on top of each other in the same rack. What if there's a fire? Or a power outage? Having an HACMP cluster won't help you if both nodes are on a single location, use the same power, or the same network switches.

Put your HACMP logs in a sensible location. Don't put them in /tmp knowing that /tmp gets purged every night....

Test, test, test and test your cluster over again. Doing take-over tests every half year is best practice. Document your tests, and your test results.

Don't assume that your cluster is high available after installing the cluster software. There are a lot of other things to consider in your infrastructure to avoid single points of failures, like: No two nodes sharing the same I/O drawer; Power redundancy; No two storage or network adapters on the same SCSI backplane or bus; Redundancy in SAN HBA's; Application monitoring in place.

Topics: PowerHA / HACMP ↑

PowerHA / HACMP links

Official IBM sites:

Other PowerHA / HACMP related sites:

IBM's Redbooks on HACMP

lpar.co.uk

(Alex Abderrazag)

Topics: Fun, PowerHA / HACMP ↑

HACMP humor

This is NOT a cluster snapshot, but a snapshot of a cluster.
(This is a very inside joke. Just ignore it if you don't get it.)

Topics: PowerHA / HACMP ↑

PowerHA / HACMP support matrix

Support matrix / life cycle for IBM PowerHA (with a typical 3 year lifecycle):

	AIX 5.1	AIX 5.2	AIX 5.3	AIX 6.1	AIX 7.1	AIX 7.2	Release Date	End Of Support
HACMP 5.1	Yes	Yes	Yes	No	No	No	7/11/2003	9/1/2006
HACMP 5.2	Yes	Yes	Yes	No	No	No	7/16/2004	9/30/2007
HACMP 5.3	No	ML4+	ML2+	Yes	No	No	8/12/2005	9/30/2009
HACMP 5.4.0	No	TL8+	TL4+	No	No	No	7/28/2006	9/30/2011
HACMP 5.4.1	No	TL8+	TL4+	Yes	Yes	No	9/11/2007	9/30/2011
PowerHA 5.5	No	No	TL7+	tl2 sp1+	Yes	No	11/14/2008	4/30/2012
PowerHA 6.1	No	No	TL9+	tl2 sp1+	Yes	No	10/20/2009	4/30/2015
PowerHA 7.1.0	No	No	No	tl6+	Yes	No	9/10/2010	9/30/2014
PowerHA 7.1.1	No	No	No	tl7 sp2+	tl1 sp2+	No	9/10/2010	4/30/2015
PowerHA 7.1.2	No	No	No	tl8 sp1+	tl2 sp1+	No	10/3/2012	4/30/2016
PowerHA 7.1.3	No	No	No	tl9 sp1+	tl3 sp1+	No	10/7/2013	4/30/2018
PowerHA 7.2.0	No	No	No	tl9 sp5+	tl3 sp5+ tl4 sp1+	tl0 sp1+	12/4/2015	4/30/2019
PowerHA 7.2.1	No	No	No	No	tl3+	tl0 sp1+	12/16/2016	4/30/2020
PowerHA 7.2.2	No	No	No	No	tl4+	tl0 sp1+	12/15/2017	tbd

Source: PowerHA for AIX Version Compatibility Matrix

Topics: PowerHA / HACMP ↑

PowerHA / HACMP Introduction

PowerHA is the new name for HACMP, which is short for High Availability Cluster Multi-Processing, a product of IBM. PowerHA / HACMP runs on AIX (and also on Linux) and its purpose is to provide high availability to systems, mainly for hardware failures. It can automatically detect system or network failures and can provide the capability to recover system hardware, applications, data and users while keeping recovery time to an absolute minimum. This is useful for systems that need to be online 24 hours a day, 365 days per year; for organizations that can't afford to have systems down for longer than 15 minutes. It's not completely fault-tolerant, but it is high available.

Compared to other cluster software, PowerHA / HACMP is highly robust, allows for large distances between nodes of a single cluster and allows up to 32 nodes in a cluster. Previous version of PowerHA / HACMP have had a reputation of having a lot of "bugs". From version 5.4 onward PowerHA / HACMP has seen a lot of improvements.

IBM's HACMP exists for over 15 years. It's not actually an IBM product; IBM bought it from CLAM, which was later renamed to Availant and then renamed to LakeViewTech and nowadays is called Vision Solutions. Until August 2006, all development of HACMP was done by CLAM. Nowadays, IBM does its own development of PowerHA / HACMP in Austin, Poughkeepsie and Bangalore.

Competitors of PowerHA / HACMP are Veritas Cluster and Echo Cluster. The last one, Echo Cluster, is a product of Vision Solutions mentioned above and tends to be easier to set-up and meant for simpler clusters. Veritas is only used by customers that use it already on other operating systems, like Sun Solaris and Windows Server environments, and don't want to invest into yet another clustering technology.

Topics: GPFS, Oracle, PowerHA / HACMP ↑

Oracle RAC introduction

The traditional method for making an Oracle database capable of 7*24 operation is by means of creating an HACMP cluster in an Active-Standby configuration. In case of a failure of the Active system, HACMP lets the standby system take over the resources, start Oracle and thus resumes operation. This takeover is done with a downtime period of aprox. 5 to 15 minutes, however the impact on the business applications is more severe. It can lead to interruptions up to one hour in duration.

Another way to achieve high availability of databases, is to use a special version of the Oracle database software called Real Application Cluster, also called RAC. In a RAC cluster multiple systems (instances) are active (sharing the workload) and provide a near always-on database operation. The Oracle RAC software relies on IBM's HACMP software to achieve high availability for hardware and the operating system platform AIX. For storage it utilizes a concurrent filesystem called GPFS (General Parallel File System), a product of IBM. Oracle RAC 9 uses GPFS and HACMP. With RAC 10 you no longer need HACMP and GPFS.

HACMP is used for network down notifications. Put all network adapters of 1 node on a single switch and put every node on a different switch. HACMP only manages the public and private network service adapters. There are no standby, boot or management adapters in a RAC HACMP cluster. It just uses a single hostname; Oracle RAC and GPFS do not support hostname take-over or IPAT (IP Address take-over). There are no disks, volume groups or resource groups defined in an HACMP RAC cluster. In fact, HACMP is only necessary for event handling for Oracle RAC.

Name your HACMP RAC clusters in such away, that you can easily recognize the cluster as a RAC cluster, by using a naming convention that starts with RAC_.

On every GPFS node of an Oracle RAC cluster a GPFS daemon (mmfs) is active. These daemons need to communicate with each other. This is done via the public network, not via the private network.

Cache Fusion

Via SQL*Net an Oracle block is read in memory. If a second node in an HACMP RAC cluster requests the same block, it will first check if it already has it stored locally in its own cache. If not, it will use a private dedicated network to ask if another node has the block in cache. If not, the block will be read from disk. This is called Cache Fusion or Oracle RAC interconnect.

This is why on RAC HACMP clusters, each node uses an extra private network adapter to communicate with the other nodes, for Cache Fusion purposes only. All other communication, including the communication between the GPFS daemons on every node and the communication from Oracle clients, is done via the public network adapter. The throughput on the private network adapter can be twice as high as on the public network adapter.

Oracle RAC will use its own private network for Cache Fusion. If this network is not available, or if one node is unable to access the private network, then the private network is no longer used, but the public network will be used instead. If the private network returns to normal operation, then a fallback to the private network will occur. Oracle RAC uses cllsif of HACMP for this purpose.

Topics: Monitoring, PowerHA / HACMP, Security ↑

HACMP 5.4: How to change SNMP community name from default "public" and keep clstat working

HACMP 5.4 supports changing the default community name from "public" to something else. SNMP is used for clstatES communications. Using the "public" SNMP community name, can be a security vulnerability. So changing it is advisable.

First, find out what version of SNMP you are using:

# ls -l /usr/sbin/snmpd
lrwxrwxrwx 1 root system 9 Sep 08 2008 /usr/sbin/snmpd -> snmpdv3ne

(In this case, it is using version 3).

Make a copy of your configuration file. It is located on /etc.

/etc/snmpd.conf <- Version 1
/etc/snmpdv3.conf <- Version 3

Edit the file and replace wherever public is mentioned for your new community name. Make sure to use not more that 8 characters for the new community name.

Change subsystems and restart them:

# chssys -s snmpmibd -a "-c new"
# chssys -s hostmibd -a "-c new"
# chssys -s aixmibd -a "-c new"
# stopsrc -s snmpd
# stopsrc -s aixmibd
# stopsrc -s snmpmibd
# stopsrc -s hostmibd
# startsrc -s snmpd
# startsrc -s hostmibd
# startsrc -s snmpmibd
# startsrc -s aixmibd

Test using your locahost:

# snmpinfo -m dump -v -h localhost -c new -o /usr/es/sbin/cluster/hacmp.defs nodeTable

If the command hangs, something is wrong. Check the changes you made.

If everything works fine, perform the same change in the other node and test again. Now you can test from one server to the other using the snmpinfo command above.

If you need to backout, replace with the original configuration file and restart subsystems. Note in this case we use double-quotes. There is no space.

# chssys -s snmpmibd -a ""
# chssys -s hostmibd -a ""
# chssys -s aixmibd -a ""
# stopsrc -s snmpd
# stopsrc -s aixmibd
# stopsrc -s snmpmibd
# stopsrc -s hostmibd
# startsrc -s snmpd
# startsrc -s hostmibd
# startsrc -s snmpmibd
# startsrc -s aixmibd

Okay, now make the change to clinfoES and restart and both nodes:

# chssys -s clinfoES -a "-c new"
# stopsrc -s clinfoES
# startsrc -s clinfoES

Wait a few minutes and you should be able to use clstat again with the new community name.

Disclaimer: If you have any other application other than clinfoES that uses snmpd with the default community name, you should make changes to it as well. Check with your application team or software vendor.

Topics: PowerHA / HACMP ↑

Tweaking the deadman switch

You can tweak the Dead Man Switch settings for HACMP. First have a look at the current setting by running:

# lssrc -ls topsvcs

A system usually has at least 2 heartbeats: 1 through the network: net_ether_01, with a sensitivity of 10 missed beats x 1 second interval x 2 = 20 seconds for it to fail. The other heartbeat is usually the disk heartbeat, diskhb_0, with a sensitivity of 4 missed beats x 2 second interval x 2 = 16 seconds.

Basically, if the other node has failed, HACMP will know if all the heartbeating has failed, thus after 20 seconds.

You can play around with the HACMP detection rates: Set it to normal:

# /usr/es/sbin/cluster/utilities/claddnim -oether -r2

Ethernet heartbeating fails after 20 seconds. If you want to set it to slow: Use "-r3" instead of "-r2", and it fails after 48 seconds. Set it to fast: Use -r1, which will fail it after 10 seconds.

To give you some more time, you can use a grace period:

# claddnim -oether -g 15

This will give you 15 seconds of grace time, which is the time within a network fallover must be taken care of.

You will have to synchronize the cluster after making any changes using claddnim:

# /usr/es/sbin/cluster/utilities/cldare -rt -V 'normal'

Number of results found for topic PowerHA / HACMP: 31.
Displaying results: 21 - 30.

Order

No time to lose? Need to know what's wrong with
your UNIX system now? Then get started TODAY!