Here are some color codes you can use in the Korn Shell:
Just copy everyting above and paste it into your shell or in a script. Then, you can use the defined variables:## Reset to normal: \033[0m NORM="\033[0m" ## Colors: BLACK="\033[0;30m" GRAY="\033[1;30m" RED="\033[0;31m" LRED="\033[1;31m" GREEN="\033[0;32m" LGREEN="\033[1;32m" YELLOW="\033[0;33m" LYELLOW="\033[1;33m" BLUE="\033[0;34m" LBLUE="\033[1;34m" PURPLE="\033[0;35m" PINK="\033[1;35m" CYAN="\033[0;36m" LCYAN="\033[1;36m" LGRAY="\033[0;37m" WHITE="\033[1;37m" ## Backgrounds BLACKB="\033[0;40m" REDB="\033[0;41m" GREENB="\033[0;42m" YELLOWB="\033[0;43m" BLUEB="\033[0;44m" PURPLEB="\033[0;45m" CYANB="\033[0;46m" GREYB="\033[0;47m" ## Attributes: UNDERLINE="\033[4m" BOLD="\033[1m" INVERT="\033[7m" ## Cursor movements CUR_UP="\033[1A" CUR_DN="\033[1B" CUR_LEFT="\033[1D" CUR_RIGHT="\033[1C" ## Start of display (top left) SOD="\033[1;1f"
## Example - Red underlined
echo "${RED}${UNDERLINE}This is a test!${NORM}"
## Example - different colors
echo "${RED}This ${YELLOW}is ${LBLUE}a ${INVERT}test!${NORM}"
## Example - cursor movement
# echo " ${CUR_LEFT}Test"
## Create a rotating thingy
while true ; do
printf "${CUR_LEFT}/"
perl -e "use Time::HiRes qw(usleep); usleep(100000)"
printf "${CUR_LEFT}-"
perl -e "use Time::HiRes qw(usleep); usleep(100000)"
printf "${CUR_LEFT}\\"
perl -e "use Time::HiRes qw(usleep); usleep(100000)"
printf "${CUR_LEFT}|"
perl -e "use Time::HiRes qw(usleep); usleep(100000)"
done
Note that the perl command used above will cause a sleep of 0.1 seconds. Perl is used here, because the sleep command can't be used to sleep less than 1 second.Topics: AIX, System Admin↑
FIRMWARE_EVENT
If FIRMWARE_EVENT entries appear in the AIX error log without FRU or location code callout, these events are likely attributed to an AIX memory page
deconfiguration event, which is the result of a single memory
cell being marked as unusable by the system firmware. The actual error is and will continue to be handled by ECC; however, notification of the unusable bit is also passed up to AIX. AIX in turn migrates the data and deallocates the memory page associated with this event from its memory map. This process is an AIX RAS feature which became available in AIX 5.3 and provides extra memory resilience and is no cause for alarm. Since the failure represents a single bit, a hardware action is NOT warranted.
To suppress logging, the following command will have to be entered and the partition will have to be rebooted to make the change effective:
# chdev -l sys0 -a log_pg_dealloc=falseCheck the current status:
# lsattr -El sys0 -a log_pg_deallocMore information about this function can be found in the "Highly Available POWER Servers for Business-Critical Applications" document which is available at the following link:
ftp://ftp.software.ibm.com/common/ssi/rep_wh/n/POW03003USEN/POW03003USEN.PDF (see pages 17-22 specifically).
Topics: AIX, Installation, System Admin↑
Compare_report
The compare_report command is a very useful utility to compare the software installed on two systems, for example for making sure the same software is installed on two nodes of a PowerHA cluster.
First, create the necessary reports:
Next, generate the report. There are four interesting options: -l, -h, -m and -n:# ssh node2 "lslpp -Lc" > /tmp/node2 # lslpp -Lc > /tmp/node1
- -l Generates a report of base system installed software that is at a lower level.
- -h Generates a report of base system installed software that is at a higher level.
- -m Generates a report of filesets not installed on the other system.
- -n Generates a report of filesets not installed on the base system.
# compare_report -b /tmp/node1 -o /tmp/node2 -l #(baselower.rpt) #Base System Installed Software that is at a lower level #Fileset_Name:Base_Level:Other_Level bos.msg.en_US.net.ipsec:6.1.3.0:6.1.4.0 bos.msg.en_US.net.tcp.client:6.1.1.1:6.1.4.0 bos.msg.en_US.rte:6.1.3.0:6.1.4.0 bos.msg.en_US.txt.tfs:6.1.1.0:6.1.4.0 xlsmp.msg.en_US.rte:1.8.0.1:1.8.0.3 # compare_report -b /tmp/node1 -o /tmp/node2 -h #(basehigher.rpt) #Base System Installed Software that is at a higher level #Fileset_Name:Base_Level:Other_Level idsldap.clt64bit62.rte:6.2.0.5:6.2.0.4 idsldap.clt_max_crypto64bit62.rte:6.2.0.5:6.2.0.4 idsldap.cltbase62.adt:6.2.0.5:6.2.0.4 idsldap.cltbase62.rte:6.2.0.5:6.2.0.4 idsldap.cltjava62.rte:6.2.0.5:6.2.0.4 idsldap.msg62.en_US:6.2.0.5:6.2.0.4 idsldap.srv64bit62.rte:6.2.0.5:6.2.0.4 idsldap.srv_max_cryptobase64bit62.rte:6.2.0.5:6.2.0.4 idsldap.srvbase64bit62.rte:6.2.0.5:6.2.0.4 idsldap.srvproxy64bit62.rte:6.2.0.5:6.2.0.4 idsldap.webadmin62.rte:6.2.0.5:6.2.0.4 idsldap.webadmin_max_crypto62.rte:6.2.0.5:6.2.0.4 AIX-rpm:6.1.3.0-6:6.1.3.0-4 # compare_report -b /tmp/node1 -o /tmp/node2 -m #(baseonly.rpt) #Filesets not installed on the Other System #Fileset_Name:Base_Level Java6.sdk:6.0.0.75 Java6.source:6.0.0.75 Java6_64.samples.demo:6.0.0.75 Java6_64.samples.jnlp:6.0.0.75 Java6_64.source:6.0.0.75 WSBAA70:7.0.0.0 WSIHS70:7.0.0.0 # compare_report -b /tmp/node1 -o /tmp/node2 -n #(otheronly.rpt) #Filesets not installed on the Base System #Fileset_Name:Other_Level xlC.sup.aix50.rte:9.0.0.1
Topics: AIX, Networking, System Admin↑
Using iptrace
The iptrace command can be very useful to find out what network traffic flows to and from an AIX system.
You can use any combination of these options, but you do not need to use them all:
- -a Do NOT print out ARP packets.
- -s [source IP] Limit trace to source/client IP address, if known.
- -d [destination IP] Limit trace to destination IP, if known.
- -b Capture bidirectional network traffic (send and receive packets).
- -p [port] Specify the port to be traced.
- -i [interface] Only trace for network traffic on a specific interface.
Run iptrace on AIX interface en1 to capture port 80 traffic to file trace.out from a single client IP to a server IP:
# iptrace -a -i en1 -s clientip -b -d serverip -p 80 trace.outThis trace will capture both directions of the port 80 traffic on interface en1 between the clientip and serverip and sends this to the raw file of trace.out.
To stop the trace:
The ipreport command can be used to transform the trace file generated by iptrace to human readable format:# ps -ef|grep iptrace # kill
# ipreport trace.out > trace.report
AIX-rpm is a "virtual" package which reflects what has been installed on the system by installp. It is created by the /usr/sbin/updtvpkg script when the rpm.rte is installed, and can be run anytime the administrator chooses (usually after installing something with installp that is required to satisfy some dependency by an RPM package).
Since AIX-rpm has to have some sort of version number, it simply reflects the level of bos.rte on the system where /usr/sbin/updtvpkg is being run. It's just informational - nothing should be checking the level of AIX-rpm.
AIX doesn't just automatically run /usr/sbin/updtvpkg every time that something gets installed or deinstalled because on some slower systems with lots of software installed, /usr/sbin/updtvpkg can take a LONG time.
If you want to run the command manually:
# /usr/sbin/updtvpkgIf you get an error similar to "cannot read header at 20760 for lookup" when running updtvpkg, run a rpm rebuilddb:
# rpm --rebuilddbOnce you run updtvpkg, you can run a rpm -qa to see your new AIX-rpm package.
Topics: AIX, System Admin↑
PRNG is not SEEDED
If you get a message "PRNG is not SEEDED" when trying to run ssh, you probably have an issue with the /dev/random and/or /dev/urandom devices on your system. These devices are created during system installation, but may sometimes be missing after an AIX upgrade.
Check permissions on random numbers generators, the "others" must have "read" access to these devices:
If the permissions are not set correctly, change them as follows:# ls -l /dev/random /dev/urandom crw-r--r-- 1 root system 39, 0 Jan 22 10:48 /dev/random crw-r--r-- 1 root system 39, 1 Jan 22 10:48 /dev/urandom
# chmod o+r /dev/random /dev/urandomNow stop and start the SSH daemon again, and retry if ssh works.
If this still doesn't allow users to use ssh and the same message is produced, or if devices /dev/random and/or /dev/urandom are missing:# stopsrc -s sshd # startsrc -s sshd
# stopsrc -s sshd # rm -rf /dev/random # rm -rf /dev/urandom # mknod /dev/random c 39 0 # mknod /dev/urandom c 39 1 # randomctl -l # ls -ald /dev/random /dev/urandom # startsrc -s sshd
Topics: AIX, Backup & restore, LVM, Performance, Storage, System Admin↑
Using lvmstat
One of the best tools to look at LVM usage is with lvmstat. It can report the bytes read and written to logical volumes. Using that information, you can determine which logical volumes are used the most.
Gathering LVM statistics is not enabled by default:
# lvmstat -v data2vg
0516-1309 lvmstat: Statistics collection is not enabled for
this logical device. Use -e option to enable.
As you can see by the output here, it is not enabled, so you need to actually enable it for each volume group prior to running the tool using:
# lvmstat -v data2vg -eThe following command takes a snapshot of LVM information every second for 10 intervals:
# lvmstat -v data2vg 1 10This view shows the most utilized logical volumes on your system since you started the data collection. This is very helpful when drilling down to the logical volume layer when tuning your systems.
What are you looking at here?# lvmstat -v data2vg Logical Volume iocnt Kb_read Kb_wrtn Kbps appdatalv 306653 47493022 383822 103.2 loglv00 34 0 3340 2.8 data2lv 453 234543 234343 89.3
- iocnt: Reports back the number of read and write requests.
- Kb_read: Reports back the total data (kilobytes) from your measured interval that is read.
- Kb_wrtn: Reports back the amount of data (kilobytes) from your measured interval that is written.
- Kbps: Reports back the amount of data transferred in kilobytes per second.
Topics: AIX, Backup & restore, LVM, Performance, Storage, System Admin↑
Spreading logical volumes over multiple disks
A common issue on AIX servers is, that logical volumes are configured on only one single disk, sometimes causing high disk utilization on a small number of disks in the system, and impacting the performance of the application running on the server.
If you suspect that this might be the case, first try to determine which disks are saturated on the server. Any disk that is in use more than 60% all the time, should be considered. You can use commands such as iostat, sar -d, nmon and topas to determine which disks show high utilization. If the do, check which logical volumes are defined on that disk, for example on an IBM SAN disk:
# lspv -l vpath23A good idea always is to spread the logical volumes on a disk over multiple disk. That way, the logical volume manager will spread the disk I/O over all the disks that are part of the logical volume, utilizing the queue_depth of all disks, greatly improving performance where disk I/O is concerned.
Let's say you have a logical volume called prodlv of 128 LPs, which is sitting on one disk, vpath408. To see the allocation of the LPs of logical volume prodlv, run:
# lslv -m prodlvLet's also assume that you have a large number of disks in the volume group, in which prodlv is configured. Disk I/O usually works best if you have a large number of disks in a volume group. For example, if you need to have 500 GB in a volume group, it is usually a far better idea to assign 10 disks of 50 GB to the volume group, instead of only one disk of 512 GB. That gives you the possibility of spreading the I/O over 10 disks instead of only one.
To spread the disk I/O prodlv over 8 disks instead of just one disk, you can create an extra logical volume copy on these 8 disks, and then later on, when the logical volume is synchronized, remove the original logical volume copy (the one on a single disk vpath408). So, divide 128 LPs by 8, which gives you 16LPs. You can assign 16 LPs for logical volume prodlv on 8 disks, giving it a total of 128 LPs.
First, check if the upper bound of the logical volume is set ot at least 9. Check this by running:
# lslv prodlvThe upper bound limit determines on how much disks a logical volume can be created. You'll need the 1 disk, vpath408, on which the logical volume already is located, plus the 8 other disks, that you're creating a new copy on. Never ever create a copy on the same disk. If that single disk fails, both copies of your logical volume will fail as well. It is usually a good idea to set the upper bound of the logical volume a lot higher, for example to 32:
# chlv -u 32 prodlvThe next thing you need to determine is, that you actually have 8 disks with at least 16 free LPs in the volume group. You can do this by running:
Note how in the command above the original disk, vpath408, was excluded from the list.# lsvg -p prodvg | sort -nk4 | grep -v vpath408 | tail -8 vpath188 active 959 40 00..00..00..00..40 vpath163 active 959 42 00..00..00..00..42 vpath208 active 959 96 00..00..96..00..00 vpath205 active 959 192 102..00..00..90..00 vpath194 active 959 240 00..00..00..48..192 vpath24 active 959 243 00..00..00..51..192 vpath304 active 959 340 00..89..152..99..00 vpath161 active 959 413 14..00..82..125..192
Any of the disks listed, using the command above, should have at least 1/8th of the size of the logical volume free, before you can make a logical volume copy on it for prodlv.
Now create the logical volume copy. The magical option you need to use is "-e x" for the logical volume commands. That will spread the logical volume over all available disks. If you want to make sure that the logical volume is spread over only 8 available disks, and not all the available disks in a volume group, make sure you specify the 8 available disks:
Now check again with "mklv -m prodlv" if the new copy is correctly created:# mklvcopy -e x prodlv 2 vpath188 vpath163 vpath208 \ vpath205 vpath194 vpath24 vpath304 vpath161
# lslv -m prodlv | awk '{print $5}' | grep vpath | sort -dfu | \
while read pv ; do
result=`lspv -l $pv | grep prodlv`
echo "$pv $result"
done
The output should similar like this:
Now synchronize the logical volume:vpath161 prodlv 16 16 00..00..16..00..00 N/A vpath163 prodlv 16 16 00..00..00..00..16 N/A vpath188 prodlv 16 16 00..00..00..00..16 N/A vpath194 prodlv 16 16 00..00..00..16..00 N/A vpath205 prodlv 16 16 16..00..00..00..00 N/A vpath208 prodlv 16 16 00..00..16..00..00 N/A vpath24 prodlv 16 16 00..00..00..16..00 N/A vpath304 prodlv 16 16 00..16..00..00..00 N/A
And remove the original logical volume copy:# syncvg -l prodlv
# rmlvcopy prodlv 1 vpath408Then check again:
# lslv -m prodlvNow, what if you have to extend the logical volume prodlv later on with another 128 LPs, and you still want to maintain the spreading of the LPs over the 8 disks? Again, you can use the "-e x" option when running the logical volume commands:
You can also use the "-e x" option with the mklv command to create a new logical volume from the start with the correct spreading over disks.# extendlv -e x prodlv 128 vpath188 vpath163 vpath208 \ vpath205 vpath194 vpath24 vpath304 vpath161
Shown below a script that can be used to create a simple comma separated values file (CSV) from NMON data.
If you wish to create a CSV file of the CPU usage on your system, you can grep for "CPU_ALL," in the nmon file. If you want to create a CSV file of the memory usage, grep for "MEM," in the nmon file. The script below creates a CSV file for the CPU usage.
#!/bin/ksh
node=`hostname`
rm -f /tmp/cpu_all.tmp /tmp/zzzz.tmp /tmp/${node}_nmon_cpu.csv
for nmon_file in `ls /var/msgs/nmon/*nmon`
do
datestamp=`echo ${nmon_file} | cut -f2 -d"_"`
grep CPU_ALL, $nmon_file > /tmp/cpu_all.tmp
grep ZZZZ $nmon_file > /tmp/zzzz.tmp
grep -v "CPU Total " /tmp/cpu_all.tmp | sed "s/,/ /g" | \
while read NAME TS USER SYS WAIT IDLE rest
do
timestamp=`grep ${TS} /tmp/zzzz.tmp | awk 'FS=","{print $4" "$3}'`
TOTAL=`echo "scale=1;${USER}+${SYS}" | bc`
echo $timestamp,$USER,$SYS,$WAIT,$IDLE,$TOTAL >> \
/tmp/${node}_nmon_cpu.csv
done
rm -f /tmp/cpu_all.tmp /tmp/zzzz.tmp
done
Note: the script assumes that you've stored the NMON output files in /var/msgs/nmon. Update the script to the folder you're using to store NMON files.A major number refers to a type of device, and a minor number specifies a
particular device of that type or sometimes the operation mode of that
device type.
Example:
In the list above:# lsdev -Cc tape rmt0 Available 3F-08-02 IBM 3580 Ultrium Tape Drive (FCP) rmt1 Available 3F-08-02 IBM 3592 Tape Drive (FCP) smc0 Available 3F-08-02 IBM 3576 Library Medium Changer (FCP)
rmt1 is a standalone IBM 3592 tape drive;
rmt0 is an LTO4 drive of a library;
smc0 is the medium changer (or robotic part) of above tape library.
Now look at their major and minor numbers:
All use IBM tape device driver (and so have the same major number of 38), but actually they are different entities (with minor number of 0, 128 and 66 respectively). Also, compare rmt0 and rmt0.1. It's the same device, but with different mode of operation.# ls -l /dev/rmt* /dev/smc* crw-rw-rwT 1 root system 38, 0 Nov 13 17:40 /dev/rmt0 crw-rw-rwT 1 root system 38,128 Nov 13 17:40 /dev/rmt1 crw-rw-rwT 1 root system 38, 1 Nov 13 17:40 /dev/rmt0.1 crw-rw-rwT 1 root system 38, 66 Nov 13 17:40 /dev/smc0


