AIX administration questions.

Why is my root directory full?
Why is the /var directory full?
Why is my system slow?
Real memory?
Paging space?
External SCSI device cables
Why can only some of my PC's access nfs exported file systems
Is my RS6000 Y2K ready
I want to try a Y2K rollover
Max members in a group
How do I set the clock accuratly
Customized start conditions for print queue
Tuning network settings
File Permisions

Covers variouse topics to do with system administration.

The root directory on AIX systems contains relatively few files and typicaly occupies 6MB aix3 to 10MB aix4. This is increased by about .5MB per additional Volume group or disk drive.

If found to be full check the following before even thinking about extending the logical volume.

/smit.log

History file for smit can be deleted.

/smit.script

History file for smit can be deleted.

/etc/security/failedlogin

this is a history of failed login attempts, if no one ever misstypes their paswoords it doesnt even exist. However if your users are very forgetfull, or you have some faulty or miss terminated serial connections, or someone trying to hack the system it can get quite large.

The following command can be used to display the file in printable format.

cat /etc/security/failedlogin |/usr/sbin/acct/fwtmp

If the file is of substantial size print it, or other wise satisfy your self that tere is not an underlying problem before removing.

Root directory filled up sudenly

This is usualy that someone has redirected some output to a miss spelt device name. Try the following commands looking for anything unusual.

find / -xdev -ls|sort -nr +6

This list files in the root directory, with the largest files displayed first.

find / -xdev -mtime 30 -ls

This list files in the root directory, that have been modified in the last 30 days.

find / -xdev -ctime 30 -ls

This list files in the root directory, that have been created in the last 30 days.

Why is the /var directory full?

The /var directory on AIX systems contains files that are expected to change a lot during normal system operation. For a small system it will typicaly occupie 16MB. It contains a number of subdirectories that are used for different purposes.

/var/tmp

Used for tempory workspace by applications, and for logfiles.

If you find any large logfiles in this directory do not delete them. If the application that is/was using the file still has it open. deleting the file will remove the directory entery so that you can not see it but it will still be there ocupieing disk space and probabaly still growing!

So check if file is open.
fuser /var/tmp/named.run
This will return a list of one or more PID's if the file is open.
ps -fp 12345
Where 12345 is the PID returned from the previouse command will give you the program that has the file open.

/var/adm

Used for system accounting files. These should be cleared up automaticaly by the nightly accounting runs.

With AIX 4.x the subdirectory /var/adm/ras is used as a save area for system dumps during system boot following a system crash.

/var/spool

Queues of various sorts.

/var/spool/mail

Mail queue, if mail is properly configured and users are collecting their mail. this should not occupie much space.

/var/spool/qdaemon

Local print jobs waiting to print

/var/spool/lpd

Inter machine print jobs.

/var/preserve

SAve are used by vi when an editor session is killed.

Why is my system slow?

A system can apear slow for lots of reasons. Hardware fault, not all faults will stop a system sometimes they will just slow it down. Try looking at the system error and events reports.

vmstat 1 10

Check for activity it the group of columns headed page. If machine has adequate memmory thease columns will contain mostly 0's, if many values over 10 are present a more indepth investigation of memmory usage is advised.

Check the group of columns headed faults, thease are not faults in the sence of problems. They are counts of how many times the processor has swiched from one program to another.

IN is the number of program switches that have occured due to an interupt. Usualy in the range 120-250. Higher values may be indicative of problems with a network, or serial wireing.

SY is a count of system calls this column is normaly less than 2000, high valuse often 20,000+ are indecative of rogue programs. The offending program may well show up on the PS command below.

In the group of columns headed cpu high values for wa indecate that the processor is having to wait for the disk system to catch up. This may be helped by spreading the i/o load across more drives, or controlers, or by adding more memmory. Have a look at the IOSTAT command below.

In the group of columns headed cpu high values for us and/or sy usualy indecate either a rogue program or excessive workload.

You may also run TOPAS if it is pressent on the system, this will display a contiuously updated overview of current system activity.

Another posibility is MONITOR which is available for most versions of AIX.


ps -eal|sort -nr +5|head

This lists current processes in order of decreacing cpu usage. The 6th column in the output of the above command the indecates the cpu usage of the process over the last few seconds.

The thirteenth column is the total CPU time used by the process, If you repeat this several times and one procces is persistantly near the top of the list and acounting for a significant amount of CPU time investigate that program.


iostat -d 10 20

This command gives 20 snapshots of disk activity at intervals of 10 seconds. First line is an average synce the system was started.

Idealy all drives should show similar valuse for %tm_act, and thease valuse should be less than 25%.

If the valuse for Kb_read are sygnificantly greater than thouse for Kb_write adding more memmory is likely to increase the cache hit rate and decrease disk accesses. Conversely spreading the accesses across more drives will help where there is a high level of disk writes.

acctcom

Real memory?

The command

lsattr -El sys0

displays basic details on the curent system. The atribute realmem reflects the amount of memory the system could see when last rebooted.

Recomended minimum memory

The minimum amount of memory recomended for using roadrunner on AIX depends on the version of AIX, the Version of Roadrunner, the number of users, the amount of data held, and how it is used.

The following formula gives a rough guide. It is based on a "typical" setup with :-

1 years history
20 jobs per day per trafic operator.
1.1 consignments per job.
1 item per consignment.
Jobs posted live
Invoicing to roadrunner accounts.

The formulas give a system where typicaly a third of the memory is used for program data, with the remaining two thirds beinused to cache Roadrunner data/index segments. Performance is improved upto about twice this amount of memory, and deteriorates slowly down to about half below which point the system slows up drasticaly.

AIX 3 Version 3 Roadrunner Allow 8MB for AIX 0.5MB per user 10% of data area or 1MB per 10,000 jobs

AIX 4 Version 4 Roadrunner Allow 16MB for AIX 2MB per user 10% of data area

Paging space?

Paging space is used to store program data when it is not needed in to be instantly available in main memmory. AIX prealocates paging space as a program requests memory. It only writes to it however if it decides to page the data out of main memmory. prealocating it in this way avoides the problem of the system finding itself out of both real and paging space.

The command

lsps -a

will display curently available paging spaces.

How many paging areas should there be? This is open ended if more than one is available space will be alocated on a round-robin basis untill the smallest fills up.

How big should the paging area(s) be.

Except is very special circumstances the total alocated paging space should be atlease as big as the real memory.

AIX 3 Roadrunner version 3 AIX 12MB 1MB per user

AIX 4 Roadrunner version 4 AIX 16MB 2MB per user

AIX 4 Roadrunner version 6 {preliminary} AIX 16MB 3MB per user

Extending a pageing area

The command

lsvg $(lsvg -o)

will list a summery of volume groups and the free space in them. It also list PP_SIZE.

chps -s1 hd6

Will add 1 PP's worth of space to the paging area hd6.

External SCSI device cables

IBM have used several different SCSI connectors. The following sumerizes with a basic discription the connectors used.

SCSI screw narrow Plug screwes in and has a devider running across the width of the connector with contacts on both sides. There are 30 contacts on each side of the finger. For a total of 60. Known to be used on MCA SCSI card shiped with 3xx/5xx To Single Centronix Length 1m P/N 33F4606 FC-_____ To Double recessed centronix Length 1m P/N FC-_____

SCSI screw wide Plug screwes in and has a devider running across the width of the connector with contacts on both sides. There are 34 contacts on each side of the finger. For a total of 68. To Single Centronix Length 1m P/N 92F2559 FC-_____ To Double recessed centronix Length 1m P/N 5264231 FC-_____

SCSI Alt-E spring grips Plug clips on to socket and has sprung loaded grips in the edges of the plug. The connector has two rows of 25 pins. For a total of 50. Built in interface on 220. To Single Centronix Length 1m P/N FC-_____ To Double recessed centronix Length 1m P/N 32G0397 FC-_____

SCSI screw wide pins Plug screwes in and has two rows of 34 pins. For a total of 68. Used for Built in interface 43P-140, Fast wide PCI SCSI card{4A}, To Single Centronix Length 1.0m P/N 06H0637 FC-2111 To Double recessed centronix Length 1.5m P/N 52G0174 FC-2113

SCSI Centronics This is used for 8 bit device to device links. To Single Centronix Length 0.7m P/N 33F4607 FC-2840 To Double recessed centronix Length 0.66m P/N 31F4222 FC-3130

SCSI Centronics

Internal SCSI device atachment

Internal SCSI/SCSI-2 50 way cables only support 8 bit SCSI devices.

With 68way Wide SCSI internal cables, 16bit devices are supported, along with 8bit devices linked via 68/50 interposer partnumber 92F0324 (ASM P/N 92F2565)

Why can only some of my PC's access nfs exported file systems NFS requiers that the majority of tcpip be setup and working. The following list of checks is a good place to start. Can you ping the client? Does the output from the last command show the host name you expect. If it shows an IP number or an unexpected name you need to fix a problem with the /etc/hosts file or the setup of the DNS server. If the last command reported more than 1 days activity fix accounting! Have a look at the /etc/exports file and check that options are as expected! if not use smit nfs to fix.

Is my RS600 Y2K ready This depends on both the operating system and for some models the firmware level installed. Usefull links IBM Raleigh IBM RS6000 microcode page IBM software year 2000 page Checking firmware Not all models requier firmware updates or checks. Where checks are neccessary the following table give a summery of levels and how to check. For a more definative listing see the IBM microcode link above. Firmware checklist Model Name Firmware Check method 7024 E20 VIC97276 Boot SMS" AIX 4.1 lscfg -vr|grep VIC AIX 4.2 lscfg -pv|grep VIC 7024 E30 7025 F40 7043-140 43P140 TIG99187 Boot SMS" 7043-150 43P150 TCP99187 Boot SMS" AIX 4.2 lscfg -pv|grep -p openprom 7043-240 43P240 DOR98153 Boot SMS" Boot SMS For systems with this option the firmware virsion may be determined by rebboting the system and pressing the appropriate key on the console during the boot sequence. Text console press 4 Graphics console press F4 Firmware level displays at top of screen.

I want to try a Y2K rollover Can I set the date on my system forward to the year 2000 and then back again. In general their are no problems with setting a system clock forward, as this is its normal direction. Setting it back however will mean that records in system logfiles used by commands such as sar or last contain invaild or out of rage values any such problems are normaly tempory and are cleared either on a reboot or when the file are cleared by the normal accounting procedures. It is obviously useless to be trying a Y2K rollover unless all known Y2K issues have been addressed first. All available Application software updates loaded Operating System updates Firmware updates if applicable

Max members in a group
There is no fixed limit to the number of users who may belong to a group. But there are 2 problems with this, I have seen a reference, giving a maximum line length of 4096 bytes for the group file. The second problem is that vi, sed, ed, ... all have a maximum edit line buffer of only 2048 bytes.
If you are experiencing probs due to line length, edit the group file to create dummy groups with the same GID, on the following lines of the groups file.
staff:!:1:moe,larry staff2:!:1:shemp,curly
If you have allowed the line length to grow beyond that supported by the editors available. the offending line may be choped out by carefull use of the head and tail commands. having obtained the offending line on its own. use awk to split out the member list and save as seperate file with one member per line. Use the head and tail commands. to spit this in to blocks of 100 to 200 members, and reasemble.

How do I set the clock accuratly
If the customer has network access for support from Roadtech. The following command will set the time on any AIX 4.2 or 4.3 system.
ntpdate -s -t 4 -p 8 193.133.123.33 193.133.123.34
If customer has permanent access via Leased Line/Frame relay {Not ISDN} xntpd may be setup using the same IP addresses in server mode See FAQ
Note :- Command may be used from command line as root, from rc.tcpip {after network routing is working, or from cron. If working via ISDN each invocation will incure the cost of a phone call.

Customized start conditions for print queue
It is posible to configure multiple queues for a printer. Each queue can then be configured to pre-initalize the printer to slightly different start settings.
The examples below are for an IBM proprinter connected to a print server.
Queue name :- myqueue
printserver :- print2

Start in condenced print
/usr/lib/lpd/pio/etc/piochpq -q myqueue -d '@print2' -k '+'
Start in normal print
/usr/lib/lpd/pio/etc/piochpq -q myqueue -d '@print2' -k '-'
Start printing at 10 cpi
/usr/lib/lpd/pio/etc/piochpq -q myqueue -d '@print2' -p '10'
Start printing at 12 cpi
/usr/lib/lpd/pio/etc/piochpq -q myqueue -d '@print2' -p '12'
Start printing at 12 cpi fastfont
/usr/lib/lpd/pio/etc/piochpq -q myqueue -d '@print2' -q '0'
Start printing using draft mode
/usr/lib/lpd/pio/etc/piochpq -q myqueue -d '@print2' -q '1'
Start printing using NLQ mode
/usr/lib/lpd/pio/etc/piochpq -q myqueue -d '@print2' -q '2'
Start printing using NLQ2 mode
/usr/lib/lpd/pio/etc/piochpq -q myqueue -d '@print2' -q '3'
Tuning network settings
The network protocols themselves can be "tuned" for a veriety of purposes.

Minimize response times
Maximize single conection throughput
Performance over poor link.
In general it is not worth investing a lot of effort in network tuning as optimizing for one usage pattern will usualy make it worse for another.

/etc/hosts
The file /etc/hosts should contain as a minimum, an entery for the loopback interface.
127.0.0.1 loopback localhost
Enteries for the primary IP address of each network card. These should be in the form
IP full.domain.name nickname
Enteris for any additional IP addresses to be configured on the network interfaces by use of the ifconfig command.
Any odd IP addresses that are not resolveable via DNS. The file should be kept as short as posible as searching a long unindexed text file is much slower than an indexed local DNS lookup.
/etc/netsvc.conf
The default order for resolving hostnames is

bind6 (named for AAAA record){AIX 4.3+}
bind4 (named for A record){AIX 4.3+}
bind (named for A record, or AAAA then A if AIX 4.3+)
nis
local (local /etc/hosts lookup)

For most of our customers creating the file /etc/netsvc.conf, containing the line
host=local,bind or for AIX 4.3 customers host=local,bind4 Will be benificial.
If necesery this order can be overriden by initializing the environment variable NSORDER this takes the same options. export NSORDER=local
Setting NSORDER to local only at the start of /etc/rc.net may speed system startup, as rc.net is executed by cfgmgr during phase 2 of system startup, as which point the other resolver options are not available, causing the the startup process to wait for them to time out.
netsvc.conf is only read when starting a program changes will not affect programs that have already started.
/etc/rc.net
This script is executed during phase 2 of system startup by cfgmgr. Variouse options to do with how network is configured can be set by calling the no command from rc.net. Options available fall into to groups. Loadtime options that must be set before the calls to configure the kernel extensions. /usr/lib/methods/definet >>$LOGFILE 2>&1 /usr/lib/methods/cfginet >>$LOGFILE 2>&1 The other runtime options are usualy set at the bottom of rc.net, and can be changed at any time form the command line. The tcp transport layer accets data from an application for transport over an open connection, this data is sent in blocks of upto size MTU. the tcpsend buffer for a connection holds blocks that are waiting to i be sent, or have been sent but not acknolaged. The recieve buffer holds packets that have been received. For most purposes it makes sence to have the send and receive buffers the same size. ################################################### # The socket default buffer size (initial advertized TCP window) is being # set to a default value of 16k (16384). This improves the performance # for ethernet and token ring networks. Networks with lower bandwidth # such as SLIP (Serial Line Internet Protocol) and X.25 or higher bandwidth # such as Serial Optical Link and FDDI would have a different optimum # buffer size. # ( OPTIMUM WINDOW = Bandwidth * Round Trip Time ) ################################################### if [ -f /usr/sbin/no ] ; then /usr/sbin/no -o tcp_sendspace=16384 /usr/sbin/no -o tcp_recvspace=16384 fi ################################################################## # Disable pmtu discovery as this can cause spurious calls # on ISDN dialup links. ################################################################## if [ -f /usr/sbin/no ] ; then /usr/sbin/no -o udp_pmtu_discover=0 /usr/sbin/no -o tcp_pmtu_discover=0 fi /etc/rc.tcpip Executed from inittab during final phase of system startup. This script configures all of the network services. Generaly it is better to configure these trough smit rather than edit the file directly. NFS settings and status If you are using NFS or any other system resorces that make use of the services of the Berkley statd daemon. It is very important that the host names as provided by the client software match the hostname listed by DNS. A major source of troubles can be that many Windows PC aplications use the computer name of the PC from the registry. And windows will let you enter any old junk including control characters. This can cause all sorts of problems ranging from errors during system backups to name resolution problems. statd records the status of any hosts it is or has been talking to, in files stored in the directories. /etc/sm {current} /etc/sm.bak {past} If you suspect problems do a ls -b in these directories and check for spurious control characters, and that the names are resolvable vi host/nslookup to an IP and back from the IP to the same hostname. After fixing problems delete all files in the directory /etc/sm.bak then shutdown and restart. After restarting recheck periodicaly and repeate this fix as necesary. diagnostic commands netstat -rn netstat -s netstat -Ien0 10 entstat ent0 snmp
File Permisions File Permisions

Model	Name	Firmware	Check method
7024	E20	VIC97276	Boot SMS" AIX 4.1 lscfg -vr\|grep VIC AIX 4.2 lscfg -pv\|grep VIC
7024	E30
7025	F40
7043-140	43P140	TIG99187	Boot SMS"
7043-150	43P150	TCP99187	Boot SMS" AIX 4.2 lscfg -pv\|grep -p openprom
7043-240	43P240	DOR98153	Boot SMS"