Wednesday, October 10, 2012

Setting Up the Flash Recovery Area - Oracle

(Excerpted from Robert Freeman)

To set up the flash recovery area, you need to configure the following parameters (which are new in Oracle Database 10g):
db_recovery_file_dest_size
Example:
Alter system set db_recovery_file_dest_size=20G scope=both;
Purpose:
This parameter sets the allocated size of the flash recovery area, and must be defined in order to enable the flash recovery area. This allows you to control how much disk space will be allocated to the flash recovery area.
You should not set this value to a size that is greater than the total amount of available disk space that is available to you. Otherwise, backups will fail.
db_recovery_file_dest
Example:
Alter system set db_recovery_file_dest= '/u01/oracle/flash_recovery' scope=both;
Purpose:
This is the location of the flash recovery area. The parameter can be set to any valid file system, or you can use Oracle Database 10g Automatic Storage Management (ASM) disk group.
Note that you must specify the db_recovery_file_dest_size parameter before you specify the db_recovery_file_dest parameter. Failure to do so will result in an ORA32001 error message. In a similar fashion, you must disable the db_recovery_file_ dest parameter before you reset the db_recovery_file_dest_size parameter. Leaving db_recovery_file_dest empty disables the flash recovery area. Here is an example of disabling the flash recovery area by resetting the db_recovery_file_dest parameter:
Alter system set db_recovery_file_dest=' ' scope=both;
Finally, in an Oracle Real Application Clusters environment, you cannot specify these settings for a specific instance; they must be consistent throughout the whole cluster.
Flash Recovery Area Views
The V$RECOVERY_FILE_DEST view, new in Oracle Database 10g, provides an overview of the recovery area that is defined in your database. It provides the size that the flash recovery area is configured for, the amount of space used, how much space can be reclaimed, and the number of files in the flash recovery area.
A new column, IS_RECOVERY_DEST_FILE, can be found in a number of Oracle Database 10g’s V$ views, such as V$CONTROLFILE, V$LOGFILE, V$ARCHIVED_ LOG, V$DATAFILE_COPY, and V$BACKUP_PIECE. This column is a Boolean that indicates whether or not the file is in a flash recovery area.
Another new column, BYTES, can be found in the views V$BACKUP_PIECE and RC_BACKUP_PIECE (an RMAN recovery catalog view). This column indicates the size, in bytes, of the backup-set piece. This can be used to help you determine how much of the flash recovery area your backups are already consuming.
NOTE - Manually removing fixed files from the flash recovery area can have unexpected consequences. Oracle Database 10g does not immediately detect the removal of these files, and thus the space is not reclaimed. If you end up manually removing files (or lose a disk perhaps), use the RMANcrosscheck command along with thedeletecommand to cause Oracle Database 10g to update the current control file information on the flash recovery area.
RMAN Commands Related to Flash Recovery Areas
RMAN has been enhanced with new commands that allow you to back up and restore the flash recovery area. The RMAN command backup recovery area allows you to back up all files required to restore the database via RMAN from a recovery area to an sbt (tape) device. The following types of files are backed up with this command:

Full and incremental backup sets
Control file autobackups
Archive logs
Datafile copies

Note that this command does not back up the following:

Flashback logs
Incremental bitmaps
The current control file
Online redo logs

As you have seen, the RMAN command backup recovery area backs up all files needed for recovery in the flash recovery area. There is a second command, backup recovery files, that backs up all recovery files that are on the disk, wherever they may be (in flash recovery areas or otherwise). The backup recovery files command must also go to an sbt device and cannot go to disk.
NOTE -- The backup recovery area and backup recovery files commands are nice commands to have available when you do your primary backups to disk but want to later back up those backup sets to tape!
RMAN Backup and Restore to a Flash Recovery Area
When a flash recovery area is defined (via the db_recovery_file_dest parameter), RMAN sends backups directly to the flash recovery area. If you are using a local or CFS file system, you will find that RMAN creates a directory structure for the flash recovery area. Typically, the structure includes a directory for the database being backed up and, underneath that directory, another directory for the type of backup.
Recoveries also use the flash recovery area if the appropriate backup set is within the flash recovery area. Also, you can specify a recovery area to use when restoring a control file or SPFILE from an autobackup by using the new recovery area clause, as shown in this example:
RMAN> Restore controlfile from autobackup using recovery area 'c:\recovery';
Other Flash Recovery Area Features
The alter database add logfile and alter database add standby logfile commands, by default, now create an online redo log member in the flash recovery area if the OMF-related parameter db_create_online_log_dest_n is not set. The alter database drop logfile and alter database rename file commands also support files in the flash recovery area.
During database creation, Oracle Database 10g can use the flashback recovery area to store the database control file and online redo logs. If the OMF-related parameter db_create_online_log_dest_n is defined, then the control file and redo logs will be created in those locations, but will not be created in the flash recovery area, even if the flash recovery area is defined. If db_create_online_log_dest_n is not defined but create_file_dest is defined, then the control file and online redo logs will be created in the location defined by create_file_dest. If the parameter db_ recovery_file_dest is also defined, then a copy of the control file and online redo logs will get created there as well. Finally, if only db_recovery_file_dest is defined, then the control file will get created in that location. If none of these parameters is defined, then the control file and online redo logs will be created to a default location, which is OS specific.

Wednesday, July 25, 2012

Oracle Automatic undo management and transactions_per_rollback_segment

Oracle Tips by Burleson Consulting

I’ve been doing a lot of database health checks on 9i and now 10g databases recently. Most are using the automatic undo management feature and on the whole it does a pretty good job of managing the undo segments (for you other old timers, rollback segments).

However, I have been noticing, shall I say, some rather retro behavior as a result of the automatic undo management. In the bad old days when we managed the undo segments manually we would tune to reduce extends and the subsequent shrinks which resulted. The shrinks would cause waits on the next session to use the segment as the segment shrank back to the “optimal” setting.

The usual method to set the initial, next and optimal was to examine the rollback segment views and determine such values as average transaction size, max transaction size and also, determine the number of active DML and DDL statements (SELECT didn’t and doesn’t really count for much in rollback/undo activity, generally speaking). From these values we could set initial, next and optimal to reduce over extending the segments and reduce the subsequent shrinks and waits as well as the needed number of segments.

What seems to be happening is that Oracle looks at two basic parameters, TRANSACTIONS (based on 1.1*SESSIONS) and TRANSACTIONS_PER_ROLLBACK_SEGMENT, and then uses an internal algorithm to determine the number of undo segments to create in the undo tablespace. The size seems to be determined by the number created and the overall size of the tablespace. So, if you set up for 300 SESSIONS this usually means about 330 TRANSACTIONS, the TRANSACTIONS_PER_ROLLBACK_SEGMENT defaults to 5 so Oracle right from the gate assumes you will ultimately need 66 undo segments. Seems they forgot that generally speaking, only about 1 in 10 “transactions” in most databases actually do DML/DDL and that 90% are usually SELECT. I have seen in almost all Oracle databases with automatic undo used, that reach near the setting of SESSIONS number of actual connected users, that Oracle over allocates the number of undo segments leaving sometimes dozens offline and never used.

The other thing I see a great deal of is the old extends, shrinks and waits we used to spend so much time tuning away. In many cases I also see the old ORA-01555 (snapshot too old) errors coming back. If the undo segment tablespace is too small and Oracle creates too many small segments, then it is quite easy to see why.

So, am I saying don’t use automatic undo? No, not at all. I say use the automatic undo, but understand how to use it wisely. Essentially, utilize the TRANSACTIONS_PER_ROLLBACK_SEGMENT to control the number of segments created, and size the undo tablespace large enough that the segments are sized appropriately. In addition, if you are not going to use 300 sessions, don’t set the SESSIONS to 300! Make sure to align the SESSIONS parameter to the proper number of expected sessions.

If you need to change the undo segment configuration in your environment (look at the v$rollstat view to see if you have excessive waits, shrinks and extends) you will need to alter the parameters, configure a second undo segment tablespace, and then restart the database (if you changed SESSIONS or TRANSACTIONS_PER_ROLLBACK_SEGMENT) to utilize the new settings.

What seems to be happening, is that as a start the Oracle algorithm will create 10 active undo segments and sets the MAX_ROLLBACK_SEGMENTS parameter equal to the value TRANSACTIONS/TRANSACTIONS_PER_ROLLBACK_SEGMENT, as the number of session increases, Oracle adds a new segment at each increment of TRANSACTIONS_PER_ROLLBACK_SEGMENT above 10*TRANSACTIONS_PER_ROLLBACK_SEGMENT that your user count reaches. It doesn’t seem to care if the session is doing anything, it just has to be present. Oracle leaves the new segment offline, just taking up space, unless the user does DML or DDL. The minimum setting Oracle seems to utilize is 30 for MAX_ROLLBACK_SEGMENTS. For example, with a SESSIONS setting of 300, this resulted in a TRANSACTIONS setting of 330, with a default TRANSACTIONS_PER_ROLLBACK_SEGMENT of 5, the MAX_ROLLBACK_SEGMENTS parameter was set to 66. With a setting of 20, instead of a new setting of 17 (330/20 rounded up) we get a setting of 30. If we set it to 10, we get a setting of 33. Note that even with manually setting the parameter MAX_ROLLBACK_SEGMENTS, if automatic UNDO management is turned on, your setting will be overridden with the calculated one.

So watch the settings of SESSIONS, TRANSACTIONS_PER_ROLLBACK_SEGMENT and the size of the undo tablespace to properly use the automatic undo feature in Oracle9i and 10g.

Oracle log_buffer sizing tips

Overview of redo log tuning

Important note for Oracle 10gr2 and beyond: Per MOSC note 351857.1, starting in release 10.2 and beyond, Oracle will automatically size the log_buffer on your behalf and log_buffer cannot be changed dynamically. The automatic log_buffer sizing is based on the granule size (as determined by to _ksmg_granule_size):

select 
   a.ksppinm name, 
   
    b.ksppstvl value, 
   a.ksppdesc description
from
   
    x$ksppi a, 
   x$ksppcv b
where 
   
    a.indx = b.indx
and 
   a.ksppinm = 
    '_ksmg_granule_size';

NAME                           
    VALUE                DESCRIPTION

    ------------------------------ -------------------- 
    --------------------
_ksmg_granule_size             
    16777216             granule size in bytes

Also note that if you are Oracle's Automatic Memory Management AMM (AMM is Not recommended for some databases), the log_buffer is part of the memory_target algorithm.

Tuning the redo log in Oracle

The steps for tuning redo log performance are straightforward:

1 - Determine the optimal sizing of the log_buffer.

2 - Size online redo logs to control the frequency of log switches and minimize system waits.

3 - Optimize the redo log disk to prevent bottlenecks. In high-update databases, no amount of disk tuning may relieve redo log bottlenecks, because Oracle must push all updates, for all disks, into a single redo location.

Once you have optimized your redo and I/O sub-system, you have few options to relieve redo-induced contention. This can be overcome by employing super-fast solid-state disk for your online redo log files, since SSD has far greater bandwidth than platter disk. For complete details on Oracle redo tuning and redo diagnostic scripts, see my book "Oracle Tuning: The Definitive Reference".

Optimizing the log_buffer region

The log_buffer is one of the most complex of the Oracle RAM region parameters to optimize, but it's a low-resource parameter (only using a few meg of RAM), so the goal in sizing log_buffer is to set a value that results in the least overall amount of log-related wait events.

The big issue with the log buffer is determining the optimal sizing for the log_buffer in a busy, high-DML database. Common wait events related to a too-small log_buffer size include high "redo log space requests" and a too-large log_buffer may result in high "log file sync" waits.

For more details on log_buffer sizing, see Bug 4930608 and MOSC Note 604351.1.

Per MOSC MOSC note 216205.1 Database Initialization Parameters for Oracle Applications 11i, recommends a log_buffer size of 10 megabytes for Oracle Applications, a typical online database:

A value of 10MB for the log buffer is a reasonable value for Oracle Applications and it represents a balance between concurrent programs and online users.

The value of log_buffer must be a multiple of redo block size, normally 512 bytes.

Obsolete advise - Use a small log_buffer

Even though Oracle has traditionally suggested a log_buffer no greater than one meg, I have seen numerous shops where increasing log_buffer beyond one meg greatly improved throughput and relieved undo contention.

The log_buffer should remain small. This is perpetuated with MOSC notes that have become somewhat obsolete:

“In a busy system, a value 65536 or higher is reasonable [for log_buffer].

“It has been noted previously that values larger than 5M may not make a difference.”

MOSC notes that in 10gr2, we see a bug in log_buffer where a customer cannot reduce the log_buffer size from 16 meg:

“In 10G R2, Oracle combines fixed SGA area and redo buffer [log buffer] together. If there is a free space after Oracle puts the combined buffers into a granule, that space is added to the redo buffer. Thus you see redo buffer has more space as expected. This is an expected behavior.. .
In 10.2 the log buffer is rounded up to use the rest of the granule. The granule size can be found from the hidden parameter "_ksmg_granule_size" and in your case is probably 16Mb. The calculation for the granule size is a little convoluted but it depends on the number of datafiles”

If the log_buffer has been set too high (e.g. greater than 20 meg), causing performance problems because the writes will be performed synchronously because of the large log buffer size, evidenced by high log file sync wait. Oracle consultant Steve Adams notes details on how Oracle processes log file sync waits:

"Before writing a batch of database blocks, DBWn finds the highest high redo block address that needs to be synced before the batch can be written.

DBWn then takes the redo allocation latch to ensure that the required redo block address has already been written by LGWR, and if not, it posts LGWR and sleeps on a log file sync wait."

Detecting an undersized log_buffer

Here is a AWR report showing a database with an undersized log_buffer, in this case where the DBA did not set the log_buffer parameter in their init.ora file:

Avg

Total Wait wait Waits

Event Waits Timeouts Time (s) (ms) /txn

---------------------------- ------------ ---------- ---------- ------ --------

log file sequential read 4,275 0 229 54 0.0

log buffer space 12 0 3 235 0.0

Top 5 Timed Events

~~~~~~~~~~~~~~~~~~ % Total

Event Waits Time (s) Ela Time

-------------------------------------------- ------------ ----------- --------

CPU time 163,182 88.23

db file sequential read 1,541,854 8,551 4.62

log file sync 1,824,469 8,402 4.54

log file parallel write 1,810,628 2,413 1.30

SQL*Net more data to client 15,421,202 687 .37

It's important to note that log buffer shortages do not always manifest in the top-5 timed events, especially if their are other SGA pool shortages. Here is an example of an Oracle 10g database with an undersized log buffer, in this example 512k (This is the database as I found it, and there was a serious data buffer shortage causing excessive disk I/O):

Top 5 Timed Events

~~~~~~~~~~~~~~~~~~ % Total

Event Waits Time (s) DB Time Wait Class

------------------------------ ------------ ----------- --------- -----------

log file parallel write 9,670 291 55.67 System I/O

log file sync 9,293 278 53.12 Commit

CPU time 225 43.12

db file parallel write 4,922 201 38.53 System I/O

control file parallel write 1,282 65 12.42 System I/O

Log buffer related parameter issues

In addition to re-sizing log_buffer, you can also adjust the hidden Oracle10g parameter _log_io_size (but only at the direction of Oracle technical support) and adjust your transactions_per_rollback_segment parameters. In 10g, the _log_io_size parameter govern the offload threshold and it defaults to log_buffer/3.

The transactions_per_rollback_segment parameter specifies the number of concurrent transactions you expect each rollback segment to have to handle. The Oracle 10g documentation notes:

"TRANSACTIONS_PER_ROLLBACK_SEGMENT specifies the number of concurrent transactions you expect each rollback segment to have to handle. The minimum number of rollback segments acquired at startup is TRANSACTIONS divided by the value for this parameter.

For example, if TRANSACTIONS is 101 and this parameter is 10, then the minimum number of rollback segments acquired would be the ratio 101/10, rounded up to 11."

At startup time, Oracle divides transactions by transactions_per_rollback_segment to have enough rollback space, and Oracle guru Osamu Kobayashi has a great test of transactions_per_rollback_segment:

"I set transactions_per_rollback_segment to 4. The result of 2-4. in fact suggests that twenty-one transactions are specified. Twenty-one transactions also match the number of transactions_per_rollback_segment that can be specified at the maximum.

As far as I analyze, one rollback segment can handle up to twenty-one transactions at a time, regardless of transactions_per_rollback_segment value.
This transactions_per_rollback_segment parameter therefore is used to determine the number of public rollback segments to start."

CPU's, log_buffer sizing and multiple log writer processes

The number of CPUs is also indirectly related to the value of log_buffer, and MOSC discusses multiple LGWR slaves that are used to asynchronously offload the redo information.

The hidden parameter _lgwr_io_slaves appear to govern the appearance of multiple log writer slaves, and the MOSC note that clearly states that multiple LGWR processes will only appear under high activity. The Oracle docs are very clear on this:

“Prior to Oracle8i you could configure multiple log writers using the LGWR_IO_SLAVES parameter.”

In Oracle10g it becomes a hidden parameter (_lgwr_io_slaves). MOSC note 109582.1 says that log I/O factotum processes started way-back in Oracle8 and that they will only appear as DML activity increases:

“Starting with Oracle8, I/O slaves are provided. These slaves can perform asynchronous I/O even if the underlying OS does not support Asynchronous I/O. These slaves can be deployed by DBWR, LGWR, ARCH and the process doing Backup. . .

In Oracle8i, the DBWR_IO_SLAVES parameter determines the number of IO slaves for LGWR and ARCH. . .

As there may not be substantial log writing taking place, only one LGWR IO slave has been started initially. This may change when the activity increases.”

The Oracle8 docs note that the value for the parameter log_simultaneous_copies is dependent on the number of CPU’s on the server:

“On multiple-CPU computers, multiple redo copy latches allow multiple processes to copy entries to the redo log buffer concurrently. The default value of LOG_SIMULTANEOUS_COPIES is the number of CPUs available to your Oracle instance”

Starting in Oracle8i, it’s a hidden parameter (_log_simultaneous_copies). From MOSC note 147471.1 “Tuning the Redo log Buffer Cache and Resolving Redo Latch Contention”, we see that the default is set to cpu_count * 2. Also, it notes that multiple redo allocation latches become possible by setting the parm _log_parallelism, and that the log buffer is split in multiple log_parallelism areas that each have a size equal to the log_buffer. Further, MOSC discusses the relationship of log_buffer to the number of CPU’s:

“The number of redo allocation latches is determined by init.ora LOG_PARALLELISM. The redo allocation latch allocates space in the log buffer cache for each transaction entry.

If transactions are small, or if there is only one CPU on the server, then the redo allocation latch also copies the transaction data into the log buffer cache.”

We also see that log file parallel writes are related to the number of CPU’s. MOSC note 34583.1 “WAITEVENT: "log file parallel write" Reference Note”, shows that the log_buffer size is related to parallel writes (i.e. the number of CPU’s), and discusses how LGWR must wait until all parallel writes are complete. It notes that solutions to high “log file parallel write” waits are directly related to I/O speed, recommending that redo log members be on high-speed disk, and that redo logs be segregated onto

“on disks with little/no IO activity from other sources.

(including low activity from other sources against the same disk controller)”.

This is a strong argument for using super-fast solid-state disk.

Here are some great tips by Steve Adams for sizing your log_buffer:

"If the log buffer is too small, then log buffer space waits will be seen during bursts of redo generation. LGWR may not begin to write redo until the _log_io_size threshold (by default, ¹/3 of the log buffer or 1M whichever is less) has been exceeded, and the remainder of the log buffer may be filled before LGWR can complete its writes and free some space in the log buffer.
Ideally, the log buffer should be large enough to cope with all bursts of redo generation, without any log buffer space waits.
Commonly, the most severe bursts of redo generation occur immediately after a log switch, when redo generation has been disabled for some time, and there is a backlog of demand for log buffer space"

Thanks,
Sreedhar.

Tuesday, July 24, 2012

OS Watcher Black Box User's Guide

OS Watcher Black Box User's Guide

@Please Note:  OSW has been renamed to OSWbb (OSWatcher Black Box) to prevent
@confusion as there are several tools now within Oracle that share this same name.

OSWbb now provides an analysis tool OSWbba which analyzes the log files produced by OSWbb. This tool allows OSWbb to be self-analyzing. This tool also provides a graphing capability to graph the data and to produce a htmp profile.

To collect database metrics in addition to OS metrics consider running LTOM. To see an example of your system profiled with LTOM click here.

Contents

Introduction
Overview
Supported Platforms
Gathering Diagnostic Data
- Installing OSWbb
- Uninstalling OSWbb
- Setting up OSWbb
- Starting OSWbb
- Stopping OSWbb
Diagnostic Data Output
- oswiostat
- oswmpstat
- oswnetstat
- oswprvtnet
- oswps
- oswtop
- oswvmstat
Graphing the Output
Known Issues
Download
Reporting Feedback
Sending Files To Support

Introduction
OS Watcher Black Box (OSWbb) is a collection of UNIX shell scripts intended to collect and archive operating system and network metrics to aid support in diagnosing performance issues. OSWbb operates as a set of background processes on the server and gathers OS data on a regular basis, invoking such Unix utilities as vmstat, netstat and iostat. OSWbb can be downloaded from this note. OSWbb is also included in the RAC-DDT script file, but is not installed by RAC-DDT. For more information on RAC-DDT see <>. OSWbb is installed on each node where data is to be collected. Installation instructions for OSWbb are provided in this user guide.

Overview

OSWbb consists of a series of shell scripts. OSWatcher.sh is the main controlling executive, which spawns individual shell processes to collect specific kinds of data, using Unix operating system diagnostic utilities. Control is passed to individually spawned operating system data collector processes, which in turn collect specific data, timestamp the data output, and append the data to pre-generated and named files. Each data collector will have its own file, created and named by the File Manager process.

Data collection intervals are configurable by the user, but will be uniform for all data collector processes for a single instance of the OSWbb tool. For example, if OSWbb is configured to collect data once per minute, each spawned data collector process will generate output for its respective metric, write data to its corresponding data file, then sleep for one minute (or other configured interval) and repeat. Because we are collecting data every minute, the files generated by each spawned processes will contain 60 entries, one for each minute during the previous hour. Each file will contain, at most, one hour of data. At the end of each hour, File Manager will wake up and copy the existing current hour file to an archive location, then create a new current hour file.

The File Manager ensures only the last N hours of information are retained, where N is a configurable integer defaulting to 48. File Manager will wake up once per hour to delete files older than N hours. At any time, the entire output file set will consist of one current hour file, plus N archive files for each data collector process.

stopOSWbb.sh will terminate all processes associated with OSWbb, and is the normal, graceful mechanism for stopping the tool's operation.

OSWbb invokes these distinct operating system utilities, each as a distinct background process, as data collectors. These utilities will be supported, or their equivalents, as available for each supported target platform.

ps
top
mpstat
iostat
netstat
traceroute
vmstat

Supported Platforms

OSWbb is certified to run on the following platforms:

AIX
Tru64
Solaris
HP-UX
Linux

Gathering Diagnostic Data

Installing OSWbb
OSWbb needs to be installed on each node, one installation per node. OSWbb should be installed manually by using the following procedure:

NOTE: OSWbb is available through MOS and can be downloaded as a tar file. The user then copies the file oswbb.tar to the directory where OSWbb is to be installed and issues the following commands.

tar xvf oswbb.tar

A directory named oswbb is created which houses all the files associated with OSWbb. OSWbb is now installed.

Uninstalling OSWbb

To de-install OSWbb issue the following command on the oswbb directory.

rm -rf oswbb

Setting up OSWbb

New in this release, is the ability to control the archive directory location where OSWbb stores the data it collects. By default this directory is created under the oswbb directory where oswbb is installed. To change this location set the UNIX environment variable OSWWBB_ARCHIVE_DEST to the location desired before starting the tool. Once OSWbb is installed, scripts have been provided to start and stop the OSWbb utility. When OSWbb is started for the first time it creates the archive subdirectory. The archive directory contains 7 subdirectories, one for each data collector. Data collectors exist for top, vmstat, iostat, mpstat, netstat, ps and an optional collector for tracing private networks. To turn on data collection for private networks the user must create an executable file in the oswbb directory named private.net. An example of what this file should look like is named Example private.net with samples for each operating system: solaris, linux, aix, hp, etc. in the oswbb directory. This file can be edited and renamed private.net or a new file named private.net can be created. This file contains entries for running the traceroute command to verify RAC private networks.

Example private.net entry on Solaris:

traceroute -r -F node1
traceroute -r -F node2

Where node1 and node2 are 2 nodes in addition to the hostnode of a 3 node RAC cluster. If the file private.net does not exist or is not executable then no data will be collected and stored under the oswprvtnet directory.

OSWbb will need access to the OS utilities: top, vmstat, iostat, mpstat, netstat, and traceroute. These OS utilities need to be install on the system prior to running OSWbb. Execute permission on these utilities need to be granted to the user of OSWbb.

Starting OSWbb

To start the OSWbb utility execute the startOSWbb.sh shell script from the directory where OSWbb was installed. This script has 2 arguments which control the frequency that data is collected and the number of hour's worth of data to archive.

ARG1 = snapshot interval in seconds.
ARG2 = the number of hours of archive data to store.

If you do not enter any arguments the script runs with default values of 30 and 48 meaning collect data every 30 seconds and store the last 48 hours of data in archive files.

Example 1:

./startOSWbb.sh 60 10

This would start the tool and collect data at 60 second intervals and log the last 10 hours of data to archive files.

Example 2:

./startOSWbb.sh

NOTE: This would use the default values of 30, 48 and collect data at 30 second intervals and log the last 48 hours of data to archive files.

Example 3:

nohup ./startOSWbb.sh 60 10 &

This would start the tool, put the process in the background, enable to the tool to continue running after the session has been terminated, collect data at 60 second intervals, and log the last 10 hours of data to archive files.

Stopping OSWbb

To stop the OSWbb utility execute the stopOSWbb.sh command from the directory where OSWbb was installed. This terminates all the processes associated with the tool.

Example:

./stopOSWbb.sh

Diagnostic Data Output

As stated above, when OSWbb is started for the first time it creates the archive subdirectory under the OSWbb installation directory. The archive directory contains 7 subdirectories, one for each data collector. These directories are named oswiostat, oswmpstat, oswnetstat, oswprvtnet, oswps, oswtop, and oswvmstat. One file per hour will be generated in each of the 7 OS utility subdirectories with the exception of oswprvtnet which is dependent on having private networks tracing configured. A new file is created at the top of each hour during the time that OSWbb is running. The file will be in the following format:

__YY.MM.DD.HH24.dat

Details about each type of data file can be viewed on the below:

oswiostat
oswmpstat
oswnetstat
oswprvtnet
oswps
oswtop
oswvmstat

oswiostat

_iostat_YY.MM.DD:HH24.dat

These files will contain output from the 'iostat' command that is obtained and archive by OSWatcher Black Box at specified intervals. These files will only exist if 'iostat' is installed on the OS and if the OSWbb user has privileges to run the utility.

The iostat command is used for monitoring system input/output device loading by observing the time the physical disks are active in relation to their average transfer rates. This information can be used to change system configuration to better balance the input/output load between physical disks and adapters.

The iostat utility is fairly standard across UNIX platforms, but really on useful for those platforms that support extended disk statistics: AIX, Solaris and Linux. Also each platform will have a slightly different version of the iostat utility. You should consult your operating system man pages for specifics. The sample provided below is for Solaris.

OSWbb runs the iostat utility at the specified interval and stores the data in the oswiostat subdirectory under the archive directory. The data is stored in hourly archive files. Each entry in the file contains a timestamp prefixed by *** embedded in the iostat output. Notice there is one entry for each timestamp.

Sample iostat file produced by OSWbb

extended device statistics
r/s	w/s	kr/s	kw/s	wait	actv	wsvc_t	asvc_t	%w	%b	device
0.0	0.3	0.0	2.1	0.0	0.0	3.4	0.8	0	0	c0t0d0
0.0	2.1	0.1	12.9	0.0	0.0	0.6	0.4	0	0	c0t2d0
0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0	0	fd0
2.9	1.2	240.8	1.5	0.0	0.1	0.0	13.3	0	5	c1t0d0
1.1	0.8	18.0	8.8	0.0	0.0	0.1	5.9	0	1	c1t1d0
0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0	0	c0t1d0

Field Descriptions

The iostat output contains summary information for all devices.

Field	Description
r/s	Shows the number of reads/second
w/s	Shows the number of writes/second
kr/s	Shows the number of kilobytes read/second
kw/s	Shows the number of kilobytes written/second
wait	Average number of transactions waiting for service (queue length)
actv	Average number of transactions actively being serviced
wsvc_t	Average service time in wait queue, in milliseconds
asvc_t	Average service time of active transactions, in milliseconds
%w	Percent of time there are transactions waiting for service
%b	Percent of time the disk is busy
device	Device name

What to look for

Average service times greater than 20msec for long duration.
High average wait times.

oswmpstat
_mpstat_YY.MM.DD:HH24.dat

These files will contain output from the 'mpstat' command that is obtained and archive by OSWatcher Black Box at specified intervals. These files will only exist if 'mpstat' is installed on the OS and if the OSWbb user has privileges to run the utility.

The mpstat command collects and displays performance statistics for all logical CPUs in the system.

The mpstat utility is fairly standard across UNIX platforms. Each platform will have a slightly different version of the mpstat utility. You should consult your operating system man pages for specifics. The sample provided below is for Solaris.

OSWbb runs the mpstat utility at the specified interval and stores the data in the oswmpstat subdirectory under the archive directory. The data is stored in hourly archive files. Each entry in the file contains a timestamp prefixed by *** embedded in the mpstat output. Notice there are 2 entries for each timestamp. You should always ignore the first entry as this entry is always invalid.

Sample mpstat file produced by OSWbb

***Fri Jan 28 12:50:36 EST 2005
CPU	minf	mjf	xcal	intr	ithr	csw	icsw	migr	smtx	srw	syscl	usr	sys	wt	idl
0	0	0	0	483	383	118	1	0	0	0	64	0	0	0	100
0	1268	0	0	486	382	414	42	0	0	0	2902	8	24	0	68
0	4	0	0	479	379	144	3	0	0	0	96	0	0	0	100

Field Descriptions

Field	Description
cpu	Processor ID
minf	Minor faults
mif	Major Faults
xcal	Processor cross-calls (when one CPU wakes up another by interrupting it).
intr	Interrupts
ithr	Interrupts as threads (except clock)
csw	Context switches
icsw	Involuntary context switches
migr	Thread migrations to another processor
smtx	Number of times a CPU failed to obtain a mutex
srw	Number of times a CPU failed to obtain a read/write lock on the first try
syscl	Number of system calls
usr	Percentage of CPU cycles spent on user processes
sys	Percentage of CPU cycles spent on system processes
wt	Percentage of CPU cycles spent waiting on event
idl	Percentage of unused CPU cycles or idle time when the CPU is basically doing nothing

What to look for

Involuntary context switches (this is probably the more relevant statistic when examining performance issues.)
Number of times a CPU failed to obtain a mutex. Values consistently greater than 200 per CPU causes system time to increase.
xcal is very important, show processor migration

oswnetstat

_netstat_YY.MM.DD:HH24.dat

These files will contain output from the 'netstat' command that is obtained and archive by OSWatcher Black Box at specified intervals. These files will only exist if 'netstat' is installed on the OS and if the OSWbb user has privileges to run the utility.

The netstat command displays current TCP/IP network connections and protocol statistics.

The netstat utility is standard across UNIX platforms. Each platform will have a slightly different version of the netstat utility. You should consult your operating system man pages for specifics. The sample provided below is for Solaris.

OSWbb runs the netstat utility at the specified interval and stores the data in the oswnetstat subdirectory under the archive directory. The data is stored in hourly archive files. Each entry in the file contains a timestamp prefixed by *** embedded in the netstat output.

The netstat utility has many command line flags, and the most commonly used to troubleshoot RAC is "ia(n)" for the interface level output and "s" for the protocol level statistics. The following are examples for the two different command parameters.

The command line options "-ain" have these effects:

Option	Description
-a	The command output will use the logical names of the interface. It will also report the name of the IP address found through normal IP address resolution methods.
-i	This triggers the Interface specific statistics, the columns of which are outlined in table [bla-KR]
-n	This causes the output to use IP addresses instead of the resolved names

Example netstat file produced by OSWbb:

Sample netstat file produced by OSWbb

***Fri Jan 28 12:50:36 EST 2005

Name	Mtu	Net/Dest	Address	Ipkts	Ierrs	Opkts	Oerrs	Collis	Queue
lo0	8232	127.0.0.0	127.0.0.1	296065	0	296065	0	0	0
eri0	1500	138.1.140.0	138.1.140.96		0	176244	2	191951	0

RAWIP
	rawipInDatagrams	=	0	rawipInErrors	=	0
	rawipInCksumErrs	=	0	rawipOutDatagrams	=	0
	rawipOutErrors	=	0
UDP
	udpInDatagrams	=	295719	udpInErrors	=	0
	udpOutDatagrams	=	295671	udpOutErrors	=	0
TCP
	tcpRtoAlgorithm	=	4	tcpRtoMin	=	400
	tcpRtoMax	=	60000	tcpMaxConn	=	-1
	tcpActiveOpens	=	27	tcpPassiveOpens	=	21
	tcpAttemptFails	=	6	tcpEstabResets	=	0
	tcpCurrEstab	=	15	tcpOutSegs	=	691
	tcpOutDataSegs	=	479	tcpOutDataBytes	=	43028
	tcpRetransSegs	=	0	tcpRetransBytes	=	0
	tcpOutAck	=	212	tcpOutAckDelayed	=	83
	tcpOutUrg	=	0	tcpOutWinUpdate	=	0
	tcpOutWinProbe	=	0	tcpOutControl	=	85
	tcpOutRsts	=	10	tcpOutFastRetrans
	tcpInSegs	=	915		=	0
	tcpInAckSegs	=	489	tcpInAckBytes	=	43023
	tcpInDupAck	=	42	tcpInAckUnsent	=	0
	tcpInInorderSegs	=	477	tcpInInorderBytes	=	40640
	tcpInUnorderSegs	=	0	tcpInUnorderBytes	=	0
	tcpInDupSegs	=	0	tcpInDupBytes	=	0
	tcpInPartDupSegs	=	0	tcpInPartDupBytes	=	0
	tcpInPastWinSegs	=	0	tcpInPastWinBytes	=	0
	tcpInWinProbe	=	0	tcpInWinUpdate	=	0
	tcpInClosed	=	0	tcpRttNoUpdate	=	0
	tcpRttUpdate	=	462	tcpTimRetrans	=	0
	tcpTimRetransDrop	=	0	tcpTimKeepalive	=	80
	tcpTimKeepaliveProbe	=	0	tcpTimKeepaliveDrop	=	0
	tcpListenDrop	=	0	tcpListenDropQ0	=	0
	tcpHalfOpenDrop	=	0	tcpOutSackRetrans	=	0
IPv4
	ipForwarding	=	2	ipDefaultTTL	=	255
	ipInReceives	=	17858585	ipInHdrErrors	=	0
	ipInAddrErrors	=	0	ipInCksumErrs	=	0
	ipForwDatagrams	=	0	ipForwProhibits	=	0
	ipInUnknownProtos	=	0	ipInDiscards	=	0
	ipInDelivers	=	296623	ipOutRequests	=	17624403
	ipOutDiscards	=	0	ipOutNoRoutes	=	827
	ipReasmTimeout	=	60	ipReasmReqds	=	0
	ipReasmOKs	=	0	ipReasmFails	=	0
	ipReasmDuplicates	=	0	ipReasmPartDups	=	0
	ipFragOKs	=	0	ipFragFails	=	0
	ipFragCreates	=	0	ipRoutingDiscards	=	0
	tcpInErrs	=	0	udpNoPorts	=	225722
	udpInCksumErrs	=	0	udpInOverflows	=	0
	rawipInOverflows	=	0	ipsecInSucceeded	=	0
	ipsecInFailed	=	0	ipInIPv6	=	0
	ipOutIPv6	=	0	ipOutSwitchIPv6	=	5
IPv6
	ipv6Forwarding	=	2	ipv6DefaultHopLimit	=	255
	ipv6InReceives	=	0	ipv6InHdrErrors	=	0
	ipv6InTooBigErrors	=	0	ipv6InNoRoutes	=	0
	ipv6InAddrErrors	=	0	ipv6InUnknownProtos	=	0
	ipv6InTruncatedPkts	=	0	ipv6InDiscards	=	0
	ipv6InDelivers	=	0	ipv6OutForwDatagrams	=	0
	ipv6OutRequests	=	0	ipv6OutDiscards	=	0
	ipv6OutNoRoutes	=	0	ipv6OutFragOKs	=	0
	ipv6OutFragFails	=	0	ipv6OutFragCreates	=	0
	ipv6ReasmReqds	=	0	ipv6ReasmOKs	=	0
	ipv6ReasmFails	=	0	ipv6InMcastPkts	=	0
	ipv6OutMcastPkts	=	0	ipv6ReasmDuplicates	=	0
	ipv6ReasmPartDups	=	0	ipv6ForwProhibits	=	0
	udpInCksumErrs	=	0	udpInOverflows	=	0
	rawipInOverflows	=	0	ipv6InIPv4	=	0
	ipv6OutIPv4	=	0	ipv6OutSwitchIPv4	=	0
ICMPv4
	icmpInMsgs	=	17624914	icmpInErrors	=	0
	icmpInCksumErrs	=	0	icmpInUnknowns	=	0
	icmpInDestUnreachs	=	72	icmpInTimeExcds	=	0
	icmpInParmProbs	=	0	icmpInSrcQuenchs	=	0
	icmpInRedirects	=	0	icmpInBadRedirects	=	0
	icmpInEchos	=	17624842	icmpInEchoReps	=	0
	icmpInTimestamps	=	0	icmpInTimestampReps	=	0
	icmpInAddrMasks	=	0	icmpInAddrMaskReps	=	0
	icmpInFragNeeded	=	0	icmpOutMsgs	=	17624920
	icmpOutDrops	=	225716	icmpOutErrors	=	0
	icmpOutDestUnreachs	=	78	icmpOutTimeExcds	=	0
	icmpOutParmProbs	=	0	icmpOutSrcQuenchs	=	0
	icmpOutRedirects	=	0	icmpOutEchos	=	0
	icmpOutEchoReps	=	17624842	icmpOutTimestamps	=	0
	icmpOutTimestampReps	=	0	icmpOutAddrMasks	=	0
	icmpOutAddrMaskReps	=	0	icmpOutFragNeeded	=	0
	icmpInOverflows	=	0
ICMPv6
	icmp6InMsgs	=	0	icmp6InErrors	=	0
	icmp6InDestUnreachs	=	0	icmp6InAdminProhibs	=	0
	icmp6InTimeExcds	=	0	icmp6InParmProblems	=	0
	icmp6InPktTooBigs	=	0	icmp6InEchos	=	0
	icmp6InEchoReplies	=	0	icmp6InRouterSols	=	0
	icmp6InRouterAds	=	0	icmp6InNeighborSols	=	0
	icmp6InNeighborAds	=	0	icmp6InRedirects	=	0
	icmp6InBadRedirects	=	0	icmp6InGroupQueries	=	0
	icmp6InGroupResps	=	0	icmp6InGroupReds	=	0
	icmp6InOverflows	=	0
	icmp6OutMsgs	=	0	icmp6OutErrors	=	0
	icmp6OutDestUnreachs	=	0	icmp6OutAdminProhibs	=	0
	icmp6OutTimeExcds	=	0	icmp6OutParmProblems	=	0
	icmp6OutPktTooBigs	=	0	icmp6OutEchos	=	0
	icmp6OutEchoReplies	=	0	icmp6OutRouterSols	=	0
	icmp6OutRouterAds	=	0	icmp6OutNeighborSols	=	0
	icmp6OutNeighborAds	=	0	icmp6OutRedirects	=	0
	icmp6OutGroupQueries	=	0	icmp6OutGroupResps	=	0
	icmp6OutGroupReds	=	0

IGMP:
	2490	messages received
	0	messages received with too few bytes
	0	messages received with bad checksum
	2490	membership queries received
	0	membership queries received with invalid field(s)
	0	membership reports received
	0	membership reports received with invalid field(s)
	0	membership reports received for groups to which we belong

	0	membership reports sent

Field Descriptions:

The netstat output produced by OSWbb contains 2 sections. The first section contains information about all the network interfaces. The second section contains information about per-protocol statistics.

Section 1: Netstat -ain

Field	Description
name	Device name of interface
Mtu	Maximum transmission unit
Net	Network Segment Address
address	Network address of the device
ipkts	Input packets
Ierrs	Input errors
opkts	Output Packets
Oerrs	Output errors
collis	Collisions
queue	Number in the Queue

Section 2: Protocol Statistics

The per-protocol statistics can be divided into several categories:

RAWIP (raw IP) packets
TCP packets
IPv4 packets
ICMPv4 packets
IPv6 packets
ICMPv6 packets
UDP packets
IGMP packet

Each protocol type has a specific set of measures associated with it. Network analysis requires evaluation of these measurements on an individual level and all together to examine the overall health of the network communications.

The TCP protocol is used the most in Oracle database and applications. Some implementations for RAC use UDP for the interconnect protocol instead of TCP. The statistics cannot be divided up on a per-interface basis, so these should be compared to the "-i" statistics above.

What to look for:

Section 1

The information in Section 1 will help diagnose network problems when there is connectivity but response is slow.

Values to look at:

Collisions (Collis)
Output packets (Opkts)
Input errors (Ierrs)
Input packets (Ipkts)

The above values will give information to workout network collision rates as follows:

Network collision rate = Output collision / Output packets

For a switched network, the collisions should be 0.1 percent or less (see the Cisco web site as a reference) of the output packets. Excessive collisions could lead to the switch port the interface is plugged into to segment, or pull itself off-line, amongst other switch-related issues.

For the input error statistics:

Input Error Rate = Ierrs / Ipkts.

If the input error rate is high (over 0.25 percent), the host is excessively dropping packets. This could mean there is a mismatch of the duplex or speed settings of the interface card and switch. It could also imply a failed patch cable.

If ierrs or oerrs show an excessive amount of errors, more information can be found by examination of the netstat -s output.

For Sun systems, further information about a specific interface can be found by using the "-k" option for netstat. The output will give fuller statistics for the device, but this option is not mentioned in the netstat man page. More information can be found at http://sunsolve.sun.com.

Section 2

The information in Section 2 contains the protocol statistics.

Many performance problems associated with the network involve the retransmission of the TCP packets. For retransmission rate calculations click here.

To find the segment retransmission rate:

%segment-retrans=(tcpRetransSegs / tcpOutDataSegs) * 100

To find the byte retransmission rate:

%byte-retrans = ( tcpRetransBytes / tcpOutDataBytes ) * 100

Most network analyzers report TCP retransmissions as segments (frames) and not in bytes.

oswprvtnet

_prvtnet_YY.MM.DD:HH24.dat

These files will contain output from the 'prvtnet' command that is obtained and archived by OSWatcher Black Box at specified intervals. These files will only exist if 'prvtnet' is installed on the OS and if the OSWbb user has privileges to run the utility.

Information about the status of RAC private networks should be collected. This requires the user to manually add entries for these private networks into the private.net file located in the base oswbb directory. Instructions on how to do this are contained in the README file.

OSWbb uses the traceroute command to obtain the status of these private networks. Each operating system uses slightly different arguments to the traceroute command. Examples of the syntax to use for each operating system are contained in the sample Example private.net file located in the base oswbb directory. This will result in the output appearing differently across UNIX platforms. OSWbb runs the private.net file at the specified interval and stores the data in the oswprvtnet subdirectory under the archive directory. The data is stored in hourly archive files. Each entry in the file contains a timestamp prefixed by *** embedded in the top output.

Sample file produced by OSWbb

***Fri Jan 28 12:50:36 EST 2005

traceroute to celdecclu2.us.oracle.com (138.2.71.112): 1-30 hops
(initial packetsize = 1500)
1 celdecclu2.us.oracle.com (138.2.71.112) 1.95ms 2.92 ms 1.95 ms

What to Look For

Example 1: Interface is up and responding:

traceroute to X.X.X.X, (X.X.X.X) 30 hops max, 1492 byte packets
1 X.X.X.X 1.015 ms 0.766 ms 0.755 ms

Example 2: Target interface is not on a directly connected network, so validate that the address is correct or the switch it is plugged in is on the same VLAN (or other issue):

traceroute to X.X.X.X, (X.X.X.X) 30 hops max, 40 byte packets
traceroute: host X.X.X.X is not on a directly-attached network

Example 3: Network is unreachable:

traceroute to X.X.X.X, (X.X.X.X) 30 hops max, 40 byte packets
Network is unreachable

oswps

_ps_YY.MM.DD:HH24.dat

These files will contain output from the 'ps' command that is obtained and archive by OSWatcher Black Box at specified intervals. These files will only exist if 'ps' is installed on the OS and if the OSWbb user has privileges to run the utility.

The ps (process state) command list all the processes currently running on the system and provides information about CPU consumption, process state, priority of the process, etc. The ps command has a number of options to control which processes are displayed, and how the output is formatted. OSWbb runs the ps command with the -elf option.

The ps command is fairly standard across UNIX platforms Each platform will have a slightly different version of the ps utility. You should consult your operating system man pages for specifics. The sample provided below is for Solaris.

OSWbb runs the ps command at the specified interval and stores the data in the oswps subdirectory under the archive directory. The data is stored in hourly archive files. Each entry in the file contains a timestamp prefixed by *** embedded in the ps output.

Sample ps file produced by OSWbb

***Wed Feb 2 09:26:54 EST 2005

UID

PID

PPID

PRI

ADDR

WCHAN

STIME

TTY

TIME

CMD

root

Jan 31

0:13

sched

root

107

Jan 31

0:00

/etc

root

Jan 31

0:00

page

root

Jan 31

0:50

fsflu

root

355

232

Jan 31

0:00

/usr/

root

297

296

379

Jan 31

0:00

htt_s

cedavis

391

381

301

Jan 31

0:00

/usr/

Field Descriptions

Field	Description
f	Flags s State of the process
uid	The effective user ID number of the process
pid	The process ID of the process
ppid	The process ID of the parent process.
d	Processor utilization for scheduling (obsolete).
pri	The priority of the process.
ni	Nice value, used in priority computation.
addr	The memory address of the process.
sz	The total size of the process in virtual memory, including all mapped files and devices, in pages.
wchan	The address of an event for which the process is sleeping (if blank, the process is running).
stime	The starting time of the process, given in hours, minutes, and seconds.
tty	The controlling terminal for the process (the message ?, is printed when there is no controlling terminal).
time	The cumulative execution time for the process.
cmd	The command name process is executing.

What to look for

The information in the ps command will primarily be used as supporting information for RAC diagnostics. If for example, the status of a process prior to a system crash may be important for root cause analysis. The amount of memory a process is consuming is another example of how this data can be used.

oswtop

_top_YY.MM.DD:HH24.dat

These files will contain output from the 'top' command that is obtained and archive by OSWatcher at specified intervals. These files will only exist if 'top' is installed on the OS and if the OSWbb user has privileges to run the utility.

Top is a program that will give continual reports about the state of the system, including a list of the top CPU using processes. Top has three primary design goals:

provide an accurate snapshot of the system and process state,
not be one of the top processes itself,
be as portable as possible.

Each operating system uses a different version of the UNIX utility top. This will result in the top output appearing differently across UNIX platforms. You should consult your operating system man pages for specifics. The sample provided below is for Solaris.

OSWbb runs the top utility at the specified interval and stores the data in the oswtop subdirectory under the archive directory. The data is stored in hourly archive files. Each entry in the file contains a timestamp prefixed by *** embedded in the top output.

Sample top file produced by OSWbb

***Fri Jan 28 12:50:36 EST 2005 load averages: 0.11, 0.07, 0.06 12:50:36 136 processes: 133 sleeping, 2 running, 1 on cpu Memory: 2048M real, 1061M free, 542M swap in use, 1605M swap free
PID	USERNAME	THR	PRI	NICE	SIZE	RES	STATE	TIME	CPU	COMMAND
704	cedavis	16	49	0	346M	276M	sleep	222:33	3.51%	java
362	root	1	59	0	34M	75M	sleep	11:49	0.21%	Xsun
20675	cedavis	1	0	0	1584K	1064K	cpu	0:00	19%	top
20640	cedavis	1	0	0	1904K	1240K	sleep	0:00	0.14%	OSWatcher.sh
20657	cedavis	1	20	0	1904K	1240K	sleep	0:00	0.14%	oswsub.sh
16881	cedavis	1	59	0	199M	159K	sleep	23:04	0.10%	oracle
20671	cedavis	1	0	0	1904K	1240K	run	0:00	0.09%	oswsub.sh
20653	cedavis	1	0	0	1904K	1240K	sleep	0:00	0.09%	OSWatcherFM.sh
20665	cedavis	1	0	0	1904K	1240K	sleep	0:00	0.09%	oswsub.sh
20672	cedavis	1	0	0	1264K	1031K	sleep	0:00	0.09%	iostat
20659	cedavis	1	10	0	1904K	1240K	sleep	0:00	0.09%	oswsub.sh
20661	cedavis	1	30	0	1096K	880K	sleep	0:00	0.09%	vmstat
20668	cedavis	1	0	0	1904K	1240K	run	0:00	0.05%	oswsub.sh
20674	cedavis	1	0	0	968K	624K	sleep	0:00	0.05%	sleep
20663	cedavis	1	20	0	1080K	864K	sleep	0:00	0.05%	mpstat

Field Descriptions

load averages: 0.11, 0.07, 0.06 12:50:36

This line displays the load averages over the last 1, 5 and 15 minutes as well as the system time. This is quite handy as top basically includes a timestamp along with the data capture.

Load average is defined as the average number of processes in the run queue. A runnable Unix process is one that is available right now to consume CPU resources and is not blocked on I/O or on a system call. The higher the load average, the more work your machine is doing.

The three numbers are the average of the depth of the run queue over the last 1, 5, and 15 minutes. In this example we can see that .11 processes were on the run queue on average over the last minute, .07 processes on average on the run queue over the last 5 minutes, etc. It is important to determine what the average load of the system is through benchmarking and then look for deviations. A dramatic rise in the load average can indicate a serious performance problem.

136 processes: 133 sleeping, 2 running, 1 on cpu

This line displays the total number of processes running at the time of the last update. It also indicates how many Unix processes exist, how many are sleeping (blocked on I/O or a system call), how many are stopped (someone in a shell has suspended it), and how many are actually assigned to a CPU. This last number will not be greater than the number of processors on the machine, and the value should also correlate to the machine's load average provided the load average is less than the number of CPUs. Like load average, the total number of processes on a healthy machine usually varies just a small amount over time. Suddenly having a significantly larger or smaller number of processes could be a warning sign.

Memory: 2048M real, 1061M free, 542M swap in use, 1605M swap free

The "Memory:" line is very important. It reflects how much real and swap memory a computer has, and how much is free. "Real" memory is the amount of RAM installed in the system, a.k.a. the "physical" memory. "Swap" is virtual memory stored on the machine's disk.

Once a computer runs out of physical memory, and starts using swap space, its performance deteriorates dramatically. If you run out of swap, you'll likely crash your programs or the OS.

Individual process fields

Field	Description
PID	Process ID of process
USERNAME	Username of process
THR	Process thread PRI Priority of process
NICE	Nice value of process
SIZE	Total size of a process, including code and data, plus the stack space in kilobytes
RES	Amount of physical memory used by the process
STATE	Current CPU state of process. The states can be S for sleeping, D for uninterrupted, R for running, T for stopped/traced, and Z for zombied
TIME	The CPU time that a process has used since it started
%CPU	The CPU time that a process has used since the last update
COMMAND	The task's command name

What to Look For

Large run queue. Large number of processes waiting in the run queue may be an indication that your system does not have sufficient CPU capacity.
Process consuming lots of CPU. A process which is "hogging" CPU is always suspect. If this process is an oracle foreground process it's most likely running an expensive query that should be tuned. Oracle background process should not hog CPU for long periods of time.
High load averages. Processes should not be backed up on the run queue for extended periods of time.
Low swap space. This is an indication you are running low on memory.

oswvmstat

_vmstat_YY.MM.DD:HH24.dat

These files will contain output from the 'vmstat' command that is obtained and archive by OSWatcher Black Box at specified intervals. These files will only exist if 'vmstat' is installed on the OS and if the OSWbb user has privileges to run the utility.

The name vmstat comes from "report virtual memory statistics". The vmstat utility does a bit more than this, though. In addition to reporting virtual memory, vmstat reports certain kernel statistics about processes, disk, trap, and CPU activity.

The vmstat utility is fairly standard across UNIX platforms. Each platform will have a slightly different version of the vmstat utility. You should consult your operating system man pages for specifics. The sample provided below is for Solaris.

OSWbb runs the vmstat utility at the specified interval and stores the data in the oswvmstat subdirectory under the archive directory. The data is stored in hourly archive files. Each entry in the file contains a timestamp prefixed by *** embedded in the vmstat output.

Sample vmstat file produced by OSWbb

***Fri Jan 28 12:50:36 EST 2005
procs			memory		page							disk				faults			cpu
r	b	w	swap	free	re	mf	pi	po	fr	de	sr	dd	f0	s0		in	sy	cs	us	sy	id
0	0	0	1761344	1246520	1	6	0	0	0	0	0	2	0	0	0	380	1364	900	4	1	95
0	0	0	1643920	1086776	331	1485	8	16	16	0	0	31	0	0	0	447	4966	1315	15	31	54
0	0	0	1643872	1086728	6	0	0	0	0	0	0	0	0	0	0	389	1472	932	0	0	100

Field Descriptions

The vmstat output is actually broken up into six sections: procs, memory, page, disk, faults and CPU. Each section is outlined in the following table.

Field	Description
PROCS
r	Number of processes that are in a wait state and basically not doing anything but waiting to run
b	Number of processes that were in sleep mode and were interrupted since the last update
w	Number of processes that have been swapped out by mm and vm subsystems and have yet to run
MEMORY
swap	The amount of swap space currently available free The size of the free list
PAGE
re	page reclaims
mf	minor faults
pi	kilobytes paged in
po	kilobytes paged out
fr	kilobytes freed
de	anticipated short-term memory shortfall (Kbytes)
sr	pages scanned by clock algorithm
DISK
Bi	Disk blocks sent to disk devices in blocks per second
FAULTS
In	Interrupts per second, including the CPU clocks
Sy	System calls
Cs	Context switches per second within the kernel
CPU
Us	Percentage of CPU cycles spent on user processes
Sy	Percentage of CPU cycles spent on system processes
Id	Percentage of unused CPU cycles or idle time when the CPU is basically doing nothing

What to look for

The following information should be used as a guideline and not considered hard and fast rules. The information documented below comes from Adrian Cockcroft's book, Sun Performance Tuning. Other operating systems like HP and Linux may have different thresholds.

Large run queue. Adrian Cockcroft defines anything over 4 processes per CPU on the run queue as the threshold for CPU saturation. This is certainly a problem if this last for any long period of time.
CPU utilization. The amount of time spent running system code should not exceed 30% especially if idle time is close to 0%.
A combination of large run queue with no idle CPU is an indication the system has insufficient CPU capacity.
Memory bottlenecks are determined by the scan rate (sr) . The scan rate is the pages scanned by the clock algorithm per second. If the scan rate (sr) is continuously over 200 pages per second then there is a memory shortage.
Disk problems may be identified if the number of processes blocked exceeds the number of processes on run queue.

Thanks,
SreedharD.