HPWorld 98 & ERP 98 Proceedings

Unix Performance Fundamentals.

By Jeff Kubler

Lund Performance Solutions


When investing money in the stock market every investor is obviously there for one reason -- TO MAKE MONEY! Who makes money? Well in today’s market everyone who diversifies and stays in for the long haul makes money. However, people usually envision, or at least would hope, for a much larger and quicker return. Very few do, and for those who do, how do they do it? They make money by understanding certain fundamentals, or leading indicators that allow them to see something before it happens. In the same way the good system manager can learn to understand certain fundamentals and anticipate future changes. This way the system manager can be prepared beforehand with the additional resources necessary to meet demand or the new configuration or tuning that will allow continued adequate performance.

INTRODUCTION

This paper, "Unix Performance Fundamentals", will cover the four basic global areas of system performance. These will be discussed in order to provide an understanding of how they work, what metrics help evaluate their functionality best, what "rules of thumb" can use to determine when they are being over used, and what tuning tips will help get the most out of the system?

Any successful information services department must evaluate performance. Companies, which experience growth often, find that demand now exceeds supply. When response time is no longer acceptable it is often too late to avoid a period of user complaints before the appropriate resource can be upgraded. Even in companies where growth is not a big factor there will be changes in the environment, normal degradation in data structures and problem processes that need to be addressed. Performance management is the key to keeping the level of hardware resources appropriate to the demand. It is also the key to keeping performance acceptable on a system that is not bottlenecked on a global resource.

Performance Management begins with understanding. Once the functionality of the four basic areas is understood there is a readiness to apply this knowledge to overall planning of resources, problem and crisis performance of specific applications and processes, and the identification of global bottlenecks.

BASICS

The first understanding of importance is the Response Time. This is the time between when a user makes a request and when the computer (really, the entire system -- network, disks etc.) are able to return the request. Response times are the bottom-line measurement of the job of the computer or IT manager. When Response times are too slow users complain and those complaints work their way up or down the corporate food chain. Then questions of response lead to questions of management and resources. Everyone wants to avoid this feeding frenzy. That’s why some type of system monitoring needs to take place. Otherwise you may find that some resource is close to saturation or has already reached a point of saturation. A common discussion used to explain what can happen to a resource is that of the "Knee in the Curve". Basically, a resource can be fine and highly utilized. Until one more request is made and suddenly that resource is over taxed. Response will increase suddenly once the "Knee in the Curve" area is reached. The following illustration helps point this out.

Figure 1 - Performance "Knee in the Curve"

Monitoring Tools

Once you have determined to monitor performance there are a number of ways to proceed. One option is use the commands that are part of the UNIX operating system. Commands capable of providing helpful information and part of the UNIX operating system are listed below:

top -- shows the top 10 CPU users including a breakdown of user utilization’s. Also includes a total of CPU usage, memory usage and CPU Queue Length information.

ps -- (option -ef) shows individual process information.

nfstat -- shows total network statistics.

netstat -- gives a breakdown of network packet activity including ICMP, UDP, TCP, and IGMP activity.

vmstat -- this displays CPU activity, free-memory information, paging, trap rate, and disk transfer data.

iostat -- provides information about disk usage.

sar -- provides general information, which can be collected into reports.

Here is a sample of the top command

Load averages: 2.31, 1.74, 1.54

102 processes: 100 sleeping, 2 running

Cpu states:

LOAD USER NICE SYS IDLE BLOCK SWAIT INTR SSYS

2.31 90.8% 0.0% 9.2% 0.0% 0.0% 0.0% 0.0% 0.0%

Memory: 8272K (5988K) real, 19100K (14528K) virtual, 2184K free Page# 1/8

TTY PID USERNAME PRI NI SIZE RES STATE TIME %WCPU %CPU COMMAND

? 1730 root 235 20 448K 184K run 420:18 65.89 65.78 _mprosrv

? 4615 root 168 20 1272K 520K sleep 245:01 25.25 25.20 _progres

? 1563 root 128 20 480K 216K sleep 432:17 2.04 2.04 _mprosrv

? 6187 root 48 0 1732K 1788K sleep 2:30 1.01 1.01 lpsmid

? 961 root 154 20 8K 16K sleep 188:53 0.93 0.93 nfsd

This process services NFS requests from remote systems.

? 962 root 154 20 8K 16K sleep 185:05 0.91 0.91 nfsd

? 956 root 154 20 32K 40K sleep 185:28 0.91 0.90 nfsd

? 957 root 154 20 8K 16K sleep 187:26 0.90 0.90 nfsd

? 7 root -32 20 0K 0K sleep 67:26 0.32 0.32 ttisr

d1p0 6309 root 178 20 208K 308K run 0:00 0.33 0.29 _mprosrv

? 0 root 127 20 0K 0K sleep 71:35 0.21 0.21 swapper

? 1407 root 154 20 180K 0K sleep 71:35 0.22 0.22 swapper

? 2 root 128 20 0K 0K sleep 18:55 0.21 0.21 vhand

? 1411 root 156 20 180K 68K sleep 6:32 0.15 0.15 _mprshut

Other third parties including Hewlett-Packard have developed programs to monitor performance. HP offers the following:

GlancePlus -- on-line performance monitoring.

PerfRX -- used to study historical data. Works with MeasureWare.

LaserRX -- like PerfRX but runs on a PC.

Lund Performance solutions offers a suite of several products. Here is a run down:

 

SOS/9000

SOS/9000 is an on-line performance monitoring and diagnostic utility for HP-UX systems. SOS/9000 uses either graphical displays or tabular (numeric) to depict how CPU, Disk, Memory, Network and more resources are being utilized.

SOS/9000 obtains most of its information about system usage via the kernel, through a process called lpsmid. This process is initiated as part of the startup of sos/9000.

 

SOS/X

SOS/X is the GUI front end to the observing on-line HP-UX performance data. SOS/X presents the performance data collected via the lpsmid process using the X-window environment.

 

Performance Gallery

Performance Gallery is the historical analysis tool used to take collected data and present various graphs for analysis. Performance Gallery runs on Windows 3.1, Windows95 or Windows NT. There are over 40 pre-created graphs with the ability to create more.

 

Analysis using extra-UNIX tools

Once either SOS/9000 or Glance is launched a Global Screen appears. In SOS/9000 this screen has two modes. One mode shows the global information as a graphic view and the other showing the numeric or tabular view. Glance uses the graphic only for global information. The screens are divided into sections. Some examples being Global, Process and Performance Advice sections. Either product has a performance Advice section that seeks to interpret measured values into English statements. The process section breaks out the active processes into their individual processes. In SOS these are sorted by CPU percent. And the Global Section details the usage of CPU, Queue Lengths, Memory percentages used and disk I/O information. Pressing the "Screen Menu" function key in SOS or the function keys in either SOS or Glance can access more screens.

Since I am more familiar with SOS and the Lund Products these will be used for most of the samples. Figure 2 shows a sample screen shot from within the SOS Global screen.

Figure 2 - Global sos/9000 Screen

Data Collection

The analysis and presentation of graphics images depicting the systems performance values cannot be given too little importance. It is greatly helpful in observing immediate problems and in seeing trends over the long term.

Performance Gallery is the Lund tool that looks at historical data. Performance Gallery begins with collected data. A process called soslogd must be started to collect resource usage data. The collector comes with the on-line tool. Once a period of time has been collected, the program soslogx is used to pull performance gallery data out for use.

PerfRX and LaserRX are two HP products that create graphs of historical data. PerfRX replaces LaserRX.

Measurement Information

CPU ANALYSIS

The CPU (Central Processing Unit) performs all of the processing that occurs in the computing environment. Total CPU usage is a good place to start when evaluating system usage. Obviously, when all of the CPU is used you are out of that resource. However, in another sense if the usage executing on the system happens in several priorities the system will be able to manage better. This assumes that some of the processes executing do not demand as fast response time as others (these may be NICED -- or have their priorities lowered). You can check the CPU makeup by looking at either the horizontal bar graph. Processes executing with a 20 nice value are executing without an adjustment to their priority. Processes with a value less than 20 are executing with a "Not Nice" value and those with a value greater than 20 have been "Niced". "Not Nice" processes have had their priority increased to help their processing complete faster while those with a "Niced" value have had their ability to compete reduced. Since the scheduler allows multiple processes to access the CPU by time slicing each contending process the priority of each process is a key element in determining when a process will complete. Other indicators of importance when looking at a process priority are the "R" priority. There are two types of priorities, Real and Time-Share. Most user priorities are time-share. High priority system processes are "Real Time" and so are described by the letter "R", these priorities continue with the CPU until they are either complete are can’t process (the processes are not ready). The different processes running with different priorities are an important point of understanding in looking at the CPU usage. Within SOS they are classified as: U = User, S = System, I = Interrupt, N = Nice, X = NNice, C = Context Switching, and T = Trap. The Figure below on the right hand shows the breakdown. When a process must give up the CPU the CPU saves that process in a context and switches to the other. An Interrupt is a hardware event used to stop CPU processing. This can be simply to change priorities or some more ominous problem. A Trap is a software event.

Figure 3 – Global CPU

The Run Queue should be another top indicator of CPU problems. The count in the run queue shows how many processes are awaiting the attention of the CPU. The higher the number of processes awaiting the CPU the worse the bottleneck. When the "5 minute run queue" maintains a consistent average in unacceptable ranges your bottleneck is very significant and occurs most of the time.

In Figure 2 above under the "Process Statistics" section the Pri column shows the priority of the process. The "Nic" column shows the Nice value, which is used by the scheduler to help determine the priority of a process. This screen is the main sos/9000 screen.

The priority of a process is a dynamically recalculated value assigned initially into different groupings based on whether the process is a highly important system process

Memory Analysis

Memory acts as the work area or "scratch pad" for all of the done on behalf of a process. Because memory is limited a number of strategies are used to try to limit the number of I/O’s to disk. This is because disk is much slower than memory access. Read’s and write’s are buffered, and paging and deactivations (or swapping) are used to move unused memory to disk. More memory is usually the cure for a shortage but configuration issues such as the size of the buffer cache are also possible solutions.

Key indicators of memory performance help evaluate how many disk reads and disk writes are eliminated, the percentage of memory and virtual memory in use, the numbers of pages of memory being moved to virtual storage, and the number of processes that have been deactivated (swapped to disk). When a process begins to process it’s needed data must be moved into memory. Since memory is finite there can come a time when no available memory exists. When this occurs the memory manager acts to move pages of space filled with pages that have not been accessed recently out to an area of disk reserved for virtual storage. When memory becomes very short the memory pages of an entire process will be moved, or "swapped" out to virtual storage (the term used in 9.04). This means that the entire process is "deactivated" as its memory pages are sent to disk and it is removed from the Ready Queue (this is what happens at 10.0 and above). When memory becomes very short a process can actually spend most of it’s time "trashing" or looking for available memory. These memory events begin to happen when the thresholds in the figure below are violated.

Figure 4 - Memory Explanation

Even though the measurements of buffer efficiency are listed under disk I/O metrics it is helpful to bring them up under the memory discussion. How much space is configured for the buffer cache removes that much space from general memory availability.

Disk I/O

Disk provides long term storage user and system data. In order for a process to access data a disk I/O request must be made to retrieve the needed data. Data that is inefficiently placed in files or data structures will cause the I/O to be much slower than it could be. Several disk I/O requests may arrive at the same disk drive, files may be fragmented, tables and indexes may have their own inherent inefficiencies and disk hardware may be slow. When these problems exist requests will wait in the disk queue. Of course the net impact of this is that processes don’t complete as quickly as they could and users response times are impeded and slowed.

Disk also provides an area on disk called Virtual Memory. This is an area that can be used to move less active memory pages (this process is called paging). Without this capability new processes might not be able to start until memory freed up. Swap space can be defined as system or device swap space (device swap is the faster of the two) or it can be defined within a specific file system. The first step in this process is called Paging. Memory is allocated in increments of pages. When pages are brought into memory (a step in starting a process) or out of (a step which is used to make room for new processes) it is called paging. Once the demand for memory becomes more serious the pages assigned to an entire process can get moved out to virtual memory. At operating system version 9.04 this process is called Swapping. At operating system 10.0 this process is called Deactivation.

A common issue we have observed is that many sites have a data hot spot. One disk, or file system will hold the files where a majority of I/O hits. This is illustrated in the graphic below.

Figure 5 --- Disk I/O by Drive

Network

Network issues are the sometimes-mysterious performance problem. Poorly planned and overtaxed networks, shared files on network file systems, and inadequate hardware, can all play parts in poor network performance. Several different metrics help identify performance problems on a network. While packets in and packets out, errors in and errors out, for several different types of transmissions can be measured the clearest measurement is that of the percentage of collisions. Any number of collisions that calculates to 10 percent or more of the total network packets transferred is considered high and some examination of the network should be undertaken to try and identify and correct the problem.

Analysis Points

Once there is a basic understanding of the measured values some "rules of thumb" can be used to help evaluate global system issues. On the next figure a breakdown of these helps set a threshold for easy evaluation of problems. "Pulse Points" is another way to refer to "rules of thumb".

Pulse Points

Under each of the important areas several indicators have emerged as being most indicative of possible problems. These have been organized by area and the "rule of thumb" or pulse point has been arrived at. The first area is that of the CPU:

INDICATOR

GREEN

YELLOW

RED

COMMENTS

CPU:

Total CPU (%)

<60

60-85

>85

(Wait+idle+nice)

High Priority CPU (%)

<60

60-85

>85

High priority processing only.

Run Queue Average

<5

5-10

>10

Processing waiting on CPU.

5 Minute Run Queue Average

<5

5-10

>10

 

Real Processing (%)

<5

5-10

>10

Fixed priority processing; too much can be bad.

System Processing (%)

<10

10-20

>20

CPU time spent on system calls.

Interrupt (%)

<10

10-15

>15

Occurs when hardware devices request CPU.

Context Switching (%)

<2

2-8

>8

Percent of time spent switching between processes.

Capture Ratio

>3.0

3-1

<1

High value indicates high user processing.

The second area is the Memory area:

MEMORY:

Page Outs/Second

<5

5-10

>10

Pages written to swap space. Good indicator of memory shortage.

Swap Outs/Second

<2

2-5

>5

Processes swapped to memory.

Memory (%)

<80

80-90

>90

Percent of memory in use.

Virtual Memory (%)

<50

50-80

>80

Virtual memory in use

The third area is the Disk area:

DISK I/O:

Wait (%)

<10

10-20

>20

CPU time waiting for disk.

Read Hit (%)

>90

80-90

<80

Measures effectiveness of buffer cache.

Write Hit (%)

>95

85-95

<85

Measures effectiveness of buffer cache.

Disk Queue Length

<1.0

1-3

>3

Overall average indicator of data locality.

Total Disk Rate/Seconds

<40

40-60

>60

 

Figure 6 – Pulse Points

Pulse Points Explained

Some the defined items have changed as operating systems have changed. Here is a quick explanation for each item:

Total CPU (%) --This is the total usage on the system.

High Priority CPU (%) -- This can be viewed on the Global Screen, but you have the advantage of Pulse Points interpretation here. Hi Pri CPU % is the percentage of CPU time spent in user mode, system mode and servicing interrupts.

Interpretation Hints: If you suspect CPU saturation you should keep an eye on this number. It may point to batch jobs and processes that need to be "niced" to a lower priority.

Run Queue Average -- This statistic shows the average number of executable processes which are waiting to use the CPU for this interval.

5 Minute Run Queue Average -- This statistic shows the average number of executable processes, which are waiting for the CPU over the last 5 minutes.

Real Processing (%) -- Real is the CPU time spent servicing "real-time" user process. These are processes, which run at high (often-fixed) priority.

System Processing (%) --This is the percentage of time the CPU spends executing system calls or operating in the kernel mode.

Interrupt (%) -- This is the percentage of time the CPU spent on interrupt and overhead.

interrupts - allow for changing from process to process. These interrupts cause the CPU to interrupt and shift priorities.

Context Switching (%) -- This is the percentage of time the CPU spent managing process switching.

context switching - after a process run for a quantum (by default 1/10th second) until the process completes or is pre-empted to let another process run. The CPU saves the status of the first process in a context and switches to the next process. The first process drops to the bottom of the run queue to wait for its next turn.

Capture Ratio -- This statistic is a ratio of how much real useful system work is being done compared to the non-useful system work (overhead). The equation for this ratio is as follows: (User + Real + Nice)/(System + Interrupt + Context Switch) = Capture ratio.

 

Page Outs/Second -- This statistic represents the number of times (per second) that a page out is performed to move the least needed pages from memory by writing them to swap space or to the file system. This happens when physical memory becomes scarce.

Deactivations or Swap Outs/Second -- This statistic represents the number of processes swapped out of memory to disk.

Memory (%) -- This is the percentage of memory used.

Virtual Memory (%) -- This is the percentage of virtual memory used (or swap space).

Read Hit (%) -- This value represents the percentage of time that disk read requests were satisfied in buffer cache, versus having to satisfy read-type I/Os physically from disk.

Write Hit (%) - This value represents the percentage of time that disk writes were performed in the buffer cache instead of resulting in a physical write to disk.

Disk Queue Length - This statistic shows the average number of executable processes which are waiting to use the CPU for this interval. The number in the bracket is this value since the monitoring began or statistics were reset.

Total Disk Rate/Seconds -- This is the total disc I/Os occurring each second.

Within SOS this information has been used to create a screen. When one threshold is exceeded that value moves from the "Green" column to the "Yellow" column.

Figure 7 – Pulse Point Screen

Conclusion

The investment you’ve made in computer resources should be considered an important and protected resource. Sometimes our attitude towards these vital resources can best be described as "cavalier". The purchase is made, but no time or software investment is made to insure that the resource will always be more than adequate and that users needs, in terms of response times, availability and reasonability, will continue to be met.

When problems are encountered (usually unexpectedly) a whole process begins to address the problem. Many times this process is much more accelerated than decisions of this nature normally should be and usually much more money is expended to solve the problem than would have been expended had good long range planning foreseen the event and planned for it. Worse, sometimes bad decisions result from this overly stressed decision brought on by bad planning (or lack of planning).

What needs to be done to avoid the pitfalls most data centers find themselves faced with? The first thing that must be done is to have a change of mindset. A realization needs to be made that the area of system performance is important, that someone needs to pay consistent attention to it, and that software resources need to be acquired (or already available software used) to help measure and understand performance. Once there is a mindset change the rest will naturally follow. However, the mindset change usually occurs when it begins toward the top of the organization chart in any company. If you have experienced the needed mindset change you may need to begin some subtle political moves to try to move the change "up the ladder".

 

Performance Problems

When performance problems occur the monitoring tool can be run to observe performance inhibitors. Sometimes just the observation is enough to determine a problem in usage that can lead to re-scheduling the activity, a change in resource usage (perhaps a NICE value), or identify programs for re-engineering (for greater efficiency). At other times a global resource restriction may be the root of the problem. In these cases a larger resource may be needed.

To avoid running out of needed resources and to help gain a general knowledge of your HP-UX system usage you should do the following things:

Author | Title | Tracks | Home


Send email to Interex or to theWebmaster
©Copyright 1998 Interex. All rights reserved.