HPWorld 98 & ERP 98 Proceedings

Capacity Planning: How to Avoid Hitting the Knee in the Curve

Jeff Kubler

Lund Consulting Services
Albany, OR 97321
Phone: (541) 926-3800
Fax: (541) 926-7723

Introduction

Planning of any kind in any venue is hard to come by. Financial experts point out that the better part of 90 percent of us end up at the end of hard work life really without the resources to lead a comfortable retirement. It is the same way in the realm of computers. As MIS professionals we have been given the task of supplying the computing needs of a user base. Learning to plan and foresee future problems is a difficult task. One shrouded by lack of knowledge and made more difficult by the daily demands of life. But it is one that can be most rewarding if these difficulties can be overcome. In this paper I will describe what capacity planning is, the importance of capacity planning, the pitfalls of capacity planning, the steps of capacity planning, the tools of the capacity planner and how a capacity planner makes effective presentations.

What is Capacity Planning?

"Capacity Planning is the science (art?) of relating business plans to data processing workloads and forecasting cost effective upgrade and tuning solutions to provide the continued required service levels for the larger organization in both the short and long term."

Quantitative System Performance, Edward D. Lazowska, John Zahorjan, G. Scott Graham, Kenneth C. Sevcik, Prentice Hall.

As this quote indicates capacity planning is really both a science and an art. It relies heavily on someone who can understand both the data processing side of the entity but also the business side. The final steps in capacity planning would never be given to the junior technician. Rather the experienced analyst would be the one to analyze system utilization, interpret feedback from the user community, figure out from management where things are really headed and verify that current and future hardware will meet the needs of future demands.

Capacity Planning helps provide the ability to make sure that hardware is on hand to meet the needs of users. It is the ability to determine the tough "what if?" questions. It is the safe guard to avoid unneeded expenditures and unwanted user response time slowdowns due to unmet user demands.

Two definitions really help explain the views needed of a capacity planner. Acquired-objective-analysis means

"focusing on analyzing trends or patterns from the past with the expectation that these historical patterns will develop in the future as well."

Browning, Capacity Planning for Computer Systems

Intuitive-subjective-synthesis means

"the ability to understand the politics of experience within the organization and to discern which events are likely to transpire".

Browning, Capacity Planning for Computer Systems

Why Perform Capacity Planning?

Capacity Planning should be performed in order to avoid unneeded expenses, to build confidence from both a management and user base that you are ready for the future, and to ensure that user needs will always be met. This really needs to be done to make sure that the initial investment is maximized. Making an upgrade either too early or to late is costly. You never want to run short on a resource. Conversely it is expensive to have too much resource (additional purchase costs, support costs, etc.).

The final compelling reason is the realization that bottlenecks can and do occur. When they occur there can be very little warning. The growth curve for CPU growth might look like the figure below with CPU usage on the Y axis and response time on the X. The point of this graph is that as you reach a certain point, response times suddenly become unacceptable.

Figure 1

Scope of Capacity Planning.

In every organization there are a number of important resources. CPU, memory, disk, and network are the key capacities important to the data processing professional. Due to the way memory is utilized the best method to use for planning for proper memory size is a "rule of thumb." This means that for each user you need x amount of memory, each batch job another x amount and then another amount for the operating system. This is because almost all available memory will be used regardless of how much you have. Disk has a number of components, transient space (sometimes referred to as "swap space") and permanent storage. Your organization may have a database administrator. The administrator needs to make sure that the database files and tables are sufficiently sized. The system administrator must make sure that there is adequate available space in the total system, in the individual volume sets, and on Unix systems - in the file systems. The network needs to be administered to make sure that no one leg is overused. The area of capacity planning this paper will focus on in that of CPU sizing. When sizing the CPU there are several questions asked of the capacity planner.

Questions asked of or by a Capacity Planner

A number of questions are helpful to think about before and during a capacity study. They are:

"How do we accurately size an upgrade to our current system?"

"When is improved performance not likely if we migrate to a larger system?"

"If we increase our accounts payable clerks from 30 to 60, what will be the net impact on the system?"

"When should a faster CPU be installed, and how long can we expect it to last?"

"What will happen to response times when we add 25% more workstations next month?"

"What will happen if we move order entry from system X to system Y while at the same time adding 10 more order entry clerks?"

"How will the average response time change if we upgrade the CPU?"

"Can I make my HP3000/HP9000 into a Web Server?"

"Should we add a new processor board?"

"What will happen if we move to a client/server or networked environment?"

These are the questions the capacity planner must prepare to answer. Knowing the types of questions before you begin helps direct the study to the areas of need. Once these above questions have been considered the capacity planner is ready to start. What are the steps a capacity planner should take?

What are the steps taken to perform a Capacity Study?

Outlining the steps of a capacity plan helps to define the needed methods. It also points out that a capacity plan is not a finished product, rather it is a process. When is the time to start capacity planning? NOW! Here are the steps you need to take to perform a good study:

Understand the current environment. What part are you chiefly responsible for? What are the methods and procedures in place to handle hardware and software purchases?

Measure the current system. Where are we now? Implement weekly/monthly/quarterly system "check-ups". Optimize again and take another baseline measurement (you don’t want to extrapolate based on data from non-optimized systems).

Create service level objectives. You must have concurrence and agreement of management and users. Involve supervisors of all DP "Customers". Be specific! Some examples:

Ninety percent of all first response times for order entry shall be less than two seconds

Batch jobs will be completed by 6 A.M. the next working day (A list of relevant jobs should be attached).

Characterize workloads. Who is using what? Who is the user? What is a business perspective of our resource usage?

Forecast and predict workloads. Choose appropriate forecast method and make predictions. Compare forecast with reality.

Create the capacity plan. Summarize findings in management report.

Re-evaluate periodically. Stay on top of the entire process - review and update.

Once you have began to implement the steps of a capacity plan the plan itself is ready for presentation. A number of sections are needed to help support the findings at which you have arrived. Here are the important component parts needed in the study.

Components of the Plan

A statement of the purpose of a Capacity Plan

Make sure that this does the following:

Documents the entire process.

Documents all judgment calls and assumptions.

Provides communication between user and DP.

Is used to validate capacity plan for later re-evaluation of the results.

Provides learning & training tool. If things are explained clearly enough it will help train others to learn from your experience.

Contents of the Capacity Plan

Single Page Executive Summary for Top Management containing conclusions - Budget Implications - Space for Approval (Signature).

Resource predictions by system resource on behalf of user departments (Workloads).

Supporting details by workload and system resource including graphics, spreadsheets, tables, etc.

Summary of user forecasts such as transaction counts and user growth. These are used to anticipate future workloads.

Understanding the Environment

Understanding the environment is another key point of understanding. Here you have the basic hardware used, the software in use (both for production, programming languages, and utilities, etc.), the nature of the business and any business cycles and the management environment.

Under the system hardware and software area, it is important to know the CPU type, the amount of memory install, and the physical total amount of disk space available and in use. Figure 2 shows the configuration parameters observed via the LPS product SOS. The uname command shows the following info on a Unix machine:

$ uname -a

HP-UX Scratchy B.10.10 A 9000/807 1749501181 two-user license. MPE users can use the SHOWVAR HPCPUNAME to show the CPU type.

When a CPU bottleneck is encountered an upgrade can be the answer and knowing the current CPU size along with potential upgrade paths is essential. CPU’s have been released in families with an easy upgrade path from one system in the family to another. Understanding the available systems is another key point in the environment. Hewlett-Packard provides extensive information on the internet that can be found starting at the following URL: http://www.hp.com/.

Configuration issues can be of key concern as well. Poor configuration parameters, particularly in the Unix environment (not really as many of these to worry about in MPE) can be the detriment to the performance and capacity plans of any institution.

It is also important to understand the software in use for both production and utilities. Tiered pricing is often used. Upgrades to the CPU size can lead to large increases in user licensing. Sometimes it is these costs that most influence the upgrade decision.

SOS/9000 F.05(c) LPS TUE, JUN 24, 1997, 3:11 PM E: 00:24:33 I: 01:32

---------------------------System Configuration Info ---------------------------

| System Name: cure HP-UX Version: A.09.04 Total Mem: 524288 KB |

| CPU Type: 9000/887 Run Level: 2 Total VM: 262144 KB |

| Serial Number: 1989810201 Boot Time: 18 JUN 97 23:15 |

------ File System Configuration --------------- Process Configuration ---------

| bufpages 26214 nbuf 26214 | Pages |

| fs_async ON nfile 1982 | timeslice 100 maxdsiz 16384 |

| maxfiles 60 ninode 1500 | maxuprc 256 maxssiz 2048 |

| maxfiles_lim 1024 ntext 0 | nproc 1044 maxtsiz 16384 |

-- Swap Configuration ---------------------- Miscellaneous ---------------------

| nswapdev 10 | ncallout 1108 unlockable mem 2288 pages |

| nswapfs 10 | npty 128 iomemsize 40960 bytes |

------------------------------ IPC Configuration ---------------3---------------

| Messages msgmap 258 | Semaphores semvmx 32767 | Shared mem |

| msgmax 32768 bytes msgmni 50 | semmap 36 semaem 16384 | shmmax 256M |

| msgmnb 32768 bytes msgseg 7168 | semmni 75 semmnu 120 | shmmni 100 |

| msgsz 8 bytes msgtql 256 | semmns 350 semume 72 | shmseg 100 |

--------------------------------------------------------------------------------

Figure 2

Another important piece of information is the nature of the business. The capacity planner should know about the day to day, week to week, month to month and quarter to quarter production levels of the business. Some businesses are very seasonal. Producing capacity plans during periods of low demand, without some adjustments, could lead to a very poor plan. Figure 3 shows the CPU utilization for a business that peaks toward the end of the summer.

Figure 3

Capacity Planning Methods

Once the nature of a plan is understood the science aspect of capacity planning needs to be understood. Capacity planning relies upon data. From this data an extrapolation is made into the future using information gathered from various sources about future growth and change. This process relies upon a statistical basis. Therefore a good understanding of certain basic statistical terms is important. Why? One reason is that some of the tools a capacity planner will need for evaluation purposes are statistic packages. Another reason is that statistical data provides keen insight into the nature of the environment. Questions such as "We reach 100 % utilization, but how often and for how long?" can be analyzed using statistical data. Here are the four most used statistical terms:

Mean (the sum of the values divided by the number of observations), midrange (halfway between the highest value and the lowest), modal (the values that occur most frequently), and median (the mid value of the list of numbers). Another important definition is the STANDARD DEVIATION. This is a number that measures the variation of the observations from the mean. It gives you an idea of how far from the mean the values range. Why is this important? A proper understanding of these statistical values will help warn of when statistics are deceiving.

Here is a sample of some statistical data available from Lund Performance Solutions products:

01/08/96 SOS/3000 Report Card Page 1

Statistics for 12/05/95 07:00 AM to 12/05/95 06:00 PM

Item ---Mean--- ----SD---- Conf. ---High--- ---Low----

*** Global CPU Statistics ***

Total Busy 40.9 27.97 2 100.0 13.2

AQ CPU % .1 .14 0 .7 .0

BQ CPU % 5.4 1.27 4 9.3 3.2

CQ CPU % 10.1 9.44 1 50.7 1.0

DQ CPU % 21.7 21.83 0 83.8 7.5

EQ CPU % .0 .00 0 .0 .0

Memory Manager % .9 1.56 0 6.8 .0

Dispatcher % 1.0 1.03 0 4.8 .3

ICS/OH % 1.7 1.72 0 6.2 .1

Pause for Disc I/O % 4.8 6.69 0 40.7 .0

Idle % 54.3 30.00 3 86.2 .0

Figure 4

There are four basic methods of Forecasting used by the capacity planner. These are Benchmarking, Straight Line Statistical Forecasting, Educated Guessing, and Analytic Modeling. But before we launch into these we need to talk about the important step of defining a workload.

Workload Definition

This is an important concept. A workload is a combination of similar type of activity grouped by program usage or log on. Having a well thought out workload is important because, from the beginning it is best to collect data in the same measurement values as those for which general business activity is measured. If this is possible the growth rates that are used to measure general business growth will be mostly easily translated to the workloads that have been defined. If there is no direct translation between the units that the business uses to plan future growth and the CPU utilization the job is little bit more complicated.

The way you define and use workloads depends upon your method of measuring the different aspects of utilization. This presupposes that some type of measurement tool is in use. There are a number of products available for measuring utilization. Most of these offer the ability to track by workload. In the Unix environment there are several commands that measure and collect data but none allow for workload definitions.

Overall view versus testing the high CPU usage.

Once the workloads have been established and some type of data collection is in practice there is another type of data collection that is vitally important. The collection of the units of production that best reflect that business of the company is also recommended. This allows for the correlation of computer usage metrics with the amount of production that causes them. This will allow for the a study of the type of production by month and quarter. Questions such as the impact of changes in response times, throughput, and CPU percentage can be addressed.

Methods and Approaches

The complete view required by a thorough capacity plan does not rely on any one method. Rather the picture needed for a wise plan consists of several pieces. These pieces are described as educated guessing, benchmarking, statistical analysis, and analytic modeling.

Educated Guessing

The overriding method needed during the entire process begins as "educated guessing". Without the use of additional methods it remains only educated "guessing". But the more education that is applied to the process the more "educated" the method and the less "guessing". Educated guessing consists initially of the general experience the capacity planner possesses and grows from there as the planner uses the other methods that will be discussed to become more and more "educated".

Benchmarking

The first method of potential use for the capacity planner is called benchmarking. In the purest sense this means taking your current CPU load, connectivity, etc. and modeling all of the pieces in a new environment. This is usually very expensive and difficult. The hardware vendor is not likely to have all of the pieces to replicate your system lying around and getting your staff to spend any period of time in a "test" environment mode is usually impossible. There are some steps that are a type of benchmarking but not the entire package that can be very helpful. If your software was purchased from a third party vendor a great deal of understanding can be gained from checking out the kind of system sizing other users of the same software have. If you have a multiple processor system it may be possible to "rent" that additional processor you have been considering buying until you are able to determine that it really is helpful. Some general guidelines can be gathered by comparing in looser terms that number of users your system has, the type of software (in-house developed COGNOS, or C programming). This information can be gathered by talking to other users or by asking your vendor. Benchmarking, in the more general usage, can be very helpful in providing guidelines or a general feeling of when problems may be encountered. In it’s very specific use it can give you very exact knowledge of the potential capacity for a system.

Statistical Analysis

The next method is statistical analysis. This method uses historically collected data. This data is studied to derive average, low and high utilization’s for any measurement. This requires an understanding of the "rules of thumb" that dictate when these values are becoming a problem. It also relies upon an understanding of these values, which ones are important and which ones are key indicators. The statistical analysis method also helps gain a thorough understanding of the seasonal ups and downs of business. In certain environments this understanding can make or break a capacity plan. If your business is very cyclical one month can make or break the company. If the capacity planners does not understand this and the plan comes from a season of low usage the company can financially suffer. When that one month period arrives you are left with little time to add the additional or increased hardware that will be needed. Part of the statistical analysis can consist of the use of a linear regression analysis on the data. A linear regression can take the historically collected data and in broad terms "forecast" out into the future where the current machine will be on CPU utilization.

Analytic Modeling

This method is the final piece needed in a capacity planners tool box. Analytic Modeling is based on Queuing Network Modeling Theory. It is described as in the following quote,

"A Queuing Network is a particular approach to computer system modeling in which the computer system is represented as a network of queues which is evaluated analytically. A network of queues is a collection of service centers, which represent system resources, and customers, which represent users or transactions. Analytic evaluation involves using software to solve efficiently a set of equation induced by the network of queues and its parameter."

Quantitative System Performance, Edward D. Lazowska, John Zahorjan, G. Scott Graham, Kenneth C. Sevcik, Prentice Hall.

In their simplest forms the queuing network theory can be used to solve for changes in response times, CPU Utilization, etc. just by knowing several of the needed parts of the equation. These have been expressed mathematical rules like Little’s Law, the Utilization law and the Response Time law. In this format we are not going to get into the calculations involved but if the reader is interested I invite you to dig in deeper. The book previously quoted is a good place to start. In this writing I will say that there are products that solve the very complex collection of interdependencies represented in the computer system. figure 5 represents the queues involved in a computer.

Analytic Modeling can be applied to a single system using the equations referred to in the previous paragraph. There are also tools that apply the mathematical equations to multiple service centers. One such application is called Forecast Capacity Planner from Lund Performance Solutions. Tools based upon Queuing Network Modeling Theory can offer some scientific support to future possibilities. They allow for the "what if" questioning the capacity planner must perform. BMC and BEST/1 offer tools intended to do that ‘what if-fing" also but I am not sure if they are based on this theory.

Figure 5

Conclusion

Capacity Planning is one of the lost and ignored science/arts. It really is one of the undiscovered areas. It relies heavily upon it’s sister science of performance monitoring and evaluation. Both must be practiced in order to ensure success in the data processing department. Without proper evaluation of performance and preparation, poor performance and lack of capacity will catch up with almost any environment. Practice the art of Capacity Planning and avoid poor performance by anticipating the need upgrades and changes before they occur.

Author | Title | Tracks | Home


Send email to Interex or to theWebmaster
©Copyright 1998 Interex. All rights reserved.