HPWorld 98 & ERP 98 Proceedings

POPS: HP-UX Performance Optimized Page Sizing

Eddie Berin

High Performance Systems Division
Hewlett-Packard Company
19111 Pruneridge Avenue MS 44M3
Cupertino, California 95014
Phone: (408) 447-1563
Fax: (408) 447-5730
E-mail: eddie_berin@hp.com



Performance Optimized Page Sizing

A White Paper

Audience: General

Introduction

Performance Optimized Page Sizing (POPS) is a capability in HP-UX 11.00 that allows the flexibility of adjusting the system memory page size. Performance Optimized Page Sizing also is referred to as Variable Page Sizing or Large Pages in other industry documents. Traditionally, UNIX systems from most vendors including Hewlett Packard's HP-UX 10.x supported only a fixed page size. The advantage of the fixed page size was simpler memory management and a good balance between application performance and optimal use of physical system memory.

However, many applications today require much more memory than in the past. UNIX has its roots reaching back several decades when a fixed page size was adequate for a centralized server supporting multiple terminals. Today, systems support not only larger applications, but also a larger operating system (OS) and a graphical user interface (GUI). It is unrealistic to expect that a fixed 4K page size that works well with an editor would be also optimal for a large spreadsheet or CAD application. A system administrator must choose a single page size that best meets the need of the various applications that will execute on the server. The end result is that application performance is degraded due to the use of sub-optimal page sizes. This has reached the point where a fixed page size becomes a bottleneck in performance:

Hewlett-Packard introduced POPS in HP-UX 11.00. The 64-bit capability of HP-UX 11.00 and POPS allow applications to maximize performance by providing flexible memory utilization.

This paper discusses the concept, implementation, and use of Performance Optimized Page Sizing.

Concept

Page sizing affects performance in several ways. A hardware mechanism inside the CPU called a translation lookaside buffer (TLB) maps physical memory to an executing program. If the next instruction executed can not find the appropriate TLB entry, the system must flush a TLB entry and set up the required TLB entry. This situation is termed a TLB cache miss and requires extra cycles to process. Smaller programs fit within the page size and avoid TLB cache misses. Larger programs that occupy multiple pages are susceptible to TLB misses.

For example, if 4 GB were split into 128 pages, this design would require a smaller TLB than 4 GB split into 256 pages. Page quantity affected the number of TLB entries. However, TLB entries are limited by design constraints of the CPU in the form of cost or available space on the chip itself. The advantage of large pages takes a load off of the TLB since fewer TLB entries are needed to manage them. However, large pages run the risk of wasted space if the application does not fully utilize the reserved area. Small pages allow for finer granular fit for an application, but would shift the load back to the TLB by requiring more entries. Clearly, the fixed page size is a compromise between memory efficiency and performance - any attempt to increase either parameter occurs at the detriment of the other. Note that unless there are TLB misses, the use of variable page sizes would have little impact on system or application performance. However, in heavily loaded systems the use of variable pages can significantly improve performance. There are several approaches to minimizing TLB misses. A simple approach is to increase the number of TLB table entries in the CPU. This allows the TLB to do less work since there is more capacity to track TLB entries. The disadvantage is increased cost and complexity and ultimately only delays the point where sub-optimization occurs.

HP-UX was designed to emphasize performance and efficiency via its POPS implementation to increase memory efficiency compared to a fixed-page-size architecture. This is achieved by using Performance Optimized Page Sizing to minimize unnecessary TLB entries. Reducing TLB entries increases overall system performance via increased efficiency. Furthermore, applications that require larger page sizes experience higher performance simply because their applications do not incur the TLB misses more inherent with a smaller page size.

Increasing the efficiency of the TLB via POPS is the primary goal that increases overall system performance. However, a key feature of POPS is the advantage for applications to use the optimal page size on a per-application basis. No longer is it necessary for all applications to use the same page size since each application can use application-specific page sizes. Another benefit is the ability to adjust the page size depending on the system configuration.

For example, if a system has low-to-moderate number of concurrent applications the TLB impact is less likely. This allows shrinking the application-specific page size to provide higher memory utilization while minimizing TLB misses. If a system has a high number of large applications, TLB misses are more likely and increasing the application-specific page size is appropriate. Both cases point out the inherent advantage of POPS that allows optimal fit of an application to its system environment for maximum performance and resource efficiency.

Implementation and Use of Performance Optimized Page Sizing

The HP-UX 11.00 implementation of POPS requires use of a PA 8x00-based system. Combined with the 64-bit capability of HP-UX 11.00, POPS adds another dimension to the performance delivered by HP 9000 servers.

POPS can be used in two ways.

1. System default page size set to pre-determined optimal page size

2. Applications can specify page size at run time via the command line

The first case is useful for a dedicated application server. For example, a server that executes only one application (a large database) or several copies of the same application (CAD programs) will be optimally using only one page size. Since the application is either by itself or identical to other applications, it is straightforward to monitor the optimal page size. The system can be configured to always run in that specific page size. For HP-UX 11.00, the page size ranges from 4K to 64 MB.

The second case demonstrates the maximum flexibility of POPS. For example, a system administrator or end user can specify a page size at the command line level using the chatr() command even when using a third-party application. In many cases, this is the easiest way to use POPS since recompilation of the third-party application is a rare option. POPS allows all applications to have some degree of control over their page size and it would not be unusual to use all the above features simultaneously to maximize system performance.

A system may reach a saturation point that will prevent allocation of the optimal page size. At that point, HP-UX 11.00 would simply allocate the default 4K page size until resources are available. The POPS addition to the HP-UX architecture allows additional system efficiencies that contribute to current system performance.

Summary

A fixed system page size is a part of the chip design that forces a trade off between performance and resource utilization. Depending on the application, the page size may help or hurt its efficiency. This is an inherent limitation of a fixed page size.

Performance Optimized Page Sizing provides maximum system resource utilization through efficient TLB management. Applications can tailor their resource usage automatically to maximum performance or conserve resources as needed. The customer can use a default POPS setting to simply take advantage of POPS without additional work. For the customer seeking the squeeze the maximum performance from the system, POPS is the tool to use.

For more information:

http://www.hp.com/go/hp-ux

Author | Title | Tracks | Home


Send email to Interex or to theWebmaster
©Copyright 1998 Interex. All rights reserved.