Presentation: 6240
5.2 Multiple Buses
6.1 SCSI Performance
Hardware vs. Software Compression
9.1 Concurrency:
10 Networking
11 CPU Power
12 OmniBack II Performance Examples
12.4 DLT4000 Compressed
12.5 2 ATL 4/52 DLT Autoloaders
12.9 SAP 4 x HP DLT 4/48 Libraries
12.10 Oracle 4 x HP DLT 4/48 Libraries
12.11 Netserver LX - HP DLT 4/48 Libraries
12.12 Netserver LS DLT7000
12.13 Vectra DDS3
13 OmniBack II Configuration for Performance
13.1 Software Compression
13.2 Hardware Compression
13.3 CRC Checking
13.4 RAW Disk verses Filesystem
13.5 Concurrency
13.6 Load Balancing
13.7 Reconnects
13.8 Segment Size
13.9 Block Size
13.10 JFS/VXFS filesystems
13.11 Logging Files to the OmniBack II Database
13.12 4 drive DLT4000 Exchanger and SCSI buses
13.13 SAP Backup
This document is intended to help explain aspects of HP OmniBack II and system configuration that effect backup and restore performance and what type of performance a user can expect with various platforms and backup devices.
People often ask questions such as, "If I buy three DDS2 drives for backup what type of performance can I expect with OmniBack II?". The correct answer depends on a variety of factors such as type of system, type and number of disks and tape drives, number and type of SCSI buses, system configuration and OmniBack II configuration.
This sounds like a long list of complicated variables to put into a "performance equation" but usually it is not as difficult as it first appears. Typically, with an understanding of the data flow during backup and the type of hardware available it is possible to make some reasonable judgements about what level of performance can be expected. One exception to this is when multiple areas are simultaneously being "pushed to the limits" of the available performance. In this case it is necessary to examine all of the possible variables or limitations in the system.
In this paper we will take the approach of :
Learning how OmniBack II moves the data around the system in order to perform a backup and what system resources are used to do this.
Looking at the practical, usable performance of some of these resources.
Looking at actual measured OmniBackII performance and the system configurations when these measurements were made.
Giving some OmniBack II and system configuration tips.
With this information it should be possible to evaluate the level of performance a customer can expect.
In this paper, only back-up performance is discussed. This is because much more data is backed up than is ever restored, and backup and restore operations normally require the same system tuning for optimal performance. Restore performance is typically very similar to backup performance unless there are some type of unusual devices being used that are capable of much faster writing than reading.
To understand how data flows during backup and restore sessions it is important to understand the basic architecture of OmniBack II. As the diagram shows, there are four main architectural blocks in the OmniBack II Architecture. Typically, for performance, the two most important blocks are the disk agent and the media agent because these are the two blocks that are responsible for handling the bulk data.
User Interface The purpose of the user interface to allow communication between OmniBack II users and the Cell Manager. The reason this is a separate block is that it then allows the user and the backup manager to be on different machines. The User Interface must be installed on any machine from which users access OmniBack II. Note, with HP-UX it is also possible to have access to OmniBack II from remote machines using the X11 remote displaying features but this offers reduced performance and requires that users have a login on the manager machine.
Cell Manager - The cell manager is the brains or intelligence of the system. The cell manager reads and writes to the OmniBack II database where it tracks a variety of information such as information about the backed up data, backup media, logical devices and configured clients. The cell manager is responsible for starting and controlling the other two blocks of the architecture; Disk Agents and Media Agents. The backup manager is installed on only one machine within each backup environment.
Disk Agent - The disk agent is responsible for reading data from the disk during backup and writing data to the disk during restore. It sends/receives (backup/restore) this data to a media agent. A disk agent must be installed on every machine where there is data to be backed up. OmniBack II offers several special database integrations for on-line backup, architecturally you can simply think of these agents as a special type of disk agent.
Media Agent - The media agent writes data to the tape during a backup and reads data from a tape during a restore. OmniBack II has the capability to backup to devices other than tapes (i.e. files, optical disks) but for simplicity the word tape verses the words "backup media" will be used. The media agent receives/sends (backup/ restore) this data from/to one or more disk agents. A media agent must be installed on every machine where there is a tape drive used as a backup device.
Some important things to note:
Each of the architectural blocks can be running on a separate machine. Normally, for the highest performance the disk and the media agents are configured on the same machine.
If an OmniBack II logical device is configured for concurrency there can be multiple disk agents writing to a single media agent.
Each media agent writes to only one tape drive.
If the user has multiple licenses OmniBackII can run multiple media agents simultaneously.
The architectural blocks are a logical way of thinking how OmniBackII works. It is not an actual description of the processes that are running during a backup. Most of the blocks are composed of multiple processes.
To be able to estimate backup/restore performance it is necessary to understand the steps the data must flow through in order to get from a disk to a tape or vise versa. The data has to flow through multiple bus and/or networks and it is important to realize where possible bottlenecks to performance lie. The data is, in a sense, transferred twice, once into the system from the disk/filesystem and then out to a tape drive.
In the diagrams above it is possible to see the path the data flows through during a backup or restore operation. The simplest case to understand is a single disk drive and a single tape device. The data flows from a disk drive through a SCSI bus then through some type of system bus onto a system memory bus into a memory buffer. This part of the transfer is controlled or performed by the OmniBack II disk agent. From this memory buffer the OmniBack II media agent takes the data over a system bus(s) out to a SCSI bus and then finally to a tape drive.
5.2 Multiple Buses
In the figures above, all the buses are labeled as separate buses (i.e. SCSI bus A, SCSI bus B), however, often the same buses are used for bringing the data into the system and sending it back out to the tape drive. For example, if the disk being backed up is a fast-wide SCSI disk and the tape drive is on a single ended (SE) SCSI bus these two buses are physically separate. However, these two separate buses might be connected to the same system bus. It is important to realize in this case that the bandwidth of the system bus has to transfer the data twice.
It is also possible that the single system bus shown above could, in fact, be multiple levels of buses. These system buses are the buses that the SCSI or networking cards are plugged into. On a HP T500 or H class server this would be the HP-PB (also called the NIO) bus. On a HP K class server this could be an HP-PB bus or the HSC bus. On a PC server this would be one of the industry standard EISA, ISA or PCI buses.
5.3 Multiple Disk Single Tape
With OmniBack II it is possible to backup multiple disk drives simultaneously up to a single tape drive. In OmniBack II this concept is called concurrency and can be defined on a logical device level.
The networking case shown above is interesting because it shows that the data flows through a longer chain of interfaces when a network backup or restore is done. It is important to note that the first line of the network case shown above runs on the system where he disk drive is connected and the second line uses the hardware on the machine where the tape drive is connected.
5.5 Multiple Disks Multiple Tapes - Combined
The cases shown are perhaps the most common, but there are endless combinations and variations of the cases shown above. For example, it is very common to have the local backup cases combined with a network backup, or many machines backing up simultaneously over the network.
It has been shown above that the data has to flow through a number of system buses. It is now necessary to look at the actual performance capabilities of some of the elements involved. Here actual performance values that have been seen will be discussed and not listed specifications. The differences between the two can be can be very misleading. These numbers are not guaranteed but every attempt has been made to give the most realistic numbers possible. Typically, there is a large range of performance for a particular class of devices and specifics are given where possible.
It is also important to realize that the performance of a bus is typically limited by the slowest device active on the bus. A good analogy is having a car that can drive 200 km/hour but driving behind a car that can only go 100 km/hour, the slower car determines the speed of all the cars until they get off the road.
6.1 SCSI Performance
It is important to understand the different type of SCSI-2 buses such as single ended narrow, differential narrow, fast narrow single ended, single ended wide, fast differential wide, etc . For performance, there is no difference between a single ended bus and a differential bus. Differential, through different wiring, simply allows for longer connections and is generally more reliable at higher speeds.
On Windows systems there are many types of buses available but generally these fall into one of the following classes:
Narrow SCSI-2 - The fast narrow SCSI-2 bus (data path is 8 bits wide) is specd at 10 MB/s. This means that in practice the highest performance will be around 8 MB/s. Whether or not this type of performance can be achieved will be dependent on the actual SCSI card and the devices being used.
Wide SCSI-2 -
The fast wide SCSI-2 bus (data path is 16 bits wide) is specd at 20 MB/s. This means that in practice the highest performance will be around 16 MB/s. Again, whether or not this type of performance can be achieved will be dependent on the actual SCSI card and the devices being used.
On HP-UX systems there are typically two kinds of buses available:
Single Ended Synchronous SCSI-2 - Often referred to as Single Ended Narrow SCSI. Synchronous narrow SCSI-2 (non-fast) is spec'd with a 5 MB/s transfer rate which equates to a real life performance of 2-4.5 MB/s (7.2-16.2 GB/hr).
One thing to note is that if HP DDS/DAT drives are being used on HP-UX systems a maximum performance of 2-3 MB/s (7.2 10.8 GB/hour) is possible because of compatibility issues between this SCSI bus and the HP DDS drives. This would mean, for example, that a system with a single SCSI bus and a DDS/DAT drive would have a maximum backup performance of 1.5 MB/s (5.4 GB/hr).
Fast Wide Differential SCSI-2 - Typically referred to as Fast Wide SCSI in the HP systems world. This bus is spec'd at 20 MB/s. The FW-SCSI card for the HP-PB (NIO) bus has a performance of ~9 MB/s (32 GB/hr). The FW-SCSI interface for the HSC bus on the K-series servers has a performance of ~12 MB/s (43.2 GB/hr).
On Windows NT and HP-UX servers there are many different type of buses in use. The performance specifications of the various buses vary widely. Normally it can be assumed that a usable bandwidth of 50-75% of the specification can be achieved. Below the most common buses listed with any known anomalies in performance are given.
Buses on Windows NT Servers are:
PCI - Peripheral Component Interconnect- The most common high speed bus found in servers today. This bus is specd at a peak of 133 MB/s. Because of the excellent performance of this bus, it is possible to have multiple high speed SCSI and/or network cards installed into a single bus.
EISA - This bus is specd at 33 MB/s.
ISA - This bus offers little in the way of performance and is specd at less than 10 MB/s.
Buses on HP-UX Servers are:
PCI - This bus is specd at a peak of 120 MB/s. Because of the excellent performance of this bus it is possible to have multiple high speed SCSI and/or network cards installed into a single bus.
HSC - This bus is sped at 132 MB/s. Again because of the speed of this bus it is possible to have multiple high speed SCSI and/or network cards installed into a single bus.
HP-PB (NIO) - The HP-PB (NIO) system bus found on many HP Servers including the T500 and H Class Servers is spec'd at 32 MB/s. Realistic performance numbers for this bus are ~10 MB/s. This means that only 1 FW-SCSI interface or 2-3 SE-SCSI interfaces should be active on this bus at one time.
7.1 Measure Tape Drive Performance
Unlike most buses, tape drives normally function quite close to their specifications. The values given below are tested performance results using OmniBack II and not simply drive specifications.
Drive |
Tested Performance - No Compression |
Tested Performance With Compression (tested ratio) |
DDS1 |
180 KB/s, 0.65 GB/hour |
360 KB/s, 1.0 GB hour (2X) |
DDS2 |
500 KB/s, 0.65 GB/hour |
1.0 MB/s, 3.6 GB/hour (2X) |
DDS3 |
1.0 MB/s, 3.6 GB/hour |
2.0 MB/s 7.2 GB/hour (2X) |
Exabyte 8mm |
490 KB/s, 1.7 GB/hour |
735 MB/s, 2.6 GB/hour (1.5X) |
DLT4000 |
1.45 MB/s, 5.2 GB/hour |
2.2 MB/s, 5.2 GB/hour (1.5X) |
DLT7000 |
4.9 MB/s, 17.6 GB/hour |
|
9490 - Timberline |
5.0 MB/s, 18 GB/hour |
6.5 MB/s, 23.0 GB/hour (1.5X) |
One difficulty with understanding tape drive performance is understanding the non-linear throughput of many tape drives. The problem is that if a tape drive is not held in streaming mode (where the tape continues to move in the same direction), the performance drops off significantly, up to 50%. What this means in practice is that a if a system can only deliver data at 90% of the streaming rate of the tape drive, the actual throughput will only be 50%. For example, if a tape drive has a 5.0 MB/s specification but the system configuration can only deliver data to the tape drive at a rate of 4.5 MB/s, the actual backup performance will be closer to 2.5 MB/s. The chart below shows the general relationship between data input and tape drive throughput (rate at which date is written to the tape). Tape drives use buffering in an attempt to reduce this problem. Linear tape technologies (i.e. DLT) seem to suffer much worse from this problem than helical scan technologies. DDS/DAT drives for example do not show this characteristic at all.
Keeping this problem in mind when configuring a backup environment or trying to determine performance is important. It says that either the tape drives need to either be the limit in performance, or that the tape drive throughput needs to be twice the rate that the system can deliver data.
This also is a reason that it sometimes makes sense to turn hardware compression off. If hardware compression is being used, this principle does not change only the absolute numbers. For example, for a DLT7000 drive 1.0 in the chart below would represent 5.0 MB/s with hardware compression disabled. If hardware compression was enabled and the data was 1.5X compressible 1.0 would represent 7.5 MB/s. If the system could only deliver data at 7 MB/s (data is 1.5X compressible) to the drive it would be better to have hardware compression turned off. With compression the throughput would be 3.75 MB/s (~50% of the 7.5 MB/s streaming rate) while without compression the ratio would be 5 MB/s (stream rate without compression).
Normally, either hardware or software compression are used for when performing backups. Many people use the assumption that most data is 2X compressible. Experience tells us that this is a bit optimistic and 1.5X is more realistic but it should be noted that this is highly data dependent.Note: in order to maintain a compression ratio of 1.5X the system must be capable of delivering data at a rate somewhat higher than this, normally at around 2 times the rate the drive can write data to the tape. This is because data is normally not consistently 1.5X compressible but some data will be uncompressible while other data will be very compressible (2.X). For example, in the graph below the average compression ratio is 1.5 but in order to maintain this the system must be able to deliver data at a rate of 2.0 the non-compressed speed of the device. The fact that the data has a variable compression ratio means:
When calculating how many tape drives can be connected to a bus a slightly higher compression ratio should be assumed (i.e. 2.0).
When calculating how fast data will be backed up in an environment a conservative compression ratio of 1.5 should be used.
8 Hardware vs. Software Compression
For data compression, two different types are available, software compression and hardware compression. Hardware compression is done by the tape drive while software compression is done by the CPU on the system where the data is being backed up. The best way to view hardware compression is to simply view it as increasing the performance and capacity of the tape drive.Software compression is a very CPU intensive and slow task and should almost never be used for local backups (tape drive connected to the same system as the disk). The one place where it does make sense to use software compression is when performing network backups. In the OmniBack II architecture, the data is compressed, by the disk agent, before being passed over the network. Because of this, it can make sense to use software compression to either effectively increase the throughput of a slow network or to reduce the amount of network traffic caused by backup.
When analyzing backup/restore performance, often too much time is spent studying and evaluating the "back-end" of the system where the data is being written, e.g. tape drives, etc. It is also important to look at the source of the data. In particular, OmniBack II performance testing on Windows NT has shown that the data source is often the limiting factor.
When analyzing what type of performance to expect from a disk or disk array, the actual hardware specifications of the disk cannot be used because the layout of the filesystem on the disk will be the determining factor. The speed of the filesystem is also very dependent on the data on the disk. A filesystem with many small files will have slower performance that one with several very large files. An exact value can not be given but average file size is greater than approximately 1MB this should not be a performance issue.
9.1 Concurrency:
To help relieve the situation where the reading of the data is the limiting factor, OmniBack II has the concept of concurrency. Concurrency is where more than one filesystem can simultaneously write to a tape drive. Note, that concurrency can hurt restore performance. When two filesystems are backed up to a tape simultaneously the data for the two is mixed on the tape. This means that when a restore of a one of the filesystems is done, it may take longer because more data (data from both filesystems) has to be read from the tape. Whether this actually hurts restore performance is dependent on the speed of the tape drive relative to the other elements involved such as networks, filesystems, etc. This is not a problem if both of the filesystems are being simultaneously be restored because OmniBack II (version A.02.50 or later) allows concurrency during restores, meaning that one tape drive can send data to multiple filesystem simultaneously during restores.
It is possible to greatly increase the reading speed of the JFS filesystem backups with OmniBack II by setting a variable that enables a faster reading mode. The variable is OB2VXDIRECT=1 and should be set in the file /opt/omni/.omnirc on the system where the filesystem is located. If this variable is set the following patch (or newer versions) must be installed to avoid data corrruption:
PHKL_9404 on s700 HP-UX 10.01, PHKL_9405 on s800 HP-UX 10.01
PHKL_9413 on s700 HP-UX 10.10, PHKL_9414 on s800 HP-UX 10.10
PHKL_9415 on s700 HP-UX 10.20, PHKL_9416 on s800 HP-UX 10.20Setting this variable will increase performance and decrease the amount of CPU resources needed during backup.
9.3 Windows NT NTFS Filesystem:
Our testing has shown that the NTFS filesystem can use a lot of CPU resources. In our testing with high speed devices such as DLT7000 drives, the limiting factor was CPU resources for the reading processes.
10 Networking
When working with network backups it is very important to consider not just the capabilities of the networks being used but also the speed of the various network interfaces being used. Having a 100 Mb/s FDDI network by no means says that each interface card is capable of 100 Mb/s.The OmniBack II architecture is flexible enough to allow the easy use of multiple networks between machines. All that is necessary to use multiple networks is to use the appropriate IP name when configuring the OmniBack II logical device. For example, if on the machine OB there are two networks with the IP names OB and OB_fddi. To have the data go across the fddi network the OmniBack II logical device should be configured with the hostname OB_fddi.
On Windows NT 4.0 (at least up to SP3) multi-processor systems we have seen that only one processor handles all network traffic and therefore CPU power of a single processor can be the limit of network backup performance even if multiple network cards are installed.
11 CPU Power
The CPU is used for driving the various backup and restore processes. This leads to the possibility that CPU power can be a limitation to backup.On HP-UX systems when doing JFS or RAW disk backups, CPU performance is typically not a problem. When doing Oracle 7 backup with Oracles EBU it has been seen that a T520 (8 way) has a limitation of ~100 MB/s caused by CPU limitations. Note: SAP online backups are done using filesystem or RAW disk backup and so are typically not limited by the CPU.
For more information regarding CPU load and backup performance with HP-UX systems refer to the paper "Backup/Restore Performance" by Paul Chu and Lily Fan. This is available on HPs ESP system.
As mentioned above, CPU performance on Windows NT systems is often the limiting factor in performance. This has been seen as the limitation when reading through the NTFS filesystem and also in trying to use multiple fast networks. In the performance examples below it is stated if CPU power was the limiting factor in the testing.
12 OmniBack II Performance Examples
System: |
HP 712/80 Workstation |
Configuration: |
OBII filesystem backup (1.5X compressible data) |
Tape Drives: |
1 HP DDS2 drive, compression |
Disk Drives: |
2 GB Built in drive |
Backup Performance: |
0.762 MB/s, 2.74 GB/Hour |
System: |
HP NetServer LX 4 x 200 Pentium Pro |
Configuration: |
OBII filesystem backup (2X compressible data) |
Tape Drives: |
1 HP DDS3 drive, compression |
Disk Drives: |
9 GB external SCSI Disk |
Backup Performance: |
2.01 MB/s, 7.2 GB/Hour |
System: |
HP H70 Server |
Configuration: |
OBII Raw disk backup Non-compressed tape device Backing up 2 GB external FW-SCSI disk Disk and Tape drives on separate SCSI buses |
Tape Drives: |
Single DLT4000 in ATL 4/52 - Connected to SE-SCSI bus Non-compressed device |
Disk Drives: |
Built-in 1 GB drive (not backed up) 2 GB FW-SCSI drive |
Backup Performance: |
1.4 MB/s, 5.04 GB/hr |
12.4 DLT4000 Compressed
System: |
HP H70 Server |
Configuration: |
OBII Raw disk backup Backing up 2 x 2 GB external FW-SCSI disks Tape and Disk drives on separate buses |
Tape Drives: |
Single DLT4000 in ATL 4/52 Autoloader Connected to SE-SCSI bus, compression enabled |
Disk Drives: |
Built-in 1 GB drive (not backed up) 2 x 2 GB FW-SCSI drives |
Backup Performance: |
2.8 MB/s, 10.1 GB/hr |
12.5 2 ATL 4/52 DLT Autoloaders
System: |
T500/7, 2 GB Memory, HP-UX 10.01 |
Configuration: |
OBII concurrency=1, OBII filesystem backup |
Tape Drives: |
2 ATL 4/52 Autoloaders containing a total of 8 DLT4000 Drives, 1 DLT per SE-SCSI, 2 HP-PB card cages used for the SE-SCSI cards, compression enabled |
Disk Drives: |
10-12 Model 20 Raid 5 arrays (M20), Dual Storage Processors (64K cache), 3 M20s per FW SCSI, 2 I/O Cards used for FW SCSI cards |
System Wide Performance: |
12.1 MB/s, 43.6 GB/hr |
Performance per Drive: |
1.5 MB/s, 5.46 GB/hr |
System: |
K400, HP-UX 10.01, 4 processors |
Configuration: |
OBII concurrency=2, OBII raw disk backup |
Tape Drives: |
ATL 4/52 Autoloader containing 4 DLT4000 Drives, 1 DLT per SE-SCSI, 2 HP-PB card cages used for the SE-SCSI cards, compression enabled |
Disk Drives: |
8 F/W SCSI Disks (1 and 2 GB) connected to 4 F/W HSC bus SCSI interfaces |
System Wide Performance: |
9.6 MB/s, 34.6 GB/hr |
Performance per Drive: |
2.4 MB/s, 8.6 GB/h |
System: |
K400, HP-UX 10.01, 4 processors |
Configuration: |
OBII concurrency=5, OBII raw disk backup |
Tape Drives: |
STK 9490, Timberline Drive connected to it's own F/W SCSI bus Compression Device Used |
Disk Drives: |
5 F/W SCSI Disks (1 and 2 GB) connected to 4 F/W HSC bus SCSI interfaces |
System Wide Performance: |
6.6 MB/s, 23.7 GB/hr (Does not include tape swapping time) |
System: |
K400, HP-UX 10.01, 4 processors |
Configuration: |
OBII concurrency=4, OBII raw disk backup |
Tape Drives: |
2-STK 9490, Timberline Each Drive connected to it's own F/W SCSI bus Each F/W SCSI interface connected to it's own HP-PB bus Compression Device Used |
Disk Drives: |
8 F/W SCSI Disks (1 and 2 GB) connected to 4 F/W HSC bus SCSI interfaces |
System Wide Performance: |
11.7 MB/s, 42 GB/hr (Does not include tape swapping time) |
Performance per Drive: |
5.85 MB/s, 21 GB/hr (Does not include tape swapping time) |
12.9 SAP 4 x HP DLT 4/48 Libraries
System: |
T520 8 way, 8 NIO (HP-PB buses), HP-UX 10.20 |
Configuration: |
OBII SAP Backup, SAP on JFS filesystem Logical Device concurrency = 2 Blocksize = 256K, Segment Size =500MB |
Tape Drives: |
4 x HP DLT 4/48 Library (DLT 4000) Total of 15 DLT4000 tape drives used 2 drives connect to each HP-PB bus |
Disk Drives: |
1 EMC 3230 Disk Array Raid 1 configured |
System Wide Performance: |
34 MB/s, 122 GB/hour |
Performance per Drive: |
2.2 MB/s, 7.6 GB/hour |
12. 10 Oracle 4 x HP DLT 4/48 Libraries
System: |
T520 8 way, 8 NIO (HP-PB buses), HP-UX 10.20 |
Configuration: |
Oracle (7.3.2.2) EBU Backup Logical Device concurrency = 2 Blocksize = 256K, Segment Size =500MB |
Tape Drives: |
4 x HP DLT 4/48 Library (DLT 4000) Total of 16 DLT4000 tape drives 2 drives connect to each HP-PB bus |
Disk Drives: |
1 EMC 3230 Disk Array Raid 1 configured |
System Wide Performance: |
28 MB/s, 100 GB/hour |
Performance per Drive: |
1.85 MB/s, 6.7 GB/hour |
Performance Limitor: |
CPU 99% and System Bus 92% utilized |
12.11 Netserver LX - HP DLT 4/48 Libraries
System: |
HP NetServer LX 4 x 200 Pentium Pro |
Configuration: |
NTFS Filesystem Backup Logical Device concurrency = 1 Segment Size = 500MB |
Tape Drives: |
HP DLT 4/48 Library (DLT 4000) Total of 4 DLT4000 tape drives 2 drives connect to each FW SCSI bus |
Disk Drives: |
4 x 9 GB external SCSI Disk 2 Drives per FW SCSI bus |
System Wide Performance: |
10.25 MB/s, 36.9 GB/hour |
Performance per Drive: |
2.55 MB/s, 9.18 GB/hour |
Performance Limitor: |
CPU 99% |
12.12 Netserver LS DLT7000
System: |
HP NetServer LS 4 x 166 Pentium |
Configuration: |
NTFS Filesystem Backup Logical Device concurrency = 2 Segment Size = 500MB |
Tape Drives: |
HP DLT 7000 Exchanger 15 Slots Total of 1 DLT7000 tape drive |
Disk Drives: |
1 x 4 GB HP SCSI Disk DAC960 Disk Array with 2 x 2GB striped disks |
Backup Performance: |
5.7 MB/s, 20.5 GB/hour |
CPU Utilization |
Average CPU 70% |
Performance Limitor: |
Disk Performance Limited |
12.13 Vectra DDS3
System: |
HP Vectra XU 2 x 166 Pentium Pro |
Configuration: |
NTFS Filesystem Backup Logical Device concurrency = 2 Segment Size = 500MB |
Tape Drives: |
HP DDS3 Exchanger 6 Slots |
Disk Drives: |
2 GB SCSI Disk |
Backup Performance: |
2.1 MB/s, 7.6 GB/hour |
CPU Utilization |
Average CPU 25% |
Performance Limitor: |
Tape Drive (data ~2X compressible) |
12.14 T600 STK Redwood JFS Filesystem
System: |
T600 8 way, HP-UX 10.20 |
Configuration: |
JFS Filesystem Backup Blocksize = 256K, Segment Size =500MB |
Tape Drives: |
11 STK Redwood Drives Each drive on separate |
Disk Drives: |
1 EMC 3430 Disk Array and 1 EMC 3130 |
System Wide Performance Sustained: |
152 MB/s, 536 GB/hour |
System Wide Performance Average: |
135 MB/s, 476 GB/hour |
CPU Utilization: |
CPU 44% and System Bus 76% utilized |
13 OmniBack II Configuration for Performance
At this point it has already been shown which system configuration areas must be examined for performance and here specific OmniBack II configuration areas will be covered.
13.1 Software Compression
By default, software compression should be disabled. Software compression should only be used for backup of many machines over a slower network where the data can be compressed before sending it over the network. Even in this case, there should be several machines doing software compression simultaneously so that the combined speed of the machines can compress data as fast as the network can transfer it. As mentioned earlier software compression uses enormous amounts of system resources and therefore is a slow operation. If software compression is used, hardware compression should be disabled because trying to compress data twice can actually expand the data!
13.2 Hardware Compression
By default, hardware compression should be enabled. On HP-UX, hardware compression should be enabled by selecting a hardware compression device file. On Windows NT, hardware compression can be selected in the device configuration. Hardware compression increases the speed at which a tape drive can receive data (because less data is written to the tape) and therefore should nearly always be enabled. NOTE: The compression button in the OmniBack II HP-UX Backup Editor GUI is for software not for hardware compression.
13.3 CRC Checking
By default, CRC checking should disabled. CRC checking takes CPU power and runs at a speed determined by the power of the CPU. Typically, to achieve high backup performance it is necessary to have CRC checking disabled. CRC checking simply checks the data written to and received from the tape drive. This same functionality is normally performed by the tape drive and parity checking on the SCSI bus which covers most of the same data path.
13.4 RAW Disk verses Filesystem
If there is a filesystem on the disk, it is nearly always better to backup using filesystem backup. Filesystem backup allows for fast, easy single file restores and allows incremental backups. Filesystem backup only backs up the needed data while raw disk backup backs up every byte on the disk even if it is empty. However, there are cases where raw disk backup can provide better performance. For example, if there are an unusually large number of small files on a filesystem accessing the disk through the filesystem can slow down access. Filesystem backup takes more CPU cycles than raw disk backup, therefore if the system is very heavily loaded raw disk backup may be a better choice.
13.5 Concurrency
By default, the concurrency should be set as low as possible. As low as possible is the minimum value where the tape drives can be held in streaming mode. This will be dependent on the device type of the tape drives and the disk drives being backed up. By faster tape technology such as DLT7000 (1998) a concurrency value of 2-4 will probably be needed. One case where it often makes sense to use a high concurrency factor is when backing up many machines over a network where software compression is being used. This allows many machine to simultaneously compress data.
13.6 Load Balancing
With OmniBack II A.02.50 and later, load balancing is offered. This is where OmniBack II dynamically determines which filesystem should be backed up to which device. Normally, it is best to enable this feature. This is especially true when a large number of filesystems in a dynamic environment are being backed up. If the environment is relatively small and static, it may be possible to balance the data across the various tape drives by hand, but in most production environments, allowing OmniBack II to perform this automatically will give better results.
13.7 Reconnects
With OmniBack II A.02.50 and later the option to have OmniBack II to perform reconnections after network outages is offered. This is to recover from temporary WAN failures. In order to perform this OmniBack II must track all messages and this could have negative effects on performance. This should not be by default enabled.
13.8 Segment Size
The segment size for OmniBack II logical devices can be changed in the logical device editor. This changes the amount of data OmniBack II writes to a tape between writing filemarks and file information to the tape. Increasing this parameter will increase the importing speed of tapes and can in some cases improve backup performance. One note, because OmniBack II must hold the filename information in memory before writing it to the tape at the end of the segment having a larger segment size means that more memory will be used on the machine where the tape drive is connected. This is normally only an issue if a very large number of small files are being backed up. However, if there are any issues with memory on the system with the tape drives, this value should be reduced.
For most tape technologies a segment size of 500 to 700MB should be more than sufficient.
13.9 Block Size
The block size for OmniBack II logical devices can be changed in the logical device editor. This changes the amount size of the blocks passed across the SCSI bus and written to the tape. The larger the block typically the more efficient operations are. The default size for most OmniBack II devices is 64KB. Increasing this to 128KB or 256KB typically will only help performance. If the block size is changed for a logical device, tapes to be used with this device need to be newly initialized or OmniBack II will issue a failure message.
13.10 JFS/VXFS filesystems
As mentioned earlier, setting the OB2VXDIRECT variable can greatly increase JFS filesystem backup performance.
13.11 Logging Files to the OmniBack II Database
Logging files to the OmniBack II database has not been seen as a performance limitation even in applications with many small files and very fast tape drives.
13.12 4 drive DLT4000 Exchanger and SCSI buses
Testing has shown that in order to get maximum performance from a HP 4 drive DLT4000 exchanger 2 SCSI buses need to be used. This was also true when the fast wide differential bus is used which would not be expected by looking at the bus specifications.
13.13 SAP Backup
SAP backups from OmniBack II are basically filesystem (or Raw Disk depending on SAP configuration) backups and the performance will no differ dramatically from normal filesystem backup. OmniBack II has the feature to balance the backup and normally the best method is to enable balancing by time.