ibm informix onpower7 - ibm - united states · ibm informix on power7 best practices 6 database...

48
IBM Information Management IBM Informix on POWER7 Best Practices A T echnical White Paper

Upload: hadiep

Post on 29-Apr-2018

291 views

Category:

Documents


9 download

TRANSCRIPT

Page 1: IBM Informix onPOWER7 - IBM - United States · IBM Informix on POWER7 Best Practices 6 Database workload Before we address best practices for Informix on the POWER7 architecture,

IBM Information Management

IBM Informix on POWER7

Best Practices

A Technical White Paper

Page 2: IBM Informix onPOWER7 - IBM - United States · IBM Informix on POWER7 Best Practices 6 Database workload Before we address best practices for Informix on the POWER7 architecture,

IBM Informix on POWER7 Best Practices 2

ContentsIntroduction ..................................................................................................................... 5

What is an LPAR............................................................................................................ 5

PowerVM........................................................................................................................ 5

Database workload......................................................................................................... 6

Mission critical .....................................................................................................7

High throughput ...................................................................................................7

Low throughput ....................................................................................................7

Simultaneous multithreading (SMT)................................................................................ 7

Current SMT mode ..............................................................................................8

Enabling SMT ....................................................................................................10

SMT1 vs SMT2 vs SMT4 .............................................................................................. 11

One core SMT testing....................................................................................................11

Multiple core SMT testing ..............................................................................................12

Recommendation ..........................................................................................................12

Dedicated LPAR vs shared LPAR................................................................................. 13

Dedicated LPAR............................................................................................................13

Shared LPAR ................................................................................................................14

Capped or uncapped shared LPAR ...................................................................15

Virtual processors ..............................................................................................16

Processor folding ...............................................................................................18

Recommendation ..........................................................................................................19

Virtual I/O Server (VIOS) LPAR .................................................................................... 20

Recommendation ..........................................................................................................20

Additional LPAR recommendations............................................................................... 21

Recommendation ..........................................................................................................21

Memory considerations ................................................................................................. 22

RESIDENT parameter ...................................................................................................22

4 KB memory page size ................................................................................................23

16 MB memory page size ..............................................................................................23

64 KB memory page size...............................................................................................24

Recommendation ..........................................................................................................25

Page 3: IBM Informix onPOWER7 - IBM - United States · IBM Informix on POWER7 Best Practices 6 Database workload Before we address best practices for Informix on the POWER7 architecture,

IBM Informix on POWER7 Best Practices 3

Feedback Directed Program Restructuring (FDPR) Tool .............................................. 26

Recommendation ..........................................................................................................28

I/O subsystem ............................................................................................................... 29

Read/Write access times...............................................................................................29

KAIO/DIRECT_IO..........................................................................................................30

Queue depth..................................................................................................................31

AIO servers ...................................................................................................................32

Recommendation ..........................................................................................................33

Network subsystem....................................................................................................... 34

TCP traffic .....................................................................................................................34

Local loopback ..............................................................................................................34

Recommendation ..........................................................................................................35

Number of CPU virtual processors................................................................................ 36

CPU-intensive workload ................................................................................................36

I/O-intensive workload ...................................................................................................37

Recommendation ..........................................................................................................38

Affinity ........................................................................................................................... 39

Recommendation ..........................................................................................................39

Understanding onstat –g glo ......................................................................................... 40

Recommendation ..........................................................................................................41

Starting LPARs.............................................................................................................. 42

Recommendation ..........................................................................................................42

Appendix A: Recommendations summary.................................................................. 43

Appendix B: Useful commands................................................................................... 44

amepat ..........................................................................................................................44

bosboot .........................................................................................................................44

chdev ............................................................................................................................44

ifconfig...........................................................................................................................44

ioo .................................................................................................................................44

iostat .............................................................................................................................44

lparstat ..........................................................................................................................44

lsattr ..............................................................................................................................45

schedo...........................................................................................................................45

smtctl.............................................................................................................................45

Page 4: IBM Informix onPOWER7 - IBM - United States · IBM Informix on POWER7 Best Practices 6 Database workload Before we address best practices for Informix on the POWER7 architecture,

IBM Informix on POWER7 Best Practices 4

vmo ...............................................................................................................................45

vmstat............................................................................................................................45

Appendix C: Additional reading ............................................................................... 46

DeveloperWorks: AIX Virtual Processor Folding is misunderstood.....................46

IBM Systems: Understanding Micro-Partitioning®..............................................46

IBM Systems: Getting a handle on Entitled Capacity & Virtual Processors .......46

YouTube: Power7 Performance – Entitlement, VPs, Affinity, Memory..............46

Feedback Directed Program Restructuring (FDPR) ...........................................46

Developer Works: VIOS Advisor .......................................................................46

IBM Redbooks Publication: IBM PowerVM Virtualization Managing andMonitoring..........................................................................................................46

IBM Redbooks Publication: AIX 5L Performance Tools Handbook....................46

Appendix D: References.......................................................................................... 47

Page 5: IBM Informix onPOWER7 - IBM - United States · IBM Informix on POWER7 Best Practices 6 Database workload Before we address best practices for Informix on the POWER7 architecture,

IBM Informix on POWER7 Best Practices 5

Introduction

This document describes best practices for using IBM® Informix® on AIX® POWER7® processor-based servers. Topics of discussion include logical partitions (LPARS), dedicated and sharedresources, capped LPARS compared with uncapped configurations, and I/O configuration.

It is assumed that you have a working knowledge of Informix and are familiar with physical andlogical database design for Informix. You also need to have skills in Informix serveradministration, and be familiar with configuration and tuning of the Informix server.

You should have basic skills in working with LPARS and system administration on AIXPOWER7 systems.

What is an LPAR

An LPAR, short for logical partition, is the division of a computer’s processors, memory, andstorage into smaller units. Each unit can run its own instance of the operating system andapplications. This concept was introduced with the POWER5 processor.

PowerVM

IBM PowerVM® provides a secure, stable, and sophisticated virtualization environment for IBMPower SystemsTM. A single physical server can be divided into multiple virtual servers using afraction of a processor to using all the processors on the physical machine. POWER7 systemssupport up to 1000 LPARs on a single server.

Businesses can deploy an appropriate mix of LPARs to meet their needs, sharing resourceswhere applicable, or by using fully dedicated resources as needed. With PowerVM, you havethe flexibility of a heterogeneous environment with the LPARs running a combination of AIX andLinux operating systems.

Through the use of virtualization, PowerVM has the ability to respond to business needs fasterby dynamic resource allocation. The Power® architecture also provides simultaneousmultithreading (SMT), which allows increased throughput on your Power system.

Page 6: IBM Informix onPOWER7 - IBM - United States · IBM Informix on POWER7 Best Practices 6 Database workload Before we address best practices for Informix on the POWER7 architecture,

IBM Informix on POWER7 Best Practices 6

Database workload

Before we address best practices for Informix on the POWER7 architecture, you mustunderstand the type of workload that you expect to have. A mission critical database server thatruns a consistent workload with a large number of users has different requirements than adatabase server that has a light workload where that work is sparse throughout the day.

Dedicated partitions

2 CPU

AIX

7.1

AIX

6.1

10 CPUA

IX6.2

1 CPU

Power Hypervisor

Dedicated processors Physical shared-processor pool

Shared Process Pool 0 Shared Process Pool 1

2.5

CP

UA

IX7.1

0.5

CP

UA

IX6.1

1.5

CP

UA

IX6.2

Micro-partitions Micro-partitions

2.4

CP

UA

IX7.1

1.5

CP

UA

IX6.1

2C

PU

AIX

6.2

OLTPWorkload

Primary Database Server24 x 7 x 365

BatchLoads

SecondaryDatabaseServer

SalesReports

BusinessAnalytics

Month-endProcessing

Page 7: IBM Informix onPOWER7 - IBM - United States · IBM Informix on POWER7 Best Practices 6 Database workload Before we address best practices for Informix on the POWER7 architecture,

IBM Informix on POWER7 Best Practices 7

Mission critical

A mission critical database is one that your business relies on, where the uptime andperformance is most important. This type of database server typically requires dedicatedresources with certain service level agreements (SLA) in place regarding throughput(transactions per second) and uptime. This type of system is important to the business.

High throughput

If your system has a high throughput (transactions per second), it is important to understandhow and when that workload uses the Informix database server. Is the work a consistentworkload that uses the database server 24 hours a day, or does the work come in bursts or atspecific times throughout the day? For instance, you might have a database server that has aconsistent, high workload from 8:30 a.m. to 5:30 p.m. during the business day, and a differentdatabase server that experiences a heavy workload from midnight to 4 a.m. Using a dedicatedsystem for each of these database servers wastes resources when one database server or theother is idle.

Low throughput

Some database servers have a very low throughput on a regular basis, requiring very littleprocessing power and resources. Perhaps data is loaded into the database server once a dayor at scheduled intervals, and otherwise the database server is idle. The data might be used torun end-of-day processing reports or possibly month-end processing reports. Placing adatabase server like that onto a large dedicated server with many CPUs would be a waste ofprocessing power.

To configure a POWER7 system to handle the workload, it’s important to know which databaseservers will require more resources than others. It is important to understand when theworkload will run on the database servers. Knowing this information will help you decide how toconfigure LPARs. For example, a dedicated LPAR might be a good choice for a mission criticaldatabase server with a consistent, high throughput. However, a shared-resource LPAR mightbe a good option for a database server where the workload occurs at a specific time of day ornight, and the server is idle the rest of the time.

Simultaneous multithreading (SMT)

Simultaneous multithreading is the ability of a single processor to simultaneously dispatchinstructions from more than one hardware thread context. The Power architecture uses SMT toprovide multiple streams of hardware execution, and the POWER7 processor can be configuredto run in SMT4, SMT2, or SMT1 (single-threaded mode). By using multiple SMT threads, aworkload can take advantage of more of the hardware features that are provided on the Powersystem. POWER6® and POWER5 support SMT2 or SMT1.

IBM Informix has performed a series of benchmarks comparing SMT4 with SMT2 and SM1.The results of these tests fall in line with industry benchmarks on POWER7 and with SMTtesting.

Page 8: IBM Informix onPOWER7 - IBM - United States · IBM Informix on POWER7 Best Practices 6 Database workload Before we address best practices for Informix on the POWER7 architecture,

IBM Informix on POWER7 Best Practices 8

Current SMT mode

To determine the current SMT mode, you can use one of the following AIX commands: lparstat,amepat, smtctl.

# lparstat

System configuration: type=Dedicated mode=Capped smt=4 lcpu=128 mem=513536MB

%user %sys %wait %idle----- ----- ------ ------

1.0 0.5 0.0 98.5

The following sample output from the lparstat command shows that SMT4 is being used, andthat there are 128 logical CPUs, which means that there are 32 physical CPUs.

LPAR/ SMT1lc

pu

0

Core 1 Core 2

lcpu

1

2 CPU

AIX

7.1

LPAR/ SMT2

lcpu

0Core 1 Core 2

lcpu

1

lcpu

2

lcpu

3

2 CPU

AIX

6.1

LPAR/ SMT4

lcpu

0

Core 1 Core 2

lcpu

1

lcpu

2

lcpu

3

lcpu

4

lcpu

5

lcpu

6

lcpu

7

2 CPU

AIX

6.2

Page 9: IBM Informix onPOWER7 - IBM - United States · IBM Informix on POWER7 Best Practices 6 Database workload Before we address best practices for Informix on the POWER7 architecture,

IBM Informix on POWER7 Best Practices 9

# amepat

Command Invoked : amepat

Date/Time of invocation : Mon Sep 23 13:07:51 CDT 2013Total Monitored time : NATotal Samples Collected : NA

System Configuration:---------------------Partition Name : shakeProcessor Implementation Mode : POWER7Number of Logical CPUs : 128Processor Entitled Capacity : 32.00Processor Max. Capacity : 32.00True Memory : 501.50 GBSMT Threads : 4Shared Processor Mode : DisabledActive Memory Sharing : DisabledActive Memory Expansion : Disabled

…..

# smtctl

This system is SMT capable.This system supports up to 4 SMT threads per processor.SMT is currently enabled.SMT boot mode is not set.SMT threads are bound to the same physical processor.

proc0 has 4 SMT threads.Bind processor 0 is bound with proc0Bind processor 1 is bound with proc0Bind processor 2 is bound with proc0Bind processor 3 is bound with proc0

proc4 has 4 SMT threadsBind processor 4 is bound with proc4Bind processor 5 is bound with proc4Bind processor 6 is bound with proc4Bind processor 7 is bound with proc4

…….

proc124 has 4 SMT threads.Bind processor 124 is bound with proc124Bind processor 125 is bound with proc124Bind processor 126 is bound with proc124Bind processor 127 is bound with proc124

Page 10: IBM Informix onPOWER7 - IBM - United States · IBM Informix on POWER7 Best Practices 6 Database workload Before we address best practices for Informix on the POWER7 architecture,

IBM Informix on POWER7 Best Practices 10

Enabling SMT

Simultaneous multithreading is set at the LPAR level. An SMT setting for a particular LPAR willnot affect the settings for another LPAR.

SMT can be enabled or disabled with the following smtctl command.

smtctl -m {off|on}

To set the SMT threads to 4, the following command can be used. This command affects thecurrent LPAR only, and the change is immediate.

smtctl -t 4

By default, the SMT change does not persist after the LPAR is rebooted. For an SMT change topersist after a reboot, the boot image must be remade with the bosboot command. See the fullman pages for the bosboot and smtctl commands.

Page 11: IBM Informix onPOWER7 - IBM - United States · IBM Informix on POWER7 Best Practices 6 Database workload Before we address best practices for Informix on the POWER7 architecture,

IBM Informix on POWER7 Best Practices 11

SMT1 vs SMT2 vs SMT4

IBM Informix has performed extensive benchmark tests comparing results when using multiplethreads on POWER7. This type of testing is possible even when SMT4 is configured becausethe 1st thread is used for a processor, and when it nears full consumption, the 2nd thread isused, and so on. When all four threads are in use on a core, we see an increase in overallthroughput by approximately 60%. While the overall throughput increases, it is important tonote that single-thread response time does not scale linearly as more threads are used percore.

One core SMT testing

The following graph shows transaction throughput for a single core when 1, 2, 3, and 4 threadswere used. The transactions per minute (TPM) increased when each additional thread wasused.

H/W threads throughput (TPM) diff%1 32893.672 46267.33 +40.7%3 51181.67 +10.6%4 53884.00 + 5.3%

Page 12: IBM Informix onPOWER7 - IBM - United States · IBM Informix on POWER7 Best Practices 6 Database workload Before we address best practices for Informix on the POWER7 architecture,

IBM Informix on POWER7 Best Practices 12

Multiple core SMT testing

IBM Informix performed additional tests to measure throughput for SMT on multiple cores. Thefollowing graph shows the transaction throughput of 1, 2, 3, and 4 threads on 1 through 8 cores.

Recommendation

If you are most concerned about overall throughput for your Informix server, use SMT4,because using SMT4 can more fully utilize the core. While tests showed a 60% increase inthroughput, keep in mind that single-thread response time does not scale linearly as more SMTthreads are used. If you want to optimize for response time, you can start with SMT4 for theincreased throughput, but if you see single-thread response time suffer, move to SMT2.

Page 13: IBM Informix onPOWER7 - IBM - United States · IBM Informix on POWER7 Best Practices 6 Database workload Before we address best practices for Informix on the POWER7 architecture,

IBM Informix on POWER7 Best Practices 13

Dedicated LPAR vs shared LPAR

Virtualized environments offer many choices for deployment such as dedicated or non-dedicated processor cores and memory micro-partitioning, which uses a fraction of a physicalprocessor core. There are pros and cons to each approach, and it is important to understandhow the LPAR is to be used and its expected workload.

Dedicated LPAR

A dedicated LPAR is one that gets a specific set of resources. It will not grab additionalresources or release any of its resources. The following graphic shows three dedicated LPARs:LPAR #1 has 2 CPUs, LPAR #2 has 10 CPUs, and LPAR #3 has 1 CPU.

One of the drawbacks of dedicated LPARs is that, if there is an over allocation of resources, youcan have a situation where the CPUs are idle. At the same time, there could be another LPARthat has used all of its resources and could benefit from increased resources. For example, inthe following graphic, LPAR #1 might be running at full capacity with CPU utilization near 100%,while LPAR #2 is running at 10% utilization. In that situation, LPAR #1 would benefit from usingresources that LPAR #2 is not using.

Dedicated partitions

2 CPU

AIX

7.1

AIX

6.1

10 CPU

AIX

6.2

1 CPU

Power Hypervisor

Dedicated processors Physical shared-processor pool

Shared Process Pool 0 Shared Process Pool 1

2.5

CP

UA

IX7.1

0.5

CP

UA

IX6.1

1.5

CP

UA

IX6.2

Micro-partitions Micro-partitions

2.4

CP

UA

IX7.1

1.5

CP

UA

IX6.1

2C

PU

AIX

6.2

LPAR#1

LPAR#2

Page 14: IBM Informix onPOWER7 - IBM - United States · IBM Informix on POWER7 Best Practices 6 Database workload Before we address best practices for Informix on the POWER7 architecture,

IBM Informix on POWER7 Best Practices 14

Shared LPAR

A shared LPAR, sometimes referred to as a non-dedicated LPAR, is an LPAR that is assigned aminimum set of resources, and that may use more resources from a shared pool, as needed, ifthe additional resources are available. This method also has pros and cons.

For example, assume that you defined a set of shared LPARs as shown in the following graphic.If LPAR #1 consumes 100% of its 2.5 CPUs, it can use additional resources from the sharedpool to allow increased throughput on that LPAR. However, if LPAR #2 pulls all the availableresources from the shared pool, and LPAR #1 becomes 100% consumed, LPAR #1 will not beable to use additional resources, and its performance will suffer.

The question is: Should you use shared LPARs or dedicated LPARs? IBM Informix tests showthat a properly configured shared LPAR, in ideal conditions, can perform nearly as well as adedicated LPAR (see the following graph). However, one of the benefits that you get with adedicated LPAR is that the LPAR is much easier to configure and monitor, and it will give youconsistent results. A shared LPAR has factors that are out of your control, and that might causevariations in throughput results.

Dedicated partitions

2 CPU

AIX

7.1

AIX

6.1

10 CPU

AIX

6.2

1 CPU

Power Hypervisor

Dedicated processors Physical shared-processor pool

Shared Process Pool 0 Shared Process Pool 1

2.5

CP

UA

IX7.1

0.5

CP

UA

IX6.1

1.5

CP

UA

IX6.2

Micro-partitions Micro-partitions

2.4

CP

UA

IX7.1

1.5

CP

UA

IX6.1

2C

PU

AIX

6.2

LPAR#1

LPAR#2

Page 15: IBM Informix onPOWER7 - IBM - United States · IBM Informix on POWER7 Best Practices 6 Database workload Before we address best practices for Informix on the POWER7 architecture,

IBM Informix on POWER7 Best Practices 15

Capped or uncapped shared LPAR

For a capped shared LPAR, the entitlement capacity is the maximum number of cycles that canbe used. An example would be creating a capped shared LPAR with an entitlement capacity of8 CPUs. That LPAR will not use more than 8 CPUs. If that LPAR uses only 2, the other 6 CPUscan be used by other uncapped shared LPARs.

If an uncapped shared LPAR that has 8 CPUs entitled consumes all 8 CPUs, it can acquiremore resources from the shared pool, and use more than its entitled capacity. It can use up tothe number of online virtual processors that are defined for the LPAR.

There are obvious throughput benefits to using an uncapped shared LPAR that can access theshared pool of processors. The LPAR must have enough virtual processors defined to takeadvantage of the idle processors in the shared pool.

The following graph shows test results of two shared LPARS. One is a capped shared LPARwith an entitlement of 8 CPUs. The other is an uncapped shared LPAR with an entitlement of 8CPUs, and 16 virtual processors defined to take advantage of the 8 CPUs that are in the sharedprocessor pool.

Page 16: IBM Informix onPOWER7 - IBM - United States · IBM Informix on POWER7 Best Practices 6 Database workload Before we address best practices for Informix on the POWER7 architecture,

IBM Informix on POWER7 Best Practices 16

If you are using a dedicated LPAR the simplest way to test a shared LPAR is to change theLPAR mode from dedicated to shared uncapped, and make the number of virtual processorsand entitlement capacity equal to the number of CPUs that were allocated to the dedicatedLPAR. This test provides the immediate advantage of allowing unused CPU cycles to be usedby the shared processor pool.

Virtual processors

Virtual processors are similar to CPUs from an AIX operating system standpoint. That is, avirtual processor is a logical entity that is backed up by physical processor cycles. The numberof online virtual processors dictates the absolute maximum CPU consumption that an LPAR canachieve. If an LPAR has an entitlement of 2 CPUs, and you set up 4 virtual processors, theLPAR could consume up to 4 physical processors, in which case it would report a 200% CPUutilization.

You can use the lparstat command to check the entitled capacity of the number of online virtualCPUs as well as other parameters for an LPAR. The following examples show three lparstatcommand outputs. The first output is for a dedicated LPAR with 8 CPUs assigned to it. Thesecond output is for a capped shared LPAR with 8 CPUs entitled, and the third output is for theshared LPAR moved to an uncapped shared LPAR with 8 CPUs entitled and 16 virtualprocessors.

Page 17: IBM Informix onPOWER7 - IBM - United States · IBM Informix on POWER7 Best Practices 6 Database workload Before we address best practices for Informix on the POWER7 architecture,

IBM Informix on POWER7 Best Practices 17

Use the following command:

#lparstat –l

Dedicated (lparstat –l)Node Name : v1009h01Partition Name : v1009h01-b3p019-informixPartition Number : 4

Type : Dedicated-SMT-4Mode : CappedEntitled Capacity : 8.00Partition Group-ID : 32772Shared Pool ID : -

Online Virtual CPUs : 8Maximum Virtual CPUs : 12Minimum Virtual CPUs : 1Online Memory : 32768 MBMaximum Memory : 49152 MBMinimum Memory : 16384 MBVariable Capacity Weight : -Minimum Capacity : 1.00Maximum Capacity : 12.00Capacity Increment : 1.00Maximum Physical CPUs in system : 128Active Physical CPUs in system : 128Active CPUs in Pool : -Shared Physical CPUs in system : 0Maximum Capacity of Pool : 0Entitled Capacity of Pool : 0Unallocated Capacity : -Physical CPU Percentage : 100.00%Unallocated Weight : -Memory Mode : Dedicated

Capped Shared LPAR (lparstat –l)Node Name : v1009h02Partition Name : v1009h02-b3p019-informixPartition Number : 6

Type : Shared-SMT-4Mode : CappedEntitled Capacity : 8.00Partition Group-ID : 32774Shared Pool ID : 1

Online Virtual CPUs : 8Maximum Virtual CPUs : 12Minimum Virtual CPUs : 1Online Memory : 32768 MBMaximum Memory : 49152 MBMinimum Memory : 16384 MBVariable Capacity Weight : 0Minimum Capacity : 0.10Maximum Capacity : 12.00Capacity Increment : 0.01Maximum Physical CPUs in system : 128Active Physical CPUs in system : 128Active CPUs in Pool : 16Shared Physical CPUs in system : 32Maximum Capacity of Pool : 1600

Page 18: IBM Informix onPOWER7 - IBM - United States · IBM Informix on POWER7 Best Practices 6 Database workload Before we address best practices for Informix on the POWER7 architecture,

IBM Informix on POWER7 Best Practices 18

Entitled Capacity of Pool : 1600Unallocated Capacity : 0.00Physical CPU Percentage : 100.00%Unallocated Weight : 0Memory Mode : Dedicated

Uncapped Shared LPAR (lparstat –l)Node Name : v1009h02Partition Name : v1009h02-b3p019-informixPartition Number : 6

Type : Shared-SMT-4Mode : UncappedEntitled Capacity : 8.00Partition Group-ID : 32774Shared Pool ID : 1

Online Virtual CPUs : 16Maximum Virtual CPUs : 16Minimum Virtual CPUs : 1Online Memory : 32768 MBMaximum Memory : 49152 MBMinimum Memory : 16384 MBVariable Capacity Weight : 128Minimum Capacity : 0.10Maximum Capacity : 16.00Capacity Increment : 0.01Maximum Physical CPUs in system : 128Active Physical CPUs in system : 128Active CPUs in Pool : 20Shared Physical CPUs in system : 41Maximum Capacity of Pool : 2000Entitled Capacity of Pool : 1600Unallocated Capacity : 0.00Physical CPU Percentage : 50.00%Unallocated Weight : 0Memory Mode : Dedicated

Processor folding

Processor folding is a method of “turning off” unused virtual processors so that they are notscheduled to run and consume CPU cycles. If an LPAR has 8 CPUs entitled and 10 virtualprocessors, but the LPAR only requires 2.5 CPUs for the current workload, it will run on just 3CPUs. The other 7 virtual processors are “folded” away, and when the workload dictates, thevirtual processors are used again (“unfolded”).

To determine if processor folding is enabled, use the schedo command, and look for the currentsetting of the vpm_fold_policy parameter.

Page 19: IBM Informix onPOWER7 - IBM - United States · IBM Informix on POWER7 Best Practices 6 Database workload Before we address best practices for Informix on the POWER7 architecture,

IBM Informix on POWER7 Best Practices 19

schedo –L

NAME CUR DEF BOOT MIN MAX UNIT TYPEDEPENDENCIES

. . . .

--------------------------------------------------------------------------------

vpm_fold_policy 1 1 1 0 15 D--------------------------------------------------------------------------------vpm_xvcpus 0 0 0 -1 2G-1 processors D--------------------------------------------------------------------------------

We see in the output that the vpm_fold_policy current value is set to 1. This is a bitmasksetting, so a value of 1 indicates that folding is enabled if the LPAR is using shared processors.See the AIX documentation for all possible settings for this parameter.

To disable processor folding regardless of the type of LPAR or power saving mode, set thevpm_fold_policy to 4 as shown in the following example.

Example: schedo -p -o vpm_fold_policy=4

The vpm_xvcpus parameter is used to determine the number of extra virtual processors tounfold when the system determines it needs to unfold a processor. For example, when theoperating system needs to unfold a processor, if vpm_xvcpus is set to 3, the operating systemunfolds 4 virtual processors.

Example: schedo -p -o vpm_xvcpus=3

Recommendation

If the workload requires consistent performance with stringent latency requirements, then such aworkload is best deployed on dedicated partitions rather than on a shared LPAR. In IBMInformix tests, a dedicated LPAR provided the most consistent performance.

The exception to using a dedicated LPAR would be when a shared processor pool is not over-committed nor over-utilized.

Use processor folding for more efficient use of the cores. However, disable folding if you see aproblem with processor folding, or if you see an excessive amount of folding. Processor foldingis dynamically configurable so test it in peak-load and low-load scenarios. If you choose to useprocessor folding, set the vpm_xvcpus parameter to 3. That setting helps avoid any penaltiesfrom unfolding one virtual processor at a time.

Page 20: IBM Informix onPOWER7 - IBM - United States · IBM Informix on POWER7 Best Practices 6 Database workload Before we address best practices for Informix on the POWER7 architecture,

IBM Informix on POWER7 Best Practices 20

Virtual I/O Server (VIOS) LPAR

A Virtual I/O Server (VIOS) is a special LPAR that has additional software installed for thepurpose of managing the I/O for other LPARs. Instead of the individual network and diskresources being carved out on an LPAR by LPAR basis, the VIOS manages the disk andnetwork resources on behalf of the other LPARs. The size of the VIOS is important.

Recommendation

The VIOS must be a dedicated LPAR, and it is recommended that you disable processor foldingfor the VIOS. The size of the VIOS server is important. Refer to AIX documentation for properconfiguration requirements. There is also a VIOS advisor that you can use, which can providerecommendations for your VIOS configuration.

http://www.ibm.com/developerworks/wikis/display/WikiPtype/VIOS+Advisor

Page 21: IBM Informix onPOWER7 - IBM - United States · IBM Informix on POWER7 Best Practices 6 Database workload Before we address best practices for Informix on the POWER7 architecture,

IBM Informix on POWER7 Best Practices 21

Additional LPAR recommendations

Shared resource LPARs are very dynamic in nature. There are various performance tools thatcan be used to help improve allocation and placement of resources on the physical machineand within the LPARS.

The Dynamic Platform Optimizer (DPO) is a PowerVM feature that you can use to improvepartition memory and processor affinity across the logical partitions in a Power Server. DPO isa feature that can help you reap performance gains for the IBM Informix server.

Active System Optimizer (ASO) is a subsystem that is designed to automatically improve theperformance of AIX workloads running on POWER7. Dynamic System Optimizer (DSO) is builton the ASO framework and provides additional optimizations.

Recommendation

We recommend that the System Administrator work with AIX and use DPO and ASO/DSO tooptimize workloads for Informix. See the following IBM Redbooks® publication, IBM PowerVMVirtualization Managing and Monitoring, for details.

http://www.redbooks.ibm.com/redpieces/abstracts/sg247590.html

Page 22: IBM Informix onPOWER7 - IBM - United States · IBM Informix on POWER7 Best Practices 6 Database workload Before we address best practices for Informix on the POWER7 architecture,

IBM Informix on POWER7 Best Practices 22

Memory considerations

Using larger virtual memory page sizes for an application’s memory can significantly improve anapplication’s performance and throughput. The improvement in system performance stemsfrom the reduction of Translation Lookaside Buffer (TLB) misses due to the ability of the TLB tomap to a larger virtual memory range. Starting with the POWER4 processor, support for 16 MB“large” pages was introduced in addition to the default 4 KB pages. To use large pages onhardware where multiple page sizes are supported, run AIX 5LTM Version 5.3 updated with5300-04 Maintenance Package (or later.)

Starting with version 11.50.xC4, Informix supports 16 MB pages. AIX does not automaticallyconfigure “large” pages in the environment. The system administrator must configure AIX touse these page sizes, and must specify the number of pages to be reserved. The number ofconfigured large pages will not be automatically changed by the operating system based ondemand.

We will look at the 64 KB pages, which are dynamically allocated by the operating system on anas-needed basis, making them simpler to use than the 16 MB large page size. (Starting withPOWER5+TM hardware, “huge” 16 GB pages are also supported.)

IBM Informix performed tests to compare the results of 4 KB page sizes, 64 KB page sizes, and16 MB page sizes. The results of these tests are discussed later in this section.

RESIDENT parameter

The RESIDENT parameter in the Informix configuration file ($ONCONFIG) needs to beconsidered with respect to memory considerations on AIX. For reference, here are thecomments from the onconfig.std file.

#################################################################### Shared Memory Configuration Parameters#################################################################### RESIDENT - Controls whether shared memory is resident.# Acceptable values are:# 0 off (default)# 1 lock the resident segment only# n lock the resident segment and the next n-1# virtual segments, where n < 100# -1 lock all resident and virtual segments

On AIX systems with a lot of allocated pinned “resident” memory, when Informix uses kernalasynchronous (KAIO) or direct I/O, Informix might experience KAIO read or write failures witherrno 22 (EINVAL).

Page 23: IBM Informix onPOWER7 - IBM - United States · IBM Informix on POWER7 Best Practices 6 Database workload Before we address best practices for Informix on the POWER7 architecture,

IBM Informix on POWER7 Best Practices 23

Example:

04:30:40 KAIO: error in kaio_WRITE, kaiocbp = 0x22b620d0, errno= 2204:30:40 fildes = 258 (gfd 3), buf = 0x700000122b64000, nbytes= 4096, offset = 130785280

Setting the RESIDENT configuration parameter to -1 is not recommended on AIX. As Informixallocates more memory, the database server attempts to pin that memory, and that results in ahigher likelihood of seeing an error. With Informix 11.70.FC6 and later, a warning message isdisplayed in the database server message log (online.log) if resident memory is used with KAIOor direct I/O.

4 KB memory page size

The default page size on AIX is 4 KB. The testing that IBM Informix has performed with varyingpage sizes of 64 KB and 16 MB are compared against that default.

16 MB memory page size

Before Informix can start to use large pages, the pages must be allocated by the SystemAdministrator. The following command example allocates 3072 large pages.

vmo -p -o lgpg_regions=3072 –o lgpg_size=16777216vmo -p -o v_pinshm=1

This command can take a while to process. Monitor the number of allocated 16 MB pages withthe vmstat command. In the following output, there are 0 (avm) 16 MB pages active.

vmstat -P ALL 5

System configuration: mem=513536MB

pgsz memory page----- -------------------------- ------------------------------------

siz avm fre re pi po fr sr cy4K 54844752 9497936 799539 0 0 0 0 0 064K 594475 526063 68412 0 0 0 0 0 016M 16384 0 16384 0 0 0 0 0 0

4K 54844752 9497948 789981 0 0 0 0 0 064K 594475 526062 68413 0 0 0 0 0 0

16M 16384 0 16384 0 0 0 0 0 0

After the 16 MB pages are allocated, you must bring the Informix server offline, set theIFX_LARGE_PAGES environment variable, and then bring the instance back online.

export IFX_LARGE_PAGES=1

Page 24: IBM Informix onPOWER7 - IBM - United States · IBM Informix on POWER7 Best Practices 6 Database workload Before we address best practices for Informix on the POWER7 architecture,

IBM Informix on POWER7 Best Practices 24

The following data is from a test performed by IBM Informix comparing 4 KB AIX page size with16 MB AIX page size. The 4 KB test resulted in 255,491 TPM. The 16 MB test resulted in270,254 TPM. The use of 16 MB page size produced a 5.77% gain in performance.

64 KB memory page size

Starting with POWER5+ hardware, there is also support for 64 KB page sizes. The 64 KB pagesare dynamically allocated by the operating system on an as-needed basis, making the use of 64KB pages simpler because no pre-allocation has to occur.

Take the following steps to enable 64 KB page sizes for Informix.

1. Bring the Informix instance offline.

2. Set the LDR_CNTRL environment variable.

export LDR_CNTRL DATAPSIZE=64K@TEXTPSIZE=64K@STACKPSIZE=64K@SHMPSIZE=64K

3. Bring the Informix instance online.

4. Unset the LDR_CNTRL environment variable.

unset LDR_CNTRL

The reason for unsetting the LDR_CNTRL environment variable is to avoid the unintended useof 64 KB pages for applications that might start from the same terminal.

Page 25: IBM Informix onPOWER7 - IBM - United States · IBM Informix on POWER7 Best Practices 6 Database workload Before we address best practices for Informix on the POWER7 architecture,

IBM Informix on POWER7 Best Practices 25

The following data is from a test performed by IBM Informix comparing 4 KB AIX page size with64 KB AIX page size. The 4 KB test resulted in 255,491 TPM, the 64 KB test resulted in271,052 TPM. The use of 64 KB page size produced a 6.09% gain in performance.

The 64 KB test results show a better performance gain than the test results for 16 MB largepages. The benefits of the 16 MB large pages can become more evident as the size of thedatabase and memory usage grows for that database.

Recommendation

IBM Informix recommends using 64 KB page sizes. The simplicity of use, the dynamic nature,and results that are similar to that of 16 MB large pages drive this recommendation. In a verylarge database the larger 16 MB page sizes might produce better performance gains, but thisneeds to be tested on an individual basis.

When using KAIO or direct I/O, do not set the RESIDENT configuration parameter to -1. Set it to0. Setting it to 1 or 2 might also be in order, but the System Administrator will need to monitorthe pinned memory to make sure that it does not exceed 80% of the physical memory on thecomputer or LPAR.

Page 26: IBM Informix onPOWER7 - IBM - United States · IBM Informix on POWER7 Best Practices 6 Database workload Before we address best practices for Informix on the POWER7 architecture,

IBM Informix on POWER7 Best Practices 26

Feedback Directed Program Restructuring (FDPR) Tool

The FDPR® tool for AIX is included with AIX 5L operating system V5.2 and later. FDPR is usedas a post-link utility for improving the performance of binaries that were compiled on the Powerfamily platform. It optimizes the binary to achieve a better hit/miss i-cache ratio, reduces thenumber of branches, and reduces TLB misses and page faults. This tool is useful for very largeprograms or those that used dynamically linked libraries.

From the man page:

The fdpr command (Feedback Directed Program Restructuring) is a performance-tuningutility that may help improve the execution time and the real memory utilization of user-level application programs. The fdpr program optimizes the executable image of aprogram by collecting information on the behavior of the program while the program isused for some typical workload, and then creating a new version of the program that isoptimized for that workload. The new program generated by fdpr typically runs faster anduses less real memory. Attention: The fdpr command applies advanced optimizationtechniques to a program which may result in programs that do not behave as expected;programs which are optimized using this tool should be used with due caution andshould be rigorously retested with, at a minimum, the same test suite used to test theoriginal program in order to verify expected functionality. The optimized program is notsupported.

The following steps outline how to determine if the FDPR tool can optimize the oninitexecutable program that runs Informix. These steps do not describe the complete optimizationprocess, they only outline steps that you can use to test the optimization.

1. Create a script to set the correct environment, run the oninit command, and run theworkload. FDPR expects to find the executable not running and in fact replaces it with aninstrumented version before startup. Because oninit is a setuid executable, and the SUIDinformation is not in the replacement executable, you must set the SUID and correct ownermask before the script starts oninit:

chown root:informix ${INFORMIXDIR}/bin/oninitchmod 6755 ${INFORMIXDIR}/bin/oninit

2. Make sure that Informix is fully configured for the benchmark, and that the data is alreadyloaded. The script should execute only the workload portion of the benchmark.

3. Keep in mind that immediately after Informix starts, no data is cached, and so I/O activity ishigher than normal. Make sure that the run time is adjusted so that FDPR can see the productperforming for a considerable amount of time after the cache is warmed up.

Page 27: IBM Informix onPOWER7 - IBM - United States · IBM Informix on POWER7 Best Practices 6 Database workload Before we address best practices for Informix on the POWER7 architecture,

IBM Informix on POWER7 Best Practices 27

tpm

0

25000

50000

75000

100000

125000

150000

175000

200000

225000

250000

275000

300000

325000

Using FDPR

baseline

optimized

4. Run FDPR in the same directory where the oninit executable file resides, or else FDPRcould experience problems renaming or re-linking the executable. If the workload script usesexternal drivers to run the benchmark, make sure that the drivers are in the execution path, andthat all output goes to the specified location.

Next, run the following command, where "/tpcc/fdpr-workload" is the script that loads theRDBMS:

( cd $INFORMIXDIR/bin ; timex fdpr -p oninit -x /tpcc/fdpr-workload )

5. Some versions of FDPR might lose information about the optimization level that is usedduring product build. The workaround for this is simple: Pass the optimization level to FDPR onthe command line (see man page for details).

Below are the results of our benchmark comparing the original Informix (baseline) with theresults obtained after the oninit binary had been optimized by FDPR. The following steps weretaken. Points were selected from two sets (baseline, optimized) where throughput wasmeasured as a function of the number of user terminals. Both the baseline set and the one forthe FDPR-optimized binary were collected in a configuration that used 64 KB pages. Maximumthroughput in both sets was achieved with 32 active terminals.

The following data is from IBM Informix test results using 64 KB page size and testing theresults on non-optimized versus FDPR optimized binary. The non-optimized binary produced270,526 TPM and the FDPR optimized binary produced throughput of over 300,000 TPM. Thisamounted to a 13% increase in performance.

Page 28: IBM Informix onPOWER7 - IBM - United States · IBM Informix on POWER7 Best Practices 6 Database workload Before we address best practices for Informix on the POWER7 architecture,

IBM Informix on POWER7 Best Practices 28

Recommendation

IBM Informix recommends testing FDPR in a non-production environment using a productionload. Using FDPR can reap performance gains, and that tool should be tested thoroughlybefore being used in a production environment.

For more information, see the following two documents: Feedback Directed ProgramRestructuring (FDPR) and AIX 5L Performance Tools Handbook (Redbook).

https://www.research.ibm.com/haifa/projects/systems/cot/fdpr/

http://www.redbooks.ibm.com/abstracts/sg246039.html

Page 29: IBM Informix onPOWER7 - IBM - United States · IBM Informix on POWER7 Best Practices 6 Database workload Before we address best practices for Informix on the POWER7 architecture,

IBM Informix on POWER7 Best Practices 29

I/O subsystem

The I/O subsystem is a key factor in a well-performing database server. A properly configuredI/O subsystem will allow maximum I/O throughput by the database server. A poorly configuredI/O subsystem can have major negative impacts on the database server. It might or might notbe obvious as to where a problem resides in a poorly performing database server.

Read/Write access times

A good read-and-write access time is determined in part by the technology of the storage that isbeing used. A typical I/O should take from 0 milliseconds (ms) to 15 ms. I/Os that take longerthan 15 ms might indicate a problem with the I/O system or a busy device. When a solid-statedrive (SSD) is used, the I/O will typically be less than 2-3 ms. Again, I/Os taking longer mightindicate a problem or a busy device. SSD might even produce results less than 1 ms, movinginto the microsecond range.

You can monitor the access times from Informix or at the operating system level.

In Informix, you would review the onstat –g iof data and the onstat –g ioh data. The onstat –g iof data will show the chunks and a summary of the response times since the instance hasbeen online. In the following output, we see the average read service time for KAIO is 9.8 ms.

onstat -g iof

IBM Informix Dynamic Server Version 12.10.FC1 -- On-Line -- Up 17 days 23:09:01 --590776 Kbytes

AIO global files:gfd pathname bytes read page reads bytes write page writes io/s3 rootdbs.1 6187008 3021 4007526400 1956800 823.1

op type count avg. timeseeks 0 N/Areads 0 N/Awrites 0 N/A

kaio_reads 2360 0.0098kaio_writes 212009 0.0011

Informix also provides this output with a historical viewpoint going back one hour. This output isa better way to monitor I/O because it summarizes the data for the past hour on a per minutebasis.

onstat -g ioh

IBM Informix Dynamic Server Version 12.10.FC1 -- On-Line -- Up 00:03:56 -- 525240Kbytes

AIO global files:

Page 30: IBM Informix onPOWER7 - IBM - United States · IBM Informix on POWER7 Best Practices 6 Database workload Before we address best practices for Informix on the POWER7 architecture,

IBM Informix on POWER7 Best Practices 30

gfd pathname bytes read page reads bytes write page writes io/s3 rootdbs.1 1073152 524 386072576 188512 457.1

avg read avg writetime reads io/s op time writes io/s op time

13:54:14 6 0.1 0.02417 1772 29.5 0.0018513:53:14 47 0.8 0.08274 2010 33.5 0.00141

13:52:14 216 3.6 0.00214 378 6.3 0.00097

From the operating system standpoint, you can use the iostat command to monitor I/Othroughput. The following iostat command will take two samples 5 seconds apart

Example: iostat -D 5 2

hdisk79 xfer: %tm_act bps tps bread bwrtn13.3 1.6K 0.4 1.6K 0.0

read: rps avgserv minserv maxserv timeouts fails

0.4 331.6 27.0 836.7 0 0write: wps avgserv minserv maxserv timeouts fails

0.0 0.0 0.0 0.0 0 0queue: avgtime mintime maxtime avgwqsz avgsqsz sqfull

0.0 0.0 0.0 0.0 0.0 0.0

KAIO/DIRECT_IO

Kernel Asynchronous I/O (KAIO) is enabled by default and will be used for raw disk space. Itprovides performance gains over regular I/O. AIX also supports direct I/O & concurrent I/O forfile system access. Informix supports those types of file system access with the DIRECT_IOconfiguration parameter.

AIX only supports concurrent I/O on JFS2 file systems. Direct I/O is similar to using KAIO for afile system. Concurrent I/O adds functionality by avoiding unnecessary write serialization. Forreference, here are the comments for the DIRECT_IO configuration parameter from theonconfig.std file.

# DIRECT_IO - Specifies whether direct I/O is used for cooked# files used for dbspace chunks.# Acceptable values are:# 0 Disable# 1 Enable direct I/O# 2 Enable concurrent I/O

To determine what type of I/O is being used, review the following output. To check on basicKAIO, run the onstat –g ath command, and look for kaio threads. For example, the followingoutput shows 1 kaio thread.

onstat -g ath|grep kaio

18 60f0a360 0 3 IO Idle 1cpu* kaio

Page 31: IBM Informix onPOWER7 - IBM - United States · IBM Informix on POWER7 Best Practices 6 Database workload Before we address best practices for Informix on the POWER7 architecture,

IBM Informix on POWER7 Best Practices 31

To check for direct I/O or concurrent I/O run the onstat –d command and look at the flags listedin column 5 for each chunk. A value of ‘D’ represents direct I/O and ‘C’ represents concurrentI/O.

onstat -d

IBM Informix Dynamic Server Version 12.10.FC1 -- On-Line -- Up 17 days 22:55:45 -- 590776 Kbytes

….

Chunks

address chunk/dbs offset size free bpages flags pathname

5ff081d0 1 1 0 5000000 4928731 PO-B-D/chunks/IDSPERF/rootdbs.1

To disable KAIO completely, you can use the KAIOOFF environment variable. Prior to bringingthe Informix instance online, set KAIOOFF to 1.

export KAIOOFF=1

Queue depth

From an application standpoint (database server), the length of time to do an I/O equals thetime to service the I/O plus the time that the I/O waits in the hard disk (hdisk) wait queue. Eachhdisk has an associated queue depth setting and, if this setting is poorly configured, it can havenegative impacts on I/O throughput. Use the lsattr command to check the current setting for adevice.

lsattr -El hdisk6 |grep queue

queue_depth 16 Queue DEPTH True

The faster the drive, the more I/O operations per second (IOPS) that a disk can handle. Themaximum throughput will be limited by the queue depth/average I/O service time. For example,a queue depth of 3 and an average I/O service time of 10 ms yield a maximum throughput of300 IOPS.

You can use the iostat -D command to monitor the service times as well as the queue times. Ifyou start to see time spent waiting in the queue, you might want to increase the queue depth fora specific device.

In the following iostat output, we see that we are spending an average of 3 ms in the queue,and had a max of 9 ms wait time in the queue.

Page 32: IBM Informix onPOWER7 - IBM - United States · IBM Informix on POWER7 Best Practices 6 Database workload Before we address best practices for Informix on the POWER7 architecture,

IBM Informix on POWER7 Best Practices 32

iostat -D 2 2

System configuration: lcpu=128 drives=9 paths=8 vdisks=0

hdisk6 xfer: %tm_act bps tps bread bwrtn63.5 5.0M 236.5 0.0 5.0M

read: rps avgserv minserv maxserv timeouts fails0.0 8.0 0.0 18.0 0 0

write: wps avgserv minserv maxserv timeouts fails236.5 5.3 1.1 71.0 0 0

queue: avgtime mintime maxtime avgwqsz avgsqsz sqfull

3.0 0.0 9.0 0.0 1.0 12.0

To change the queue depth, use the chdev command.

Example: chdev –l hdisk66 –a queue_depth=32

For more information regarding queue depth, monitoring, and configuration see the followingarticle: AIX Disk Queue Depth Tuning for Performance.

https://www-03.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/TD105745

AIO servers

AIO is an AIX software subsystem that allows processes to issue I/O operations without waitingfor I/O to complete. This feature is particularly important in a database environment

A kernel process (kproc), called an AIO server (AIOS), is in charge of each request from thetime that the request is taken off the queue until it completes. The number of servers limits thenumber of disk I/O operations that can be in progress in the system simultaneously. The defaultvalue of minservers is 3 and maxservers is 30. When more than 3 servers are needed, they willautomatically be allocated, up to the maxservers value.

There is also an aio_server_inactivity tunable parameter that indicates the duration of inactivitybefore the inactive AIO servers are stopped. The stopping of these AIO servers can occurdown to the minservers value. The default value for aio_server_inactivity is 300 seconds.

To view the current settings use the ioo command.

# ioo -a |grep aio_

aio_maxreqs = 8192aio_maxservers = 30aio_minservers = 3

aio_server_inactivity = 300

These defaults with AIX 6.1 can cause slow I/O, and the issue might not be easily identifiable.With these low defaults, a situation can occur where more than 3 AIO servers are needed to

Page 33: IBM Informix onPOWER7 - IBM - United States · IBM Informix on POWER7 Best Practices 6 Database workload Before we address best practices for Informix on the POWER7 architecture,

IBM Informix on POWER7 Best Practices 33

process a peak load of I/O activity. For example, during low I/O activity, the AIO servers can betaken down to the minimum of 3. But when a checkpoint occurs, more AIO servers arerequired. This situation can generate extra overhead each time the peak load of I/O activityoccurs, because starting the AIO servers incurs an extra cost.

To change these parameters use the ioo command.

Example:

ioo -p -o aio_maxreqs=65536 -o aio_minservers=100 -o aio_server_inactivity=86400

Recommendation

IBM Informix recommends the use of KAIO for raw devices. For file system chunks, set theDIRECT_IO configuration parameter to 2. This setting permits direct I/O on file system chunksand allows for concurrent I/O on JFS2 file systems.

Informix also recommends monitoring the queue depth, with a minimum setting of 16. Ifmonitoring shows wait times in the queue, increase the queue depth accordingly.

Set the aio_minservers and aio_maxservers parameters to 100. Set the aio_server_inactivityparameter to 86400, which represents a 24-hour period of inactivity before any extra servers aretaken down. Also, set the aio_maxreqs parameter to 65536.

Page 34: IBM Informix onPOWER7 - IBM - United States · IBM Informix on POWER7 Best Practices 6 Database workload Before we address best practices for Informix on the POWER7 architecture,

IBM Informix on POWER7 Best Practices 34

Network subsystem

TCP traffic

The network layer can have performance implications based on the amount of data that needsbe sent across the network and whether the network is poorly configured. If you suspect that aslow network is the cause of a performance problem, you can use the following methods to testnetwork throughput.

Run the slow-performing test or SQL locally, and compare the results against the same test orSQL that is run remotely, where the result set has to be returned over the network.

Use scp or ftp commands to test throughput speed by sending a large file and monitoring itsthroughput. If the network is not producing the amount of throughput that is expected, makesure that the TCP window size is configured properly.

The following TCP parameters can affect network performance: tcp_recvspace, tcp_sendspace,and rfc1323. The tcp_recvspace parameter specifies the number of bytes that the receivingsystem can buffer in the kernel. The tcp_sendspace parameter specifies the number of bytesthat the sending system can buffer in the kernel. The rfc1323 parameter enables the TCPwindow scaling option.

Local loopback

The fastpath loopback option is used to achieve better performance for loopback traffic. Thetcp_fastlo network parameter permits the TCP loopback traffic to reduce the distance for theentire TCP/IP stack to achieve better performance.

To display the current setting for tcp_fastlo:

no –a |grep tcp_fastlo

The tcp_fastlo parameter is disabled by default (value of 0). To set the parameter, use the nocommand. The –p option applies the changes to both current and reboot values.

no –p –o tcp_fastlo=1

IBM Informix tests have shown an increase from ~350,000 TPM to ~520,000 TPM for localloopback testing. This is an increase of ~50%.

Page 35: IBM Informix onPOWER7 - IBM - United States · IBM Informix on POWER7 Best Practices 6 Database workload Before we address best practices for Informix on the POWER7 architecture,

IBM Informix on POWER7 Best Practices 35

Recommendation

IBM Informix recommends contacting AIX Software Support to discuss any throughput issueswith the network. The following changes might increase throughput but should not be triedbefore discussing them with AIX Software Support.

If network throughput becomes an issue, consider increasing the TCP window size to 256 KBwith the following commands. Consult AIX Software Support to discuss these changes.

ifconfig en11 rfc1323 1 tcp_nodelay 1 tcp_sendspace 262144 tcp_recvspace 262144chdev -l en11 -a rfc1323=1 -a tcp_nodelay=1 -a tcp_sendspace=262144 –a tcp_recvspace=262144 -P

If using a local loopback connection, enable tcp_fastlo for performance improvements in theTCP loopback traffic.

no –p –o tcp_fastlo=1

Page 36: IBM Informix onPOWER7 - IBM - United States · IBM Informix on POWER7 Best Practices 6 Database workload Before we address best practices for Informix on the POWER7 architecture,

IBM Informix on POWER7 Best Practices 36

Number of CPU virtual processors

The question of how many CPU virtual processors (VPs) should be configured has been aninteresting problem on POWER7, specifically when SMT2 or SMT4 is used. The important thingto understand when sizing the number of CPU VPs on a system is that turning on SMT2 orSMT4 does not give you 2x or 4x CPU power, respectively. Configuring this number properlywith Informix is important, and the number depends on the type of work that the Informixdatabase will do. Is the workload more CPU intensive, or is the workload more I/O intensive?These are some of the questions that need to be understood to properly size the system.

CPU-intensive workload

If the Informix server is performing a more CPU-intensive workload, set the number of CPU VPsto 1.5x the number of physical CPUs allocated to the LPAR. On a POWER7 LPAR with 32cores and SMT4 enabled (128 logical CPUs), a good starting point is 48 CPU VPs. Use theVPCLASS configuration parameter to specify the number of CPU VPs that Informix will usewhen first bringing the database server online. For reference, here are the comments forVPCLASS from the onconfig.std file.

# VPCLASS cpu - Configures the CPU VPs. The format is:# VPCLASS cpu, num=<number of CPU VPs>,

VPCLASS cpu,num=48,noage

Monitor the Informix engine to make sure that all 48 CPU VPs are needed, and decrease thenumber if necessary. To change the number of CPU VPs, you update the value of theVPCLASS configuration parameter, and the value takes effect after you stop and then start theInformix server. Use the onstat –g glo command to determine if there are CPU VPs that arenot being used. In the following example, some of the latter CPU VPs have only clocked 5-6minutes of CPU time over nearly 58 days of being online. In that case, consider decreasing thenumber of CPU VPs.

IBM Informix Dynamic Server Version 11.70.FC2 -- On-Line -- Up 57 days 23:42:02 --28512384 Kbytes

……

Virtual processor summary:class vps usercpu syscpu totalcpu 72 20609362.66 1415471.47 22024834.13

Page 37: IBM Informix onPOWER7 - IBM - United States · IBM Informix on POWER7 Best Practices 6 Database workload Before we address best practices for Informix on the POWER7 architecture,

IBM Informix on POWER7 Best Practices 37

Individual virtual processors:vp pid class usercpu syscpu total Thread Eff1 9044394 cpu 1318982.55 126036.12 1445018.67 3471009.49 41%2 4980998 adm 2021.36 1038.05 3059.41 0.00 0%

……

19 18808928 cpu 1513914.43 106858.67 1620773.10 3952966.80 41%20 6750704 cpu 1479419.71 106683.61 1586103.32 3858124.02 41%

……

87 16515902 cpu 273.03 93.09 366.12 4967.90 7%

88 11075584 cpu 251.17 91.13 342.30 4876.44 7%

If, however, you determine that the Informix server is constantly seeing threads in the ready queuewaiting to run, consider increasing the number of CPU VPs.

onstat -g rea

IBM Informix Dynamic Server Version 12.10.FC1 -- On-Line -- Up 3 days 01:55:42 -- 558008 Kbytes

Ready threads:tid tcb rstcb prty status vp-class name194655974 700000abab6b028 700000b7e0d9ae0 1 ready 32cpu sqlexec195234123 700000a71e78568 700000af4e67920 1 ready 36cpu sqlexec195317254 700000b2516e028 700000bd5045b70 1 ready 31cpu srvinfx195372610 700000ac7f422a0 700000af4e90bb0 1 ready 34cpu sqlexec195425354 700000aadeb4d20 700000a9015b8a0 1 ready 1cpu srvinfx195426222 700000b53cae0d0 700000b7e0d3a80 1 ready 32cpu scan_3.0

Adding CPU VPs can be done dynamically with the onmode command. The following command adds 5CPU VPs.

onmode –p +5 cpu

Keep in mind, threads 2 – 4 in SMT do not scale linearly, so although the total throughput will increase asthreads 2 – 4 are used, single thread response time might suffer. For this reason, when increasing thenumber of CPU VPs, test to find the best setting for your specific environment.

I/O-intensive workload

If the Informix server is performing a workload heavy on I/O, overload the CPU VPs a bit more.In the previous example with 32 cores (SMT4), 3x the physical cores, 96 CPU VPs is a goodplace to start.

As described earlier, monitor the CPU clock time to determine if the VPs are over configured,and monitor the ready queue to see if more is needed. Also, monitor the user threads, andcheck to see if there are a lot of threads consistently waiting on I/O.

Page 38: IBM Informix onPOWER7 - IBM - United States · IBM Informix on POWER7 Best Practices 6 Database workload Before we address best practices for Informix on the POWER7 architecture,

IBM Informix on POWER7 Best Practices 38

onstat –g ath|grep “IO Wait”

194806496 700000aa4534568 700000af4e751f8 1 IO Wait 48cpu sqlexec194812270 700000af6fe3c68 700000b88f07108 1 IO Wait 49cpu sqlexec195228271 700000b21d4eb18 700000a9016d1b8 1 IO Wait 24cpu sqlexec195234152 700000ac13eab20 700000b88f17a10 1 IO Wait 26cpu sqlexec195288699 700000abfe7cb10 700000b7e0d3a80 1 IO Wait 39cpu sqlexec195293668 700000adf34d930 700000b848189d0 1 IO Wait 37cpu sqlexec

If this is a consistent characteristic, increasing the number of CPU VPs might help performancethroughput.

Recommendation

Monitor the system workload: For CPU-intensive workloads, use a starting point for the number of CPU VPs at 1.5x

the number of physical CPUs in the LPAR For I/O-intensive workloads, use a starting point for the number of CPU VPs at 3x the

number of physical CPUs in the LPAR

Page 39: IBM Informix onPOWER7 - IBM - United States · IBM Informix on POWER7 Best Practices 6 Database workload Before we address best practices for Informix on the POWER7 architecture,

IBM Informix on POWER7 Best Practices 39

Affinity

The database server supports automatic binding of CPU virtual processors to a processor in amultiprocessor environment. The default behavior of affinity in Informix is to affinitize startingwith cpu0, then cpu1, cpu2, etc. On the POWER7 architecture, this is not necessarily the mostbeneficial behavior as this equates to logical cpu0, logical cpu1, logical cpu2, etc. WithPOWER7 architecture and SMT, it is more beneficial to use the first thread of each physicalCPU before using the 2nd threads of each physical CPU.

By disabling affinity in Informix and allowing the operating system to schedule the CPU virtualprocessors, you will get the behavior of using the first thread for each core. This behavior is themost advantageous due to the throughput that is obtained by the first thread of each core. Asstated earlier in this paper, using threads 2-4 for a core will gain an additional 40% - 60%improvement in overall throughput.

Recommendation

On the POWER7 architecture, if SMT2 or SMT4 is used, disable affinity. Using affinity candegrade performance due to the usage of threads 2-4 before using all the first threads for eachcore in the LPAR.

Page 40: IBM Informix onPOWER7 - IBM - United States · IBM Informix on POWER7 Best Practices 6 Database workload Before we address best practices for Informix on the POWER7 architecture,

IBM Informix on POWER7 Best Practices 40

Understanding onstat –g glo

The onstat –g glo command is used to display information about the virtual processors withinInformix. One of the values that the command displays is a virtual processor efficiency value.This value is the ratio of the total CPU time to the total time that the threads ran on the virtualprocessor. This value shows efficiency utilization for the CPU virtual processor.

Example:

Individual virtual processors:vp pid class usercpu syscpu total Thread Eff

19 18808928 cpu 1513914.43 106858.67 1620773.10 3952966.80 41%

Threads were scheduled to run on this CPU VP for 3,952966 seconds, but the CPU VP only ranon the CPU for 1,620,773 seconds. The efficiency rating of 41% is derived by dividing 1620773by 3952966.

To understand the efficiency rating on POWER7, it is necessary to understand the load on theserver and the number of physical and logical CPUs allocated to an LPAR. For example, anLPAR with 1 CPU allocated to it, with SMT4 with a load that would keep 4 CPU VPs maxed out,would show an efficiency rating of 25% for each of the 4 CPU VPs. Or all 4 CPU VPs added upwould approach 100%.

This would not be a typical setup, and it is not recommended to have a 1-to-1 relation of CPUVPs with logical CPUs in an LPAR. This measurement will have more relevant meaning insystems not using SMT. The DBA would use the onstat –g glo command along with mpstatand lparstat data to obtain information and understand CPU utilization.

Monitoring lparstat can give you a general idea of how busy or idle the LPAR is.

System configuration: type=Dedicated mode=Capped smt=4 lcpu=128 mem=513279MB

%user %sys %wait %idle----- ----- ------ ------61.7 10.3 0.1 28.058.2 10.9 0.0 30.956.3 9.5 0.1 34.1

In this output, the system is in the 30% range of being idle. The output also contains otherinformation about the LPAR. The LPAR is dedicated, using SMT4, and it has 128 logical CPUs.

Page 41: IBM Informix onPOWER7 - IBM - United States · IBM Informix on POWER7 Best Practices 6 Database workload Before we address best practices for Informix on the POWER7 architecture,

IBM Informix on POWER7 Best Practices 41

Monitoring the mpstat data is not straight forward to read and understand.

cpu min maj mpc int cs ics rq mig lpa sysc us sy wa id pc0 1510 0 0 607 1720 280 1 4518 100 12509 74 16 0 9 0.461 273 0 0 391 656 60 0 556 100 4988 65 10 0 25 0.312 5 0 0 309 62 4 0 68 100 432 12 6 0 82 0.123 8 0 0 258 21 1 0 27 100 216 5 5 0 90 0.114 1438 0 0 444 1459 262 1 3772 100 11519 72 18 0 10 0.385 539 0 0 352 342 44 0 344 100 4875 83 5 0 11 0.426 1 0 0 263 46 3 0 41 100 218 17 6 0 77 0.107 0 0 0 265 49 2 0 32 100 340 14 8 0 78 0.10

ALL 30648 0 0 49044 86261 12756 18 155654 100 641630 56 9 0 34 31.98

The summary in the ALL line at the end of the mpstat data looks more like the lparstat data.Monitoring this output might show that the first thread #1 for each core is pretty active. LogicalCPU 0 is thread #1 for the first core in this LPAR. Logical CPU 4 is thread #1 for the secondcore in the LPAR.

The output also shows that threads 2-4 are in use, but that they are not as heavily utilized asthread #1 for each core. As the number of CPU VPs becomes greater than the number of coresin use, threads 2-4 will begin to show more utilization.

When threads 2-4 are being used, monitor throughput and response times to verify that thenumber of CPU VPs is properly configured with respect to the number of cores and logicalCPUs for the LPAR.

Recommendation

If the number of CPU VPs is greater than the number of cores in an LPAR, monitor closely fortotal throughput compared to single-user response time. If response time degrades tounacceptable levels, test and monitor decreasing the number of CPU VPs to ensure betterresponse times. As is noted in other areas of this document, threads 2-4 for a core do not scalelinearly.

Page 42: IBM Informix onPOWER7 - IBM - United States · IBM Informix on POWER7 Best Practices 6 Database workload Before we address best practices for Informix on the POWER7 architecture,

IBM Informix on POWER7 Best Practices 42

Starting LPARs

The order in which LPARs are started can affect the physical resource allocation within theLPARs and can have performance implications within a system. The first LPAR that is startedgets the most optimal resources, and the 2nd LPAR started gets the next best resources, etc.And the last LPAR that is started will get what’s left.

For example, assume that you have a 32 core system with four chips, each with eight cores. Iffive partitions are configured, each with six cores, the first four LPARs would be located on eachchip and the fifth LPAR would be spread across three chips.

This is especially true if the LPAR’s resources that are allocated are not greater than the coreson a single chip, in which case there is a better opportunity for them to obtain good affinitycharacteristics in their core and memory allocations.

Recommendation

The order in which LPARs are started should be considered in obtaining the best performancefor high-priority workloads. Start the most important partitions first to obtain the best resourcesfrom a single chip.

Page 43: IBM Informix onPOWER7 - IBM - United States · IBM Informix on POWER7 Best Practices 6 Database workload Before we address best practices for Informix on the POWER7 architecture,

IBM Informix on POWER7 Best Practices 43

Appendix A: Recommendations summary

Summary table of recommendations made throughout this whitepaper. For completerecommendations see the appropriate section in the white paper.

SMT settings Use SMT4 for increased overall throughput. Use SMT2 whensingle-thread response time is more important. See the “SMT1vs SMT2 vs SMT4” section for details.

LPAR type Use a dedicated LPAR where possible. See the “DedicatedLPAR vs shared LPAR” section for details.

VIOS Use a dedicated LPAR. See the “Virtual I/O Server (VIOS)LPAR” section for details.

Additional LPAR improvements When possible, use DPO and ASO/DSO to optimize workloadsfor Informix. See the “Additional LPAR recommendations”section for details.

Memory Page Sizes When using KAIO or direct I/O, do not set RESIDENTconfiguration parameter to -1. Use 64 KB large pages forperformance improvements. See the “Memory considerations”section for details.

FDPR Use FDPR in a non-production environment to test forperformance improvements. See the “Feedback DirectedProgram Restructuring (FDPR)” section for more details.

IO Subsystem Use KAIO or DIRECT IO where possible. For disks, set thequeue depth to a minimum of 16. For AIO servers set the minand max aio servers to 100 and aio_server_inactivity to 86400.See the “I/O subsystem” section for details.

Network Subsystem Test network throughput and as tune as needed. Increasetcp_recvspace and tcp_sendspace AIX parameters up to 256KB. See the “Network Subsystem” section for details.

Number of CPU VPs For CPU-intensive workloads, set CPU VPs at 1.5x the numberof physical CPUs in the LPAR. For I/O-intensive workloads, setthe CPU VPs to 3.x the number of CPUs in the LPAR. See the“Number of CPU virtual processors” section for details.

Affinity Disable affinity when SMT2 or SMT4 is being used. See the“Affinity” section for details.

Interpreting onstat -g glo If the number of CPU VPs is greater than the number of coresin an LPAR, monitor closely the throughput, and adjust thenumber of CPU VPs as appropriate. See the “Interpretingonstat -g glo” section for details.

Starting LPARs Start the most critical LPARs first. See the “Starting LPARs”section for details.

Page 44: IBM Informix onPOWER7 - IBM - United States · IBM Informix on POWER7 Best Practices 6 Database workload Before we address best practices for Informix on the POWER7 architecture,

IBM Informix on POWER7 Best Practices 44

Appendix B: Useful commands

amepat

Active MemoryTM Expansion Planning and Advisory Tool. The amepat command reports ActiveMemory Expansion information and statistics as well as provides an advisory report that assistsin planning the use of Active Memory Expansion for an existing workload. This document usedthis tool to show statistics for an LPAR.

bosboot

Creates a boot image. This utility is used to reboot an LPAR with any setting changes that havebeen made.

chdev

Changes the characteristics of a device. This document used this tool to modify the queuedepth for a hard disk, as well as changing TCP settings for a network interface.

ifconfig

Configures or displays network interface parameters for a network using TCP/IP. Thisdocument used this tool to make modifications to TCP parameters.

ioo

Manages Input/Output tunable parameters. This document used this tool to view the AIO serversettings as well as to modify some of the parameters.

iostat

Reports Central Processing Unit (CPU) statistics, asynchronous input/output (AIO) andinput/output statistics for the entire system, adapters, TTY devices, disks CD-ROMs, tapes, andfile systems. This document used this tool to monitor I/O statistics.

lparstat

Reports logical partition (LPAR) related information and statistics. This document used this toolto gather and show statistics for an LPAR.

Page 45: IBM Informix onPOWER7 - IBM - United States · IBM Informix on POWER7 Best Practices 6 Database workload Before we address best practices for Informix on the POWER7 architecture,

IBM Informix on POWER7 Best Practices 45

lsattr

Displays attribute characteristics and possible values of attributes for devices in the system.This document used this tool to monitor the queue depth for a hard disk.

schedo

Manages processor scheduler tunable parameters. This document used this tool to check theprocessor folding settings and modify the settings as needed.

smtctl

Controls the enabling and disabling of processor simultaneous multithreading mode. Thisdocument used this tool to enable/disable SMT, and set the mode accordingly.

vmo

Manages Virtual Memory Manager tunable parameters. This document used this tool to set upand pre-allocate memory for 16 MB large pages.

vmstat

Reports virtual memory statistics. This document used vmstat to monitor the creation of 16 MBlarge pages.

Page 46: IBM Informix onPOWER7 - IBM - United States · IBM Informix on POWER7 Best Practices 6 Database workload Before we address best practices for Informix on the POWER7 architecture,

IBM Informix on POWER7 Best Practices 46

Appendix C: Additional reading

The following list of links provides useful background information for PowerVM and POWER7systems. Also listed is 3rd-party material, which is not an endorsement of the material by IBM,but is meant to offer the reader a variety of information and viewpoints.

DeveloperWorks: AIX Virtual Processor Folding is misunderstoodhttps://www.ibm.com/developerworks/community/blogs/aixpert/entry/aix_virtual_processor_folding_in_misunderstood110?lang=en

IBM Systems: Understanding Micro-Partitioning®

http://www.ibmsystemsmag.com/aix/tipstechniques/systemsmanagement/Understanding-Micro-Partitioning/?page=1

IBM Systems: Getting a handle on Entitled Capacity & Virtual Processorshttp://www.ibmsystemsmag.com/aix/administrator/systemsmanagement/entitled_capacity/

YouTube: Power7 Performance – Entitlement, VPs, Affinity, Memoryhttp://www.youtube.com/watch?v=1W1M114ppHQ

Feedback Directed Program Restructuring (FDPR)https://www.research.ibm.com/haifa/projects/systems/cot/fdpr/

Developer Works: VIOS Advisorhttp://www.ibm.com/developerworks/wikis/display/WikiPtype/VIOS+Advisor

IBM Redbooks Publication: IBM PowerVM Virtualization Managing and Monitoringhttp://www.redbooks.ibm.com/redpieces/abstracts/sg247590.html

IBM Redbooks Publication: AIX 5L Performance Tools Handbookhttp://www.redbooks.ibm.com/abstracts/sg246039.html

Page 47: IBM Informix onPOWER7 - IBM - United States · IBM Informix on POWER7 Best Practices 6 Database workload Before we address best practices for Informix on the POWER7 architecture,

IBM Informix on POWER7 Best Practices 47

Appendix D: References

IBM Informix database server 12.10.xC1: A Technical White Paper Release Notes for IBM Informix 12.10.xC2 IBM Informix 12.10 .NET Provider Reference Guide IBM Informix 12.10 Administrator’s Reference IBM Informix 12.10 Backup and Restore Guide IBM Informix 12.10 Database Extensions User's Guide IBM Informix 12.10 Enterprise Replication IBM Informix 12.10 GLS User's Guide IBM Informix 12.10 Guide to SQL: Reference IBM Informix 12.10 Guide to SQL: Syntax IBM Informix 12.10 Migrating and upgrading IBM Informix 12.10 Performance Guide IBM Informix 12.10 Security IBM Informix 12.10 TimeSeries Data User’s Guide IBM Informix 12.10 Warehouse Accelerator Administration Guide

Page 48: IBM Informix onPOWER7 - IBM - United States · IBM Informix on POWER7 Best Practices 6 Database workload Before we address best practices for Informix on the POWER7 architecture,

IBM Informix on POWER7 Best Practices 48

For more information

To learn more about the Informix features, contact your IBM representative or IBM BusinessPartner, or visit ibm.com/software/data/informix

IBM Informix on POWER7Best PracticesA Technical White Paper

December 2013Darin TracyMonish GuptaVladimir Kolobrodov

© Copyright 2013 IBM Corporation

IBM CorporationSoftware GroupRoute 100Somers, NY 10589U.S.A.

The information contained in this publication is providedfor informational purposes only. While efforts were madeto verify the completeness and accuracy of theinformation contained in this publication, it is provided ASIS without warranty of any kind, express or implied. Inaddition, this information is based on IBM’s currentproduct plans and strategy, which are subject to changeby IBM without notice.

IBM shall not be responsible for any damages arising outof the use of, or otherwise related to, this publication orany other materials. Nothing contained in this publicationis intended to, nor shall have the effect of, creating anywarranties or representations from IBM or its suppliers orlicensors, or altering the terms and conditions of theapplicable license agreement governing the use of IBMsoftware.

References in this publication to IBM products, programs,or services do not imply that they will be available in allcountries in which IBM operates. Product release datesand/or capabilities referenced in this presentation maychange at any time at IBM’s sole discretion based onmarket opportunities or other factors, and are notintended to be a commitment to future product or featureavailability in any way. Nothing contained in thesematerials is intended to, nor shall have the effect of,stating or implying that any activities undertaken by youwill result in any specific sales, revenue growth, savingsor other results.

IBM, the IBM logo, ibm.com, and Informix are trademarksof International Business Machines Corp., registered inmany jurisdictions worldwide. Other product and servicenames might be trademarks of IBM or other companies.A current list of IBM trademarks is available on the Webat “Copyright and trademark information” atwww.ibm.com/legal/copytrade.shtml. Linux is aregistered trademark of Linus Torvalds in the UnitedStates, other countries, or both.