critical recent ptfs

百硕客户通讯，总第 7 期 2007 年 3 月 1 日

2


3

获得 z/OS 学习资料的

新途径

由于主机专业技术人才越来越紧缺，最近在国际主机领

域中，提高人们对主机和 z/OS 使用技能成为了一个令

人关注的焦点，由此在网络上增加了很多可以供主机技

术人员参考和学习的技术文章。

z/NextGen

在美国，很多主机系统程序员都快到了退休年龄。所

以，在大学和相关机构中培养新一代主机系统工程师成

为了一个热门。

针对这种形势，SHARE（IBM 用户组织）已经开设了

一门课程，集中在开发新一代主机工程师技能方面，叫

做 z/NextGen。该课程的重点是培养新一代主机工程

师，直接面向第一次接触主机技术工作的学生和 IT 工

程师。参加 SHARE 课程是实现主机相关培训的一种良

好途径。

（作为 SHARE 的会员，百硕已经参加过在美国举办的

SHARE 会议。所有的 SHARE 材料都被刻录成

DVD，在会议结束之后发送给各 SHARE 会员。如果

您对现有的会议文件感兴趣，可以向百硕相关人员索

取。）

以下便是 SHARE 的一些 z/NextGen 基础课程：

• Why CICS? Why the Mainframe? Why Now?

• z/OS Basics: Intermediate JCL

• Bit Bucket x'21'

• z/OS Basics: Building a 'Gee Whiz' z/OS Toolkit

• Understanding Network Acronyms - What Do All Those Letters Mean?

• TCP/IP Performance Management for Dummies

• Assembler Language as a Higher Level Language: Basic Conditional Assembly and Macro Concepts - Part 1 of 2

• Introduction to Sockets API Programming on z/OS

• z/OS Basics: z/OS Basic Skills Hands-on-Lab

• 'Are you talkin' to me?' sez Java to COBOL

• z/OS Basics: Introduction to JES2 for New Systems Programmers

• Storage Area Networking Concepts

• DB2 for z/OS 'Things' I Wish They'd Told Me Eight Years Ago - Part 9

• Revisiting the Basics of DASD I/O Performance

• The Great z/OS Information Adventure

• Understanding RACF (The Boot Camp) - Part 1 of 2

• Autonomic Workload Management

• Everything a z/OS Programmer Ever Wanted to Know about UNIX 'ld' but was Afraid to Ask

• Integrating Green Screen Applications in SOA Implementations using HATS: Case Studies

• ISPF Users Boot Camp - Part 1 of 2

• LE For Dummies

• Cheryl's Hot Flashes #17

• zNextGen Project Opening and Keynote

• zNextGen Project Wrap-Up

• Fulfilling the Challenge: A Perspective on Education

《z/OS Hot Topic Newsletters》

《z/OS Hot Topic Newsletters》是一个非常好的

z/OS 技术文件来源。每年由 IBM 出版两次，面向所

有客户。登陆以下网址可以访问现有的热门话题和时事

通讯：

http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/Shelves/HOTOPICS?

在 z/OS 方面有很多最新信息。该网站上次的时事通讯

涵盖了对所有 z/OS 系统程序员在不同方面的有用信

息。2006 年 8 月份出版的《 z/OS Hot Topic Newsletters》包括以下内容：

http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/Shelves/HOTOPICS

http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/Shelves/HOTOPICS


4

• The Top 10 Best Practices for Continuous Availability

• Coupling Facility Structures and how they can be best organized

• An Update on the IBM Encryption Facility for z/OS

• DFSMShsm Encryption

• Learning z/OS(z/OS Basic Skills Information Center)

• The zIIP processor(A DB2 Specialty processor)

• z/FS Fast Mount

• Some Problem Determination Tips for V1R7 Installation

• Z/OS Ordering using the ShopzSeries WebSite

• IBM Electronic Service Agent (a product which protects data access between a company’s network and the Internet)

• Shedding some light on IBM Customer Support

• Using a Faster and More Flexible NJE

• Policy Based Network Security of z/OS Communications Server

• Using Enterprise Extender

• Sysplex Network Enhancements in z/OS v1R8

• Using C/C++ compiler on z/Series

z/OS Basic Skills Information Center:

这是一个培养主机工程师的 IBM 基地，叫做 z/OS 基

本技能信息中心，可以通过以下链接登陆该网站。

http://publib.boulder.ibm.com/infocenter/zoslnctr/v1r7/index.jsp

z/OS 基本技能信息中心的目的在于：

• 提供免费的 z/OS 基本培训及信息；

• 缩短人们成为 z/OS 平台上专业人士所花费的时

间；

• 让新手更容易地学习 z/OS。目前，该网站包括下列在线信息：

• z/OS 概念

• z/OS 应用编程

• z/OS 网络

• z/OS 系统编程

• z/OS 在线 Workloads

• z/OS 问题管理

• z/OS 安全

• z/OS 的交互式课程

• z/OS 词汇表




5

From Non-threadsafe to Threadsafe －Take full advantage of multi-processor

百硕资深工程师马晓冬

在单 CP 环境中，CICS 中运行的应用程序大多使用 QR 方式，QR TCB 保证了被访问资源的一致性，反而

如果使用 THREADSAFE 方式，编写程序会相对麻烦一些。但是在多 CP 环境中，如果使用 QR 方式，则

限制了系统的处理能力，只有使用 threadsafe 才可以提高多 CP 的并行处理能力。本文介绍在 CICS 环境

中使用 THREADSAFE 的好处和访问资源的一致性。

马晓冬

1 Reentrant In online environment, more than one tasks share the program code and other shared resource (shared resource in this document refers to application shared resource, for example CWA, SHARED storage, not resource managed by CICS like file). If these tasks don’t change program code, we call these program are reentrant, Language Environment (LE) programs can be guaranteed reentrant by compiling with the RENT option, CICS/TS load these program in read only storage (E)RDSA if SIT parameter RENTPGM = PROTECT.

A example of change the program code is defining variable in CSECT in assembler program.

2 Quasi-reentrant

Quasi-reentrant programs don’t change the program code, but access (update) the shared resource. Before CICS TS, all application code runs under the main CICS TCB called the Quasi-reentrant (QR) TCB. The CICS dispatcher sub-dispatches the use of the QR TCB between the CICS tasks. Each task voluntarily give up control when it issues a CICS service, which cause a CICS dispatcher wait. There is only ever one CICS task active at any one time or the QR TCB. So QR TCB serialize access to shared resource, no data integrity problem. These programs have COncurrency (Quasirent) attribute in program definition.

LABEL_A EXEC CICS ADDRESS CWA SET(R4) L R5,0(R4) LA R5,1(R5) ST R5,0(R4) ST R5,WRKCOUNT LABEL_B EXEC CICS....

Above code modify the first 4 byte in CWA, if COncurrency (Quasirent), at any time, only one task is executing the code between LABEL_A and LABEL_B, each task get unique number from the CWA.

The benefit of QR TCB is data integrity, but the cost is performance because it can not utilize parallel processing in multi-processor system and sometime need extra process for TCB switch.

3 Threadsafe Compare to Quasi-reentrant, Threadsafe program is ‘truly-reentrant’ or ‘fully reentrant’, these programs are:

1 reentrant;

2 no nonthreadsafe command;


6

3 don’t access shared resource or have serialization techniques to guarantee data integrity.

They can run in OTE (Open TCB Environment)

Threadsafe command: when a CICS API or SPI commands is executed CICS will execute code that would update any number of CICS control blocks (for example, the CSA), these commands are considered nonthreadsafe. CICS

will automatically switch back to QR TCB when it is about executed any API or SPI command that is knows to be non threadsafe.

4 OTE (Open TCB Environment) OTE introduces a new class of TCB, which can be used by applications called a open TCB. A open TCB is characterized by the fact it is assigned to CICS task for

its sole use and multiple OTE TCBs may run concurrently in CICS. CICS/TS V2 brought in OTE feature for DB2 V6 (and higher), L8 TCB to handle DB2 function.

The TCB switches involved in typical DB2 transactions runningunder CICS TS 2.2/2.3. Threadsafe and non threadsafe tasks are both shown.

In CICS TS 3.1 there is new attribution of program API (OPENAPI/CICSAPI) and new ‘L9 TCB’


7

If non threadsafe command is necessary in the program, move them before first the SQL call to avoid switch back QR TCB.

5 Serialization techniques to access shared resource This part should be done by application programmer.

CICS API enqueue /dequeue The EXEC CICS ENQUEUE and DEQUEUE commands are ideally suited for CICS application programs to serialize access to shared resources. Both commands are threadsafe.

CICS XPI enqueue/dequeue An enhancement to the exit programming interface (XPI) introduced with CICS Transaction Server 1.3 was the DFHNQEDX

macro function call, which provides the same ENQUEUE and DEQUEUE capability provided by the CICS API. The XPI commands are threadsafe.

Compare and swap Assembler applications and user exits can use one of the conditional swapping instructions, COMPARE AND SWAP (CS) or COMPARE DOUBLE AND SWAP (CDS), to serialize access to shared resources. Refer to the appropriate Principles of Operation manual for full details on coding these instructions.

LABEL_A EXEC CICS ADDRESS CWA

SET(R4)

L R5,0(R4)

LA R5,1(R5)

ST R5,0(R4)

ST R5,WRKCOUNT

LABEL_B EXEC CICS....

LABEL_A EXEC CICS ADDRESS CWA

SET(R4)

RETRY L R5,0(R4)

LR R6,R5

LA R5,1(R5)

CS R6,R5,0(R4)

BNZ RETRY

ST R5,WRKCOUNT

LABEL_B EXEC CICS....

For CS instruction, we need to go backwards in the code and redo our updates and then retry to store our data again. Notice that this is actually coded as an infinite loop, which could be dangerous. It might have been cleaner to put a loop counter in there and abend the transaction if it cannot serialize the data. However, due to the fact it was difficult getting contention, we felt it was a very low chance we would ever go into an infinite loop.


8

主机内存动态分配实验

百硕资深工程师陈银波

实验一：

实验对象：机器 XX04，型号 9672-R54，S/390，操作系统版本：OS/390 2.10，所有 LPAR 实际可用内存 3G。

实验前,各 LPAR 内存分配：

LPAR NAME INITIAL RESERVED

XX0401 512M 0M

XX0402 1024M 0M

XX0403 512M 0M

XX0404 1024M 0M

修改后，各 LPAR 内存分配：


XX0401 512M 0M

XX0402 512M 512M

XX0403 512M 0M

XX0404 512M 512M

IBM 主机划分 LPAR 的功能有一个突出的优点就是能合理分配硬件资源，达到资源的最大化使用。比如，

大家所熟悉的 CP 资源的使用，通过在各个 LPAR 合理定义 LOGICAL CP 资源，就能够充分地动态使用

CP 资源。同样，主机的内存资源也能够在各个 LPAR 之间动态分配，从而达到最优化，而不需要重新对

机器 ACTIVATE。本文所描述的是对同一台主机内部各个 LPAR 之间进行的内存动态分配实验。以下实验都是在 S/390 的机

器下进行的，没有在 z/Arch 系列的主机下做过实验，如果大家感兴趣可以自己动手尝试该实验，我们很高

兴就相关的问题和大家一起探讨。陈银波


9

实验目的：修改 XX0402、XX0404 的内存配置，增加 RESERVED 内存，观察系统如何使用 RESERVED 内

存。

实验过程：

1、配置机器，启动系统

在 HMC 控制台修改机器 XX04 各个 LPAR 的内存配置，然后做 ACTIVATE，使用新的 RSU 参数进行启动，

RSU=512（实验时漏写了 M）。

参数 RSU 的意义：

代表有多少的 CENTRAL STORAGE 是可以重新配置的。

直接指定数字,那么系统认为 RECONFIGURABLE 的内存是 512 x 内存粒度(HMC 上分配内存时的最小单位)。可以直接

指定 xxxxM、xxxxG，也可以指定百分数 xx%。缺省值是 0，代表不进行动态内存配置。

这里 RSU 的值是 512，同时机器 XX04 上内存最小粒度是 16M，根据 RSU 参数的意义，系统应该有 512x16＝8192M 的内存可以 RECONFIGURABLE，显然大大超过实际能够 RECONFIGURABLE 的值。因此在系统的启动

过程中出现了以下信息：

IAR004I THE RSU PARAMETER WAS NOT COMPLETELY SATISFIED

当然这个不会影响系统的启动，这时系统会为你选一个合适的 RSU 值。

2、起系统后，观察内存使用情况

XX0402 系统重起以后，发命令：

D M=STOR

观察系统的内存使用情况，结果如下：

REAL STORAGE STATUS

ONLINE-NOT RECONFIGURABLE

0M-16M

464M-512M

ONLINE-RECONFIGURABLE

16M-464M

PENDING OFFLINE

NONE

512M IN OFFLINE STORAGE ELEMENT(S)

0M UNASSIGNED STORAGE

从结果可以看出：512M 的 OFFLINE RESERVED 内存还没有被使用，LPAR 指定的 RESERVED 内存在系统启动

之后默认是 OFFLINE 的。

同时也可以看到以下信息：


0M-16M

464M-512M

0M-16M，464M－512M 是不能重新配置的，这是由于这部分内存是系统占用的，因此这部分内存是不可交换的。

哪些 ADDRESS SPACE 是不可交换的呢？我们可以从以下几个方面去判断：

If the program name is in the program properties table (PPT) and the appropriate flags are set.

If ADDRSPC=REAL is specified on the JOB or EXEC statement.


10

If the address space issues the TRANSWAP sysevent.The DONTSWAP and HOLD sysevents also make an address space non-swappable, but these specifications are considered to be of short duration, and associated LSQA and private area pages are not necessarily put into preferred storage.

3、尝试使用 RESERVED 内存

尝试在 XX0402 使用 RESERVED 内存，使用命令：

CF STOR(E=1),ONLINE

然后查看内存使用情况，结果如下：

REAL STORAGE STATUS


0M-16M

464M-512M


16M-464M

512M-1024M

PENDING OFFLINE

NONE



从结果可以看到，XX0402 的内存从 512M 增加到了 1024M。可见 RESERVED 内存在 XX0402 的系统上 ONLINE并使用了。

在 XX0404 做同样的操作，RESERVED 512M 也 ONLINE 使用了。

结论：被设定为 RESERVED 的内存在系统重起后的初始状态是 OFFLINE 的。LPAR 中 RESERVED 内存的大小

和 INITIAL 的内存大小之和，决定了这个 LPAR 最多能使用多少内存，RESERVED 内存决定了最多能动态增加多

少内存。

实验二：

实验对象：机器 YY04，型号 9672-R66，S/390, 操作系统版本：OS/390 2.10，所有 LPAR 实际可用内存 6G。

实验主要针对 YY0401 和 YY0402 这两个 LPAR 进行。YY0403 和 YY0404 LPAR 的内存配置不做修改。

修改前


YY0401 1024M 0M

YY0402 2048M 0M

YY0403 1024M 0M

YY0404 2048M 0M

修改后


YY0401 1024M 960M

YY0402 1088M 960M

YY0403 1024M 0M

YY0404 2048M 0M


11

实验目的：希望能从 YY0402 的 INITIAL 内存分出 960M 内存当 RSERVED 内存。然后在 YY0401 不减少

INITIAL 内存的情况下，也指定 960 RESERVED 内存。观察 YY0401 和 YY0402 系统起来之后，能否使用这

960M。由于一开始的内存分配是刚好用完机器内存的，因此这个时候如果算上 RESERVED 内存，内存之和是大

于实际内存的，并且正好多出 916M。

实验过程：

1、修改配置，启动系统

指定 RSU=30，正常启动。（YY04 机器的内存粒度是 32M，这样系统能 RECONFURABLE 的内存是 960M。）

2、观察系统的内存使用情况

在 YY0401、YY0402 系统重起后，在 YY0402 上发命令：

D M=STOR

得到结果如下：

REAL STORAGE STATUS


0M-1088M


NONE

PENDING OFFLINE

NONE



在 YY0401 系统发命令:

D M＝STOR

得到结果如下：

REAL STORAGE STATUS


0M-1024M


NONE

PENDING OFFLINE

NONE



通过查看返回结果可以发现，两个系统中都显示有 960M 预留内存可以使用（但是实际的空余内存只有 960M）。

3、XX0401 尝试使用 RESERVED 内存

在 YY0401 发命令：

CF STOR(e=1),online

使用 RESERVED 内存，结果如下：

REAL STORAGE STATUS


0M-1024M


12


1024M-1984M

PENDING OFFLINE

NONE



从结果可以看出，YY0401 从原来的 1024M 变成了 1984M 内存，说明 960M 的预留内存在 YY0401 上被使用了。

ATUS

由于系统预留的内存只有 960M，而 YY0401 已经使用，那么这时 YY0402 的内存情况是怎么样？查看 YY0402 内

存，结果如下：

REAL STORAGE ST


0M-1088M


NONE

PENDING OFFLINE

NONE 0M IN OFFLINE STORAGE ELEMENT(S)


960M IN ANOTHER CONFIGURATION

显示的结果说明，原来预留给 YY0402 的 960M 内存已经被 YY0401 所使用了。同样，如果我们首先将 YY0402 预

结论：通过在多个所需的 LPAR 中定义 RESERVED 内存，当某个 LPAR 发生内存不足时使用 online 命令即可以

实验三：

实验对象：号 9672-R66，S/390, 操作系统版本：OS/390 2.10，所有 LPAR 实际可用内存 6G

NAME INITIAL RESERVED

留的内存进行 online 操作的话，则 YY0401 会显示上述相同的结果。

动态地将预留的内存首先分配给该 LPAR，从而及时缓解该 LPAR 上内存不足的问题。

机器 YY04，型

YY04 的配置如下：

LPAR

YY0401 1024M 960M

YY0402 1088M 960M

YY0403 1024M 0M

YY0404 2048M 0M

实验目的：观察预留的 960M 内存能否按照不同的内存大小同时分配给所需要的多个 LPAR 使用。

实验过程：验二的基础上进行。在实验二的时候，系统预留的 960M 内存已经全部被 YY0401 使用了。目前实本次实验是在实

际的 YY0401、YY0402 的内存大小是：

YY0401：1024+960=1984M

YY0402：1088M


13

此时，系统中已经没有空闲内存。在此情况下：

1、首先在 YY0401 上 OFFLINE 256M 内存，在 YY0401 使用命令：

CF Stor（256M），offline

命令发出后，查看内存情况：

REAL STORAGE STATUS


0M-1024M


1024M-1728M

PENDING OFFLINE

NONE

0M OFFLINE STORAGE ELEMENT(S)


结果显示，有 256M 没有被分配的内存。

2、查看 YY0402 的内存使用情况，并尝试使用空闲的 256M 内存

查看 YY0402 的内存情况，结果如下：

REAL STORAGE STATUS


0M-1088M


NONE

PENDING OFFLINE

NONE



在 YY0401 960M RESERVED 内存没有全部 OFFLINE 的情况下，YY0402 仍然显示 960M IN OFFLINE

用显示为 OFFLINE 的内存：

STORAGE ELEMENT(S)。

在 YY0402 上发命令，尝试使

CF stor(e=1)，online

查看 YY0402 上内存情况：

REAL STORAGE STATUS


0M-1088M


1088M-1344M

PENDING OFFLINE

NONE



可以看到，YY0402 的内存从 1088 增加到了 1344，说明确实使用了从 YY0401 OFFLINE 的 256M 内存。可见，

在不超出实际空闲内存的前提下，是可以在多个 LPAR 间分配的。


14

结论：所有的空闲内存可以随意分配给任何有 RESERVED 内存定义的 LPAR 使用。这里的空闲内存是指：空闲

内存=系统总内存—所有 initial 内存

实验四：

实验对象：机器 YY04，型号 9672-R66，S/390, 操作系统版本：OS/390 2.10，LPAR 实际可用内存 6G

YY04 的配置如下：


YY0401 1024M 960M

YY0402 1088M 960M

YY0403 1024M 0M

YY0404 2048M 0M

实验目的：测试其他 LPAR 的 INITIAL 内存 OFFLINE 之后，能否被 YY0401 使用？

实验过程：

1、首先，在 YY0402 ONLINE 空闲的 960M 内存，以确保系统中再无空闲内存可分配。

在 YY0402 使用命令：

cf stor（e=1），online

使用空闲的 960M 内存后，此时系统中已经没有空闲内存可用了。

在 YY0401 使用 D M＝STOR，结果如下：

REAL STORAGE STATUS


0M-1024M


NONE

PENDING OFFLINE

NONE



960M IN ANOTHER CONFIGURATION

结果显示，YY0401 中没有内存可以使用。

2、在 YY0404 上尝试 OFFLINE INITIAL 的内存

在 YY0404，发命令：

CF STOR(256M),OFFLINE

然后查看内存情况，结果如下：

IEE174I 10.12.06 DISPLAY M 121

REAL STORAGE STATUS


15


0M-1536M

1600M-1696M

1728M-1824M

1984M-2048M


NONE

PENDING OFFLINE

NONE



从结果可以看出，在 YY0404 上，已经有 256M INITIAL 内存变成了 OFFLINE，此时，在 YY0401 查看内存情

况：

REAL STORAGE STATUS


0M-1024M


NONE

PENDING OFFLINE

NONE



可以看到，在 YY0401 上显示有 960M OFFLINE 的内存可用（虽然实际只有 256M 可用）。

3、在 YY0401 尝试使用空闲的 256M

使用命令，尝试 ONLINE 这块内存

CF stor（e=1），online

YY0401 内存情况如下：

REAL STORAGE STATUS


0M-1024M


1024M-1280M

PENDING OFFLINE

NONE



从结果可以看出，YY0401 的内存已经增加了 1280－1024＝256M 内存了。

结论：一个 LPAR 的 INITIAL 内存在 OFFLINE 之后，也可以被其他的有 RESERVED 内存定义的 LPAR 所使

用。

注释：

e=0 表示 initial 内存

e=1 表示 reserved 内存


16

实验总结：

为了能动态的 ONLINE 和 OFFLINE 内存，必须在 IEASYSXX 中定义 RSU 参数，这个参数指定了系统运行时可以

动态配置的内存数量。当然，也应该首先在 HMC 中为需要进行动态配置的 LPAR 定义 RESERVED 内存。

通过上述的四个试验，我们可以总结出如下的结论：

1. 只要系统中有空闲内存，就可以被定义有 RESERVED 内存的各个 LPAR 动态使用，但是所有 LPAR 使用的

内存总和不会超过实际内存的数量，单个 LPAR 的内存使用不会超过 INITIAL 内存数量和 RESERVED 内存

数量之和。

2. INITIAL 内存也是可以动态更改的，即使没有指定 RESERVED 内存也能把部分 INITIAL 内存 OFFLINE 和

ONLINE，而且这些内存也可以被另外的定义有 RESERVED 内存的 LPAR 使用。

3. 如果系统中出现某个 LPAR 的内存使用紧张,那么就可以通过合理的配置 INITIAL 内存和 RESERVED 内存的

数量,实现使用中内存的动态分配,达到整个主机内存资源使用的最优化。

附件：

关于 Reserved 内存和 initial 内存的描述，见下图：（HMC HELP 截图）


17

REBIND Magic with DB2 V8 By Bonnie Baker Quarter 4, 2006 | 《DB2 MAGAZINE》

引言：越来越多的用户将迁移到 DB2 for z/OS V8，需要特别注意的是 DB2 V8 在 ACCESS PATH 的选取上与以

前的版本有很多改变。在迁移的过程中，有可能会引起 ACCESS PATH 的改变。DB2 专家 Bonnie Baker讲述了 DB2 V8 的 optimizer 与其它版本的不同之处。推荐给大家，请大家看看 DB2 V8 的 optimizer 魔力

在哪里。百硕资深工程师高春霞

Tapping the mainframe vaults -- which contain billions of dollars of IT investments -- for new service-oriented applications often seems to require the skill of a master safecracker. But two new products from SOA Software Inc. and GT Software Inc. aim to make it easier for organizations not only to reuse their legacy riches, but to incorporate mainframe platforms and programmers into an SOA.

It takes some time and some nerve, but it’s worth it.

As more and more companies are moving to DB2 for z/OS V8, which became generally available on March 1, 2004 (a significant while ago), I’ve noticed a general reluctance to REBIND after migration. I can only speculate as to the reasoning behind this inertia. It’s true that a rock at rest tends to stay at rest, but we must consider other reasons, too.

A HUGE New Release

Word spread quickly that V8 was the largest release of DB2 for z/OS ever; in fact, that it’s larger than V1 plus V4 plus V6. DBAs were confronted with so much new

code, so many possibilities, and so little time. Because of its sheer magnitude and the conversion of the DB2 CATALOG (DSNDB06) to Unicode, the DB2 developers decided to allow companies migrating from V7 to V8 to crawl before they walked; migration to V8 is accomplished in two phases rather than one. DBAs can implement V8 first in Compatibility Mode with the option of FALLBACK to V7. Then, after becoming confident, they can commit to go forward and move to V8 New Function Mode, from which there is no FALLBACK.

DBAs who are time-constrained realize that a REBIND has consequences — mostly good consequences, but occasionally (very occasionally) horrid. Therefore, they may be a bit reluctant to do wide-scale REBINDs under any circumstance, especially a new release. It’s so much more preferable to do REBINDs one-at-a-time, in an orderly, controlled manner that allows a review of each resulting access path before a commitment to really REBIND into production, overlaying the existing package.

One cautious technique is to REBIND packages under V8 with EXPLAIN YES one at a time into a collection (not included in any plan) called V8EXPLAIN (or any other collection name that isn’t included in any production plan). After the EXPLAIN information is reviewed, DBAs are much more comfortable overlaying the existing packages with the “good” results. And, in the odd case of an unexpected calamity, can “buy time,” do some research, and find a fix for the problem access path.

BIND vs. REBIND

BIND (not to be confused with REBIND) uses as input the SQL/DBRM in the DBRM Library. (See my series of articles, listed in Resources, on how plans, packages, collections, and versions = confusion.) The assumption with BIND is that the program source code has been changed and that a new DBRM exists.

REBIND uses the current copy of the DBRM in the DB2 Catalog. The assumption is that the program has not changed but that something external to the program


18

has. REBINDs are normal for situations where indexes are added or dropped or RUNSTATs has been run and the statistics have changed significantly. Probably the most dramatic external change is the installation of a new version of DB2

You may also want to read my 2002 column called, “Just like Magic” (see Resources) that covered some of the optimizer changes in V4, V5, and V6.

Prior Magic with REBIND under V7

Prior to V7 the performance of a query selecting a maximum value of a column could be improved by creating an index that used the “descending” option. Consider the following SQL:

SELECT MAX(PONBR) FROM POMASTER WHERE CUSTNO = :HVCUSTNO

Before V7, the performance of this SQL could be improved by creating an index on CUSTNO, PONBR descending. The DB2 Optimizer could use this index, matching on one column and using I1 (one index leaf page access) to get the largest value of PONBR for the customer.

Likewise, the performance of a query selecting a minimum could be improved by creating the appropriate ascending index.

The V7 optimizer was improved so that a SELECT of either a maximum or a minimum of a column value could be accomplished using I1 access regardless of the ascending/descending index characteristic. REBINDs under V7 improved the performance of queries where the opposite index order existed, for example, an index with PONBR ASCENDING could improve the performance of a SELECT MAX(PONBR)… and an index with PONBR DESCENDING could improve the

performance of a SELECT MIN(PONBR)….

As a result, many DBAs were able to drop indexes that had been created to solve the dilemma of multiple queries needing both values. Dropping an index has the ripple effect of improving the performance of INSERT, DELETE, logging, ROLLBACK, RECOVERY, REORG, and BIND, as well as many other functions of DB2.

V8 MAGIC

With each release of DB2, more information is available to help the optimizer make the right decision when weighing the cost of each possible access path. And with each release, more and better access paths are available.

RUNSTATs

In V8, RUNSTATs can gather distribution statistics about nonindex columns. This information is especially important for SQL that applies WHERE clause predicates to table, rather than index, data. If DB2 only has the COLCARD statistics, which give the number of distinct values of a column on the table, then the optimizer will assume even distribution across all values. With highly skewed, unevenly distributed data, the Optimizer can use the new distribution statistics to make much better decisions about the cost of data sorts as well as the optimal join type and the order of the tables in the join sequence.

Bi-Directional Indexes

V7 solved the max/min ascending/descending issue. V8 solved a similar issue. The optimizer gained the knowledge to use the same index for sort avoidance whether the ORDER BY is ascending or descending on the index columns. In other words, the same index can be used to avoid both ascending and descending sorts. This means that

a REBIND may allow the optimizer to avoid many sorts where the ORDER BY is in the opposite order of an existing index. It also means that if you have created two indexes with the same columns, one with the columns in ascending order and the other with the columns in descending order, one of those indexes can be dropped.

Transitive Closure between an outer and inner SELECT

If A=B and A=1, transitive closure tells us that B=1. If COLX = COLY and COLX = :HV, the transitive closure provided by the optimizer allows the implicit predicate COLY = :HV. The optimizer has provided these implicit predicates for us for a long, long time. But there was a situation in which we humans had to intervene and explicitly code closure predicates. Consider the following SQL:

Select empno from T1 where workdept in (Select deptno from T2 where T2.COLX = T1.COLX and T2.COLX = :HV)

In this SQL, the outer SELECT will do a table space scan, and the correlated subselect will be performed for each different value of workdept on T1. But, through transitive closure, if T2.COLX = T1.COLX and T2.COLX = :HV, then T1.COLX = :HV. Coding this predicate in the outer SELECT WHERE clause allows the optimizer to avoid a table space scan by using an index on T1.COLX.

In V8, the optimizer will code this implicit predicate for us and the executed SQL will look like this:

Select EMPNO from T1 where T1.COLX = :HV and T1.WORKDEPT in (Select DEPTNO from T2 where T2.COLX = T1.COLX and T2.COLX = :HV) — this predicate is useless now


19

Transitive Closure between WHERE and ORDER BY

STAGE 2 Predicates now STAGE 1

Automagic

Other improvements in access path selection are part of the V8 optimizer. With a few ALTERs and the use of new V8 functions, REBIND is even more magical. I will cover these improvements in my next column.

In V8 there is now transitive closure/substitution between a WHERE clause and an ORDER BY clause. Consider the following SQL:

Some predicates that were Stage 2, and therefore nonindexable, have been converted to Stage 1 in V8. One example is the situation in which the literal or host variable on the right side of the operator is larger than the column on the left. Consider the following SQL:

Select M.ponbr, M.custno, L.lineno, L.style

from pomaster M, polineitem L Select… from T1 where M.ponbr = L.ponbr… Where char3col = :hvpicx4 order by M.ponbr, L.lineno Bonnie Baker is a consultant

and trainer specializing in applications performance issues on the DB2 for z/OS platform. She is an IBM DB2 Gold Consultant, a five-time winner of the IDUG Best Speaker award, and a member of the IDUG Speakers’ Hall of Fame. She is best known for her series of seminars entitled “Things I Wish They’d Told Me 8 Years Ago” and the Programmers Only column. You can reach her by email or through Bonnie Baker Corporation at 813-837-3393.

The longer, 4-byte host variable on the right made this a Stage 2 nonindexable predicate in V7 and prior. In V8, the optimizer is more forgiving, and the predicate is now Stage 1 and indexable. A REBIND allows DB2 to consider the use of an index for these reformed predicates.

You can’t use an index to avoid a data sort if the ORDER BY contains columns from more than one table. However, via transitive closure in V8 the M.ponbr in the ORDER BY clause can be replaced with L.ponbr. Now an existing index on the polineitem table on L.ponbr, L.lineno can be used to avoid the data sort.


20

Why the IBM mainframe is an effective choice for banks

by Morton Nygaard － IBM Program Director, Systems and Technology Group Global Financial Services Sector

by David Zimmerman －IBM Global Core System Transformation Executive

June 2006

Retail banks around the world are faced with a growing set of challenges. Competition is intense, managing risk is more challenging than ever and responding quickly to change is a necessity. This paper reflects IBM’s general view of various forces affecting the banking industry and their relationship with IT investment. It was produced and developed by members of IBM’s Global Banking community, and includes research conducted by the IBM Institute for Business Value and from a variety of non-IBM resources.

These challenges are helping to shape the business strategies of retail banks. Despite continued merger and acquisition activity and changing market conditions, banks are now focused on achieving organic growth as their primary objective.

To achieve organic growth, banks are expected to focus heavily on

customer retention and increased wallet share. Customer service, rather than products or price, may be a differentiator for most banks to survive.

Banking executives may be limited in their ability to adapt to market conditions

Banking executives are potentially facing the following challenges:

• Difficulty supporting new business initiatives

•Unable to get new products to market as quickly as they would like to

• Inability to support current or projected customer accounts without considerable investment

• Inability to support the functionality of an acquired bank

• Struggling to react quickly to customer transaction requests, which may result in response time delays that negatively impact customer service

• Inability to handle current and/or future regulatory and compliance mandates

• Struggling with back-office integration and maintenance costs that may be prohibitively high when new applications like CRM are added

As a result of these challenges, banks may be considering Core Banking application transformation in order to support organic growth. It is anticipated that transformation will occur most rapidly in Asia, where banks will struggle to operate effectively as the volume of transaction processing increases with economic growth. Transformation also is expected to occur rapidly in the Europe-Middle East-Africa (EMEA) region, as banks cope with regulatory compliance and risk management. Many banks in the Americas are


21

now beginning the planning process for transformation.

The mainframe can be a key component of Core Banking System transformation.

The benefits banks may achieve through transforming an environment to a Service Oriented Architecture (SOA) center on increased agility and the ability to respond to changing market conditions. Banks also have the goal of bringing new products to market more quickly, centralizing multiple Core Banking Systems into one system, adhering to regulatory and compliance mandates and reducing response time delays that negatively impact customer service. These are all benefits that may be able to be achieved through migration to a Service Oriented Architecture on the mainframe platform. By mainframe, we are referring to the IBM® System z9™ and eServer™ zSeries®.

In the banking industry, the mainframe can be a platform of

choice. In fact, the number of MIPS (million instructions per second) installed increased dramatically between 2001 and 2004 ac

cording to Gartner (see

ages, including:

ty

rates Strong performance

mes in new and exciting help

mplify integration and may also facilitate collaboration.

need them most at any specific moment in time.

Open Standards. Mainframes are able to support J2EE, Linux®, grid standards, SOA, Web services and other forms of open standards.

Collaboration. Increasingly, banks seek to collaborate with partners and even other banks. For example, while some banks leverage check image exchange networks like Viewpointe Archive Services, others are transmitting digital images directly between banks for settlement. Open standards, deep levels of security and real-time capabilities that mainframe can provide can

Figure).

The mainframe can be viewed as the platform of choice because it brings many advant

• High availability • Strong business continui• Deep levels of security • High system utilization•

Banks are deploying mainfra siways Virtualization. A mainframe can support hundreds of servers in a virtual environment. This can help improve manageability and may enable more efficient use of system resources by allowing servers to be prioritized and allocated to the workloads that


22

2006 年 11 月，我们从美国 EPS 公司(Enterprise Performance Strategies, Inc.)聘请了两位

资深顾问来到中国，为国内主机用户作了题为“z/OS 系统性能调整”的高级培训，他们就是 Peter Enrico 和 Tom Beretvas。本次培训获得了国内各家金融行业的主机用户的热烈响应，从收到的意

见反馈表可以看出，大家对此次培训评价很好，并希望能有更多的机会参与类似的培训或技术交

流。

为了尽可能多地为国内主机用户提供了解高水平主机技术和经验的机会，我们还特别邀请

Peter 和 Tom 根据国内用户的需求撰写了两篇技术专题文章，并刊登于此，希望能够为各位主机用

户提供更多帮助和技术指导。

* EPS 公司 (Enterprise Performance Strategies, Inc.) ：百硕合作伙伴

WLM: Understanding the Importance Control

By Peter Enrico

Synopsis One of the most common Workload Manager (WLM) mistakes I see is the improper use of WLM importance levels. This article provides a brief explanation of the WLM importance control and how WLM uses it, and outlines a simple exercise that will help you better understand your WLM importance settings.

Background

First, it’s important to understand the meaning of WLM's importance control. As you may know, all work in the system is assigned to a service class. Each service class is comprised of one or more service class periods. It’s at the period level that work is assigned a goal, an importance level, and optionally, a period duration of service. The latter is used to age work out of one period and into another period within the same service class.

Although there are some exceptions, a basic premise of

WLM is to manage work at the period level. That is, when WLM is dynamically managing the system to meet goals, it selects the work in a particular period and then checks to see how it can help the work achieve the period’s goals. When WLM helps work that isn’t meeting its goal, it’s really helping the work in a particular period. After all, it’s the period that’s assigned a goal, so WLM manipulates the controls for the work in that period (or relative donor periods) to help meet the period's goals.


23

But how does WLM know which period to try to help in the first place? This is where importance comes into play.

What Is Importance? Every period that’s assigned a velocity or response time goal is also assigned a relative importance value. The value indicates how vital it is to the installation that this performance goal be met relative to the goals of other periods.

Valid values for the importance parameter are:

• 1 for the highest importance periods • 2 • 3 for the medium importance periods • 4 • 5 for the lowest importance periods. The periods for system service classes SYSTEM and SYSSTC aren’t assigned importance levels. Although SYSTEM and SYSSTC don’t have goals, WLM considers these periods more important than periods assigned an importance level of 1.

Discretionary goal periods also aren’t assigned an importance level, but WLM considers them less important than periods assigned an importance level of 5.

When Is a Period's Importance Considered? A period's assigned importance is considered at various times by the WLM algorithms. Not all situations are described here, and those that are described are done so with brevity in mind. The following highlights some examples of primary situations when WLM considers importance.

1. Selecting a Service Class Period to Help Every 10 seconds the WLM algorithms wake up to determine if there’s any work to help. During this time, the WLM algorithms will attempt to help, at most, one period come closer to meeting its assigned goal. The section algorithm WLM uses makes use of each period's assigned importance as one of its guides.

This process is known as selecting a receiver. The receiver selection algorithm is not overly complicated but more than just the importance settings are considered. For brevity, let me just say that WLM attempts to find the most important period that is missing its goal the most. If the period selected can’t be helped, then WLM chooses the next period, and so on. This is why some sites may see a service class period with a high importance getting a high performance index (PI), and vice versa.

In essence, WLM attempts to help all importance 1 work first, then importance 2 work, and so on. Periods assigned to resource groups and WLM running in a multisystem Sysplex make this selection algorithm more complicated, but essentially WLM attempts to help the most important goals first before trying to meet the less important goals.

2. Selecting a Service Class Period to Take Resources From After WLM selects a receiver and determines what resources are needed to help this receiver period, WLM will begin a search for the necessary resources and determine to what periods it can donate those resources.

This process is known as selecting a donor. Like the select receiver algorithm, the WLM resource donor selection algorithm considers each period's assigned

importance. Although WLM will help, at most, one receiver during a 10-second window, it reserves the right to take from one or more donor periods until it has the resources needed to help the receiver.

In general, during the select donor algorithm, WLM will first try to find resources that aren’t being used. If not enough resources are found, WLM will attempt to take additional resources from discretionary periods. However, if both unallocated resources and discretionary periods can’t donate enough resources to help the receiver, then WLM will search for a goal period to take resources from. From this point forward, the process for selecting a donor period is essentially the opposite as selecting a receiver. WLM will look for the least important period that’s achieving its goal the best. If this period examined can’t donate enough resources, then WLM considers the next period, and so on, until it finds enough resources.

Periods assigned to resource groups make this much more complicated, but the key idea is that WLM will first attempt to take from less important goal periods before trying to take from more important ones.

3. Determining if an Action Being Considered Has Net Value WLM may find a receiver period to help, and it may even find enough donor periods to donate the needed resources. However, WLM doesn’t actually make any changes unless certain conditions are met. One of these is that the action to be taken must have net value.

The net value check is WLM's way of ensuring that any action taken is consistent with the service policy. WLM ensures that helping a receiver doesn’t hurt the donor more than it helps the receiver,


24

and thus negatively impact the system.

A period marked as CPU critical dictates to WLM that the period will never have its CPU dispatch priority equal to or less than the dispatch priority of periods assigned a lower importance. This helps ensure that work marked CPU critical will always have a dispatch priority greater than work of less importance.

Again, each period's assigned importance is considered. This net value check is quite complicated, but it does consider each period's assigned importance. For example, the net value check considers if the receiver period to be helped is more important than the donor period. If so, then it’s more likely the action to be taken will have a net positive effect on the system. Another consideration is whether the donor period is both more important than the receiver period and missing its goals. If so, then it’s more likely the action won’t be taken, since it could have a net negative effect on the system.

Also note that the net value check allows a donor period to be more important than a receiver period as long as the donor isn’t hurt more than the receiver is helped. Naturally, the net value check has many other considerations, such as PIs, resource group minimums and maximums, etc.

4. Considerations When CPU and Storage Controls Are Used Still relatively new to WLM are the CPU and storage critical controls. These are designed to help installations ensure that WLM doesn’t hurt the most important workloads. The WLM algorithms consider each period's relative importance during the enforcement of these critical controls.

A region marked as storage critical dictates to WLM that its storage owned by a storage critical address space shouldn’t be donated to work of lesser importance.

Considerations When Using the WLM Importance Control Like all WLM controls, the WLM importance control is powerful and installations need to revisit it on a regular basis. When I evaluate WLM service definitions, I perform a series of importance verification tasks that you may want to consider:

Break the thought process that importance equates strictly to business importance: A common mistake I see is WLM administrators assigning importance based on political correctness rather than a combination of political and technical correctness.

Most departments think their workloads are the most important, and most businesses have multiple high-importance workloads. You have to think rationally that WLM shouldn’t

consider every workload equally. If you assign 10 periods the importance 1 setting, then WLM considers all periods equally important.

Think of importance not just in terms of the business, but also relative to the resources available and the relationship of the workloads to these resources.

Determine current importance control settings: I list all periods and sort them by importance. Just visually seeing such a list is a big help. I recommend that you obtain an easily viewable version of your WLM service definition. You can do this by using a free tool that’s available on my Website to convert your WLM service definition to HTML format. The resulting HTML file will contain a list of all your service class periods sorted by importance.

To access this tool, go to www.epstrategies.com, select the button titled “WLM to HTML,” and you will be instructed on how to do this conversion. The resulting file will contain a hyperlink to a table of goals sorted by importance. Figure 1 shows a partial example of this table. From this list, you can visualize the approximate order that WLM will be selecting receivers and donors, as well as the relative importance values of all periods. Many performance monitors will allow you to sort the reports by importance. These reports should be sorted by importance (highest to lowest) by PI (highest to lowest).


25

I/O Initiation, DASD Response Time, Its Importance And Its Components

by Thomas Beretvas

Abstract.

The article discusses the concept, the importance and the components of DASD response time.

Introduction

Many installations still do not understand what good DASD response time is or what it can mean to them. Application response time is often very important to installations; however when the response time is not good enough, the installation typically concentrate performance analysis and capacity-planning on improving CPU responsiveness. Much effort goes into studies whether CPU upgrades required, and how much upgrade is required. What is often overlooked is the ease and reduced cost of improving DASD I/O response time to improve application performance. This paper discusses how an I/O operation is initiated, introduces the concept of DASD response time and indicates what its components are. Subsequent papers will indicate guidelines for good DASD response time values, how improvements can be identified and the process of identifying good candidates for DASD response time improvements.

DASD I/O requests

A DASD I/O request is initiated by an application program (Figure 1.) that issues Read/Write or Get/Put macros to an access method (such as VSAM). The access method prepares Channel Command Words, CCWs, and then invokes the I/O Supervisor (IOS) component of the Operating System, which issues a Start Subchannel, SSCH command to the Channel Subsystem. The Channel Subsystem then proceeds to initiate the I/O by finding a path (figure 2.) to the DASD volume. A path consists of a channel, a director (a switch, SWT) and a control unit, CU, to the DASD volume. In order for an I/O operation to proceed, a path has to be available.

Figure 1. An I/O request


26

Figure 2. A path

Once the path is found, the I/O is initiated. The I/O operation itself is guided by the CCWs discussed above which are interpreted by the channel subsystem and the control unit. The way modern DASD subsystems, storage processors, control units (figure 3.) work, most of the I/Os are satisfied from cache, meaning these I/O operation are hits. Cache in some cases is volatile storage in contrast to a sometimes separate nonvolatile storage (NVS), which is battery-protected. Thus, all writes go to cache, meaning that the data is written to cache (and NVS when it is separate from cache), to be later destaged asynchronously. Hits are typically very fast. In contrast, a large portion of all reads, but not all are satisfied from cache. Read I/Os that are not satisfied from cache are called (read) misses. Misses are much slower than hits. The ratio of misses divided by all I/Os is called the miss ratio, MR. In modern storage subsystems the size of cache is very large, tens or even hundreds of GBs in size, meaning that the miss ratio, MR is small, typically less than 10%. In other words, the hit ratio, HR (the counterpart of miss ratio, the ratio of hits divided by all I/Os) is greater than 90%.

Figure 3. A Storage Processor (Control Unit)

The operating system assumes all data is stored on logical volumes, representing buckets of data which are defined at I/O generation time, represented as UCBs, discussed later. These logical volumes may or may not represent coherent physical entities. However, in reality all data is stored on physical volumes, which are part and parcel of (physical) control units. The operating system is not aware of the physical volumes or physicl control units.

DASD (Direct Access Storage Device) response time is defined and measured as the average response time in milliseconds for the completion of a DASD I/O request during some finite (RMF) interval, typically lasting 15 minutes.

The components of DASD response time, RT are shown on Figure 4. We can write an equation for DASD response time, showing its four components. All four components are of course average values, expressed in ms.


27

Figure 4. The components of IO Response Time, RT

RT = IOSQ +PEND + DISC + CONN (Eq. 1.)

A brief discussion of the four components of DASD response time follows:

IOSQ: The z/OS operating system and its ancestor, MVS, represent a DASD logical volume by a control block, the Unit Control Block, UCB. The UCB has the limitation, that it allows only one I/O operation outstanding against this UCB from this operating system. (Each LPAR, or system has its own operating system.) Later enhancements to the operating systems, the so-called PAV (Parallel Access Volumes) allow the generation of multiple (so-called) alias UCBs, which allow additional I/O operations addressing the same logical volume to proceed concurrently. If an I/O operation cannot find a free UCB for the logical volume, the I/O operation is queued in IOS. The duration of wait in IOS is called IOS Queuing, causing an IOSQ delay. If multiple I/O operations can proceed in parallel, then for example, a hit does not have to wait for the completion of a miss. As a concrete example, suppose there are two consecutive I/O requests to a logical volume, a request resulting in a miss, and one resulting in a hit. Assume the satisfaction of the miss takes 10 ms, while that of the hit (on its own) takes 1 ms. If both can proceed in parallel, there is no IOSQ wait time and the average completion time is 5.5 ms. If the miss comes first, and there is only one UCB, the miss completes in 10 ms. This represents an IOSQ wait for the second, hit request, which then completes in a total of 11 ms, for an average completion time of 10.5 ms, with an average IOSQ wait of 5 ms.

PEND: Once the IOSQ wait (if any) is completed, a path has to be found to the logical volume, this includes a free channel, a free director port and a free path in the control unit. If any of these is unavailable, a PEND time wait occurs, until such time that all three become available. Once the path is clear, the I/O can be initiated, unless the logical volume itself is busy. The logical volume can be busy, if another I/O operation has begun against it from another z/OS (system or LPAR). If this is the case an additional PEND time wait occurs unless the control unit is equipped with multiple allegiance feature. The multiple allegiance feature is similar to alias UCBs discussed above, allowing operations proceed in parallel, but it is internal to the control unit, not a feature of the operating system. The three delays corresponding to the path components are Channel Delay, CH DLY, Director Port Busy Delay, DPB DLY, and Control Unit Busy Delay, CUB DLY.

The delay caused by the logical volume being busy due to another operating system is called Device Busy Delay, DB DLY.

DISC: In an ESCON environment, when an I/O operation was initiated, the path, including the channel was involved. However, in cases where a miss had to be resolved, which includes access to the physical volume and mechanical access delays; it was considered foolish and wasteful to keep the path and therefore the channel busy for the duration of the whole mechanical access time. Therefore, the path was freed up, the volume disconnected from the path and the duration of this disconnect is reported as DISC delay. In a FICON environment this DISC delay still exists, even though no physical disconnect of the path and the channel actually occurs. Misses occur because the requests cannot be satisfied from cache, thus usually an increase in cache size can reduce DISC delay. Upon completion of the mechanical operation, the volume has to be reconnected to a channel, which may not be possible if all the channels are busy, resulting in a potential reconnect miss delay, also part of DISC time. (The likelihood of this occurring is very small.)

If synchronous remote copy operations are implemented, a write operation in the primary control unit completes before the (usually remote) secondary control unit completes its matching write. (Only writes are involved in remote copy access.) Thus, there may be a synchronous copy delay due to the delay in the secondary side


28

manifesting itself in additional DISC time. In addition, contention in a physical volume or the control unit itself can also cause DISC delays.

CONN: The last portion of the response time is the productive portion, CONN time. During this time the actual transfer takes place. In an ESCON environment, where only one I/O operation can take place at one time on the channel, CONN keeps the channel busy. In a FICON environment, there can be multiple I/O operations occurring concurrently, thus, this I/O will share FICON with other I/Os. CONN time also includes the time required for microcode execution in the control unit, the so-called protocol time, PR. In modern control units, the protocol time is negligible.

Summary. We discussed the initiation of DASD I/O operation highlighting the path components involved. This led to the discussion of DASD response time and its components. Subsequent papers will indicate guidelines for good DASD response time values, how improvements can be identified and the process of identifying good candidates for DASD response time improvements.


29

本期专家：Darryn Salt

资深主机系统专家，拥有 15 年以上的主机经验。

曾服务于美国、英国、新西兰等多家大型金融用户，拥有丰富的主

机技术经验。并曾服务于 IBM 中国公司，为国内银行用户提供咨询

服务，了解中国银行业的主机应用情况。2006 年加盟百硕公司，作

为主机高级顾问为国内银行用户提供咨询服务。

本期问题：

问题 1：Do I need to REORG my DB2 System Catalog and Directory tables?

问题 2 ：What should I consider in my DB2 buffer pool strategy?

问题 3：Is there an easier method to check PSP maintenance when upgrading to a new release of z/OS?

问题 1：Do I need to REORG my DB2 System Catalog and Directory tables?

专家解答：

These tables should definitely be reorganized on the regular basis in order to improve overall performance, reclaim unused space, and reduce dataset extents. Since DB2 version 4 it has been possible to reorg all of the tables in the DSNDB06 catalog and tables SCT02, SPT01, DBD01, and SYSLGRNX in the DSNDB01 directory.

If the DSNDB06 catalog tables are not reorganized then the performance of system (and user) look-ups on the catalog will suffer because of the extra I/O required to retrieve information. In addition, the catalog and directory uses internal linking and hashing structures that are different to normal tablespaces and

the longer the object goes without a reorg then the greater the possibility of links breaking, resulting in data corruption.

The performance of the DSNDB01 directory is critical as it contains information about DB2 objects (DBD01) and access paths for plans (SCT02) and packages (SPT01). This information is required in the EDM pool when a thread is allocated and executed. The SYSLGRNX directory table is worth special mention because of its performance implications. SYSLGRNX is used to minimize the time taken to recover DB2 objects and to restart DB2. A record is written to SYSLGRNX when a tablespace or partition pageset is opened or updated, and the record is updated when the pageset is closed or switched to read only. The MODIFY RECOVERY utility removes records from SYSLGRNX. This is a very volatile table that can grow very large and in addition to reclaiming space, regular reorgs will result in the faster execution of the


30

RECOVER, MODIFY RECOVERY, and REORG SHRLEVEL(CHANGE) utilities.

Recommendations:

• Use the normal statistics contained in the catalog to determine when to reorganize DSNDB06 catalog tables. Keep the statistics current using the RUNSTATS utility.

• When you reorganize DSNDB06.SYSDBASE also reorg DSNDB01.DBD01. When you reorganize DSNDB06.SYSPLAN also reorg DSNDB01.SCT02. When you reorganize DSNDB06.SYSPKAGE also reorg DSNDB01.SPT01.

• Reorganize DSNDB01.SYSLGRNX 2 to 4 times per year.

• To reduce the time to perform the reorgs increase the size of BP0 (try 10 times the current size) for the duration of the REORG. It may be necessary to temporarily reduce the size of other bufferpools.

问题 2：What should I consider in my DB2 buffer pool strategy?

专家解答：

Any buffer pool strategy should consider the number of buffer pools will be allocated, the purpose for which they will be used, and how they will be configured. The old days of having just one large buffer pool to handle everything have long gone and it is now important to have a good buffer pool strategy that is continuously evaluated. Applications will run faster if the DB2 data is found in buffer pools (I/O to disk is not required) and this makes buffer pool tuning one of the most important aspects of improving DB2 performance.

The buffer pool requirements for every site are different, however most sites should consider having separate buffer pools for the following objects or processing characteristics:

• DB2 catalog and directory (BP0)

• Sort and work database

• Heavily accessed code and reference tables (keep them in memory)

• Tables that are predominantly accessed sequentially

• Tables that are predominantly accessed randomly

• Indexes

• All tables/indexes of a “special” application

• Tables/indexes for vendor products (keep separate from business data)

• Reserved for problem analysis of application performance

• LOBs

• BP8K, BP16K, BP32K buffer pools

There are many configuration options for a buffer pool and I will discuss the most important ones here:

• VPSIZE specifies the size in pages. The bigger the buffer pool the more chance DB2 will find the required data in it.

• VPSEQT specifies the percentage of the buffer pool that can be used for sequential (sequential, list, dynamic prefetch) processing.

• DWQT is the deferred write threshold and when reached DB2 will start writing changed pages out to disk.

• VDWQT is the vertical deferred write threshold and specifies when changes of an individual dataset will be written to disk.

Recommendations:

• Only use BP0 for the catalog and directory – don’t put anything else in there. Set VPSEQT = 45 to 50% for a good mix of random and sequential activities.

• Have a separate buffer pool for the sort/work objects. Try VPSEQT = 90% since most processing is sequential, and try VDWQT = 90% to keep the pages in the buffer pool.

• For buffer pools used for sequentially processed data set VDWQT to around 80% and for random processing try around 20%.

• For general table and index buffer pools try DWQT = 10% and VDWQT = 2% to encourage continuous writing of changed pages to disk and to avoid I/O spikes from occurring when there is a DB2 checkpoint.

• Consider changing buffer pool configurations for online and batch processing. For example during the online day when data access is mostly random you might use VPSIZE = 1000 and VPSEQT = 20 and for overnight batch processing where the data access is mainly sequential the buffer pool could be altered to VPSIZE = 5000 and VPSEQT = 80.


31

• Buffer pools should always be backed by real memory to avoid paging.

• Always monitor and keep tuning – don’t be afraid to try things!

The buffer pool topic is a very large one and I have only touched some aspects of it here, however I hope I have given you some insight into what you should consider in order to improve your buffer pool strategy.

问题 3：Is there an easier method to check PSP maintenance when upgrading to a new release of z/OS?

专家解答：

Yes there is. You may use the Enhanced PSP Tool (EPSPT) to assist in programmatically determining which coexistence PTFs you must install on your current system in preparation for migration to a later z/OS release. When you retrieve the extract file from the ZOSGEN PSP bucket subset that is used as input to EPSPT, the file will contain the current list of coexistence PTFs for migrating to a later z/OS release. Coexistence PTFs are identified in the "Cross Product Dependencies" section of the ZOSGEN PSP bucket

subset. The steps to take to programmatically determine whether your current system has the complete list of required Coexistence PTFs for migration are:

1) Download and install the EPSPT, available from

http://www14.software.ibm.com/webapp/set2/sas/f/psp/download.html

2) Download the extract file from your current release's ZOSGEN PSP bucket subset. The list of "to" release coexistence PTFs is found in the "from" release ZOSGEN PSP bucket subset.

3) Using the extract file from your current release's ZOSGEN PSP bucket subset, run the EPSPT.

4) Resolve any outstanding discrepancies that the EPSPT has identified.

Periodically, you may want to download the extract file from your current release's ZOSGEN PSP bucket subset, and rerun EPSPT to ensure that any newly added coexistence PTFs are verified.


32

百硕客户通讯 BAYSHORE ADVISOR

中国主机用户专享的资讯季刊 2007 年 3 月 1 日出版（总第 7 期）

主办：百硕同兴科技（北京）有限公司

出版：百硕客户通讯编委会吕宁

郭庆雪李琰

吴笳笳王晓兵徐卫华邹杰

刘京平陈银波高春霞郑霞

康会影 Darryn Salt Martha Hall

James Smith John Varendorff

地址：北京市朝阳区望京科技园利泽中二路 1 号中辰大厦 209 室电话：010 64391733 传真：010 64391582 电子邮箱：[email protected]

如果您对百硕客户通讯有任何意见和建议，欢迎您随时与我们交流！


33

critical recent ptfs

Documents