performance optimization with idaa
TRANSCRIPT
Roberto Gioi
Manager Capacity & Optimization
Performance Optimization
with IDAAXIX EPV UG October 11 - 14 2021
2
IDAA in a nutshell
A high-performance appliance that integrates PureData based on
IIAS technology (Ibm Integrated Analytics System) with
zEnterprise technology, with the aim of providing extremely
high performance to the analytical world
Main features:
o DB2 is the data owner (for OLTP, Batch and DW) with all the
benefits of security, consistency and integrity
o Virtual component of DB2, transparent to the application
o DB2 Quality of Service extended to the analytical workload
o Mixed workload support: the DB2 optimizer decides whether
or not to speed up queries
o Access to the accelerator is exclusively through DB2
o No specific skills required, complete transparency for the end
user
Transactional
Workload
Analytics
Workload
3
Acce
lera
tor D
RD
A R
eq
ue
sto
r
ApplicationInterface
Heartbeat (availability and performance indicators)
Application
Optimizer
Query execution run-time for queries
that cannot be or should not be
routed to Accelerator
Heartbeat
Queries executed
with Accelerator
Queries executed
without Accelerator
Query execution flow
node0101Partition 0
node0102 node0103
Partition 2
Partition 3
Partition 4
Partition 5
Partition 6
Partition 7
Partition 8
Partition 9
Partition 10
Partition 1
4
MPS Configuration: Active – Active on M4002-003 appliance and v.7.5.5.1
2021usergroupP
Pri
ma
ry S
ite
DR
Sit
e
5
IBM Query Monitor for query discovering and EPV for DB2 for statistical data analysis
Evaluation of IDAA ‘elegibility’ through Data Studio and cost/benefit analysis (data
snapshot + execution).
Eligible Batch load detection (no CDC)
1. Query with high OPEN cost> 500 seconds
2. Query with high Elapsed time > 500 Seconds
3. Query with similar SQL of the same application
Frequent batch types: downloading big set of data from table for internal application
services processing or production of files for different uses such as DWH
Eligible OLTP load detection (with CDC)
1. Query with high CPU cost> 2 seconds
2. Query with high Elapsed time > 20 seconds
OLTP query types: low frequency – high impact queries (query parallelism in IDAA below
100 thread per Appliance)
Performance optimization process
6
Query selection process - DDF
Search for DDF query using
dynamic SQL statements
Using IBM DB2 Query Monitor
and looking at high
CPU/elapsed figures
In this case:
C03020507A call accounts
for 25% of CPU time
7
IDAA eligibility check:
1. Detect all table within the
scope of the query
2. Define the tables on IDAA
using DataStudio
3. Perform refresh on IDAA
4. Use DB2 Explain to verify
the IDAA eligibility
5. Estimate the benefit of the
optimization
Query selection process - DDF
12
Search for query using static SQL
statements in CICS transactions
Using IBM DB2 Query Monitor and
looking at packages showing SQL
statements with high CPU or
Elapsed time
In this case:
Package B2C5005 shows high Elapsed time
Query selection process – CICS Transaction
15
Query selection process – CICS Transaction: Rebind
In REBIND step, set
QUERYACCELERATION
Parameter to ‘EL’
16
Snapshot from
Production:
program B2C5005
Run in Production:
statement simulation using
a Java program.
In this case: Elapsed time
cut to 3 seconds
Query selection process – CICS Transaction: simulation in Production
17
Query selection process: example of a mail for AM team engagement
Job LKY95111 has been selected eligible for IDAA with only minimal modification interventions
Current average consumption 8.000 GP CPU seconds, 35.000 zIIP CPU second. These figures show a growing trend due to the type and goals of the job
Average elapsed time: more than 7 hours
What is and why using IDAA?
IDAA is an IBM appliance connected to DB2 on z/OS, used to accelerate some types of complex queries (e.g.: analytical SQL)
How to use it:
1. On IDAA, define the tables in scope of the IDAA-eligible queries
2. On IDAA, enable and populate the tables just defined
3. On z/OS, run the query
• When DB2 notices that the query can be accelerated it executes it on IDAA (data at the timestamp of refresh - see point 1 below)
4. On IDAA, disable the tables
Cross-check run in Production showed Elapsed time under 60 seconds
Listed hereunder the JCLs to carry out a cost-benefit analysis and deployment.
1. REFRESH on IDAA:
SYS5.DM.CNTL(IDAALKEN)
2. RUN JCL LKY95111
SYS5.DM.CNTL(IDAALKUN)
3. DISABLE on IDAA
SYS5.DM.CNTL(IDAALKDI)
18
Query selection process: standard run on DB2
1 J E S 2 J O B L O G -- S Y S T E M C I C S -- N O D E P R O D
0
23.54.49 JOB20085 ---- FRIDAY, 24 MAY 2019 ----
23.54.49 JOB20085 IRR010I USERID OPCACID IS ASSIGNED TO THIS JOB.
23.54.49 JOB20085 ICH70001I OPCACID LAST ACCESS AT 23:54:49 ON FRIDAY, MAY 24, 2019
23.54.49 JOB20085 $HASP373 LKY95111 STARTED - WLM INIT - SRVCLASS MPBATNOR - SYS CICS
23.54.49 JOB20085 IEF403I LKY95111 - STARTED - TIME=23.54.49
23.54.49 JOB20085 - --TIMINGS (MINS.)-- ----PAGING COUNTS---
23.54.49 JOB20085 -JOBNAME STEPNAME PROCSTEP RC EXCP CPU SRB CLOCK SERV PG PAGE SWAP VIO SWAPS
23.54.49 JOB20085 -LKY95111 ULKFAN DELETE 00 23 .00 .00 .0 925 0 0 0 0 0
23.54.49 JOB20085 -LKY95111 ULKFAN OCS0067L 00 65 .00 .00 .0 1845 0 0 0 0 0
09.01.46 JOB20085 ---- SATURDAY, 25 MAY 2019 ----
09.01.46 JOB20085 -LKY95111 ULKFAN UNLOAD 04 1181 200.70 .00 546.9 1976M 0 0 0 0 0
09.01.46 JOB20085 -LKY95111 ULKFAN HT200 FLUSH 0 .00 .00 .0 0 0 0 0 0 0
09.01.46 JOB20085 -LKY95111 ULKFAN CATFILES 00 17 .00 .00 .0 845 0 0 0 0 0
09.01.46 JOB20085 -LKY95111 CE£S010A 00 35 .00 .00 .0 2776 0 0 0 0 0
09.01.46 JOB20085 -LKY95111 ICEGENA 00 129 .00 .00 .0 6390 0 0 0 0 0
09.01.46 JOB20085 IEF404I LKY95111 - ENDED - TIME=09.01.46
09.01.46 JOB20085 -LKY95111 ENDED. NAME- TOTAL CPU TIME=200.70 TOTAL ELAPSED TIME= 546.9
09.01.46 JOB20085 $HASP395 LKY95111 ENDED - RC=00041
19
Query selection process: run using IDAA1 J E S 2 J O B L O G -- S Y S T E M P R O D -- N O D E P R O D
0
09.26.30 JOB38290 ---- MONDAY, 27 MAY 2019 ----
09.26.30 JOB38290 IRR010I USERID S510981 IS ASSIGNED TO THIS JOB.
09.26.38 JOB38290 ICH70001I S510981 LAST ACCESS AT 09:26:12 ON MONDAY, MAY 27, 2019
09.26.38 JOB38290 $HASP373 S5109811 STARTED - WLM INIT - SRVCLASS BATNOR - SYS PROD
09.26.38 JOB38290 IEF403I S5109811 - STARTED - TIME=09.26.38
09.26.38 JOB38290 - --TIMINGS (MINS.)-- ----PAGING COUNTS---
09.26.38 JOB38290 -JOBNAME STEPNAME PROCSTEP RC EXCP CPU SRB CLOCK SERV PG PAGE SWAP VIO SWAPS
09.26.38 JOB38290 -S5109811 DELETE 00 4 .00 .00 .0 708 0 0 0 0 0
09.27.54 JOB38290 -S5109811 UNLOAD 04 138 .00 .00 1.2 34318 0 0 0 0 0
09.27.54 JOB38290 IEF404I S5109811 - ENDED - TIME=09.27.54
09.27.54 JOB38290 -S5109811 ENDED. NAME-DB2 UTILITY TOTAL CPU TIME= .00 TOTAL ELAPSED TIME= 1.2
09.27.54 JOB38290 $HASP395 S5109811 ENDED - RC=0004
*****
CPU: 0 HR 00 MIN 00.04 SEC SRB: 0 HR 00 MIN 00.00 SEC
1READY
DSN SYSTEM(DBP)
DSN
RUN PROGRAM(DSNTIAUL) PLAN(DSNTIAUL) PARM('SQL’)
****
READY
END
1 DSNT490I SAMPLE DATA UNLOAD PROGRAM
0 DSNT505I DSNTIAUL OPTIONS USED: SQL
0 DSNT503I UNLOAD DATA SET SYSPUNCH RECORD LENGTH SET TO 80
0 DSNT504I UNLOAD DATA SET SYSPUNCH BLOCK SIZE SET TO 80
0
SET CURRENT QUERY ACCELERATION = ELIGIBLE
DSNT400I SQLCODE = 000, SUCCESSFUL EXECUTION
20
Using IDAA: monitoring
Critical Open MPS_DB2_IDAA_ThreadElapsed CC=C03020058T|CAN=0 DBN:CICS:DB2 DBN:CICS:DB2 10/05/21 15:46 19 Minutes 10/05/21 15:45 Sampled MPS_DB2_IDAA_ThreadElapsed
FirstOccurrence LastOccurrence AlertGroup Node AlertKey Summary Notifica Receiver Servizio Ack Count Severity AckOccurrence
05/10/21 15:45 05/10/21 15:45 AP:ITM_DB2 DBN:CICS:DB2 M_AP_DB2_AVA_0C_IdaaThreadElap
Elapsed Time elevato per thread che va su Acceleratore 0
Sistema DB2 No 2 Critico 0
22
Db2ID MAX 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
DBA 4 1 0 0 1 4 3 3 3 3 2 2 3 3 3 2 3 3 3 2 3 3 3 3 3
DBMD 4 1 1 0 0 4 2 2 3 3 2 3 3 3 3 2 3 3 3 3 3 2 3 3 2
DBMP 4 1 0 0 1 4 3 3 3 3 2 2 3 3 3 3 3 3 2 3 3 3 3 3 2
DBN 4 2 0 0 1 3 3 3 4 3 2 3 3 3 3 3 3 3 3 2 3 3 3 3 3
DBP 4 1 0 0 0 4 2 2 3 3 2 3 3 3 3 2 3 3 3 3 3 3 3 3 2
Db2ID MAX 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
DBA 582.234 1.378 150 14.263 93.121 4.317 4.024 1.300 7.508 124.527 582.234 197.267 371.547 168.022 94.105 107.443 162.046 122.071 33.972 50.110 11.277 15.624 18.551 4.258 1.313
DBMD 570.277 1.840 184 16.238 90.782 4.670 6.407 1.049 7.368 122.775 570.277 215.962 375.557 165.177 94.107 102.480 160.139 139.482 50.808 53.677 12.844 19.474 15.671 4.299 1.066
DBMP 585.881 1.378 150 10.138 93.121 4.317 4.024 1.300 7.508 126.575 585.881 195.731 373.486 170.947 99.209 110.973 172.303 140.328 51.444 51.350 12.457 15.117 15.457 4.023 1.131
DBN 587.650 4.899 150 14.424 139.378 9.138 3.783 1.452 9.399 155.649 587.650 231.188 388.144 173.627 99.477 142.712 187.463 143.964 59.386 71.932 12.457 19.702 16.038 3.888 1.150
DBP 545.164 1.911 160 16.230 90.772 4.505 6.407 1.049 7.764 119.522 545.164 207.721 386.996 169.319 82.867 91.664 168.132 112.665 33.972 50.162 11.277 14.619 18.694 4.121 1.131
IDAA02FI REPLICATION - Mon, 03 May 2021 - ACCEL LATENCY - IDAA02FI
IDAA02FI REPLICATION - Mon, 03 May 2021 - ACCEL ROWS INSERTED - IDAA02FI
Using IDAA: monitoring
23
Business areas :
Improve response time and cut mips consumption for Batch and OLTP workloads
- Wire transfers (Daily Batch)
- Pre-authorizations (Daily Batch)
- Installment Loans Management (Daily Batch)
- Marketing (Daily Batch)
- Funds (Daily Batch)
- Credit Monitoring (Daily Batch)
- Risk Matrix (Monthly Batch)
- Centralized Contracts Management (Weekly Batch)
- Credit Monitoring (OLTP)
- Monitoring and Major Incident (OLTP)
- Counter area application authorizations (OLTP)
24
Results from the first set of jobs (application DI) processed by IDAA v.5
• TEST
• Elapsed time : -96% CPU GP: -99%
• Production
• Elapsed: -95% CPU GP: -88%
Using IDAA: Batch
25
Using IDAA for OLTP – InfoSphere CDC for z/OS
Incremental Update
• Syncs DB2 and Accelarator Data in near real-time
• Scope: Row
• Based on the Change Data Capture (CDC) component of IBM InfoSphere Data Replication
• INSERT/UPDATE/DELETE statements captured from DB2 log data and replicated
to the Accelerator
- Default apply interval approx 10 seconds
- UPDATES are decomposed into DELETEs and INSERTs
• Tables enabled for incremental update require either enforced uniqueness (primary key, unique
index) or defined informational constraint (via ACCEL_ADD_TABLES stored procedure)
- Required for DELETEs
• Continuous replication
- Base table not locked while table initially loaded to the Accelerator
- Replication not stopped if replication subscription is changed (tables added, removed, loaded,
reloaded)
27
Using IDAA for OLTP – InfoSphere CDC for z/OS
Source system CPU resources are required for
these processes:
Target system CPU resources are required for these
processes
• Capturing, decoding, and staging the changed data
stored in the database log files. The product must
process all records from the DB2 log for Units of
Recovery and records for tables having DATA
CAPTURE CHANGES configured. The DB2 IFI
interface must filter the entire database log at all
times, even if the majority of the log contains
out-of-scope data.
• Transmitting the captured change data to the
target system using TCP/IP.
• Querying the database directly for %GETCOL or
%SELECT functions.
• Receiving captured change data from the source system.
• Converting the captured change data to database
operations.
• Committing the database operations to the target
database.
• Execution of SQL operations by the target database during
the apply process on the target system.
• Multiple indexes on tables in the target database often
requires additional CPU resources (n.a. on IDAA)
• Code page conversion.
Rule of thumb: 2 CPU seconds on z/OS every 100 MB read from the DB2 log
5 CPU seconds on z/OS every 100 MB sent to IDAA
Actual CPU usage in Production: 100/170 mips per hour
Please note: the cost of CDC log reading is one-time, i.e. does not depend on the number of the
exploiters (but additional CPU is used for each exploiter sending data to the accelerator)
28
Query DDF APPMDNMC -
Credit Monitoring:
performance
optimization of some
Java calls
Using CDC
• From a z15
perspective:
-700 mips GP
-900 mips zIIP
• Avg. Response time:
before ≈ 20 seconds
After ≈ 4 seconds
Using IDAA per OLTP – Overall Report Class results
29
Query DDF APPMDNMC-
Credit Monitoring: new
complex and strategic Java
call optimization
Using CDC
• From a z15 perspective:
-130 mips GP
-200 mips zIIP
• Avg. Response time:
before ≈ 70 seconds
After ≈ 3 seconds
Using IDAA for OLTP – java call detail
30
Business Areas Achievements
Direct Debit (Batch)
Goal
Performance
improvement:
reduction of MIPS
usage on z15 and
DRAMATIC
reduction of
elapsed times of
some Batch and
OLTP processes
Bonifici (Batch)
Elise - Gestione Finanziamenti Rateali
(Batch)
EF - Gestione Accentrata Contratti
(Weekly Batch)
MK - Marketing (Batch)
Batch daily : -41.000 seconds ≈
-1.500 mips/hour in batch window
Elapsed time decreased by 90%
FZ - Fondi (Batch)
GA - Gestione Andamentale (Batch)
D3 - Preautorizzazioni (Batch)
YU - Matrice di Rischio
(Monthly Batch)
Batch: weekly : -7.000 seconds per exec
Batch: monthly: 55.000 seconds per exec
MD - Monitoraggio del Credito
(DDF OLTP)
MH - Major incident
(DDF OLTP)
OLTP Prime Time: -1.100 mips/hour
Elapsed time decreased by 60% - 90%
Total batch elapsed time from
30 to 1,5 hours
Takeaways - from z15 to IDAA: 1.500 mips/hour in daily batch window
1.100 mips/hour in OLTP Prime Time