is my ibm z15 performing as expected? part 1

19
EPV Technologies Is my z15 performing as expected? 1 Is my IBM z15 performing as expected? Part 1 Fabio Massimo Ottaviani EPV Technologies March 2020 1 Introduction Every time a new IBM machine is announced the LSPR benchmarks are published. They provide an indication of the performance of the new machine compared to the existing ones. Based on these numbers, on the usage of available tools, such as IBM zPCR, and on their capacity planning methodology, customers decide the characteristics of the new machine model which better fit their application needs for the next years. Once the upgrade has been completed some customers are happy, some are not satisfied, others simply are not able to understand if they get the expected performance benefits. We regularly receive requests from some customers to help them evaluate the new machine’s performance. This has also happened when upgrading to z15. Their question is always the same: “Is my new machine performing as expected?”. In this paper we will try to provide suggestions to help you answer this question. All these suggestions are not specific for an upgrade to z15, they also apply to any machine upgrade. In the final part we will also discuss a real case of migrating from z13 to z15. 2 Choosing the right days to compare This is probably the most important step to perform. These are the main rules to follow. Rule 1 The workload of many companies is lighter on weekends and holidays. If this is true, these days should be excluded from the comparison. Rule 2 You should compare days when all the LPARs were on the old machine with days when all the LPARs are on the new machine. LPAR migration is normally done in steps so you have a certain number of days when your LPARs are running on both machines. In such a situation, performance is normally better because there is less competition among the LPARs. In the example in Figure 1, we show the average CPI (cycles per instruction) in the peak hours of a production system running on an IBM z13 during the migration of the other LPARs on a new IBM z15.

Upload: others

Post on 02-Nov-2021

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Is my IBM z15 performing as expected? Part 1

EPV Technologies

Is my z15 performing as expected? 1

Is my IBM z15 performing as expected?

Part 1

Fabio Massimo Ottaviani – EPV Technologies

March 2020

1 Introduction

Every time a new IBM machine is announced the LSPR benchmarks are published. They provide an

indication of the performance of the new machine compared to the existing ones.

Based on these numbers, on the usage of available tools, such as IBM zPCR, and on their capacity

planning methodology, customers decide the characteristics of the new machine model which better

fit their application needs for the next years.

Once the upgrade has been completed some customers are happy, some are not satisfied, others

simply are not able to understand if they get the expected performance benefits.

We regularly receive requests from some customers to help them evaluate the new machine’s

performance. This has also happened when upgrading to z15.

Their question is always the same: “Is my new machine performing as expected?”.

In this paper we will try to provide suggestions to help you answer this question. All these suggestions

are not specific for an upgrade to z15, they also apply to any machine upgrade.

In the final part we will also discuss a real case of migrating from z13 to z15.

2 Choosing the right days to compare

This is probably the most important step to perform.

These are the main rules to follow.

Rule 1

The workload of many companies is lighter on weekends and holidays.

If this is true, these days should be excluded from the comparison.

Rule 2

You should compare days when all the LPARs were on the old machine with days when all the

LPARs are on the new machine.

LPAR migration is normally done in steps so you have a certain number of days when your LPARs

are running on both machines.

In such a situation, performance is normally better because there is less competition among the

LPARs.

In the example in Figure 1, we show the average CPI (cycles per instruction) in the peak hours of a

production system running on an IBM z13 during the migration of the other LPARs on a new IBM

z15.

Page 2: Is my IBM z15 performing as expected? Part 1

EPV Technologies

Is my z15 performing as expected? 2

Figure 1

The migration started on 26th January 2020 and it was completed on the 22nd of February. The PRD1

system was the last LPAR migrated.

You can note that the CPI decreased (performance increased) continuously due to the reduced LPAR

contention on the IBM z13. In the last days of the graph, PRD1 was the only LPAR on the machine.

Rule 3

Many companies have workload peaks at the end or at the beginning of the month and a quiet period

in the middle.

Figure 2

Page 3: Is my IBM z15 performing as expected? Part 1

EPV Technologies

Is my z15 performing as expected? 3

In the example in Figure 2 you can note the end-of-the-month peaks (in red) which, in this case, are

enhanced by some holidays (Easter and 1st of May).

You should never compare a peak with a quiet period.

Comparing peaks with peaks is normally the best solution because performance is much more

important during peak times.

3 LPAR CP consumptions

What all customers want to know is if the new machine is providing the expected and planned

processing power for both CPU and zIIP engines.

Depending on the workload characteristics and the adopted software licensing policies, you can

decide whether to consider all the hours of the day or only some time shifts in your comparisons.

Please note that all the reports in this chapter are provided as examples of what you could use to check

the effects of a new machine. They don’t refer to a real upgrade case.

The first check we suggest is comparing the total CEC consumptions, including all the LPARs, before

and after the upgrade.

In Figure 3 and 4 you will find an example of a daily trend report1 showing the CPU consumed, in

MIPS, by LPAR and globally for the CEC (in USED).2

Using a report like this it will be very easy to make comparisons between different time periods.

Figure 3

A second important check should be done on the production LPARs.

You should focus on them for the following reasons:

they are more important for the company business,

their workload is normally much more stable than that of development and test LPARs,

they normally account for most of the CP usage.

1 Only part of the report is presented. By default EPV shows the last 60 days. 2 EPV for z/OS provides this report in Resources Daily Trends and allows to select only the desired time shift.

DATE DAY PHYSICAL LPPRD1 LPPRD2 LPPRD3 LPPRD4 LPPRD5 LPTST1 LPTST2 LPTST3 LPTST4 LPTST5 USED FREE03/02/2020 Mon 179 11.569 2.435 1.052 813 805 3.736 489 813 275 257 22.423 12.012

04/02/2020 Tue 177 11.443 3.036 981 998 687 4.447 380 998 297 316 23.758 10.677

05/02/2020 Wed 174 10.792 2.859 983 1.018 763 4.151 344 1.018 280 284 22.666 11.769

06/02/2020 Thu 178 10.435 2.480 959 763 747 3.955 297 763 259 277 21.112 13.323

07/02/2020 Fri 170 10.772 2.849 1.418 1.028 656 4.597 336 1.028 288 305 23.446 10.989

10/02/2020 Mon 162 11.407 2.783 805 742 698 3.671 462 742 269 248 21.987 12.448

11/02/2020 Tue 170 11.167 3.474 894 1.110 647 3.783 411 1.110 290 281 23.336 11.099

12/02/2020 Wed 180 10.985 3.090 824 1.026 699 3.440 297 1.026 284 266 22.116 12.319

13/02/2020 Thu 181 10.414 2.718 856 1.057 510 4.362 325 1.057 298 270 22.047 12.388

14/02/2020 Fri 183 10.927 2.801 839 895 624 4.622 256 895 295 279 22.613 11.822

CECPARTITIONS

Page 4: Is my IBM z15 performing as expected? Part 1

EPV Technologies

Is my z15 performing as expected? 4

Figure 4

When planning a new machine capacity, it’s very important to minimize the probability of zIIP

eligible work overflowing to CPU.

A third check needs to be dedicated to evaluating the variations of the CPU usage due to such

overflows.

Figure 5

4 Checking the overhead

After an upgrade it is always good practice to check the PR/SM and system overhead.

Please note that all the reports in this chapter are provided as examples of what you could use to check

the effects of a new machine. They don’t refer to a real upgrade case.

In Figure 6 you will find an example of a daily trend report3 showing the CPU consumed, in MIPS,

by PR/SM (in Physical) and by each LPAR when talking to PR/SM.

If the number of LPARs and their configuration has not been substantially changed you don’t have

to find big variations in the measured overhead.4

3 Only part of the report is presented. By default EPV shows the last 60 days. 4 This is only part of the PR/SM overhead. The biggest part is included in application CPU consumptions and depends on

the contention on the processor cache.

DATE DAY PHYSICAL LPPRD1 LPPRD2 LPPRD3 LPPRD4 LPPRD5 LPTST1 LPTST2 LPTST3 LPTST4 LPTST5 USED FREE03/02/2020 Mon 179 11.569 2.435 1.052 813 805 3.736 489 813 275 257 22.423 12.012

04/02/2020 Tue 177 11.443 3.036 981 998 687 4.447 380 998 297 316 23.758 10.677

05/02/2020 Wed 174 10.792 2.859 983 1.018 763 4.151 344 1.018 280 284 22.666 11.769

06/02/2020 Thu 178 10.435 2.480 959 763 747 3.955 297 763 259 277 21.112 13.323

07/02/2020 Fri 170 10.772 2.849 1.418 1.028 656 4.597 336 1.028 288 305 23.446 10.989

10/02/2020 Mon 162 11.407 2.783 805 742 698 3.671 462 742 269 248 21.987 12.448

11/02/2020 Tue 170 11.167 3.474 894 1.110 647 3.783 411 1.110 290 281 23.336 11.099

12/02/2020 Wed 180 10.985 3.090 824 1.026 699 3.440 297 1.026 284 266 22.116 12.319

13/02/2020 Thu 181 10.414 2.718 856 1.057 510 4.362 325 1.057 298 270 22.047 12.388

14/02/2020 Fri 183 10.927 2.801 839 895 624 4.622 256 895 295 279 22.613 11.822

CECPARTITIONS

SYSTEM DATE DAY 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23LPPRD1 03/02/2020 Mon 46,0 1,6 5,7 2,3 0,5 0,4 1,4 4,0 14,6 21,6 20,3 21,7 20,1 10,8 10,8 16,9 14,8 8,0 4,8 5,4 2,1 2,4 2,0 2,4

LPPRD1 04/02/2020 Tue 2,7 1,7 13,1 4,9 0,4 0,7 2,7 3,7 13,2 19,4 18,0 18,1 17,4 9,4 9,3 15,8 13,7 6,8 4,5 3,1 2,1 1,8 1,2 1,6

LPPRD1 05/02/2020 Wed 1,4 4,8 10,4 2,1 1,4 1,5 2,6 4,6 13,3 19,2 19,1 18,0 16,6 10,4 10,2 13,7 13,4 7,5 5,7 4,2 2,3 2,0 2,4 3,5

LPPRD1 06/02/2020 Thu 2,1 2,0 9,6 1,5 1,3 1,6 3,8 3,5 12,4 17,6 17,2 17,6 16,2 9,9 8,8 14,5 12,6 7,4 6,7 5,0 1,6 1,6 2,1 3,2

LPPRD1 07/02/2020 Fri 2,9 1,6 8,3 1,8 1,2 0,9 2,4 3,3 12,7 18,0 17,8 18,2 16,7 9,8 9,8 14,7 13,1 6,5 4,3 3,6 1,7 1,7 2,4 3,9

LPPRD1 10/02/2020 Mon 66,6 2,4 5,0 2,1 0,3 0,3 1,7 3,0 13,4 14,4 17,7 20,1 18,3 10,5 11,0 15,6 13,5 8,4 4,6 5,0 1,9 1,9 2,0 2,9

LPPRD1 11/02/2020 Tue 2,3 1,6 7,7 9,6 0,7 0,5 2,7 3,8 12,8 18,6 17,8 17,8 16,6 10,1 9,8 14,4 13,1 6,9 4,5 7,3 2,0 2,0 3,2 7,0

LPPRD1 12/02/2020 Wed 2,1 1,4 8,9 0,9 0,2 0,6 2,6 3,1 12,6 16,9 16,2 16,5 15,6 9,2 8,6 13,8 12,8 6,5 4,3 6,9 2,1 2,5 2,8 5,3

LPPRD1 13/02/2020 Thu 2,4 3,3 5,5 1,2 0,2 0,6 2,2 2,8 11,7 16,6 16,4 17,6 16,2 9,5 9,3 13,9 12,4 6,5 3,9 5,3 1,8 1,8 2,5 6,2

LPPRD1 14/02/2020 Fri 2,2 7,2 2,0 0,9 0,2 0,6 2,6 3,0 12,7 17,1 17,4 17,6 16,5 9,3 8,8 14,3 11,5 5,8 3,5 5,2 1,8 1,4 4,4 4,7

SYSTEM IIPCP UTILIZATION - LPPRD1 - MIPS

Page 5: Is my IBM z15 performing as expected? Part 1

EPV Technologies

Is my z15 performing as expected? 5

Figure 6

By looking at each system you can also verify the internal system overhead.

Unfortunately, there is not a direct measurement available, but you can evaluate it by looking at the

system capture ratio daily trends.

The system capture ratio is the ratio between the total CPU attributed to all the workloads and the

total CPU used, measured at the system level. It is an inverted measurement of the system overhead:

the lower the value the higher the overhead.5

Figure 7

5 Verifying the performance of the most important applications

Another important verification that is needed, is the impact of the new machine on the most relevant

applications throughput and performance.

Of course, you should focus on the more stable applications and choose an appropriate time shift.

Please note that all the reports in this chapter are provided as examples of what you could use to check

the effects of a new machine. They don’t refer to a real upgrade case.

In the following figure we show an example of a CICS application which is a good candidate for the

comparison. You can note that the CPU consumptions per transaction (AVG CPU) is stable.

Reported data refers to the prime shift (from 8 to 12 am) on working days only.

5 A widely accepted ROT (Rule Of Thumb) is to consider as acceptable values between 85% and 95%.

DATE DAY PHYSICAL LPPRD1 LPPRD2 LPPRD3 LPPRD4 LPPRD5 LPDTST1 LPDTST2 LPDTST3 LPDTST4 LPDTST5 TOTAL03/02/2020 Mon 178,6 41,4 31,2 6,5 8,6 4,9 14,9 2,0 9,8 2,1 2,3 302,3

04/02/2020 Tue 176,7 40,8 32,8 6,2 9,6 4,7 11,8 2,3 10,3 2,3 2,4 299,9

05/02/2020 Wed 174,2 36,0 32,7 6,2 9,3 4,5 12,7 2,2 11,6 2,3 2,4 294,1

06/02/2020 Thu 178,4 35,9 31,5 5,8 9,4 4,5 13,8 2,2 9,9 2,2 2,3 295,9

07/02/2020 Fri 169,8 35,7 32,9 5,4 9,6 4,9 9,6 2,2 10,6 2,3 2,4 285,4

10/02/2020 Mon 162,0 35,0 31,3 7,3 8,2 4,7 9,1 2,0 9,5 2,2 2,3 273,6

11/02/2020 Tue 169,8 42,6 31,6 4,9 9,5 4,4 14,1 2,1 9,8 2,2 2,4 293,4

12/02/2020 Wed 180,0 48,5 32,6 4,8 8,8 4,0 14,4 2,1 9,3 2,1 2,4 309,0

13/02/2020 Thu 181,4 49,2 32,3 4,8 10,4 3,9 10,7 2,0 9,4 2,1 2,4 308,6

14/02/2020 Fri 182,8 51,0 33,4 4,6 9,3 4,4 9,0 2,1 9,5 2,2 2,4 310,7

PARTITIONS

SYSTEM DATE DAY 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23LPPRD1 03/02/2020 Mon 89 85 93 96 89 94 91 93 96 97 97 97 97 96 95 96 97 95 94 93 96 95 96 96

LPPRD1 04/02/2020 Tue 95 96 96 96 97 97 96 94 95 97 96 96 96 95 96 96 96 95 94 93 96 95 96 96

LPPRD1 05/02/2020 Wed 95 96 96 96 96 96 94 92 97 97 96 97 97 95 96 96 96 94 95 93 96 95 95 96

LPPRD1 06/02/2020 Thu 96 96 96 96 96 95 95 93 96 97 97 97 96 95 96 97 96 94 93 93 96 96 96 96

LPPRD1 07/02/2020 Fri 95 96 96 96 95 94 92 90 96 96 97 97 97 95 96 96 96 95 95 93 96 96 95 96

LPPRD1 10/02/2020 Mon 86 90 94 95 91 90 91 92 96 97 97 97 97 95 96 97 96 95 95 93 96 95 96 96

LPPRD1 11/02/2020 Tue 96 96 96 96 95 94 93 93 96 97 97 97 97 95 96 96 96 94 95 94 95 95 96 96

LPPRD1 12/02/2020 Wed 96 96 96 94 94 92 91 91 96 96 96 97 97 96 95 96 96 94 94 93 96 96 96 94

LPPRD1 13/02/2020 Thu 96 96 96 95 96 93 91 91 95 96 96 96 96 95 96 96 96 93 92 92 96 96 96 94

LPPRD1 14/02/2020 Fri 96 96 96 93 94 91 91 90 95 96 96 96 96 95 95 96 95 93 92 92 96 96 96 94

SYSTEM CPU CAPTURE RATIO - LPPRD1

Page 6: Is my IBM z15 performing as expected? Part 1

EPV Technologies

Is my z15 performing as expected? 6

Figure 8

An important issue to consider when comparing the CPU usage is the CPU speed normalization.

Depending on the CPU speed the amount of work which can be processed in one CPU second is

different. When you upgrade to a new machine normally the effect is that the CPU seconds used by

your application is reduced giving the illusion of a reduction in the CPU consumptions.

To have a meaningful comparison you need to normalize the CPU seconds.

If you want to normalize the CPU seconds, used in the new machines, to the old machine CPU seconds

you can use the following formula:

Norm CPU sec = new CPU sec / old SU rate * new SU rate

You can note that the SU rate in Figure 7 is always the same because the machine has not changed in

the reported time frame6.

6 Using MF counters

Starting from the MF counters, provided in SMF 113, some useful indexes can be calculated.

In this chapter, for educational purposes, we will also show some examples displaying the indexes

measured on one day in the prime shift hours.

However, it’s very important that your analysis is not based on a small number of hours, it should

consider values measured during the workload peaks and over a long period of time.

6 The SU rate depends also on the number of logical processor assigned to the LPAR.

SU rate AVG CPU TRX TOT CPU03/02/2020 86.486 0,015 629.592 9.610

04/02/2020 86.486 0,015 521.838 7.610

05/02/2020 86.486 0,015 496.704 7.407

06/02/2020 86.486 0,015 462.122 6.769

07/02/2020 86.486 0,015 460.828 6.968

10/02/2020 86.486 0,015 569.858 8.604

11/02/2020 86.486 0,015 518.778 7.823

12/02/2020 86.486 0,015 482.355 7.277

13/02/2020 86.486 0,014 467.326 6.710

14/02/2020 86.486 0,014 465.820 6.745

17/02/2020 86.486 0,015 614.852 9.507

18/02/2020 86.486 0,015 491.496 7.382

19/02/2020 86.486 0,014 463.659 6.454

20/02/2020 86.486 0,014 463.814 6.550

21/02/2020 86.486 0,014 397.824 5.689

24/02/2020 86.486 0,014 495.452 7.113

25/02/2020 86.486 0,014 462.977 6.577

26/02/2020 86.486 0,014 476.467 6.842

27/02/2020 86.486 0,015 518.207 7.552

28/02/2020 86.486 0,015 589.269 8.890

Page 7: Is my IBM z15 performing as expected? Part 1

EPV Technologies

Is my z15 performing as expected? 7

6.1 CPI

The CPI index represents the average number of cycles needed per instruction. It can be calculated

by using basic counters and the following simple formula (valid for all the IBM models): 7

CPI = B0 / B1

As you can imagine there is not a Rule of Thumb for the ideal CPI value. However, it’s intuitive that

to exploit the processor power the CPI value should be as low as possible.

You can use CPI to evaluate the performance benefits when moving to a new machine generation,

but you need to normalize the CPI values to the processor speed to make a meaningful comparison.

old machine cycle

normalized new machine CPI = ------------------------------ * new machine CPI

new machine cycle

Normally, you expect a lower CPI when moving to a new machine.

If this is not the case, you can go in more details by splitting the CPI into:

finite CPI; cycles needed because L1 cache is not infinite; it indicates which portion of CPI

is due to data and instructions coming from L2 and shared caches (Nest);

instruction_complexity_CPI; cycles needed even with an infinite L1 cache; it indicates

which portion of CPI is due to the effectiveness of the microprocessor design with your

workload.

They can be estimated for z15 by using the following simple formulas8:

finite_CPI = E143 / B1

instruction_complexity_CPI = CPI – finite_CPI

In Figure 9 you can see the hourly profile of the CPI values in the prime shift of a production system

on the 2nd of March.

You can note that values are between 2,3 and 2,7 and evenly split to CPI-F and CPI-I.

7 Bx counters are Basic Counters. 8 Exx counters are Extended Counters.

Page 8: Is my IBM z15 performing as expected? Part 1

EPV Technologies

Is my z15 performing as expected? 8

Figure 9

The graph in Figure 10 shows the average CPI calculated across the prime shift hours during the

March working days. You can see that the values are generally between 2,5 and 3,0.

Figure 10

6.2 %L1M and RNI

Workload capacity performance is quite sensitive to how deep into the memory hierarchy the

processor must go to retrieve the workload’s instructions and data to be executed. The higher %L1M

(% Level 1 Miss) and RNI (Relative Nest Intensity), the worse the workload capacity performance

will be.

The %L1M index represents the percentage of data and instructions which were not found in the

Level 1 cache. It can be calculated by using basic counters and the following simple formula (valid

for all the IBM models):

%L1M = ((B2 + B4) / B1) * 100

The RNI index represents how deep into the memory hierarchy the processor must go to retrieve the

instructions and data when a Level 1 cache miss occurs.

The RNI calculation is much more complex and it is dependent on the machine model.9

By using %L1 Miss and RNI values, together with the rules in the next figure, you can understand

which benchmark best represents the workload running in each system.

9 See the “z15 Capacity Planning” white paper.

Page 9: Is my IBM z15 performing as expected? Part 1

EPV Technologies

Is my z15 performing as expected? 9

Figure 11

In practical terms the machine will look less powerful on a workload represented by a HIGH RNI

benchmark than on a workload represented by an AVG or LOW RNI benchmark.

If your workload has not changed, you shouldn’t expect that the representing benchmark will be

different.

In Figure 12 you can see the hourly profile of the %L1M and RNI values in the prime shift of a

production system on the 2nd of March. Based on those numbers the benchmark better representing

the system workload is AVG RNI (%L1M < 3 and RNI >0,75).

Figure 12

The graphs in Figure 13 and 14 shows the average %L1M and RNI calculated across the prime shift

hours during the March working days. You can note that the situation is not as clear as it appeared

when looking at just few hours (see Figure 12).

%L1M is, in many days, close to 3 and, in one day, it is over that. The RNI values are always higher

than 0,75 and, in some days, higher than 1.

We can still assume AVG RNI, as the benchmark better representing the system workload, but we

are very close to the classification limits.

%L1 Miss RNI Benchmark< 3% >= 0,75 AVG RNI

< 3% < 0,75 LOW RNI

3% to 6% > 1,00 HIGH RNI

3% to 6% 0,60 to 1,00 AVG RNI

3% to 6% < 0,60 LOW RNI

> 6% >= 0,75 HIGH RNI

> 6% < 0,75 AVG RNI

Page 10: Is my IBM z15 performing as expected? Part 1

EPV Technologies

Is my z15 performing as expected? 10

Figure 13

Figure 14

Generally, if the representing benchmark will become worse (from LOW to AVG or from AVG to

HIGH), when migrating to a new machine, it means that machine performance will look degraded;

you should investigate it.

The most likely reasons are:

a) a change in your workload, especially if it is already borderline as in the example in Figure

13 and 14,

b) some workload characteristics do not go well with the technology of the new machine.

Page 11: Is my IBM z15 performing as expected? Part 1

EPV Technologies

Is my z15 performing as expected? 11

A possible reason for b) is discussed in the next chapter.

6.3 SIIS

The processor architecture of modern IBM machines expects to execute code which respect the

following rules:

separating data and instructions, they should not be in the same cache line,

localizing storage references,

no self-modifying code.

When these rules are violated a SIIS (Store Into the Instruction Stream) event happens and

instructions have to be re-fetched from the Level 3 cache10 with consequent performance degradation.

Modern compilers have been written with processor architecture in mind so the SIIS issue normally

arises with old assembler programs written using poor programming practices.

IBM recently provided the formula to estimate the SIIS impact on z15 machines:11

%SIIS = E164 / B2 * 100

IBM also provided indications on the suggested actions depending on the %SIIS levels:

%SIIS IMPACT ACTIONS NEEDED

< 2% negligible none

>= 2% and < 5% Low low priority actions to get some MSU savings

>= 5% and < 10% Medium medium priority actions to get noteworthy

MSU savings

>= 10% High high priority actions to get considerable MSU

savings

Figure 15

In Figure 16 you can see an example of hourly profile of the %SIIS values in the prime shift on a

production system hosting only applications developed in the last 10 years. You can see that the

%SIIS impact can be considered as negligible.

10 In z13, z14 and z15. 11 The same formula is also valid for z14.

Page 12: Is my IBM z15 performing as expected? Part 1

EPV Technologies

Is my z15 performing as expected? 12

Figure 16

7 From z13 to z15

One of our customers migrated from an IBM 2964-716 (z13) to an IBM 8561-714 (z15).

In this chapter we will only describe a small part of the analysis performed.

We will essentially focus on understanding if the z15 performs as expected in terms of CPU

consumption.

7.1 Time periods under analysis

We chose the days to make comparisons based on the following criteria:

a) excluding weekends and holidays;

b) comparing days when all the LPARs were on the old machine with days when all the LPARs

are on the new machine;

c) selecting days in the middle of the month.

The days we chose were:

From 13th to 24th January 2020 for z13;

From 9th to 20th March 2020 for z15.

All the hours in the prime shift (from 8 to 17) have been considered.

7.2 Overall CPU utilization

In the following graph we compare the average daily CPU utilization in the two time periods.

Page 13: Is my IBM z15 performing as expected? Part 1

EPV Technologies

Is my z15 performing as expected? 13

Figure 17

You can note that the MIPS used values (blue bars) in the first (z13) and last (z15) weeks are similar.

In the other weeks, values are slightly higher. Anyway, we can say that with z15 the CPU utilization

is not higher than with z13.

The red lines represent the average percentage of utilization of the whole machine in the two time

periods:

60,0% of the z13;

49,4% of the z15.

The IBM 2964-716 (z13) is targeted at about 19665 MIPS while the IBM 8561-714 is targeted at

about 22374 MIPS. These targets refer to an Average RNI workload.

The expected capacity increase is about 13,8%.

If we normalize the z13 utilization to the z15 capacity the average percentage of utilization of the

whole machine becomes:

52,7%, which is the expected utilization with z15 based on the LSPR benchmarks;

49,4%, which is the measured z15 utilization.

It seems that the z15 is performing slightly better than expected. The improvement is about 3%.

It can be due to:

a) Different workloads running in the analysed time periods;

b) Reduced CPU contention because of a lower percentage of utilization.

For what concerns b) it is interesting to note that, based on IBM studies and presentations, application

consumptions also increase when percentage utilization of the CEC (CEC busy) grows. This is mostly

due to processor cache contention. As you know, the time waiting for data and instructions to be

loaded in the Level 1 cache is charged to applications as CPU time.

Page 14: Is my IBM z15 performing as expected? Part 1

EPV Technologies

Is my z15 performing as expected? 14

It can be estimated in 3% to 5% more every 10% increase of CEC busy, depending on the workload

RNI characteristics (LOW, AVG, HIGH).

In our case the CEC busy decreased by about 10% with z15, so it could explain a further 3% reduction.

7.3 Production systems CPU

We analysed the following production systems:

PRD1

PRD2

PROD

Figure 18

Page 15: Is my IBM z15 performing as expected? Part 1

EPV Technologies

Is my z15 performing as expected? 15

Figure 19

Figure 20

As you can see the consumptions with the z15 are similar or slightly lower for the PRD1 and PRD2

systems.

They appear more erratic for the PROD system. This is due to some very intensive batch activity

related to some company activities which normally start at the end of March but, unfortunately for

our analysis, they have been anticipated this year.

Page 16: Is my IBM z15 performing as expected? Part 1

EPV Technologies

Is my z15 performing as expected? 16

7.4 Processor cache effectiveness

By exploiting the MF counters provided in SMF 113 you can get an insight on the processor cache

effectiveness.

To evaluate the performance benefits of z15 versus z13 the most interesting index is the CPI (Cycles

Per Instruction). The lower the CPI the better the performance.

The z15 CPI values are normalized to the z13 clock speed.

Figure 21

Figure 22

Page 17: Is my IBM z15 performing as expected? Part 1

EPV Technologies

Is my z15 performing as expected? 17

Figure 23

You can note that the processor cache effectiveness increased consistently for all the production

LPARs in z15.

For what concerns PROD, it could seem contrasting with the CPU utilization reported in Figure 20.

There is no contrast: the number of cycles to perform an instruction is lower in z15 but the number

of instructions to execute is higher because of the increase in batch workload we mentioned.

7.5 Application CPU and throughput

We analysed the IMS transactions of the most important application in the PRD1 system.

The next graphs show a comparison of the average CPU time per transaction and the total number of

transactions executed in z13 and in z15.

You can see a good performance improvement due to the higher speed of z15.

Page 18: Is my IBM z15 performing as expected? Part 1

EPV Technologies

Is my z15 performing as expected? 18

Figure 24

To understand the difference in terms of CPU consumptions the average CPU time for z15 needs to

be normalized to the z13 CPU speed.12

Figure 25

You can note that, even after the normalization, the average CPU consumptions per transaction show

a consistent decrease.

12 CPU has been normalized by using the service units’ rate.

Page 19: Is my IBM z15 performing as expected? Part 1

EPV Technologies

Is my z15 performing as expected? 19

The next graph shows that the transaction throughput is comparable, in the weeks analyzed.

Figure 26

8 Summary

In this paper we discussed techniques and metrics which can be useful to evaluate if the upgrade to a

new machine, whatever the old and the new machine model is, provided the expected benefits.

We also briefly discussed a customer experience where an upgrade from z13 to z15 has been

performed.