so you want to write your own benchmark

64
So y ou wa n t to w rit e yo ur own m ic r o b e nchm a rk Dror Berez ni ts k y Decembe r 18 th 200 8

Upload: dror-bereznitsky

Post on 10-May-2015

16.267 views

Category:

Technology


1 download

DESCRIPTION

Performance has always been a major concern in software development and should not be taken lightly even when commodity computers have multicore CPUs and a few gigabytes of RAM. One of the most handy, simple tools for performance testing are microbenchmarks. Unfortunately, developing correct Java microbenchmarks is a complex task with many pitfalls on the way. This presentation is about the Do's and Don'ts of Java microbenchmarking and about what tools are out there to help with this tricky task.

TRANSCRIPT

Page 1: So You Want To Write Your Own Benchmark

So you want to write

your

own microbenchmark

Dror B

erezni

tsky

Decem

ber 18

th 2008

Page 2: So You Want To Write Your Own Benchmark

2

Agenda

• Introduction

• Java™ micro benchmarking pitfalls

• Writing your own benchmark

• Micro benchmarking tools

• Summary

Page 3: So You Want To Write Your Own Benchmark

3

Microbenchmark – simple definition

1. Start the

clock

2. Run the code 3. Stop the

clock

4. Report

Page 4: So You Want To Write Your Own Benchmark

4

Better microbenchmark definition

• Small program

• Goal: Measure something about a few

lines of code

• All other variables should be removed

• Returns some kind of a numeric

result

Page 5: So You Want To Write Your Own Benchmark

5

Why do I need microbenchmarks?

• Discover something about my code:

• How fast is it

• Calculate throughput – TPS, KB/s

• Measure the result of changing my code:

• Should I replace a HashMap with a TreeMap?

• What is the cost of synchronizing a method?

Page 6: So You Want To Write Your Own Benchmark

6

Why are you talking about this?

• It’s hard to write a robust

microbenchmark

• it’s even harder to do it in Java™

• There are not enough Java

microbenchmarking tools

• There are too many flawed

microbenchmarks out there

Page 7: So You Want To Write Your Own Benchmark

7

Agenda

• Introduction

• Java micro benchmarking pitfalls

• Writing your own benchmark

• Micro benchmarking tools

• Summary

Page 8: So You Want To Write Your Own Benchmark

8

A microbenchmark story: the problem

The boss asks you to solve a performance issue

in one of the components

Blah, blah …

Page 9: So You Want To Write Your Own Benchmark

9

A microbenchmark story: the cause

You find out that the cause is excessive use of Math.sqrt()

Page 10: So You Want To Write Your Own Benchmark

10

A microbenchmark story: a solution?

• You decide to develop a state of the art

square root approximation

• After developing the square root

approximation you want to benchmark it against the java.lang.Math

implementation

Page 11: So You Want To Write Your Own Benchmark

11

public static void main(String[] args) {

long start = System.currentTimeMillis(); // start the clock

for (double i = 0; i < 10 * 1000 * 1000; i++) {

mySqrt(i); // little piece of code

}

long end = System.currentTimeMillis(); // stop the clock

long duration = end - start;

System.out.format("Test duration: %d (ms) %n", duration);

}

SQRT approximation microbenchmark

Let’s run this little piece of code in a loop

and see what happens …

Page 12: So You Want To Write Your Own Benchmark

12

SQRT microbenchmark results

Wow, this is really fast !

Test duration: 0 (ms)

Page 13: So You Want To Write Your Own Benchmark

13

Flawed microbenchmark

Page 14: So You Want To Write Your Own Benchmark

14

SQRT microbenchmark: what’s wrong?

The Java™ HotSpot virtual machine

Dynamic optimizations

On Stack Replacement

Dynamic Compilation

Dead code elimination

Classloading

Garbage collection

Page 15: So You Want To Write Your Own Benchmark

15

The HotSpot: a mixed mode system

Profiling

DynamicCompilation

Stuff Happen

Code is interpreted

Interpreted againor recompiled

1

2

3

4

5

Page 16: So You Want To Write Your Own Benchmark

16

Dynamic compilation

• Dynamic compilation is unpredictable

• Don’t know when the compiler will run

• Don’t know how long the compiler will run

• Same code may be compiled more than once

• The JVM can switch to compiled code at will

Page 17: So You Want To Write Your Own Benchmark

17

• Dynamic compilation can seriously

influence microbenchmark results

Dynamic compilation cont.

Interpreted execution +

Dynamic compilation +

Compiled code execution

≠Compiled / Interpreted code

execution

Continuous recompilation Steady-state

Page 18: So You Want To Write Your Own Benchmark

18

Dynamic optimizations

• The HotSpot server compiler performs

large variety of optimizations:

• loop unrolling

• range check elimination

• dead-code elimination

• code hoisting …

Page 19: So You Want To Write Your Own Benchmark

19

Code hoisting ?

Did he just said

“code

hoisting”?

Page 20: So You Want To Write Your Own Benchmark

20

What the heck is code hoisting ?

• Hoist = to raise or lift

• Size optimization

• Eliminate duplicated pieces

of code in method bodies

by hoisting expressions

or statements

Page 21: So You Want To Write Your Own Benchmark

21

Code hoisting example

Optimizing Java for Size: Compiler Techniques for Code Compaction, Samuli Heilala

a + b is a busy

expression

After hoisting the

expression a + b. A

new local variable t

has been introduced

Page 22: So You Want To Write Your Own Benchmark

22

Dynamic optimizations cont.

• Most of the optimizations are performed

at runtime

• Profiling data is used by the compiler to

improve optimization decisions

• You don’t have access to the dynamically

compiled code

Page 23: So You Want To Write Your Own Benchmark

23

public static void main(String[] args) {

long start = System.nanoTime();

int result = 0;

for (int i = 0; i < 10 * 1000 * 1000; i++) {

result += Math.sqrt(i);

}

long duration = (System.nanoTime() - start) / 1000000;

System.out.format("Test duration: %d (ms) %n", duration);

}

Example: Very fast square root?

10,000,000 calls to Math.sqrt() ~ 4 ms

Page 24: So You Want To Write Your Own Benchmark

24

public static void main(String[] args) {

long start = System.nanoTime();

int result = 0;

for (int i = 0; i < 10 * 1000 * 1000; i++) {

result += Math.sqrt(i);

}

System.out.format("Result: %d %n", result);

long duration = (System.nanoTime() - start) / 1000000;

System.out.format("Test duration: %d (ms) %n", duration);

}

Example: not so fast?

Now it takes ~ 2000 ms ?!?

Single line

of code

added

Page 25: So You Want To Write Your Own Benchmark

25

DCE - Dead Code Elimination

• Dead code - code that has no effect on the

outcome of the program execution

public static void main(String[] args) {

long start = System.nanoTime();

int result = 0;

for (int i = 0; i < 10 * 1000 * 1000; i++) {

result += Math.sqrt(i);

}

long duration = (System.nanoTime() - start) / 1000000;

System.out.format("Test duration: %d (ms) %n", duration);

}

Dead Code

Page 26: So You Want To Write Your Own Benchmark

26

OSR - On Stack Replacement

• Methods are HOT if they cumulatively

execute more than 10,000 of loop

iterations

• Older JVM versions did not switch to the

compiled version until the method exited

and was re-entered

• OSR - switch from interpretation to

compiled code in the middle of a loop

Page 27: So You Want To Write Your Own Benchmark

27

OSR and microbenchmarking

• OSR’d code may be less performant

• Some optimizations are not performed

• OSR usually happen when you put

everything into one long method

• Developers tend to write long main()

methods when benchmarking

• Real life applications are hopefully divided

into more fine grained methods

Page 28: So You Want To Write Your Own Benchmark

28

Classloading

• Classes are usually loaded only when

they are first used

• Class loading takes time

• I/O

• Parsing

• Verification

• May flow your benchmark results

Page 29: So You Want To Write Your Own Benchmark

29

Garbage Collection

• JVM automatically claim resources by

• Garbage collection

• Objects finalization

• Outside of developer’s control

• Unpredictable

• Should be measured if invoked as a result

of the benchmarked code

Page 30: So You Want To Write Your Own Benchmark

30

Time measurement

public static void main(String[] args) throwsInterruptedException {

long start = System.currentTimeMillis();

Thread.sleep(1);

final long end = System.currentTimeMillis();

final long duration = (end - start);

System.out.format("Test duration: %d (ms) %n", duration);

}

Test duration: 16 (ms)

How long is one millisecond?

Page 31: So You Want To Write Your Own Benchmark

31

System.curremtTimeMillis()

• Accuracy varies with platform

Markus KoblerLinux – 2.6 kernel1 ms

Java GlossaryMac OS X 1 ms

David HolmesWindows NT, 2K, XP, 200310 – 15 ms

Java Glossary Windows 95/98 55 ms

SourcePlatformResolution

Page 32: So You Want To Write Your Own Benchmark

32

Wrong target platform

• Choosing the wrong platform for your

microbenchmark

• Benchmarking on Windows when your

target platform is Linux

• Benchmarking a highly threaded

application on a single core machine

• Benchmarking on a Sun JVM when the

target platform is Oracle (BEA) JRockit

Page 33: So You Want To Write Your Own Benchmark

33

Caching

• Caching

• Hardware – CPU caching

• Operating System – File system caching

• Database – query caching

Page 34: So You Want To Write Your Own Benchmark

34

Caching: CPU L1 and L2 caches

• The more the data accessed are far from the CPU, the more the delays are high

• Size of dataset affects access cost

136.44657438128192K

9.82141345116k

Cost (ns)Time (us)Array size

Jcachev2 results for Intel® core™2 duo T8300, L1 = 32 KB, L2 = 3 MB

Page 35: So You Want To Write Your Own Benchmark

35

Busy environment

• Running in a busy environment – CPU,

IO, Memory

Page 36: So You Want To Write Your Own Benchmark

36

Agenda

• Introduction

• Java micro benchmarking pitfalls

•Writing your own benchmark

• Micro benchmarking tools

• Summary

Page 37: So You Want To Write Your Own Benchmark

37

Warm-up your code

Page 38: So You Want To Write Your Own Benchmark

38

Warm-up up your code

• Let the JVM reach steady state execution

profile before you start benchmarking

• All classes should be loaded before

benchmarking

• Usually executing your code for ~10

seconds should be enough

Page 39: So You Want To Write Your Own Benchmark

39

Warm-up up your code – cont.

• Detect JIT compilations by using

• CompilationMXBean.

getTotalCompilationTime()

• -XX:+PrintCompilation

• Measure classloading time

• Use the ClassLoadingMXBean

Page 40: So You Want To Write Your Own Benchmark

40

CompilationMXBean usage

import java.lang.management.ManagementFactory;

import java.lang.management.CompilationMXBean;

long compilationTimeTotal;

CompilationMXBean compBean =

ManagementFactory.getCompilationMXBean();

if (compBean.isCompilationTimeMonitoringSupported())

compilationTimeTotal = compBean.getTotalCompilationTime();

Page 41: So You Want To Write Your Own Benchmark

41

Dynamic optimizations

• Avoid on stack replacement

• Don’t put all your benchmark code in one big main() method

• Avoid dead code elimination

• Print the final result

• Report unreasonable speedups

Page 42: So You Want To Write Your Own Benchmark

42

Garbage Collection

• Measure garbage collection time

• Force garbage collection and finalization

before benchmarking

• Perform enough iteration to reach garbage

collection steady state

• Gather gc stats: -XX:PrintGCTimeStamps

-XX:PrintGCDetails

Page 43: So You Want To Write Your Own Benchmark

43

Time measurement

• Use System.nanoTime()

• Microseconds accuracy on modern operating

systems and hardware

• Not worse than currentTimeMillis()

• Notice: Windows users

• executes in microseconds

• don’t overuse !

Page 44: So You Want To Write Your Own Benchmark

44

JVM configuration

• Use similar JVM options to your target

environment:

• -server or –client JVM

• Enough heap space (-Xmx)

• Garbage collection options

• Thread stack size (-Xss)

• JIT compiling options

Page 45: So You Want To Write Your Own Benchmark

45

Other issues

• Use fixed size data sets

• Too large data sets can cause L1 cache

blowout

• Notice system load

• Don’t play GTA while benchmarking !

Page 46: So You Want To Write Your Own Benchmark

46

Agenda

• Introduction

• Java micro benchmarking pitfalls

• Writing your own benchmark

• Micro benchmarking tools

• Summary

Page 47: So You Want To Write Your Own Benchmark

47

• Various specialized benchmarks

• SPECjAppServer ®

• SPECjvm™

• CaffeineMark 3.0™

• SciMark 2.0

• Only a few benchmarking frameworks

Java™ benchmarking tools

Page 48: So You Want To Write Your Own Benchmark

48

Japex Micro-Benchmark framework

• Similar in spirit to JUnit

• Measures throughput – work over time

• Transactions Per Second (Default)

• KBs per second

• XML based configuration

• XML/HTML reports

Page 49: So You Want To Write Your Own Benchmark

49

Japex: Drivers

• Encapsulates knowledge about a specific algorithm implementation

• Must extend JapexDriverBase

public interface JapexDriver extends Runnable {

public void initializeDriver();

public void prepare(TestCase testCase);

public void warmup(TestCase testCase);

public void run(TestCase testCase);

public void finish(TestCase testCase);

public void terminateDriver();

}

Page 50: So You Want To Write Your Own Benchmark

50

public class SqrtNewtonApproxDriver extends JapexDriverBase {

private long tmp;

@Override

public void warmup(TestCase testCase) {

tmp += sqrt(getNextRandomNumber());

}

}

Japex: Writing your own driver

Page 51: So You Want To Write Your Own Benchmark

51

<testSuite name="SQRT Test Suite"

xmlns=http://www.sun.com/japex/testSuite …>

<param name="libraryDir" value="C:/java/japex/lib"/>

<param name="japex.classPath" value="./target/classes"/>

<param name="japex.runIterations" value="1000000" />

<driver name="SqrtApproxNewtonDriver">

<param name="Description" value="Newton Driver"/>

<param name="japex.driverClass“

value="com.alphacsp.javaedge.benchmark.

japex.driver.SqrtNewtonApproxDriver"/>

</driver>

<testCase name="testcase1"/>

</testSuite>

Japex: Test suite

Page 52: So You Want To Write Your Own Benchmark

52

Japex: HTML Reports

Page 53: So You Want To Write Your Own Benchmark

53

Japex: more chart types

Scatter chart

Line chart

Page 54: So You Want To Write Your Own Benchmark

54

Japex: pros and cons

• Pros

• Similar to JUnit

• Nice HTML reports

• Cons

• Last stable release on March 2007

• HotSpot issues are not handled

• XML configuration

Page 55: So You Want To Write Your Own Benchmark

55

Brent Boyer’s Benchmark framework

• Part of the “Robust Java benchmarking”

article by Brent Boyer

• Automate as many aspects as possible:

• Resource reclamation

• Class loading

• Dead code elimination

• Statistics

Page 56: So You Want To Write Your Own Benchmark

56

Benchmark framework example

Benchmark.Params params = new Benchmark.Params(true);

params.setExecutionTimeGoal(0.5);

params.setNumberMeasurements(50);

Runnable task = new Runnable() {

public void run() {

sqrt(getNextRandomNumber());

}

};

Benchmark benchmark = new Benchmark(task, params);

System.out.println(benchmark.toString());

Page 57: So You Want To Write Your Own Benchmark

57

Benchmark single line summary

first = 25.702 us,

mean = 91.070 ns

(CI deltas: -115.591 ps, +171.423 ps)

sd = 1.451 us (CI deltas: -461.523 ns, +676.964 ns)

WARNING: execution times have mild outliers, SD

VALUES MAY BE INACCURATE

Benchmark output:

Page 58: So You Want To Write Your Own Benchmark

58

Outlier and serial correlation issues

• Records outlier and serial correlation issues

• Outliers indicate that a major measurement error happened

• Large outliers - some other activity started on the computer during measurement

• Small outliers might hint that DCE occurred

• Serial correlation indicates that the JVM has not reached its steady-state performance profile

Page 59: So You Want To Write Your Own Benchmark

59

Benchmark : pros and cons

• Pros

• Handles HotSpot related issues

• Detailed statistics

• Cons

• Each run takes a lot of time

• Not a formal project

• Lacks documentation

Page 60: So You Want To Write Your Own Benchmark

60

Agenda

• Introduction

• Java micro benchmarking pitfalls

• Writing your own benchmark

• Micro benchmarking tools

• Summary

Page 61: So You Want To Write Your Own Benchmark

61

Summary 1

• Micro benchmarking is hard when it comes to Java™

• Define what you want to measure and how want to do it, pick your goals

• Know what you are doing

• Always warm-up your code

• Handle DCE, OSR, GC issues

• Use fixed size data sets and fixed work

Page 62: So You Want To Write Your Own Benchmark

62

Summary 2

• Do not rely solely on microbenchmark

results

• Sanity check results

• Use a profiler

• Test your code in real life scenarios under

realistic load (macro-benchmark)

Page 63: So You Want To Write Your Own Benchmark

63

Summary: resources

• http://www.ibm.com/developerworks/java/librar

y/j-benchmark1.html

• http://www.azulsystems.com/events/javaone_20

02/microbenchmarks.pdf

• https://japex.dev.java.net/

• http://www.ibm.com/developerworks/java/librar

y/j-jtp12214/

• http://www.dei.unipd.it/~bertasi/jcache/

Page 64: So You Want To Write Your Own Benchmark

64

Thank Thank

You !You !