jvm internals by douglas hawkins

87
JVM Internals Douglas Q. Hawkins http://www.slideshare.net/dougqh http://www.dougqh.net [email protected] Monday, January 23, 12

Upload: zulujdk

Post on 15-Jul-2015

316 views

Category:

Technology


0 download

TRANSCRIPT

Page 2: JVM Internals by Douglas Hawkins

TopicsJava Byte Code

File Format

Byte Code Examples

How Java 5 & 7 Features Are Implemented

JVM Optimizations

Monday, January 23, 12

Page 3: JVM Internals by Douglas Hawkins

Why?Monday, January 23, 12

Besides techie edification, why is this useful?A better understanding of the internals can help in deciphering some of the harder problems, but better...You’ll know that the compiler and JVM are doing a lot for you letting you focus on writing readable code.

Page 4: JVM Internals by Douglas Hawkins

File Format

Monday, January 23, 12

Page 5: JVM Internals by Douglas Hawkins

Class File FormatCA FE BA BE Minor Version Major Version

Constant PoolConstant PoolConstant PoolConstant PoolConstant PoolConstant Pool

FlagsFlags This ClassThis Class Super ClassInterfacesInterfacesInterfacesInterfacesInterfacesInterfaces

FieldsFieldsFieldsFieldsFieldsFields

MethodsMethodsMethodsMethodsMethodsMethods

AttributesAttributesAttributesAttributesAttributesAttributes

Monday, January 23, 12

Every file starts the magic 2-bytes: CAFEBABEFollowed by major and minor version - major indicates Java 5, 6, 7, etc.Then a constant pool - which contains... constants: int, long, String, etc. references: method and field descriptors: method and fieldFollowed by flags: modifiers for this class/interfaceFollowed by reference to this class/interfaceFollowed by the super class - which is an index into the constant poolFollowed by a list interface references - which are indices into constant poolFollowed by fieldsFollowed by methodsAnd, finally, attributes which are extra meta-information about the class... - the name of the original file - annotation information - information on sub-classes

Class File Spec: http://java.sun.com/docs/books/jvms/second_edition/ClassFileFormat-Java5.pdfHistory of CAFEBABE: http://en.wikipedia.org/wiki/Java_class_file

Page 6: JVM Internals by Douglas Hawkins

Class File FormatCA FE BA BE Minor Version Major Version

Constant PoolConstant PoolConstant PoolConstant PoolConstant PoolConstant Pool

FlagsFlags This ClassThis Class Super ClassInterfacesInterfacesInterfacesInterfacesInterfacesInterfaces

FieldsFieldsFieldsFieldsFieldsFields

MethodsMethodsMethodsMethodsMethodsMethods

AttributesAttributesAttributesAttributesAttributesAttributes

public

privat

e

protec

ted

static

final

abstr

act

strict

fp

anno

tation

enum

interf

ace

Monday, January 23, 12

Every file starts the magic 2-bytes: CAFEBABEFollowed by major and minor version - major indicates Java 5, 6, 7, etc.Then a constant pool - which contains... constants: int, long, String, etc. references: method and field descriptors: method and fieldFollowed by flags: modifiers for this class/interfaceFollowed by reference to this class/interfaceFollowed by the super class - which is an index into the constant poolFollowed by a list interface references - which are indices into constant poolFollowed by fieldsFollowed by methodsAnd, finally, attributes which are extra meta-information about the class... - the name of the original file - annotation information - information on sub-classes

Class File Spec: http://java.sun.com/docs/books/jvms/second_edition/ClassFileFormat-Java5.pdfHistory of CAFEBABE: http://en.wikipedia.org/wiki/Java_class_file

Page 7: JVM Internals by Douglas Hawkins

Field FormatFlags Name Descriptor

AttributesAttributesAttributesAttributes public

privat

e

protec

ted

static

final

volat

ile

trans

ient

Monday, January 23, 12

Fields consist of...flagsfollowed by name - actually index to a string literal into the constant poolfollowed by descriptor - e.g. field type - also index into the constant pool - type is raw typefollowed by attributes- constant value- specific type information - List< String >, etc.

Page 8: JVM Internals by Douglas Hawkins

Field FormatFlags Name Descriptor

AttributesAttributesAttributesAttributes“name”

Monday, January 23, 12

Fields consist of...flagsfollowed by name - actually index to a string literal into the constant poolfollowed by descriptor - e.g. field type - also index into the constant pool - type is raw typefollowed by attributes- constant value- specific type information - List< String >, etc.

Page 9: JVM Internals by Douglas Hawkins

Field FormatFlags Name Descriptor

AttributesAttributesAttributesAttributes“Ljava/lang/String;”

Monday, January 23, 12

Fields consist of...flagsfollowed by name - actually index to a string literal into the constant poolfollowed by descriptor - e.g. field type - also index into the constant pool - type is raw typefollowed by attributes- constant value- specific type information - List< String >, etc.

Page 10: JVM Internals by Douglas Hawkins

Field FormatFlags Name Descriptor

AttributesAttributesAttributesAttributes

ConstantValue

Monday, January 23, 12

Fields consist of...flagsfollowed by name - actually index to a string literal into the constant poolfollowed by descriptor - e.g. field type - also index into the constant pool - type is raw typefollowed by attributes- constant value- specific type information - List< String >, etc.

Page 11: JVM Internals by Douglas Hawkins

Method FormatFlags Name Descriptor

AttributesAttributesAttributesAttributes public

privat

e

protec

ted

static

final

vararg

s

native

strict

fp

synch

ronize

dMonday, January 23, 12

Methods consist of...flagsfollowed by name - actually index to a string literal into the constant poolfollowed by descriptor - e.g. raw parameter types and return typefollowed by attributes- exceptions & code- specific type information - List< String >, etc.- specific exception information- debugging information

Page 12: JVM Internals by Douglas Hawkins

Method FormatFlags Name Descriptor

AttributesAttributesAttributesAttributes“main”

Monday, January 23, 12

Methods consist of...flagsfollowed by name - actually index to a string literal into the constant poolfollowed by descriptor - e.g. raw parameter types and return typefollowed by attributes- exceptions & code- specific type information - List< String >, etc.- specific exception information- debugging information

Page 13: JVM Internals by Douglas Hawkins

Method FormatFlags Name Descriptor

AttributesAttributesAttributesAttributes“([Ljava/lang/String;)V”

Monday, January 23, 12

Methods consist of...flagsfollowed by name - actually index to a string literal into the constant poolfollowed by descriptor - e.g. raw parameter types and return typefollowed by attributes- exceptions & code- specific type information - List< String >, etc.- specific exception information- debugging information

Page 14: JVM Internals by Douglas Hawkins

Method FormatFlags Name Descriptor

AttributesAttributesAttributesAttributes

Exceptions

Code

Monday, January 23, 12

Methods consist of...flagsfollowed by name - actually index to a string literal into the constant poolfollowed by descriptor - e.g. raw parameter types and return typefollowed by attributes- exceptions & code- specific type information - List< String >, etc.- specific exception information- debugging information

Page 15: JVM Internals by Douglas Hawkins

Constant PoolC 22 UTF 10 HelloWorldHelloWorldHelloWorldHelloWorldHelloWorldHelloWorldHelloWorldHelloWorldHelloWorldHelloWorld

C 44 UTF 1616“java/lang/Object”“java/lang/Object”“java/lang/Object”“java/lang/Object”“java/lang/Object”“java/lang/Object”“java/lang/Object”“java/lang/Object”“java/lang/Object”“java/lang/Object”“java/lang/Object”“java/lang/Object”“java/lang/Object”“java/lang/Object”“java/lang/Object”

UTF 66 “<init>”“<init>”“<init>”“<init>”“<init>”“<init>”“<init>”“<init>”“<init>”“<init>”“<init>” UTF

33 “()V”“()V”“()V”“()V”“()V” UTF 44 “Code”“Code”“Code”“Code”“Code”M 3 99 N&T 55 66 UTF 44“main”“main”“main”“main”“main”“main”“main” UTF 2222

“([Ljava/lang/String;)V”“([Ljava/lang/String;)V”“([Ljava/lang/String;)V”“([Ljava/lang/String;)V”“([Ljava/lang/String;)V”“([Ljava/lang/String;)V”“([Ljava/lang/String;)V”“([Ljava/lang/String;)V”“([Ljava/lang/String;)V”“([Ljava/lang/String;)V”“([Ljava/lang/String;)V”“([Ljava/lang/String;)V”“([Ljava/lang/String;)V”“([Ljava/lang/String;)V”“([Ljava/lang/String;)V”

F 1313 1515 C 1414 UTF

1616 “java/lang/System”“java/lang/System”“java/lang/System”“java/lang/System”“java/lang/System”“java/lang/System”“java/lang/System”“java/lang/System”“java/lang/System”“java/lang/System”“java/lang/System”“java/lang/System”“java/lang/System”Monday, January 23, 12

Dissect the “Hello World” example a little...Entry 1 is a class entry - a 2-byte index to a UTF entry that contains the nameEntry 2 is the name of the classSimilarly...Entry 3 is a class entry - referring to the parent class refers to Entry 4 which is the full name of the parent classSkip over the constructor “<init>” and focus on mainEntry 10 is the name “main” & Entry 11 is the raw type descriptor for “main”The [Ljava/lang/String indicates String[] - V indicates returns void

Page 16: JVM Internals by Douglas Hawkins

Browsing Class File Format

JClassLib Viewer http://www.ej-technologies.com/products/jclasslib/overview.html

Monday, January 23, 12

JClassLibViewer: http://www.ej-technologies.com/products/jclasslib/overview.html

Page 17: JVM Internals by Douglas Hawkins

public final class HelloWorld { public static final String MESSAGE = "Hello, World!"; public static final void main( final String... args ) { System.out.println( MESSAGE ); }}

ConstantValue

Monday, January 23, 12

Here, we can see that because the “MESSAGE” field is “static final”.The value is stored in a “ConstantValue” attribute on the “MESSAGE” field.

Page 18: JVM Internals by Douglas Hawkins

Exceptionspublic interface InputStreamProvider { public abstract InputStream open() throws IOException;}

Monday, January 23, 12

Exception information is also stored in attribute.As it turns out the JVM, makes no distinction between checked and unchecked exceptions which has an interesting implication...

Page 19: JVM Internals by Douglas Hawkins

Exceptionspublic final class NewInstance { public static void main(String... args) { try { Class. forName("net.dougqh.runtime.SomeClass"). newInstance(); } catch ( InstantiationException | IllegalAccessException | ClassNotFoundException e) { e.printStackTrace(); } }}

public class SomeClass { public SomeClass() throws SomeException { throw new SomeException(); }}

Exception in thread "main" net.dougqh.runtime.SomeClass$SomeException! at net.dougqh.runtime.SomeClass.<init>! at sun.reflect.NativeConstructorAccessorImpl.newInstance0! at sun.reflect.NativeConstructorAccessorImpl.newInstance! at sun.reflect.DelegatingConstructorAccessorImpl.newInstance! at java.lang.reflect.Constructor.newInstance! at java.lang.Class.newInstance0! at java.lang.Class.newInstance! at net.dougqh.runtime.NewInstance.main

Monday, January 23, 12

www.javapuzzlers.comBecause of an oversight in the original reflection API, Class.newInstance can throw a checked exception that isnot reported by the compiler

Page 20: JVM Internals by Douglas Hawkins

Genericspublic final class Generics { public static final List<String> getStrings() { return Collections.singletonList("foo"); }}

Monday, January 23, 12

Here, we can getStrings() which returns List<String> has a descriptor of the raw-type ListHowever, the exact type information is stored in the “Signature” attribute

Page 21: JVM Internals by Douglas Hawkins

Annotations@Inherited@Retention( RetentionPolicy.RUNTIME )public @interface Annotation { public int foo() default 20; public String bar();}

@Annotation( bar="quux" )class Annotated {}

Monday, January 23, 12

An annotation is just an intefaceThe default values for each method are stored in a ConstElement attributeThe annotation information on a class or method is also stored in an attributeIn this case, since the annotation has a RUNTIME RetentionPolicy, it is stored in the RuntimeVisibleAnnotations attributeValues for the attribute are stored in the sub-attribute ElementValuePair

Page 22: JVM Internals by Douglas Hawkins

Byte Code

Monday, January 23, 12

Page 23: JVM Internals by Douglas Hawkins

Stack Based Virtual Machine0 iconst_1

1 iconst_2

2 iadd

3 istore_0

4 iload_0

0 1 2 3

Monday, January 23, 12

The JVM byte code format is stack-based like many other VMs: CLR, PHP, and PythonIn this example, the green field is heap, the bar above is local variable slots, and the column to the left is the stack

Let’s look at how to add 1 + 2 together and store into a local variableFirst, we use an iconst_1 instruction to load onto the stackJava has special instructions for common numbers: -1 to 5.Next, an iconst_2 to place 2 on the stackNext, we use iadd which pops the 1 & 2 on the stack adds them together and stores the result back on the stackNext, we use an istore_0 to store into the first local variable slotTo load value, back from the local variable slots, we use an iload_0Note: Similar to iconst, there are special istore/iload instructions for the most often used slots: 0-3

Page 24: JVM Internals by Douglas Hawkins

Stack Based Virtual Machine0 iconst_1

1 iconst_2

2 iadd

3 istore_0

4 iload_0

1

0 1 2 3

Monday, January 23, 12

The JVM byte code format is stack-based like many other VMs: CLR, PHP, and PythonIn this example, the green field is heap, the bar above is local variable slots, and the column to the left is the stack

Let’s look at how to add 1 + 2 together and store into a local variableFirst, we use an iconst_1 instruction to load onto the stackJava has special instructions for common numbers: -1 to 5.Next, an iconst_2 to place 2 on the stackNext, we use iadd which pops the 1 & 2 on the stack adds them together and stores the result back on the stackNext, we use an istore_0 to store into the first local variable slotTo load value, back from the local variable slots, we use an iload_0Note: Similar to iconst, there are special istore/iload instructions for the most often used slots: 0-3

Page 25: JVM Internals by Douglas Hawkins

Stack Based Virtual Machine0 iconst_1

1 iconst_2

2 iadd

3 istore_0

4 iload_0

12

0 1 2 3

Monday, January 23, 12

The JVM byte code format is stack-based like many other VMs: CLR, PHP, and PythonIn this example, the green field is heap, the bar above is local variable slots, and the column to the left is the stack

Let’s look at how to add 1 + 2 together and store into a local variableFirst, we use an iconst_1 instruction to load onto the stackJava has special instructions for common numbers: -1 to 5.Next, an iconst_2 to place 2 on the stackNext, we use iadd which pops the 1 & 2 on the stack adds them together and stores the result back on the stackNext, we use an istore_0 to store into the first local variable slotTo load value, back from the local variable slots, we use an iload_0Note: Similar to iconst, there are special istore/iload instructions for the most often used slots: 0-3

Page 26: JVM Internals by Douglas Hawkins

Stack Based Virtual Machine0 iconst_1

1 iconst_2

2 iadd

3 istore_0

4 iload_0

1+2

0 1 2 3

Monday, January 23, 12

The JVM byte code format is stack-based like many other VMs: CLR, PHP, and PythonIn this example, the green field is heap, the bar above is local variable slots, and the column to the left is the stack

Let’s look at how to add 1 + 2 together and store into a local variableFirst, we use an iconst_1 instruction to load onto the stackJava has special instructions for common numbers: -1 to 5.Next, an iconst_2 to place 2 on the stackNext, we use iadd which pops the 1 & 2 on the stack adds them together and stores the result back on the stackNext, we use an istore_0 to store into the first local variable slotTo load value, back from the local variable slots, we use an iload_0Note: Similar to iconst, there are special istore/iload instructions for the most often used slots: 0-3

Page 27: JVM Internals by Douglas Hawkins

Stack Based Virtual Machine0 iconst_1

1 iconst_2

2 iadd

3 istore_0

4 iload_0

3

0 1 2 3

Monday, January 23, 12

The JVM byte code format is stack-based like many other VMs: CLR, PHP, and PythonIn this example, the green field is heap, the bar above is local variable slots, and the column to the left is the stack

Let’s look at how to add 1 + 2 together and store into a local variableFirst, we use an iconst_1 instruction to load onto the stackJava has special instructions for common numbers: -1 to 5.Next, an iconst_2 to place 2 on the stackNext, we use iadd which pops the 1 & 2 on the stack adds them together and stores the result back on the stackNext, we use an istore_0 to store into the first local variable slotTo load value, back from the local variable slots, we use an iload_0Note: Similar to iconst, there are special istore/iload instructions for the most often used slots: 0-3

Page 28: JVM Internals by Douglas Hawkins

Stack Based Virtual Machine0 iconst_1

1 iconst_2

2 iadd

3 istore_0

4 iload_0

30 1 2 3

Monday, January 23, 12

The JVM byte code format is stack-based like many other VMs: CLR, PHP, and PythonIn this example, the green field is heap, the bar above is local variable slots, and the column to the left is the stack

Let’s look at how to add 1 + 2 together and store into a local variableFirst, we use an iconst_1 instruction to load onto the stackJava has special instructions for common numbers: -1 to 5.Next, an iconst_2 to place 2 on the stackNext, we use iadd which pops the 1 & 2 on the stack adds them together and stores the result back on the stackNext, we use an istore_0 to store into the first local variable slotTo load value, back from the local variable slots, we use an iload_0Note: Similar to iconst, there are special istore/iload instructions for the most often used slots: 0-3

Page 29: JVM Internals by Douglas Hawkins

Stack Based Virtual Machine0 iconst_1

1 iconst_2

2 iadd

3 istore_0

4 iload_0

3

3

0 1 2 3

Monday, January 23, 12

The JVM byte code format is stack-based like many other VMs: CLR, PHP, and PythonIn this example, the green field is heap, the bar above is local variable slots, and the column to the left is the stack

Let’s look at how to add 1 + 2 together and store into a local variableFirst, we use an iconst_1 instruction to load onto the stackJava has special instructions for common numbers: -1 to 5.Next, an iconst_2 to place 2 on the stackNext, we use iadd which pops the 1 & 2 on the stack adds them together and stores the result back on the stackNext, we use an istore_0 to store into the first local variable slotTo load value, back from the local variable slots, we use an iload_0Note: Similar to iconst, there are special istore/iload instructions for the most often used slots: 0-3

Page 30: JVM Internals by Douglas Hawkins

Parameters and Local Variables0 iload_0

1 iload_1

2 imul

3 istore_3

4 iload_3

5 iload_2

6 imul

7 istore 4

9 iload 4

11ireturn

static int volume( int width, int depth, int height ){ int area = width * depth; int volume = area * height; return volume;}

0 1 2 3 4width

depth

heigh

tare

avo

lume

Monday, January 23, 12

Trace through a slightly more complicated example: calculating volume- arguments are passed into the low local variables slots - 0 - 3 in this case- first to calculate area, load width and depth from slots 0 & 1 respectively- multiply the values on the stack, then store result into slot 4 area- reload area & height - slots 4 & 3 respectively- multiply the values and store into slot 5: volume- reload volume and returnYes, the value is stored and then immediately reloaded in the byte code. Starting with Java 3, byte code is not optimized by javac, all optimizations are left to the JVM to perform.

Page 31: JVM Internals by Douglas Hawkins

Static vs Virtual Methods int width, int depth, int height ){ int area = width * depth; int volume = area * height; return volume;}

int volume(

0 1 2 3 4width

depth

heigh

tare

avo

lume

5this

0 iload_1

1 iload_2

2 imul

3 istore 4

5 iload 4

7 iload_3

8 imul

9 istore 5

11 iload 5

13 ireturn

Monday, January 23, 12

In the prior example, you may have noticed that method was static.If the method isn’t static, then “this” is invisibly passed to the first slot.So, our arguments start at 1 and the load and stores all change accordingly.

Page 32: JVM Internals by Douglas Hawkins

Hello World0 getstatic System.out

3 ldc “Hello World”

5 invokevirtual PrintStream.println

8 return

System.out

“Hello World”

0 1 2 3

System.out.println( “Hello World” );

Monday, January 23, 12

Now, we know enough to understand “Hello World”

The first operation is a getstatic to load the value of System.out onto the stackWe need this reference to invoke printlnSecond, load the string “Hello World” onto the stack - the ldc indicates a load from the constant poolNow, since this is non-static method on a class, use invokevirtual to invoke PrintStream.printlnThis consumes the pointer to System.out (which is the this for PrintStream.println) and the reference to “Hello World”These values are then mapped to local slots for “this” and “msg” in the new stack frame

Page 33: JVM Internals by Douglas Hawkins

Hello World0 getstatic System.out

3 ldc “Hello World”

5 invokevirtual PrintStream.println

8 return

System.out

“Hello World”

0 1 2 3

System.out.println( “Hello World” );

Monday, January 23, 12

Now, we know enough to understand “Hello World”

The first operation is a getstatic to load the value of System.out onto the stackWe need this reference to invoke printlnSecond, load the string “Hello World” onto the stack - the ldc indicates a load from the constant poolNow, since this is non-static method on a class, use invokevirtual to invoke PrintStream.printlnThis consumes the pointer to System.out (which is the this for PrintStream.println) and the reference to “Hello World”These values are then mapped to local slots for “this” and “msg” in the new stack frame

Page 34: JVM Internals by Douglas Hawkins

Hello World0 getstatic System.out

3 ldc “Hello World”

5 invokevirtual PrintStream.println

8 return

System.out

“Hello World”

0 1 2 3

System.out.println( “Hello World” );

Monday, January 23, 12

Now, we know enough to understand “Hello World”

The first operation is a getstatic to load the value of System.out onto the stackWe need this reference to invoke printlnSecond, load the string “Hello World” onto the stack - the ldc indicates a load from the constant poolNow, since this is non-static method on a class, use invokevirtual to invoke PrintStream.printlnThis consumes the pointer to System.out (which is the this for PrintStream.println) and the reference to “Hello World”These values are then mapped to local slots for “this” and “msg” in the new stack frame

Page 35: JVM Internals by Douglas Hawkins

Hello World0 getstatic System.out

3 ldc “Hello World”

5 invokevirtual PrintStream.println

8 return

System.out

“Hello World”

0 1 2 3

System.out.println( “Hello World” );

this

msgMonday, January 23, 12

Now, we know enough to understand “Hello World”

The first operation is a getstatic to load the value of System.out onto the stackWe need this reference to invoke printlnSecond, load the string “Hello World” onto the stack - the ldc indicates a load from the constant poolNow, since this is non-static method on a class, use invokevirtual to invoke PrintStream.printlnThis consumes the pointer to System.out (which is the this for PrintStream.println) and the reference to “Hello World”These values are then mapped to local slots for “this” and “msg” in the new stack frame

Page 36: JVM Internals by Douglas Hawkins

Types of Method Invocationsinvokestatic - invoke static methods

invokevirtual - invoke instance method from class

invokeinterface - invoke instance method from interface

invokespecial - invoke <init> / invoke super method

invokedynamic - optimized dynamic look-up (in Java 7)

Monday, January 23, 12

We’ve seen a call to invokevirtual which is used class methods, but there are other invocation types, too.invokestatic - for static methodsinvokeinterface- for methods invoked through an interface reference (rather than a class reference)invokespecial - for direct targets - like constructors or invoking a super method where the call is not polymorphicinvokedynamic - used by script languages like JRuby in Java 7 for improved performance

Page 37: JVM Internals by Douglas Hawkins

New Object0 new BigDecimal

3 dup

4 ldc “2.0”

6 invokespecial BigDecimal.<init>

9 astore_0

“2.0”

0 1 2 3

BigDecimal num = new BigDecimal(“2.0”); nu

m

Monday, January 23, 12

Now, let’s look an object allocationThe first step is to an object; however, this steps does not yet invoke the constructorIt just allocates space on the heap for the object and returns a pointer to uninitialized memoryUnfortunately, since invoking of the constructor will consume a reference to the newly allocated BigDecimal, we need to a copy (“dup”) so that we’ll have a reference left to store into “num”.Next, we push “2.0” onto the stackThen we invoke BigDecimal.<init> which is the BigDecimal constructor.It consumes the pointer to “2.0” and the duplicate reference, leaving us with one reference to assign into “num”.

As you can see construction is rather complicated, some of the past security wholes with byte code verifier involved object construction because the sequence is non-trivial.CLR learned from this and has a single “new” instruction that both allocates and invokes the construction, thus making byte code verification easier.

From this example, you can also see why double-checked locking is broken in Java. Construction isn’t a single step and with reordering, so it is possible for a pointer to an uninitialized object to be assigned to field.In Java 5, the use of volatile guarantees a “happens-before”, so the field will never be assigned before the constructor is done being invoked.

Page 38: JVM Internals by Douglas Hawkins

New Object0 new BigDecimal

3 dup

4 ldc “2.0”

6 invokespecial BigDecimal.<init>

9 astore_0

“2.0”

0 1 2 3

BigDecimal num = new BigDecimal(“2.0”);

BigDecimal

num

Monday, January 23, 12

Now, let’s look an object allocationThe first step is to an object; however, this steps does not yet invoke the constructorIt just allocates space on the heap for the object and returns a pointer to uninitialized memoryUnfortunately, since invoking of the constructor will consume a reference to the newly allocated BigDecimal, we need to a copy (“dup”) so that we’ll have a reference left to store into “num”.Next, we push “2.0” onto the stackThen we invoke BigDecimal.<init> which is the BigDecimal constructor.It consumes the pointer to “2.0” and the duplicate reference, leaving us with one reference to assign into “num”.

As you can see construction is rather complicated, some of the past security wholes with byte code verifier involved object construction because the sequence is non-trivial.CLR learned from this and has a single “new” instruction that both allocates and invokes the construction, thus making byte code verification easier.

From this example, you can also see why double-checked locking is broken in Java. Construction isn’t a single step and with reordering, so it is possible for a pointer to an uninitialized object to be assigned to field.In Java 5, the use of volatile guarantees a “happens-before”, so the field will never be assigned before the constructor is done being invoked.

Page 39: JVM Internals by Douglas Hawkins

New Object0 new BigDecimal

3 dup

4 ldc “2.0”

6 invokespecial BigDecimal.<init>

9 astore_0

“2.0”

0 1 2 3

BigDecimal num = new BigDecimal(“2.0”);

BigDecimal

num

Monday, January 23, 12

Now, let’s look an object allocationThe first step is to an object; however, this steps does not yet invoke the constructorIt just allocates space on the heap for the object and returns a pointer to uninitialized memoryUnfortunately, since invoking of the constructor will consume a reference to the newly allocated BigDecimal, we need to a copy (“dup”) so that we’ll have a reference left to store into “num”.Next, we push “2.0” onto the stackThen we invoke BigDecimal.<init> which is the BigDecimal constructor.It consumes the pointer to “2.0” and the duplicate reference, leaving us with one reference to assign into “num”.

As you can see construction is rather complicated, some of the past security wholes with byte code verifier involved object construction because the sequence is non-trivial.CLR learned from this and has a single “new” instruction that both allocates and invokes the construction, thus making byte code verification easier.

From this example, you can also see why double-checked locking is broken in Java. Construction isn’t a single step and with reordering, so it is possible for a pointer to an uninitialized object to be assigned to field.In Java 5, the use of volatile guarantees a “happens-before”, so the field will never be assigned before the constructor is done being invoked.

Page 40: JVM Internals by Douglas Hawkins

New Object0 new BigDecimal

3 dup

4 ldc “2.0”

6 invokespecial BigDecimal.<init>

9 astore_0

“2.0”

0 1 2 3

BigDecimal num = new BigDecimal(“2.0”);

BigDecimal

num

Monday, January 23, 12

Now, let’s look an object allocationThe first step is to an object; however, this steps does not yet invoke the constructorIt just allocates space on the heap for the object and returns a pointer to uninitialized memoryUnfortunately, since invoking of the constructor will consume a reference to the newly allocated BigDecimal, we need to a copy (“dup”) so that we’ll have a reference left to store into “num”.Next, we push “2.0” onto the stackThen we invoke BigDecimal.<init> which is the BigDecimal constructor.It consumes the pointer to “2.0” and the duplicate reference, leaving us with one reference to assign into “num”.

As you can see construction is rather complicated, some of the past security wholes with byte code verifier involved object construction because the sequence is non-trivial.CLR learned from this and has a single “new” instruction that both allocates and invokes the construction, thus making byte code verification easier.

From this example, you can also see why double-checked locking is broken in Java. Construction isn’t a single step and with reordering, so it is possible for a pointer to an uninitialized object to be assigned to field.In Java 5, the use of volatile guarantees a “happens-before”, so the field will never be assigned before the constructor is done being invoked.

Page 41: JVM Internals by Douglas Hawkins

New Object0 new BigDecimal

3 dup

4 ldc “2.0”

6 invokespecial BigDecimal.<init>

9 astore_0

“2.0”

0 1 2 3

BigDecimal num = new BigDecimal(“2.0”);

BigDecimal

num

Monday, January 23, 12

Now, let’s look an object allocationThe first step is to an object; however, this steps does not yet invoke the constructorIt just allocates space on the heap for the object and returns a pointer to uninitialized memoryUnfortunately, since invoking of the constructor will consume a reference to the newly allocated BigDecimal, we need to a copy (“dup”) so that we’ll have a reference left to store into “num”.Next, we push “2.0” onto the stackThen we invoke BigDecimal.<init> which is the BigDecimal constructor.It consumes the pointer to “2.0” and the duplicate reference, leaving us with one reference to assign into “num”.

As you can see construction is rather complicated, some of the past security wholes with byte code verifier involved object construction because the sequence is non-trivial.CLR learned from this and has a single “new” instruction that both allocates and invokes the construction, thus making byte code verification easier.

From this example, you can also see why double-checked locking is broken in Java. Construction isn’t a single step and with reordering, so it is possible for a pointer to an uninitialized object to be assigned to field.In Java 5, the use of volatile guarantees a “happens-before”, so the field will never be assigned before the constructor is done being invoked.

Page 42: JVM Internals by Douglas Hawkins

Demojavap -c

Monday, January 23, 12

Page 43: JVM Internals by Douglas Hawkins

if ( x > 0 ) { return true;} else { return false;}

0: iload_0 1: ifle 6 4: iconst_1 5: ireturn 6: iconst_0 7: ireturn

return x > 0 ? true : false; 0: iload_0 1: ifle 8 4: iconst_1 5: goto 9 8: iconst_0 9: ireturn

return ( x > 0 ); 0: iload_0 1: ifle 8 4: iconst_1 5: goto 9 8: iconst_0 9: ireturn

ConditionalsOriginal Byte Code

Monday, January 23, 12

Three ways to write a method that checks if a number is greater than 0.The byte code is almost the same in all 3 cases.

Page 44: JVM Internals by Douglas Hawkins

Invoke StaticOriginal DecompiledMath.max(10, 20); 0: bipush 10

2: bipush 20 4: invokestatic Math.max 7: pop 8: return

Monday, January 23, 12

Here, we see an extra pop after the invokestatic call.That’s because the return value of max is left on the stack, since we don’t use it the compiler generates a pop to discard it.If we store the value in a variable, the pop will be replaced with an istore

Page 45: JVM Internals by Douglas Hawkins

InvocationsOriginal Decompiled

0: new FileInputStream 3: dup 4: ldc "foo" 6: invokespecial FileInputStream.<init> 9: astore_0 10: aload_0 11: invokevirtual FileInputStream.close 14: return

FileInputStream in = new FileInputStream("foo");in.close();

Closeable in = new FileInputStream("foo");in.close();

0: new FileInputStream 3: dup 4: ldc "foo" 6: invokespecial FileInputStream.<init> 9: astore_0 10: aload_0 11: invokeinterface Closeable.close 16: return

Monday, January 23, 12

In one example, close is called on a class-type FileInputStream in the other it is called on an interface-type CloseableIn the first case, the compiler generates an invokevirtual callIn the second case, the compiler generates an invokeinterface call

Page 46: JVM Internals by Douglas Hawkins

For Loopstatic int sum( int min, int max ){ int sum = 0; for ( int i=min; i<max; ++i ){ sum += i; } return sum;}

0 iconst_01 istore_22 iload_03 istore_34 goto +10 //147 iload_28 iload_39 iadd

10 istore_211 iinc 3 by 114 iload_315 iload_116 if_icmplt -9 //719 iload_220 ireturn

befo

relo

opini

t & te

stte

stlo

op b

ody

incaf

ter

loop

Monday, January 23, 12

Examine a for loop example

The first 2 ops are the initialization of “sum”, load 0 and store in “sum” (slot 2)The next 3 ops are the loop initialization and jump to the initial test...- load the value of “min” (slot 0) into “i” (slot 3)- then jump to the testThe test is placed at the end since it is generally performed after the body and step portions of the loopThe test...- loads “i” (slot 3) and “max” (slot 1)- if “i” is less than “max”, then it jumps back 9 bytes to the start of the loop bodyThe loop body...- loads and adds “sum” and “i” (slots 2 and 3) and stores the result back into “sum” (slot 2)Then the step / increment part of the loop happens...- which just increments “i”Then we flow straight into the test portionIf the test fails, we flow through to the after loop portionHere, we load “sum” (slot 2) and return the result

Page 47: JVM Internals by Douglas Hawkins

Exception Handlingstatic int read( InputStream in ) { try { return in.read(); } catch ( IOException e ) { return -1; } finally { IoUtils.closeQuietly( in ); }}

Exception TableException TableException TableException Tablestart end handler Exception

0 5 11 IOException0 5 18 any11 12 18 any

0 aload_01 invokevirtual InputStream.read4 istore_15 aload_06 invokestatic IoUtils.closeQuietly9 iload_1

10 ireturn11 pop12 aload_013 invokestatic IoUtils.closeQuietly16 iconst_m117 ireturn18 astore_219 aload_020 invokestatic IoUtils.closeQuietly23 aload_224 athrow

try /

finall

yca

tch

/ fina

llyfin

ally

Monday, January 23, 12

Now, Exception handling...Exceptions are handled through extra meta-information that says how to handle different types of exceptions over a range of byte-code instructions.

The finally portion is inlined in the try, catch, and finally portions of the generated byte code.(Prior to Java 6, the regular javac compiler generated “jsr” and “ret” to jump to single block of compiled “finally” code.) The “try / finally” section represents the normal flow.- invoke InputStream.read- store the result into an unnamed temporary variable (slot 1) b/c we need to run the finally code- run the finally code- reload the temporary variable and return

The “catch / finally” is the catching of the IOException...The exception table says if an IOException is raised between instructions 0 and 5 (the try), jump to 11 this catch section.First, step is to “pop”, pop what? In this case the IOException which was automatically placed on the stack. Since we don’t use it discard it. This implies that “e” is never assigned a stack slot by the compiler.Now, invoke IoUtils.closeQuietly (the finally block) then return -1.

Despite inlining the “finally” in the “try” and the “catch”, we still need to generate code for a stand-alone finally in the event of unchecked Throwables being raised in the try or the catch. See the two exception table entries.Here we see another store of temporary variable in this case the exception which must still be propagated.Do the finally portion, load the temporary and rethrow.

http://cliffhacks.blogspot.com/2008/02/java-6-tryfinally-compilation-without.html

Page 48: JVM Internals by Douglas Hawkins

Synchronizationint inc() { synchronized ( this ) { ++this.counter; }}

Exception TableException TableException TableException Tablestart end handler Exception

4 16 22 any19 21 22 any

0 aload_01 dup2 astore_13 monitorenter4 aload_05 dup6 getfield Counter.num9 iconst_1

10 iadd11 putfield Counter.num14 aload_115 monitorexit16 goto +6 //2219 aload_120 monitorexit21 athrow22 return

finall

ytry

/ fin

ally

befo

re tr

y

Monday, January 23, 12

Interestingly enough, synchronization works the same way.To understand synchronization, it is better to luck at synchronization as a lock and unlock within a try / finally.And, that’s exactly how the byte code works.And, just like a regular try / finally, the finally is inlined is both the try and the finally.

Page 49: JVM Internals by Douglas Hawkins

Synchronization

Exception TableException TableException TableException Tablestart end handler Exception

4 16 22 any19 21 22 any

0 aload_01 dup2 astore_13 monitorenter4 aload_05 dup6 getfield Counter.num9 iconst_1

10 iadd11 putfield Counter.num14 aload_115 monitorexit16 goto +6 //2219 aload_120 monitorexit21 athrow22 return

finall

y

int inc() { lock( this ); try { ++this.counter; } finally { unlock( this ); }} try

/ fin

ally

befo

re tr

y

Monday, January 23, 12

Interestingly enough, synchronization works the same way.To understand synchronization, it is better to luck at synchronization as a lock and unlock within a try / finally.And, that’s exactly how the byte code works.And, just like a regular try / finally, the finally is inlined is both the try and the finally.

Page 50: JVM Internals by Douglas Hawkins

DemoJava 5

Java 7

Monday, January 23, 12

In these demos, I demonstrate new language features by showing Java 5 and Java 7 code and then showing what it looks when its decompiled back into Java 4 code.JAD - http://www.varaneckas.com/jad

Page 51: JVM Internals by Douglas Hawkins

Java 5

Monday, January 23, 12

JAD - http://www.varaneckas.com/jad

Page 52: JVM Internals by Douglas Hawkins

Auto-BoxingOriginal Decompiled as Java 4

public class AutoBoxing { public static void main(String[] args) { Integer foo = 20; Integer bar = 30; int sum = foo + bar; System.out.println(sum); }}

public class AutoBoxing { public static void main(String args[]) { Integer foo = Integer.valueOf(20); Integer bar = Integer.valueOf(30); int sum = foo.intValue() + bar.intValue(); System.out.println(sum); }}

Monday, January 23, 12

Here, we see how auto-boxing works.The compiler injects the necessary calls to Integer.valueOf and Integer.intValue for us.NOTE: Even if you don’t like auto-boxing, please call Integer.valueOf rather than calling new Integer.Unlike new, Integer.valueOf returns cached instances of Integer for commonly used values.

Page 53: JVM Internals by Douglas Hawkins

Enhanced ForOriginal

public class EnhancedFor { static void array(String[] args) { for ( String arg : args ) { System.out.println(arg); } } static void iterable( Iterable<String> args) { for ( String arg: args ) { System.out.println(arg); } }}

public class EnhancedFor { static void array(String args[]) { String arr$[] = args; int len$ = arr$.length; for (int i$ = 0; i$ < len$; i$++) { String arg = arr$[i$]; System.out.println(arg); } }

static void iterable(Iterable args) { String arg;

for (Iterator i$ = args.iterator(); i$.hasNext(); ) { arg = (String) i$.next(); System.out.println(arg) } }}

Decompiled as Java 4

Monday, January 23, 12

In this slide, we see how the enhanced for gets handled by the compiler.

The array for loop, converts to the canonical C-style loop. With one slight difference of performing invariant hoisting on the array length. (Although, this is rather pointless optimization because the JVM would do this at runtime anyway.)

For an Iterable, a loop that uses an iterator is generated. In this example, we can also see that the compiler injects a cast to exact type String, too.

Page 54: JVM Internals by Douglas Hawkins

Var-ArgsOriginalpublic final class VarArgs { public static void main(String... args) { System.out.printf( "Hello %s %s", "Jon", "Doe"); }}

public final class VarArgs { public static transient void main( String[] args) { System.out.printf( "Hello %s %s", new Object[] {"Jon", "Doe"}); }}

Decompiled as Java 4

Monday, January 23, 12

In this example, we var-args being used both in the signature and in the call to printf.NOTE: I’ve declared a main method with var-args, since on a byte-code level this is still just a String[]. This actually works just fine.

The “transient” modifier in the decompiled Java 4 is a bit amusing. This happens because Java ran out of flag bits to use in Java 5, so they overloaded the “transient” bit which only applies to fields to mean “var-args” when applied to methods.

In the call to printf, we can see that the compiler injects a construction of a new Object[] and passes it as the last arg to printf.

Page 55: JVM Internals by Douglas Hawkins

EnumOriginal

public enum AnEnum { FOO, BAR, QUUX }

public static final class AnEnum extends Enum { public static final AnEnum FOO = new AnEnum(“FOO”, 0); public static final AnEnum BAR = new AnEnum(“BAR”, 1); public static final AnEnum QUUX = new AnEnum(“QUUX”, 2); private static final AnEnum[] $VALUES = new AnEnum[]{FOO, BAR, QUUX};

public static AnEnum[] values() { return (AnEnum[]) $VALUES.clone(); } public static AnEnum valueOf(String name){ return (AnEnum)Enum.valueOf(

AnEnum.class, name); } private Simple(String s, int i) { super(s, i); } }

Decompiled as Java 4

Monday, January 23, 12

For Enum-s, the compiler does a great deal of work on your behalf -- even in the simplest case.The compiler generates a constructor that takes a label and ordinal for each entry.It then initializes a static final field for each constant from the original file.These constants are all placed in a value array.Finally, the compiler generates a values() method and valueOf() method for each enum class.

Page 56: JVM Internals by Douglas Hawkins

Covariance public interface Parent { Number calculate(); } public class CovariantChild implements Parent { public Integer calculate() { return 10; } }

Original public static interface Parent { public abstract Number calculate(); }

public class CovariantChild implements Parent { public Integer calculate() { return Integer.valueOf(10); }

public volatile Number calculate() { return calculate(); } }

Decompiled as Java 4

Monday, January 23, 12

A lesser known addition to Java 5 is the ability to have a covariant return type.Here, the child type returns a more specific type of Number -- namely Integer.

The generated code is interesting. We end up with two “calculate” methods - one that returns Integer and another returns Number. The one that returns Number satisfies the contact of the parent and simply calls the more specific version that returns Integer.

Here, again we see the curious modifier on a method: “volatile”. This another situation where Java 5 overloaded an existing flag bit.

For more information on why this is type-safe, look-up Liskov Substitution Principle.

Page 57: JVM Internals by Douglas Hawkins

Java 7

Monday, January 23, 12

Page 58: JVM Internals by Douglas Hawkins

Multi-CatchOriginalpublic final class EnhancedCatch { public static void main(String[] args){ try { Class. forName("some.package.SomeClass"). newInstance(); } catch ( InstantiationException | IllegalAccessException | ClassNotFoundException e) { throw new IllegalStateException(e); } }}

public final class EnhancedCatch { public static void main(String args[]) { try { Class. forName("some.package.SomeClass").

newInstance(); } catch (ReflectiveOperationException e){ throw new IllegalStateException(e); } }}

Decompiled as Java 4

Monday, January 23, 12

Java 7 adds the ability to handle multi-exception types in a single catch.Great for ugly reflection code.Here, the catch of all the reflection exceptions simplifies to a single catch of their common parent ReflectiveOperationException (a new base class for reflection exceptions also introduced in Java 7).

Page 59: JVM Internals by Douglas Hawkins

Try With ResourcesOriginal Decompiledpublic class EnhancedTry { public static void main( String[] args) throws IOException { Properties properties = new Properties(); try (InputStream in = new FileInputStream("my.properties")) { properties.load(in); } }}

public class EnhancedTry { public static void main(String args[]) throws IOException { Properties properties = new Properties(); InputStream in = new FileInputStream("my.properties"); Throwable throwable = null; try { properties.load(in); } catch (Throwable throwable1) { throwable = throwable1; } finally { if (in != null) { try { in.close(); } catch (Throwable x2) { throwable.addSuppressed(x2); throw throwable; } } } }}

Monday, January 23, 12

Java 7 also enhances try by allowing it to automatically close resources.It generates a similar try / finally to what you’d write by hand.Although, it puts the resource acquisition outside the try (which is correct but uncommon among many Java programmers).However, it does one more thing, it also adds code, so that if an exception happens when closing the original exception from the body is still propagated. And, even better the exception raised by closed is added to the suppressed list of the original exception using the new Java 7 method: Throwable.addSuppressed.

Page 60: JVM Internals by Douglas Hawkins

String SwitchOriginal Decompiled

switch (args[0]) { case "Hello": System.out.println("Hello, World!"); break; case "Bye": System.out.println("Good Bye, World!"); break; case "9\uffe7": System.out.println("Collision"); break; }

byte byte0 = -1; switch(args[0].hashCode()) { case 69609650: ... break; case 67278: if(s.equals("9\uFFE7")) { byte0 = 2; } else if(s.equals("Bye")) { byte0 = 1; } break; } switch(byte0) { case 0: System.out.println("Hello, World!"); break;

case 1: System.out.println("Good Bye, World!"); break; case 2: System.out.println("Collision"); break; }

Monday, January 23, 12

One last example from Java 7 -- string switch

String switch is implemented as a switch on the String’s hashCode.However, hashCode is not unique, so the generated code must also perform an equals check.

To handle this, string switch actually generates two switch statements.The first on the hashCode, assigns a temporary variable, a case value from the original code.

Then the second switches on the case code, each case containing code from the original Java 7 cases.Here, I’ve deliberately created a hash collision, so you can see how collisions are resolved.

Page 61: JVM Internals by Douglas Hawkins

CompilerOptimizations

Monday, January 23, 12

In the next few examples, I show code the original code and the code after it has been decompiled.By doing this, we can see some of the optimizations performed by the compiler.JAD - http://www.varaneckas.com/jad

Page 62: JVM Internals by Douglas Hawkins

Constant FoldingOriginal Decompiled

public final class StaticInitializer { private static final String LOG_FORMAT = "Started at %d ms";

private static final long START_TIME = System.currentTimeMillis();

private static final long START_TIME_2; static { START_TIME_2 = System.currentTimeMillis(); }}

public final class StaticInitializer { private static final String LOG_FORMAT = "Started at %d ms";

private static final long START_TIME = System.currentTimeMillis();

private static final long START_TIME_2 = System.currentTimeMillis();}

Monday, January 23, 12

While modern Java compiler’s don’t do much optimization, they do some.One example is constant folding -- when possible, the compiler computes simply constant expressions at compile time.This even includes string concatenation.

Page 63: JVM Internals by Douglas Hawkins

Constant InliningOriginal Decompiled

public class Inlining { public static final String INLINED_VERSION = "1.1.0"; public static final String NOT_INLINED_VERSION = identity("1.2.0"); private static String identity( String value) { return value; } public static void print() { System.out.println(INLINED_VERSION); System.out.println(NOT_INLINED_VERSION); }}

public class Inlining { public static final String

INLINED_VERSION = "1.1.0"; public static final String

NOT_INLINED_VERSION = identity("1.2.0");

private static String identity( String value) { return value; }

public static void print() { System.out.println("1.1.0"); System.out.println(NOT_INLINED_VERSION); }}

Monday, January 23, 12

Constants can also be inlined by the compilerIn this example, the compiler inlines INLINED_VERSION in the print method; however, it does no inlined NOT_INLINED_VERSION.The reason is that NOT_INLINED_VERSION is complexed expression because a method was invoked.

This has implications in the byte code, too.INLINED_VERSION will have its value set through a ConstantValue attribute.NOT_INLINED_VERSION will be initialized in a <clinit> method generated by the compiler and called automatically when the class is first loaded.

Page 64: JVM Internals by Douglas Hawkins

Dead Code EliminationOriginal Decompiledpublic class DeadCodeElimination { public static final boolean DEBUG_OFF = false; public static final boolean DEBUG_ON = true; public static void main(String[] args) { if ( DEBUG_OFF ) { System.out.println("never"); } if ( DEBUG_ON ) { System.out.println("always"); } }}

public class DeadCodeElimination { public static final boolean DEBUG_OFF = false; public static final boolean DEBUG_ON = true;

public static void main(String args[]) { System.out.println("always"); }}

Monday, January 23, 12

Along with inlining, the compiler can perform dead code elimination.In this case, DEBUG_OFF is never true, so the “never” print out is not generated by the compiler.Even in the DEBUG_ON case, the compiler realizes the if is always true and simply includes an unconditional print of “always”.

Page 65: JVM Internals by Douglas Hawkins

RuntimeOptimizations

Monday, January 23, 12

Page 66: JVM Internals by Douglas Hawkins

HotSpot Lifecycle1 2

34

Interpreted Profiling

DynamicCompilation

DynamicDecompilation

Monday, January 23, 12

Client compilation kicks-in at invocation 3000Server compilation kicks-in at invocation 10000Tiered compilation - C0, C1, C2Method Replacement vs On-Stack Replacement

http://java.sun.com/products/hotspot/whitepaper.htmlhttp://openjdk.java.net/groups/hotspot/docs/HotSpotGlossary.htmlhttp://www.azulsystems.com/blog/cliff-click/2010-07-16-tiered-compilationhttp://www.slideshare.net/drorbr/so-you-want-to-write-your-own-benchmark-presentation

Page 67: JVM Internals by Douglas Hawkins

Is This Optimized?double sumU = 0, sumV = 0;for ( int i = 0; i < 100; ++i ) { Vector2D vector = new Vector2D( i, i ); synchronized ( vector ) { sumU += vector.getU(); sumV += vector.getV(); }}

How many...?Loop IterationsHeap AllocationsMethod InvocationsLock Acquisitions

100100200100

Monday, January 23, 12

Let’s start the runtime observation discussion with a simple question.Is this optimized?How many loop iterations does it do? 100How many heap allocations? 100How method invocations? 200How lock acquisitions? 100Surprisingly, enough the answer to all of these may actually be zero.

Page 68: JVM Internals by Douglas Hawkins

Is This Optimized?double sumU = 0, sumV = 0;for ( int i = 0; i < 100; ++i ) { Vector2D vector = new Vector2D( i, i ); synchronized ( vector ) { sumU += vector.getU(); sumV += vector.getV(); }}

How many...?Loop IterationsHeap AllocationsMethod InvocationsLock Acquisitions

0000

Monday, January 23, 12

Let’s start the runtime observation discussion with a simple question.Is this optimized?How many loop iterations does it do? 100How many heap allocations? 100How method invocations? 200How lock acquisitions? 100Surprisingly, enough the answer to all of these may actually be zero.

Page 69: JVM Internals by Douglas Hawkins

Common Sub-Expression Eliminationint x = a + b;int y = a + b;

int tmp = a + b;int x = tmp;int y = tmp;

Monday, January 23, 12

Among the simplest optimizations is common sub-expression elimination.Here the VM optimizes the code by only performing the calculation of “a+b” once.http://www.slideshare.net/drorbr/so-you-want-to-write-your-own-benchmark-presentation

Page 70: JVM Internals by Douglas Hawkins

Array Bounds Check Eliminationint[] nums = ...for ( int i = 0; i < nums.length; ++i ) {

System.out.println( “nums[“ + i + “]=” + nums[ i ] );}

int[] nums = ...for ( int i = 0; i < nums.length; ++i ) { if ( i < 0 || i >= nums.length ) { throw new ArrayIndexOutOfBoundsException(); }

System.out.println( “nums[“ + i + “]=” + nums[ i ] );}

Monday, January 23, 12

One of the nice things about the VM is that we do have to worry about buffer overruns because the VM checks array bounds for us, but how much is that costing us.In short, nothing. The VM recognizes common patterns and realizes that it does not need to generate the bound checking code.http://www.cs.umd.edu/~vibha/330/array-bounds.pdf

Page 71: JVM Internals by Douglas Hawkins

Loop Invariant Hoisting

for ( int i = 0; i < nums.length; ++i ) {...}

int length = nums.length;for ( int i = 0; i < length; ++i ) {...}

Monday, January 23, 12

The VM can also also realize that the length of array does not change, so it can replace looking up the length of the array on each test with a single storing of a temporary variable and comparing against that instead.http://java.sun.com/products/hotspot/docs/whitepaper/Java_Hotspot_v1.4.1/Java_HSpot_WP_v1.4.1_1002_4.html

Page 72: JVM Internals by Douglas Hawkins

Loop Unrollingint sum = 0;for ( int i = 0; i < 10; ++i ) { sum += i;}

int sum = 0;sum += 1;...sum += 9;

Monday, January 23, 12

In some situations, the loop can even be unrolled into a simple linear code segment.

Page 73: JVM Internals by Douglas Hawkins

Method InliningVector vector = ...double magnitude = vector.magnitude();

Vector vector = ...double magnitude = Math.sqrt( vector.u*vector.u + vector.v*vector.v );

Vector vector = ...double magnitude;if ( vector instance of Vector2D ) { magnitude = Math.sqrt(

vector.u*vector.u + vector.v*vector.v );} else { magnitude = vector.magnitude();}

staticfinalprivatevirtualreflectivedynamic

alwaysalwaysalwaysoftensometimesoften

Monday, January 23, 12

http://www.ibm.com/developerworks/library/j-jtp12214/http://openjdk.java.net/groups/hotspot/docs/HotSpotGlossary.htmlhttp://blog.headius.com/2009/01/my-favorite-hotspot-jvm-flags.htmlhttp://java.sun.com/developer/technicalArticles/Networking/HotSpot/inlining.html

Page 74: JVM Internals by Douglas Hawkins

Lock CoarseningStringBuffer buffer = ...buffer.append( “Hello” );buffer.append( name );buffer.append( “\n” );

StringBuffer buffer = ...lock( buffer ); buffer.append( “Hello” ); unlock( buffer );lock( buffer ); buffer.append( name ); unlock( buffer );lock( buffer ); buffer.append( “\n” ); unlock( buffer );

StringBuffer buffer = ...lock( buffer ); buffer.append( “Hello” );buffer.append( name );buffer.append( “\n” );unlock( buffer );

Monday, January 23, 12

Starting in Java 5, HotSpot optimizes locks by performing lock coarsening.The VM realizes that constantly acquiring and releasing the same lock is not performant, so may take a single larger lock instead.http://java.sun.com/performance/reference/whitepapers/6_performance.html#2.1

Page 75: JVM Internals by Douglas Hawkins

Other Lock Optimizations

Biased Locking

Adaptive Locking - Thread sleep vs. Spin lock

Monday, January 23, 12

And, even more lock optimizations are possible...- biased locking - makes it cheap for the last thread to acquire lock to acquire it again- adaptive locking - dynamic detects whether a lock is usually held for a short or long period - if it is long, the thread is put to sleep - if it is short, the thread will simply spinhttp://java.sun.com/performance/reference/whitepapers/6_performance.html#2.1

Page 76: JVM Internals by Douglas Hawkins

Escape AnalysisPoint p1 = new Point( x1, y1 ), p2 = new Point( x2, y2 );

synchronized ( p2 ) { double dx = p1.getX() - p2.getX();

synchronized ( p1 ) {

double dy = p1.getY() - p2.getY();

}}

double distance = Math.sqrt( dx*dx + dy*dy );

Monday, January 23, 12

Finally, in Java 7, escape analysis is finally on by default.With escape analysis, the VM can realize that an object never escapes a stack frame allowing it to...- elide heap allocation- elide locks

Page 77: JVM Internals by Douglas Hawkins

Escape AnalysisPoint p1 = new Point( x1, y1 ), p2 = new Point( x2, y2 );

double dx = p1.getX() - p2.getX(); double dy = p1.getY() - p2.getY(); double distance = Math.sqrt( dx*dx + dy*dy );

Monday, January 23, 12

Finally, in Java 7, escape analysis is finally on by default.With escape analysis, the VM can realize that an object never escapes a stack frame allowing it to...- elide heap allocation- elide locks

Page 78: JVM Internals by Douglas Hawkins

Escape AnalysisPoint p1 = new Point( x1, y1 ), p2 = new Point( x2, y2 );

double dx = p1.getX() - p2.getX(); double dy = p1.getY() - p2.getY(); double distance = Math.sqrt( dx*dx + dy*dy );

double dx = x1 - x2;double dx = y1 - y2;double distance = Math.sqrt( dx*dx + dy*dy );

Monday, January 23, 12

Finally, in Java 7, escape analysis is finally on by default.With escape analysis, the VM can realize that an object never escapes a stack frame allowing it to...- elide heap allocation- elide locks

Page 79: JVM Internals by Douglas Hawkins

Runtime Demo

http://code.google.com/p/caliper/

Monday, January 23, 12

To conclude the runtime optimization section, I’ll show some micro-benchmarks illustrating some of the optimizations.Writing microbenchmarks for a dynamically optimizing VM is devilishly hard, fortunately, Google created a tool called Caliper to make it easy. You can write JUnit 3 like Benchmark classes to compare various implementation options.http://www.slideshare.net/drorbr/so-you-want-to-write-your-own-benchmark-presentationhttp://code.google.com/p/caliper/

Page 80: JVM Internals by Douglas Hawkins

for ( int i = 0; i < ints.length; ++i ) { int x = ints[i]; sum += x; }

int x; for ( int i = 0; i < ints.length; ++i ) { x = ints[i]; sum += x; }

for ( int i = 0; i < ints.length; ++i ) { sum += ints[i]; }

Loop Variable PlacementInside

Outside

No Variable

vs.

vs.

Monday, January 23, 12

First, let’s look at loop variable placement -- declaring the loop variable inside the loop vs. outside vs. using no variable at all.All three take the same amount of time to run. In fact, declaring inside or outside produces the same byte code.

My recommendation...For a one-line loop body, skip the variable.For a complicated loop body, declare the variable inside to keep the code easier to read and refactor.

Page 81: JVM Internals by Douglas Hawkins

Loop Invariant HoistingRegular For

Manual Hoisting

Enhanced For

vs.

vs.

for ( int x : ints ) { sum += x; }

for ( int i = 0; i < ints.length; ++i ) { sum += ints[i]; }

for ( int i = 0, len = ints.length; i < len; ++i ) { sum += ints[i]; }

Monday, January 23, 12

Now, we’ll compare...- the canonical loop which checks i against array.length each time in the test- manually, hoisting the length into a len temporary variable- using Java 5’s enhanced forOnce again, they all take the same amount of time because the VM performs for hoisting for us.

Page 82: JVM Internals by Douglas Hawkins

Field AccessDirect

Virtual Accessor

Interface Accessor

vs.

vs.

point.getX() point.getY()

point.x point.y

point.getX() point.getY()

Monday, January 23, 12

Next, we’ll look at direct field access vs. using a virtual accessor method vs. using an interface accessor methodOnce again, the VM can optimize all of these by performing method inlining, so all three take the same amount of the time.

Page 83: JVM Internals by Douglas Hawkins

StringBuilder builder = new StringBuilder(); builder.append( "foo" ); builder.append( "bar" ); builder.append( "baz" );

Loop Variable PlacementStringBuilder - no locks

StringBuffer - multiple locks

StringBuffer - single lock

vs.

vs.

StringBuffer buffer = new StringBuffer(); buffer.append( "foo" ); buffer.append( "bar" ); buffer.append( "baz" );

StringBuffer buffer = new StringBuffer(); synchronized( buffer ) { buffer.append( "foo" ); buffer.append( "bar" ); buffer.append( "baz" ); }

Monday, January 23, 12

Now, revisiting locking - compare...Java 5’s StringBuilder which performs no locking vs.Plain StringBuffer code - multiple separate appends vs.StringBuffer - with a manually added bigger lock

The no lock version does come out slightly ahead, but it is close.And, the attempt to manually improve performance by taking a bigger single lock actually comes in last.

Page 84: JVM Internals by Douglas Hawkins

Heap Elision BenchmarkPrimitive Array

Boxed Array - no Comparator

Boxed Array - singleton Compator

vs.

vs.

Arrays.sort(new int[]{...});

Arrays.sort(new Integer[]{...});

vs.

Arrays.sort( new Integer[]{...}, IntCompator.INSTANCE);

Boxed Array - anonymous CompatorArrays.sort( new Integer[]{...}, new Comparator<Integer>() { ... });

Monday, January 23, 12

Lastly, lets look at heap elision by looking at sorting some lists.No surprise, the primitive array is the most performant.But the no Comparator case, the singleton Comparator case, and an anonymous Comparator all perform the same.Even creating an anonymous every time does not impact performance much -- in Java 7, no heap allocation may take place at all.

Page 85: JVM Internals by Douglas Hawkins

Is This Optimized?double sumU = 0, sumV = 0;for ( int i = 0; i < 100; ++i ) { Vector2D vector = new Vector2D( i, i ); synchronized ( vector ) { sumU += vector.getU(); sumV += vector.getV(); }}

How many...?Loop IterationsHeap AllocationsMethod InvocationsLock Acquisitions

0000

Monday, January 23, 12

So now, hopefully, you can see how this could may truly be optimized already.Just write clean code and trust in the VM to make it fast.If you must optimize always profile first and use a micro-benchmarking tool like Caliper.

Page 86: JVM Internals by Douglas Hawkins

Recommending Reading

http://www.javapuzzlers.com/By Joshua Bloch and Neal GafterJava Puzzlers

Java Specialist Newsletterhttp://www.javaspecialists.eu

Brian Goetz’s Articleshttp://www.ibm.com/developerworks/views/java/libraryview.jsp?contentarea_by=Java+technology&search_by=brian+goetz

Monday, January 23, 12

Page 87: JVM Internals by Douglas Hawkins

Q&AMonday, January 23, 12