patterns for jvm languages - geecon 2014

62
The following presentation is provided under DHMB license (Do Not Hurt My Brain). Lecturer is not responsible for the financial and moral damages resulting from taking too seriously the content of the presentation. Including permanent damage to neurons, reduction of neurotransmitter activity at the molecular level and group dismissal.

Upload: jaroslaw-palka

Post on 15-Jun-2015

534 views

Category:

Software


0 download

DESCRIPTION

In recent years we have seen explosion of languages which run on Java Virtual Machine. We also have seen existing languages getting their implementations being rewritten to JVM. With all of the above we have seen rapid development of tools like parsers, bytecode generators and such, even inside JVM we saw initiatives like Da Vinci Machine Project, which led to invoke dynamic in JDK 7 and recent development of Graal and Truffle projects. Is it really hard to write new programming language running on JVM? Even if you are not going to write your own I think it is worth to understand how your favorite language runs undercover, how early decisions can impact language extensibility and performance, what JVM itself and JVM ecosystem has to offer to language implementors. During session I will try to get you familiar with options you have when choosing parsers and byte code manipulation libraries. which language implementation to consider, how to test and tune your "new baby". Will you be able after this session to develop new and shiny language, packed with killer features language? No. But for sure you will understand difference between lexers and parsers, how bytecode works, why invoke dynamic and Graal and Truffle are so important to the future of JVM platform. Will we have time to write simple, compiled language? Yes, we will, just to show you that even person who didn't studied computer science, compilers theory, and for majority of his life didn't know what AST is, can do it :)

TRANSCRIPT

Page 1: Patterns for JVM languages - Geecon 2014

The following presentation is provided under DHMB license

(Do Not Hurt My Brain).

Lecturer is not responsible for the financial and moral

damages resulting from taking too seriously the content of

the presentation.

Including permanent damage to neurons, reduction of

neurotransmitter activity at the molecular level and group

dismissal.

Page 2: Patterns for JVM languages - Geecon 2014

Patterns for JVM languages

Jarosław Pałka

Page 3: Patterns for JVM languages - Geecon 2014

about me

work://chief_architect@lumesse

owner://symentis.pl

twitter://j_palka

blog://geekyprimitives.wordpress.com

scm:bitbucket://kcrimson

scm:github://kcrimson

Page 4: Patterns for JVM languages - Geecon 2014

Pet project

Page 5: Patterns for JVM languages - Geecon 2014

Because having pet projects is like solving crosswords everyday

Or going to a gym

Helps your brain to stay in shape

Page 6: Patterns for JVM languages - Geecon 2014
Page 7: Patterns for JVM languages - Geecon 2014
Page 8: Patterns for JVM languages - Geecon 2014

JVM is

thenew

assembler

Page 9: Patterns for JVM languages - Geecon 2014

Translator

Page 10: Patterns for JVM languages - Geecon 2014

source code

target language source code

machine code || interpreter

Page 11: Patterns for JVM languages - Geecon 2014

Interpreter

Page 12: Patterns for JVM languages - Geecon 2014

source code

↓execution results

Page 13: Patterns for JVM languages - Geecon 2014

Compiler

Page 14: Patterns for JVM languages - Geecon 2014

source code

↓machine code

Page 15: Patterns for JVM languages - Geecon 2014

So how does it all work?

Page 16: Patterns for JVM languages - Geecon 2014

Diving into the stream

Page 17: Patterns for JVM languages - Geecon 2014

source code

↓lexer

token stream

↓parser

IR

Page 18: Patterns for JVM languages - Geecon 2014

Intermediate Representation

visitor

listener

tree

Page 19: Patterns for JVM languages - Geecon 2014

Syntax tree

Page 20: Patterns for JVM languages - Geecon 2014

local y = 1x = 1 + y

Page 21: Patterns for JVM languages - Geecon 2014
Page 22: Patterns for JVM languages - Geecon 2014
Page 23: Patterns for JVM languages - Geecon 2014
Page 24: Patterns for JVM languages - Geecon 2014

fragmentDIGIT:[0-9];

fragmentLETTER:[a-zA-Z];

String:'\'' .*? '\''| '"' .*? '"';

Number:DIGIT+;

Name:('_'| LETTER)('_'| LETTER| DIGIT)*;

COMMENT:'--' .*? '\n' -> skip;

NL:'\r'? '\n' -> skip;

WS:[ \t\f]+ -> skip;

Page 25: Patterns for JVM languages - Geecon 2014

[@0,[0..1)='x',<49>]

[@1,[2..3)='=',<36>]

[@2,[4..5)='1',<48>]

[@3,[6..7)='+',<33>]

[@4,[8..15)='"Hello"',<47>]

[@5,[15..16)=';',<37>]

[@6,[17..17)='',<-1=EOF>]

Page 26: Patterns for JVM languages - Geecon 2014

varlist returns [Varlist result]:var{$result = createVarlist($var.result);}

(',' var{$result = createVarlist($result,$var.result);}

)*;

//

var returns [Var result]:Name{ $result = createVariableName($Name.text); }

| prefixexp '[' exp ']'| prefixexp '.' Name;

Page 27: Patterns for JVM languages - Geecon 2014

beware of left recursion

Page 28: Patterns for JVM languages - Geecon 2014

translator

interpreter

compiler

Page 29: Patterns for JVM languages - Geecon 2014

Interpreter pattern

Non terminal nodes

Terminal nodes

Page 30: Patterns for JVM languages - Geecon 2014
Page 31: Patterns for JVM languages - Geecon 2014

package pl.symentis.lua.grammar;

import pl.symentis.lua.runtime.Context;

public interface Statement {

void execute(Context context);

}

Page 32: Patterns for JVM languages - Geecon 2014

package pl.symentis.lua.grammar;

Import pl.symentis.lua.runtime.Context;

public interface Expression<T> {

T evaluate(Context ctx);

}

Page 33: Patterns for JVM languages - Geecon 2014

package pl.symentis.lua.runtime;

public interface Context {

VariableRef resolveVariable(String name);

Scope enterScope();

Scope exitScope();

}

Page 34: Patterns for JVM languages - Geecon 2014

package pl.symentis.lua.runtime;

public interface Scope {

Scope getOuterScope();

VariableRef resolveVariable(String name);

void bindVariable(String name, Object value);

}

Page 35: Patterns for JVM languages - Geecon 2014

Into the forest of syntax tree

Page 36: Patterns for JVM languages - Geecon 2014

x = 1

Page 37: Patterns for JVM languages - Geecon 2014

public class Literal implements ExpressionNode {

public final LuaType value;

public Literal(LuaType value) {super();this.value = value;

}

@Overridepublic void accept(ExpressionVisitor visitor) {

visitor.visit(this);}

}

Page 38: Patterns for JVM languages - Geecon 2014

public class VariableName extends Var {

public final String name;

public VariableName(String name) {this.name = name;

}

@Overridepublic void accept(ExpressionVisitor visitor) {

visitor.visit(this);}

}

Page 39: Patterns for JVM languages - Geecon 2014

public class Assignment implements StatementNode {

public final Varlist varlist;

public final Explist explist;

public Assignment(Varlist varlist, Explist explist) {super();this.varlist = varlist;this.explist = explist;

}

@Overridepublic void accept(StatementVisitor visitor) {

visitor.visit(this);}

}

Page 40: Patterns for JVM languages - Geecon 2014

public void visit(Assignment assignment) {InterpreterExpressionVisitor varlist = createExpressionVisitor();assignment.varlist.accept(varlist);

InterpreterExpressionVisitor explist = createExpressionVisitor();assignment.explist.accept(explist);

Iterator<ExpressionNode> iterator = ((List<ExpressionNode>) explist.result()).iterator();

List<VariableRef> vars = (List<VariableRef>) varlist.result();for (VariableRef variableRef : vars) {

if (iterator.hasNext()) {ExpressionNode node = iterator.next();

InterpreterExpressionVisitor exp = createExpressionVisitor();node.accept(exp);

variableRef.set(exp.result());} else {

variableRef.set(LuaNil.Nil);}

}}

Page 41: Patterns for JVM languages - Geecon 2014

public void visit(Literal literal) {result = literal.value;

}

public void visit(VariableName variableName) {result = context.resolveVariable(variableName.name);

}

public void visit(Varlist varlist) {

List<VariableRef> refs = new ArrayList<>();

for (Var var : varlist.vars) {var.accept(this);refs.add((VariableRef) result());

}

result = refs;}

Page 42: Patterns for JVM languages - Geecon 2014

translator

interpreter

compiler

Page 43: Patterns for JVM languages - Geecon 2014

compiler → byte codeorg.ow2.asm:asm:jar

me.qmx.jitescript:jitescript:jar

Page 44: Patterns for JVM languages - Geecon 2014

JiteClass jiteClass = new JiteClass(classname, new String[] { p(Script.class) });

jiteClass.defineDefaultConstructor();

CodeBlock codeBlock = newCodeBlock();

// bind context into current runtimecodeBlock.aload(1);codeBlock.invokestatic(p(RuntimeContext.class), "enterContext", sig(void.class, Context.class));

// parse file and start to walk through AST visitorStatementNode chunk = parseScript(readFileToString(script));

// generate byte codechunk.accept(new CompilerStatementVisitor(codeBlock, new HashSymbolTable()));

codeBlock.voidreturn();

jiteClass.defineMethod("execute", JiteClass.ACC_PUBLIC, sig(void.class, new Class[] { Context.class }), codeBlock);

return jiteClass.toBytes(JDKVersion.V1_8);}

Page 45: Patterns for JVM languages - Geecon 2014

JiteClass jiteClass = new JiteClass(classname, new String[] { p(Script.class) });

jiteClass.defineDefaultConstructor();

CodeBlock codeBlock = newCodeBlock();

// bind context into current runtimecodeBlock.aload(1);codeBlock.invokestatic(p(RuntimeContext.class), "enterContext", sig(void.class, Context.class));

// parse file and start to walk through AST visitorStatementNode chunk = parseScript(readFileToString(script));

// generate byte codechunk.accept(new CompilerStatementVisitor(codeBlock, new HashSymbolTable()));

codeBlock.voidreturn();

jiteClass.defineMethod("execute", JiteClass.ACC_PUBLIC, sig(void.class, new Class[] { Context.class }), codeBlock);

return jiteClass.toBytes(JDKVersion.V1_8);}

Page 46: Patterns for JVM languages - Geecon 2014

JiteClass jiteClass = new JiteClass(classname, new String[] { p(Script.class) });

jiteClass.defineDefaultConstructor();

CodeBlock codeBlock = newCodeBlock();

// bind context into current runtimecodeBlock.aload(1);codeBlock.invokestatic(p(RuntimeContext.class), "enterContext", sig(void.class, Context.class));

// parse file and start to walk through AST visitorStatementNode chunk = parseScript(readFileToString(script));

// generate byte codechunk.accept(new CompilerStatementVisitor(codeBlock, new HashSymbolTable()));

codeBlock.voidreturn();

jiteClass.defineMethod("execute", JiteClass.ACC_PUBLIC, sig(void.class, new Class[] { Context.class }), codeBlock);

return jiteClass.toBytes(JDKVersion.V1_8);}

Page 47: Patterns for JVM languages - Geecon 2014

x = 1

Page 48: Patterns for JVM languages - Geecon 2014

public void visit(Assignment assignment) {

Explist explist = assignment.explist;Varlist varlist = assignment.varlist;Iterator<Var> vars = varlist.vars.iterator();

for (ExpressionNode exp : explist.expressions) {exp.accept(new CompilerExpressionVisitor(codeBlock, symbolTable));

CompilerExpressionVisitor visitor = new CompilerExpressionVisitor(codeBlock, symbolTable);

vars.next().accept(visitor);

String varName = (String) visitor.result();codeBlock.astore(symbolTable.index(varName));

}

}

Page 49: Patterns for JVM languages - Geecon 2014

public void visit(Literal literal) {

Object object = literal.value.asObject();

codeBlock.ldc(object);

// box object into Lua typeif (object instanceof Integer) {

codeBlock.invokestatic(p(Integer.class), "valueOf", sig(Integer.class, int.class));} else if (object instanceof Double) {

codeBlock.invokestatic(p(Double.class), "valueOf", sig(Double.class, double.class));

}

codeBlock.invokestatic(p(LuaTypes.class), "valueOf", sig(LuaType.class, Object.class));

}

public void visit(VariableName variableName) {result = variableName.name;

}

Page 50: Patterns for JVM languages - Geecon 2014

When writing bytecode compiler it helps to think about Java code

But at the end you need to think like stack machine :)

Page 51: Patterns for JVM languages - Geecon 2014

.method public execute : (Lpl/symentis/lua4j/runtime/Context;)V ; method code size: 21 bytes .limit stack 2 .limit locals 3 aload_1 invokestatic pl/symentis/lua4j/compiler/RuntimeContext enterContext (Lpl/symentis/lua4j/runtime/Context;)V ldc2_w 1.0 invokestatic java/lang/Double valueOf (D)Ljava/lang/Double; invokestatic pl/symentis/lua4j/types/LuaTypes valueOf [_28] astore_2 aload_2 return.end method

Page 52: Patterns for JVM languages - Geecon 2014

Symbol table

maps variable local to its symbolic name

Page 53: Patterns for JVM languages - Geecon 2014

0xBA

invokedynamic

Page 54: Patterns for JVM languages - Geecon 2014

print("Welcome","to","Geecon",2014)

Page 55: Patterns for JVM languages - Geecon 2014

Var var = functionCall.var;

CompilerExpressionVisitor visitor = new CompilerExpressionVisitor(codeBlock, symbolTable);var.accept(visitor);String functionName = (String) visitor.result();

Args args = functionCall.args;

visitor = new CompilerExpressionVisitor(codeBlock, symbolTable);args.accept(visitor);Integer argNumber = (Integer) visitor.result();

Class<?>[] paramTypeArr = new Class[argNumber];fill(paramTypeArr, LuaType.class);

codeBlock.invokedynamic(functionName, sig(void.class, paramTypeArr), Lua4JDynamicBootstrap.HANDLE, new Object[] {});

Page 56: Patterns for JVM languages - Geecon 2014

static final Handle HANDLE = new Handle(CodeBlock.H_INVOKESTATIC, p(Lua4JDynamicBootstrap.class), "bootstrap", sig(CallSite.class, MethodHandles.Lookup.class, String.class,

MethodType.class));

Page 57: Patterns for JVM languages - Geecon 2014

public static CallSite bootstrap(MethodHandles.Lookup lookup, String name, MethodType type) throws Throwable {

Context context = currentContext();

MethodHandle target=lookup.findStatic(BasicFunctions.class,name,methodType(

LuaType.class, Context.class, LuaType[].class));

target = MethodHandles.insertArguments(handler, 0, context) .asCollector(LuaType[].class, type.parameterCount());

return new ConstantCallSite(target.asType(type));}

Page 58: Patterns for JVM languages - Geecon 2014

https://github.com/headius/invokebinder

A Java DSL for binding method handles forward, rather than backward

https://github.com/projectodd/rephract

Another Java DSL for invokedynamic

Page 59: Patterns for JVM languages - Geecon 2014

What's next?

graal + truffledynamic compiler + AST interpreter

Page 60: Patterns for JVM languages - Geecon 2014

Thanks,

stay cool,

dynamically linked

and...

Page 61: Patterns for JVM languages - Geecon 2014

Let the eternal stream of bytecode

flow through your body

Page 62: Patterns for JVM languages - Geecon 2014

Q&A