all about javascriptcore’s many compilers · (interpreter) baseline (template jit) dfg (less...

252
All About JavaScriptCore’s Many Compilers Filip Pizlo Apple Inc.

Upload: others

Post on 22-May-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

All About JavaScriptCore’s Many

CompilersFilip PizloApple Inc.

Page 2: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

webkit.org

https://svn.webkit.org/repository/webkit/trunk

Page 3: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

JavaScriptCore.framework

Page 4: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Safari

Page 5: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Agenda

• High Level Overview

• Template JITing

• Optimized JITing- DFG

- FTL

- BBQ

- OMG

Page 6: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Agenda

• High Level Overview

• Template JITing

• Optimized JITing- DFG

- FTL

- BBQ

- OMG

Page 7: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Four Tiers

LLInt (interpreter)

Baseline (template JIT)

DFG (less optimizing JIT)

FTL (optimizing JIT)

latency throughput

Page 8: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Four Tiers

LLInt (interpreter)

Baseline (template JIT)

DFG (less optimizing JIT)

FTL (optimizing JIT)

latency throughput

profiling

Page 9: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Four Tiers

LLInt (interpreter)

Baseline (template JIT)

DFG (less optimizing JIT)

FTL (optimizing JIT)

latency throughput

profiling

speculation and OSR

Page 10: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

time

"use strict";

let result = 0;for (let i = 0; i < 10000000; ++i) { let o = {f: i}; result += o.f;}

print(result);

conc

urre

ncy

Page 11: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

time

"use strict";

let result = 0;for (let i = 0; i < 10000000; ++i) { let o = {f: i}; result += o.f;}

print(result);

conc

urre

ncy

LLInt

Page 12: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

time

"use strict";

let result = 0;for (let i = 0; i < 10000000; ++i) { let o = {f: i}; result += o.f;}

print(result);

conc

urre

ncy

LLInt

trigg

er

Page 13: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

time

"use strict";

let result = 0;for (let i = 0; i < 10000000; ++i) { let o = {f: i}; result += o.f;}

print(result);

conc

urre

ncy

LLInt

baseline compiler

trigg

er

Page 14: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

time

"use strict";

let result = 0;for (let i = 0; i < 10000000; ++i) { let o = {f: i}; result += o.f;}

print(result);

conc

urre

ncy

LLInt

baseline compiler

trigg

er

LLInt

Page 15: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

time

"use strict";

let result = 0;for (let i = 0; i < 10000000; ++i) { let o = {f: i}; result += o.f;}

print(result);

conc

urre

ncy

LLInt

baseline compiler

trigg

er

LLInt OSR

Page 16: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

time

"use strict";

let result = 0;for (let i = 0; i < 10000000; ++i) { let o = {f: i}; result += o.f;}

print(result);

conc

urre

ncy

LLInt

baseline compiler

trigg

er

LLInt OSR Baseline

DFG compiler

trigg

er

Baseline OSR DFG

trigg

er

FTL compiler

DFG OSR FTL

Page 17: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

time

"use strict";

let result = 0;for (let i = 0; i < 10000000; ++i) { let o = {f: i}; result += o.f;}

print(result);

conc

urre

ncy

LLInt

baseline compiler

trigg

er

LLInt OSR Baseline

DFG compiler

trigg

er

Baseline OSR DFG

trigg

er

FTL compiler

DFG OSR FTL

0.12ms

Page 18: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

time

"use strict";

let result = 0;for (let i = 0; i < 10000000; ++i) { let o = {f: i}; result += o.f;}

print(result);

conc

urre

ncy

LLInt

baseline compiler

trigg

er

LLInt OSR Baseline

DFG compiler

trigg

er

Baseline OSR DFG

trigg

er

FTL compiler

DFG OSR FTL

0.12ms 0.48ms

Page 19: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

time

"use strict";

let result = 0;for (let i = 0; i < 10000000; ++i) { let o = {f: i}; result += o.f;}

print(result);

conc

urre

ncy

LLInt

baseline compiler

trigg

er

LLInt OSR Baseline

DFG compiler

trigg

er

Baseline OSR DFG

trigg

er

FTL compiler

DFG OSR FTL

0.12ms 0.48ms 2.86ms

Page 20: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR
Page 21: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Parser

Page 22: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Parser

Bytecompiler

Page 23: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Parser

Bytecompiler

Generatorification

Page 24: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Parser

Bytecompiler

Generatorification

Bytecode Linker

Page 25: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Parser

Bytecompiler

Generatorification

Bytecode Linker

LLInt

Page 26: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Parser

Bytecompiler

Generatorification

Bytecode Linker

LLInt Bytecode Template JIT

Page 27: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Parser

Bytecompiler

Generatorification

Bytecode Linker

LLInt Bytecode Template JIT

DFG

Page 28: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Parser

Bytecompiler

Generatorification

Bytecode Linker

LLInt Bytecode Template JIT

DFG Bytecode Parser

DFG

Page 29: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG Optimizer

Parser

Bytecompiler

Generatorification

Bytecode Linker

LLInt Bytecode Template JIT

DFG Bytecode Parser

DFG

Page 30: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG Optimizer

DFG Backend

Parser

Bytecompiler

Generatorification

Bytecode Linker

LLInt Bytecode Template JIT

DFG Bytecode Parser

DFG

Page 31: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG Optimizer

DFG Backend

Parser

Bytecompiler

Generatorification

Bytecode Linker

LLInt Bytecode Template JIT

DFG Bytecode Parser

DFG FTL

Page 32: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG Optimizer

DFG Backend

Parser

Bytecompiler

Generatorification

Bytecode Linker

LLInt Bytecode Template JIT

DFG Bytecode Parser

DFG Bytecode Parser

DFG FTL

Page 33: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG Optimizer

DFG Backend

Extended DFG Optimizer

Parser

Bytecompiler

Generatorification

Bytecode Linker

LLInt Bytecode Template JIT

DFG Bytecode Parser

DFG Bytecode Parser

DFG FTL

Page 34: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG-to-B3 lowering

DFG Optimizer

DFG Backend

Extended DFG Optimizer

Parser

Bytecompiler

Generatorification

Bytecode Linker

LLInt Bytecode Template JIT

DFG Bytecode Parser

DFG Bytecode Parser

DFG FTL

Page 35: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG-to-B3 lowering

DFG Optimizer

DFG Backend

Extended DFG Optimizer

B3 Optimizer

Parser

Bytecompiler

Generatorification

Bytecode Linker

LLInt Bytecode Template JIT

DFG Bytecode Parser

DFG Bytecode Parser

DFG FTL

Page 36: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG-to-B3 lowering

DFG Optimizer

DFG Backend

Extended DFG Optimizer

B3 Optimizer

Instruction Selection

Parser

Bytecompiler

Generatorification

Bytecode Linker

LLInt Bytecode Template JIT

DFG Bytecode Parser

DFG Bytecode Parser

DFG FTL

Page 37: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG-to-B3 lowering

DFG Optimizer

DFG Backend

Extended DFG Optimizer

B3 Optimizer

Instruction Selection

Air Optimizer

Parser

Bytecompiler

Generatorification

Bytecode Linker

LLInt Bytecode Template JIT

DFG Bytecode Parser

DFG Bytecode Parser

DFG FTL

Page 38: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG-to-B3 lowering

DFG Optimizer

DFG Backend

Extended DFG Optimizer

B3 Optimizer

Instruction Selection

Air Optimizer

Air Backend

Parser

Bytecompiler

Generatorification

Bytecode Linker

LLInt Bytecode Template JIT

DFG Bytecode Parser

DFG Bytecode Parser

DFG FTL

Page 39: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

JetStream 2 Score

LLInt

LLInt+Baseline

LLInt+Baseline+DFG

LLInt+Baseline+DFG+FTL

Score (higher is better)0 15 30 45 60 75 90 105 120 135 150

on my computer one day

Page 40: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

JetStream 2 Score

LLInt

LLInt+Baseline

LLInt+Baseline+DFG

LLInt+Baseline+DFG+FTL

Score (higher is better)0 15 30 45 60 75 90 105 120 135 150

on my computer one day

>2× LLInt

Page 41: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

JetStream 2 Score

LLInt

LLInt+Baseline

LLInt+Baseline+DFG

LLInt+Baseline+DFG+FTL

Score (higher is better)0 15 30 45 60 75 90 105 120 135 150

on my computer one day

>2× LLInt

>2× Baseline

Page 42: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

JetStream 2 Score

LLInt

LLInt+Baseline

LLInt+Baseline+DFG

LLInt+Baseline+DFG+FTL

Score (higher is better)0 15 30 45 60 75 90 105 120 135 150

on my computer one day

>2× LLInt

>2× Baseline

~1.1× DFG

Page 43: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

JetStream 2 “gaussian-blur”

LLInt

LLInt+Baseline

LLInt+Baseline+DFG

LLInt+Baseline+DFG+FTL

Milliseconds (lower is better)0 10000 20000 30000 40000 50000

1,450 ms

2,276 ms

14,091 ms

41,948 ms

on my computer one day

Page 44: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

JetStream 2 “gaussian-blur”

LLInt

LLInt+Baseline

LLInt+Baseline+DFG

LLInt+Baseline+DFG+FTL

Milliseconds (lower is better)0 10000 20000 30000 40000 50000

1,450 ms

2,276 ms

14,091 ms

41,948 ms

on my computer one day

~1.6× DFG

Page 45: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

JetStream 2 “raytrace”

LLInt

LLInt+Baseline

LLInt+Baseline+DFG

LLInt+Baseline+DFG+FTL

Milliseconds (lower is better)0 10000 20000 30000 40000 50000

1,328 ms

2,386 ms

13,740 ms

47,839 ms

on my computer one day

Page 46: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

JetStream 2 “raytrace”

LLInt

LLInt+Baseline

LLInt+Baseline+DFG

LLInt+Baseline+DFG+FTL

Milliseconds (lower is better)0 10000 20000 30000 40000 50000

1,328 ms

2,386 ms

13,740 ms

47,839 ms

on my computer one day

~1.8× DFG

Page 47: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Two WebAssembly Tiers

BBQ (B3 -01)

OMG (B3 -O2)

latency throughput

Page 48: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

~9 JIT compilersLLInt

(interpreter)Baseline

(template JIT)DFG

(template JIT)FTL

(B3 JIT)

Wasm BBQ

(B3 JIT)

Wasm OMG (B3 JIT)

JavaScript execution engines:

WebAssembly execution engines:

PolymorphicAccess

(template JIT)

Snippet (template JIT)

Bonus JITs:YARR

(template regexp JIT)

CSS (template JIT)

Page 49: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

MacroAssembler (template JIT toolkit)

B3 JIT (optimizing JIT toolkit)

Baseline DFG PolymorphicAccess Snippet YARR CSS

FTL Wasm BBQ

Wasm OMG

Page 50: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

MacroAssembler (template JIT toolkit)

B3 JIT (optimizing JIT toolkit)

Baseline DFG PolymorphicAccess Snippet YARR CSS

FTL Wasm BBQ

Wasm OMG

Page 51: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

MacroAssembler (template JIT toolkit)

B3 JIT (optimizing JIT toolkit)

Baseline DFG PolymorphicAccess Snippet YARR CSS

FTL Wasm BBQ

Wasm OMG

Page 52: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

MacroAssembler (template JIT toolkit)

B3 JIT (optimizing JIT toolkit)

Baseline DFG PolymorphicAccess Snippet YARR CSS

FTL Wasm BBQ

Wasm OMG

Page 53: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

MacroAssembler (template JIT toolkit)

B3 JIT (optimizing JIT toolkit)

Baseline DFG PolymorphicAccess Snippet YARR CSS

FTL Wasm BBQ

Wasm OMG

Page 54: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

MacroAssembler (template JIT toolkit)

B3 JIT (optimizing JIT toolkit)

Baseline DFG PolymorphicAccess Snippet YARR CSS

FTL Wasm BBQ

Wasm OMG

Page 55: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

JIT-friendly VM

• Conservative-on-the-stack GC

• Consistent, mostly C-like ABI

Page 56: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Agenda

• High Level Overview

• Template JITing

• Optimized JITing- DFG

- FTL

- BBQ

- OMG

Page 57: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Template JIT

• Goal: decent throughput with low latency.

Page 58: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

function foo(a, b){ return a + b;}

Page 59: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

[ 0] enter [ 1] get_scope loc3[ 3] mov loc4, loc3[ 6] check_traps [ 7] add loc6, arg1, arg2[ 12] ret loc6

Page 60: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

[ 0] enter [ 1] get_scope loc3[ 3] mov loc4, loc3[ 6] check_traps [ 7] add loc6, arg1, arg2[ 12] ret loc6

Page 61: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

[ 7] add loc6, arg1, arg2 0x2f8084601a65: mov 0x30(%rbp), %rsi 0x2f8084601a69: mov 0x38(%rbp), %rdx 0x2f8084601a6d: cmp %r14, %rsi 0x2f8084601a70: jb 0x2f8084601af2 0x2f8084601a76: cmp %r14, %rdx 0x2f8084601a79: jb 0x2f8084601af2 0x2f8084601a7f: mov %esi, %eax 0x2f8084601a81: add %edx, %eax 0x2f8084601a83: jo 0x2f8084601af2 0x2f8084601a89: or %r14, %rax 0x2f8084601a8c: mov %rax, -0x38(%rbp)

Page 62: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Template JIT

• Portable assembly meta-programming.

Page 63: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

[ 7] add loc6, arg1, arg2 0x2f8084601a65: mov 0x30(%rbp), %rsi 0x2f8084601a69: mov 0x38(%rbp), %rdx 0x2f8084601a6d: cmp %r14, %rsi 0x2f8084601a70: jb 0x2f8084601af2 0x2f8084601a76: cmp %r14, %rdx 0x2f8084601a79: jb 0x2f8084601af2 0x2f8084601a7f: mov %esi, %eax 0x2f8084601a81: add %edx, %eax 0x2f8084601a83: jo 0x2f8084601af2 0x2f8084601a89: or %r14, %rax 0x2f8084601a8c: mov %rax, -0x38(%rbp)

Page 64: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

[ 7] add loc6, arg1, arg2 0x2f8084601a65: mov 0x30(%rbp), %rsi 0x2f8084601a69: mov 0x38(%rbp), %rdx 0x2f8084601a6d: cmp %r14, %rsi 0x2f8084601a70: jb 0x2f8084601af2 0x2f8084601a76: cmp %r14, %rdx 0x2f8084601a79: jb 0x2f8084601af2 0x2f8084601a7f: mov %esi, %eax 0x2f8084601a81: add %edx, %eax 0x2f8084601a83: jo 0x2f8084601af2 0x2f8084601a89: or %r14, %rax 0x2f8084601a8c: mov %rax, -0x38(%rbp)

Page 65: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

[ 7] add loc6, arg1, arg2 0x2f8084601a65: mov 0x30(%rbp), %rsi 0x2f8084601a69: mov 0x38(%rbp), %rdx 0x2f8084601a6d: cmp %r14, %rsi 0x2f8084601a70: jb 0x2f8084601af2 0x2f8084601a76: cmp %r14, %rdx 0x2f8084601a79: jb 0x2f8084601af2 0x2f8084601a7f: mov %esi, %eax 0x2f8084601a81: add %edx, %eax 0x2f8084601a83: jo 0x2f8084601af2 0x2f8084601a89: or %r14, %rax 0x2f8084601a8c: mov %rax, -0x38(%rbp)

if (!m_leftOperand.isConstInt32()) state.slowPathJumps.append( jit.branchIfNotInt32(m_left));

Page 66: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

[ 7] add loc6, arg1, arg2 0x2f8084601a65: mov 0x30(%rbp), %rsi 0x2f8084601a69: mov 0x38(%rbp), %rdx 0x2f8084601a6d: cmp %r14, %rsi 0x2f8084601a70: jb 0x2f8084601af2 0x2f8084601a76: cmp %r14, %rdx 0x2f8084601a79: jb 0x2f8084601af2 0x2f8084601a7f: mov %esi, %eax 0x2f8084601a81: add %edx, %eax 0x2f8084601a83: jo 0x2f8084601af2 0x2f8084601a89: or %r14, %rax 0x2f8084601a8c: mov %rax, -0x38(%rbp)

if (!m_leftOperand.isConstInt32()) state.slowPathJumps.append( jit.branchIfNotInt32(m_left));

jit.branchAdd32( Overflow, var.payloadGPR(), Imm32(constValue), scratch);

Page 67: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

jit.addPtr(Address(regT3, 48), regT5)regT5 += loadPtr(regT3, offset = 48)

Page 68: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

jit.addPtr(Address(regT3, 48), regT5)regT5 += loadPtr(regT3, offset = 48)

addq 48(%rcx), %r10

Page 69: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

jit.addPtr(Address(regT3, 48), regT5)regT5 += loadPtr(regT3, offset = 48)

addq 48(%rcx), %r10 ldr x16, [x3, #48]add x3, x3, x16

Page 70: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

jit.emitFunctionPrologue()

Page 71: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

jit.setupArguments<decltype( operationReallocateButterflyToGrowPropertyStorage)>( baseGPR, CCallHelpers::TrustedImm32(newSize / sizeof(JSValue))); CCallHelpers::Call operationCall = jit.call(OperationPtrTag);

jit.addLinkTask([=] (LinkBuffer& linkBuffer) { linkBuffer.link( operationCall, FunctionPtr<OperationPtrTag>( operationReallocateButterflyToGrowPropertyStorage));});

Page 72: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

jit.setupArguments<decltype( operationReallocateButterflyToGrowPropertyStorage)>( baseGPR, CCallHelpers::TrustedImm32(newSize / sizeof(JSValue))); CCallHelpers::Call operationCall = jit.call(OperationPtrTag);

jit.addLinkTask([=] (LinkBuffer& linkBuffer) { linkBuffer.link( operationCall, FunctionPtr<OperationPtrTag>( operationReallocateButterflyToGrowPropertyStorage));});

Page 73: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

jit.setupArguments<decltype( operationReallocateButterflyToGrowPropertyStorage)>( baseGPR, CCallHelpers::TrustedImm32(newSize / sizeof(JSValue))); CCallHelpers::Call operationCall = jit.call(OperationPtrTag);

jit.addLinkTask([=] (LinkBuffer& linkBuffer) { linkBuffer.link( operationCall, FunctionPtr<OperationPtrTag>( operationReallocateButterflyToGrowPropertyStorage));});

Page 74: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

jit.setupArguments<decltype( operationReallocateButterflyToGrowPropertyStorage)>( baseGPR, CCallHelpers::TrustedImm32(newSize / sizeof(JSValue))); CCallHelpers::Call operationCall = jit.call(OperationPtrTag);

jit.addLinkTask([=] (LinkBuffer& linkBuffer) { linkBuffer.link( operationCall, FunctionPtr<OperationPtrTag>( operationReallocateButterflyToGrowPropertyStorage));});

Page 75: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

JIT Inline Cache

0x46f8c30b9b0: mov 0x30(%rbp), %rax0x46f8c30b9b4: test %rax, %r150x46f8c30b9b7: jnz 0x46f8c30ba2c0x46f8c30b9bd: jmp 0x46f8c30ba2c0x46f8c30b9c2: o16 nop %cs:0x200(%rax,%rax)0x46f8c30b9d1: nop (%rax)0x46f8c30b9d4: mov %rax, -0x38(%rbp)

Page 76: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

JIT Inline Cache

0x46f8c30b9b0: mov 0x30(%rbp), %rax0x46f8c30b9b4: test %rax, %r150x46f8c30b9b7: jnz 0x46f8c30ba2c0x46f8c30b9bd: jmp 0x46f8c30ba2c0x46f8c30b9c2: o16 nop %cs:0x200(%rax,%rax)0x46f8c30b9d1: nop (%rax)0x46f8c30b9d4: mov %rax, -0x38(%rbp)

Page 77: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

JIT Inline Cache

0x46f8c30b9b0: mov 0x30(%rbp), %rax0x46f8c30b9b4: test %rax, %r150x46f8c30b9b7: jnz 0x46f8c30ba2c0x46f8c30b9bd: jmp 0x46f8c30ba2c0x46f8c30b9c2: o16 nop %cs:0x200(%rax,%rax)0x46f8c30b9d1: nop (%rax)0x46f8c30b9d4: mov %rax, -0x38(%rbp)

Page 78: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

JIT Inline Cache

0x46f8c30b9b0: mov 0x30(%rbp), %rax0x46f8c30b9b4: test %rax, %r150x46f8c30b9b7: jnz 0x46f8c30ba2c0x46f8c30b9bd: cmp $0x125, (%rax)0x46f8c30b9c3: jnz 0x46f8c30ba2c0x46f8c30b9c9: mov 0x18(%rax), %rax0x46f8c30b9cd: nop 0x200(%rax)0x46f8c30b9d4: mov %rax, -0x38(%rbp)

Page 79: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Baseline DFG PolymorphicAccess Snippet YARR CSS

MacroAssembler (template JIT toolkit)

Page 80: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Agenda

• High Level Overview

• Template JITing

• Optimized JITing- DFG

- FTL

- BBQ

- OMG

Page 81: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Agenda

• High Level Overview

• Template JITing

• Optimized JITing- DFG

- FTL

- BBQ

- OMG

Page 82: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG IR

Page 83: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Source

function foo(a, b){ return a + b;}

Page 84: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Bytecode[ 0] enter [ 1] get_scope loc3[ 3] mov loc4, loc3[ 6] check_traps [ 7] add loc6, arg1, arg2[ 12] ret loc6

Page 85: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Bytecode[ 0] enter [ 1] get_scope loc3[ 3] mov loc4, loc3[ 6] check_traps [ 7] add loc6, arg1, arg2[ 12] ret loc6

Page 86: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

23: GetLocal(Untyped:@1, arg1(B<Int32>/FlushedInt32), R:Stack(6), bc#7) 24: GetLocal(Untyped:@2, arg2(C<BoolInt32>/FlushedInt32), R:Stack(7), bc#7) 25: ArithAdd(Int32:@23, Int32:@24, CheckOverflow, Exits, bc#7) 26: MovHint(Untyped:@25, loc6, W:SideState, ClobbersExit, bc#7, ExitInvalid) 28: Return(Untyped:@25, W:SideState, Exits, bc#12)

arg1

arg2

Return

Add

Page 87: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG-to-B3 lowering

DFG Optimizer

DFG Backend

DFG Optimizer

B3 Optimizer

Instruction Selection

Air Optimizer

Air Backend

DFG Bytecode Parser

DFG Bytecode Parser

DFG FTLFast JIT Powerful JIT

DFG SSA Conversion

DFG SSA Optimizer

Page 88: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG-to-B3 lowering

DFG Optimizer

DFG Backend

DFG Optimizer

B3 Optimizer

Instruction Selection

Air Optimizer

Air Backend

DFG Bytecode Parser

DFG Bytecode Parser

DFG FTL

DFG IR

Fast JIT Powerful JIT

DFG SSA Conversion

DFG SSA Optimizer

Page 89: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG-to-B3 lowering

DFG Optimizer

DFG Backend

DFG Optimizer

B3 Optimizer

Instruction Selection

Air Optimizer

Air Backend

DFG Bytecode Parser

DFG Bytecode Parser

DFG FTL

DFG IR DFG IR

Fast JIT Powerful JIT

DFG SSA Conversion

DFG SSA Optimizer

Page 90: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG-to-B3 lowering

DFG Optimizer

DFG Backend

DFG Optimizer

B3 Optimizer

Instruction Selection

Air Optimizer

Air Backend

DFG Bytecode Parser

DFG Bytecode Parser

DFG FTL

DFG IR DFG IR

Fast JIT Powerful JIT

DFG SSA Conversion

DFG SSA Optimizer

DFG SSA IR

Page 91: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG-to-B3 lowering

DFG Optimizer

DFG Backend

DFG Optimizer

B3 Optimizer

Instruction Selection

Air Optimizer

Air Backend

DFG Bytecode Parser

DFG Bytecode Parser

DFG FTL

DFG IR DFG IR

B3 IR

Fast JIT Powerful JIT

DFG SSA Conversion

DFG SSA Optimizer

DFG SSA IR

Page 92: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG-to-B3 lowering

DFG Optimizer

DFG Backend

DFG Optimizer

B3 Optimizer

Instruction Selection

Air Optimizer

Air Backend

DFG Bytecode Parser

DFG Bytecode Parser

DFG FTL

DFG IR DFG IR

B3 IR

Assembly IR

Fast JIT Powerful JIT

DFG SSA Conversion

DFG SSA Optimizer

DFG SSA IR

Page 93: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG-to-B3 lowering

DFG Optimizer

DFG Backend

DFG Optimizer

B3 Optimizer

Instruction Selection

Air Optimizer

Air Backend

DFG Bytecode Parser

DFG Bytecode Parser

DFG FTL

DFG IR DFG IR

B3 IR

Assembly IR

Fast JIT Powerful JIT

DFG SSA Conversion

DFG SSA Optimizer

DFG SSA IR

Page 94: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG-to-B3 lowering

DFG Optimizer

DFG Backend

DFG Optimizer

B3 Optimizer

Instruction Selection

Air Optimizer

Air Backend

DFG Bytecode Parser

DFG Bytecode Parser

DFG FTL

DFG IR DFG IR

B3 IR

Assembly IR

Fast JIT Powerful JIT

DFG SSA Conversion

DFG SSA Optimizer

DFG SSA IR

Page 95: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG Goal

Remove lots of type checks quickly.

Page 96: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG Goals

• Speculation

• Static Analysis

• Fast Compilation

Page 97: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG Goals

• Speculation

• Static Analysis

• Fast Compilation

Page 98: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG IR

23: GetLocal(Untyped:@1, arg1(B<Int32>/FlushedInt32), R:Stack(6), bc#7) 24: GetLocal(Untyped:@2, arg2(C<BoolInt32>/FlushedInt32), R:Stack(7), bc#7) 25: ArithAdd(Int32:@23, Int32:@24, CheckOverflow, Exits, bc#7) 26: MovHint(Untyped:@25, loc6, W:SideState, ClobbersExit, bc#7, ExitInvalid) 28: Return(Untyped:@25, W:SideState, Exits, bc#12)

Page 99: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG IR

23: GetLocal(Untyped:@1, arg1(B<Int32>/FlushedInt32), R:Stack(6), bc#7) 24: GetLocal(Untyped:@2, arg2(C<BoolInt32>/FlushedInt32), R:Stack(7), bc#7) 25: ArithAdd(Int32:@23, Int32:@24, CheckOverflow, Exits, bc#7) 26: MovHint(Untyped:@25, loc6, W:SideState, ClobbersExit, bc#7, ExitInvalid) 28: Return(Untyped:@25, W:SideState, Exits, bc#12)

profiling

Page 100: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG IR

23: GetLocal(Untyped:@1, arg1(B<Int32>/FlushedInt32), R:Stack(6), bc#7) 24: GetLocal(Untyped:@2, arg2(C<BoolInt32>/FlushedInt32), R:Stack(7), bc#7) 25: ArithAdd(Int32:@23, Int32:@24, CheckOverflow, Exits, bc#7) 26: MovHint(Untyped:@25, loc6, W:SideState, ClobbersExit, bc#7, ExitInvalid) 28: Return(Untyped:@25, W:SideState, Exits, bc#12)

profilingspeculation

Page 101: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG IR

23: GetLocal(Untyped:@1, arg1(B<Int32>/FlushedInt32), R:Stack(6), bc#7) 24: GetLocal(Untyped:@2, arg2(C<BoolInt32>/FlushedInt32), R:Stack(7), bc#7) 25: ArithAdd(Int32:@23, Int32:@24, CheckOverflow, Exits, bc#7) 26: MovHint(Untyped:@25, loc6, W:SideState, ClobbersExit, bc#7, ExitInvalid) 28: Return(Untyped:@25, W:SideState, Exits, bc#12)

profilingspeculation

OSR

Page 102: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Why OSR?

Page 103: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Branch(Equal( @type, $T1))

Load(@object, offset = 0x10) Call(slow path)

… things …

Branch(Equal( @type, $T1))

Load(@object, offset = 0x18) Call(slow path)

… more things …

Type Checks w/o OSR

Page 104: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Branch(Equal( @type, $T1))

Load(@object, offset = 0x10) Call(slow path)

… things …

Branch(Equal( @type, $T1))

Load(@object, offset = 0x18) Call(slow path)

… more things …

not redundant!

Type Checks w/o OSR

Page 105: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Branch(Equal( @type, $T1))

Load(@object, offset = 0x10) Call(slow path)

… things …

Branch(Equal( @type, $T1))

Load(@object, offset = 0x18) Call(slow path)

… more things …

Branch(Equal( @type, $T1))

OSR exit

Load(@object, offset = 0x10)

… things …

Branch(Equal( @type, $T1))

Load(@object, offset = 0x18)

… more things …

OSR exit

Type Checks w/o OSR Type Checks with OSR

Page 106: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Branch(Equal( @type, $T1))

Load(@object, offset = 0x10) Call(slow path)

… things …

Branch(Equal( @type, $T1))

Load(@object, offset = 0x18) Call(slow path)

… more things …

Branch(Equal( @type, $T1))

OSR exit

Load(@object, offset = 0x10)

… things …

Load(@object, offset = 0x18)

… more things …

Type Checks w/o OSR Type Checks with OSR

Page 107: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

OSR flattens control flow

Page 108: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

OSR is hard

Page 109: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

int foo(int* ptr){ int w, x, y, z;

w = … // lots of stuff

x = is_ok(ptr) ? *ptr : slow_path(ptr);

y = … // lots of stuff

z = is_ok(ptr) ? *ptr : slow_path(ptr);

return w + x + y + z; }

Page 110: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

int foo(int* ptr){ int w, x, y, z;

w = … // lots of stuff

if (!is_ok(ptr)) return foo_base1(ptr, w); x = *ptr;

y = … // lots of stuff

z = *ptr;

return w + x + y + z; }

Page 111: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

int foo(int* ptr){ int w, x, y, z;

w = … // lots of stuff

if (!is_ok(ptr)) return foo_base1(ptr, w); x = *ptr;

y = … // lots of stuff

z = *ptr;

return w + x + y + z; }

Page 112: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

OSR IR Goals

• Must know where to exit.

• Must know what is live-at-exit.

• Must be malleable.

Page 113: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG IR

23: GetLocal(Untyped:@1, arg1(B<Int32>/FlushedInt32), R:Stack(6), bc#7) 24: GetLocal(Untyped:@2, arg2(C<BoolInt32>/FlushedInt32), R:Stack(7), bc#7) 25: ArithAdd(Int32:@23, Int32:@24, CheckOverflow, Exits, bc#7) 26: MovHint(Untyped:@25, loc6, W:SideState, ClobbersExit, bc#7, ExitInvalid) 28: Return(Untyped:@25, W:SideState, Exits, bc#12)

Page 114: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

[ 7] add loc6, arg1, arg2

23: GetLocal(Untyped:@1, arg1(B<Int32>/FlushedInt32), R:Stack(6), bc#7) 24: GetLocal(Untyped:@2, arg2(C<BoolInt32>/FlushedInt32), R:Stack(7), bc#7) 25: ArithAdd(Int32:@23, Int32:@24, CheckOverflow, Exits, bc#7) 26: MovHint(Untyped:@25, loc6, W:SideState, ClobbersExit, bc#7, ExitInvalid)

Page 115: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

[ 7] add loc6, arg1, arg2

23: GetLocal(Untyped:@1, arg1(B<Int32>/FlushedInt32), R:Stack(6), bc#7) 24: GetLocal(Untyped:@2, arg2(C<BoolInt32>/FlushedInt32), R:Stack(7), bc#7) 25: ArithAdd(Int32:@23, Int32:@24, CheckOverflow, Exits, bc#7) 26: MovHint(Untyped:@25, loc6, W:SideState, ClobbersExit, bc#7, ExitInvalid)

Page 116: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

[ 7] add loc6, arg1, arg2

23: GetLocal(Untyped:@1, arg1(B<Int32>/FlushedInt32), R:Stack(6), bc#7) 24: GetLocal(Untyped:@2, arg2(C<BoolInt32>/FlushedInt32), R:Stack(7), bc#7) 25: ArithAdd(Int32:@23, Int32:@24, CheckOverflow, Exits, bc#7) 26: MovHint(Untyped:@25, loc6, W:SideState, ClobbersExit, bc#7, ExitInvalid)

Page 117: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

[ 7] add loc6, arg1, arg2

23: GetLocal(Untyped:@1, arg1(B<Int32>/FlushedInt32), R:Stack(6), bc#7) 24: GetLocal(Untyped:@2, arg2(C<BoolInt32>/FlushedInt32), R:Stack(7), bc#7) 25: ArithAdd(Int32:@23, Int32:@24, CheckOverflow, Exits, bc#7) 26: MovHint(Untyped:@25, loc6, W:SideState, ClobbersExit, bc#7, ExitInvalid)

Page 118: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

[ 7] add loc6, arg1, arg2

23: GetLocal(Untyped:@1, arg1(B<Int32>/FlushedInt32), R:Stack(6), bc#7) 24: GetLocal(Untyped:@2, arg2(C<BoolInt32>/FlushedInt32), R:Stack(7), bc#7) 25: ArithAdd(Int32:@23, Int32:@24, CheckOverflow, Exits, bc#7) 26: MovHint(Untyped:@25, loc6, W:SideState, ClobbersExit, bc#7, ExitInvalid)

Page 119: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

[ 7] add loc6, arg1, arg2

23: GetLocal(Untyped:@1, arg1(B<Int32>/FlushedInt32), R:Stack(6), bc#7) 24: GetLocal(Untyped:@2, arg2(C<BoolInt32>/FlushedInt32), R:Stack(7), bc#7) 25: ArithAdd(Int32:@23, Int32:@24, CheckOverflow, Exits, bc#7) 26: MovHint(Untyped:@25, loc6, W:SideState, ClobbersExit, bc#7, ExitInvalid)

Page 120: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

[ 7] add loc6, arg1, arg2

23: GetLocal(Untyped:@1, arg1(B<Int32>/FlushedInt32), R:Stack(6), bc#7) 24: GetLocal(Untyped:@2, arg2(C<BoolInt32>/FlushedInt32), R:Stack(7), bc#7) 25: ArithAdd(Int32:@23, Int32:@24, CheckOverflow, Exits, bc#7) 26: MovHint(Untyped:@25, loc6, W:SideState, ClobbersExit, bc#7, ExitInvalid)

Page 121: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

[ 7] add loc6, arg1, arg2

23: GetLocal(Untyped:@1, arg1(B<Int32>/FlushedInt32), R:Stack(6), bc#7) 24: GetLocal(Untyped:@2, arg2(C<BoolInt32>/FlushedInt32), R:Stack(7), bc#7) 25: ArithAdd(Int32:@23, Int32:@24, CheckOverflow, Exits, bc#7) 26: MovHint(Untyped:@25, loc6, W:SideState, ClobbersExit, bc#7, ExitInvalid)

Page 122: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG SSA state

arg1

arg2

Add

Return

Page 123: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG SSA state

arg1

arg2

Add

Return

DFG Exit state

loc0 loc1 loc3 loc4 loc5 loc6

Page 124: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG SSA state

arg1

arg2

Add

Return

DFG Exit state

loc0 loc1 loc3 loc4 loc5 loc6

Page 125: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG SSA state

arg1

arg2

Add

Return

DFG Exit state

loc0 loc1 loc3 loc4 loc5 loc6

MovHint

Page 126: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG SSA state

arg1

arg2

Add

Return

DFG Exit state

loc0 loc1 loc3 loc4 loc5 loc6

MovHint

Page 127: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG SSA state

arg1

arg2

Add

Return

DFG Exit state

loc0 loc1 loc3 loc4 loc5 loc6

MovHint

25:

Page 128: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG SSA state

arg1

arg2

Add

Return

DFG Exit state

loc0 loc1 loc3 loc4 loc5 loc6

MovHint

loc6 := @25

25:

Page 129: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

[ 42] addMovHint(…)

ArithAdd(…)

Page 130: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

[ 42] addMovHint(…)

ArithAdd(…) }exit ok

Page 131: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

[ 42] addMovHint(…)

ArithAdd(…) }exit ok

}exit invalid

Page 132: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

[ 666] wat

… checks and readonly ops …

… more effects …

FirstEffect(…)

OSR exit state update

Page 133: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

[ 666] wat

… checks and readonly ops …

… more effects …

FirstEffect(…)

OSR exit state update

}exit ok

Page 134: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

[ 666] wat

… checks and readonly ops …

… more effects …

FirstEffect(…)

OSR exit state update

}exit ok

}exit invalid

Page 135: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

[ 666] wat

… checks and readonly ops …

… more effects …

FirstEffect(…)

OSR exit state update

ExitOK

}exit ok

}exit invalid

Page 136: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

[ 666] wat

… checks and readonly ops …

… more effects …

FirstEffect(…)

OSR exit state update

ExitOK

ExitOK

}exit ok

}exit invalid

Page 137: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

[ 666] wat

… checks and readonly ops …

… more effects …

FirstEffect(…)

OSR exit state update

ExitOK

OSR exit state update

… more effects …

… checks …

FirstEffect(…)

ExitOK

}exit ok

}exit invalid

[ 661] foo

[ 683] bar

Page 138: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Watchpoints +

InvalidationPoint

Page 139: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG Goals

• Speculation

• Static Analysis

• Fast Compilation

Page 140: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Remove type checks

Page 141: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR
Page 142: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Check(Int32:@foo)

Page 143: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Check(Int32:@foo)

Check(Int32:@foo)

Page 144: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Check(Int32:@foo)

Check(Int32:@foo)Check(Int32:@foo)

Page 145: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Check(Int32:@foo)

Page 146: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Check(Int32:@foo)

Check(Int32:@foo)

Page 147: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Check(Int32:@foo)

Check(Int32:@foo) Check(Int32:@foo)

Page 148: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Check(Int32:@foo) Check(Int32:@foo)

Page 149: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Abstract Interpreter• “Global” (whole compilation unit)

• Flow sensitive

• Tracks:

- variable type

- object structure

- indexing type

- constants

Page 150: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG Goals

• Speculation

• Static Analysis

• Fast Compilation

Page 151: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Fast Compile

• Emphasis on block-locality.

• Template code generation.

Page 152: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Get Local Get

Local

Add

Set Local

Page 153: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Get Local Get

Local

Add

Set Local

Branch

Page 154: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Get Local Get

Local

Add

Set Local

Branch

φ φ

Page 155: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Get Local Get

Local

Add

Set Local

Branch

φ φ

Set Local

Set Local

Page 156: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Get Local Get

Local

Add

Set Local

Branch

Get Local

Get Local

φ

φ φ

φ

Set Local

Set Local

Page 157: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

• Primary block-local data flow graph.

Get Local Get

Local

Add

Set Local

Branch

Get Local

Get Local

φ

φ φ

φ

Set Local

Set Local

Page 158: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

• Primary block-local data flow graph.

• Secondary global data flow graph.

Get Local Get

Local

Add

Set Local

Branch

Get Local

Get Local

φ

φ φ

φ

Set Local

Set Local

Page 159: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG Template Codegen

23: GetLocal(Untyped:@1, arg1(B<Int32>/FlushedInt32), R:Stack(6), bc#7) 24: GetLocal(Untyped:@2, arg2(C<BoolInt32>/FlushedInt32), R:Stack(7), bc#7) 25: ArithAdd(Int32:@23, Int32:@24, CheckOverflow, Exits, bc#7) 26: MovHint(Untyped:@25, loc6, W:SideState, ClobbersExit, bc#7, ExitInvalid) 28: Return(Untyped:@25, W:SideState, Exits, bc#12)

Page 160: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG Template Codegen

23: GetLocal(Untyped:@1, arg1(B<Int32>/FlushedInt32), R:Stack(6), bc#7) 24: GetLocal(Untyped:@2, arg2(C<BoolInt32>/FlushedInt32), R:Stack(7), bc#7) 25: ArithAdd(Int32:@23, Int32:@24, CheckOverflow, Exits, bc#7) 26: MovHint(Untyped:@25, loc6, W:SideState, ClobbersExit, bc#7, ExitInvalid) 28: Return(Untyped:@25, W:SideState, Exits, bc#12)

Page 161: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG Template Codegen

23: GetLocal(Untyped:@1, arg1(B<Int32>/FlushedInt32), R:Stack(6), bc#7) 24: GetLocal(Untyped:@2, arg2(C<BoolInt32>/FlushedInt32), R:Stack(7), bc#7) 25: ArithAdd(Int32:@23, Int32:@24, CheckOverflow, Exits, bc#7) 26: MovHint(Untyped:@25, loc6, W:SideState, ClobbersExit, bc#7, ExitInvalid) 28: Return(Untyped:@25, W:SideState, Exits, bc#12)

add %esi, %eaxjo Lexit

Page 162: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG optimization pipelineDFG IR

Bytecode Parsing and Inlining

Type Inference

Check Scheduling

Abstract Interpreter

Local CSE

Simplify (CFG, etc.)

Varargs Forwarding

GC Barrier Scheduling

Template Codegen

Page 163: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG optimization pipelineDFG IR

Bytecode Parsing and Inlining

Type Inference

Check Scheduling

Abstract Interpreter

Local CSE

Simplify (CFG, etc.)

Varargs Forwarding

GC Barrier Scheduling

Template Codegen

Page 164: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG optimization pipelineDFG IR

Bytecode Parsing and Inlining

Type Inference

Check Scheduling

Abstract Interpreter

Local CSE

Simplify (CFG, etc.)

Varargs Forwarding

GC Barrier Scheduling

Template Codegen

Page 165: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG optimization pipelineDFG IR

Bytecode Parsing and Inlining

Type Inference

Check Scheduling

Abstract Interpreter

Local CSE

Simplify (CFG, etc.)

Varargs Forwarding

GC Barrier Scheduling

Template Codegen

Page 166: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG optimization pipelineDFG IR

Bytecode Parsing and Inlining

Type Inference

Check Scheduling

Abstract Interpreter

Local CSE

Simplify (CFG, etc.)

Varargs Forwarding

GC Barrier Scheduling

Template Codegen

Page 167: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG optimization pipelineDFG IR

Bytecode Parsing and Inlining

Type Inference

Check Scheduling

Abstract Interpreter

Local CSE

Simplify (CFG, etc.)

Varargs Forwarding

GC Barrier Scheduling

Template Codegen

Page 168: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG optimization pipelineDFG IR

Bytecode Parsing and Inlining

Type Inference

Check Scheduling

Abstract Interpreter

Local CSE

Simplify (CFG, etc.)

Varargs Forwarding

GC Barrier Scheduling

Template Codegen

Page 169: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

JetStream 2 Score

LLInt

LLInt+Baseline

LLInt+Baseline+DFG

LLInt+Baseline+DFG+FTL

Score (higher is better)0 15 30 45 60 75 90 105 120 135 150

on my computer one day

>2x LLInt

>2x Baseline

~1.1x DFG

Page 170: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

JetStream 2 Score

LLInt

LLInt+Baseline

LLInt+Baseline+DFG

LLInt+Baseline+DFG+FTL

Score (higher is better)0 15 30 45 60 75 90 105 120 135 150

on my computer one day

>2x LLInt

>2x Baseline

~1.1x DFG

Page 171: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG-to-B3 lowering

DFG Optimizer

DFG Backend

DFG Optimizer

B3 Optimizer

Instruction Selection

Air Optimizer

Air Backend

DFG Bytecode Parser

DFG Bytecode Parser

DFG FTL

DFG IR DFG IR

B3 IR

Assembly IR

Fast JIT Powerful JIT

DFG SSA Conversion

DFG SSA Optimizer

DFG SSA IR

Page 172: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG-to-B3 lowering

DFG Optimizer

DFG Backend

DFG Optimizer

B3 Optimizer

Instruction Selection

Air Optimizer

Air Backend

DFG Bytecode Parser

DFG Bytecode Parser

DFG FTL

DFG IR DFG IR

B3 IR

Assembly IR

Fast JIT Powerful JIT

DFG SSA Conversion

DFG SSA Optimizer

DFG SSA IR

Page 173: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

FTL Goal

All the optimizations.

Page 174: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

FTL IRsIR Style Example

Bytecode High LevelLoad/Store bitor dst, left, right

DFG Medium LevelExotic SSA

dst: BitOr(Int32:@left, Int32:@right, …)

B3 Low LevelNormal SSA

Int32 @dst = BitOr(@left, @right)

Air ArchitecturalCISC Or32 %src, %dest

Page 175: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

FTL IRsIR Style Example

Bytecode High LevelLoad/Store bitor dst, left, right

DFG Medium LevelExotic SSA

dst: BitOr(Int32:@left, Int32:@right, …)

B3 Low LevelNormal SSA

Int32 @dst = BitOr(@left, @right)

Air ArchitecturalCISC Or32 %src, %dest

Page 176: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

FTL IRsIR Style Example

Bytecode High LevelLoad/Store bitor dst, left, right

DFG Medium LevelExotic SSA

dst: BitOr(Int32:@left, Int32:@right, …)

B3 Low LevelNormal SSA

Int32 @dst = BitOr(@left, @right)

Air ArchitecturalCISC Or32 %src, %dest

Page 177: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

FTL IRsIR Style Example

Bytecode High LevelLoad/Store bitor dst, left, right

DFG Medium LevelExotic SSA

dst: BitOr(Int32:@left, Int32:@right, …)

B3 Low LevelNormal SSA

Int32 @dst = BitOr(@left, @right)

Air ArchitecturalCISC Or32 %src, %dest

Page 178: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

FTL IRsIR Style Example

Bytecode High LevelLoad/Store bitor dst, left, right

DFG Medium LevelExotic SSA

dst: BitOr(Int32:@left, Int32:@right, …)

B3 Low LevelNormal SSA

Int32 @dst = BitOr(@left, @right)

Air ArchitecturalCISC Or32 %src, %dest

Page 179: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

FTL optimization pipelineDFG IR B3 IR Air

Bytecode Parsing and Inlining

Type Inference

Check Scheduling

Simplify (CFG etc)

Abstract Interpretation

Global CSE

Escape Analysis

LICM

Integer Range Optimization

GC Barrier Scheduling

Lower to B3 IR

Double-to-Float

Simplify (folding, CFG, etc)

LICM

Global CSE

Switch Inference

Tail Duplication

Path Constants

Macro Lowering

Legalization

Constant Motion

Lower to Air (isel)

Simplify CFG

Macro Lowering

DCE

Graph Coloring Reg Alloc

Spill CSE

Graph Coloring Stack Alloc

Report Used Registers

Lower Multiple Entrypoints

Select Block Order

Emit Machine Code

Fix Partial Register Stalls

Page 180: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

FTL optimization pipelineDFG IR B3 IR Air

Bytecode Parsing and Inlining

Type Inference

Check Scheduling

Simplify (CFG etc)

Abstract Interpretation

Global CSE

Escape Analysis

LICM

Integer Range Optimization

GC Barrier Scheduling

Lower to B3 IR

Double-to-Float

Simplify (folding, CFG, etc)

LICM

Global CSE

Switch Inference

Tail Duplication

Path Constants

Macro Lowering

Legalization

Constant Motion

Lower to Air (isel)

Simplify CFG

Macro Lowering

DCE

Graph Coloring Reg Alloc

Spill CSE

Graph Coloring Stack Alloc

Report Used Registers

Lower Multiple Entrypoints

Select Block Order

Emit Machine Code

Fix Partial Register Stalls

Page 181: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

FTL optimization pipelineDFG IR B3 IR Air

Bytecode Parsing and Inlining

Type Inference

Check Scheduling

Simplify (CFG etc)

Abstract Interpretation

Global CSE

Escape Analysis

LICM

Integer Range Optimization

GC Barrier Scheduling

Lower to B3 IR

Double-to-Float

Simplify (folding, CFG, etc)

LICM

Global CSE

Switch Inference

Tail Duplication

Path Constants

Macro Lowering

Legalization

Constant Motion

Lower to Air (isel)

Simplify CFG

Macro Lowering

DCE

Graph Coloring Reg Alloc

Spill CSE

Graph Coloring Stack Alloc

Report Used Registers

Lower Multiple Entrypoints

Select Block Order

Emit Machine Code

Fix Partial Register Stalls

Page 182: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

FTL optimization pipelineDFG IR B3 IR Air

Bytecode Parsing and Inlining

Type Inference

Check Scheduling

Simplify (CFG etc)

Abstract Interpretation

Global CSE

Escape Analysis

LICM

Integer Range Optimization

GC Barrier Scheduling

Lower to B3 IR

Double-to-Float

Simplify (folding, CFG, etc)

LICM

Global CSE

Switch Inference

Tail Duplication

Path Constants

Macro Lowering

Legalization

Constant Motion

Lower to Air (isel)

Simplify CFG

Macro Lowering

DCE

Graph Coloring Reg Alloc

Spill CSE

Graph Coloring Stack Alloc

Report Used Registers

Lower Multiple Entrypoints

Select Block Order

Emit Machine Code

Fix Partial Register Stalls

Page 183: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

FTL optimization pipelineDFG IR B3 IR Air

Bytecode Parsing and Inlining

Type Inference

Check Scheduling

Simplify (CFG etc)

Abstract Interpretation

Global CSE

Escape Analysis

LICM

Integer Range Optimization

GC Barrier Scheduling

Lower to B3 IR

Double-to-Float

Simplify (folding, CFG, etc)

LICM

Global CSE

Switch Inference

Tail Duplication

Path Constants

Macro Lowering

Legalization

Constant Motion

Lower to Air (isel)

Simplify CFG

Macro Lowering

DCE

Graph Coloring Reg Alloc

Spill CSE

Graph Coloring Stack Alloc

Report Used Registers

Lower Multiple Entrypoints

Select Block Order

Emit Machine Code

Fix Partial Register Stalls

Page 184: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

FTL optimization pipelineDFG IR B3 IR Air

Bytecode Parsing and Inlining

Type Inference

Check Scheduling

Simplify (CFG etc)

Abstract Interpretation

Global CSE

Escape Analysis

LICM

Integer Range Optimization

GC Barrier Scheduling

Lower to B3 IR

Double-to-Float

Simplify (folding, CFG, etc)

LICM

Global CSE

Switch Inference

Tail Duplication

Path Constants

Macro Lowering

Legalization

Constant Motion

Lower to Air (isel)

Simplify CFG

Macro Lowering

DCE

Graph Coloring Reg Alloc

Spill CSE

Graph Coloring Stack Alloc

Report Used Registers

Lower Multiple Entrypoints

Select Block Order

Emit Machine Code

Fix Partial Register Stalls

Page 185: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

FTL optimization pipelineDFG IR B3 IR Air

Bytecode Parsing and Inlining

Type Inference

Check Scheduling

Simplify (CFG etc)

Abstract Interpretation

Global CSE

Escape Analysis

LICM

Integer Range Optimization

GC Barrier Scheduling

Lower to B3 IR

Double-to-Float

Simplify (folding, CFG, etc)

LICM

Global CSE

Switch Inference

Tail Duplication

Path Constants

Macro Lowering

Legalization

Constant Motion

Lower to Air (isel)

Simplify CFG

Macro Lowering

DCE

Graph Coloring Reg Alloc

Spill CSE

Graph Coloring Stack Alloc

Report Used Registers

Lower Multiple Entrypoints

Select Block Order

Emit Machine Code

Fix Partial Register Stalls

Page 186: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

FTL optimization pipelineDFG IR B3 IR Air

Bytecode Parsing and Inlining

Type Inference

Check Scheduling

Simplify (CFG etc)

Abstract Interpretation

Global CSE

Escape Analysis

LICM

Integer Range Optimization

GC Barrier Scheduling

Lower to B3 IR

Double-to-Float

Simplify (folding, CFG, etc)

LICM

Global CSE

Switch Inference

Tail Duplication

Path Constants

Macro Lowering

Legalization

Constant Motion

Lower to Air (isel)

Simplify CFG

Macro Lowering

DCE

Graph Coloring Reg Alloc

Spill CSE

Graph Coloring Stack Alloc

Report Used Registers

Lower Multiple Entrypoints

Select Block Order

Emit Machine Code

Fix Partial Register Stalls

Page 187: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Source

function foo(a, b, c){ return a + b + c;}

Page 188: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Bytecode[ 0] enter [ 1] get_scope loc3[ 3] mov loc4, loc3[ 6] check_traps [ 7] add loc6, arg1, arg2[ 12] add loc6, loc6, arg3[ 17] ret loc6

Page 189: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG IR

24: GetLocal(Untyped:@1, arg1(B<Int32>/FlushedInt32), R:Stack(6), bc#7) 25: GetLocal(Untyped:@2, arg2(C<BoolInt32>/FlushedInt32), R:Stack(7), bc#7) 26: ArithAdd(Int32:@24, Int32:@25, CheckOverflow, Exits, bc#7) 27: MovHint(Untyped:@26, loc6, W:SideState, ClobbersExit, bc#7, ExitInvalid) 29: GetLocal(Untyped:@3, arg3(D<Int32>/FlushedInt32), R:Stack(8), bc#12) 30: ArithAdd(Int32:@26, Int32:@29, CheckOverflow, Exits, bc#12) 31: MovHint(Untyped:@30, loc6, W:SideState, ClobbersExit, bc#12, ExitInvalid) 33: Return(Untyped:@3, W:SideState, Exits, bc#17)

Page 190: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG IR

24: GetLocal(Untyped:@1, arg1(B<Int32>/FlushedInt32), R:Stack(6), bc#7) 25: GetLocal(Untyped:@2, arg2(C<BoolInt32>/FlushedInt32), R:Stack(7), bc#7) 26: ArithAdd(Int32:@24, Int32:@25, CheckOverflow, Exits, bc#7) 27: MovHint(Untyped:@26, loc6, W:SideState, ClobbersExit, bc#7, ExitInvalid) 29: GetLocal(Untyped:@3, arg3(D<Int32>/FlushedInt32), R:Stack(8), bc#12) 30: ArithAdd(Int32:@26, Int32:@29, CheckOverflow, Exits, bc#12) 31: MovHint(Untyped:@30, loc6, W:SideState, ClobbersExit, bc#12, ExitInvalid) 33: Return(Untyped:@3, W:SideState, Exits, bc#17)

Page 191: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

B3 IR Int32 @42 = Trunc(@32, DFG:@26) Int32 @43 = Trunc(@27, DFG:@26) Int32 @44 = CheckAdd(@42:WarmAny, @43:WarmAny, generator = 0x1052c5cd0, earlyClobbered = [], lateClobbered = [], usedRegisters = [], ExitsSideways|Reads:Top, DFG:@26) Int32 @45 = Trunc(@22, DFG:@30) Int32 @46 = CheckAdd(@44:WarmAny, @45:WarmAny, @44:ColdAny, generator = 0x1052c5d70, earlyClobbered = [], lateClobbered = [], usedRegisters = [], ExitsSideways|Reads:Top, DFG:@30) Int64 @47 = ZExt32(@46, DFG:@32) Int64 @48 = Add(@47, $-281474976710656(@13), DFG:@32) Void @49 = Return(@48, Terminal, DFG:@32)

Page 192: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

B3 IR Int32 @42 = Trunc(@32, DFG:@26) Int32 @43 = Trunc(@27, DFG:@26) Int32 @44 = CheckAdd(@42:WarmAny, @43:WarmAny, generator = 0x1052c5cd0, earlyClobbered = [], lateClobbered = [], usedRegisters = [], ExitsSideways|Reads:Top, DFG:@26) Int32 @45 = Trunc(@22, DFG:@30) Int32 @46 = CheckAdd(@44:WarmAny, @45:WarmAny, @44:ColdAny, generator = 0x1052c5d70, earlyClobbered = [], lateClobbered = [], usedRegisters = [], ExitsSideways|Reads:Top, DFG:@30) Int64 @47 = ZExt32(@46, DFG:@32) Int64 @48 = Add(@47, $-281474976710656(@13), DFG:@32) Void @49 = Return(@48, Terminal, DFG:@32)

Page 193: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

B3 IR Int32 @42 = Trunc(@32, DFG:@26) Int32 @43 = Trunc(@27, DFG:@26) Int32 @44 = CheckAdd(@42:WarmAny, @43:WarmAny, generator = 0x1052c5cd0, earlyClobbered = [], lateClobbered = [], usedRegisters = [], ExitsSideways|Reads:Top, DFG:@26) Int32 @45 = Trunc(@22, DFG:@30) Int32 @46 = CheckAdd(@44:WarmAny, @45:WarmAny, @44:ColdAny, generator = 0x1052c5d70, earlyClobbered = [], lateClobbered = [], usedRegisters = [], ExitsSideways|Reads:Top, DFG:@30) Int64 @47 = ZExt32(@46, DFG:@32) Int64 @48 = Add(@47, $-281474976710656(@13), DFG:@32) Void @49 = Return(@48, Terminal, DFG:@32)

Page 194: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Int32 @42 = Trunc(@32, DFG:@26) Int32 @43 = Trunc(@27, DFG:@26) Int32 @44 = CheckAdd(@42:WarmAny, @43:WarmAny, generator = 0x1052c5cd0, earlyClobbered = [], lateClobbered = [], usedRegisters = [], ExitsSideways|Reads:Top, DFG:@26) Int32 @45 = Trunc(@22, DFG:@30) Int32 @46 = CheckAdd(@44:WarmAny, @45:WarmAny, @44:ColdAny, generator = 0x1052c5d70, earlyClobbered = [], lateClobbered = [], usedRegisters = [], ExitsSideways|Reads:Top, DFG:@30) Int64 @47 = ZExt32(@46, DFG:@32) Int64 @48 = Add(@47, $-281474976710656(@13), DFG:@32) Void @49 = Return(@48, Terminal, DFG:@32)

Page 195: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Int32 @42 = Trunc(@32, DFG:@26) Int32 @43 = Trunc(@27, DFG:@26) Int32 @44 = CheckAdd(@42:WarmAny, @43:WarmAny, generator = 0x1052c5cd0, earlyClobbered = [], lateClobbered = [], usedRegisters = [], ExitsSideways|Reads:Top, DFG:@26) Int32 @45 = Trunc(@22, DFG:@30) Int32 @46 = CheckAdd(@44:WarmAny, @45:WarmAny, @44:ColdAny, generator = 0x1052c5d70, earlyClobbered = [], lateClobbered = [], usedRegisters = [], ExitsSideways|Reads:Top, DFG:@30) Int64 @47 = ZExt32(@46, DFG:@32) Int64 @48 = Add(@47, $-281474976710656(@13), DFG:@32) Void @49 = Return(@48, Terminal, DFG:@32)

Page 196: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Int32 @42 = Trunc(@32, DFG:@26) Int32 @43 = Trunc(@27, DFG:@26) Int32 @44 = CheckAdd(@42:WarmAny, @43:WarmAny, generator = 0x1052c5cd0, earlyClobbered = [], lateClobbered = [], usedRegisters = [], ExitsSideways|Reads:Top, DFG:@26) Int32 @45 = Trunc(@22, DFG:@30) Int32 @46 = CheckAdd(@44:WarmAny, @45:WarmAny, @44:ColdAny, generator = 0x1052c5d70, earlyClobbered = [], lateClobbered = [], usedRegisters = [], ExitsSideways|Reads:Top, DFG:@30) Int64 @47 = ZExt32(@46, DFG:@32) Int64 @48 = Add(@47, $-281474976710656(@13), DFG:@32) Void @49 = Return(@48, Terminal, DFG:@32)

26: ArithAdd(Int32:@24, Int32:@25, CheckOverflow, Exits, bc#7) 27: MovHint(Untyped:@26, loc6, W:SideState, ClobbersExit, bc#7, ExitInvalid) 30: ArithAdd(Int32:@26, Int32:@29, CheckOverflow, Exits, bc#12)

Page 197: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Int32 @42 = Trunc(@32, DFG:@26) Int32 @43 = Trunc(@27, DFG:@26) Int32 @44 = CheckAdd(@42:WarmAny, @43:WarmAny, generator = 0x1052c5cd0, earlyClobbered = [], lateClobbered = [], usedRegisters = [], ExitsSideways|Reads:Top, DFG:@26) Int32 @45 = Trunc(@22, DFG:@30) Int32 @46 = CheckAdd(@44:WarmAny, @45:WarmAny, @44:ColdAny, generator = 0x1052c5d70, earlyClobbered = [], lateClobbered = [], usedRegisters = [], ExitsSideways|Reads:Top, DFG:@30) Int64 @47 = ZExt32(@46, DFG:@32) Int64 @48 = Add(@47, $-281474976710656(@13), DFG:@32) Void @49 = Return(@48, Terminal, DFG:@32)

26: ArithAdd(Int32:@24, Int32:@25, CheckOverflow, Exits, bc#7) 27: MovHint(Untyped:@26, loc6, W:SideState, ClobbersExit, bc#7, ExitInvalid) 30: ArithAdd(Int32:@26, Int32:@29, CheckOverflow, Exits, bc#12)

Page 198: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Int32 @42 = Trunc(@32, DFG:@26) Int32 @43 = Trunc(@27, DFG:@26) Int32 @44 = CheckAdd(@42:WarmAny, @43:WarmAny, generator = 0x1052c5cd0, earlyClobbered = [], lateClobbered = [], usedRegisters = [], ExitsSideways|Reads:Top, DFG:@26) Int32 @45 = Trunc(@22, DFG:@30) Int32 @46 = CheckAdd(@44:WarmAny, @45:WarmAny, @44:ColdAny, generator = 0x1052c5d70, earlyClobbered = [], lateClobbered = [], usedRegisters = [], ExitsSideways|Reads:Top, DFG:@30) Int64 @47 = ZExt32(@46, DFG:@32) Int64 @48 = Add(@47, $-281474976710656(@13), DFG:@32) Void @49 = Return(@48, Terminal, DFG:@32)

26: ArithAdd(Int32:@24, Int32:@25, CheckOverflow, Exits, bc#7) 27: MovHint(Untyped:@26, loc6, W:SideState, ClobbersExit, bc#7, ExitInvalid) 30: ArithAdd(Int32:@26, Int32:@29, CheckOverflow, Exits, bc#12)

Page 199: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG IR

B3 IR

Page 200: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG IR

B3 IR

ArithAdd

Page 201: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG IR

B3 IR

ArithAdd

CheckAdd

Page 202: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG IR

B3 IR

ArithAdd

CheckAdd

generator

λ

Page 203: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG IR

B3 IR

ArithAdd

CheckAdd

OSR Exit

Descriptor

generator

λ

Page 204: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG IR

B3 IR

ArithAdd

CheckAdd

OSR Exit

Descriptor

generator

λ

Page 205: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

CheckAdd(@left, @right, @arg0, @arg1, @arg2, …, generator = 0x…)

Page 206: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

CheckAdd(@left, @right, @arg0, @arg1, @arg2, …, generator = 0x…)

Bytecode Variable: loc1 loc2 loc3 loc4Recovery Method: @arg2 Const: 42 @arg0 @arg1

JSC::FTL::OSRExitDescriptor

Page 207: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

CheckAdd(@left, @right, @arg0, @arg1, @arg2, …, generator = 0x…)

Bytecode Variable: loc1 loc2 loc3 loc4Recovery Method: @arg2 Const: 42 @arg0 @arg1

JSC::FTL::OSRExitDescriptor

Page 208: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

CheckAdd(@left, @right, @arg0, @arg1, @arg2, …, generator = 0x…)

Bytecode Variable: loc1 loc2 loc3 loc4Recovery Method: @arg2 Const: 42 @arg0 @arg1

JSC::FTL::OSRExitDescriptor

Page 209: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

CheckAdd(@left, @right, @arg0, @arg1, @arg2, …, generator = 0x…)

Bytecode Variable: loc1 loc2 loc3 loc4Recovery Method: @arg2 Const: 42 @arg0 @arg1

JSC::FTL::OSRExitDescriptor

Page 210: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Air backend

CheckAdd(@left, @right, @arg0, @arg1, @arg2, …, generator = 0x…)

Bytecode Variable: loc1 loc2 loc3 loc4Recovery Method: @arg2 Const: 42 @arg0 @arg1

JSC::FTL::OSRExitDescriptor

Patch &BranchAdd32 Overflow, %left, %right, %dst, %arg0, %arg1, %arg2, …, generator = 0x…)

Page 211: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Air backend

CheckAdd(@left, @right, @arg0, @arg1, @arg2, …, generator = 0x…)

Bytecode Variable: loc1 loc2 loc3 loc4Recovery Method: @arg2 Const: 42 @arg0 @arg1

JSC::FTL::OSRExitDescriptor

Patch &BranchAdd32 Overflow, %left, %right, %dst, %arg0, %arg1, %arg2, …, generator = 0x…)

Page 212: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Air backend

CheckAdd(@left, @right, @arg0, @arg1, @arg2, …, generator = 0x…)

Bytecode Variable: loc1 loc2 loc3 loc4Recovery Method: @arg2 Const: 42 @arg0 @arg1

JSC::FTL::OSRExitDescriptor

Patch &BranchAdd32 Overflow, %left, %right, %dst, %arg0, %arg1, %arg2, …, generator = 0x…)

Page 213: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Air backend

CheckAdd(@left, @right, @arg0, @arg1, @arg2, …, generator = 0x…)

Bytecode Variable: loc1 loc2 loc3 loc4Recovery Method: @arg2 Const: 42 @arg0 @arg1

JSC::FTL::OSRExitDescriptor

Patch &BranchAdd32 Overflow, %left, %right, %dst, %arg0, %arg1, %arg2, …, generator = 0x…)

Page 214: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Air backend

CheckAdd(@left, @right, @arg0, @arg1, @arg2, …, generator = 0x…)

Bytecode Variable: loc1 loc2 loc3 loc4Recovery Method: @arg2 Const: 42 @arg0 @arg1

JSC::FTL::OSRExitDescriptor

Patch &BranchAdd32 Overflow, %left, %right, %dst, %rcx , %r11 , %rax , …, generator = 0x…)

Page 215: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Air backend

CheckAdd(@left, @right, @arg0, @arg1, @arg2, …, generator = 0x…)

Bytecode Variable: loc1 loc2 loc3 loc4Recovery Method: @arg2 Const: 42 @arg0 @arg1

%rcx

JSC::FTL::OSRExitDescriptor

Patch &BranchAdd32 Overflow, %left, %right, %dst, %rcx , %r11 , %rax , …, generator = 0x…)

Page 216: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Air backend

CheckAdd(@left, @right, @arg0, @arg1, @arg2, …, generator = 0x…)

Bytecode Variable: loc1 loc2 loc3 loc4Recovery Method: @arg2 Const: 42 @arg0 @arg1

%rcx %r11

JSC::FTL::OSRExitDescriptor

Patch &BranchAdd32 Overflow, %left, %right, %dst, %rcx , %r11 , %rax , …, generator = 0x…)

Page 217: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Air backend

CheckAdd(@left, @right, @arg0, @arg1, @arg2, …, generator = 0x…)

Bytecode Variable: loc1 loc2 loc3 loc4Recovery Method: @arg2 Const: 42 @arg0 @arg1

%rcx %r11%rax

JSC::FTL::OSRExitDescriptor

Patch &BranchAdd32 Overflow, %left, %right, %dst, %rcx , %r11 , %rax , …, generator = 0x…)

Page 218: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR
Page 219: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG IR

Page 220: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG IR

B3 IR

lowering

phase

Page 221: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG IR

B3 IR

lowering

phaselots of stuff

Machine Code

Page 222: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG IR

B3 IR

lowering

phaselots of stuff

Machine Code

Add

Page 223: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG IR

B3 IR

lowering

phaselots of stuff

Machine Code

Add

CheckAdd

Trunc Trunc

Page 224: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG IR

B3 IR

lowering

phaselots of stuff

Machine Code

Add

CheckAdd

Trunc Trunc

addl

jo

Page 225: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG IR

B3 IR

lowering

phaselots of stuff

Machine Code

Add

CheckAdd

Trunc Trunc

addl

jo

Call

Page 226: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG IR

B3 IR

lowering

phaselots of stuff

Machine Code

Add

CheckAdd

Trunc Trunc

addl

jo

Call

Page 227: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG IR

B3 IR

lowering

phaselots of stuff

Machine Code

Add

CheckAdd

Trunc Trunc

addl

jo

Call

patchpoint

Page 228: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG IR

B3 IR

lowering

phaselots of stuff

Machine Code

Add

CheckAdd

Trunc Trunc

addl

jo

Call

λ

Page 229: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

inline void x86_cpuid(){ intptr_t a = 0, b, c, d; asm volatile( "cpuid" : "+a"(a), "=b"(b), "=c"(c), "=d"(d) : : "memory");}

Page 230: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

if (MacroAssemblerARM64:: supportsDoubleToInt32ConversionUsingJavaScriptSemantics()) { PatchpointValue* patchpoint = m_out.patchpoint(Int32); patchpoint->appendSomeRegister(doubleValue); patchpoint->setGenerator( [=] (CCallHelpers& jit, const StackmapGenerationParams& params) { jit.convertDoubleToInt32UsingJavaScriptSemantics( params[1].fpr(), params[0].gpr()); }); patchpoint->effects = Effects::none(); return patchpoint;}

Page 231: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

if (MacroAssemblerARM64:: supportsDoubleToInt32ConversionUsingJavaScriptSemantics()) { PatchpointValue* patchpoint = m_out.patchpoint(Int32); patchpoint->appendSomeRegister(doubleValue); patchpoint->setGenerator( [=] (CCallHelpers& jit, const StackmapGenerationParams& params) { jit.convertDoubleToInt32UsingJavaScriptSemantics( params[1].fpr(), params[0].gpr()); }); patchpoint->effects = Effects::none(); return patchpoint;}

Page 232: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

if (MacroAssemblerARM64:: supportsDoubleToInt32ConversionUsingJavaScriptSemantics()) { PatchpointValue* patchpoint = m_out.patchpoint(Int32); patchpoint->appendSomeRegister(doubleValue); patchpoint->setGenerator( [=] (CCallHelpers& jit, const StackmapGenerationParams& params) { jit.convertDoubleToInt32UsingJavaScriptSemantics( params[1].fpr(), params[0].gpr()); }); patchpoint->effects = Effects::none(); return patchpoint;}

Page 233: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

if (MacroAssemblerARM64:: supportsDoubleToInt32ConversionUsingJavaScriptSemantics()) { PatchpointValue* patchpoint = m_out.patchpoint(Int32); patchpoint->appendSomeRegister(doubleValue); patchpoint->setGenerator( [=] (CCallHelpers& jit, const StackmapGenerationParams& params) { jit.convertDoubleToInt32UsingJavaScriptSemantics( params[1].fpr(), params[0].gpr()); }); patchpoint->effects = Effects::none(); return patchpoint;}

Page 234: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

if (MacroAssemblerARM64:: supportsDoubleToInt32ConversionUsingJavaScriptSemantics()) { PatchpointValue* patchpoint = m_out.patchpoint(Int32); patchpoint->appendSomeRegister(doubleValue); patchpoint->setGenerator( [=] (CCallHelpers& jit, const StackmapGenerationParams& params) { jit.convertDoubleToInt32UsingJavaScriptSemantics( params[1].fpr(), params[0].gpr()); }); patchpoint->effects = Effects::none(); return patchpoint;}

Page 235: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

if (MacroAssemblerARM64:: supportsDoubleToInt32ConversionUsingJavaScriptSemantics()) { PatchpointValue* patchpoint = m_out.patchpoint(Int32); patchpoint->appendSomeRegister(doubleValue); patchpoint->setGenerator( [=] (CCallHelpers& jit, const StackmapGenerationParams& params) { jit.convertDoubleToInt32UsingJavaScriptSemantics( params[1].fpr(), params[0].gpr()); }); patchpoint->effects = Effects::none(); return patchpoint;}

Page 236: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

if (MacroAssemblerARM64:: supportsDoubleToInt32ConversionUsingJavaScriptSemantics()) { PatchpointValue* patchpoint = m_out.patchpoint(Int32); patchpoint->appendSomeRegister(doubleValue); patchpoint->setGenerator( [=] (CCallHelpers& jit, const StackmapGenerationParams& params) { jit.convertDoubleToInt32UsingJavaScriptSemantics( params[1].fpr(), params[0].gpr()); }); patchpoint->effects = Effects::none(); return patchpoint;}

Page 237: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Patchpoint Use Cases

• Polymorphic inline caches

• Calls with interesting calling conventions

• Lazy slow paths

• Interesting instructions

Page 238: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Patchpoint Use Cases

FTL

B3

Page 239: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Patchpoint Use Cases

FTL

B3 B3

B3

Page 240: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Patchpoint Use Cases

FTL

B3 B3

B3

PolymorphicAccess

Snippet

Page 241: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Patchpoint Use Cases

Snippet

DFG

FTL

B3 B3

B3

PolymorphicAccess

Snippet

Page 242: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Patchpoint Use Cases

Snippet

DFG

Snippet

FTL

FTL

B3 B3

B3

PolymorphicAccess

Snippet

Page 243: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

DFG-to-B3 lowering

DFG Optimizer

DFG Backend

DFG Optimizer

B3 Optimizer

Instruction Selection

Air Optimizer

Air Backend

DFG Bytecode Parser

DFG Bytecode Parser

DFG FTL

DFG IR DFG IR

B3 IR

Assembly IR

Fast JIT Powerful JIT

DFG SSA Conversion

DFG SSA Optimizer

DFG SSA IR

Page 244: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Agenda

• High Level Overview

• Template JITing

• Optimized JITing- DFG

- FTL

- BBQ

- OMG

Page 245: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Two WebAssembly Tiers

BBQ (B3 -01)

OMG (B3 -O2)

latency throughput

Page 246: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

OMGPowerful JIT

B3 IR Air

Double-to-Float

Simplify (folding, CFG, etc)

LICM

Global CSE

Switch Inference

Tail Duplication

Path Constants

Macro Lowering

Legalization

Constant Motion

Lower to Air (isel)

Simplify CFG

Macro Lowering

DCE

Graph Coloring Reg Alloc

Spill CSE

Graph Coloring Stack Alloc

Report Used Registers

Lower Multiple Entrypoints

Select Block Order

Emit Machine Code

Fix Partial Register Stalls

Page 247: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Air

Graph Coloring Reg Alloc

BBQFast JIT

B3 IR

Double-to-Float

Simplify (folding, CFG, etc)

LICM

Global CSE

Switch Inference

Tail Duplication

Path Constants

Macro Lowering

Legalization

Constant Motion

Lower to Air (isel)

Simplify CFG

Macro Lowering

DCE

Spill CSE

Graph Coloring Stack Alloc

Report Used Registers

Lower Multiple Entrypoints

Select Block Order

Emit Machine Code

Fix Partial Register Stalls

Page 248: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Air

Linear Scan Reg+Stack Alloc

BBQFast JIT

B3 IR

Double-to-Float

Simplify (folding, CFG, etc)

Macro Lowering

Legalization

Constant Motion

Lower to Air (isel)

Simplify CFG

Macro Lowering

DCE

Lower Multiple Entrypoints

Select Block Order

Emit Machine Code

Fix Partial Register Stalls

Page 249: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Air

Linear Scan Reg+Stack Alloc

BBQFast JIT

B3 IR

Double-to-Float

Simplify (folding, CFG, etc)

Macro Lowering

Legalization

Constant Motion

Lower to Air (isel)

Simplify CFG

Macro Lowering

DCE

Lower Multiple Entrypoints

Select Block Order

Emit Machine Code

Fix Partial Register Stalls

5× faster compile than OMG

Page 250: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Air

Linear Scan Reg+Stack Alloc

BBQFast JIT

B3 IR

Double-to-Float

Simplify (folding, CFG, etc)

Macro Lowering

Legalization

Constant Motion

Lower to Air (isel)

Simplify CFG

Macro Lowering

DCE

Lower Multiple Entrypoints

Select Block Order

Emit Machine Code

Fix Partial Register Stalls

5× faster compile than OMG2× slower execution than OMG

Page 251: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

Agenda

• High Level Overview

• Template JITing

• Optimized JITing- DFG

- FTL

- BBQ

- OMG

Page 252: All About JavaScriptCore’s Many Compilers · (interpreter) Baseline (template JIT) DFG (less optimizing JIT) FTL (optimizing JIT) latency throughput profiling speculation and OSR

MacroAssembler (template JIT toolkit)

B3 JIT (optimizing JIT toolkit)

Baseline DFG PolymorphicAccess Snippet YARR CSS

FTL Wasm BBQ

Wasm OMG