cog back to the future, part ii
DESCRIPTION
COG Back to the Future, Part II. faster open source VMs for Croquet, Squeak & Newspeak. Eliot Miranda. ESUG 2008, AmsterdamTRANSCRIPT
![Page 1: COG Back to the Future, Part II](https://reader038.vdocuments.net/reader038/viewer/2022103017/55855237d8b42ae15d8b54d9/html5/thumbnails/1.jpg)
COGBack to the Future, Part II
faster open source VMs forCroquet, Squeak & Newspeak
1
![Page 2: COG Back to the Future, Part II](https://reader038.vdocuments.net/reader038/viewer/2022103017/55855237d8b42ae15d8b54d9/html5/thumbnails/2.jpg)
Why?
What?
Where to & when?
2
![Page 3: COG Back to the Future, Part II](https://reader038.vdocuments.net/reader038/viewer/2022103017/55855237d8b42ae15d8b54d9/html5/thumbnails/3.jpg)
Qwaq Forums
SAS
Client experience
Why Cog?
3
![Page 4: COG Back to the Future, Part II](https://reader038.vdocuments.net/reader038/viewer/2022103017/55855237d8b42ae15d8b54d9/html5/thumbnails/4.jpg)
Small part of a larger whole
Why Cog?
cog•no•men |kägˈnōmən; ˈkägnəmən|nounan extra personal name given to an ancient Roman citizen, functioning rather like a nickname and typically passed down from father to son.• a name; a nickname.
4
![Page 5: COG Back to the Future, Part II](https://reader038.vdocuments.net/reader038/viewer/2022103017/55855237d8b42ae15d8b54d9/html5/thumbnails/5.jpg)
Success is 99% failure
Why Cog?
5
![Page 6: COG Back to the Future, Part II](https://reader038.vdocuments.net/reader038/viewer/2022103017/55855237d8b42ae15d8b54d9/html5/thumbnails/6.jpg)
MARKETING GIRL:When you have been in marketing as long as I have, you’ll know
that before any new product can be developed,it has to be properly researched.
I mean yes, yes we’ve got to find out what people want from fire, I mean how do they relate to it, the image -
FORD:Oh, stick it up your nose.
MARKETING GIRL:Yes which is precisely the sort of thing we need to know, I mean
do people want fire that can be fitted nasally?
get the marketing out of the way early...
6
![Page 7: COG Back to the Future, Part II](https://reader038.vdocuments.net/reader038/viewer/2022103017/55855237d8b42ae15d8b54d9/html5/thumbnails/7.jpg)
a logo!
What’s Cog?
7
![Page 8: COG Back to the Future, Part II](https://reader038.vdocuments.net/reader038/viewer/2022103017/55855237d8b42ae15d8b54d9/html5/thumbnails/8.jpg)
What’s Cog?
a bloghttp://www.mirandabanda.org/cogblog
Community Organised Graft
Contributions Over<->taken Gratefully
8
![Page 9: COG Back to the Future, Part II](https://reader038.vdocuments.net/reader038/viewer/2022103017/55855237d8b42ae15d8b54d9/html5/thumbnails/9.jpg)
What’s Cog?
a series of VM evolutions
bytecode sets
object representations
execution technology
DTSSTCPW/KISS/Evolve
not a new compiler
not a replacement VM
no new GC, new CompiledMethod, etc
9
![Page 10: COG Back to the Future, Part II](https://reader038.vdocuments.net/reader038/viewer/2022103017/55855237d8b42ae15d8b54d9/html5/thumbnails/10.jpg)
Where to?
Targets:
HPS style fast JIT
Self style quick JIT
Integrate with Hydra
Integrate with Spoon
10
![Page 11: COG Back to the Future, Part II](https://reader038.vdocuments.net/reader038/viewer/2022103017/55855237d8b42ae15d8b54d9/html5/thumbnails/11.jpg)
fast car11
![Page 12: COG Back to the Future, Part II](https://reader038.vdocuments.net/reader038/viewer/2022103017/55855237d8b42ae15d8b54d9/html5/thumbnails/12.jpg)
quick car12
![Page 13: COG Back to the Future, Part II](https://reader038.vdocuments.net/reader038/viewer/2022103017/55855237d8b42ae15d8b54d9/html5/thumbnails/13.jpg)
Where to?
Targets:HPS style fast JITSelf style quick JIT
Prerequisites
Closures
Internal stack organization
polymorphous Inline Caches
13
![Page 14: COG Back to the Future, Part II](https://reader038.vdocuments.net/reader038/viewer/2022103017/55855237d8b42ae15d8b54d9/html5/thumbnails/14.jpg)
14
![Page 15: COG Back to the Future, Part II](https://reader038.vdocuments.net/reader038/viewer/2022103017/55855237d8b42ae15d8b54d9/html5/thumbnails/15.jpg)
baby steps
Closures + Closure VM
extend existing compiler front-end, new back-end
5 new bytecodes
sionara BlockContext
has ANSI block syntaxnicely familiarless work
Croquet 1.0, Squeak 3.9www.mirandabanda.org/downloadsDeployed internally at Qwaq
15
![Page 16: COG Back to the Future, Part II](https://reader038.vdocuments.net/reader038/viewer/2022103017/55855237d8b42ae15d8b54d9/html5/thumbnails/16.jpg)
closures enable stack representationinject: aValue into: binaryBlock | next | next := aValue. self do: [:each| next := binaryBlock value: next value: each]. ^next
inject: aValue into: binaryBlock | tempVector | tempVector := Array new: 1. tempVector at: 1 put: aValue. self do: [:each| tempVector at: 1 put: (binaryBlock value: (tempVector at: 1) value: each)]. ^tempVector at: 1
16
![Page 17: COG Back to the Future, Part II](https://reader038.vdocuments.net/reader038/viewer/2022103017/55855237d8b42ae15d8b54d9/html5/thumbnails/17.jpg)
- BlockClosure - BlockContext + BlockClosure
Object variableSubclass: #BlockClosure instanceVariableNames: 'outerContext startpc numArgs'
BlockClosure methods for evaluatingvalue[:value:value:value:]valueWithArguments:
ContextPart variableSubclass: #MethodContext instanceVariableNames: 'method receiverMap closureOrNil receiver'
baby steps
17
![Page 18: COG Back to the Future, Part II](https://reader038.vdocuments.net/reader038/viewer/2022103017/55855237d8b42ae15d8b54d9/html5/thumbnails/18.jpg)
closure activation
inject:into:sender
pcstackpmethod
closureOrNilreceiver 'Hi!'
aValue 0binaryBlocktemp vector
[] in inject:into:sender
pcstackpmethod
closureOrNilreceiver
$Htemp ...
[] in inject:into:outerContext
startpcnumArgs
binaryBlocktemp vector
do:sender
pcstackpmethod
closureOrNilreceiverarg ...temp ...
an ArraynextValue
18
![Page 19: COG Back to the Future, Part II](https://reader038.vdocuments.net/reader038/viewer/2022103017/55855237d8b42ae15d8b54d9/html5/thumbnails/19.jpg)
baby steps5 new bytecodes
pushNewArrayOfSize:/pushConsArrayWithElements:
pushClosureCopyNumCopiedValues:numArgs:blockSize:
pushRemoteTemp:inVectorAt:
popRemoteTemp:inVectorAt:
storeRemoteTemp:inVectorAt:
change return bytecode to do non-local return if closureOrNil not nil
19
![Page 20: COG Back to the Future, Part II](https://reader038.vdocuments.net/reader038/viewer/2022103017/55855237d8b42ae15d8b54d9/html5/thumbnails/20.jpg)
Stack Interpreter
Closure VM + Internal Stack Organization
activations are stack frames on stack pages
contexts on heap are proxies for stack frames
Streamlined GC, no pop/pushRemappableOop:
20
![Page 21: COG Back to the Future, Part II](https://reader038.vdocuments.net/reader038/viewer/2022103017/55855237d8b42ae15d8b54d9/html5/thumbnails/21.jpg)
stack frame
no argument copying
lazy context creation
slower argument access!! (in interpreter, not in JIT)
epsilon away from fast JIT organization (maybe coexist)
receiver/closurearg...
caller saved ip/base frame caller contextcaller saved fp (0 for base frame)
method
flag bytes: numArgs, hasContext, isClosureActivationthisContext (uninitialized garbage in new frame)
receivertemp...
stack value...
framePointer ⇒
stackPointer ⇒
21
![Page 22: COG Back to the Future, Part II](https://reader038.vdocuments.net/reader038/viewer/2022103017/55855237d8b42ae15d8b54d9/html5/thumbnails/22.jpg)
stack activation‘Hi!’ inject: 0 into:! [:sum :char| sum + char asInteger]
inject: thisValue into: binaryBlock! | nextValue |! nextValue := thisValue.! self do:! ! [:each |! ! nextValue := binaryBlock value: nextValue value: each].! ^nextValue
rcvr: 'Hi!'initialValue: 0
binaryBlock: [] in DoItsp⇒
22
![Page 23: COG Back to the Future, Part II](https://reader038.vdocuments.net/reader038/viewer/2022103017/55855237d8b42ae15d8b54d9/html5/thumbnails/23.jpg)
stack activation‘Hi!’ inject: 0 into:! [:sum :char| sum + char asInteger]
inject: thisValue into: binaryBlock! | nextValue |! nextValue := thisValue.! self do:! ! [:each |! ! nextValue := binaryBlock value: nextValue value: each].! ^nextValue
rcvr: 'Hi!'initialValue: 0
binaryBlock: [] in DoItcaller saved ipcaller saved fp
method (inject:into:)2, false, false%$&^*#@!self: 'Hi!'
temp vector: nilrcvr: 'Hi!'sp⇒
fp⇒
23
![Page 24: COG Back to the Future, Part II](https://reader038.vdocuments.net/reader038/viewer/2022103017/55855237d8b42ae15d8b54d9/html5/thumbnails/24.jpg)
stack activation‘Hi!’ inject: 0 into:! [:sum :char| sum + char asInteger]
inject: thisValue into: binaryBlock! | nextValue |! nextValue := thisValue.! self do:! ! [:each |! ! nextValue := binaryBlock value: nextValue value: each].! ^nextValue
rcvr: 'Hi!'initialValue: 0
binaryBlock: [] in DoItcaller saved ipcaller saved fp
method (inject:into:)2, false, false%$&^*#@!self: 'Hi!'
temp vectorrcvr: 'Hi!'
ArraynextValue: 0
sp⇒
fp⇒
24
![Page 25: COG Back to the Future, Part II](https://reader038.vdocuments.net/reader038/viewer/2022103017/55855237d8b42ae15d8b54d9/html5/thumbnails/25.jpg)
stack activationrcvr: 'Hi!'initialValue: 0
binaryBlock: [] in DoItcaller saved ipcaller saved fp
method (inject:into:)2, false, truethisContextself: 'Hi!'
temp vectorrcvr: 'Hi!'
[] in inject:into:
BlockClosureouterContext
startpcnumArgs: 1
binaryBlock: [] in DoIttemp vector
MethodContextsender: fp
pc: caller saved fpstackp: 2
method: inject:into:closureOrNil: nil
self: ‘Hi!’temp vector
binaryBlock: [] in DoIt%$&^*#@!
...%$&^*#@!
ArraynextValue: 0
sp⇒
fp⇒
25
![Page 26: COG Back to the Future, Part II](https://reader038.vdocuments.net/reader038/viewer/2022103017/55855237d8b42ae15d8b54d9/html5/thumbnails/26.jpg)
stack activationrcvr: 'Hi!'initialValue: 0
binaryBlock: [] in DoItcaller saved ipcaller saved fp
method (inject:into:)2, false, truethisContextself: 'Hi!'
temp vectorrcvr: 'Hi!'
[] in inject:into:caller saved ipcaller saved fpmethod (do:)1, false, false%$&^*#@!self: 'Hi!'index: 1
BlockClosureouterContext
startpcnumArgs: 1
binaryBlock: [] in DoIttemp vector
MethodContextsender: fp
pc: caller saved fpstackp: 2
method: inject:into:closureOrNil: nil
self: ‘Hi!’temp vector
binaryBlock: [] in DoIt%$&^*#@!
...%$&^*#@!
ArraynextValue: 0
sp⇒
fp⇒
26
![Page 27: COG Back to the Future, Part II](https://reader038.vdocuments.net/reader038/viewer/2022103017/55855237d8b42ae15d8b54d9/html5/thumbnails/27.jpg)
stack activationrcvr: 'Hi!'initialValue: 0
binaryBlock: [] in DoItcaller saved ipcaller saved fp
method (inject:into:)2, false, truethisContextself: 'Hi!'
temp vectorrcvr: 'Hi!'
[] in inject:into:caller saved ipcaller saved fpmethod (do:)1, false, false%$&^*#@!self: 'Hi!'index: 1
clsr: [] in inject:into:each: $H
caller saved ipcaller saved fp
method (inject:into:)1, true, false%$&^*#@!self: ‘Hi’[] in DoIt
temp vector
BlockClosureouterContext
startpcnumArgs: 1
binaryBlock: [] in DoIttemp vector
MethodContextsender: fp
pc: caller saved fpstackp: 2
method: inject:into:closureOrNil: nil
self: ‘Hi!’temp vector
binaryBlock: [] in DoIt%$&^*#@!
...%$&^*#@!
ArraynextValue: 0
⇐sp
⇐fp
27
![Page 28: COG Back to the Future, Part II](https://reader038.vdocuments.net/reader038/viewer/2022103017/55855237d8b42ae15d8b54d9/html5/thumbnails/28.jpg)
Stack Interpreter
Nowish
Detailed blog posts + code real soon now™
Underwhelming Performance ≈ 10% ⇔ 68% faster benchmark performance as yet no faster for client experience
bit of a mystery...
28
![Page 29: COG Back to the Future, Part II](https://reader038.vdocuments.net/reader038/viewer/2022103017/55855237d8b42ae15d8b54d9/html5/thumbnails/29.jpg)
the VM gardener’s spade
many a mickle ... . ... makes a muckle
29
![Page 30: COG Back to the Future, Part II](https://reader038.vdocuments.net/reader038/viewer/2022103017/55855237d8b42ae15d8b54d9/html5/thumbnails/30.jpg)
fast JITApril 2009 (for x86)a la HPS
simple deferred code gen (stack => register)three register (self + 2 args) calling conventionmost kernel prims all in regs (at: at:put: + * etc)in-line caches: open & closed PICs
two word object header“open” translatorretain interpreter
30
![Page 31: COG Back to the Future, Part II](https://reader038.vdocuments.net/reader038/viewer/2022103017/55855237d8b42ae15d8b54d9/html5/thumbnails/31.jpg)
The life-cycle of the lesser spotted Inline Cache
egg -> monomorph -> polymorph -> megamorph
the egg is one of many instructions laid by the JIT when it “compiles” a bytecode method into a native code method
<B0> send #=
movl %rCache,#=call linkSend1Args
31
![Page 32: COG Back to the Future, Part II](https://reader038.vdocuments.net/reader038/viewer/2022103017/55855237d8b42ae15d8b54d9/html5/thumbnails/32.jpg)
The life-cycle of the lesser spotted Inline Cache
if executed the egg hatches into a larval monomorphic inline cache. Typically only 70% of eggs will hatch#(true false nil 0 0.0 #zero 'zero' $0 #()) select: [:each| each = 0]
movl %rCache,#=!! ⇒!! movl %rCache,Truecall linkSend1Args! ⇒!! call Object.=.entry
32
![Page 33: COG Back to the Future, Part II](https://reader038.vdocuments.net/reader038/viewer/2022103017/55855237d8b42ae15d8b54d9/html5/thumbnails/33.jpg)
The life-cycle of the lesser spotted Inline Cache
movl %rCache,Truecall Object.=.entry
Object.=.entry: mov %rTemp,%rSelf and %rTemp,#3 jnz L1 mov %rTemp,%rSelf[#ClassOffset]L1: cmp %rTemp,%rCache jnz LCallFixSendFailureObject.=.noCheckEntry: rock and roll
33
![Page 34: COG Back to the Future, Part II](https://reader038.vdocuments.net/reader038/viewer/2022103017/55855237d8b42ae15d8b54d9/html5/thumbnails/34.jpg)
The life-cycle of the lesser spotted Inline Cache
if the monomorph encounters another kind it changes into a nymph polymorphic cache with 2 cases. Only 10% of monomorphs will metamorphose into Closed PICs
movl %rCache,True!⇒!! movl %rCache,True
call Object.=.entry!⇒!! call aClosedPIC.=.entry
34
![Page 35: COG Back to the Future, Part II](https://reader038.vdocuments.net/reader038/viewer/2022103017/55855237d8b42ae15d8b54d9/html5/thumbnails/35.jpg)
The life-cycle of the lesser spotted Inline Cache
aClosedPIC.=.entry: mov %rTemp,%rSelf and %rTemp,#3 jnz L1 mov %rTemp,%rSelf[#ClassOffset]L1: cmp %rTemp,%rCache jz Object.=.noCheckEntry cmp %rCache, #False jz Object.=.noCheckEntry jmp extendClosedPIC ....
35
![Page 36: COG Back to the Future, Part II](https://reader038.vdocuments.net/reader038/viewer/2022103017/55855237d8b42ae15d8b54d9/html5/thumbnails/36.jpg)
The life-cycle of the lesser spotted Inline Cache
aClosedPIC.=.entry: mov %rTemp,%rSelf and %rTemp,#3 jnz L1 mov %rTemp,%rSelf[#ClassOffset]L1: cmp %rTemp,%rCache jz Object.=.noCheckEntry cmp %rCache, #False jz Object.=.noCheckEntry cmp %rCache, #UndefinedObject jz Object.=.noCheckEntry cmp %rCache, #SmallInteger jz SmallInteger.=.noCheckEntry jmp extendClosedPIC
36
![Page 37: COG Back to the Future, Part II](https://reader038.vdocuments.net/reader038/viewer/2022103017/55855237d8b42ae15d8b54d9/html5/thumbnails/37.jpg)
The life-cycle of the lesser spotted Inline Cache
if the nymph polymorph has a rich enough life and encounters more than (say) 8 classes of self then it blossoms into a magnificent Open PIC
movl %rCache,True !! ! ⇒!! movl %rCache,True
call aClosedPIC.=.entry!⇒!! call anOpenPIC.=.entry
37
![Page 38: COG Back to the Future, Part II](https://reader038.vdocuments.net/reader038/viewer/2022103017/55855237d8b42ae15d8b54d9/html5/thumbnails/38.jpg)
The life-cycle of the lesser spotted Inline Cache
An adult Open PIC is adept at probing the first-level method lookup cache to find the target method for each self
Since the Open PIC started life for a single selector it knows the constant value of its slector and its selector’s hash
Only 1% of monomorphs will complete the arduous journey to Open PIC
38
![Page 39: COG Back to the Future, Part II](https://reader038.vdocuments.net/reader038/viewer/2022103017/55855237d8b42ae15d8b54d9/html5/thumbnails/39.jpg)
The epiphenomena of the lesser spotted Inline Cache
in the steady state there is no code modification
monomorphs and closed PICs eliminate indirect branches, allowing the processor’s prefetch logic to gorge itself on instructions beyond each branch
eggs, monomorphs and polymorphs record concrete type information that an be harvested by an adaptive optimizer that speculatively inlines target methods
39
![Page 40: COG Back to the Future, Part II](https://reader038.vdocuments.net/reader038/viewer/2022103017/55855237d8b42ae15d8b54d9/html5/thumbnails/40.jpg)
two-word object header
“common” 32-bit/64-bit object header
classTableIndex is lookup key in in-line caches& first-level method lookup caches. GC doesn’t move them => simplified inline cache mgmt
index sparse class table with classTableIndexonly for class hierarchy search & class primitive
no table access to instantiate known classes
A class’s id hash is its classTableIndex =>no lookup to classTableIndex for new:
class... ...table... ...index flags etc
identity... ...hash... ...field slot sizee.g.
40
![Page 41: COG Back to the Future, Part II](https://reader038.vdocuments.net/reader038/viewer/2022103017/55855237d8b42ae15d8b54d9/html5/thumbnails/41.jpg)
“Open” fast JIT
bytecode set translation via a table of functions not a switch statement
object model is an ADT
should be able to configure JIT for different bytecode sets, different GCs, object representations
hence Croquet, Squeak and Newspeakand...?
41
![Page 42: COG Back to the Future, Part II](https://reader038.vdocuments.net/reader038/viewer/2022103017/55855237d8b42ae15d8b54d9/html5/thumbnails/42.jpg)
quick DT JIT
target: good floating-point performance
AOStA again... Adaptive Optimization =>SIStA Speculative Inlining
reify PIC state
count conditional branches6x less frequent than sendstaken & untaken counts => basic block frequency
image-level optimizer
42
![Page 43: COG Back to the Future, Part II](https://reader038.vdocuments.net/reader038/viewer/2022103017/55855237d8b42ae15d8b54d9/html5/thumbnails/43.jpg)
quick DT JIT
image-level optimizer (many benefits)
VM extended with go-faster no-check primitive bytecodes, e.g.add known non-overflowing SmallIntegersin-bounds pointer at: known SmallInteger index
could marry well with LLVM
43
![Page 44: COG Back to the Future, Part II](https://reader038.vdocuments.net/reader038/viewer/2022103017/55855237d8b42ae15d8b54d9/html5/thumbnails/44.jpg)
quick DT JIT
model for good floating-point performance:
OptimizedContext has two stacks object & byte data
VM extended with unboxed float bytecodes, e.g.bdsDoubleAt:putProductOfBdsDblAt:andBdsDblAt:pushBoxDbsDoubleAt:
code gen maps byte data stack to floating-point regs
44
![Page 45: COG Back to the Future, Part II](https://reader038.vdocuments.net/reader038/viewer/2022103017/55855237d8b42ae15d8b54d9/html5/thumbnails/45.jpg)
quick DT JIT
2010, 2011?
Sooner if you join the Cognoscenti...
45