python opcodes

25
The Python Interpreter is Fun and Not At All Terrifying: Opcodes name: Alex Golec twitter: @alexandergolec not @alexgolec : ( email: akg2136 (rhymes with cat) columbia dot (short for education) this talk lives at: blog.alexgolec.com 1

Upload: alexgolec

Post on 06-May-2015

6.037 views

Category:

Technology


7 download

DESCRIPTION

The python interpreter converts programs to bytecodes before beginning execution. Execution itself consist of looping over these bytecodes and performing specific operations over each one. This talk gives a very brief overview of the main classes of bytecodes. This presentation was given as a lightning talk at the Boston Python Meetup group on July 24th, 2012.

TRANSCRIPT

Page 1: Python opcodes

The Python Interpreter is Fun and Not At All Terrifying: Opcodes

name: Alex Golectwitter: @alexandergolec

not @alexgolec : (email: akg2136 (rhymes with cat) columbia dot (short for education)

this talk lives at: blog.alexgolec.com

1

Page 2: Python opcodes

Python is Bytecode-Interpreted

• Your python program is compiled down to bytecode

• Sort of like assembly for the python virtual machine

• The interpreter executes each of these bytecodes one by one

2

Page 3: Python opcodes

Before we Begin

• This presentation was written using the CPython 2.7.2 which ships with Mac OS X Mountain Lion GM Image

• The more adventurous among you will find that minor will details differ on PyPy / IronPython / Jython

3

Page 4: Python opcodes

The Interpreter is Responsible For:

• Issuing commands to objects and maintaining stack state

• Flow Control

• Managing namespaces

• Turning code objects into functions and classes

4

Page 5: Python opcodes

Issuing Commands to Objects and Maintaining Stack State

5

Page 6: Python opcodes

The dis Module

>>> def parabola(x):... return x*x + 4*x + 4... >>> dis.dis(parabola) 2 0 LOAD_FAST 0 (x) 3 LOAD_FAST 0 (x) 6 BINARY_MULTIPLY 7 LOAD_CONST 1 (4) 10 LOAD_FAST 0 (x) 13 BINARY_MULTIPLY 14 BINARY_ADD 15 LOAD_CONST 1 (4) 18 BINARY_ADD 19 RETURN_VALUE

Each instruction is exactly three bytes Opcodes have friendly (ish) mnemonics

6

Page 7: Python opcodes

Example: Arithmetic Operations

• We don’t know the type of x!

• How does BINARY_MULTIPLY know how to perform multiplication?

• What is I pass a string?

• Note the lack of registers; the Python virtual machine is stack-based

>>> def parabola(x):... return x*x + 4*x + 4... >>> dis.dis(parabola) 2 0 LOAD_FAST 0 (x) 3 LOAD_FAST 0 (x) 6 BINARY_MULTIPLY 7 LOAD_CONST 1 (4) 10 LOAD_FAST 0 (x) 13 BINARY_MULTIPLY 14 BINARY_ADD 15 LOAD_CONST 1 (4) 18 BINARY_ADD 19 RETURN_VALUE

7

Page 8: Python opcodes

Things the Interpreter Doesn’t Do:Typed Method Dispatch

• The python interpreter does not know anything about how to add two numbers (or objects, for that matter)

• Instead, it simply maintains a stack of objects, and when it comes time to perform an operation, asks them to perform the operation

• The result gets pushed onto the stack

8

Page 9: Python opcodes

Flow Control

9

Page 10: Python opcodes

• Jumps can be relative or absolute

• Relevant opcodes:

• JUMP_FORWARD

• POP_JUMP_IF_[TRUE/FALSE]

• JUMP_IF_[TRUE/FALSE]_OR_POP

• JUMP_ABSOLUTE

• SETUP_LOOP

• [BREAK/CONTINUE]_LOOP

Flow Control>>> def abs(x):... if x < 0:... x = -x... return x... >>> dis.dis(abs) 2 0 LOAD_FAST 0 (x) 3 LOAD_CONST 1 (0) 6 COMPARE_OP 0 (<) 9 POP_JUMP_IF_FALSE 22

3 12 LOAD_FAST 0 (x) 15 UNARY_NEGATIVE 16 STORE_FAST 0 (x) 19 JUMP_FORWARD 0 (to 22)

4 >> 22 LOAD_FAST 0 (x) 25 RETURN_VALUE

10

Page 11: Python opcodes

Managing Namespaces

11

Page 12: Python opcodes

• Variables, functions, etc. are all treated identically

Simple Namespaces>>> def example():... variable = 1... def function():... print 'function'... del variable... del function... >>> dis.dis(example) 2 0 LOAD_CONST 1 (1) 3 STORE_FAST 0 (variable)

3 6 LOAD_CONST 2 (<code object b at 0x10c545930, file "<stdin>", line 3>) 9 MAKE_FUNCTION 0 12 STORE_FAST 1 (function)

5 15 DELETE_FAST 0 (variable)

6 18 DELETE_FAST 1 (function) 21 LOAD_CONST 0 (None) 24 RETURN_VALUE

• Once the name is assigned to the object, the interpreter completely forgets everything about it except the name

12

Page 13: Python opcodes

Turning Code Objects into Functions and Classes

13

Page 14: Python opcodes

Functions First!

>>> def square(inputfunc):... def f(x):... return inputfunc(x) * inputfunc(x)... return f... >>> dis.dis(square) 2 0 LOAD_CLOSURE 0 (inputfunc) 3 BUILD_TUPLE 1 6 LOAD_CONST 1 (<code object f at 0x10c545a30, file "<stdin>", line 2>) 9 MAKE_CLOSURE 0 12 STORE_FAST 1 (f)

4 15 LOAD_FAST 1 (f) 18 RETURN_VALUE

• The compiler generates code objects and sticks them in memory

14

Page 15: Python opcodes

Now Classes!>>> def make_point(dimension, names):... class Point:... def __init__(self, *data):... pass... dimension = dimensions... return Point... >>> dis.dis(make_point) 2 0 LOAD_CONST 1 ('Point') 3 LOAD_CONST 3 (()) 6 LOAD_CONST 2 (<code object Point at 0x10c545c30, file "<stdin>", line 2>) 9 MAKE_FUNCTION 0 12 CALL_FUNCTION 0 15 BUILD_CLASS 16 STORE_FAST 2 (Point)

6 19 LOAD_FAST 2 (Point) 22 RETURN_VALUE

BUILD_CLASS()

Creates a new class object. TOS is the methods dictionary, TOS1 the tuple of the names of the base classes, and TOS2 the class name.

15

Page 16: Python opcodes

Other Things

• Exceptions

• Loops

• Technically flow control, but they’re a little more involved

16

Page 17: Python opcodes

Now, We Have Some Fun

17

Page 18: Python opcodes

What to Do With Our Newly Acquired Knowledge of Dark

Magic?

18

Page 19: Python opcodes

Write your own Python interpreter!

19

Page 20: Python opcodes

Static Code Analysis!

20

Page 21: Python opcodes

Understand How PyPy Does It!

21

Page 22: Python opcodes

Buy Me Alcohol!Or at least provide me with pleasant conversation

22

Page 23: Python opcodes

Slideshare-only Bonus Slide: Exception Handling!

23

Page 24: Python opcodes

• The exception context is pushed by SETUP_EXCEPT

• If an exception is thrown, control jumps to the address of the top exception context, in this case opcode 15

• If there is no top exception context, the interpreter halts and notifies you of the error

• The yellow opcodes check if the exception thrown matches the type of the one in the except statement, and execute the except block

• At END_FINALLY, the interpreter is responsible for popping the exception context, and either re-raising the exception, in which case the next-topmost exception context will trigger, or returning from the function

• Notice that the red opcodes will never be executed

• The first: between a return and a jump target

• The second: only reachable by jumping from dead code.

• CPython’s philosophy of architectural and implementation simplicity tolerates such minor inefficiencies

>>> def list_get(lst, pos):... try:... return lst[pos]... except IndexError:... return None... # there is an invisible “return None” here>>> dis.dis(list_get) 2 0 SETUP_EXCEPT 12 (to 15)

3 3 LOAD_FAST 0 (lst) 6 LOAD_FAST 1 (pos) 9 BINARY_SUBSCR 10 RETURN_VALUE 11 POP_BLOCK 12 JUMP_FORWARD 18 (to 33)

4 >> 15 DUP_TOP 16 LOAD_GLOBAL 0 (IndexError) 19 COMPARE_OP 10 (exception match) 22 POP_JUMP_IF_FALSE 32 25 POP_TOP 26 POP_TOP 27 POP_TOP

5 28 LOAD_CONST 0 (None) 31 RETURN_VALUE >> 32 END_FINALLY >> 33 LOAD_CONST 0 (None) 36 RETURN_VALUE

24

Page 25: Python opcodes

Thanks!

25