using cxx::types

Post on 02-Jul-2015

3.485 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 1 / 39

using cxx::types;

Jordan DeLongSoftware Engineer, Facebook

Overview

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 2 / 39

• C++ and its type system

• Weakly typed code

• Refactoring example

Preliminaries

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 3 / 39

Why does Facebook use C++?

Preliminaries

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 3 / 39

Why does Facebook use C++?

• Performance

◦ At scale: operational costs > engineering costs

◦ C++ gives programmers low-level control

Preliminaries

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 3 / 39

Why does Facebook use C++?

• Performance

◦ At scale: operational costs > engineering costs

◦ C++ gives programmers low-level control

• Abstraction tools

◦ Lambdas and higher-order functions

◦ Type deduction (auto, template arguments)

◦ Powerful type system

C++: Powerful Type System

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 4 / 39

• template<class T>

C++: Powerful Type System

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 4 / 39

• template<class T>

• template<int I>

C++: Powerful Type System

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 4 / 39

• template<class T>

• template<int I>

• template<template <class> class T>

C++: Powerful Type System

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 4 / 39

• template<class T>

• template<int I>

• template<template <class> class T>

• OO-style subtyping/polymorphism

C++: Powerful Type System

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 4 / 39

• template<class T>

• template<int I>

• template<template <class> class T>

• OO-style subtyping/polymorphism

Basics:

• New statically incompatible types can be created

C++: Powerful Type System

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 4 / 39

• template<class T>

• template<int I>

• template<template <class> class T>

• OO-style subtyping/polymorphism

Basics:

• New statically incompatible types can be created

• Function and operator overloading

C++: “Powerful” Type System?

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 5 / 39

• Error-prone standard conversions

◦ Nearly all primitive types convert to bool

◦ unsigned to signed

◦ narrowing conversions

◦ floating-integral conversions

C++: “Powerful” Type System?

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 5 / 39

• Error-prone standard conversions

◦ Nearly all primitive types convert to bool

◦ unsigned to signed

◦ narrowing conversions

◦ floating-integral conversions

• void*, unsigned char*, untyped memory

C++: “Powerful” Type System?

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 5 / 39

• Error-prone standard conversions

◦ Nearly all primitive types convert to bool

◦ unsigned to signed

◦ narrowing conversions

◦ floating-integral conversions

• void*, unsigned char*, untyped memory

• typedef only makes type aliases

◦ Creating real new types is more verbose

Strong vs. Weak

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 6 / 39

Usual definition:

A type system is “strong” if it disallows conversions

between values of different types.

Strong vs. Weak

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 6 / 39

Usual definition:

A type system is “strong” if it disallows conversions

between values of different types.

Many? Most? Unsafe? Implicit?

Strong vs. Weak

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 6 / 39

Usual definition:

A type system is “strong” if it disallows conversions

between values of different types.

Many? Most? Unsafe? Implicit?

In the context of static type systems:

• Unclear as a language property

Strong vs. Weak

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 6 / 39

Usual definition:

A type system is “strong” if it disallows conversions

between values of different types.

Many? Most? Unsafe? Implicit?

In the context of static type systems:

• Unclear as a language property

• Better: “strongly typed” is a property of code

Strong vs. Weak

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 6 / 39

Usual definition:

A type system is “strong” if it disallows conversions

between values of different types.

Many? Most? Unsafe? Implicit?

In the context of static type systems:

• Unclear as a language property

• Better: “strongly typed” is a property of code

• Weakly typed code can be written in a strongly typed language

What We Really Want

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 7 / 39

Goal: fewer bugs, guaranteed correctness.

What We Really Want

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 7 / 39

Goal: fewer bugs, guaranteed correctness.

• Move runtime errors to compile time

• One way to do this: strongly typed APIs

What We Really Want

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 7 / 39

Goal: fewer bugs, guaranteed correctness.

• Move runtime errors to compile time

• One way to do this: strongly typed APIs

What this means:

• Types are a critical part of interface design

What We Really Want

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 7 / 39

Goal: fewer bugs, guaranteed correctness.

• Move runtime errors to compile time

• One way to do this: strongly typed APIs

What this means:

• Types are a critical part of interface design

• Types should encode semantics

What We Really Want

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 7 / 39

Goal: fewer bugs, guaranteed correctness.

• Move runtime errors to compile time

• One way to do this: strongly typed APIs

What this means:

• Types are a critical part of interface design

• Types should encode semantics

• Primitive types are just building blocks (especially in C++)

Weakly Typed APIs

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 8 / 39

Weak: High-arity Functions

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 9 / 39

public void DrawRectangle(

Color colorOutline,

int thicknessOutline,

int x,

int y,

int width,

int height,

int xCornerRadius,

int yCornerRadius,

Color colorGradientStart,

int xGradientStart,

int yGradientStart,

Color colorGradientEnd,

int xGradientEnd,

int yGradientEnd,

UInt16 opacity

)

Weak: High-arity Functions

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 10 / 39

• Semantics are primarily encoded by position

Weak: High-arity Functions

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 10 / 39

• Semantics are primarily encoded by position

• Worse when arguments have compatible types

Weak: High-arity Functions

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 10 / 39

• Semantics are primarily encoded by position

• Worse when arguments have compatible types

• Temptation to use or add defaulted arguments

◦ Hinders refactoring

Weak: “Types” in Identifiers

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 11 / 39

void sleep(int seconds);

sleep(3600 * 60 * 2 /* two hours */);

Weak: “Types” in Identifiers

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 11 / 39

void sleep(int seconds);

sleep(3600 * 60 * 2 /* two hours */);

• Semantics are encoded in the identifier

Weak: “Types” in Identifiers

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 11 / 39

void sleep(int seconds);

sleep(3600 * 60 * 2 /* two hours */);

• Semantics are encoded in the identifier

• Use types that encode units, e.g. std::chrono::duration<>

using namespace std::chrono;

void sleep(seconds s);

sleep(duration_cast<seconds>(hours(2)));

Weak: Boolean Arguments

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 12 / 39

typedef int Id;

enum MetaKind { ... };

void addMeta(int pos,

MetaKind kind,

MetaData* mdata,

bool mIsVector,

Id id);

Particularly bad in C++: every other argument type here implicitly

converts to bool.

Refactoring: HHVM Assembler

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 13 / 39

An Assembler API

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 14 / 39

typedef int register_name_t;

const register_name_t rax = 0;

const register_name_t rbx = 1;

// ...

struct Asm {

// ...

void load_reg64_disp_reg32(int rbase, int disp,

int rdest);

void load_reg64_disp_reg64(int, int, int);

void sub_imm32_reg32(intptr_t, int);

void mov_imm64_reg(intptr_t, int);

void load_reg64_index_scale_disp_reg64(

int rbase, int rindex, int scale, int disp,

int rdest);

x64 Memory Operands

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 15 / 39

// *rax = rbx;

movq %rbx, (%rax) ; base + 0

// rax[0xc] = rbx;

movq %rbx, 0xc(%rax) ; base + disp

// rax[rcx*2+0xc] = 0x42;

movq $0x42, 0xc(%rax,%rcx,0x2) ; base + idx*2 + disp

General case: base + index * scale + displacement.

Using the API

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 16 / 39

void (*g_destructors[4])(void*) =

{destructString, destructArray,

destructObject, destructRef};

// Dispatch to appropriate destructor function

a. load_reg64_disp_reg32(rbx, TVOFF(m_type), rsi);

a. sub_imm32_reg32(KindOfString, rsi);

a. mov_imm64_reg(uintptr_t(&g_destructors), rax)

a. load_reg64_index_scale_disp_reg64(

rax, rsi, 8, 0, rax);

a. call_reg(rax);

Using the API

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 17 / 39

movl 0xc(%rbx), %esi

subl $0xf,%esi

movq $0x6a28240,%rax

movq (%rax,%rsi,8),%rax

callq *%rax

Misusing the API

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 18 / 39

void load_reg64_index_scale_disp_reg64(

int rbase, int rindex, int scale, int disp,

int rdest);

Possible Errors #1

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 19 / 39

a. load_reg64_disp_reg64(rVmFp, AROFF(m_this), rax);

a. store_reg64_disp_reg64(0,

AROFF(m_this), rVmFp);

a. shr_imm32_reg64(1, rax);

a. jcc(CC_NBE, decRefThisStub);

Possible Errors #1

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 20 / 39

a. load_reg64_disp_reg64(rVmFp, AROFF(m_this), rax);

a. store_reg64_disp_reg64(0,

AROFF(m_this), rVmFp); // <--

a. shr_imm32_reg64(1, rax);

a. jcc(CC_NBE, decRefThisStub);

Possible Errors #1

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 21 / 39

a. load_reg64_disp_reg64(rVmFp, AROFF(m_this), rax);

a. store_imm64_disp_reg64(0, // reg -> imm

AROFF(m_this), rVmFp); // <--

a. shr_imm32_reg64(1, rax);

a. jcc(CC_NBE, decRefThisStub);

Possible Errors #2

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 22 / 39

a. mov_imm32_reg32(-1, rax);

a. store_reg64_disp_reg64(rax,

0, rbx);

Possible Errors #2

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 23 / 39

a. mov_imm32_reg32(-1, rax);

a. store_reg64_disp_reg64(rax,

0, rbx); // not sign-extended

Possible Errors #2

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 24 / 39

a. mov_imm32_reg64(-1, rax); // 32 -> 64

a. store_reg64_disp_reg64(rax,

0, rbx); // not sign-extended

Improving on this

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 25 / 39

• Argument semantics we can encode in types:

◦ Registers vs. immediates

◦ How a register is used (value vs. part of memory operand)

◦ Operand sizes (eax vs. rax)

Improving on this

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 25 / 39

• Argument semantics we can encode in types:

◦ Registers vs. immediates

◦ How a register is used (value vs. part of memory operand)

◦ Operand sizes (eax vs. rax)

• Reduce function arity

Improving on this

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 25 / 39

• Argument semantics we can encode in types:

◦ Registers vs. immediates

◦ How a register is used (value vs. part of memory operand)

◦ Operand sizes (eax vs. rax)

• Reduce function arity

• Function naming: closer to x64 opcode mnemonics

Refactored API

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 26 / 39

// Dispatch to appropriate destructor function

a. movl (rbx[TVOFF(m_type)], esi);

a. subl (KindOfString, esi);

a. movq (&g_destructors, rax);

a. movq (rax[rsi*8], rax);

a. call (rax);

New Register Types

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 27 / 39

/// Reg64, Reg32, RegXMM, RegRIP ...

constexpr Reg64 rax(0);

constexpr Reg64 rcx(1);

constexpr Reg32 eax(0);

constexpr Reg32 ecx(1);

constexpr Reg8 al(0);

constexpr RegXMM xmm0(0);

constexpr RegRIP rip;

// etc

Register Type Details

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 28 / 39

struct Reg64 {

explicit constexpr Reg64(int);

explicit constexpr operator int() const;

constexpr bool operator==(Reg64) const;

constexpr bool operator!=(Reg64) const;

// ...

};

• We’re using a struct instead of enum class because we want to

define an operator[] later.

Register to Register mov

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 29 / 39

void movb(Reg8, Reg8); // 8-bit operands

void movl(Reg32, Reg32); // 32-bit operands

void movq(Reg64, Reg64); // 64-bit operands

Register to Register mov

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 29 / 39

void movb(Reg8, Reg8); // 8-bit operands

void movl(Reg32, Reg32); // 32-bit operands

void movq(Reg64, Reg64); // 64-bit operands

a. movl (rax, rbx); // compile-time error

Register to Register mov

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 29 / 39

void movb(Reg8, Reg8); // 8-bit operands

void movl(Reg32, Reg32); // 32-bit operands

void movq(Reg64, Reg64); // 64-bit operands

a. movl (rax, rbx); // compile-time error

a. movl (eax, ebx); // ok

Simple Memory Operands

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 30 / 39

// reg + offset

struct DispReg {

Reg64 rbase;

intptr_t disp;

};

DispReg operator+(Reg64 rbase, intptr_t disp);

Indexed Memory Operands

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 31 / 39

// reg * scale

struct ScaledIndex {

Reg64 rindex;

int scale;

};

ScaledIndex operator*(Reg64 rindex, int scale);

// reg + reg*scale + disp

struct IndexedDispReg {

Reg64 rbase;

ScaledIndex index;

intptr_t disp;

};

IndexedDispReg operator+(Reg64 rbase, ScaledIndex);

IndexedDispReg operator+(IndexedDispReg, intptr_t);

Dereferenced Memory Operands

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 32 / 39

// *(reg + offset)

struct MemoryRef {

DispReg dr;

};

MemoryRef operator*(DispReg);

// *(reg + reg*scale + disp)

struct IndexedMemoryRef {

IndexedDispReg dr;

};

IndexedMemoryRef operator*(IndexedDispReg);

Loads

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 33 / 39

void movq(MemoryRef, Reg64);

void movl(MemoryRef, Reg32);

void movq(IndexedMemoryRef, Reg64);

void movl(IndexedMemoryRef, Reg32);

Loads

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 33 / 39

void movq(MemoryRef, Reg64);

void movl(MemoryRef, Reg32);

void movq(IndexedMemoryRef, Reg64);

void movl(IndexedMemoryRef, Reg32);

a. movl(*(rbx + 0xc), rax); // compile-time error

Loads

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 33 / 39

void movq(MemoryRef, Reg64);

void movl(MemoryRef, Reg32);

void movq(IndexedMemoryRef, Reg64);

void movl(IndexedMemoryRef, Reg32);

a. movl(*(rbx + 0xc), rax); // compile-time error

a. movl(*(rbx + 0xc), eax); // ok

Stores

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 34 / 39

void movq(Reg64, MemoryRef);

void movl(Reg32, MemoryRef);

void movq(Reg64, IndexedMemoryRef);

void movl(Reg32, IndexedMemoryRef);

Stores

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 34 / 39

void movq(Reg64, MemoryRef);

void movl(Reg32, MemoryRef);

void movq(Reg64, IndexedMemoryRef);

void movl(Reg32, IndexedMemoryRef);

a. movq(rbx, rax[0xc]); // Reg64 gets an operator[]

Stores

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 34 / 39

void movq(Reg64, MemoryRef);

void movl(Reg32, MemoryRef);

void movq(Reg64, IndexedMemoryRef);

void movl(Reg32, IndexedMemoryRef);

a. movq(rbx, rax[0xc]); // Reg64 gets an operator[]

a. movq(ebx, rax[0xc]); // compile-time error

Other opcodes

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 35 / 39

// Load effective address:

void lea(IndexedDispReg, Reg64);

void lea(DispReg, Reg64);

// Push can take memory or registers:

void pushq(IndexedMemoryRef);

void pushq(MemoryRef);

void pushq(Reg64);

Discussion

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 36 / 39

• Registers 6≡ Memory Operands 6≡ Immediates 6≡ Registers

Discussion

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 36 / 39

• Registers 6≡ Memory Operands 6≡ Immediates 6≡ Registers

• Memory Operands: encoded as an embedded expression language

Discussion

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 36 / 39

• Registers 6≡ Memory Operands 6≡ Immediates 6≡ Registers

• Memory Operands: encoded as an embedded expression language

• Immediates:

void movq(Immed, Reg64);

void movl(Immed, Reg32);

void movb(Immed, Reg8);

◦ Thin, runtime-checked wrapper around intptr_t

◦ Still potentially vulnerable to runtime integer-related issues

Discussion

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 36 / 39

• Registers 6≡ Memory Operands 6≡ Immediates 6≡ Registers

• Memory Operands: encoded as an embedded expression language

• Immediates:

void movq(Immed, Reg64);

void movl(Immed, Reg32);

void movb(Immed, Reg8);

◦ Thin, runtime-checked wrapper around intptr_t

◦ Still potentially vulnerable to runtime integer-related issues

• Looks more like the assembly we’re trying to generate

;

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 37 / 39

Making immediates safer

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 38 / 39

void movimm(Immed immed, Reg64 reg);

struct Immed {

template<class T>

/* implicit */ Immed(

T i, typename std::enable_if<...>::type* = 0

) : m_int(/* ... */) {}

// various accessors q(), l(), w()

// fitsSigned(), fitsUnsigned()

};

Making immediates safer

By Jordan DeLong. c©2012- Facebook. Do not redistribute. 39 / 39

void movimm(Immed imm, Reg64 dest) {

if (imm.q() == 0) return xorl(r32(dest), r32(dest));

if (imm.q() > 0 && imm.fitsUnsigned(sz::dword)) {

return movl(imm, r32(dest)); // zeros top bits

}

movq(imm, dest);

}

top related