hpca 2019 keynote towards secure high- performance ...people.csail.mit.edu/devadas/pubs/hpca.pdf ·...

Post on 11-Mar-2020

4 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Srini DevadasMassachusetts Institute of Technology

HPCA 2019 Keynote

Towards Secure High-Performance Computer

Architectures

Architectural Isolationof Processes

2

Fundamental to maintaining correctness and privacy!

Performance Dictates Microarchitectural Optimization

3

Isolation Breaks Because of Shared Microarchitectural State!

Shared Last Level Cache

4

Line Offsetl-1…0

Address Tagn-1…s+l

Set Indexs+l-1…l

Memory Address

…Set S-1, Way 1 Set S-1, Way W-1Set S-1, Way 0⋮ ⋱ ⋮⋮

Set i, Way 1 Set i, Way W-1…Set i, Way 0

⋮⋮ ⋮ ⋱

Set 1, Way W-1Set 1, Way 0 …Set 1, Way 1

Set 0, Way W-1…Set 0, Way 1Set 0, Way 0

Way W-1…Way 1Way 0

Tag Line Tag Line Tag Line

Matched Line

Tag Comparator

Match? Matched Word

Process 1

Process 2

Control Flow Speculation for Performance

I: Compute

I+1: Compute

I+2: Compute

I+3: Compute

I: Control Flow

J: Compute

J+1: Compute

J+2: Compute

K: Compute

K+1: Compute

K+2: Compute

Correct direction

Mis-speculated direction

Sequential InstructionExecution

Non-Sequential InstructionExecution

5

Control Flow Speculation is insecure

Speculative execution does not affect architectural state → “correct”… but can be observed via some “side channels” (primarily cache tag state)

… and attacker can influence (mis)speculation (branch predictor inputs not authenticated)

A huge, complex attack surface!6

Domain of Victim

Transmitter

Secret

ReceiverChannelAccess

Secret

Attacker

Pre-existing (RSA conditional execution example)Written by attacker (Meltdown)Synthesized out of existing victim code by attacker (Spectre)

Building a Transmitter

7

8

• Real systems: large, complex, cyberphysical(not secure)

•Introducing a spy:

Hypervisor,Bios

CPU

Side Channels Gone Wild!

DR

AMChipset

Network

Thread L1 $L2 $

L3 $DR

AM C

trl.

Disk

Mainboard

OS

sharing!

sharing!sharing!

App

admins! users!vendors!

users!

admins!

Philosophy

Build enclaves on an enclave platform, not

just processes

9

Enclaves strengthenthe process abstraction

• Processes guarantee isolation of memory• Enclaves provide a stronger guarantee

– No other program can infer anything private from the enclave program through its use of shared resources or shared microarchitectural state

• Largely decouple performance considerations from security

• Minimally invasive hardware changes• Provable security under chosen threat model

10

A Typical ComputerTrusted Computing Base

TRUSTED!

TRUSTED!

CPU!!

DRAM

!Chipset!

Network!

Thread! L1 $!L2 $!

L3 $!DRAM

Ctrl

.!Disk!

Main!board!

Priv

eleg

e!

BIOS (SMM)!

Hypervisor (Ring 0, VMX root) !

OS Kernel (Ring 0) !

App! App!

(Ring 3) !

Software…!… Running on hardware!

Secure App!

Single-Chip Secure Processor:Shrink the TCB

Protected Environment

Memory

I/O

TrustedSoftware

Protect

Identify

• Enclave assumes trusted hardware + trusted software “monitor”

• Operating system is untrusted 12

Edward Suh’s ICS 2003 Talk on Aegis processor

Enclave Enclave

Enclave Lifecycle (simplified)

13

Enclave binary image

Untrusted software (OS)

Create enclave,Grant resources

Loadenclave

Sealenclave

Enclave

Enclave binary image

Enclave binary image

① ② ③

Enclave can no longer be modified from outside

Enclave has a measurement for attestation

Enclave Lifecycle (simplified)

14

Untrusted software (OS)

Create enclave,Grant resources

Loadenclave

Sealenclave

Enterenclave

Exitenclave

Enclave binary image

④① ② ③

(enclave executes in a strongly isolated

environment)

Platform erases (flushes)

sensitive state

Strong Isolation Goal

• Any attack by a privileged attacker on the same machine as the victim that can extract a secret inside the victim enclave, could also have been run successfully by an attacker on a different machine than the victim.– No protection against an enclave leaking its own

secrets through its public API.

• Three strategies for isolation: Spatial isolation, temporal isolation and cryptography 15

Sanctum Design

Victor Costan, Ilia LebedevSanctum: Minimal Hardware Extensions

for Strong Software Isolation

Software Stack

User

Supervisor

Hypervisor

Machine

Hypervisor

Host ApplicationEnclave

Security MonitorMeasurement Root

Enclave multiplexing

Operating System

Enclave management

Enclave syscall shims Sanctum-aware runtimeNon-sensitive code and data Sensitive code and data

Enclave setup

User

Supervisor

Hypervisor

Machine

Hypervisor

Host ApplicationEnclave

Security MonitorMeasurement Root

Enclave multiplexing

Operating System

Enclave management

Enclave syscall shims Sanctum-aware runtimeNon-sensitive code and data Sensitive code and data

Enclave setup

Software Stack

Target: multi-core processor(no hyperthreading, no speculation)

L3 C

ache

CBox

Core

L2 Cache

L3 CacheSlice

L3 CacheSlice

CBox

Core

L2 Cache

HomeAgent

CBox

Core

L2 Cache

L3 CacheSlice

L3 CacheSlice

CBox

Core

L2 Cache

QPIPacketizer

MemoryController

DDR3Channel

Ring toQPI

Ring toPCIeI/O Controller

UBox

QPI Link

PCIe Lanes

Microarchitectural StateIsolation in Sanctum Enclaves

• Resources exclusively granted to an enclave, and scheduled at the granularity of process context switches are isolated temporally– Register files, branch predictors, private caches,

and private TLBs

• Resources shared between processes on-demand, with arbitrarily small granularity are isolated spatially by partitioning– Shared caches and shared TLBs

20

Operating SystemManages Page Tables

VirtualAddress

Physical AddressMapping

PageTables

VirtualAddress Space

PhysicalAddress Space

Address Translation

Software DRAM

System bus

Practical Software Attackon SGX “Simulators”

• Microsoft Research, IEEE S&P 2015: Exploit no-noise side channel due to page faults

22

Page Table Isolation

Host applicationspace

Host applicationspace

EVRANGE A

Enclave A VirtualAddress Space

Physical Memory

Enclave A region

Enclave A page tables

Enclave A region

Enclave B region

Enclave B page tables

OS region

OS region

OS page tables

Host applicationspace

Host applicationspace

EVRANGE B

Enclave B VirtualAddress Space

Partitioning to Prevent Timing Attacks

24

Line Offsetl-1…0

Address Tagn-1…s+l

Set Indexs+l-1…l

Memory Address

…Set S-1, Way 1 Set S-1, Way W-1Set S-1, Way 0⋮ ⋱ ⋮⋮

Set i, Way 1 Set i, Way W-1…Set i, Way 0

⋮⋮ ⋮ ⋱

Set 1, Way W-1Set 1, Way 0 …Set 1, Way 1

Set 0, Way W-1…Set 0, Way 1Set 0, Way 0

Way W-1…Way 1Way 0

Tag Line Tag Line Tag Line

Matched Line

Tag Comparator

Match? Matched Word

Enclave

OS

Page Coloring

DRAM RegionIndex

CacheLine Offset

5 0611Page Offset

1214

Cache Set Index

DRAM StripeIndex

151720 18Cache Tag

Address bits used by 256 KB of DRAM

Address bits covering the maximum addressable physical space of 2 MB

Physical page number (PPN)

Page Colors =

DRAM Regions

0

MEMTOP

Normal DRAMpage colors

Sanctum DRAMpage colors / regions

0

MEMTOP

Region 0 Region 1 Region 2 Region 3Region 4 Region 5 Region 6 Region 7

LLC set colors

Page Colors =

DRAM Regions

0

MEMTOP

Normal DRAMpage colors

Sanctum DRAMpage colors / regions

0

MEMTOP

Region 0 Region 1 Region 2 Region 3Region 4 Region 5 Region 6 Region 7

LLC set colors

A little bit-shifting gets us a large contiguous DRAM region

Sanctum Secure ProcessorNo Speculation, No Hyperthreading

RISCV Rocket Core, Changes required by Sanctum (+ ~2% of core)

Also requires 9 new config registers

Sanctum Status andCurrent Limitations

• We have built an open-source Sanctum based on the RISC-V ISA– Low performance and area overhead to support

enclaves– Ongoing formal verification effort

• Sanctum is an academic, lightweight processor• Apply its design philosophy to speculative

out-of-order (OOO) processors, which need to protect against Spectre-style attacks

29

MI6 Design

Thomas Bourgeat, Ilia Lebedev,Andrew Wright, Sizhuo Zhang, Arvind

MI6: Secure Enclaves in aSpeculative Out-of-Order Processor

RiscyOO Processor

31

Rename

ROB

ALU IQ Issue RegRead Exec Reg

Write

MEM IQ Issue RegRead

AddrCalc

UpdateLSQ

Physical Reg File

L1 D TLB

LSQ (LQ + SQ)

Commit

IssueLd

Deq

StoreBuffer

L1 D$

RespLd

IssueSt

RespSt

RenameTable

SpeculationManager

EpochManager

Scoreboard

ALU pipeline

MEM pipeline

Load-Store Unit

Front-end

Fetch Bypass

Last Level Cache

RiscyOO Processor

32

Rename

ROB

ALU IQ Issue RegRead Exec Reg

Write

MEM IQ Issue RegRead

AddrCalc

UpdateLSQ

Physical Reg File

L1 D TLB

LSQ (LQ + SQ)

Commit

IssueLd

Deq

StoreBuffer

L1 D$

RespLd

IssueSt

RespSt

RenameTable

SpeculationManager

EpochManager

Scoreboard

ALU pipeline

MEM pipeline

Load-Store Unit

Front-end

Fetch Bypass

Last Level Cache

Not speculating during entire program results in average 3X slowdown!

MI6 Processor

33

Rename

ROB

ALU IQ Issue RegRead Exec Reg

Write

MEM IQ Issue RegRead

AddrCalc

UpdateLSQ

Physical Reg File

L1 D TLB

LSQ (LQ + SQ)

Commit

IssueLd

Deq

StoreBuffer

L1 D$

RespLd

IssueSt

RespSt

RenameTable

SpeculationManager

EpochManager

Scoreboard

ALU pipeline

MEM pipeline

Load-Store Unit

Front-end

Fetch Bypass

Last Level Cache

All modules with private state are flushed on enclave entry and exit.

Performance overhead ~ 5%.

MI6 Processor

34

Rename

ROB

ALU IQ Issue RegRead Exec Reg

Write

MEM IQ Issue RegRead

AddrCalc

UpdateLSQ

Physical Reg File

L1 D TLB

LSQ (LQ + SQ)

Commit

IssueLd

Deq

StoreBuffer

L1 D$

RespLd

IssueSt

RespSt

RenameTable

SpeculationManager

EpochManager

Scoreboard

ALU pipeline

MEM pipeline

Load-Store Unit

Front-end

Fetch Bypass

Last Level Cache

Last Level Cache is spatially partitioned.

Performance overhead ~7%.

Leaky Cache Hierarchy

35

Timing IndependentCache Hierarchy

36

Fair LLC Arbiter.Performance overhead ~8%.

MI6 Processor

37

Rename

ROB

ALU IQ Issue RegRead Exec Reg

Write

MEM IQ Issue RegRead

AddrCalc

UpdateLSQ

Physical Reg File

L1 D TLB

LSQ (LQ + SQ)

Commit

IssueLd

Deq

StoreBuffer

L1 D$

RespLd

IssueSt

RespSt

RenameTable

SpeculationManager

EpochManager

Scoreboard

ALU pipeline

MEM pipeline

Load-Store Unit

Front-end

Fetch Bypass

Last Level Cache

Stop speculating: Every memory instruction will be stalled at the rename stage until ROB is empty.

MI6 Processor

38

Rename

ROB

ALU IQ Issue RegRead Exec Reg

Write

MEM IQ Issue RegRead

AddrCalc

UpdateLSQ

Physical Reg File

L1 D TLB

LSQ (LQ + SQ)

Commit

IssueLd

Deq

StoreBuffer

L1 D$

RespLd

IssueSt

RespSt

RenameTable

SpeculationManager

EpochManager

Scoreboard

ALU pipeline

MEM pipeline

Load-Store Unit

Front-end

Fetch Bypass

Last Level Cache

Stop speculating ONLY during enclave copy operations (I/O) à

overhead is negligible.

MI6 Processor

39

Rename

ROB

ALU IQ Issue RegRead Exec Reg

Write

MEM IQ Issue RegRead

AddrCalc

UpdateLSQ

Physical Reg File

L1 D TLB

LSQ (LQ + SQ)

Commit

IssueLd

Deq

StoreBuffer

L1 D$

RespLd

IssueSt

RespSt

RenameTable

SpeculationManager

EpochManager

Scoreboard

ALU pipeline

MEM pipeline

Load-Store Unit

Front-end

Fetch Bypass

Last Level Cache

Area overhead for hardware modifications is ~2% not counting

L1 cache, FPU, LLC slice.

MI6 Processor

40

Rename

ROB

ALU IQ Issue RegRead Exec Reg

Write

MEM IQ Issue RegRead

AddrCalc

UpdateLSQ

Physical Reg File

L1 D TLB

LSQ (LQ + SQ)

Commit

IssueLd

Deq

StoreBuffer

L1 D$

RespLd

IssueSt

RespSt

RenameTable

SpeculationManager

EpochManager

Scoreboard

ALU pipeline

MEM pipeline

Load-Store Unit

Front-end

Fetch Bypass

Last Level Cache

Security Monitor virtually unchanged. Hardware can evolve

separately from software (and vice versa).

Challenge: Expressivity

• ~15% performance overhead for enclaves• Enclaves trade expressivity for security

– Cannot make system calls directly since OS can’t be trusted to restore an enclave’s execution state

– Enclave’s runtime must ask the host application to proxy file system and network I/O requests

– What syscall functionality should the enclave’s runtime provide?

41

Challenge: Adaptivity

• Runtime decisions based on sensitive data leak information through timing: completion time, resource usage

• Crypto to the rescue?– Secure demand paging using page-level memory

encryption, integrity verification and ORAM– Secure and efficient dynamic memory allocation

in enclaves an open problem

42

Challenge: Interaction

• Interaction with the outside may leak information– Public schedule for interaction does not leak

• Can we bound leakage of adaptive interactions with users, other programs?

43

Can ITrust You?

Challenge: (Formal) Verification

Open Source à Independent Verification

Properties of Enclaves:

Measurement := Different enclaves have different measurements (also inverse)

Integrity := Modelled attacker cannot affect enclave state

Confidentiality := Modelled attacker cannot observe enclave state

44

Modeling the Adversary

Adversary := set of ops an attacker can use to tamper with or observe enclave state. Any combination of these can be used at any time.

Threat model := ∪(observation function, tamper function, model initial state)

Specify non-interference properties or invariants that execution should satisfy

45

Invariants and Non-Interference

46

Check invariants

Do an attacker actionInit

operationsDo a victim action

The proof describes a CFG with “forks”. Search this graph for a path that violates an invariant.

Summary: Desiderata forSingle-Chip Secure Processor

• Open source• Formally verified (small) TCB• Secure against all practical software attacks• Secure against physical attacks on memory• Enhanced physical security against invasive

attacks• Minimal performance overhead

47

Thank you for your attention!48

Acknowledgements

• Edward Suh• Victor Costan• Ilia Lebedev• Chris Fletcher• Ling Ren• Albert Kwon• Sanjit Seshia• Pramod Subramanyan

• Arvind• Thomas Bourgeat• Andrew Wright• Sizhuo Zhang• Kyle Hogan• Jules Drean• Rohit Sinha• NSF, DARPA, ADI, Delta

top related