r enabling trusted software integrity darko kirovski microsoft research milenko drinić miodrag...

21
Enabling Trusted Software Integrity Darko Kirovski Microsoft Research Milenko Drinić Miodrag Potkonjak Computer Science Department University of California, Los Angeles

Upload: august-morrison

Post on 02-Jan-2016

218 views

Category:

Documents


0 download

TRANSCRIPT

Enabling Trusted Software IntegrityDarko Kirovski

Microsoft ResearchMilenko DrinićMiodrag Potkonjak

Computer Science Department University of California, Los

Angeles

Problem Description

HIGH LOWR

ET

UR

N A

DD

RE

SS

LOC

AL

VA

RIA

BLE

S

BU

FF

ER

HIGH PRIORITYPROCESS

LOW PRIORITYPROCESS

HIGH LOWR

ET

UR

N A

DD

RE

SS

LOC

AL

VA

RIA

BLE

S

BU

FF

ER

AT

TA

CK

CO

DE

HIGH PRIORITYPROCESS

LOW PRIORITYPROCESS

DATA

Buffer Overrun Goal

– Explore improperly implemented I/O– Divert execution to attack code

Simplest variant – Stack smashing– “Smashing The Stack For Fun And Profit” by Aleph

One ([email protected]), Phrack 49, 1996.

Numerous variants explore different vulnerabilities– Tutorials on the Web with bug descriptions– setuid() – Chen, Wagner, Dean, 2002.

What Can Be Done?

StackGuard – Cowan et al., 1998– Dummy value next to return address

Bounds checking for all pointers – Jones, Kelly, 1995– Slow in pointer-intensive software

Static analysis – Wagner, 2000– Verify all buffers – promising idea– Too many false alarms– Need to be resolved manually

Intrusion Prevention Current approaches

– Intrusion detection PREVENT rather than DETECT is

easier Intrusion prevention system

– Adversary must solve a computationally difficult task to run programs in high priority

Two types of binaries– Ordinary– Touched with a security wand

Run-time verification

Outline

How the system works? Software installation Example of constraint embedding Run-time verification How to break the system? Effect on performance

Outline

How the system works? Software installation Example of constraint embedding Run-time verification How to break the system? Effect on performance

An Intrusion Prevention System

Software

PUBLIC MODE

INSTALLATION MODE

Executes any code.Restricted access to resources.Script interpreters, distrustedprograms, P2P networking, etc.

Single process. Interruptsdisabled. Input = software.Output = software with additionalCPUID-dependent constraints.

CPUIDAtomicexecution

unit.

KeyedMAC

Softwaretrusted

TRUSTED MODE

Runs only trusted processes: OS+ user defined. Full or controlledaccess.

KeyedMAC

AbortRun

Burnt-in. Not a privacyissue, because it is neverrevealed externally.

Outline

How the system works? Software installation Example of constraint embedding Run-time verification How to break the system? Effect on performance

Software Installation Installer is on-

chip or on an EPROM with verified contents

Single process I/O – memory

mapped Interrupts

disabled Used registers,

memory overwritten

~ BOOT on PCsGOAL: embed constraints

w/o revealing CPUID.

CPUID

Softwareworking-copy

I-block

SPEF InstallationSoftwaremaster-copy

I-block

TIhash

Encrypt(3DES)

Domainordering

Constraintembedding

Ran

dom

bits

tre

am

Outline

How the system works? Software installation Example of constraint

embedding Run-time verification How to break the system? Effect on performance

Example: Instruction Scheduling

SPEF - Instruction Rescheduling

Softwaremaster-copy

TIhash

Encrypt(3DES)

Domainordering

Constraintembedding -verification

Randombitstream

SUB...

MOV...

MOV...

DIV...

MULT...

XOR...

SUB...

JUMP...

ADD...

MOV...

SUB...

MOV...

MOV...

DIV...

MULT...

XOR...

SUB...

JUMP...

ADD...

MOV...

SUB...

MOV...

MOV...

DIV...

MULT...

XOR...

SUB...

JUMP...

ADD...

MOV...

CPUID

Domainordering

How the Bitstream Reorders Ops?

3 - (1)

0x0080e0 LDR r1,[r8,#0]0x0080e4 LDR r0,[r9,#0]0x0080e8 MOV r3,r50x0080ec MUL r2,r0,r10x0080f0 MOV r1,#10x0080f4 LDR r0,[r6,#0]

(3)

(1) (2)

(5) (6)

(4)

b) Dependency graph

(1)(2)(3)(4)(5)(6)

Instructions Possible positions(1) (2) (3) (4) (5) (6)

a) Initial order of instructionsand their possible positions

initial position

possible position

conditional possible position

Controlstep

Availableinstructions

Part of bit-stream used

Selectedinstruction

1 10 (3)2 1 (2)

4 - (4)5 0 (5)

d) Instruction ordering procedure

0x0080e0 LDR r1,[r8,#0]0x0080e4 LDR r0,[r9,#0]0x0080e8 MOV r3,r5

0x0080ec MUL r2,r0,r10x0080f0 MOV r1,#10x0080f4 LDR r0,[r6,#0]

e) Final order of instructions

Instructions

1010...0110

c) Sample bit-stream

encoding 00 01 10 11(1) (2) (3) -(1) (2) - -

(4) (5)* -(1)* (4)* - -

(6)*(5) (6) - -

– Examples• Instruction rescheduling • Register assignment• Basic block reordering• Conditional branch selection• Filling unused opcode fields• Toggling signs of operands

Constraint Embedding Techniques Entropy of program representation is high Reduce entropy w/ constraints for 50+ bits

with preserved performance Exact entropy reduction unique for each

CPUID Constraint types– Requirements

• High entropy• Functional transparency• Transformation invariance• Effective implementation• Low performance

overhead

Outline

How the system works? Software installation Example of constraint embedding Run-time verification How to break the system? Effect on performance

CPU + SPEF Verifier

CPU ID

I-blockbuffer

Traditional CPUarchitecture

Softwareworking-copy

I-block

Encrypt(3DES)

TIhash

Domainordering

Constraintverification

Randombitstream

AB

OR

T o

r R

UN

Run-time Code Verification ARM instruction set and

simulated system 50 cycles 20K gates HW support?

Cach

e lin

e

Outline

How the system works? Software installation Example of constraint embedding Run-time verification How to break the system? Effect on performance

How to Break the System? Cryptographically secure keyed MAC

– Hard to extract CPUID from working-copies– Hard to create an I-block with CPUID

constraints satisfied w/o the CPUID Patch low entropy instruction blocks

– I-block with low entropy? Example:• I-block = one instruction and all other NOPS

– Hardware must detect I-blocks with low entropy

• Count and limit domain cardinality• Done during domain ordering

Patch I-blocks from working copies– Difficult? Hard to evaluate w/o a lot of software

Outline

How the system works? Software installation Example of constraint embedding Run-time verification How to break the system? Effect on performance

Performance Embedded bits of

entropy

Performance effect– 13-25% overhead– 7-17% with a cache

that logs TI-hashes

0 100 200 300

0

25

50

75

100

0 100 200 300

0

15

30

45

60

0 100 200 300

0

25

50

75

100

0 100 200 300

0

40

80

120

160

0 100 200 300

0

15

30

45

60

Cummulative Degrees of Freedom For All Constraint Types

Mpeg Encode Mpeg Decode Jpeg Encode Jpeg Decode Pegwit

64-

Inst

ruct

ion

Blo

ck C

ou

nt

35 68 5621 22

343 blocksMean: 136

380 blocksMean: 146

242 blocksMean: 142

330 blocksMean: 152

216 blocksMean: 140

0

2

4

6

8

10

12

14

16

18

1K, FA 1K, DM 2K, FA 2K, DM 4K, FM 4K, DM

Cache size

Eff

ecti

ve C

PI

No Verification

Verification without TIH cache

Verification with TIH cache

Simulated w/ ARMulator ARM instruction set

MediaBench suite

Summary Intrusion prevention On-line software verification for

authenticity Keyed message authentication code

– Stored as footer– Stored as constraints

• 50% decrease in code size overhead

Public and trusted execution mode Relatively hi/lo performance overhead

– No hardware acceleration– 20% - sets back Moore’s Law 4.5 months