security seminar, fall 2003 on the (im)possibility of obfuscating programs boaz barak, oded...

30
Security Seminar, Fall 2003 On the (Im)possibility of Obfuscating Programs Boaz Barak, Oded Goldreich, Russel Impagliazzo, Steven Rudich, Amit Sahai, Salil Vadhan and Ke Yang Presented by Shai Rubin

Upload: lonnie-laughton

Post on 29-Mar-2015

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Security Seminar, Fall 2003 On the (Im)possibility of Obfuscating Programs Boaz Barak, Oded Goldreich, Russel Impagliazzo, Steven Rudich, Amit Sahai, Salil

Security Seminar, Fall 2003

On the (Im)possibility of Obfuscating Programs

Boaz Barak, Oded Goldreich, Russel Impagliazzo, Steven Rudich, Amit Sahai, Salil

Vadhan and Ke Yang

Presented by Shai Rubin

Page 2: Security Seminar, Fall 2003 On the (Im)possibility of Obfuscating Programs Boaz Barak, Oded Goldreich, Russel Impagliazzo, Steven Rudich, Amit Sahai, Salil

Security Seminar, Fall 2003

Theory/Practice “Gap”In practice

Hackers successfully obfuscate

viruses

Researchers successfully

obfuscate programs [2,4]

Companies sell obfuscation

products [3]

In theory [1]

There is no good algorithm for

obfuscating programs

Which side are you?

Page 3: Security Seminar, Fall 2003 On the (Im)possibility of Obfuscating Programs Boaz Barak, Oded Goldreich, Russel Impagliazzo, Steven Rudich, Amit Sahai, Salil

Security Seminar, Fall 2003

Why Do I Give This Talk?

• Understand Theory/Practice Gap

• An example of a good paper

• An example of an interesting research:

– shows how to model a practical problem in terms of complexity

theory

– Illustrates techniques used by theoreticians

• I did not understand the paper. I thought that explaining the paper to

others, will help me understand it

• To hear your opinion (free consulting)

• To learn how to pronounce ‘obfuscation’

Page 4: Security Seminar, Fall 2003 On the (Im)possibility of Obfuscating Programs Boaz Barak, Oded Goldreich, Russel Impagliazzo, Steven Rudich, Amit Sahai, Salil

Security Seminar, Fall 2003

Disclaimer

• This paper is mostly about complexity theory

• I’m not a complexity theory expert

• I present and discuss only the main result of the paper

• The paper describes extensions to the main result which

I did not fully explore

• Hence, some of my

interpretations/conclusions/suggestions may be

wrong/incomplete

• You are welcome to catch me

Page 5: Security Seminar, Fall 2003 On the (Im)possibility of Obfuscating Programs Boaz Barak, Oded Goldreich, Russel Impagliazzo, Steven Rudich, Amit Sahai, Salil

Security Seminar, Fall 2003

Talk StructureMotivation

(Theory/Practice Gap)

ObfuscationModel

ImpossibilityProof

Theoretician Track

Other Obfuscation

Models Practitioner

TrackSummary

ObfuscationModel

Analysis

Page 6: Security Seminar, Fall 2003 On the (Im)possibility of Obfuscating Programs Boaz Barak, Oded Goldreich, Russel Impagliazzo, Steven Rudich, Amit Sahai, Salil

Security Seminar, Fall 2003

Obfuscation ConceptA good obfuscator: a virtual black box

“Anything an adversary can compute from an obfuscated program O(P), it can compute given just an oracle access to P”

The weakest notion of compute: a predicate, or a property of P.

Prog.cO(Prog.c) Prog.c

p(Prog.c)

Input/Output queriesCode +

Analysis + Input/Output queries

Page 7: Security Seminar, Fall 2003 On the (Im)possibility of Obfuscating Programs Boaz Barak, Oded Goldreich, Russel Impagliazzo, Steven Rudich, Amit Sahai, Salil

Security Seminar, Fall 2003

Turing Machine Obfuscator

1. [Functionality property] O(M) computes the same function as M.2. [Efficiency property] O(M) running time1 is the same as M.3. [Black box property] For any efficient algorithm2 A (Analysis) that

computes a predicate p(M) from O(P), there is an efficient (oRacle access) algorithm2 RM that for all M computes p(M):

2Probabalistic polynomial-time Turing machine

1Polynomial slowdown is permitted

A Turing machine O is a Turing Machine (TM) Obfuscator if for any Turing machine M:

Pr[A(O(M)) = p(M)] Pr[RM(1|M|) = p(M)]

In words: For every M, there is no predicate that can be (efficiently) computed from the obfuscated version of M, and cannot be computed by merely observing the input-output behavior of M.

Page 8: Security Seminar, Fall 2003 On the (Im)possibility of Obfuscating Programs Boaz Barak, Oded Goldreich, Russel Impagliazzo, Steven Rudich, Amit Sahai, Salil

Security Seminar, Fall 2003

Talk Structure

ObfuscationModel

ImpossibilityProof

Other Obfuscation

Models

Summary

Motivation(Theory/Practice Gap)

Theoretician Track

Practitioner Track

ObfuscationModel

Analysis

Page 9: Security Seminar, Fall 2003 On the (Im)possibility of Obfuscating Programs Boaz Barak, Oded Goldreich, Russel Impagliazzo, Steven Rudich, Amit Sahai, Salil

Security Seminar, Fall 2003

Proof Outline

2. Really? Please provide O.

4. I show you a predicate p, and an (analysis) algorithm s.t.: A(O(E))=p(E). You must provide RM: Pr[RE(1|E|)= p(E)] Pr[A(O(E))=p(E)].

5. I choose another machine Z and obfuscate it using O. I show you that Pr[RZ(1|Z|)= p(Z)] << Pr[A(O(Z))=p(Z)].

1. You say: “I have an obfuscator: for any Machine M, for any (analysis) algorithm A that computes a predicate p(M), there is an oracle access algorithm RM that for all M computes p(M).

3. Given O and a my chosen Turing machine E, I compute O(E).

6. Conclusion: please try another obfuscator (i.e., you do not have a good obfuscator)

Page 10: Security Seminar, Fall 2003 On the (Im)possibility of Obfuscating Programs Boaz Barak, Oded Goldreich, Russel Impagliazzo, Steven Rudich, Amit Sahai, Salil

Security Seminar, Fall 2003

Building E (1)

• Combination Machine. For any M,N:

• COMBM,N(1,x) M(x) and COMBM,N(0,x) N(x).

• Hence, COMBM,N can be used to compute N(M).

COMBM,N(b,x)=M(x) b=1

N(x) b=0

Page 11: Security Seminar, Fall 2003 On the (Im)possibility of Obfuscating Programs Boaz Barak, Oded Goldreich, Russel Impagliazzo, Steven Rudich, Amit Sahai, Salil

Security Seminar, Fall 2003

Building E (2)

• Let ,{0,1}K

• Let

• Note: D, can distinguish between C, and C’,’ when (,)(’,’)

• E,=COMBD,,C,

• Remember: E, can be used to compute D,(C,)

C,(x)= x=

0 otherwiseD,(C)=

1 C()=

0 otherwise

Page 12: Security Seminar, Fall 2003 On the (Im)possibility of Obfuscating Programs Boaz Barak, Oded Goldreich, Russel Impagliazzo, Steven Rudich, Amit Sahai, Salil

Security Seminar, Fall 2003

Proof Outline

2. Really? Please provide O.

4. I show you a predicate p, and an (analysis) algorithm s.t.: A(O(E,))=p(E,). You must provide RM: Pr[RE,(1|E,|)= p(E,)] Pr[A(O(E,))=p(E,)].

5. I choose another machine Z and obfuscate it using O. I show you that Pr[RZ(1|Z|)= p(Z)] >> Pr[A(O(Z))=p(Z)].

1. You say: “I have an obfuscator: for any Machine M, for any (analysis) algorithm A that computes a predicate p(M), there is an oracle access algorithm RM that for all M computes p(M).

3. Given O and a my chosen Turing machine E, I compute O(E,).

6. Conclusion: please try another obfuscator (i.e., you do not have a good obfuscator)

Page 13: Security Seminar, Fall 2003 On the (Im)possibility of Obfuscating Programs Boaz Barak, Oded Goldreich, Russel Impagliazzo, Steven Rudich, Amit Sahai, Salil

Security Seminar, Fall 2003

The Analysis Algorithm

Input: A combination machine COMBM,N(b,x).

Algorithm: 1. Decompose COMBM,N into M and N.

a. COMBM,N(1,x) M(x) b. COMPM,N(0,x) N(x)).

2. Return M(N).

Note: A(O(E,)) is a predicate that is always (i.e., with probability 1) true:

A(O(E,)) = A(O(COMBD,,C,)) D,(C,) = 1

You must provide oracle access algorithm:RM s.t. Pr[RE,(1|E,|)=1] 1.

Page 14: Security Seminar, Fall 2003 On the (Im)possibility of Obfuscating Programs Boaz Barak, Oded Goldreich, Russel Impagliazzo, Steven Rudich, Amit Sahai, Salil

Security Seminar, Fall 2003

Proof Outline

2. Really? Please provide O.

5. I choose another machine Z and obfuscate it using O. I show you that Pr[RZ(1|Z|)= p(Z)] << Pr[A(O(Z))=p(Z)].

1. You say: “I have an obfuscator: for any Machine M, for any (analysis) algorithm A that computes a predicate p(M), there is an oracle access algorithm RM that for all M computes p(M).

3. Given O and a my chosen Turing machine E, I compute O(E).

6. Conclusion: please try another obfuscator (i.e., you do not have a good obfuscator)

4. I show you a predicate p, and an (analysis) algorithm s.t.: A(O(E,))=1.

You must provide RM: Pr[RE,(1|E,|)= 1] Pr[A(O(E,))=1] = 1.

Page 15: Security Seminar, Fall 2003 On the (Im)possibility of Obfuscating Programs Boaz Barak, Oded Goldreich, Russel Impagliazzo, Steven Rudich, Amit Sahai, Salil

Security Seminar, Fall 2003

The Z machine

• Let Zk be a machine that always return 0k.

• Z is similar to E, (COMBD,,C,): replace C, with Zk.

Z=COMBD,,Zk

• Note A(O(Z)): is a predicate that is always (i.e., with probability 1) false:

A(O(Z)) = A(O(COMBD,,Zk)) D,(Zk) = 0

• Pr[RZ(1|Z|]=0) 1 ?. If we show that Pr[RZ(1|Z|]=0) << 1, we are done.

Page 16: Security Seminar, Fall 2003 On the (Im)possibility of Obfuscating Programs Boaz Barak, Oded Goldreich, Russel Impagliazzo, Steven Rudich, Amit Sahai, Salil

Security Seminar, Fall 2003

Why Pr[RZ(1|Z|]=0)<<1 ?

Let us look at the execution of RE,:

Start End

D, D, D,

C,

1

Start EndOut’

When we replace the oracle to C, with oracle to Zk, we get RZ.

What will change in the execution?

Pr(out’=0) = Pr(a query to C, returns non-zero) =

Pr(query=) = 2-k

Zk

D, D, D,

ZkZk

C,

RE,:

RZ:

C,

3 Inaccurate, see paper.

3

Page 17: Security Seminar, Fall 2003 On the (Im)possibility of Obfuscating Programs Boaz Barak, Oded Goldreich, Russel Impagliazzo, Steven Rudich, Amit Sahai, Salil

Security Seminar, Fall 2003

Proof Outline

2. Really? Please provide O.

4. I show you a predicate p, and an (analysis) algorithm s.t.: A(O(E))=1. You must provide RM: Pr[RE(1|E|)= 1] Pr[A(O(E))=1] = 1.

5. I choose another machine Z and obfuscate it using O. I show you that Pr[RZ(1|Z|)= 0]=2-k << Pr[A(O(Z))=0] = 1.

1. You say: “I have an obfuscator: for any Machine M, for any (analysis) algorithm A that computes a predicate p(M), there is an oracle access algorithm RM that for all M computes p(M).

3. Given O and a my chosen Turing machine E, I compute O(E).

6. Conclusion: please try another obfuscator (i.e., you do not have a good obfuscator)

Page 18: Security Seminar, Fall 2003 On the (Im)possibility of Obfuscating Programs Boaz Barak, Oded Goldreich, Russel Impagliazzo, Steven Rudich, Amit Sahai, Salil

Security Seminar, Fall 2003

Talk Structure

ObfuscationModel

ImpossibilityProof

ObfuscationModel

Analysis

Other Obfuscation

Models

Summary

Theoretician Track

Practitioner Track

Motivation(Theory/Practice Gap)

Page 19: Security Seminar, Fall 2003 On the (Im)possibility of Obfuscating Programs Boaz Barak, Oded Goldreich, Russel Impagliazzo, Steven Rudich, Amit Sahai, Salil

Security Seminar, Fall 2003

Modeling ObfuscationA good obfuscator: a virtual black box

“Anything that an adversary can compute from an obfuscation O(P), it can also compute given just an oracle access to P”

Prog.cO(Prog.c) Prog.c

Knowledge

• Barak shows: there are properties that cannot be efficiently learned from I/O queries, but can be learned from the code

• However, we informally knew it: for example, whether a program is written in C or Pascal, or which data structure a program uses

Input/Output queriesCode +

Analysis + Input/Output queries

Page 20: Security Seminar, Fall 2003 On the (Im)possibility of Obfuscating Programs Boaz Barak, Oded Goldreich, Russel Impagliazzo, Steven Rudich, Amit Sahai, Salil

Security Seminar, Fall 2003

Obfuscation Model Space

Difficulty to gain information from O(P).

Efficient inefficient

Information hiddenby obfuscator.

Specific predicate

All predicates

Barak’s Model

Page 21: Security Seminar, Fall 2003 On the (Im)possibility of Obfuscating Programs Boaz Barak, Oded Goldreich, Russel Impagliazzo, Steven Rudich, Amit Sahai, Salil

Security Seminar, Fall 2003

TM Obfuscator

1. O(M) computes the same function as M.2. O(M) running time1 is the same as M.3. For any efficient algorithm2 A (Analysis) that computes a predicate

p(M), there is an efficient (oRacle) algorithm2 RM that for all M computes p(M):

2Probabalistic polynomial-time Turing machine

1Polynomial slowdown is permitted

A Turing machine O is a TM obfuscator if for any Turing machine M:

Pr[A(O(M)) = p(M)] Pr[RM(1|M|) = p(M)]

Page 22: Security Seminar, Fall 2003 On the (Im)possibility of Obfuscating Programs Boaz Barak, Oded Goldreich, Russel Impagliazzo, Steven Rudich, Amit Sahai, Salil

Security Seminar, Fall 2003

Obfuscation Model Space

Efficient Inefficient

Programs

AllPro

gram

s

Difficulty to gain information from O(P).

Barak’s Model

Information gainedfrom O(P).

Specific predicate

All predicates

Speci

fic

Progr

am

Page 23: Security Seminar, Fall 2003 On the (Im)possibility of Obfuscating Programs Boaz Barak, Oded Goldreich, Russel Impagliazzo, Steven Rudich, Amit Sahai, Salil

Security Seminar, Fall 2003

TM Obfuscator

1. O(M) computes the same function as M.2. O(M) running time1 is the same as M.3. For any efficient algorithm2 A (Analysis) that computes a predicate

p(M), there is an efficient (oRacle) algorithm2 RM that for all M computes p(M):

2Probabalistic polynomial-time Turing machine

1Polynomial slowdown is permitted

A Turing machine O is a TM obfuscator if for any Turing machine M:

Pr[A(O(M)) = p(M)] Pr[RM(1|M|) = p(M)]

Page 24: Security Seminar, Fall 2003 On the (Im)possibility of Obfuscating Programs Boaz Barak, Oded Goldreich, Russel Impagliazzo, Steven Rudich, Amit Sahai, Salil

Security Seminar, Fall 2003

Talk StructureMotivation

ObfuscationModel

ImpossibilityProof

Other Obfuscation

Models

Summary

Theoretician Track

Practitioner Track

ObfuscationModel

Analysis

Page 25: Security Seminar, Fall 2003 On the (Im)possibility of Obfuscating Programs Boaz Barak, Oded Goldreich, Russel Impagliazzo, Steven Rudich, Amit Sahai, Salil

Security Seminar, Fall 2003

Signature obfuscation:

Other Obfuscation Models

Efficient Inefficient

Barak’s Model

Programs

Signature obfuscation:1. Not all properties2. Not virtual black box?

Difficulty to gain information from O(P).

AllPro

gram

s

Information gainedfrom O(P).All predicates

Static Disassembly [2]:

Specific predicate

Static Disassembly [2]:1. Not all properties2. Not difficult3. Not virtual black box?

Page 26: Security Seminar, Fall 2003 On the (Im)possibility of Obfuscating Programs Boaz Barak, Oded Goldreich, Russel Impagliazzo, Steven Rudich, Amit Sahai, Salil

Security Seminar, Fall 2003

Barak’s Model Limitation

• Virtual Black Box: – Not surprising in some sense (but, still excellent work)– Does not corresponds to what attackers/researchers are doing:

“the virtual black box paradigm for obfuscation is inherently flawed”

• Too general: – obfuscator must work for all programs– for any property (Barak addresses this in the extensions)

• Too restrictive: does not allow to fit the oracle algorithm per Turing machine (does it matter?).

Page 27: Security Seminar, Fall 2003 On the (Im)possibility of Obfuscating Programs Boaz Barak, Oded Goldreich, Russel Impagliazzo, Steven Rudich, Amit Sahai, Salil

Security Seminar, Fall 2003

Alternative Models

“Property Hiding Model”: for a given property q: (i) q can be computed from P, (ii) q cannot be (is more difficult to?) computed from O(P).

Given an algorithm A, and a Turing machine M such that A(M)=q(M), obfuscate M such that

1. [property hiding] for every algorithm A, A(O(M)) q(M) 2. [functionality] M and O(M) computes the same function

Virus Signature Obfuscation•A(M) = q(M) = substring of

instructions inside M•O(M) does not contain this

substring

Static Disassembly•A(M)=(particular) Dissembler•q(M) = A(M)• 90% of the instruction in A(M) are

different than the instructions in

A(O(M))

Page 28: Security Seminar, Fall 2003 On the (Im)possibility of Obfuscating Programs Boaz Barak, Oded Goldreich, Russel Impagliazzo, Steven Rudich, Amit Sahai, Salil

Security Seminar, Fall 2003

Alternative Models (2)Backdoor Model: hide functionality for a single input, change functionality for most other inputs

Given a Turing machine M and an input x 1. [obfuscated back door] there exists y such that M(x)=O(M)(y)2. [non functionality] for every zy Pr[M(z)O(M)(z)] is high

Page 29: Security Seminar, Fall 2003 On the (Im)possibility of Obfuscating Programs Boaz Barak, Oded Goldreich, Russel Impagliazzo, Steven Rudich, Amit Sahai, Salil

Security Seminar, Fall 2003

Summary

What to take home:• The gap is possible because:

– Virtual black box paradigm is different than real world obfuscation.

– The Obfuscation Model Space .

• Nice research: Concept Formalism Properties• A lot remain to be done

Page 30: Security Seminar, Fall 2003 On the (Im)possibility of Obfuscating Programs Boaz Barak, Oded Goldreich, Russel Impagliazzo, Steven Rudich, Amit Sahai, Salil

Security Seminar, Fall 2003

Bibliography

1. B. Barak, O. Goldreich R. Impagliazzo, S. Rudich, A. Sahai, S. Vadhan and K. Yang, "On the (Im)possibility of Obfuscating Programs", CRYPTO, Aug. 2001, Santa Barbara, CA.

2. Cullen Linn and Saumya Debray. "Obfuscation of Executable Code to Improve Resistance to Static Disassembly", CCS Oct. 2003, Washington DC.

3. www.cloakware.com.

4. Christian S. Collberg, Clark D. Thomborson, Douglas Low: Manufacturing Cheap, Resilient, and Stealthy Opaque Constructs. POPL 1998.