symbolic finite state transducers: algorithms and applications

Post on 07-Feb-2016

47 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Symbolic Finite State Transducers: Algorithms and Applications. Margus Veanes Pieter Hooimeijer Benjamin Livshits David Molnar Nikolaj Bjørner. Symbolic Finite State Transducers: Algorithms and Applications. Margus Veanes Pieter Hooimeijer Benjamin Livshits David Molnar - PowerPoint PPT Presentation

TRANSCRIPT

2

Symbolic Finite State Transducers:Algorithms and Applications

Margus VeanesPieter HooimeijerBenjamin LivshitsDavid Molnar Nikolaj Bjørner

3

Symbolic Finite State Transducers:Algorithms and Applications

Margus VeanesPieter HooimeijerBenjamin LivshitsDavid Molnar Nikolaj Bjørner

4

Formal languagesare well-studied.

5

a*b+𝑞0

a𝑞1

b

b

✔abb aaaa✘

6

Series10

20

40

60

80

100

120103

Num

ber o

f pap

ers

“automata”

POPL (2001–2011)

7

What about

transformation?

8http://en.wikipedia.org/wiki/Osborne_1

9

10

Compute image:

Check properties: Equivalence Composition

✔ abb{baa}

aaaa ✘𝑞0 𝑞1

a/b

b/a

b/a

11

Series10

20

40

60

80

100

120103

8

Num

ber o

f pap

ers

“automata” “transducers”

POPL (2001–2011)

12

Talk Outline

Background Approach Case Studies

13

Background

“Fast and Precise Sanitizer Analysis with BEK”

Idea:Develop a language for commonly-used string transformations. Prove properties about those transfor-mations.

14

Code

𝑞0 𝑞1

a/b

b/a

b/a

t := iter(c in s)[b := false;]{ case (!b && c in "['\"\\]"):    b := false;    yield('\\', c); case (c == '\\'):   b := !b;   yield(c); case (true):  b := false; yield(c);};

FSTs

Gap

15

Code

𝑞0 𝑞1

a/b

b/a

b/a

t := iter(c in s)[b := false;]{ case (!b && c in "['\"\\]"):    b := false;    yield('\\', c); case (c == '\\'):   b := !b;   yield(c); case (true):  b := false; yield(c);};

FSTs

1

domain-specific languages

16

Code

𝑞0 𝑞1

a/b

b/a

b/a

t := iter(c in s)[b := false;]{ case (!b && c in "['\"\\]"):    b := false;    yield('\\', c); case (c == '\\'):   b := !b;   yield(c); case (true):  b := false; yield(c);};

FSTs

1

more expressive

transducers2

domain-specific languages

17

domain-specific languages

Code

𝑞0 𝑞1

a/b

b/a

b/a

t := iter(c in s)[b := false;]{ case (!b && c in "['\"\\]"):    b := false;    yield('\\', c); case (c == '\\'):   b := !b;   yield(c); case (true):  b := false; yield(c);};

FSTs

1

more expressive

transducers2

18

Talk Outline

Background Approach Case Studies

19

20

Symbolic Finite State Transducers

Idea:• Equip transitions with formulae• Allow the use of any decidable

theory

21

Definition

Symbolic Finite State Transducer (SFT):

22

Symbolic Finite State Transducer (SFT):

- states- start state- final states

23

Symbolic Finite State Transducer (SFT):

- states- start state- final states

𝑅𝑞𝜙/ 𝒇→

𝑟

24

Symbolic Finite State Transducer (SFT):

- states- start state- final states

𝑅𝑞𝜙/ 𝒇→

𝑟

predicates output

25

Symbolic Finite State Transducer (SFT):

- states- start state- final states- transition

Background Theory:

- predicates

- label theory

26

Example𝑞0

𝑞1(𝜆 𝑥 . 𝑥=0 )/ [𝜆𝑥 .1 ]

(𝜆 𝑥 .𝐭 )/ [𝜆𝑥 .2𝑥 ]

27

𝑞0

𝑞1(𝜆 𝑥 . 𝑥=0 )/ [𝜆𝑥 .1 ]

(𝜆 𝑥 .𝐭 )/ [𝜆𝑥 .2𝑥 ]guards symbolic   outputs

28

29

Closure under composition

SFT A B

in outSFT A in outSFT B

Requirement:

30

Single-valued equivalence

Definition:1𝑎 :𝜎 ∗

𝑏𝑐 :𝛾∗

𝑏∈ 𝐴(𝑎)𝑐∈𝐵(𝑎)

𝑏=𝑐

31

Algorithm:• Construct 2-output

product transducer• Find conflicts (dft):– output length– output value

Complexity:

𝑂 (𝑛2 ⋅ 𝑓 (𝑚 ) )number of rules

complexity of decision procedure

32

Key restriction: single-valuedness

Transducer A is single-valued if, for all inputs, A has at most one out-put.

𝐴=𝐴1

33

Note: This definition permits non-determinism, e.g.:

b/[]

b/[]

......

...

Transducer A is single-valued if, for all inputs, A has at most one out-put.

𝐴=𝐴1

34

35

algebra

36

subsumption equivalence idempotence

commutativity

...

algebra interesting  properties

37

Talk Outline

Background Approach Case Studies

38

39

Case Studies

HTMLdecode

"b"'b'

MalwareFingerprinting

ImageBlurring

LocationPrivacy

40

HTMLdecode

"b"'b'

MalwareFingerprinting

ImageBlurring

LocationPrivacy

41

HTMLdecode

"<""&lt;" "&#60;""&#0060;"

Decode

42

"<""&lt;" "&#60;""&#0060;"

Decode

The Task: Prove that HTMLdecode is not idempotent

The Metric: Running time

43

"<""&lt;" "&#60;""&#0060;"

Decode

The Problem: Unicode defines 1,114,112 code points.

44

Three Participating Representations

C#

SFT (Eager)

C# C#

SFT+Registers(Eager)

+REG+REG

SFT+Registers(Lazy)

45

1

10

100

1,000

10,000

100,000

1,000,000

10,000,000

2 3 4 5 6

Transducer size ()6.6M

maximum number of digits

46

C#

SFT (Eager)

C# C#

SFT+Registers(Eager)

+REG+REG

SFT+Registers(Lazy)

47

1

10

100

1,000

10,000

100,000

1,000,000

10,000,000

2 3 4 5 6

Transducer size ()6.6M

51

SFT

SFT + Symbolic State Space

maximum number of digits

48

1.000

10.000

100.000

1,000.000

10,000.000

100,000.000

1,000,000.000

Tim

e (s

econ

ds; l

og s

cale

)

maximum number of digits2 3 4 5 6

Idempotence Checking: TimeSFT SFT +

REG(lazy)

SFT + REG(eager)

49

Talk Outline

Background Approach Case Studies

50

Conclusion

• Introduced Symbolic Finite State Transducers over any decidable background theory

• Presented decidability and complexity results

• Comes with a scalable and robust* implementation

51

Thank you!Please try our…

implementation

http://research.microsoft.com/automata/

online tutorial

http://www.rise4fun.com/Bek/tutorial

52

53

http://www.rise4fun.com/Bek

top related