presto: program analyses and software tools research group, ohio state university efficient...

PRESTO: Program Analyses and Software Tools Research Group, Ohio State University

Efficient Checkpointing of Java Software using Context-Sensitive Capture and Replay

Guoqing Xu, Atanas Rountev, Yan Tang, Feng Qin

Ohio State University

ESEC/FSE 07

22 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University

Outline

Motivation- Challenges for checkpointing/replaying Java

software- Summary of our approach

Contributions- Static analyses- Multiple execution regions- Experimental evaluation

Conclusions


Motivation Checkpointing/replaying has been used for a

variety of purposes at system level- Originally designed to support fault tolerance- Debugging of OS and of parallel and distributed

software

Checkpointing can benefit a number of software engineering tasks- Reduce the cost of manual debugging and testing- Support for automated techniques for debugging

and testing: e.g., dynamic slicing and delta-debugging

- Inspired by both system-level checkpointing [Pan-PDD88, Dunlap-OSDI02, King-USENIX05] and “saving-and-restoring” software engineering techniques [Saff-ASE05, Orso-WODA05, Orso-WODA06, Elbaum-FSE06]


Challenges Ease of use and deployment

- Application-level checkpointing: no JVM/runtime support, just code analysis and instrumentation

- Challenge: no direct access to the call stack; no control over thread scheduling or external resources (files, etc.)

Reduce the size of the recorded state- Dumping the entire heap may be prohibitively

expensive, especially for large programs- Challenge: static analyses to prune redundant state

Static and dynamic overhead- Static analysis cost is amortized over multiple runs- Approach is intended for long-running applications


Summary of Our Approach Tool input: program + checkpoint definition Performs static analyses and code instrumentation Tool output: two program versions First, an augmented checkpointing version is

executed once to record (parts of) the run-time program states - At the checkpoint: heap objects, static fields, locals- At certain points along the call chain leading to the

checkpoint Next, a pruned replaying version is executed multiple

times- Restore variables saved at the checkpoint- Restore variables saved at points along the call chain

How do we resume execution from the checkpoint?- Step 1: control flow quickly reaches the checkpoint- Step 2: recover state at checkpoint- Step 3: incrementally recover state after call sites along the

call chain leading to the checkpoint


Definitions Crosscut call chain (CC-chain)

- A programmer-specified call chain that leads to the method that contains the checkpoint

- E.g. main(44) -> run(28)

Decision points - A call site on the CC-chain (e.g. m.run) – due to

polymorphism- A predicate on which a decision point or the

checkpoint is control-dependent

At a decision point, the checkpointing version records the control-flow outcome

The replaying version uses this info to force the control flow to reach the checkpoint


Replaying, Step 1: Recover the Call Stack

Predicate decision point: recover boolean value

Call site decision point o.m(a1…, an)- Recover the run-time type of the receiver object;

instantiated during replaying using sun.misc.Unsafe


Checkpointing Versionvoid run(String[] args) { processCmdLine(args); loadNecessaryClasses(); Set wp_packs = getWpacks(); Set body_packs = getBpacks(); boolean b = Options.v().whole_jimple(); => save(b); if (b){// DP getPack("cg").apply(); // --- checkpoint --- => save(…); getPack("wjtp").apply(); getPack("wjop").apply(); getPack("wjap").apply(); } retrieveAllBodies(); … } ...}

static void main(String[] args) { Main m = new Main(); boolean b = args.length !=0; => save(b); if (b) // DP => save(type_of(m)); m.run(args); // DP}


Replaying Versionvoid run(String[] args) { processCmdLine(args); loadNecessaryClasses(); Set wp_packs = getWpacks(); Set body_packs = getBpacks(); boolean b = Options.v().whole_jimple(); => read(b); if (b){// DP getPack("cg").apply(); // --- checkpoint --- =>read(…); getPack("wjtp").apply(); getPack("wjop").apply(); getPack("wjap").apply(); } retrieveAllBodies(); … }

static void main(String[] args) { Main m = new Main(); boolean b = args.length !=0; => read(b); if (b) // DP => read(type_of(m)); => unsafe.allocate(m); => args = null; m.run(args); // DP}


Step 2: Recover at the Checkpoint Our static analysis selects locals for

recording(for checkpointing)/recovering(for replaying) when- They are written before the checkpoint- They are read after the checkpoint

Record primitive-typed values or entire object graphs on the heap (all reachable objects)

Static fields are selected based on the same idea

void run(String[] args) { processCmdLine(args); loadNecessaryClasses(); Set wp_packs = getWpacks(); Set body_packs = getBpacks(); if (Options.v().whole_jimple()) { getPack("cg").apply(); // --- checkpoint --- getPack("wjtp").apply(); getPack("wjop").apply(); getPack("wjap").apply(); } retrieveAllBodies(); for (Iterator i = body_packs.iterator(); i.hasNext();) { … }… }

body_packs


Selection of Static Fields A whole program Mod/Use analysis

- A static field is “written” if its value is changed, or any heap object reachable from it is mutated

- A static field is “read” if its value is directly read

Analysis algorithm- Context-sensitive and flow-insensitive; uses the

points-to solution and the call graph from Spark [Lhotak CC-03]

- Bottom-up traversal of the SCC-DAG of the call graph

- For each method m, a set Cm is maintained to contain all objects from which a mutated object can be reached

- Propagate backwards the objects in Cm that escape a callee method to its callers

- Select a static field fld if PointsToSet(fld) ∩ Cm ≠ ∅


Step 3: Recover after the Checkpoint Replaying only at decision points and the

checkpoint is not enough to guarantee correct execution after the checkpoint

Additionally record/recover local variables that will be read after each call site in CC-chain

void main(){

Set hs = new HashSet();

B b = new B(hs);

//-- reco/rest //(type_of(b))

b.m();

//-- extra reco/rest (hs)

if(hs == b.s){ … }

}

class B{

Set s;

void m(){

B r0 = this;

r0.s = new HashSet();

//-- checkpoint

//-- reco/rest (r0)

r0.s.add(“”);

}

}

hs uninitialize

d


Additional Issues A checkpoint can have multiple run-time

instances If a method in CC-chain has callers that are

not in the chain, it has to be replicated Currently do not support multi-threaded

programs Our technique does not guarantee the

correctness of the execution, when the post-checkpoint part of the program- Depends on external resources, such as files,

databases- Depends on unique-per-execution values, such as

clock- Is modified with new cross-checkpoint

dependencies Multiple execution regions

- Designated by a starting point and an ending point- Specified by two CC-chains


Study 1: Static Analysis

5 3 jb-6.1

8 3 jlex-1.2.6

5 2 db

4 2 jtar-1.21

8 2 jflex

9 4 violet

8 3 jess

11 4 sablecc

9 4 javacup

35 10 soot-2.2.3

10 3 raytrace

14 3 socksecho

11 3 socksproxy

6 1 compress

20 3 muffin

#IP #R Program


Static Analysis: Locals Reduction

0

200

400

600

800

1000

1200

1400

1600

1800 Total Locals Selected Locals


Static Analysis: Static Fields Reduction

0

500

1000

1500

2000

2500

3000

3500 Total SF Selected SF


Static Analysis: Removed/Inserted Statements

0

20

40

60

80

100

120Stmts Left after Pruning(%) Stmts Inserted(%)


Static Analysis Cost Phase 1: Soot infrastructure cost

- Between 1.64ms and 30.6ms per thousand Jimple statements

- On average, 11.1ms/1000 statements

Phase 2: Our analysis cost- Between 1.67ms and 26.6ms per thousand Jimple

statements- On average, 9.4ms/1000 statements

This should be amortized across multiple runs of the replaying version


Study 2: Run-Time Performance (compress) Original program: compressing and

decompressing 5 big tar files several times Evaluated for five checkpoint definitions

- One checkpoint, close to the beginning of the program

- Two regions of compression and decompression- A region containing the process of compression- A region containing the process of decompression- One checkpoint, close to the end of the program


compress Performance Normalized

running time

Normalized size of captured program state

0

20

40

60

80

100

120

140

1 2 3 4 5

checkpointing version replaying version

0

10

20

30

40

50

60

70

80

90

100

1 2 3 4 5

Size of Heap Size of Captured Program State


Study 2: Run-Time Performance (soot) Input: soot-2.2.3 itself containing 2227333

methods Phases

- Enabling cg.spark, wjtp, wjop.ji, wjap.uft, jtp, jop.cp

Evaluated for six checkpoint definitions- Before whole-program packs- After cg- After wjtp- After wjop- After wjap- After body packs


soot Performance Normalized

running time


0

20

40

60

80

100

120

1 2 3 4 5 6

Checkpointing version Replaying version

0

10

20

30

40

50

60

70

80

90

100

1 2 3 4 5 6



Study 2: Run-Time Performance (jflex-1.4.1)

Input: a .flex grammar file corresponding to a DFA containing 21769 states

Evaluated for four checkpoint definitions- After NFA is generated- After DFA is generated to DFA- After minimization - After emission


jflex Performance Normalized

running time


0

50

100

150

1 2 3 4

Replaying version Checkpointing version

0

10

20

30

40

50

60

70

80

90

100

1 2 3 4



Summary of Evaluation Static analysis successfully reduces the size of

program state recorded and recovered It is more meaningful to checkpoint/replay

long-running programs Checkpoints are better taken after a phase of

long computation with (relatively) small output state- √ compress: small program state, short running

time- √ soot: large program state, but very long computation time- X jflex: large program state, short running time


Conclusions A static-analysis-based

checkpointing/replaying technique An implementation and an evaluation that

shows our technique can be an interesting candidate for testing, debugging, and dynamic slicing of long-running programs

Future work- Language-level checkpointing/replaying multi-

threaded programs- More precise static analyses could be employed to

reduce the size of program state to be captured- The run-time support for object reading and writing

could be improved


Questions?

presto: program analyses and software tools research group, ohio state university efficient...

Documents