trustless grid computing in concert (progress report)

55
Trustless Grid Computing in ConCert (Progress Report) Robert Harper Carnegie Mellon University

Upload: gemma-ware

Post on 03-Jan-2016

37 views

Category:

Documents


1 download

DESCRIPTION

Trustless Grid Computing in ConCert (Progress Report). Robert Harper Carnegie Mellon University. Acknowledgements. Co-PI’s Karl Crary, Frank Pfenning, Peter Lee. Support NSF ITR program. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Trustless Grid Computing  in ConCert (Progress Report)

Trustless Grid Computing in ConCert

(Progress Report)Robert Harper

Carnegie Mellon University

Page 2: Trustless Grid Computing  in ConCert (Progress Report)

Acknowledgements

• Co-PI’sKarl Crary, Frank Pfenning, Peter Lee.

• SupportNSF ITR program.

• Students (who do the real work)Chang, Delap, Dreyer, Kliger , Magill, Moody, Murphy, Petersen, Sarkar, Vanderwaart, Watkins.

• Thanks to FGC Organizers for the invitation!

Page 3: Trustless Grid Computing  in ConCert (Progress Report)

Grid Computing

• “The network is a computer.”– Exploit idle resources on the network.– Many ad hoc grids.

• SETI@HOME• FOLDING@HOME

• But what is a general grid model?– Trust model, programming model,

participation model?

Page 4: Trustless Grid Computing  in ConCert (Progress Report)

Application Model

• What is the (a?) grid computer?– Parallelism?– Dependencies?– Sharing resources?– Failures?

• Centralized vs. distributed.– Bottlenecks (e.g., SETI traffic at UCB).– Reliability, robustness.

Page 5: Trustless Grid Computing  in ConCert (Progress Report)

Application Model

• Most grid app’s are massively parallel.– Depth = 1, no dependencies.– Ray tracing, GIMPS, SETI.

• Is a grid useful for depth > 1?– Game-tree search.– Theorem proving.

• Is parallelism the only benefit?– What about data locality?

Page 6: Trustless Grid Computing  in ConCert (Progress Report)

Host Model

• Active intervention required.– Must download code, apply upgrades.– Must decide on which grids to participate.

• Motivation to participate?– At scale, largely altruism, coolness.– Ad hoc grids on an intranet.– Economic models? (Cf Lillibridge, et al.)

Page 7: Trustless Grid Computing  in ConCert (Progress Report)

Trust Relationships

• Hosts trust applications.– Denial of service attacks.– Privacy/secrecy attacks.– Accidental misbehavior (e.g., SETI).

• Applications trust hosts.– Spoofed answers.– Collusion among participants.

• Can we minimize these?

Page 8: Trustless Grid Computing  in ConCert (Progress Report)

The ConCert Approach

• One computer, many keyboards.– Decentralized scheduling.– Emphasis on code mobility.

• Policy-based participation.– Declarative statement of participation criteria.– Applications must prove compliance.

• Dependency-based scheduling.– Arbitrary depth.– And/or dependencies.– Inspired by CILK/NOW.

Page 9: Trustless Grid Computing  in ConCert (Progress Report)

The ConCert Network

ClientHosts

Page 10: Trustless Grid Computing  in ConCert (Progress Report)

Host Setup

Locator

Scheduler

Worker

Peer-to-Peer Discovery Protocol

Distributed Scheduler

Loader/Verifier/Runner

Page 11: Trustless Grid Computing  in ConCert (Progress Report)

Scheduler

• Maintain ready and waiting queues.– Ready queue: available for “stealing”.– Wait queue: awaiting satisfying assignment.

• Work-stealing model.– Who has work to do?– Grab work, compute result, deliver to owner.

• Dependencies.– Supports depth > 1 parallelism.– Don’t care and don’t know parallelism.

Page 12: Trustless Grid Computing  in ConCert (Progress Report)

Scheduler

• The unit of work on the grid is a cord.

Page 13: Trustless Grid Computing  in ConCert (Progress Report)

Scheduler

• Cord structure:– Code: cached using MD5 fingerprints.– Certificate of compliance: (more later).– Dependencies: positive boolean formula.

• Assumptions:– Idempotent: can always be re-run.– Non-blocking: runs to completion (but may create

more cords, often as continuations).– Communication only via dependencies. Satisfying

assignment passed on activation.

Page 14: Trustless Grid Computing  in ConCert (Progress Report)

Worker

• Steal work from (self or) neighbor.• Obtain cord from host.

– Typically arguments + dependencies.– Code shipped at most once.

• Verify certificate of compliance.• Load and execute as a DLL.

– Currently combined with verification.– Should verify at most once (cache result).

• Deliver result to owner.

Page 15: Trustless Grid Computing  in ConCert (Progress Report)

Control

• Client.– Submit a job to the grid.– “One per keyboard.”

• Monitor.– Web server interface.– Displays cord status.– [Change policy.]

Page 16: Trustless Grid Computing  in ConCert (Progress Report)

Moving Cords Around

A client submits work, broken into cords, to the local conductor.

Page 17: Trustless Grid Computing  in ConCert (Progress Report)

Idle peers steal cords to work on.

Cords have destinations for their answers, shown by color here.

Moving Cords Around

Page 18: Trustless Grid Computing  in ConCert (Progress Report)

Moving Cords Around

Some cords spawn new cords. They might depend on other cords before they can run.

The destination of F and G is the green node, since they will be used to fill H’s dependencies.

Page 19: Trustless Grid Computing  in ConCert (Progress Report)

Moving Cords Around

When a cord finishes, the result is sent to its destination. The client interprets and displays the results.

Simultaneously, unfinished cords continue to be stolen...

Page 20: Trustless Grid Computing  in ConCert (Progress Report)

Moving Cords Around

When the green node has answers for F and G, H is then ready to be stolen.

Page 21: Trustless Grid Computing  in ConCert (Progress Report)

Popcorn/Grid Model

• my_cord: string £ witness ! string.– Marshals argument and result itself.– Witness is the satisfying assignment for its

dependencies.

• Typical structure:– Input = entry point + arguments.– Dispatch on entry point.

• Cords as distributed continuations.– Perform some work, spawn new cords.– Supports various higher-level parallelism models.

Page 22: Trustless Grid Computing  in ConCert (Progress Report)

ML/Grid Model

• One program for client and its cords.– Compiler separates client from cords.

• Compiler handles marshalling.

• Run-time checks enforce distinctions (more later).– Cord cannot perform I/O.– Client cannot submit itself as a cord.

• Compiles to TAL/Grid.

Page 23: Trustless Grid Computing  in ConCert (Progress Report)

ML/Grid Model

• Primitives:– spawn : (unit ! ) ! task– sync : task ! – relax : task list ! £ task list

• Must be provided as primitives.– Requires access to representations.

• Further higher-level libraries.– E.g., parallelism models.

Page 24: Trustless Grid Computing  in ConCert (Progress Report)

Examples

• GML ray-tracer (ICFP01 Contest).– Depth = 1.– Written in Popcorn/Grid, compiles to TALx86/Grid.

• Chess player.– Depth > 1, and-or dependencies.– Written in Popcorn/Grid, compiles to TALx86/Grid.

• Theorem prover for MLL.– Depth > 1, and-or dependencies.– Written in SML, runs on simulator.– Being ported to ML/Grid.

Page 25: Trustless Grid Computing  in ConCert (Progress Report)

Some Problems

• Failures.– Fail-stop model is easily supported.– Demonic failures require result certification.

• Abandoning cords.– Or-dependencies are satisfied by first cord to deliver

answer.– Parent must be prepared to receive result long after it

is no longer needed.

• Sharing results.– Grid-wide cache of answers?

Page 26: Trustless Grid Computing  in ConCert (Progress Report)

Result Certification

• Main idea: make host prove validity of answer.– Avoid need for application to trust hosts.

• Some applications admit native certification.– For theorem prover: the proof.– For factoring, the facts.

• Are there general result certification methods?– Work-stealing model precludes random allocation /

redundancy methods (SETI, Bayanihan).– Centralized methods are not robust or scalable.

Page 27: Trustless Grid Computing  in ConCert (Progress Report)

Result Certification

• A crazy idea: use the PCP theorem.– Use interactive dialog to spot-check a proof.

• Host proves that it ran given code on given data.– Execution trace is a proof that it did.– But traces can be huge!

• Engage in a dialog with O(1) rounds to check proof with high probability.– Avoids need to transmit trace itself.– But the representation is enormous!

Page 28: Trustless Grid Computing  in ConCert (Progress Report)

Two Foundational Questions

• What is a type system for a GPL?– Enforce mobility constraints.– Clean type system to support development,

compilation, certification.

• What policies can we support?– How to state policies?– How to prove compliance?– How to support multiple policies?

Page 29: Trustless Grid Computing  in ConCert (Progress Report)

A Type System for GPL

• Main idea: modalities for mobility.– Cf. related ideas by Cardelli, Gordon, et al.– Cf. recent work by Walker.– Here: Curry-Howard applied to modal logic.

• Necessity (¤ A): a computation of A anywhere.– Classifies mobile code of type A.– Enforces marshalling and access restrictions.

• Possibility (¦ A): a computation of A somewhere.– Classifies remote code of type A.– Ensures that access is limited to remote values.

Page 30: Trustless Grid Computing  in ConCert (Progress Report)

Necessity for Mobility

• Truth (local) typing judgement:

Valid (Mobile) Bindings

True (Local) Bindings

Page 31: Trustless Grid Computing  in ConCert (Progress Report)

Necessity for Mobility

• Validity (mobile) typing judgement:

• Mobile = does not use local resources.

Page 32: Trustless Grid Computing  in ConCert (Progress Report)

Necessity for Mobility

• Box = marshal value and bindings.

• Values of boxed type are mobile.

Page 33: Trustless Grid Computing  in ConCert (Progress Report)

Necessity for Mobility

• Unboxing = unbox and run mobile code.

• Implicit un-marshalling:

Page 34: Trustless Grid Computing  in ConCert (Progress Report)

Necessity for Mobility

• Marshalling = cast into network form.– Base types, structured types: fairly typical.– Function types: certified binary.

• Code mobility is a form of semantic linking.– Import object from the network.– Un-marshall, verify, load, execute.– (More later.)

Page 35: Trustless Grid Computing  in ConCert (Progress Report)

Possibility for Locality

• Possible (somewhere) typing judgement:

• What is here is somewhere:

Page 36: Trustless Grid Computing  in ConCert (Progress Report)

Possibility for Locality

• Create a local reference to something somewhere:

Page 37: Trustless Grid Computing  in ConCert (Progress Report)

Possibility for Locality

• Move to remote entity:

• May be useful for managing data locality.– Return call has type ¦¤ (A! B).– Cf “upcalls”.

Page 38: Trustless Grid Computing  in ConCert (Progress Report)

Modalities for Mobility

• These rules are for S4 modal logic.– Accessibility is reflexive and transitive.

• Is this the right notion of accessibility?– Symmetry = S5. “You can go home again.”– Judgmental form requires three contexts.– Explicit-world form uses a record of contexts.

• Other varieties of modal logic are also under consideration.

Page 39: Trustless Grid Computing  in ConCert (Progress Report)

Policies and Certification

• Current certification methods are uniform.– 9 sec. policy 8 problems safety is assured.

• Eg, PCC for Java• Eg, TAL for Popcorn.

– Safety means memory and type safety.• Baseline requirement.• But not adequate for all applications.

• Recall: policies should be per-host.

Page 40: Trustless Grid Computing  in ConCert (Progress Report)

Foundational Certification

• Non-uniform setup: 8 probs 9 type system– Shift the type system for object code out of

the TCB (untrusted, problem-specific).– Must provide a proof that type system is safe.

• Compare Appel, et al.– Their goal: minimize TCB.– Our goal: support multiple safety policies.– Could be consolidated, but it’s a lot of work.

Page 41: Trustless Grid Computing  in ConCert (Progress Report)

Foundational Safety

• Host specifies target architecture.– Fully realistic, e.g., IA-32 + OS + RTS.– No unsafe transitions.

• Safety policy: target does not get stuck.– Any type system must come with a proof of

progress relative to the target machine.– Experience shows that progress proofs are

readily mechanizable.

Page 42: Trustless Grid Computing  in ConCert (Progress Report)

Foundational Certification (I)

Page 43: Trustless Grid Computing  in ConCert (Progress Report)

Foundational Certification (I)

• Object code is essentially a DLL.

• Type system is specified in LF.– Using typical LF representations.

• Safety proof: well-typed ) safe.– Represented as an LF term.– Obtained with Twelf proof search engine.

• Derivation: type annotations for code.– Makes mechanical checking feasible.

Page 44: Trustless Grid Computing  in ConCert (Progress Report)

Foundational Certification (I)

• May cache type system and safety proof.– Reduces certificate size.– Many cords for one type system is typical.

• May use oracle strings for derivation.– Relies on details of operational behavior of

host-side checker.– Therefore not completely declarative.– But significantly reduces certificate size.

Page 45: Trustless Grid Computing  in ConCert (Progress Report)

Foundational Certification (II)

Page 46: Trustless Grid Computing  in ConCert (Progress Report)

Foundational Certification (II)

• Object code is a DLL as before.• Type checker is a program.

– Currently, a Twelf logic program.– Could be ML code.

• Safety proof shows partial correctness of the checker.– Checking succeeds ) safety.

• Annotations support mechanical checking.• Time limit precludes looping.

– Can refuse if limit is too large.

Page 47: Trustless Grid Computing  in ConCert (Progress Report)

Examples

• TALT– Essentially TALx86 with a safety proof.– Proof is mechanically derived and checked.– Structured as a safety proof for an abstract

machine plus a simulation lemma for target.

• TALT + Resource Bounds– Goal: ensure that object code yields

processor at set intervals.– Precludes denial of CPU service.

Page 48: Trustless Grid Computing  in ConCert (Progress Report)

Resource Bound Certification

• Type system enforces upper bound on yield interval.– Specified as a parameter of the type system.

• Basic method:– Conservative instruction counting (join points).– Yield processor at start of every basic block.– Prove that block can complete before next

yield (else split block).

Page 49: Trustless Grid Computing  in ConCert (Progress Report)

Resource Bound Certification

• Smarter techniques are under development.– Better analysis of code behavior across calls.– Fewer yields overall.

• Run-time checks reduce overhead.– Use static analysis to insert minor yields that

check true interval.– Minor yields re-calibrate, possibly incurring a

major yield (system call).

Page 50: Trustless Grid Computing  in ConCert (Progress Report)

A Meta-Grid?

• ConCert Conductor represents one model of grid computing.– Compute-intensive, distributed scheduling.– Not much reason to believe this is canonical.

• Can we support a variety of models inside of a single meta-grid?– Applications choose grid model.– Hosts are indifferent to programming model.

Page 51: Trustless Grid Computing  in ConCert (Progress Report)

A Meta-Grid?

• The ur-grid:– A TCP port.– Foundational code certification.

• A grid framework:– Scheduler, recovery model, host policy.– Runs application cords.

Page 52: Trustless Grid Computing  in ConCert (Progress Report)

A Meta-Grid?

• Key capability: safe dynamic loading and linking.– Current ConCert framework must be certified

against host safety policy.– It must be able to load application policies and

application code.

• Requires a fairly sophisticated theory of sage linking.

Page 53: Trustless Grid Computing  in ConCert (Progress Report)

Semantic Linking

• Marshalling is meta-programming.– Create values of a grid type system.– Cast grid values as local values.

• Certification is how we marshal code.– Functions are marshalled as closures plus

proof of compliance with host type system.– Ensures that cast will succeed, safely.

• The ur-grid is just an unmarshaller.• Grid frameworks are meta-programs.

Page 54: Trustless Grid Computing  in ConCert (Progress Report)

Summary

• Declarative approach to safe grids.– Passive, policy-based participation model.– Logic and proof technology for specifying

policies and proving compliance.

• Close interplay between systems building and foundational theory.– Type systems for mobile code.– Type systems for various safety policies.

Page 55: Trustless Grid Computing  in ConCert (Progress Report)

Thanks!

• Web site: http://www.cs.cmu.edu/~concert.

• Demonstration available after talk.

• Questions or comments?