peerreview: practical accountability for distributed systems sosp 07
TRANSCRIPT
PeerReview: Practical Accountability for
Distributed Systems
SOSP 07
Why have Accountability?
Nodes can fail An attacker can compromise a node Accidental Mis-configuration Multiple administrative domains
Distributed state, incomplete information General case: Multiple admins with different interests
Admin
www.sosp2007.org/talks/sosp118-haeberlen.ppt
What is Accountability?
Fault = Anything besides expected behavior
Ideal Accountability: Detect a fault Identify the faulty node (Completeness) Correct node can prove its correctness
(Accuracy) Expose the faulty node
A few advantages:
Deterring faults Augment fault tolerant systems Augmenting best-effort systems
Challenges: What can/cannot be detected? Un-observable faults:
Node’s internal state CPU overheating, Display failed Need trusted probes!
Observable faults: Affect a correct node causally No trusted entity required!
How to verify if a node reports correctly? How to distinguish omission from long
delays?
A B CREQ_ 8
GNT_ 8 REQ_ 5
REL_ 8
GNT_ 5
REL_ 5
•Request•Grant•Release
A B C
REQ_ 8
GNT_ 5REQ_ 5
GNT_ 5
REL_ 5
A B C
REQ_ 8
GNT_ 8REQ_ 5
GNT_ 5
REL_ 5
REL_ 8
A B C
REQ_ 8
REQ_ 5
GNT_ 5
REL_ 5
A B C
REQ_ 8
REQ_ 2
GNT_ 3
REL_ 5
GNT_ 2
REQ_ 3
GNT_ 8
REL_ 8
Accountability: How much can we do?
Completeness: Eventually suspected Eventually exposed
Accuracy No correct node is forever suspected No correct node ever exposed by a
correct node
FullReview Characteristics:
A trusted entity exists All messages go through trusted entity Each node maintains a log for every other node Check the log Suspect/Expose a deviant node
Complete? Accurate? Practical?
PeerReview: Practical Accountability
No trusted entity Nodes only keep their own log
May retrieve others when needed Logs are tamper-evident Witness nodes: check correctness of a
node Challenge/Response protocol
System Model Each node modeled as:
A state machine A detector An application
Assumptions: Deterministic state machine Correct nodes can communicate A reference implementation of node SW A secure signature mechanism available
Overview
Nodes maintain a log of I/O Witnesses of a node audit its log
If faulty, gather evidence Make it known
Tamper-evident logs Append-only list of I/O Log-entries connected in a hash-chain Authenticator: A signed statement by a
node If a node tampers the log, it will be evident
Logs must be complete No entries missed
Logs must be correct No forged entries No multiple logs
Fault Detection Audit
Replay the inputs to a reference implementation
Output == Log ? Evidence Transfer
Fetch evidence from witnesses
Module B
Module A
Module B
=?
LogNetwork
Input
Output
Sta
te m
ach
ine
if ≠
Module A
PeerReview: Applications Overlay Multicast
Large amounts of data Freeloaders
Network File System Latency-sensitive Data tampering Message loss in the network
Peer-to-peer email DoS attack
Results: Multicast with Freeloader
Results: Throughput
Results:
Discussion
What if all witnesses are faulty? How to choose Ttrunc, Taudit, Tbuf