ken birman professor, dept. of computer science. today’s cloud computing platforms are best for...
Post on 20-Dec-2015
215 views
TRANSCRIPT
A High-Assurance Cloud Computing Agenda
Ken BirmanProfessor, Dept. of Computer Science
Context
Today’s cloud computing platforms are best for building “apps” like YouTube, web search Highly elastic, pipelined (“asynchronous”)
services But very weak guarantees and limited security
The cloud comes with its own mantra: Don’t use ACID! BASE is better… CAP theorem proves it (or does it?)
Cornell Dept of Computer Science Colloquium 4
eBay’s Five Commandments
As described by Randy Shoup at LADIS 2008
Thou shalt…1. Partition Everything2. Use Asynchrony Everywhere3. Automate Everything4. Remember: Everything Fails5. Embrace Inconsistency
Sept 24, 2009
Cornell Dept of Computer Science Colloquium 5
Vogels at the Helm
Werner Vogels is CTO at Amazon.com… His first act?
Introduced a series of weak consistency options
Replaced the older strongly consistent “pub/sub” infrastructure with slower but more scalable one
In small systems, raw speed wins In the cloud
Weaker forms of guarantees oftenscale far better than strong ones
Sept 24, 2009
Cornell Dept of Computer Science Colloquium 6
James Hamilton’s advice
Key to scalability is decoupling, loosest possible synchronization
Any synchronized mechanism is a risk His approach: create a committee Anyone who wants to deploy a highly
consistent mechanism needs committee approval
…. They don’t meet very often
Sept 24, 2009
Cornell Dept of Computer Science Colloquium 7
Consistency
Consistency technologies just
don’t scale!
Sept 11, 2009 P2P 2009 Seattle, WashingtonSept 24, 2009
Cornell Dept of Computer Science Colloquium 8
What’s consistency?
A consistent distributed system will often have many
components, but users observe behavior indistinguishable from
that of a single-component reference system
Reference Model Implementation
Sept 24, 2009
Our Perspective?
We’re being too quick to give up on consistency and other assurance properties CAP, BASE are really about database
consistency Other very strong forms of consistency can be
the foundation for a new science of highly assured, high speed, scalable cloud computing
We have the science to back our vision The new Isis2 system makes it real
Highly Assured Cloud: Isis2
Named for an old Cornell story In 1990 our first Isis Toolkit became the
core of the NYSE, French Air Traffic Control System and US Navy AEGIS
Isis2 : A completely new system but same idea Makes it easy to create high-assurance cloud apps Offers consistency, fault-tolerance, security
FreeBSD code release later this spring
Cornell Dept of Computer Science Colloquium 11
Virtual synchrony meets Paxos (and they live happily ever after…)
Virtual synchrony is a “consistency” model: Synchronous runs: indistinguishable from non-
replicated object that saw the same updates (like Paxos)
Virtually synchronous runs are indistinguishable from synchronous runs
p
q
r
s
t
Time: 0 10 20 30 40 50 60 70
p
q
r
s
t
Time: 0 10 20 30 40 50 60 70
Synchronous execution Virtually synchronous execution
Sept 24, 2009
Non-replicated reference execution
A=3
B=7
B = B-A
A=A+1
Example: Parallel search
Replies = g.query(LOOKUP, “Name=*Smith”);
g.callback(myReplyHndlr, Replies, typeof(double));
public void myReplyHndlr(double[] fnd) { foreach(double d in fnd) avg += d; …}
public void myLookup(string who) { divide work into viewSize() chunks this replica will search chunk # getMyRank();
…..
reply(myAnswer);}
Group g = new Group(“/amazon/something”);g.register(LOOKUP, myLookup);
Scalable Aggregation
Used if group is really big Request, updates: still via multicast Response is aggregated within a tree
Level 0
Level 1
Level 2Agg(va vb vc vd )
query
a
a
ca
c
db
va
vb
vc
vd
Agg(vc vd)Agg(va vb)
reply
Example: nodes {a,b,c,d}
collaborate to perform a query
Aggregated Parallel search
Replies = g.query(LOOKUP, 27, “Name=*Smith”);
g.callback(myReplyHndlr, Replies, typeof(double));
public void myReplyHndlr(double[] fnd) { The answer is in fnd[0]….}
public void myLookup(int rid, string who) { divide work into viewSize() chunks this replica will search chunk # getMyRank();
…..
SetAggregateValue(myAnswer);}
Group g = new Group(“/amazon/something”);g.register(LOOKUP, myLookup);
Rval = GetAggregateResult(27);Reply(Rval/DatabaseSize);
Our Early Users?
Partnering with Cisco to apply these ideas in core Internet routers (NEBULA/R3 projects) Creating a continuously available CRS-1 story
Close dialogs with Microsoft, IBM, Intel
Funding from National Science Foundation, Air Force, talking to DARPA and ARPAe Government, military and smart power grid will
all need highly assured cloud options
Challenge of the week
Debugging a system that targets thousands of nodes with tens of cores each is hard! We benefit from our own strong model But physical access to non-virtualized large-scale
systems is “difficult” today And many block IPMC and UDP
Better tools will need to be part of a better assurance property Else we know how it should work but not how it
does work, or even whether it works correctly!