big problems

47
Gary Berger Technical Leader, Engineering Office of the CTO, DSSG Biggest Problems in Cloud Design Today rce: http://visibleearth.nasa.gov

Upload: gary-berger

Post on 24-Jun-2015

113 views

Category:

Technology


0 download

DESCRIPTION

Just some presentation I gave a few years ago.

TRANSCRIPT

Page 1: Big problems

Gary BergerTechnical Leader, EngineeringOffice of the CTO, DSSG

Biggest Problems in Cloud Design

Today

Source: http://visibleearth.nasa.gov

Page 2: Big problems

Internet being dominated by real-time entertainment

Source: Sandia, 2010 Global Internet Phenomena Report

Page 3: Big problems

What is an Architect?

IMHOTEPDOCTOR, ARCHITECT, HIGH PRIEST,

SCRIBE AND VIZIER TO KING DJOSER

“An architect does not arrive at his finished product solely by a sequence of rationalizations, like a scientist, or through the workings of the Zeitgeist. Nor does he reach them by uninhibited intuition, like a musician or painter. He thinks of forms intuitively, and then tries to justify them rationally. Peter Collins 1966

“Good architecture has been seen largely as either working within a context or circumventing it, depending on which principles are adopted and where the cutting edge is perceived.” Theory of Architecture, Paul-Alan Johnson, 1994

Page 4: Big problems

Neolithic Architecture

6800BCE– 3200BCE

Stonehenge Circa 3000BCE

Secrets of Stonehenge, Nova 2009

Page 5: Big problems

Neolithic Technology

The Lever Ball Bearings

Page 6: Big problems

Why is Architecture hard to understand?

“Whereof one cannot speak, one must pass over in silence.” Wittgenstein

Page 7: Big problems

Tacit Knowledge(Informal Knowledge)

• Knowledge that is difficult to transfer to another person by means of writing it down or verbalizing it.

• Knowledge which cannot be codified, but can only be transmitted via training or gained through personal experience.

• Inherent “know-how” -- as opposed to “know-what” (facts), “know-why” (science), or “know-who” (networking). It involves learning and skill but not in a way that can be written down.Source http://en.wikipedia.org/wiki/Tacit_knowledge adapted from 'The Tacit Dimension, philosopher-

chemist Michael Polanyi

W.T. Wallington walks a 21,600lb stone

Page 8: Big problems

Explicit Knowledge

Page 9: Big problems

"Knowledge as the Competitive Resource”• "Knowledge is not just another resource alongside the

traditional factors of production --labor, capital and land- but the only meaningful resource today” - [Drucker, 1993]

• “Knowledge is the source of the highest quality power and is the key to the power-shift that lies ahead. knowledge is not merely an adjunct of money power and muscle power but eventually will be the ultimate replacement of other resource” -[Toffler, 1990]

• “The economic and producing power of a modern corporation lies more in its intellectual and service capabilities than in its hard assets such as land, plant and equipment - [Quinn, 1992]

Page 10: Big problems

Caution low flying cloud.. …into the fog

Page 11: Big problems

MESIFCAP

NOSQLPCM

TSV

REST

NFS4.1

Spring

DHT

DSL

DDD

Immutability

Actor Patterns

ORMJPA

Adaptive consistency

idempotent

cardinality

Pipeline

HTTP

Vector clocks

SSLCSS

ECMAScript

Graphs

Cassandra

JEE6

.NET

Gnutella

FaceBook

Skype

ChromeOS

VP8

Erlang

SCRUMALM

Polyglot programming

Polyglot persistence

Scala

vFabric

Exalogic

VCE

XML

ESX

KVM

Linux 2.7

iOS

Android

Set Theory

LISP

OneP

C++

GO

Plan9

FusionIBM

R

AMQP

JAVA

Flash

Hibernate

GPU

FPGA

Google

Amazon

Microsoft

Yahoo

IPV6 Crypto

Adaptive Routing

SDN

Websphere

HTML5

Websockets

ROP

CLOS

Butterfly

Hypercube

Cayley Tree

SpringSource

Congestion

PUE

Cisco

Barrelfish

RIAK

Page 12: Big problems

Awesome Ladder!Von Neuman Architecture John Von Neumann

1903-1957

Page 13: Big problems

Cloud ArchitectureAdrian Colyer, CTO Spring Source

Page 14: Big problems

Independent Compute POD

Data NetworkUnified I/O 10GEData Snooping/Migration

Capacity ScalingBlock Store

Data Center Blueprint

I/O Scaling

POD Services Tier

Client Access TierHTTP

Compute/Data Grid

Page 15: Big problems

Things we are going to talk about

• Dealing with Scalability• Dealing with Data• Dealing with Security

Page 16: Big problems

Sumerian Architecture

3600BCE– 2300BCE

Pyramid of Djoser 2630BCE – 2611BCE

Page 17: Big problems

Sumerian Technology

The Wheelcirca 3500BCE

Adobe-brick

Page 18: Big problems

What is Scalability?Mechanical and Biological systems all have limits

Scaling Factors• All systems reach a

limit relative to their size.

• Understanding where these limitations arise gives us a clue where to look for performance bottlenecks

• Architects typically find limitations through trial and error.

• Concurrency = The interaction between processors

• Contention = The degree of serialization on shared writeable data

• Coherency = Penalty incurred for maintaining consistency of shared writable data

Page 19: Big problems

Processor ScalabilityWhat happens when you break a bottleneck!

Page 20: Big problems

Nominal Computer Access Times

Source; Analyzing Computer Systems with Perl , Gunther

Source; Jeff Dean, Google

Page 21: Big problems

Scalability Can Be Measured

Guerrilla Capacity Planning, Gunther, 2007

Universal Scalability Law

• C(p) = scaleup|scaleout• p = number of processors• a = serialized

fraction(contention)• k = coherency k>=0• Scalability is not infinite

but a concave function

We are making an assumption here that we have an exponentially distributed load and service rate (i.e. a Poisson Distribution)

Page 22: Big problems

Why Scale-Up is ImportantBeyond Wimpy Cores

Max Capacity p*

Asymptotic Maximumceiling

Coherency starts to dominatek

Amdahl k=0

Page 23: Big problems

ConclusionWe Need Models Moore’s Impact[1]

• Effectively modeling some of these characteristics are top of mind problems for current application architects

• Eric Brewers CAP Theorem challenges architects to deal with latency as a proxy for strong consistency..

• Much work going on in understanding these problems and building a balance between availability and consistency (i.e. adaptive consistency)

• Some patterns make it difficult to model mathematically

• Technologist’s Moore’s Lawo Double Transistors per Chip every 2

yearso Slows or stops: TBD

• Microarchitect’s Moore’s Lawo Double Performance per Core every 2

yearso Slowed or stopped: Early 2000s

Multicore’s Moore’s Lawo Double cores per chip every 2 years

• Double Parallelism per Workload every 2 years

o Aided by Architectural Support for Parallelism

o Double Performance per Chip every 2 years

Or GAME OVER?1. Amdahl’s Law in the Multicore Era, Hill, Marty, Wisconsin Multifacet Project

Page 24: Big problems

Ancient Egyptian Architecture

3000BCE– 300CE

Pyramids at Giza2575BCE to 2150BCE

Hatshepsut’s TempleCirca 1482BCE

Page 25: Big problems

Ancient EgyptianTechnology

SchematicsDenderah and the Temple of Hathor

being built by CleopatraCirca 30 BCE – 14CEzzz

Process DocumentationRope Making

Page 26: Big problems

Data Management

Data management is the development, execution and supervision of plans, policies, programs and

practices that control, protect, deliver and enhance the value of data and information assets

What are the two most important commands in the data center today?

(NFS Read/Write)

Source: Data Management International, dama.org

Page 27: Big problems

Data ManagementModels Practices• Request level parallelism• Data level parallelism• Persistence model• Durable, Volatile,

Transient• Caching Eviction Policies • Synchronous/

Asynchronous Updates• Denormalization of data

• Caching Treeso Anti-cache spoilers

• Distributed Hash Tables (NOSQL)o Key/valueo Columno Documento Graph

• Messaging and Serialization(IPC)o Lightweight interfaces (PB, Thrift, HC)

• Distributed transactionso Opportunistic lockingo Vector Clockso Paxos protocols

Page 28: Big problems

Jason McHugh, Principal Engineer, Amazon

Flash CrowdsDemand spike on singular resource

• 69.6 seconds receive 31K requests for a single object

• Cache spoilers• Cache trees and

coherency protocol built into relax consistency to protect availability

Page 29: Big problems

Data StructuresSet Theory

Source Big Data in Real-Time at Twitter, Nick Kallen, QCONSF, 2010

Page 30: Big problems

Classical Architecture850BCE– 475CE

Parthenon

Page 31: Big problems

Classical Technology150BCE– 100BCE

Antikythera Machine

Page 32: Big problems

The “Illusion” of Security

• Perimeter defense seals off data center so attack surface moves to the client

• Attackers find path of least resistanceo Email Addresseso Social Websiteso Standard naming

practices )i.e. [email protected]

The Apple I, Recently sold for $210,000

“Simply keeping out bad code is not sufficient to keep out bad computation” Stefan Savage, UC San Diego

Page 33: Big problems

Modern AttacksEasy to 0wn, Normal processing leads to code execution

Mitigation Strategies• Memory Trespass• Rogue AV through mass mailings• Injection Flaws (SQL, OS, LDAP)• Cross Site Scripting• Broken Authentication and Session

Management• Insecure Direct Object References• Cross-site Request Forgery

Summary

• Normal processing leads to code executiono Receive packet/requesto Parse display/data

• ASLR (Address Space Layout Randomization)

• DEP (Data Execution Prevention)

• Stack Cookies• Sandboxing• Need to understand

strategy more than tactics

Examples

Page 34: Big problems

Source: Dino A. Dai Zovi, Memory Corruption, Exploitation and You

Workstation Attack Surface

Page 35: Big problems

Zero Day Attacks

• The price of disclosure?• There are 1419 Researchers working at ZDI?• ZDI can be used to launch a new Aurora attack

Page 36: Big problems

Modern Browser Attack Graph

Source: Dino A. Dai Zovi, Memory Corruption, Exploitation and You

Page 37: Big problems

Architectural Ladders

3000 BCE 300 CE

Page 38: Big problems

Architecture• Architecture is created to express

some intent but is not the purpose itself, therefore architecture must serve a purpose

• Architectures must evolve or die, sometimes at the expense of the intent and function

• Architectures can be rediscovered, refactored and reused for a new purpose or function

• Architectures may not realize their full potential

• Architectures do not replace fundamentals in engineering and science but establish a pattern from which to describe its effectiveness

Foote, Yoder, 1999, The Big Ball of Mud

ZIGGURAT: Dubai’s Carbon Neutral Pyramid

Will House 1 Million

Page 39: Big problems

Conclusion• Some of the problems today have been recognized over a

decade ago but lacked the economic justifications for change• History repeating as we move to refactoring architectures of the

past “Engineered Solutions” just at different scales• New architectures being proposed based on empirical evidence,

prototyping and experimentation, others just a horrible guess• Architects need to quickly establish new patterns with the goal

of pushing the bottlenecks to the least cost contributor (i.e. Energy Proportional Computing).

• Architecture should help us to describe intent of the product or function not merely as a generalization

• Architectures today are agile • Architecture for efficient computing which maximizes processing

power per joule of energy.

Page 40: Big problems

Uggh.. Predictions?• By 2012, 20 percent of businesses will own no IT assets• By 2012, India-centric IT services companies will represent 20

percent of the leading cloud aggregators in the market (through cloud service offerings)

• By 2012, Facebook will become the hub for social network integration and Web socialization

• By 2013, mobile phones will overtake PCs as the most common Web access device worldwide

• By 2014, most IT business cases will include carbon remediation cost

• By 2014, over 3 billion of the world's adult population will be able to transact electronically via mobile or Internet technology

• By 2015, context will be as influential to mobile consumer services and relationships as search engines are to the Web

• By 2016, all Global 2000 companies will use public cloud services.

Page 41: Big problems

Thank You

Page 42: Big problems

Backup

Page 43: Big problems

Stonehenge – Woodhenge -

Bluehenge

Around 3 miles

Page 44: Big problems

Meta Structures to scale

Page 45: Big problems

PersistencypNFS RFC5661 HoneyComb 2

• Parallel Opens by file handle

• Asynchronous notification on lock availability

• Commands linearized in slot table

• Support for File, Object and Block targets

• Automated data management • Extreme data mobility • Ability to run 3rd party storage apps• Highly Reliable with self healing • Flat name space • Single management entity • Multi‐cell architecture • Programmatic APIs• Immutable • Automatic load balancing • Transparent node upgrades • Meta‐data support • Storage apps support • Deferred maintenance model • Open‐Source Software only

Page 46: Big problems

Clustered Scalability

Guerrilla Capacity Planning, Gunther, 2007

Universal Scalability Law

• C(p) = intranode scalability• n = nodes• p,n = processors/node

• az = global internode contention

• kz = global internode coherency

Page 47: Big problems

Impact on application

Source Big Data in Real-Time at Twitter, Nick Kallen, QCONSD, 2010