diego: re-envisioning the elastic runtime (cloud foundry summit 2014)

119
Onsi Fakhouri DIEGO ElasticRuntime 2.0 TECHNICAL

Upload: pivotal

Post on 20-Aug-2015

6.954 views

Category:

Technology


1 download

TRANSCRIPT

Onsi Fakhouri

DIEGOElasticRuntime 2.0TECHNIC

AL

What?

Why?

Show me…

The future

DIEGOElasticRuntime 2.0

DIEGOElasticRuntime 2.0

What?

Why?

Show me…

The future

Cloud Controller

What is being rewritten?

Stage App

Run n App Instances(and keep them running)

http://…

Push App> cf

Route to App

DEA Pool (Droplet Execution Agent)

What is being rewritten?

http://…

Push App> cf Cloud

Controller

Router

(API)

What is being rewritten?

http://…

Push App> cf Cloud

Controller

Router

DEA Pool (Droplet Execution Agent)

(API)

What is being rewritten?

http://…

Push App> cf Cloud

Controller

Router

DEA Pool (Droplet Execution Agent)

DEA

Staging Apps

Running Apps(API)

What is being rewritten?

http://…

Push App> cf Cloud

Controller

Router

DEA Pool (Droplet Execution Agent)

DEA

Staging Apps

Running Apps

Warden

Containerization(API)

What is being rewritten?

http://…

Push App> cf Cloud

Controller

Router

DEA Pool (Droplet Execution Agent)

DEA

Staging Apps

Running Apps

Warden

Containerization

Health Manager

(API)

What is being rewritten?

Push App

http://…

> cf Cloud Controller

Router

Health Manager

DEA Pool (Droplet Execution Agent)

DEA

Staging Apps

Running Apps

Warden

Containerization

NATS (message bus)

(API)

What is being rewritten?

Push App

http://…

> cf Cloud Controller

Router

Health Manager

DEA Pool (Droplet Execution Agent)

DEA

Staging Apps

Running Apps

Warden

Containerization

NATS (message bus)

(API)

What?

Why?

Show me…

The future

DIEGOElasticRuntime 2.0

Why rewrite?

Push App

http://…

> cf Cloud Controller

Router

Health Manager

NATS (message bus)

DEA Pool (Droplet Execution Agent)

DEA

Staging Apps

Running Apps

Warden

Containerization

Why rewrite?

Hard to add new featuresHard to maintain existing features

Why?

Why rewrite?

Cloud Controller

Router

Health Manager

NATS (message bus)

DEA Pool (Droplet Execution Agent)

DEA

Staging Apps

Running Apps

Warden

Containerization

Tight CouplingPoor separation of concerns

Orchestration

Why rewrite?

Tight CouplingPoor separation of concerns

Orchestration

> cf scale

Why rewrite?

Tight CouplingPoor separation of concerns

Orchestration

Cloud Controller

> cf scale

Why rewrite?

Cloud ControllerTight Coupling

Poor separation of concerns

Orchestration

> cf scale

“Make it so”

Why rewrite?

Cloud ControllerTight Coupling

Poor separation of concerns

Orchestration

> cf scale

start/stop

Why rewrite?

Cloud ControllerTight Coupling

Poor separation of concerns

Orchestration

> cf scale

DEA

Warden

DEA

Warden

DEA

Warden

DEA

Warden

DEA

Warden

DEA

Warden

Why rewrite?

Cloud ControllerTight Coupling

Poor separation of concerns

Orchestration

> cf scale

DEA

Warden

DEA

Warden

DEA

Warden

DEA

Warden

DEA

Warden

DEA

Warden

Why rewrite?

Cloud ControllerTight Coupling

Poor separation of concerns

Orchestration

> cf scale

DEA

Warden

DEA

Warden

DEA

Warden

DEA

Warden

DEA

Warden

DEA

Warden

Why rewrite?

Cloud ControllerTight Coupling

Poor separation of concerns

Orchestration

> cf scale

DEA

Warden

DEA

Warden

DEA

Warden

DEA

Warden

DEA

Warden

DEA

Warden

startstart

Why rewrite?

Cloud ControllerTight Coupling

Poor separation of concerns

Orchestration

> cf scale

DEA

Warden

DEA

Warden

DEA

Warden

DEA

Warden

DEA

Warden

DEA

Warden

startstart

Why rewrite?

Cloud ControllerTight Coupling

Poor separation of concerns

Orchestration

> cf scale

DEA

Warden

DEA

Warden

DEA

Warden

DEA

Warden

DEA

Warden

DEA

Warden

startstart

Why rewrite?

Cloud ControllerTight Coupling

Poor separation of concerns

Orchestration

> cf scale

DEA

Warden

DEA

Warden

DEA

Warden

DEA

Warden

DEA

Warden

DEA

Warden

startfails

Why rewrite?

Cloud ControllerTight Coupling

Poor separation of concerns

Orchestration

> cf scale

DEA

Warden

DEA

Warden

DEA

Warden

DEA

Warden

DEA

Warden

DEA

Warden

startfails

Why rewrite?

Cloud ControllerTight Coupling

Poor separation of concerns

Orchestration

> cf scale

DEA

Warden

DEA

Warden

DEA

Warden

DEA

Warden

DEA

Warden

DEA

Warden

startfails

Too much responsiblity

Why rewrite?

Tight CouplingPoor separation of concerns

Cloud Controller

DEA

Warden

DEA

Warden

DEA

Warden

DEA

Warden

DEA

Warden

DEA

Warden

TriangularDependencies

Why rewrite?

Tight CouplingPoor separation of concerns

TriangularDependencies

Health Manager

DEA

Warden

DEA

Warden

DEA

Warden

Cloud Controller

DEA

Warden

Why rewrite?

Tight CouplingPoor separation of concerns

TriangularDependencies

Health Manager

DEA

Warden

DEA

Warden

DEA

Warden

Cloud Controller

DEA

WardenWhen it’s time to

upgrade the DEAsWhen it’s time to

upgrade the DEAs

Why rewrite?

Tight CouplingPoor separation of concerns

TriangularDependencies

Health Manager

DEA

Warden

DEA

Warden

DEA

Warden

Cloud Controller

DEA

WardenWhen it’s time to

upgrade the DEAs we perform a rolling deploy

Why rewrite?

Tight CouplingPoor separation of concerns

TriangularDependencies

Health Manager

DEA

Warden

DEA

Warden

DEA

Warden

DEA

Warden

Cloud Controller

Why rewrite?

Tight CouplingPoor separation of concerns

TriangularDependencies

Health Manager

DEA

Warden

DEA

Warden

DEA

Warden

DEA

Warden

Cloud Controller

“bye!”

Why rewrite?

Tight CouplingPoor separation of concerns

TriangularDependencies

Health Manager

DEA

Warden

DEA

Warden

Cloud Controller

“bye!”DEA

Warden

DEA

Warden

Why rewrite?

Tight CouplingPoor separation of concerns

TriangularDependencies

Health Manager

DEA

Warden

DEA

Warden

Cloud Controller

start!

“bye!”DEA

Warden

DEA

Warden

Why rewrite?

Tight CouplingPoor separation of concerns

TriangularDependencies

Health Manager

DEA

Warden

DEA

Warden

Cloud Controller

start!

“bye!”DEA

Warden

DEA

Warden

start!

Why rewrite?

Tight CouplingPoor separation of concerns

TriangularDependencies

Health Manager

DEA

Warden

DEA

Warden

Cloud Controller

start!

“bye!”DEA

Warden

DEA

Warden

start!

Why rewrite?

Tight CouplingPoor separation of concerns

TriangularDependencies

Health Manager

DEA

Warden

DEA

Warden

Cloud Controller

start!

“bye!”DEA

Warden

DEA

Warden

start!

all clear!

Why rewrite?

Tight CouplingPoor separation of concerns

TriangularDependencies

Health Manager

DEA

Warden

DEA

Warden

Cloud Controller

start!

“bye!”DEA

Warden

DEA

Warden

start!

all clear!

Why rewrite?

Tight CouplingPoor separation of concerns

TriangularDependencies

Health Manager

DEA

Warden

DEA

Warden

Cloud Controller

start!

“bye!”DEA

Warden

DEA

Warden

start!

all clear!

Problematic

Why rewrite?

Tight CouplingPoor separation of concerns

TriangularDependencies

Health Manager

DEA

Warden

DEA

Warden

Cloud Controller

start!

“bye!”DEA

Warden

DEA

Warden

start!

all clear!

Problematic

Why rewrite?

Tight CouplingPoor separation of concerns

TriangularDependencies

Health Manager

DEA

Warden

DEA

Warden

Cloud Controller

start!

“bye!”DEA

Warden

DEA

Warden

start!

all clear!

Problematic

??

??

Why rewrite?

Tight CouplingPoor separation of concerns

TriangularDependencies

Health Manager

DEA

Warden

DEA

WardenCloud Controller

start!

“bye!”DEA

Warden

DEA

Warden

start!

all clear!

Problematic

Why rewrite?

Tight CouplingPoor separation of concerns

TriangularDependencies

Health Manager

DEA

Warden

DEA

WardenCloud Controller

start!

“bye!”DEA

Warden

DEA

Warden

start!

all clear!

Problematic

Why rewrite?

Tight CouplingPoor separation of concerns

TriangularDependencies

Health Manager

DEA

Warden

DEA

WardenCloud Controller

start!

“bye!”DEA

Warden

DEA

Warden

start!

all clear!

Problematic

Why rewrite?

Tight CouplingPoor separation of concerns

TriangularDependencies

Health Manager

DEA

Warden

DEA

WardenCloud Controller

start!

“bye!”DEA

Warden

DEA

Warden

all clear!

Problematic

start!

Why rewrite?

Tight CouplingPoor separation of concerns

TriangularDependencies

Health Manager

DEA

Warden

DEA

WardenCloud Controller

start!

“bye!”DEA

Warden

DEA

Warden

all clear!

Problematic

start!

Why rewrite?

Tight CouplingPoor separation of concerns

TriangularDependencies

Health Manager

DEA

Warden

DEA

WardenCloud Controller

start!

“bye!”DEA

Warden

DEA

Warden

all clear!

Problematic

start!

Why rewrite?

Tight Coupling Poor separation of concernsTriangularDependenciesOrchestration

Why rewrite?

Tight Coupling Poor separation of concernsTriangularDependenciesOrchestration

complex interactions

Why rewrite?

Tight Coupling Poor separation of concernsTriangularDependenciesOrchestration

hard to testcomplex interactions

Why rewrite?

Tight Coupling Poor separation of concerns

hard to testhard to reason through

complex interactions

TriangularDependenciesOrchestration

Why rewrite?

Domain Specific (app, app, app, app)

Why rewrite?

Domain Specific (app, app, app, app)

Push App

http://…

> cf Cloud Controller

Router

Health Manager

NATS (message bus)

DEA Pool (Droplet Execution Agent)

DEA

Staging Apps

Running Apps

Warden

Containerization

App

Push App

http://…

> cf Cloud Controller

Router

Health Manager

NATS (message bus)

DEA Pool (Droplet Execution Agent)

DEA

Staging Apps

Running Apps

Warden

Containerization

App

Why rewrite?

Domain Specific (app, app, app, app)

App AppAppsApps

App

AppApp

AppApp

AppApp

AppApp

App

App

AppApp

App

AppApp

AppApp

Why rewrite?

Domain Specific (app, app, app, app)

Hard to extend to new domains (e.g. cron-like jobs)

Push App

http://…

> cf Cloud Controller

Router

Health Manager

NATS (message bus)

DEA Pool (Droplet Execution Agent)

DEA

Staging Apps

Running Apps

Warden

Containerization

App

App AppAppsApps

App

AppApp

AppApp

AppApp

AppApp

App

App

AppApp

App

AppApp

AppApp

Why rewrite?

Platform Specific

DEA

Staging Apps

Running Apps

Warden

Containerization

Why rewrite?

Platform Specific

DEA

Staging Apps

Running Apps

Warden

Containerization

Why rewrite?

Platform Specific

DEA

Staging Apps

Running Apps

Warden

Containerization

Why rewrite?

Platform Specific

DEA

Staging Apps

Running Apps

Warden

Containerization

DEA

Staging Apps

Running Apps

Warden

Containerization

DEA

Staging Apps

Running Apps

Warden

Containerization

Why rewrite?

Platform Specific

DEA

Staging Apps

Running Apps

Warden

Containerization

DEA

Staging Apps

Running Apps

Warden

Containerization

Why rewrite?

Platform Specific

hard to maintain

DEA

Staging Apps

Running Apps

Warden

Containerization

Why rewrite?

Long-lived processesTons of concurrency

Low-level os interactions

Why rewrite?

Platform SpecificDomain Specific (app, app, app, app)

Tight Coupling Poor separation of concernsOrchestration

TriangularDependencies

Hard to add new features

to maintain existing features

What?

Why?

Show me…

The future

DIEGOElasticRuntime 2.0

Show me Diego

Strong concurrency support

Written in Golang

Strongly typed

Explicit error handling

Promotes developer discipline

Strong low-level OS support

Show me Diego

Domain Specific (app, app, app, app) One-off Tasks

(guaranteed to only run once)

Long Running Processes(n monitored instances)

The Right(?) Abstraction

Cloud Controller

Show me Diego

The Right(?) Abstraction

Cloud Controller

Show me Diego

The Right(?) Abstraction

Executor Pool

Run Tasks

Launch Long Running

Processes

Cloud Controller

Executor Pool

Show me Diego

The Right(?) Abstraction

Run Tasks

Launch Long Running

Processes

StagerStage App Run Task

Cloud Controller

Executor Pool

Show me Diego

The Right(?) Abstraction

Run Tasks

Launch Long Running

Processes

App-ManagerRun App Launch LRP

StagerStage App Run Task

Cloud Controller

Executor Pool

Show me Diego

The Right(?) Abstraction

App-ManagerRun App Launch LRP

Run Tasks

Launch Long Running

Processes

StagerStage App Run Task

Express specific domain

Cloud Controller

Executor Pool

Show me Diego

The Right(?) Abstraction

App-ManagerLaunch LRP

Run Tasks

Launch Long Running

Processes

StagerRun Task

Express specific domain In terms of generic recipes

Run App

Stage App

Cloud Controller

Executor Pool

Show me Diego

The Right(?) Abstraction

App-Manager

Stager

Express specific domain In terms of generic recipes

Run Tasks

Launch LRPs

Rep

Launch LRP

Run Task

Run App

Stage App

Cloud Controller

Executor Pool

Show me Diego

The Right(?) Abstraction

App-Manager

Stager

Express specific domain In terms of generic recipes

Exec Recipes

Exec

Run Tasks

Launch LRPs

Rep

Launch LRP

Run Task

Run App

Stage App

Cloud Controller

Executor Pool

Show me Diego

The Right(?) Abstraction

App-Manager

Stager

Express specific domain In terms of generic recipes

Exec Recipes

Exec Garden

Manage Containers

Run Tasks

Launch LRPs

Rep

Launch LRP

Run Task

Run App

Stage App

Cloud Controller

Executor Pool

Show me Diego

The Right(?) Abstraction

App-Manager

Stager

Express specific domain In terms of generic recipes

Run Tasks

Launch LRPs

Rep

Exec Recipes

Exec Garden

Manage Containers

Linux Backend

Run Containers

Launch LRP

Run Task

Run App

Stage App

Cloud Controller

Executor Pool

Show me Diego

App-Manager

Stager

Express specific domain In terms of generic recipes

Run Tasks

Launch LRPs

Rep

Exec Recipes

Exec Garden

Manage Containers

Linux Backend

Run Containers

GenericSpecific

Launch LRP

Run Task

Run App

Stage App

Cloud Controller

Executor Pool

Show me Diego

App-Manager

Stager

Express specific domain In terms of generic recipes

Run Tasks

Launch LRPs

Rep

Exec Recipes

Exec Garden

Manage Containers

Linux Backend

Run Containers

GenericSpecific

Launch LRP

Run Task

Run App

Stage App

New features go here!(e.g. cron-like tasks)

Cloud Controller

Executor Pool

Show me Diego

App-Manager

Stager

Express specific domain In terms of generic recipes

Run Tasks

Launch LRPs

Rep

Exec Recipes

Exec Garden

Manage Containers

Linux Backend

Run Containers

GenericSpecific

Flexibility

Launch LRP

Run Task

Run App

Stage App

New features go here!(e.g. cron-like tasks)

Show me Diego

Platform Specific

Show me Diego

Platform Independent ✓

Cloud Controller

Executor Pool

App-ManagerRun App Launch LRP

StagerStage App Run Task

Express specific domain In terms of generic recipes

Run Tasks

Launch LRPs

Rep

Exec Recipes

Exec Garden

Manage Containers

Linux Backend

Run Containers

Cloud Controller

Executor Pool

App-ManagerRun App Launch LRP

StagerStage App Run Task

Express specific domain In terms of generic recipes

Run Tasks

Launch LRPs

Rep

Exec Recipes

Exec Garden

Manage Containers

Linux Backend

Run Containers

Show me Diego

Platform Independent ✓

✓ ✓

✓ ✓ ✓

Cloud Controller

Executor Pool

App-ManagerRun App Launch LRP

StagerStage App Run Task

Express specific domain In terms of generic recipes

Run Tasks

Launch LRPs

Rep

Exec Recipes

Exec Garden

Manage Containers

Linux Backend

Run Containers

Show me Diego

✓ ✓

✓ ✓ ✓

Platform Independent ✓

Show me Diego

Linux Backend

Run Containers

Win Backend

Run Containers

Just 2 Things:

Platform Independent ✓

Show me Diego

Linux Backend

Run Containers

Win Backend

Run Containers

Just 2 Things:

Platform Independent ✓

Tight Coupling Poor separation of concernsOrchestration

TriangularDependencies

Show me Diego

Health Manager

Cloud Controller

Show me Diego

Rep

Exec

Rep

Exec

Rep

Exec

Rep

Exec

Orchestration

Health Manager

Cloud Controller

Show me Diego

Rep

Exec

Rep

Exec

Rep

Exec

Rep

Exec

Start!

Start!

Stop!

Orchestration

Health Manager

Cloud Controller

Show me Diego

Rep

Exec

Rep

Exec

Rep

Exec

Rep

Exec

Want 3

Orchestration

Health Manager

Cloud Controller

Show me Diego

Rep

Exec

Rep

Exec

Rep

Exec

Rep

Exec

Want 3Hold auctions…

Orchestration

Health Manager

Cloud Controller

Show me Diego

Rep

Exec

Rep

Exec

Rep

Exec

Rep

Exec

Want 3Hold auctions…… to distribute LRPs

Orchestration

Health Manager

Cloud Controller

Show me Diego

Rep

Exec

Rep

Exec

Rep

Exec

Rep

Exec

Want 3Hold auctions…… to distribute LRPs

Health Manager

Cloud Controller

Show me Diego

Rep

Exec

Rep

Exec

Rep

Exec

Rep

Exec

Want 3Hold auctions…… to distribute LRPs

TriangularDependencies

Health Manager

Cloud Controller

Show me Diego

Rep

Exec

Rep

Exec

Rep

Exec

Rep

Exec

Want 3

TriangularDependencies

self managingmonitoringhealing

Health Manager

Cloud Controller

Show me Diego

Rep

Exec

Rep

Exec

Rep

Exec

Rep

Exec

Want 3

self managingmonitoringhealing

TriangularDependencies

Health Manager

Cloud Controller

Show me Diego

Rep

Exec

Rep

Exec

Rep

Exec

Rep

Exec

Want 3

self managingmonitoringhealing

eventually consistent

TriangularDependencies

Show me Diego

Cloud Controller

Rep

Exec

Rep

Exec

Rep

Exec

Rep

Exec

Want 3 self managingmonitoringhealing

eventually consistent

Show me Diego

Rep

Exec

Rep

Exec

Rep

Exec

Rep

Exec

self managingmonitoringhealing

eventually consistent

robust

Cloud Controller Want 3

Show me Diego

Rep

Exec

Rep

Exec

Rep

Exec

Rep

Exec

but…

Show me Diego

Rep

Exec

Rep

Exec

Rep

Exec

Rep

Exec

distributed auction is complex

emergent behavior

Show me Diego

Rep

Exec

Rep

Exec

Rep

Exec

Rep

Exec

distributed auction is complex

emergent behavior

Simulation-Driven Development

Show me Diegocomplex interactions hard to test hard to reason through

Show me Diego

simulation driven

complex interactions hard to test hard to reason through

complex interactions hard to test hard to reason through

Show me Diego

simulation drivenCloud

Controller

Executor Pool

App-ManagerRun App Launch LRP

StagerStage App Run Task

Express specific domain In terms of generic recipes

Run Tasks

Launch LRPs

Rep

Exec Recipes

Exec Garden

Manage Containers

Linux Backend

Run Containers

Show me Diego

executor

rep

stager14 small single-responsibility components! app-manager

auctioneer

converger

etcd-metrics-server

etcdfile-server

gardenlinux-circus

metricz

route-emitter

tps

simulation driven

complex interactions hard to test hard to reason through

Show me Diego

executor

rep

stager app-manager

auctioneer

converger

etcd-metrics-server

etcdfile-server

gardenlinux-circus

metricz

route-emitter

tps

✓✓

✓ ✓✓✓ ✓

✓✓✓✓

unit-tested✓simulation driven

complex interactions hard to test hard to reason through

Show me Diego

executor

rep

stager app-manager

auctioneer

converger

etcd-metrics-server

etcdfile-server

gardenlinux-circus

metricz

route-emitter

tps

✓✓

✓ ✓✓✓ ✓

✓✓✓✓

?unit-tested✓simulation driven

complex interactions hard to test hard to reason through

Show me Diego

rep✓

garden✓linux-circus✓

auctioneer✓ metricz✓route-emitter✓

stager✓ app-manager✓executor✓

file-server✓tps✓etcd✓converger✓

etcd-metrics-server✓

unit-tested✓simulation driven

Actors

complex interactions hard to test hard to reason through

Show me Diego

unit-tested✓simulation driven

Diego is a playActors

rep✓

garden✓linux-circus✓

auctioneer✓ metricz✓route-emitter✓

stager✓ app-manager✓executor✓

file-server✓tps✓etcd✓converger✓

etcd-metrics-server✓

complex interactions hard to test hard to reason through

Show me Diego

rep✓

garden✓linux-circus✓

auctioneer✓

metricz✓

route-emitter✓stager✓

app-manager✓

executor✓

file-server✓

tps✓

etcd✓converger✓

etcd-metrics-server✓

communication and role encoded via shared library

script

shared narrativeunit-tested✓simulation driven

Diego is a playActors

complex interactions hard to test hard to reason through

Show me Diego

executorrep

stager

app-manager

auctioneer

converger

etcd-metrics-server etcd

file-server

gardenlinux-circus

metricz

route-emitter

tps✓

✓✓

✓✓

✓✓

communication and role encoded via shared library

script

✓integration tests✓

Diego is a playActors

shared narrativeunit-tested✓simulation driven

complex interactions hard to test hard to reason through

Show me Diegocomplexity in a distributed system

of this scope is real and necessary

Diego embraces this and tries to make its complexity:

explicittransparent

∴ easier to reason about

integration tests✓shared narrativeunit-tested✓simulation driven

complex interactions hard to test hard to reason through

Show me Diego

flexible abstractionextensiblerobustagile

Tasks/LRPs

Platform-Independent

SELFManaging

Handle on Complexity

What?

Why?

Show me…

The future

DIEGOElasticRuntime 2.0

The futurestaging

running

+ buildpacks

placement pools

.NETprocess types

auto-rebalancing0-downtime deploys

dockerfiles

custom health-checks

shell access persistent disk

DIEGOElasticRuntime 2.0

Rep

Exec

Rep

Exec

Rep

Exec

Rep

Exec