architecting for scale - cs.cmu.educkaestne/15313/2017/... · source: josh evans, mastering chaos...

14
10/16/17 1 Architecting for Scale Microservices at Netflix-Uber-Spotify Scale Pooyan Jamshidi Foundations of Software Engineering Learning Goals Understand the value of microservices for building complex applications that need to operate at higher scale Identify requirements that derive companies to migrate to microservices (contrast of requirements between companies) Understand strategies for reliability of microservice architecture either at micro-level using design patterns or at a larger level Build agile team structure that enable large-scale companies to move fast (organizational challenges) Understand challenges that Netflix-Uber-Spotify faced in realizing microservice based applications 2 Disclaimer I used materials from Netflix blog Spotify, Uber and Netflix’s architects GOTO talks And some other sources referenced in the slides I’m a postdoc in Christian’s group Software Engineering + Machine Learning I worked as a software practitioners for 7 years Pre-PhD 4 years as a developer 3 years as an architect Involved in migration to cloud and microservices | Microservices Architecture Enables DevOps Migration to a Cloud-Native Architecture Armin Balalaie and Abbas Heydarnoori, Sharif University of Technology Pooyan Jamshidi, Imperial College London // This article reports on experiences and lessons learned during incremental migration and architectural refactoring of a commercial mobile back end as a service to microservices architecture. It explains how the researchers adopted DevOps and how this facilitated a smooth migration. // A LOOK AT the searches related to the term “microservices” on Google Trends revealed that the top searches are now technology driven. This im- plies that the time of general search terms such as “What is microser- vices?” has now long passed. Not only are software vendors (for ex- ample, IBM and Microsoft) using microservices and DevOps practices, but also content providers (for exam- ple, Netix and the BBC) have ad- opted and are using them. In addition, Google Trends re- veals that both DevOps and mi- croservices are growing concepts, with an equal rate of growth after 2014 (see Figure 1). Although Dev- Ops can also be applied to mono- lithic software systems, microservices enable effective implementation of DevOps by promoting the impor- tance of small teams.(For more on DevOps and Microservices, see the related sidebar.) A microservices architecture is a cloud-native architecture that aims to realize software systems as a package of small services. Each ser- vice is independently deployable on a potentially different platform and technological stack. It can run in its own process while communicat- ing through lightweight mechanisms such as RESTful or RPC-based APIs—for example, Finagle. (REST stands for Representational State Transfer.) In this setting, each ser- vice is a business capability that can utilize various programming lan- guages and data stores and is devel- oped by a small team.Migrating monolithic architec- tures to microservices brings in many benets. In particular, it pro- vides adaptability to technological changes to avoid technology lock-in and, more important, reduced time- to-market and better development team structuring around services.Here we explain our experiences and lessons learned during incre- mental migration of Backtory (www. backtory.com), a commercial mo- bile back end as a service (MBaaS), to microservices in the context of DevOps. Microservices help Back- tory in various ways, especially in shipping new features more fre- quently and providing scalability for the collective set of users from differ- ent mobile-app developers. Furthermore, we report on migra- tion patterns we developed on the basis of our observations in migra- tion projects. Practitioners can use these patterns to migrate monolithic software systems to microservices. In addition, system consultants can use FOCUS: DEVOPS What is the most interesting aspect that you have learned from the Netflix talk? Tradeoff in software architecture Everything is tradeoff Try to make them intentionally Organ Systems Each organ has a purpose Organs form systems Systems form an organism

Upload: others

Post on 20-May-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Architecting for Scale - cs.cmu.educkaestne/15313/2017/... · Source: Josh Evans, Mastering Chaos -A Netflix Guide to Microservices Cascading Failure Source: Josh Evans, Mastering

10/16/17

1

Architecting for ScaleMicroservices at Netflix-Uber-Spotify Scale

PooyanJamshidi

FoundationsofSoftwareEngineering LearningGoals

• Understandthevalueofmicroservices forbuildingcomplexapplicationsthatneedtooperateathigherscale• Identifyrequirementsthatderivecompaniestomigratetomicroservices (contrastofrequirementsbetweencompanies)• Understandstrategiesforreliability of microservice architectureeitheratmicro-levelusingdesignpatternsoratalargerlevel• Buildagileteamstructurethatenablelarge-scalecompaniestomovefast(organizationalchallenges)• UnderstandchallengesthatNetflix-Uber-Spotifyfacedinrealizingmicroservice basedapplications

2

Disclaimer

• Iusedmaterialsfrom• Netflixblog• Spotify,UberandNetflix’sarchitectsGOTOtalks• Andsomeothersourcesreferencedintheslides

• I’mapostdocinChristian’sgroup• SoftwareEngineering+MachineLearning

• Iworkedasasoftwarepractitionersfor7years• Pre-PhD• 4yearsasadeveloper• 3yearsasanarchitect• Involvedinmigrationtocloudandmicroservices

2 IEEE SOFTWARE | PUBLISHED BY THE IEEE COMPUTER SOCIETY 0 7 4 0 - 7 4 5 9 / 1 6 / $ 3 3 . 0 0 © 2 0 1 6 I E E E

Microservices Architecture Enables DevOpsMigration to a Cloud-Native Architecture

Armin Balalaie and Abbas Heydarnoori, Sharif University of Technology

Pooyan Jamshidi, Imperial College London

// This article reports on experiences and lessons learned during incremental migration and architectural refactoring of a commercial mobile back end as a service to microservices architecture. It explains how the researchers adopted DevOps and how this facilitated a smooth migration. //

A LOOK AT the searches related to the term “microservices” on Google Trends revealed that the top searches are now technology driven. This im-plies that the time of general search terms such as “What is microser-vices?” has now long passed. Not only are software vendors (for ex-ample, IBM and Microsoft) using microservices and DevOps practices,

but also content providers (for exam-ple, Netflix and the BBC) have ad-opted and are using them.

In addition, Google Trends re-veals that both DevOps and mi-croservices are growing concepts, with an equal rate of growth after 2014 (see Figure 1). Although Dev-Ops can also be applied to mono-lithic software systems, microservices

enable effective implementation of DevOps by promoting the impor-tance of small teams.1 (For more on DevOps and Microservices, see the related sidebar.)

A microservices architecture is a cloud-native architecture that aims to realize software systems as a package of small services. Each ser-vice is independently deployable on a potentially different platform and technological stack. It can run in its own process while communicat-ing through lightweight mechanisms such as RESTful or RPC-based APIs—for example, Finagle. (REST stands for Representational State Transfer.) In this setting, each ser-vice is a business capability that can utilize various programming lan-guages and data stores and is devel-oped by a small team.2

Migrating monolithic architec-tures to microservices brings in many benefits. In particular, it pro-vides adaptability to technological changes to avoid technology lock-in and, more important, reduced time-to-market and better development team structuring around services.3

Here we explain our experiences and lessons learned during incre-mental migration of Backtory (www.backtory.com), a commercial mo-bile back end as a service (MBaaS), to microservices in the context of DevOps. Microservices help Back-tory in various ways, especially in shipping new features more fre-quently and providing scalability for the collective set of users from differ-ent mobile-app developers.

Furthermore, we report on migra-tion patterns we developed on the basis of our observations in migra-tion projects. Practitioners can use these patterns to migrate monolithic software systems to microservices. In addition, system consultants can use

FOCUS: DEVOPS

WhatisthemostinterestingaspectthatyouhavelearnedfromtheNetflixtalk?

Tradeoffinsoftwarearchitecture

• Everythingistradeoff• Trytomakethemintentionally

OrganSystemsEachorganhasapurposeOrgansformsystemsSystemsformanorganism

Page 2: Architecting for Scale - cs.cmu.educkaestne/15313/2017/... · Source: Josh Evans, Mastering Chaos -A Netflix Guide to Microservices Cascading Failure Source: Josh Evans, Mastering

10/16/17

2

ELB

andsoistakingtraffic

Source:JoshEvans,MasteringChaos- ANetflixGuidetoMicroservices

LargestInternetTVnetwork

86millionmembers~190countries,10soflanguages125mhourscontentperday

Microservices onAWS

Source:JoshEvans,MasteringChaos- ANetflixGuidetoMicroservices

NetflixDVDDataCenter- 2000

LinuxHostApache Tomcat

JavawebSTORE

Load

Balan

cer

BILLING

HTTP

JDBC

DBLinkHTTP/S

MonolithiccodebaseMonolithicdatabaseTightlycoupledarchitecture

Whatmicroservices arenot Whatmicroservices arenot

Service Oriented Architecture

Enterprise Service Bus (ESB)

PrivacyService (PS)

ServiceConsumer

ServiceConsumer

ServiceConsumer

ServiceProvider

ServiceProvider

ServiceProvider

OtherServices

Routing

Service Registry Transport

Replication

• Message Routing • Message Monitoring

• Service Replication • Language transformation

Edge

ELB

Zuul

NCCP

API

MiddleTier&Platform

Product• Buckettesting• Subscriber• Recommendations

Platform• Routing• Configuration• Crypto

Persistence• Cache• Database

Source:JoshEvans,MasteringChaos- ANetflixGuidetoMicroservices

Architecturalpattern1:APIGateway

12Source:Kasun Indrasiri,Microservices inPractice:FromArchitecturetoDeployment

Page 3: Architecting for Scale - cs.cmu.educkaestne/15313/2017/... · Source: Josh Evans, Mastering Chaos -A Netflix Guide to Microservices Cascading Failure Source: Josh Evans, Mastering

10/16/17

3

Architecturalpattern2:Inter-processCommunicationinaMicroservices Architecture

13

Architecturalpattern3:ServiceDiscoveryinaMicroservices Architecture

14

Architecturalpattern4:Event-DrivenDataManagementforMicroservices

15

27Microservices – From Design to Deployment Ch. 3 – Inter-Process Communication

A message consists of headers (metadata such as the sender) and a message body. Messages are exchanged over channels. Any number of producers can send messages to a channel. Similarly, any number of consumers can receive messages from a channel. There are two kinds of channels, point-to-point and publish-subscribe:

• A point-to-point channel delivers a message to exactly one of the consumers that are reading from the channel. Services use point-to-point channels for the one-to-one interaction styles described earlier

• A publish-subscribe channel delivers each message to all of the attached consumers. Services use publish-subscribe channels for the one-to-many interaction styles described above

Figure 3-4 shows how the taxi-hailing application might use publish-subscribe channels

Figure 3-4. Using publish-subscribe channels in a taxi-hailing application.

The Trip Management service notifies interested services, such as the Dispatcher, about a new Trip by writing a Trip Created message to a publish-subscribe channel The Dispatcher finds an available driver and notifies other services by writing a Driver Proposed message to a publish-subscribe channel

DISPATCHER

TRIP CREATED

DRIVER PROPOSED

TRIPMANAGEMENT

PASSENGERMANAGEMENT

DRIVERMANAGEMENT

ChrisRichadson,Microservices fromDesigntoDeployment

Architecturalpattern5:DecentralizedDataManagement

ChristianPosta,TheHardestPartAboutMicroservices:YourData

Architecturalpattern6:ChoosingaMicroservices DeploymentStrategy

17

Architecturalpattern7:Security

18Source:Kasun Indrasiri,Microservices inPractice:FromArchitecturetoDeployment

Page 4: Architecting for Scale - cs.cmu.educkaestne/15313/2017/... · Source: Josh Evans, Mastering Chaos -A Netflix Guide to Microservices Cascading Failure Source: Josh Evans, Mastering

10/16/17

4

Whatarchitecturalpatternsyoucanidentifywithinthesearchitectures?

Edge

ELB

Zuul

NCCP

API

MiddleTier&Platform

Product• Buckettesting• Subscriber• Recommendations

Platform• Routing• Configuration• Crypto

Persistence• Cache• Database

Source:JoshEvans,MasteringChaos- ANetflixGuidetoMicroservices

Microservices atUber

21

MicroservicesatSpotify

22

Whyreliabilitymattersinmicroservices world?

LinuxHostLinuxHost

LinuxHostLinuxHost

Intra-serviceRequests

LinuxHostApache Tomcat

LinuxHostApache Tomcat

Networklatency,congestion,failureLogicalorscalingfailure

ServiceA ServiceB

Source:JoshEvans,MasteringChaos- ANetflixGuidetoMicroservices

Page 5: Architecting for Scale - cs.cmu.educkaestne/15313/2017/... · Source: Josh Evans, Mastering Chaos -A Netflix Guide to Microservices Cascading Failure Source: Josh Evans, Mastering

10/16/17

5

CrossingtheChasm

Source:JoshEvans,MasteringChaos- ANetflixGuidetoMicroservices

CascadingFailure

Source:JoshEvans,MasteringChaos- ANetflixGuidetoMicroservices

Vaccination

Device ServiceB

ServiceC

Internet EdgeZuul

ServiceA

ELB

FITSynthetictransactionsOverridebydeviceoraccount%oflivetrafficupto100%

FaultInjectionTesting(FIT)

Source:JoshEvans,MasteringChaos- ANetflixGuidetoMicroservices

Device ServiceB

ServiceC

Internet EdgeZuul

ServiceA

ELB

FIT

FaultInjectionTesting(FIT)

Enforcedthroughoutthecallpath

Source:JoshEvans,MasteringChaos- ANetflixGuidetoMicroservices

Page 6: Architecting for Scale - cs.cmu.educkaestne/15313/2017/... · Source: Josh Evans, Mastering Chaos -A Netflix Guide to Microservices Cascading Failure Source: Josh Evans, Mastering

10/16/17

6

ELB

APIAPI

Gateway

App1

App2

App4

App5

App6

App3

App7

App8

99.99

99.99

99.99

99.99

99.99

99.99

99.99

99.99Proxy

99.99 99.99

CombinatorialMath

99.9910 =99.9

CriticalMicroservices

Persistence

Inthepresenceofanetworkpartition,youmustchoosebetweenconsistencyandavailability

CAPTheorem

DB

DB

DB

NetworkB

NetworkC

NetworkD

Service

NetworkA X ZoneA

ZoneB

ZoneC

ZoneB

ZoneC

Client

ZoneA

LocalQuorum(Typical)

100ms

EventualConsistency

Page 7: Architecting for Scale - cs.cmu.educkaestne/15313/2017/... · Source: Josh Evans, Mastering Chaos -A Netflix Guide to Microservices Cascading Failure Source: Josh Evans, Mastering

10/16/17

7

Infrastructure

December24th,2012

US-East-1

Canada

Noplacetogo

US

LatinAmerica

US-East-1US-West-2 EU-West-1

Regionalfailover

x x

Regional fail-over

Ruslan Meshenberg,Microservices atNetflixScale:Principles,Tradeoffs&LessonsLearned

RegionalfailoverRegional fail-over

Ruslan Meshenberg,Microservices atNetflixScale:Principles,Tradeoffs&LessonsLearned

Page 8: Architecting for Scale - cs.cmu.educkaestne/15313/2017/... · Source: Josh Evans, Mastering Chaos -A Netflix Guide to Microservices Cascading Failure Source: Josh Evans, Mastering

10/16/17

8

Whatisastatelessservice?

Whatisastatelessservice?

• Notacacheoradatabase• Frequentlyaccessedmetadata• Noinstanceaffinity• Lossanodeisanon-event

Source:JoshEvans,MasteringChaos- ANetflixGuidetoMicroservices

Minimumsize

Desiredcapacity

Maximumsize

Scaleoutasneeded

S3AMIretrievedondemand

ComputeefficiencyNodefailureTrafficspikesPerformancebugs

AutoScalingGroups

ClusterA ClusterD

EdgeCluster

ClusterB

ClusterC

Surviving Instance Failure

Source:JoshEvans,MasteringChaos- ANetflixGuidetoMicroservices

Whatisastateful service?

Page 9: Architecting for Scale - cs.cmu.educkaestne/15313/2017/... · Source: Josh Evans, Mastering Chaos -A Netflix Guide to Microservices Cascading Failure Source: Josh Evans, Mastering

10/16/17

9

Whatisastatelessservice?

• Databasesandcaches• Customappswhichholddata• Lossofanodeisanotableevent

DedicatedShards– AnAntipattern

Squid1 Squid2 Squid3

ClientApplication

SubscriberClientLibrary

CacheClient ServiceClient

S S S S...

DB DB DB DB...

Squidn

HAProxy

Set1 Set2 Set3 Setn

X

Redundancyisfundamental

ZoneA ZoneB ZoneC

.........

EVCache Writes

ClientApplication

ClientLibrary

EVCacheClient

ClientApplication

ClientLibrary

EVCacheClient

ClientApplication

ClientLibrary

EVCacheClient

...

ZoneA ZoneB ZoneC

ClientApplication

ClientLibrary

EVCacheClient

.........

EVCache Reads

ClientApplication

ClientLibrary

EVCacheClient

ClientApplication

ClientLibrary

EVCacheClient

...

Whyautomation,inallsoftwaredev/opsstages,isimportant?

Page 10: Architecting for Scale - cs.cmu.educkaestne/15313/2017/... · Source: Josh Evans, Mastering Chaos -A Netflix Guide to Microservices Cascading Failure Source: Josh Evans, Mastering

10/16/17

10

AutonomicNervousSystem

Youdon’thavetothinkaboutdigestionorbreathing

PrioritiesOur Priorities

1. Innovation

3. Efficiency

2. Reliability

Ruslan Meshenberg,Microservices atNetflixScale:Principles,Tradeoffs&LessonsLearned

Innovation:Tightcouplingdoesn’twork Monolithicvsmicroservice-basedapplications:InterdependentvsIndependentteams

59LeeAtchison,ArchitectingforScale:HighAvailabilityforYourGrowingApplications

End-to-endownership

60

End-end ownership + velocity Architect

Design

Develop

Review Test

Deploy

Run

Support

Architect

Design

Develop

Review Test

Deploy

Run

Support

Architect

Design

Develop

Review Test

Deploy

Run

Support

Architect

Design

Develop

Review Test

Deploy

Run

Support

Architect

Design

Develop

Review Test

Deploy

Run

Support

Architect

Design

Develop

Review Test

Deploy

Run

Support

Architect

Design

Develop

Review Test

Deploy

Run

Support

Architect

Design

Develop

Review Test

Deploy

Run

Support

Architect

Design

Develop

Review Test

Deploy

Run

Support

Architect

Design

Develop

Review Test

Deploy

Run

Support

Architect

Design

Develop

Review Test

Deploy

Run

Support

Architect

Design

Develop

Review Test

Deploy

Run

Support

Ruslan Meshenberg,Microservices atNetflixScale:Principles,Tradeoffs&LessonsLearned KevinGoldsmith,Microservices @Spotify

Page 11: Architecting for Scale - cs.cmu.educkaestne/15313/2017/... · Source: Josh Evans, Mastering Chaos -A Netflix Guide to Microservices Cascading Failure Source: Josh Evans, Mastering

10/16/17

11

Server

Core Library

Platform Platform Platform Platform

Infrastructure

KevinGoldsmith,Microservices @Spotify

ChallengesSynchronization

Client UX implementationCore Library Implementation

depends on depends on depends onServer Implementation

Infrastructure Implementation

KevinGoldsmith,Microservices @Spotify

KevinGoldsmith,Microservices @Spotify

Full-stack autonomous teamsRequires you to structure your application in loosely coupled parts

ArchitectureevolutionofSpotify

67KevinGoldsmith,Microservices @Spotify

Page 12: Architecting for Scale - cs.cmu.educkaestne/15313/2017/... · Source: Josh Evans, Mastering Chaos -A Netflix Guide to Microservices Cascading Failure Source: Josh Evans, Mastering

10/16/17

12

ArchitectureevolutionofSpotify

68 69

70 71

Load

Bal

lanc

er

72

Microservices:Yay!

• EasiertoScale• Easiertotest• Easiertodeploy• Easiertomonitor• Theycanversionedindependently

Page 13: Architecting for Scale - cs.cmu.educkaestne/15313/2017/... · Source: Josh Evans, Mastering Chaos -A Netflix Guide to Microservices Cascading Failure Source: Josh Evans, Mastering

10/16/17

13

Microservices:Boo!

• Monitoringlotsofservices• Documentations• Increasedlatency

75

What does this look like at Spotify?

‣ 810 active services

‣ ~10 Systems per squad

‣ ~1.7 Systems per person with access to production servers

‣ ~1.15 Systems per member of Technology

MicroservicesatSpotify

76

As of April 2016:

Uber Cities Worldwide: 400+ Countries: 70 Employees: 6,000+

Uber

MattRanney,WhatIWishIHadKnownBeforeScalingUberto1000Services

Microservices atUber

79MattRanney,WhatIWishIHadKnownBeforeScalingUberto1000Services

Page 14: Architecting for Scale - cs.cmu.educkaestne/15313/2017/... · Source: Josh Evans, Mastering Chaos -A Netflix Guide to Microservices Cascading Failure Source: Josh Evans, Mastering

10/16/17

14

pre-history PHP (outsourced)

Dispatch Node.JS, moving Go

Core Services Python, moving to Go

Maps Python and Java

Data Python and Java

Metrics Go

MattRanney,WhatIWishIHadKnownBeforeScalingUberto1000Services

LANGUAGESHard to share code Hard to move between teams WIWIK: Fragments the culture

MattRanney,WhatIWishIHadKnownBeforeScalingUberto1000Services

MattRanney,WhatIWishIHadKnownBeforeScalingUberto1000Services MattRanney,WhatIWishIHadKnownBeforeScalingUberto1000Services

Summary

• Microservices maybearightsolutionforbuildingcomplexapplicationsthatneedtooperateathigherscale• Tradeoffsthatcompaniesmadetomigratetomicroservices (contrastofrequirementsbetweencompanies)• Makingreliablemicroservice architecturerequiresstrategiestodealwithfailureeitheratmicro-leveloratalargerlevel• Microservices architecturehelptobuildagileteamstructurethatenablelargescalecompaniestomovefast(organizationalchallenges)