cloud gaming architectures

78
© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Mark Bate, AWS Solutions Architect Cloud Gaming Architectures From Social to Mobile to MMO

Upload: trannhan

Post on 13-Feb-2017

227 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Cloud Gaming Architectures

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Mark Bate, AWS Solutions Architect

Cloud Gaming ArchitecturesFrom Social to Mobile to MMO

Page 2: Cloud Gaming Architectures

Gratuitous logo slide

Page 3: Cloud Gaming Architectures

Traditional: Rigid AWS: Elastic

Servers

Demand

Capacity

Excess Capacity Wasted $$

Demand

Unmet DemandUpset Players

Missed Revenue :(

Scale to what you need, pay for what you use

Page 4: Cloud Gaming Architectures

11 regions

53 edge locations

Continuous expansion

Global is good

Page 5: Cloud Gaming Architectures

Common game back-­end concepts

Think in terms of APIsHTTP + JSONGet friends, leaderboardBinary asset dataMultiplayer serversHigh availabilityScalability

Page 6: Cloud Gaming Architectures

Core (HA) game back end

ELB

• Choose region• >=2 Availability Zones• Amazon EC2 for app• Elastic Load Balancing• Amazon RDS database

• Multi-­AZ

Region

Page 7: Cloud Gaming Architectures

Scale it way out

ELB

• Amazon S3 for game data• Assets• UGC• Analytics

Region

Page 8: Cloud Gaming Architectures

Scale it way out

ELB

• Amazon S3 for game data• Assets• UGC• Analytics• ... With Amazon

CloudFront!

Region

CloudFrontCDN

Page 9: Cloud Gaming Architectures

Scale it way out

• Amazon S3 for game data• Assets• UGC• Analytics• ... with CloudFront!

• Auto Scaling group• Capacity on demand• Respond to users• Automatic healing

ELB

Region

CloudFrontCDN

Page 10: Cloud Gaming Architectures

Scale it way out

• Amazon S3 for game data• Assets• UGC• Analytics• ... with CloudFront!

• Auto Scaling group• Capacity on demand• Respond to users• Automatic healing

• Amazon ElastiCache• Memcached• Redis

ELB

Region

CloudFrontCDN

Page 11: Cloud Gaming Architectures

Writing is painful

• Games are write heavy• Caching of limited use• Key value • Binary structures• Database = bottleneck

ELB

Region

CloudFrontCDN

Page 12: Cloud Gaming Architectures

Sharding (not fun)

Page 13: Cloud Gaming Architectures

Amazon DynamoDB

• Fully managed• NoSQL data store• Provisioned throughput• Secondary indexes• PUT/GET keys• Document support!

ELB

Region

CloudFrontCDN

Page 14: Cloud Gaming Architectures

Example: Leaderboard in DynamoDB

• Hash key = Primary key• Range key = Sub key• Range key = Sort key• Others attributes are undefined

• So… How to sort based on top score?

UserID(hash key)

BoardName(range key)

TopScore TopScoreDate

"101" "Galaxy Invaders" 5842 "2014-­09-­15T17:24:31"

"101" "Meteor Blasters" 1000 "2014-­10-­22T23:18:01"

"101" "Starship X" 24 "2014-­08-­31T13:14:21"

"102" "Alien Adventure" 192 "2014-­07-­12T11:07:56"

"102" "Galaxy Invaders" 0 "2014-­09-­18T07:33:42"

"103" "Attack Ships" 3 "2014-­10-­19T01:13:24"

"103" "Galaxy Invaders" 2317 "2014-­09-­11T06:53:00"

"103" "Meteor Blasters" 723 "2014-­10-­19T01:14:24"

"103" "Starship X" 42 "2014-­07-­11T06:53:03"

Page 15: Cloud Gaming Architectures

Leaderboard with secondary indexes

• Create a secondary index!• Set hash key to BoardName• Set range key to TopScore• Project extra attributes as needed• Can now query by BoardName,sorted by TopScore

• Handles many common gaming use cases

BoardName(hash key)

TopScore(range key)

UserID

"Alien Adventure" 192 "101"

"Attack Ships" 3 "103"

"Galaxy Invaders" 0 "102"

"Galaxy Invaders" 2317 "103"

"Galaxy Invaders" 5842 "101"

"Meteor Blasters" 723 "103"

"Meteor Blasters" 1000 "101"

"Starship X" 24 "101"

"Starship X" 42 "103"

UserID(hash key)

BoardName(range key)

TopScore TopScoreDate

"101" "Galaxy Invaders" 5842 "2014-­09-­15T17:24:31"

Page 16: Cloud Gaming Architectures

Documents in DynamoDB

Scalar types: String, Number, Binary, Boolean, NullMultivalue types: String Set, Number Set, Binary SetDocument types: List, MapDocument content addressing

"name": ”Mark",

"games": ["Megablast","Spacerace"],

"score": "Megablast" : 123,"Spacerace" : 41

"name": "S": ”Mark"

"games":

"L": [ "S": "Megablast" , "S": "Spacerace" ]

,"score": "M":

"Megablast": "N": "123" ,"Spacerace": "N": "41"

"name": "S": ”Mark"

"games":

"L": [ "S": "Megablast" , "S": "Spacerace" ]

,"score": "M":

"Megablast": "N": "123" ,"Spacerace": "N": "41"

document.score.Megablast

Page 17: Cloud Gaming Architectures

Related sessions

DAT204 NoSQL? No Worries: Building Scalable Applications on AWS NoSQL Services

DAT401 Amazon DynamoDB Deep Dive: Schema Design, Indexing, JSON, Search, and More

GAM401 Serverless Mobile Game Development with Amazon Cognito, AWS Lambda, and Amazon DynamoDB

Page 18: Cloud Gaming Architectures

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.Cloud Gaming Architectures from Mobile to Social to MMO 18

Customer Story: Devsisters

Page 19: Cloud Gaming Architectures

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.Cloud Gaming Architectures from Mobile to Social to MMO

What to expect from the session

19

How we improved our design

Tips and tricks

Retrospect

How we started

Page 20: Cloud Gaming Architectures

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.Cloud Gaming Architectures from Mobile to Social to MMO

Cookie Run

20

Page 21: Cloud Gaming Architectures

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.Cloud Gaming Architectures from Mobile to Social to MMO

Cookie Run video

21

Page 22: Cloud Gaming Architectures

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.Cloud Gaming Architectures from Mobile to Social to MMO 22

About Cookie Run

• 70M~ downloads• 10M DAU• Top free 1st in 10 countries• Top free 10th in 38 countries

Page 23: Cloud Gaming Architectures

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.Cloud Gaming Architectures from Mobile to Social to MMO

More about Devsisters and Cookie Run

23

Page 24: Cloud Gaming Architectures

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.Cloud Gaming Architectures from Mobile to Social to MMO

How We Started

24

Page 25: Cloud Gaming Architectures

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.Cloud Gaming Architectures from Mobile to Social to MMO

In early 2013…

Lack of infrastructure, lack of developer, no hope(1 server developer / 0 system engineers)

Only 1 game in service

Ovenbreak 2-­ AWS US East

Cookie Run-­ Only 1 person, 1 month left

25

Page 26: Cloud Gaming Architectures

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.Cloud Gaming Architectures from Mobile to Social to MMO

Goal

26

Highly reliable Quality assured

Scalable design Auto configuring and scaling

Real-­time monitoring system Log system

Page 27: Cloud Gaming Architectures

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.Cloud Gaming Architectures from Mobile to Social to MMO

First design

27

Game server

Operation tool

Monitoring

Java, Spring MVC, MySQL 5.5

Python, Django, Boto

Amazon CloudWatch, Zabbix, Statsd, Graphite

Page 28: Cloud Gaming Architectures

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.Cloud Gaming Architectures from Mobile to Social to MMO

First design

28

Page 29: Cloud Gaming Architectures

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.Cloud Gaming Architectures from Mobile to Social to MMO

After 11 days

29

Page 30: Cloud Gaming Architectures

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.Cloud Gaming Architectures from Mobile to Social to MMO

Design Improvements

30

Page 31: Cloud Gaming Architectures

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.Cloud Gaming Architectures from Mobile to Social to MMO

Design improvements

31

Improving the logging system

Improving the game patch system

Adding global user ranking system

Redesigning the back end

Page 32: Cloud Gaming Architectures

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.Cloud Gaming Architectures from Mobile to Social to MMO

Redesigning the back end

Players send game hearts to each other. Back ends do the bookkeeping

-­ Plan A: Used MySQL for storing dataTrouble: MySQL can’t keep up;; too many rows (100M ~)

-­ Plan B: Gave unlimited hearts to users! Disabled the featureTrouble: Not so bad, but need to come up with a better solution

32

Situation

Page 33: Cloud Gaming Architectures

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.Cloud Gaming Architectures from Mobile to Social to MMO

SolutionMySQL → NoSQL (Couchbase)Use MySQL for game data (shop data, stage data, …)Use NoSQL for user data (user items, level, coin, …)

33

Redesigning the back end

Page 34: Cloud Gaming Architectures

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.Cloud Gaming Architectures from Mobile to Social to MMO

Before

34

Page 35: Cloud Gaming Architectures

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.Cloud Gaming Architectures from Mobile to Social to MMO

After

35

Page 36: Cloud Gaming Architectures

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.Cloud Gaming Architectures from Mobile to Social to MMO

Improving the logging system

We need real-­time log querying capability

36

Real-­time log viewing system based on ELK

Situation

Solution

Page 37: Cloud Gaming Architectures

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.Cloud Gaming Architectures from Mobile to Social to MMO

Before

37

Page 38: Cloud Gaming Architectures

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.Cloud Gaming Architectures from Mobile to Social to MMO

After

38

Page 39: Cloud Gaming Architectures

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.Cloud Gaming Architectures from Mobile to Social to MMO

/Real-­time log viewing system

39

Page 40: Cloud Gaming Architectures

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.Cloud Gaming Architectures from Mobile to Social to MMO

Improving the game patch system

40

App Store binary size limitSome resources need to be downloaded on demandWanted to distribute patches without App Store update

Constructed a decent patch systemBased on Amazon S3 and Amazon CloudFront

Situation

Solution

Page 41: Cloud Gaming Architectures

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.Cloud Gaming Architectures from Mobile to Social to MMO

Before

41

Page 42: Cloud Gaming Architectures

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.Cloud Gaming Architectures from Mobile to Social to MMO

After

42

Page 43: Cloud Gaming Architectures

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.Cloud Gaming Architectures from Mobile to Social to MMO

Improving the logging system

43

Total log size >10 TB;; want to analyze all logsSituation forced us to look for big data solutions

Adopted big data platforms using Amazon EMR or Amazon EC2

Situation

Solution

Eventually migrated to Spark and Spark SQL

Page 44: Cloud Gaming Architectures

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.Cloud Gaming Architectures from Mobile to Social to MMO

Before

44

Page 45: Cloud Gaming Architectures

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.Cloud Gaming Architectures from Mobile to Social to MMO

After

45

Page 46: Cloud Gaming Architectures

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.Cloud Gaming Architectures from Mobile to Social to MMO

Spark

46

Page 47: Cloud Gaming Architectures

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.Cloud Gaming Architectures from Mobile to Social to MMO

Log dashboard

47

Page 48: Cloud Gaming Architectures

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.Cloud Gaming Architectures from Mobile to Social to MMO

Adding global user ranking system

48

Want to introduce global user ranking system

Use ordered set based on skip list using with ElastiCache…with custom caching and a lot of optimization techniques

Situation

Solution

Page 49: Cloud Gaming Architectures

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.Cloud Gaming Architectures from Mobile to Social to MMO

Before

49

Page 50: Cloud Gaming Architectures

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.Cloud Gaming Architectures from Mobile to Social to MMO

After

50

Page 51: Cloud Gaming Architectures

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.Cloud Gaming Architectures from Mobile to Social to MMO

Tips and Tricks

51

Page 52: Cloud Gaming Architectures

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.Cloud Gaming Architectures from Mobile to Social to MMO

THIS CAN HAPPEN TO YOUBASED ON THE TRUE STORY OF OUR TEAM

52

WARNING

Page 53: Cloud Gaming Architectures

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.Cloud Gaming Architectures from Mobile to Social to MMO

Auto Scaling gotchas

Frequency: More than 10 times during 2 years

Many users connect to the game simultaneously• During holiday seasons• Start of in-­game events• When bulk push notifications are sent • Or reasons unknown

Booting instances takes several minutes, which isn’t quick enough to handle spikey loads

We have to predict traffic surges and prepare beforehand

53

Page 54: Cloud Gaming Architectures

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.Cloud Gaming Architectures from Mobile to Social to MMO

Our bulk push system

54

Page 55: Cloud Gaming Architectures

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.Cloud Gaming Architectures from Mobile to Social to MMO

Auto Scaling gotchas

Don’t set minimum instance of 1 or 2If one machine dies, service fails

Use multiple Availability ZonesSometimes instance availability of a single AZ can run outUse multiple AZ with ELB cross-­zone balancing

55

Page 56: Cloud Gaming Architectures

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.Cloud Gaming Architectures from Mobile to Social to MMO

Auto Scaling gotchas

Set scale-­out(scale-­in) policy meticulouslyscale-­out: +4 when Latency >= 0.1 for 2 minutesscale-­in: -­2 when CPUUtilization < 10 for 2 minutes

Sometimes scale-­up can be a useful option

56

Page 57: Cloud Gaming Architectures

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.Cloud Gaming Architectures from Mobile to Social to MMO

Chef server failure

Auto Scaling group relying on Chef server is dangerousChef server is a single point of failure (SPOF)May become unresponsive when too many servers start simultaneously

Errors happen in unexpected places!

57

Page 58: Cloud Gaming Architectures

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.Cloud Gaming Architectures from Mobile to Social to MMO

Couchbase storage failure

Hardware problems can occur in EC2 instancesThe worst, the most hopeless system failureFront end API server can crash;; that’s OKBut if you are maintaining a database on EC2, this can be a tragedy

It really happens

58

Page 59: Cloud Gaming Architectures

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.Cloud Gaming Architectures from Mobile to Social to MMO

Couchbase storage failure

June 2015A monumentalhell gate in our company historyServer down for 12 consecutive hours because of a disk error in CouchbaseAlso, our daily backup script had not worked for 1 week prior to the shutdownSome data were restored via replication

The other data were restored through adding the lost week’s logs to previously backed up data

Lesson learned: Replica is necessary. Confirm backups.

59

Page 60: Cloud Gaming Architectures

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.Cloud Gaming Architectures from Mobile to Social to MMO

Overseas network failure

Frequency: More than 5 times over 2 yearsThis situation has really happenedISPs cut costs leading to overseas packet loss

Just Call AWS

60

Page 61: Cloud Gaming Architectures

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.Cloud Gaming Architectures from Mobile to Social to MMO

Final Design Review

61

Page 62: Cloud Gaming Architectures

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.Cloud Gaming Architectures from Mobile to Social to MMO

First design

62

Page 63: Cloud Gaming Architectures

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.Cloud Gaming Architectures from Mobile to Social to MMO

Final

63

Page 64: Cloud Gaming Architectures

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.Cloud Gaming Architectures from Mobile to Social to MMO

Future Plans

64

Page 65: Cloud Gaming Architectures

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.Cloud Gaming Architectures from Mobile to Social to MMO

Future plans

Transactional log system High latency / packet loss networks : QUIC

Entertain the world!

65

Page 66: Cloud Gaming Architectures

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Thank [email protected]

Page 67: Cloud Gaming Architectures

Amazon Cognito

Page 68: Cloud Gaming Architectures

Identity Providers

UniqueIdentitiesJoe Anna Bob

Any DeviceAny Platform

Any AWS Service

Helps implement security best practicesSecurely access any AWS service from mobile device;; it simplifies the interaction with AWS Identity and Access Management

Support multiple login providersEasily integrate with major login providers for authentication, or use your own authentication system

Unique users vs. devicesManage unique identities;; automatically recognize unique user across devices and platforms

Mobile Analytics

S3 DynamoDB Kinesis

Your own Auth

Amazon Cognito

Page 69: Cloud Gaming Architectures

Synchronize data across devices with Amazon Cognito

Sync game stateacross OS, devices

State transition(link multiple accounts)

Sync user profilesacross OS, devices, web

Page 70: Cloud Gaming Architectures

Related sessions

GAM401 Serverless Mobile Game Development with Amazon Cognito, AWS Lambda, and AmazonDynamoDB

MBL402 Mobile Identity Management and Data Synchronization Using Amazon Cognito

WRK202 Build a Scalable Mobile App on Serverless, EventTriggered, BackEnd Logic

Page 71: Cloud Gaming Architectures

Player TwoPress Start

Page 72: Cloud Gaming Architectures

Multiplayer game servers

Region

• API back-­end app• Core session• Matchmaking

• S3 + CloudFront• DLC, assets• Game saves• UGC

• Public server tier• Direct client socket• Scale on players

Page 73: Cloud Gaming Architectures

Multiplayer game servers

① Login via API② Request matchmaking③ Get game server IP

Region

Page 74: Cloud Gaming Architectures

Multiplayer game servers

① Login via API② Request matchmaking③ Get game server IP④ Connect to server⑤ Pull down assets⑥ Other players join

Region

Page 75: Cloud Gaming Architectures

Multiregion game servers

Region ARegion B

Page 76: Cloud Gaming Architectures

Related sessions

GAM403 From 0 to 60 Million Player Hours in 400 Billion Star Systems

GAM404 Evolve: Hunting Monsters in a Low Latency Multiplayer Game on Amazon EC2

GAM407 Quiplash: The Multiscreen, Multidevice, Multiplayer Game for 10,000

Page 77: Cloud Gaming Architectures

Wrap it up already

Use Auto Scaling to save moneyAmazon CloudFront + Amazon S3 for download and uploadPainful DIYDB? No! Use Amazon DynamoDBDynamically manage game servers using the APIs

• Even multiregion!

Page 78: Cloud Gaming Architectures

Thank you!Mark [email protected]: @markbateSteam: thevegemiteavengerPSN: nuclears0ck-­de