zero to 1 billion+ records: a true story of learning & scaling gamechanger

82
Zero to 1 Billion Records Kiril Savino @holacrat

Upload: mongodb

Post on 26-Jun-2015

811 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

Zero to 1 Billion Records

Kiril Savino @holacrat

Page 2: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

2

GC.com/about/product-team

Page 3: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

3

• have a sense of humor

• know what use cases work best

• remember that databases are hard

• don’t understate the difficulty in scaling up

Page 4: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

4

• 1,480,808,857 events

• 8 terabytes of primary data

• 35 nodes

• 420GB RAM on primaries

• 21TB SSD storage

• 14TB EBS storage

• 120,000 ops/s

Page 5: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

• Model

• Scale

• Grow

• Extend

5

Page 6: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

6

Model

Page 7: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

November 2009 — MongoDB 1.2

• More indexes per collection

• Faster index creation

• Map/Reduce

• Stored JavaScript functions

• Configurable fsync time

• Several small features and fixes

7

{.}

Page 8: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

8

{.?!?.}

Page 9: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

9

Decoding/Unmarshalling

Django ORM

{.}

[---]business logic

REST

API

MySQL

Page 10: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

10

Decoding/Unmarshalling

Django ORM

REST

API{.}

[---]business logic

MySQL

Page 11: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

11

InningOutsBallsStrikesPitcherBatter

Page 12: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

12

InningOutsBallsStrikesPitcherBatter

PeriodMinuteLocationShooterRebounderAssist

Page 13: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

13

[play]

[participant][role]

[sport][play_property]

Page 14: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

14

[play]

[participant][role]

[sport][play_property]

Page 15: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

15

{_id: ObjectId(), code: “1B”, participants: [{player_id: ObjectId(), roles: [“batter”, “out”]}, {player_id: ObjectId(), roles: [“pitcher”]}], situation: {outs: 1, balls: 2, strikes: 0}, properties: {location: [0.45, 0.721]}}

Page 16: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

16

{_id: ObjectId(), code: “shot”, participants: [{player_id: ObjectId(), roles: [“shooter”]}, {player_id: ObjectId(), roles: [“rebounder”]}], situation: {period: 1, time: 5:29}, properties: {location: [0.45, 0.721]}}

Page 17: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

17

Decoding/Unmarshalling

Django ORM

REST

API{.}

business logic

{.}MongoDB

Page 18: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

18

Decoding/Unmarshalling

Django ORM

REST

API{.}

business logic

{.}MongoDB

👏

Page 19: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

19

Page 20: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

Modeling data in MongoDB

20

• JSON won the internet

• Don’t write your own JSON storage engine

• Flexible schemas promote app simplicity

• Validation is your responsibility

• Invest in schema design early

Page 21: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

21

Scale

Page 22: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

22

Page 23: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

23

Page 24: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

24

Page 25: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

25

$$$

Page 26: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

26

$$$

😱

Page 27: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

27

User Load

System Latency

Page 28: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

28

User Load

System Latency

Page 29: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

29

User Load

System Latency

Page 30: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

30

Scaling is the process of decoupling load from latency.

Page 31: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

Latency comes from

31

• Writing data to your database

• Reading data from your database

• Aggregating data from multiple locations

• Running complex calculations

Page 32: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

32

{.}

This is a document.

Page 33: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

33

{.} {.}{.}

{.}{.}

API MongoDB Browser

Page 34: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

34

{.} {.}{.}

{.}{.}

API MongoDB Browser

Page 35: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

35

{.} {.}{.}

{.}{.}

API MongoDB Browser

+/-*

Page 36: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

36

Read Load

System Latency

Page 37: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

37

{.} {.}{.}

{.}{.}

API MongoDB Browser

Page 38: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

38

{.} {.}{.}

{.}{.}

API MongoDB Browser

+/-*

Page 39: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

39

Write Load

System Latency

Page 40: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

40

{.} {.}{.}

{.}{.}

API MongoDB Browser

Background+/-*

Page 41: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

41

{.} {.}{.}

{.}{.}

API MongoDB Browser

Background+/-*

Page 42: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

42

User Load

System Latency

Page 43: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

43

{.}{.}{.}

Page 44: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

44

{.}{.}{.}

{.} }

Page 45: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

45

{.}{.}{.}

{.} }

Page 46: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

46

Page 47: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

Scaling data access

47

• Decouple load from latency

• Queries are expensive

• Aggregation is expensive

• Do calculation in the background

• Serve content from single* documents

Page 48: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

48

Grow

Page 49: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

49

Page 50: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

50

Page 51: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

51

Page 52: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

52

{.}

Page 53: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

53

{.}

Page 54: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

54

{.}

Page 55: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

55

{.}

Page 56: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

56

Page 57: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

57

{.} {$addToSet: {a: 2}}

Page 58: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

58

{.} {$addToSet: {a: 2}}

{.} {v: 2}, {$set: {v: 3}}

Page 59: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

59

{.}

Page 60: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

60

Page 61: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

61

{.} {.}

Page 62: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

62

{a}{abc}{b}

{c} }

Page 63: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

63

{.}

Page 64: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

64

{.}{.}

Page 65: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

65

{.} {.}{.}

Page 66: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

66

{.} {.}{.}

Page 67: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

67

{.} {.}{.}

Page 68: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

68

{.} {.}{.}

Page 69: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

69

{.} {.}{.}

Page 70: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

70

{.} {.} {.}

Page 71: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

71

<id><id><id><id><id><id><id>

To Propagate

Page 72: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

72

<id><id><id><id><id><id><id>

To Propagate Propagating…

Page 73: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

73

<id><id><id><id><id><id><id>

To Propagate Propagating…

<id> {.}{.}{.}

Page 74: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

74

{$} {$} {$} {$} {$}

Page 75: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

Growing load

75

• Denormalize for constant access time

• Use MongoDB atomic operators

• Check out optimistic locking and MVCC

• Leverage external concurrency control

• Watch your oplog

Page 76: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

76

Extend

Page 77: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

77

{.} +

Page 78: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

78

Page 79: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

79

Page 80: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

80

Page 81: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

So there we have it

• Design your schema to MongoDB’s strengths

• Use monolithic documents

• Don’t do (live) querying

• You can still do transactional things

• You may need to denormalize & propagate

• Think about your overall architecture

81

Page 82: Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

82

• have a sense of humor

• know what use cases work best

• remember that databases are hard

• don’t understate the difficulty in scaling up

@holacrat