Transcript
Page 1: Capacity Planning For Your Growing MongoDB Cluster

Solution Architect, MongoDB

Sam Weaver

Capacity Planning:

Deploying MongoDB

#mongodb

Page 2: Capacity Planning For Your Growing MongoDB Cluster

Capacity Planning

• Why is it important?

• What is it?

• When is it important?

• How is it actually done?

Page 3: Capacity Planning For Your Growing MongoDB Cluster

Why?

Page 4: Capacity Planning For Your Growing MongoDB Cluster

Prepping for launch

• You’ve written your application

• The code is good

• You’re looking to launch soon

• How do I deploy?

Page 5: Capacity Planning For Your Growing MongoDB Cluster

Questions to ask yourself

• Instance types

– Standalone?

– Replica set?

– Sharded?

• Architecture

• Size of machines

– Machines cost money

– Size of machines may affect instance types required

Page 6: Capacity Planning For Your Growing MongoDB Cluster

• What are the consequences of not planning?

Why does it matter?

Page 7: Capacity Planning For Your Growing MongoDB Cluster

Why

• Once we launch, we don't want to have avoidable down time due to poorly selected HW

• As our success grows we want to stay in front of the demand curve

• We want to meet business' and users' expectations

• We want to keep our jobs

Page 8: Capacity Planning For Your Growing MongoDB Cluster

What?

Page 9: Capacity Planning For Your Growing MongoDB Cluster

What is Capacity Planning?

Requirements

Resources

Page 10: Capacity Planning For Your Growing MongoDB Cluster

Requirements

• Availability

– Planning for a crash

– Planning for binary upgrades

– Planning for hardware maintenance

• Throughput

– X many users at any one time

– Bulk loads vs. random access

• Responsiveness

– SLA of x ms per page load

– Amazon, Google study

Page 11: Capacity Planning For Your Growing MongoDB Cluster

How?

Page 12: Capacity Planning For Your Growing MongoDB Cluster

CPU

• Non-indexed Data

• Sorting

• Aggregation

– Map/Reduce

– Framework

• Data

– Fields

– Nesting

– Arrays/Embedded-Docs

Page 13: Capacity Planning For Your Growing MongoDB Cluster

Network

• Latency

– WriteConcern

– ReadPreference

– Batching

• Throughput

– Update/Write Patterns

– Reads/Queries

Page 14: Capacity Planning For Your Growing MongoDB Cluster

Understand memory usage for MongoDB

• Data & indexes memory mapped into virtual

address space

• Data accessed is paged into RAM

• OS evicts least recently used page

• More frequently used pages stay in RAM

Page 15: Capacity Planning For Your Growing MongoDB Cluster

Identify your working set

Number of active users on the system at any one time

Number of distinct pages accessed per second

=

Page 16: Capacity Planning For Your Growing MongoDB Cluster

Working Set

Page 17: Capacity Planning For Your Growing MongoDB Cluster

Working Set

4 distinct pages per second

RAM

Disk

Page 18: Capacity Planning For Your Growing MongoDB Cluster

Working Set

4 distinct pages per second

Page 19: Capacity Planning For Your Growing MongoDB Cluster

Working Set

4 distinct pages per second

Worst case 4 disk accesses

Page 20: Capacity Planning For Your Growing MongoDB Cluster

Working Set

6 distinct pages per second

Page 21: Capacity Planning For Your Growing MongoDB Cluster

Working Set

6 distinct pages per second

Page 22: Capacity Planning For Your Growing MongoDB Cluster

Working Set

6 distinct pages per second

Page 23: Capacity Planning For Your Growing MongoDB Cluster

Working Set

6 distinct pages per second

Worst case disk access on every op

Page 24: Capacity Planning For Your Growing MongoDB Cluster

Memory & Storage

MOPs

PFs

Page 25: Capacity Planning For Your Growing MongoDB Cluster

Memory

• Working set affected by

–Sorting

–Aggregation

–Connections

SORTS

Connections

Aggregations

Page 26: Capacity Planning For Your Growing MongoDB Cluster

Working Set Estimator

"workingSet" : {

"note" : "thisIsAnEstimate",

"pagesInMemory" : <num>,

"computationTimeMicros" : <num>,

"overSeconds" : num

}

Number of unique pages the server needed in the last

15 minutes. Use this to see if you are growing out

RAM

Page 27: Capacity Planning For Your Growing MongoDB Cluster

Storage• Different storage have different IOPs

– Spinning disk

• 7,500k SATA 75-100 IOPs

– SSD

• 9,000-120,000 IOPs

– EBS

• 100 IOPs

– Provisioned EBS

• 2,000 IOPs

• Work out how much data you need to write per time frame.

• MongoDB writes to a journal and datafiles flush to disk.

• Replication adds oplog considerations

Page 28: Capacity Planning For Your Growing MongoDB Cluster

Using this information

• Plan hardware to hold the working set + indexes

• Allow room to grow

• If working set is larger than RAM and you can’t

reasonably add more resources, then shard

– Don’t shard too early

– Lots of little instances vs. a few big instances

• Think about architecture

– Local disk or central storage

– Don’t be surprised with x copies of data with x number of

nodes

Page 29: Capacity Planning For Your Growing MongoDB Cluster

Development to production

• Don’t be surprised by:

– More data = more/larger indexes

– Indexes make your working set bigger

• Replication adds a network overhead

• Journal has different access patterns

Page 30: Capacity Planning For Your Growing MongoDB Cluster

What tools are there to help me?

Page 31: Capacity Planning For Your Growing MongoDB Cluster

IOStat

Page 32: Capacity Planning For Your Growing MongoDB Cluster

MongoStat

Page 33: Capacity Planning For Your Growing MongoDB Cluster

MongoPerf

• Measure amount of data written to device per

second

Page 34: Capacity Planning For Your Growing MongoDB Cluster

MongoDB Management Service

• Free Cloud or On-Premise based management tool

– Monitoring

– Automation

– Backup

Page 35: Capacity Planning For Your Growing MongoDB Cluster
Page 36: Capacity Planning For Your Growing MongoDB Cluster

Scaling for capacity – MMS automation

Page 37: Capacity Planning For Your Growing MongoDB Cluster

When?

Page 38: Capacity Planning For Your Growing MongoDB Cluster

Capacity Planning: When

• When?

– Before it's too late!

– Iterative process

Start Launch Version 2

Page 39: Capacity Planning For Your Growing MongoDB Cluster

Repeat (continuously)

• Repeat Testing

• Repeat Evaluations

• Repeat Deployment

Page 40: Capacity Planning For Your Growing MongoDB Cluster

What is failure?

• We have failed at Capacity Planning when our

resources don’t meet our requirements

• Because our requirements can have many

dimensions, we may exceed our requirements in

one characteristic but not meet them in another

• This means that we can spend many $$$ and still

fail!

Page 41: Capacity Planning For Your Growing MongoDB Cluster

Models

• Load/Users

– Response Time/TTFB

• System Performance

– Peak Usage

– Min Usage

Page 42: Capacity Planning For Your Growing MongoDB Cluster

Starter Questions

• What is the working set?

– How does that equate to memory

– How much disk access will that require

• How efficient are the queries?

• What is the rate of data change?

• How big are the highs and lows?

Page 43: Capacity Planning For Your Growing MongoDB Cluster

Questions?

Page 44: Capacity Planning For Your Growing MongoDB Cluster

Solution Architect, MongoDB

Sam Weaver

Thank You

#mongodb


Top Related