letters from the trenches: lessons learned taking mongodb to production

17
Letters from the Trenches: Lessons Learned Taking MongoDB to Production October 17, 2013 Rick Warren [email protected]

Upload: rick-warren

Post on 15-Jan-2015

1.299 views

Category:

Technology


2 download

DESCRIPTION

eHarmony moved one family of business-critical back-end applications to MongoDB several months ago. In this presentation, I discuss some of the important lessons we learned along the way about how to provision, scale, manage, and troubleshoot MongoDB.

TRANSCRIPT

Page 1: Letters from the Trenches: Lessons Learned Taking MongoDB to Production

Letters from the Trenches:Lessons Learned Taking MongoDB to Production

October 17, 2013

Rick Warren [email protected]

Page 2: Letters from the Trenches: Lessons Learned Taking MongoDB to Production

Traditional Internet Dating Service

Unidirectional User-Defined Criteria

Page 3: Letters from the Trenches: Lessons Learned Taking MongoDB to Production

eHarmony Matching

Bidirectional User-Defined Criteria

Page 4: Letters from the Trenches: Lessons Learned Taking MongoDB to Production

eHarmony Matching: 3 Parts

1. Bidirectional User-Defined Criteria

2. Research-Based Compatibility Models

3. Machine-Learned Affinity Models

Photo CreditsMagnifying glass: andercismo @ http://www.flickr.com/photos/andercismo/

Machine learning: University of Maryland Press Releases @ http://www.flickr.com/photos/umdnews/

Page 5: Letters from the Trenches: Lessons Learned Taking MongoDB to Production

Application: Find Potential Matches

As fast as possible:

1. Find people who meet each other’s preferences

2. Discard combos that violate Compatibility Models

1. Bidirectional User-Defined Criteria

Page 6: Letters from the Trenches: Lessons Learned Taking MongoDB to Production

Application: Find Potential Matches

• User attributes in MongoDB

– Replicated

– Sharded

• Data access pattern:

– Read-heavy

– Complex queries

• Java application

1. Bidirectional User-Defined Criteria

Page 7: Letters from the Trenches: Lessons Learned Taking MongoDB to Production

Application: Find Potential Matches

• In full production> 6 mos

– Following several mos limited production

– Following several mos intensive dev+testing

• No production outages

• MongoDB no longer the thing we worry about most

• User attributes in MongoDB

– Replicated

– Sharded

• Data access pattern:

– Read-heavy

– Complex queries

• Java application

Page 8: Letters from the Trenches: Lessons Learned Taking MongoDB to Production

Lesson: Provision for Success

Fit all data & indexes in memory– MongoDB storage implemented using

mem-mapped files

– Beware under-provisioned VMs

Minimize field names to keep data as small as possible– “Schema-less records” ==

“schema repeated millions of times”

– Morphia Java library can help with mapping

Page 9: Letters from the Trenches: Lessons Learned Taking MongoDB to Production

Lesson: Provision for Success

Primary

Secondary

Secondary

Shard / RS

Primary

Secondary

Secondary

Shard / RS

Scale write ops & data volume by adding shards

Scale

read

op

sb

y a

dd

ing

seco

nd

arie

s

Page 10: Letters from the Trenches: Lessons Learned Taking MongoDB to Production

Lesson: Be Ready to Tinker

• Many processes:– mongod on each node, primary

or secondary

– 2 MMS agents

– Plus, if sharding:

• mongos for each app instance

• 3 config servers

• …Each configured

separately & differently– Configuration file

– Manual commands to set up

• Less likely to have

DBA support– …and relational Best Practices

may not transfer

Use Puppet, Chef, or similar– Helps with config files, command-line

arguments

– Insufficient for adding secondaries,

configuring indexes, etc.

If scripting, use real client

driver, not mongo shell– Doesn’t handle output or errors

consistently

– Can’t wait in JavaScript

Train your DB/Ops team(s)– And expect to do more yourself

Page 11: Letters from the Trenches: Lessons Learned Taking MongoDB to Production

Lesson: Shadow Mode Is Your Friend

Test with real production data, conditions, and queries

Measure everything (MMS is a good start, but insufficient)

Kill mongod instances to verify resiliency

Primary school enrollment, Armenia:http://data.worldbank.org/country/armenia

X

Real Application

“Shadow” Application

Real Events & Requests

Page 12: Letters from the Trenches: Lessons Learned Taking MongoDB to Production

Lesson: Be Ready to Restore Your Data

• Schemas will

change

• Shard key(s)

will change– More on this later…

• You’ll

experience

MongoDB bugs

Maintain 2nd copy in

another format– Backing source of truth?

– Backup in standard format?

– Second cluster with different

version of MongoDB?

Increment DB name

with each reload

Automate reload

process, and use it

Image credit:http://tutorialphotoshopcs-putradom.blogspot.com/2012/11/create-dramatic-meteor-and-burning-city.html

Page 13: Letters from the Trenches: Lessons Learned Taking MongoDB to Production

Lesson: Pick a Good Shard Key

1. Distribute Data Volume Evenly– This is what auto-balancing does for you.

2. Multiply Query Performance– Isolate queries to 1 shard to multiply

read capacity by # of shards.

3. Distribute Workload Evenly– Conflicts with above!

Page 14: Letters from the Trenches: Lessons Learned Taking MongoDB to Production

1. Distribute Data Volume Evenly– This is what auto-balancing does for you.

2. Multiply Query Performance– Isolate queries to 1 shard to multiply

read capacity by # of shards.

3. Distribute Workload Evenly– Conflicts with above!

Lesson: Pick a Good Shard Key

Jessica Rabbit: http://disney.wikia.com/wiki/Jessica_RabbitSteve Urkel:

http://celebratingtvandfilmgeeks.wordpress.com/2010/04/25/steve-urkel-the-ultimate-90s-nerd-and-life-lessons/4-steve-urkel/

Shard 2

mongos

Shard 1

Page 15: Letters from the Trenches: Lessons Learned Taking MongoDB to Production

Lesson: Pick a Good Shard Key

DO These Things

Use fields appearing in every query

Choose combo that finely partitions data

Measure relative load across shards– Consider adding

secondaries to loaded shard(s) ONLY

BEWARE These Things

• Include serial numbers(or similar)

• Hash fields when reads might be a problem

• Mutable fields in shard key—remove and add

Page 16: Letters from the Trenches: Lessons Learned Taking MongoDB to Production

Summary

1. Provision for Success

2. Be Ready to Tinker

3. Shadow Mode Is Your Friend

4. Be Ready to Restore Your Data

5. Pick a Good Shard Key

Page 17: Letters from the Trenches: Lessons Learned Taking MongoDB to Production

We’re Hiring

http://www.eharmony.com/about/careers

[email protected]