mongodb indexing constraints and creative schemas

30
MongoDB Indexing Constraints & Creative Schemas Chris Winslett [email protected] Thursday, June 27, 13

Upload: mongodb

Post on 12-May-2015

920 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: MongoDB Indexing Constraints and Creative Schemas

MongoDB Indexing Constraints & Creative

Schemas

Chris [email protected]

Thursday, June 27, 13

Page 2: MongoDB Indexing Constraints and Creative Schemas

My Background

• For the past year, I’ve looked at MongoDB logs at least once every day.

• We routinely answer the question “how can I improve performance?”

Thursday, June 27, 13

Page 3: MongoDB Indexing Constraints and Creative Schemas

Who’s this talk for?

• New to MongoDB

• Seeing some slow operations, and need help debugging

• Running database operations on a sizeable deploy

• I have a MongoDB deployment, and I’ve hit a performance wall

Thursday, June 27, 13

Page 4: MongoDB Indexing Constraints and Creative Schemas

What should you learn?Know where to look on a running MongoDBto uncover slowness, and discuss solutions.

MongoDB has performance “patterns”.

How to think about improving performance.

And . . .

Thursday, June 27, 13

Page 5: MongoDB Indexing Constraints and Creative Schemas

Schema Design

Design with the end in mind.

Thursday, June 27, 13

Page 6: MongoDB Indexing Constraints and Creative Schemas

MongoDB Indexing Constraints

• One index per query *

• One range operator per query ($)

• Range operator must be last field in index

• Using RAM well

* except $or, but the sin with $or is appending a sort to the query.

Thursday, June 27, 13

Page 7: MongoDB Indexing Constraints and Creative Schemas

The Tools

• `mongostat` for MongoDB Behavior

• `tail` the logs for current options

• `iostat` for disk util

• `top -c` for CPU usage

Thursday, June 27, 13

Page 8: MongoDB Indexing Constraints and Creative Schemas

First, a Simple One

query getmore command res faults locked db ar|aw netIn netOut conn time 129 4 7 126m 2 my_db:0.0% 3|0 27k 445k 42 15:36:54 64 4 3 126m 0 my_db:0.0% 5|0 12k 379k 42 15:36:55 65 7 8 126m 0 my_db:0.1% 3|0 15k 230k 42 15:36:56 65 3 3 126m 1 my_db:0.0% 3|0 13k 170k 42 15:36:57 66 1 6 126m 1 my_db:0.0% 0|0 14k 262k 42 15:36:58 32 8 5 126m 0 my_db:0.0% 5|0 5k 445k 42 15:36:59

a truncated mongostat

Alerted due to high CPU

Thursday, June 27, 13

Page 9: MongoDB Indexing Constraints and Creative Schemas

log

[conn73454] query my_db.my_collection query: { $query: { publisher: "US Weekly" }, orderby: { publishAt: -1 } } ntoreturn:5 ntoskip:0 nscanned:33236 scanAndOrder:1 keyUpdates:0 numYields: 21 locks(micros) r:317266 nreturned:5 reslen:3127 178ms

Thursday, June 27, 13

Page 10: MongoDB Indexing Constraints and Creative Schemas

Solution

{ $query: { publisher: "US Weekly" }, orderby: { publishedAt: -1 } }

db.my_collection.ensureIndex({“publisher”: 1, publishedAt: -1}, {background: true})

We are fixing this query

With this index

I would show you the logs, but now they are silent.

Thursday, June 27, 13

Page 11: MongoDB Indexing Constraints and Creative Schemas

The Pattern

Inefficient Read Queries from in-memory table scans cause high CPU load

Caused by not matching indexes to queries.

Thursday, June 27, 13

Page 12: MongoDB Indexing Constraints and Creative Schemas

Example 2

MongoDB Twitter-ish Feed

Customer was building a network graph of users.

Thursday, June 27, 13

Page 13: MongoDB Indexing Constraints and Creative Schemas

Naive Method

{ creator_id: ObjectId(), status: “This is so awesome!”}

Statuses Users

{ _id: ObjectId(), friends: [array-o-friends]}

db.status.find({creator_id: {$in: [array-o-friends]}}).sort({_id: -1})

Query

Thursday, June 27, 13

Page 14: MongoDB Indexing Constraints and Creative Schemas

Solution

{ creator_id: ObjectId(), friends_of_creator: [array-of-viewers], status: “This is so awesome!”}

Statuses Users

{ _id: ObjectId(), friends: [array-o-friends]}

db.statuses.find({friends_of_creator: ObjectId()}).sort({_id: -1})

Query

Thursday, June 27, 13

Page 15: MongoDB Indexing Constraints and Creative Schemas

The Pattern

With graphs, query on viewable by.

What worked with minimal documents was not scaling.

Thursday, June 27, 13

Page 16: MongoDB Indexing Constraints and Creative Schemas

Similar Issues - Messages

{ sender_id: ObjectId(), recipient_id: ObjectId(), message: “This is so awesome!”}

Naive{ sender_id: ObjectId(), recipient_id: ObjectId(), participants: [ObjectId(), ObjectId()], thread_id: ObjectId(), message: “This is so awesome!”}

Solution

db.messages.find({participants: ObjectId()}).sort({_id: -1})

Query

db.messages.find({$or: [{sender_id: ObjectId()}, {recipient_id: ObjectId()]}).sort({_id: -1})

Naive Query

Thursday, June 27, 13

Page 17: MongoDB Indexing Constraints and Creative Schemas

Example 3

insert query update delete getmore command faults locked % idx miss % qr|qw ar|aw *0 *0 *0 *0 0 1|0 1422 0 0 0|0 50|0 *0 6 *0 *0 0 6|0 575 0 0 0|0 51|0 *0 3 *0 *0 0 1|0 1047 0 0 0|0 50|0 *0 2 *0 *0 0 3|0 1660 0 0 0|0 50|0

a truncated mongostat

Alerted on high CPU

Thursday, June 27, 13

Page 18: MongoDB Indexing Constraints and Creative Schemas

tail

[initandlisten] connection accepted from ....[conn4229724] authenticate: { authenticate: ....[initandlisten] connection accepted from ....[conn4229725] authenticate: { authenticate: .....[conn4229717] query ..... 102ms[conn4229725] query ..... 140ms

amazingly quietThursday, June 27, 13

Page 19: MongoDB Indexing Constraints and Creative Schemas

currentOp> db.currentOP(){ "inprog" : [ { "opid" : 66178716, "lockType" : "read", "secs_running" : 760, "op" : "query", "ns" : "my_db.my_collection", "query" : {

keywords: $in: [“keyword1”, “keyword2”],tags: $in: [“tags1”, “tags2”]

},orderby: {

“created_at”: -1},

"numYields" : 21 }

]}

Thursday, June 27, 13

Page 20: MongoDB Indexing Constraints and Creative Schemas

Solution

> db.currentOP().inprog.filter(function(row) { return row.secs_running > 100 && row.op == "query"

}).forEach(function(row) { db.killOp(row.opid)

})

Return Stability to Database

Disable query, and refactor schema.

Thursday, June 27, 13

Page 21: MongoDB Indexing Constraints and Creative Schemas

Refactoring

I have one word for you, “Schema”

Thursday, June 27, 13

Page 22: MongoDB Indexing Constraints and Creative Schemas

Example 4

A map reduce has gradually runslower and slower.

Thursday, June 27, 13

Page 23: MongoDB Indexing Constraints and Creative Schemas

Finding Offenders

Find the time of the slowest query of the day:grep '[0-9]\{3,100\}ms$' $MONGODB_LOG | awk '{print $NF}' | sort -n

Thursday, June 27, 13

Page 24: MongoDB Indexing Constraints and Creative Schemas

Slowest Map Reducemy_db.$cmd command: {

mapreduce: "my_collection", map: function() {}, query: { $or: [

{ object.type: "this" }, { object.type: "that" } ],time: { $lt: new Date(1359025311290), $gt: new Date(1358420511290) }, object.ver: 1, origin: "tnh"

},out: "my_new_collection", reduce: function(keys, vals) { ....}

} ntoreturn:1 keyUpdates:0 numYields: 32696 locks(micros) W:143870 r:511858643 w:6279425 reslen:140 421185ms

Thursday, June 27, 13

Page 25: MongoDB Indexing Constraints and Creative Schemas

Solution

Query is slow because it has multiple multi-value operators: $or, $gte, and $lte

Problem

Solution Change schema to use an “hour_created” attribute:

hour_created: “%Y-%m-%d %H”

Create an index on “hour_created” with followed by “$or” values. Query using the new “hour_created.”

Thursday, June 27, 13

Page 26: MongoDB Indexing Constraints and Creative Schemas

Words of caution

2 / 4 solutions were to add an index.

New indexes as a solution scales poorly.

Thursday, June 27, 13

Page 27: MongoDB Indexing Constraints and Creative Schemas

Sometimes . . .

It is best to do nothing, except add shards / add hardware.

Go back to the drawing board on the design.

Thursday, June 27, 13

Page 28: MongoDB Indexing Constraints and Creative Schemas

Bad things happen to good databases?

• ORMs

• Manage your indexes and queries.

• Constraints will set you free.

Thursday, June 27, 13

Page 29: MongoDB Indexing Constraints and Creative Schemas

Road Map for Refactoring

• Measure, measure, measure.

• Find your slowest queries and determine if they can be indexed

• Rephrase the problem you are solving by asking “How do I want to query my data?”

Thursday, June 27, 13

Page 30: MongoDB Indexing Constraints and Creative Schemas

Thank you!

• Questions?

• E-mail me: [email protected]

Thursday, June 27, 13