mongophilly indexing-2011-04-26

24
Indexing, Query Optimization, the Query Optimizer — MongoPhilly Richard M Kreuter 10gen Inc. [email protected] April 26, 2011 MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly

Upload: kreuter

Post on 26-May-2015

861 views

Category:

Documents


0 download

DESCRIPTION

Indexing and the Query Optimizer at MongoPhilly

TRANSCRIPT

Page 1: Mongophilly indexing-2011-04-26

Indexing, Query Optimization, the QueryOptimizer — MongoPhilly

Richard M Kreuter10gen Inc.

[email protected]

April 26, 2011

MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly

Page 2: Mongophilly indexing-2011-04-26

Indexing Basics

Indexes are tree-structured sets of references to yourdocuments.

The query planner can employ indexes to efficiently enumerateand sort matching documents.

MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly

Page 3: Mongophilly indexing-2011-04-26

However, indexing strikes people as a gray art

As is the case with relational systems, schema design andindexing go hand in hand...

... but you also need to know about your actual (not justpredicted) query patterns.

MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly

Page 4: Mongophilly indexing-2011-04-26

Some indexing generalities

A collection may have at most 64 indexes.

A query may only use 1 index (except for disjuncts of $orqueries).

Indexes entail additional work on inserts, updates, deletes.

MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly

Page 5: Mongophilly indexing-2011-04-26

Creating Indexes

The id attribute is always indexed. Additional indexes can becreated with ensureIndex():

// Create an index on the user attributedb.collection.ensureIndex({ user : 1 })// Create a compound index on// the user and email attributesdb.collection.ensureIndex({ user : 1, email : 1 })// Create an index on the favorites// attribute, will index all values in listdb.collection.ensureIndex({ favorites : 1 })// Create a unique index on the user attribtedb.collection.ensureIndex({user:1}, {unique:true})// Create an index in the background.db.collection.ensureIndex({user:1}, {background:true})

MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly

Page 6: Mongophilly indexing-2011-04-26

Index maintenance

// Drops an index on xdb.collection.dropIndex({x:1})// drops all indexesdb.collection.dropIndexes()// Rebuild indexes (need for this reduced in 1.6)db.collection.reIndex()

MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly

Page 7: Mongophilly indexing-2011-04-26

Indexes are smart about data types and structures

Indexes on attributes whose values are of different types indifferent documents can speed up queries by skippingdocuments where the relevant attribute isn’t of theappropriate type.

Indexes on attributes whose values are lists will index eachelement, speeding up queries that look into these attributes.(You really want to do this for querying on tags.)

MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly

Page 8: Mongophilly indexing-2011-04-26

When can indexes be used?

In short, if you can envision how the index might get used, itprobably is. These will all use an index on x:

db.collection.find( { x: 1 } )

db.collection.find( { x :{ $in : [1,2,3] } } )

db.collection.find( { x : { $gt : 1 } } )

db.collection.find( { x : /^a/ } )

db.collection.count( { x : 2 } )

db.collection.distinct( { x : 2 } )

db.collection.find().sort( { x : 1 } )

MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly

Page 9: Mongophilly indexing-2011-04-26

Trickier cases where indexes can be used

db.collection.find({ x : 1 }).sort({ y : 1 })will use an index on y for sorting, if there’s no index on x.(For this sort of case, use a compound index on both x and yin that order.)

db.collection.update( { x : 2 } , { x : 3 } )will use an index on x (but older mongodb versions didn’tpermit $inc and other modifiers on indexed fields.)

MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly

Page 10: Mongophilly indexing-2011-04-26

Some array examples

The following queries will use an index on x, and will matchdocuments whose x attribute is the array [2,10]

db.collection.find({ x : 2 })db.collection.find({ x : 10 })db.collection.find({ x : { $gt : 5 } })db.collection.find({ x : [2,10] })db.collection.find({ x : { $in : [2,5] }})

MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly

Page 11: Mongophilly indexing-2011-04-26

Geospatial indexes

Geospatial indexes are a sort of special case; the operators that cantake advantage of them can only be used if the relevant indexeshave been created. Some examples:

db.collection.find({ a : [50, 50]}) finds adocument with this point for a.

db.collection.find({a : {$near : [50, 50]}})sorts results by distance.

db.collection.find({a:{$within:{$box:[[40,40],[60,60]]}}}})db.collection.find({a:{$within:{$center:[[50,50],10]}}}})

MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly

Page 12: Mongophilly indexing-2011-04-26

When indexes cannot be used

Many sorts of negations, e.g., $ne, $not.

Tricky arithmetic, e.g., $mod.

Most regular expressions (e.g., /a/).

Expressions in $where clauses don’t take advantage ofindexes.

Of course $where clauses are mostly for complex queries thatoften can’t be indexed anyway, e.g., ‘‘where a > b’’. (Ifthese cases matter to you, it you can precompute the matchand store that as an additional attribute, you can store that,index it, and skip the $where clause entirely.)

map/reduce can’t take advantage of indexes (mappingfunction is opaque to the query optimizer).

As a rule, if you can’t imagine how an index might be used, itprobably can’t!

MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly

Page 13: Mongophilly indexing-2011-04-26

Never forget about compound indexes

Whenever you’re querying on multiple attributes, whether aspart of the selector document or in a sort(), compoundindexes can be used.

MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly

Page 14: Mongophilly indexing-2011-04-26

Schema/index relationships

Sometimes, question isn’t “given the shape of these documents,how do I index them?”, but “how might I shape the data so I cantake advantage of indexing?”

// Consider a schema that uses a list of// attribute/value pairs:db.c.insert({ product : "SuperDooHickey",

manufacturer : "Foo Enterprises",catalog : [ { stock : 50,

modtime: ’2010-09-02’ },{ price : 29.95,modtime : ’2010-06-14’ } ] });

db.c.ensureIndex({ catalog : 1 });// All attribute queries can use one index.db.c.find( { catalog : { stock : { $gt : 0 } } } )

MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly

Page 15: Mongophilly indexing-2011-04-26

Sparse Indexes

Sparse indexes are a new flavor of index that may be useful whenyou want to index on a field that is present in only a smallishsubset of a collection. A sparse index is created by specifying{ sparse : true } to the index constructor, and it onlycreate entries for documents that contain the field.

MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly

Page 16: Mongophilly indexing-2011-04-26

Covered Indexes

A covered index is an index from which a query’s results can beproduced without needing to access full document records. So, forexample, if you have an index on attributes foo and bar and youexecute find({ bar : { $gt : 10 } },{ foo : 1 , id : 0 }), the results can be computed just byexamining the index.Note that the id attribute is not present in indexes by default, andso in order to take advantage of covered indexes, you’ll need toexclude it from a query’s projection argument or include it in theindex explicitly.

MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly

Page 17: Mongophilly indexing-2011-04-26

Index sizes

Of course, indexes take up space. For many interesting databases,real query performance will depend on index sizes; so it’s useful tosee these numbers.

db.collection.stats() shows indexSizes, the size ofeach index in the collection.

db.collection.totalIndexSize() displays the size of allindexes in the collection.

MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly

Page 18: Mongophilly indexing-2011-04-26

explain()

It’s useful to be able to ensure that your query is doing what youwant it to do. For this, we have explain(). Query plans that usean index have cursor type BtreeCursor.

db.collection.find({x:{$gt:5}}).explain(){"cursor" : "BtreeCursor x_1",

..."nscanned" : 12345,

..."n" : 100,"millis" : 4,

...}

MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly

Page 19: Mongophilly indexing-2011-04-26

explain(), continued

If the query plan doesn’t use the index, the cursor type will beBasicCursor.

db.collection.find({x:{$gt:5}}).explain(){"cursor" : "BasicCursor",

..."nscanned" : 12345,

..."n" : 42,"millis" : 4,

...}

MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly

Page 20: Mongophilly indexing-2011-04-26

Really, compound indexes are important

Try this at home:

1 Create a collection with a few tens of thousands of documentshaving two attributes (let’s call them a and b).

2 Create a compound index on {a : 1, b : 1},3 Do a db.collection.find({a : constant}).sort({b :

1}).explain().

4 Note the explain result’s millis.

5 Drop the compound index.

6 Create another compound index with the attributes reversed.(This will be a suboptimal compound index.)

7 Explain the above query again.

8 The suboptimal index should produce a slower explain result.

MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly

Page 21: Mongophilly indexing-2011-04-26

The DB Profiler

MongoDB includes a database profiler that, when enabled, recordsthe timing measurements and result counts in a collection withinthe database.

// Enable the profiler on this database.> db.setProfilingLevel(1, 100){ "was" : 0, "slowms" : 100, "ok" : 1 }> db.foo.find({a: { $mod : [3, 0] } });...// See the profiler info.> db.system.profile.find(){ "ts" : "Thu Nov 18 2010 06:46:16 GMT-0500 (EST)","info" : "query test.$cmd ntoreturn:1

command: { count: \"foo\",query: { a: { $mod: [ 3.0, 0.0 ] } },

fields: {} } reslen:64 406ms","millis" : 406 }

MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly

Page 22: Mongophilly indexing-2011-04-26

Query Optimizer

MongoDB’s query optimizer is empirical, not cost-based.

To test query plans, it tries several in parallel, and records theplan that finishes fastest.

If a plan’s performance changes over time (e.g., as datachanges), the database will reoptimize (i.e., retry all possibleplans).

MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly

Page 23: Mongophilly indexing-2011-04-26

Hinting the query plan

Sometimes, you might want to force the query plan. For this, wehave hint().

// Force the use of an index on attribute x:db.collection.find({x: 1, ...}).hint({x:1})// Force indexes to be avoided!db.collection.find({x: 1, ...}).hint({$natural:1})

MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly

Page 24: Mongophilly indexing-2011-04-26

Going forward

www.mongodb.org — downloads, docs, community

[email protected] — mailing list

#mongodb on irc.freenode.net

try.mongodb.org — web-based shell

10gen is hiring. Email [email protected].

10gen offers support, training, and advising services formongodb

MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly