indexing and query optimization
TRANSCRIPT
![Page 2: Indexing and Query Optimization](https://reader037.vdocuments.net/reader037/viewer/2022110308/5579bfdbd8b42aca7a8b5061/html5/thumbnails/2.jpg)
Agenda
• What are indexes?
• Why do I need them?
• Working with indexes in MongoDB
• Optimize your queries
• Avoiding common mistakes
![Page 3: Indexing and Query Optimization](https://reader037.vdocuments.net/reader037/viewer/2022110308/5579bfdbd8b42aca7a8b5061/html5/thumbnails/3.jpg)
What are indexes?
![Page 4: Indexing and Query Optimization](https://reader037.vdocuments.net/reader037/viewer/2022110308/5579bfdbd8b42aca7a8b5061/html5/thumbnails/4.jpg)
KRISTINE TO INSERT IMAGE OF COOKBOOK
Find a recipe by name: “Currywurst”
![Page 5: Indexing and Query Optimization](https://reader037.vdocuments.net/reader037/viewer/2022110308/5579bfdbd8b42aca7a8b5061/html5/thumbnails/5.jpg)
What are indexes?
• How would you find a recipe using chicken?
• How about a 300-500 calorie recipe using chicken?
Chicken88 kcal: Chicken soup356 kcal: Chicken Ceasar
Salad480 kcal: Teriyaki Chicken680 kcal: Buffalo Chicken
WingsFish
412 kcal: Tuna SandwichPork
480 kcal: Curry Wurst
88 kcalChicken: Chicken soup
356 kcal Chicken: Chicken Ceasar
Salad412 kcal
Fish: Tuna Sandwich480 kcal
Pork: Curry WurstChicken: Teriyaki Chicken
680 kcal Chicken: Buffalo Chicken
Wings
![Page 6: Indexing and Query Optimization](https://reader037.vdocuments.net/reader037/viewer/2022110308/5579bfdbd8b42aca7a8b5061/html5/thumbnails/6.jpg)
Linked List
![Page 7: Indexing and Query Optimization](https://reader037.vdocuments.net/reader037/viewer/2022110308/5579bfdbd8b42aca7a8b5061/html5/thumbnails/7.jpg)
Finding 7 in Linked List
![Page 8: Indexing and Query Optimization](https://reader037.vdocuments.net/reader037/viewer/2022110308/5579bfdbd8b42aca7a8b5061/html5/thumbnails/8.jpg)
Finding 7 in Tree
![Page 9: Indexing and Query Optimization](https://reader037.vdocuments.net/reader037/viewer/2022110308/5579bfdbd8b42aca7a8b5061/html5/thumbnails/9.jpg)
Indexes in MongoDB are B-trees
![Page 10: Indexing and Query Optimization](https://reader037.vdocuments.net/reader037/viewer/2022110308/5579bfdbd8b42aca7a8b5061/html5/thumbnails/10.jpg)
Indexes are the single biggest tunable performance factor in MongoDB
![Page 11: Indexing and Query Optimization](https://reader037.vdocuments.net/reader037/viewer/2022110308/5579bfdbd8b42aca7a8b5061/html5/thumbnails/11.jpg)
Absent or suboptimal indexes are the most common avoidable MongoDB performance problem.
![Page 12: Indexing and Query Optimization](https://reader037.vdocuments.net/reader037/viewer/2022110308/5579bfdbd8b42aca7a8b5061/html5/thumbnails/12.jpg)
Why do I need indexes?A brief story
5 min
30 sec
![Page 13: Indexing and Query Optimization](https://reader037.vdocuments.net/reader037/viewer/2022110308/5579bfdbd8b42aca7a8b5061/html5/thumbnails/13.jpg)
Why do I need indexes?
3h
![Page 14: Indexing and Query Optimization](https://reader037.vdocuments.net/reader037/viewer/2022110308/5579bfdbd8b42aca7a8b5061/html5/thumbnails/14.jpg)
Working with Indexes in MongoDB
![Page 15: Indexing and Query Optimization](https://reader037.vdocuments.net/reader037/viewer/2022110308/5579bfdbd8b42aca7a8b5061/html5/thumbnails/15.jpg)
// The client remembers the index and raises no errorsdb.recipes.ensureIndex({ main_ingredient: 1 })
* 1 means ascending, -1 descending
How do I create indexes?
![Page 16: Indexing and Query Optimization](https://reader037.vdocuments.net/reader037/viewer/2022110308/5579bfdbd8b42aca7a8b5061/html5/thumbnails/16.jpg)
// Multiple fields (compound indexes)db.recipes.ensureIndex({ main_ingredient: 1, calories: -1})
// Arrays of values (multikey indexes){ name: ’Curry Wurst mit Pommes’, ingredients : [’pork', ’curry'] }
db.recipes.ensureIndex({ ingredients: 1 })
What can be indexed?
Image: http://www.marions-kochbuch.com/
![Page 17: Indexing and Query Optimization](https://reader037.vdocuments.net/reader037/viewer/2022110308/5579bfdbd8b42aca7a8b5061/html5/thumbnails/17.jpg)
// Subdocuments{ name : ’Curry Wurst mit Pommes', contributor: { name: ’Hans Wurst', id: ’hawu36' }}
db.recipes.ensureIndex({ 'contributor.id': 1 })
db.recipes.ensureIndex({ 'contributor': 1 })
What can be indexed?
Image: http://www.marions-kochbuch.com/
![Page 18: Indexing and Query Optimization](https://reader037.vdocuments.net/reader037/viewer/2022110308/5579bfdbd8b42aca7a8b5061/html5/thumbnails/18.jpg)
// List a collection's indexes
db.recipes.getIndexes()
db.recipes.getIndexKeys()
// Drop a specific index
db.recipes.dropIndex({ ingredients: 1 })
// Drop all indexes and recreate them
db.recipes.reIndex()
// Default (unique) index on _id
How do I manage indexes?
![Page 19: Indexing and Query Optimization](https://reader037.vdocuments.net/reader037/viewer/2022110308/5579bfdbd8b42aca7a8b5061/html5/thumbnails/19.jpg)
// Index creation is a blocking operation that can take a long time
// Background creation yields to other operations
db.recipes.ensureIndex(
{ ingredients: 1 },
{ background: true }
)
Background Index Builds
![Page 20: Indexing and Query Optimization](https://reader037.vdocuments.net/reader037/viewer/2022110308/5579bfdbd8b42aca7a8b5061/html5/thumbnails/20.jpg)
Options
• Uniqueness constraints (unique, dropDups)
• Sparse Indexes
• Geospatial (2d) Indexes
• TTL Collections (expireAfterSeconds)
![Page 21: Indexing and Query Optimization](https://reader037.vdocuments.net/reader037/viewer/2022110308/5579bfdbd8b42aca7a8b5061/html5/thumbnails/21.jpg)
// Only one recipe can have a given value for name
db.recipes.ensureIndex( { name: 1 }, { unique: true } )
// Force index on collection with duplicate recipe names – drop the duplicates
db.recipes.ensureIndex(
{ name: 1 },
{ unique: true, dropDups: true }
)
* dropDups is probably never what you want
Uniqueness Constraints
image: www.idownloadblog.com
![Page 22: Indexing and Query Optimization](https://reader037.vdocuments.net/reader037/viewer/2022110308/5579bfdbd8b42aca7a8b5061/html5/thumbnails/22.jpg)
// Only documents with field calories will be indexed
db.recipes.ensureIndex(
{ calories: -1 },
{ sparse: true }
)
// Allow multiple documents to not have calories field
db.recipes.ensureIndex(
{ name: 1 , calories: -1 },
{ unique: true, sparse: true }
)
* Missing fields are stored as null(s) in the index
Sparse Indexes
![Page 23: Indexing and Query Optimization](https://reader037.vdocuments.net/reader037/viewer/2022110308/5579bfdbd8b42aca7a8b5061/html5/thumbnails/23.jpg)
// Add latitude, longitude coordinates
{
name: ’Curry 36 am Mehringdamm’,
loc: [ 13.387764, 52.493442]
}
// Index the coordinates
db.locations.ensureIndex( { loc : '2d' } )
// Query for locations 'near' a particular coordinate
db.locations.find({
loc: { $near: [ 37.4, -122.3 ] }
})
Geospatial Indexes
image: NASA
![Page 24: Indexing and Query Optimization](https://reader037.vdocuments.net/reader037/viewer/2022110308/5579bfdbd8b42aca7a8b5061/html5/thumbnails/24.jpg)
// Documents must have a BSON UTC Date field
{ ’submitted_date' : ISODate('2012-10-12T05:24:07.211Z'), … }
// Documents are removed after // 'expireAfterSeconds' seconds
db.recipes.ensureIndex(
{ submitted_date: 1 },
{ expireAfterSeconds: 3600 }
)
TTL Collections
image: taylordsdn112.wordpress.com
![Page 25: Indexing and Query Optimization](https://reader037.vdocuments.net/reader037/viewer/2022110308/5579bfdbd8b42aca7a8b5061/html5/thumbnails/25.jpg)
Limitations
• Collections can not have > 64 indexes.
• Index keys can not be > 1024 bytes (1K).
• The name of an index, including the namespace,
must be < 128 characters.
• Queries can only use 1 index*
• Indexes have storage requirements, and impact the
performance of writes.
• In memory sort (no-index) limited to 32mb of return
data.
![Page 26: Indexing and Query Optimization](https://reader037.vdocuments.net/reader037/viewer/2022110308/5579bfdbd8b42aca7a8b5061/html5/thumbnails/26.jpg)
Optimize Your Queries
![Page 27: Indexing and Query Optimization](https://reader037.vdocuments.net/reader037/viewer/2022110308/5579bfdbd8b42aca7a8b5061/html5/thumbnails/27.jpg)
db.setProfilingLevel( n , slowms=100ms )
n=0 profiler off
n=1 record operations longer than slowms
n=2 record all queries
db.system.profile.find()
* The profile collection is a capped collection, and fixed in size
Profiling Slow Ops
image: http://www.speareducation.com/
![Page 28: Indexing and Query Optimization](https://reader037.vdocuments.net/reader037/viewer/2022110308/5579bfdbd8b42aca7a8b5061/html5/thumbnails/28.jpg)
db.recipes.find( { calories:
{ $lt : 40 } }
).explain( )
{
"cursor" : "BasicCursor" ,
"n" : 42,
"nscannedObjects” : 12345
"nscanned" : 12345,
...
"millis" : 356,
...
}
* Doesn’t use cached plans, re-evals and resets cache
The Explain Plan (without Index)
![Page 29: Indexing and Query Optimization](https://reader037.vdocuments.net/reader037/viewer/2022110308/5579bfdbd8b42aca7a8b5061/html5/thumbnails/29.jpg)
db.recipes.find( { calories:
{ $lt : 40 } }
).explain( )
{
"cursor" : "BtreeCursor calories_-1" ,
"n" : 42,
"nscannedObjects": 42
"nscanned" : 42,
...
"millis" : 0,
...
}
* Doesn’t use cached plans, re-evals and resets cache
The Explain Plan (with Index)
![Page 30: Indexing and Query Optimization](https://reader037.vdocuments.net/reader037/viewer/2022110308/5579bfdbd8b42aca7a8b5061/html5/thumbnails/30.jpg)
The Query Optimizer
• For each "type" of query, MongoDB periodically tries all useful indexes• Aborts the rest as soon as one plan wins• The winning plan is temporarily cached for each “type” of query
![Page 31: Indexing and Query Optimization](https://reader037.vdocuments.net/reader037/viewer/2022110308/5579bfdbd8b42aca7a8b5061/html5/thumbnails/31.jpg)
// Tell the database what index to use
db.recipes.find({
calories: { $lt: 1000 } }
).hint({ _id: 1 })
// Tell the database to NOT use an index
db.recipes.find(
{ calories: { $lt: 1000 } }
).hint({ $natural: 1 })
Manually Select Index to Use
![Page 32: Indexing and Query Optimization](https://reader037.vdocuments.net/reader037/viewer/2022110308/5579bfdbd8b42aca7a8b5061/html5/thumbnails/32.jpg)
// Given the following indexdb.collection.ensureIndex({ a:1, b:1 , c:1, d:1 })
// The following query and sort operations can use the indexdb.collection.find( ).sort({ a:1 })db.collection.find( ).sort({ a:1, b:1 })
db.collection.find({ a:4 }).sort({ a:1, b:1 })db.collection.find({ b:5 }).sort({ a:1, b:1 })
Use Indexes to Sort Query Results
![Page 33: Indexing and Query Optimization](https://reader037.vdocuments.net/reader037/viewer/2022110308/5579bfdbd8b42aca7a8b5061/html5/thumbnails/33.jpg)
// Given the following index
db.collection.ensureIndex({ a:1, b:1, c:1, d:1 })
// These can not sort using the index
db.collection.find( ).sort({ b: 1 })
db.collection.find({ b: 5 }).sort({ b: 1 })
Indexes that won’t work for sorting query results
![Page 34: Indexing and Query Optimization](https://reader037.vdocuments.net/reader037/viewer/2022110308/5579bfdbd8b42aca7a8b5061/html5/thumbnails/34.jpg)
// MongoDB can return data from just the index
db.recipes.ensureIndex({ main_ingredient: 1, name: 1 })
// Return only the ingredients field
db.recipes.find(
{ main_ingredient: 'chicken’ },
{ _id: 0, name: 1 }
)
// indexOnly will be true in the explain plan
db.recipes.find(
{ main_ingredient: 'chicken' },
{ _id: 0, name: 1 }
).explain()
{
"indexOnly": true,
}
Index Covered Queries
![Page 35: Indexing and Query Optimization](https://reader037.vdocuments.net/reader037/viewer/2022110308/5579bfdbd8b42aca7a8b5061/html5/thumbnails/35.jpg)
Absent or suboptimal indexes are the most common avoidable MongoDB performance problem.
![Page 36: Indexing and Query Optimization](https://reader037.vdocuments.net/reader037/viewer/2022110308/5579bfdbd8b42aca7a8b5061/html5/thumbnails/36.jpg)
Avoiding Common Mistakes
![Page 37: Indexing and Query Optimization](https://reader037.vdocuments.net/reader037/viewer/2022110308/5579bfdbd8b42aca7a8b5061/html5/thumbnails/37.jpg)
// MongoDB can only use one index for a query
db.collection.ensureIndex({ a: 1 })
db.collection.ensureIndex({ b: 1 })
// Only one of the above indexes is used
db.collection.find({ a: 3, b: 4 })
Trying to Use Multiple Indexes
![Page 38: Indexing and Query Optimization](https://reader037.vdocuments.net/reader037/viewer/2022110308/5579bfdbd8b42aca7a8b5061/html5/thumbnails/38.jpg)
// Compound key indexes are very effective
db.collection.ensureIndex({ a: 1, b: 1, c: 1 })
// But only if the query is a prefix of the index
// This query can't effectively use the index
db.collection.find({ c: 2 })
// …but this query can
db.collection.find({ a: 3, b: 5 })
Compound Key Mistakes
![Page 39: Indexing and Query Optimization](https://reader037.vdocuments.net/reader037/viewer/2022110308/5579bfdbd8b42aca7a8b5061/html5/thumbnails/39.jpg)
db.collection.distinct('status’)
[ 'new', 'processed' ]
db.collection.ensureIndex({ status: 1 })
// Low selectivity indexes provide little benefit
db.collection.find({ status: 'new' })
// Better
db.collection.ensureIndex({ status: 1, created_at: -1 })
db.collection.find(
{ status: 'new' }
).sort({ created_at: -1 })
Low Selectivity Indexes
![Page 40: Indexing and Query Optimization](https://reader037.vdocuments.net/reader037/viewer/2022110308/5579bfdbd8b42aca7a8b5061/html5/thumbnails/40.jpg)
db.users.ensureIndex({ username: 1 })
// Left anchored regex queries can use the index
db.users.find({ username: /^hans wurst/ })
// But not generic regexes
db.users.find({username: /wurst/ })
// Or case insensitive queries
db.users.find({ username: /Hans/i })
Regular Expressions
![Page 41: Indexing and Query Optimization](https://reader037.vdocuments.net/reader037/viewer/2022110308/5579bfdbd8b42aca7a8b5061/html5/thumbnails/41.jpg)
// Indexes aren't helpful with negationsdb.things.ensureIndex({ x: 1 })
// e.g. "not equal" queries db.things.find({ x: { $ne: 3 } })
// …or "not in" queriesdb.things.find({ x: { $nin: [2, 3, 4 ] } })
// …or the $not operatordb.people.find({ name: { $not: ’Hans Wurst' } })
Negation
![Page 42: Indexing and Query Optimization](https://reader037.vdocuments.net/reader037/viewer/2022110308/5579bfdbd8b42aca7a8b5061/html5/thumbnails/42.jpg)
Choosing the right indexes is one of the most important things you can do as a MongoDB developer so take the time to get your indexes right!