map/confused? a practical approach to map/reduce with mongodb
DESCRIPTION
Talk given at MongoDb Munich on 16.10.2012 about the different approaches in MongoDB for using the Map/Reduce algorithm. The talk compares the performance of built-in MongoDB Map/Reduce, group(), aggregate(), find() and the MongoDB-Hadoop Adapter using a practical use case.TRANSCRIPT
![Page 1: Map/Confused? A practical approach to Map/Reduce with MongoDB](https://reader033.vdocuments.net/reader033/viewer/2022052307/54b75ae84a7959bd138b45b7/html5/thumbnails/1.jpg)
![Page 2: Map/Confused? A practical approach to Map/Reduce with MongoDB](https://reader033.vdocuments.net/reader033/viewer/2022052307/54b75ae84a7959bd138b45b7/html5/thumbnails/2.jpg)
![Page 3: Map/Confused? A practical approach to Map/Reduce with MongoDB](https://reader033.vdocuments.net/reader033/viewer/2022052307/54b75ae84a7959bd138b45b7/html5/thumbnails/3.jpg)
![Page 5: Map/Confused? A practical approach to Map/Reduce with MongoDB](https://reader033.vdocuments.net/reader033/viewer/2022052307/54b75ae84a7959bd138b45b7/html5/thumbnails/5.jpg)
![Page 6: Map/Confused? A practical approach to Map/Reduce with MongoDB](https://reader033.vdocuments.net/reader033/viewer/2022052307/54b75ae84a7959bd138b45b7/html5/thumbnails/6.jpg)
![Page 7: Map/Confused? A practical approach to Map/Reduce with MongoDB](https://reader033.vdocuments.net/reader033/viewer/2022052307/54b75ae84a7959bd138b45b7/html5/thumbnails/7.jpg)
![Page 8: Map/Confused? A practical approach to Map/Reduce with MongoDB](https://reader033.vdocuments.net/reader033/viewer/2022052307/54b75ae84a7959bd138b45b7/html5/thumbnails/8.jpg)
![Page 9: Map/Confused? A practical approach to Map/Reduce with MongoDB](https://reader033.vdocuments.net/reader033/viewer/2022052307/54b75ae84a7959bd138b45b7/html5/thumbnails/9.jpg)
![Page 10: Map/Confused? A practical approach to Map/Reduce with MongoDB](https://reader033.vdocuments.net/reader033/viewer/2022052307/54b75ae84a7959bd138b45b7/html5/thumbnails/10.jpg)
![Page 11: Map/Confused? A practical approach to Map/Reduce with MongoDB](https://reader033.vdocuments.net/reader033/viewer/2022052307/54b75ae84a7959bd138b45b7/html5/thumbnails/11.jpg)
![Page 12: Map/Confused? A practical approach to Map/Reduce with MongoDB](https://reader033.vdocuments.net/reader033/viewer/2022052307/54b75ae84a7959bd138b45b7/html5/thumbnails/12.jpg)
![Page 13: Map/Confused? A practical approach to Map/Reduce with MongoDB](https://reader033.vdocuments.net/reader033/viewer/2022052307/54b75ae84a7959bd138b45b7/html5/thumbnails/13.jpg)
![Page 14: Map/Confused? A practical approach to Map/Reduce with MongoDB](https://reader033.vdocuments.net/reader033/viewer/2022052307/54b75ae84a7959bd138b45b7/html5/thumbnails/14.jpg)
![Page 15: Map/Confused? A practical approach to Map/Reduce with MongoDB](https://reader033.vdocuments.net/reader033/viewer/2022052307/54b75ae84a7959bd138b45b7/html5/thumbnails/15.jpg)
![Page 16: Map/Confused? A practical approach to Map/Reduce with MongoDB](https://reader033.vdocuments.net/reader033/viewer/2022052307/54b75ae84a7959bd138b45b7/html5/thumbnails/16.jpg)
![Page 17: Map/Confused? A practical approach to Map/Reduce with MongoDB](https://reader033.vdocuments.net/reader033/viewer/2022052307/54b75ae84a7959bd138b45b7/html5/thumbnails/17.jpg)
![Page 18: Map/Confused? A practical approach to Map/Reduce with MongoDB](https://reader033.vdocuments.net/reader033/viewer/2022052307/54b75ae84a7959bd138b45b7/html5/thumbnails/18.jpg)
![Page 19: Map/Confused? A practical approach to Map/Reduce with MongoDB](https://reader033.vdocuments.net/reader033/viewer/2022052307/54b75ae84a7959bd138b45b7/html5/thumbnails/19.jpg)
{
"_id" : ObjectId("4fb9fb91d066d657de8d6f36"),
"text" : “MongoDB uses Map/Reduce #epic #win",
…
"user" : {
"friends_count" : 73,
…
"followers_count" : 102,
"id" : 53507833,
},
…
}
![Page 20: Map/Confused? A practical approach to Map/Reduce with MongoDB](https://reader033.vdocuments.net/reader033/viewer/2022052307/54b75ae84a7959bd138b45b7/html5/thumbnails/20.jpg)
mongod --rest --shardsvr --port 27017 --dbpath /tmp/shard1/ --smallfiles
mongod --rest --shardsvr --port 27017 --dbpath /tmp/shard1/ --smallfiles
mongod --configsvr --port 10000 --dbpath /tmp/config/ --smallfiles
mongos --port 22222 --configdb localhost:10000
1.db.tweets.mapReduce()
2.db.tweets.group()
3.db.tweets.aggregate()
4.MongoDB-Hadoop Adapter
5.db.tweets.find()
![Page 21: Map/Confused? A practical approach to Map/Reduce with MongoDB](https://reader033.vdocuments.net/reader033/viewer/2022052307/54b75ae84a7959bd138b45b7/html5/thumbnails/21.jpg)
var measure = function(c) {
var a = Date.now();
var results = c.apply();
var d = Date.now() - a;
return { results:results, duration:d };
};
![Page 22: Map/Confused? A practical approach to Map/Reduce with MongoDB](https://reader033.vdocuments.net/reader033/viewer/2022052307/54b75ae84a7959bd138b45b7/html5/thumbnails/22.jpg)
![Page 23: Map/Confused? A practical approach to Map/Reduce with MongoDB](https://reader033.vdocuments.net/reader033/viewer/2022052307/54b75ae84a7959bd138b45b7/html5/thumbnails/23.jpg)
function() {
if (this.user != null) {
emit("user",
{userName: this.user.name,
followers: this.user.followers_count});
}
}
![Page 24: Map/Confused? A practical approach to Map/Reduce with MongoDB](https://reader033.vdocuments.net/reader033/viewer/2022052307/54b75ae84a7959bd138b45b7/html5/thumbnails/24.jpg)
function(key, values) {
var result = null;
values.forEach( function(value) {
if (result == null ||
result.followers < value.followers) {
result = value;
}
})
return result;
}
![Page 25: Map/Confused? A practical approach to Map/Reduce with MongoDB](https://reader033.vdocuments.net/reader033/viewer/2022052307/54b75ae84a7959bd138b45b7/html5/thumbnails/25.jpg)
![Page 26: Map/Confused? A practical approach to Map/Reduce with MongoDB](https://reader033.vdocuments.net/reader033/viewer/2022052307/54b75ae84a7959bd138b45b7/html5/thumbnails/26.jpg)
db.tweets.group({
key: {},
initial: { name:'', followers_count:0 },
reduce: function(obj,prev) {
if (obj.user != null &&
prev.followers_count < obj.user.followers_count)
{
prev.name = obj.user.name;
prev.followers_count = obj.user.followers_count;
}
}
})
![Page 27: Map/Confused? A practical approach to Map/Reduce with MongoDB](https://reader033.vdocuments.net/reader033/viewer/2022052307/54b75ae84a7959bd138b45b7/html5/thumbnails/27.jpg)
![Page 28: Map/Confused? A practical approach to Map/Reduce with MongoDB](https://reader033.vdocuments.net/reader033/viewer/2022052307/54b75ae84a7959bd138b45b7/html5/thumbnails/28.jpg)
db.tweets.aggregate(
{$group: {
_id: {user_name: "$user.name"},
followers_count: {$max: "$user.followers_count"}
}},
{$sort: {"followers_count" : -1}},
{$limit : 1},
{$project: {
_id : 0,
user_name : "$_id.user_name",
followers_count : "$followers_count"
}})
![Page 29: Map/Confused? A practical approach to Map/Reduce with MongoDB](https://reader033.vdocuments.net/reader033/viewer/2022052307/54b75ae84a7959bd138b45b7/html5/thumbnails/29.jpg)
![Page 30: Map/Confused? A practical approach to Map/Reduce with MongoDB](https://reader033.vdocuments.net/reader033/viewer/2022052307/54b75ae84a7959bd138b45b7/html5/thumbnails/30.jpg)
#!/usr/bin/env python
# encoding: utf-8
import sys
sys.path.append(".")
from pymongo_hadoop import BSONMapper
def mapper(documents):
for doc in documents:
if doc['user'] != None:
yield {'_id': doc['user']['name'].encode('utf-8'),
'followers':doc['user']['followers_count']}
BSONMapper(mapper)
print >> sys.stderr, "Done Mapping!"
![Page 31: Map/Confused? A practical approach to Map/Reduce with MongoDB](https://reader033.vdocuments.net/reader033/viewer/2022052307/54b75ae84a7959bd138b45b7/html5/thumbnails/31.jpg)
#!/usr/bin/env python
# encoding: utf-8
import sys
sys.path.append('.')
from pymongo_hadoop import BSONReducer
def reducer(key, values):
print >> sys.stderr, "Processing key %s" % key.encode('utf-8')
_count = 0
for v in values:
if _count < v['followers']:
_count = v["followers"]
return {"_id": key.encode('utf-8'), "count": _count}
BSONReducer(reducer)
print >> sys.stderr, "Done Reducing!"
![Page 32: Map/Confused? A practical approach to Map/Reduce with MongoDB](https://reader033.vdocuments.net/reader033/viewer/2022052307/54b75ae84a7959bd138b45b7/html5/thumbnails/32.jpg)
hadoop jar /usr/lib/hadoop/lib/mongo-hadoop-streaming-
assembly-1.1.0-SNAPSHOT.jar
-files mapper.py, reducer.py
-inputURI mongodb://localhost:27017/twitter.tweets
-outputURI mongodb://localhost:27017/twitter.top_user
-mapper mapper.py
-reducer reducer.py
![Page 33: Map/Confused? A practical approach to Map/Reduce with MongoDB](https://reader033.vdocuments.net/reader033/viewer/2022052307/54b75ae84a7959bd138b45b7/html5/thumbnails/33.jpg)
![Page 34: Map/Confused? A practical approach to Map/Reduce with MongoDB](https://reader033.vdocuments.net/reader033/viewer/2022052307/54b75ae84a7959bd138b45b7/html5/thumbnails/34.jpg)
db.tweets.find().sort( {"user.followers_count": -1} ).limit(1)
![Page 35: Map/Confused? A practical approach to Map/Reduce with MongoDB](https://reader033.vdocuments.net/reader033/viewer/2022052307/54b75ae84a7959bd138b45b7/html5/thumbnails/35.jpg)
![Page 36: Map/Confused? A practical approach to Map/Reduce with MongoDB](https://reader033.vdocuments.net/reader033/viewer/2022052307/54b75ae84a7959bd138b45b7/html5/thumbnails/36.jpg)
db.tweets.mapReduce()
db.tweets.group()
db.tweets.aggregate()
MongoDB-Hadoop Adapter
db.tweets.find()
![Page 37: Map/Confused? A practical approach to Map/Reduce with MongoDB](https://reader033.vdocuments.net/reader033/viewer/2022052307/54b75ae84a7959bd138b45b7/html5/thumbnails/37.jpg)
db.tweets.mapReduce()
db.tweets.group()
db.tweets.aggregate()
MongoDB-Hadoop Adapter
db.tweets.find()
![Page 38: Map/Confused? A practical approach to Map/Reduce with MongoDB](https://reader033.vdocuments.net/reader033/viewer/2022052307/54b75ae84a7959bd138b45b7/html5/thumbnails/38.jpg)
db.tweets.mapReduce()
db.tweets.group()
db.tweets.aggregate()
MongoDB-Hadoop Adapter
db.tweets.find()
![Page 39: Map/Confused? A practical approach to Map/Reduce with MongoDB](https://reader033.vdocuments.net/reader033/viewer/2022052307/54b75ae84a7959bd138b45b7/html5/thumbnails/39.jpg)
db.tweets.mapReduce()
db.tweets.group()
db.tweets.aggregate()
MongoDB-Hadoop Adapter
db.tweets.find()
![Page 40: Map/Confused? A practical approach to Map/Reduce with MongoDB](https://reader033.vdocuments.net/reader033/viewer/2022052307/54b75ae84a7959bd138b45b7/html5/thumbnails/40.jpg)
db.tweets.mapReduce()
db.tweets.group()
db.tweets.aggregate()
MongoDB-Hadoop Adapter
db.tweets.find()
![Page 41: Map/Confused? A practical approach to Map/Reduce with MongoDB](https://reader033.vdocuments.net/reader033/viewer/2022052307/54b75ae84a7959bd138b45b7/html5/thumbnails/41.jpg)
![Page 42: Map/Confused? A practical approach to Map/Reduce with MongoDB](https://reader033.vdocuments.net/reader033/viewer/2022052307/54b75ae84a7959bd138b45b7/html5/thumbnails/42.jpg)
![Page 43: Map/Confused? A practical approach to Map/Reduce with MongoDB](https://reader033.vdocuments.net/reader033/viewer/2022052307/54b75ae84a7959bd138b45b7/html5/thumbnails/43.jpg)
![Page 44: Map/Confused? A practical approach to Map/Reduce with MongoDB](https://reader033.vdocuments.net/reader033/viewer/2022052307/54b75ae84a7959bd138b45b7/html5/thumbnails/44.jpg)