mongodb @ fliptop
DESCRIPTION
tech talk about how fliptop leverage mongodb in its infrastructure for better scalability @ twjugTRANSCRIPT
![Page 1: MongoDB @ fliptop](https://reader033.vdocuments.net/reader033/viewer/2022061223/54c665704a79594b538b4736/html5/thumbnails/1.jpg)
MongoDB @ Fliptop
2011/12/10
![Page 2: MongoDB @ fliptop](https://reader033.vdocuments.net/reader033/viewer/2022061223/54c665704a79594b538b4736/html5/thumbnails/2.jpg)
Agenda
• Fliptopo infrastructure
• MongoDBo architectureo sharding strategyo data schemao index and queryo miscellaneous
![Page 3: MongoDB @ fliptop](https://reader033.vdocuments.net/reader033/viewer/2022061223/54c665704a79594b538b4736/html5/thumbnails/3.jpg)
What is Fliptop?
• Social profiles lookupo facebook, twitter, linkedino campaign analysiso api lookup
• Our problemso scalability
Data ~ 7 billion data
Infrastructure ~ 1MM lookup/day
![Page 4: MongoDB @ fliptop](https://reader033.vdocuments.net/reader033/viewer/2022061223/54c665704a79594b538b4736/html5/thumbnails/4.jpg)
Fliptop Infrastructure
• Infrastructureo Amazon EC2
• NoSQL Database
o MongoDB
• Indexing and full-text searcho Apache SOLR
• Distributed computingo AWS Elastic MapReduce (Hadoop)
![Page 5: MongoDB @ fliptop](https://reader033.vdocuments.net/reader033/viewer/2022061223/54c665704a79594b538b4736/html5/thumbnails/5.jpg)
Fliptop DataBases
• Fliptop Datao ~50MM records
• w/t MongoDBo MySQL
AWS RDS x1o Solr
AWS EC2 m1.large x 10• w MongoDB
o MySQL AWS RDS x1
o Solr AWS EC2 m1.large x 2 (master/slave)
o MongoDB AWS EC2 m2.large x 10 (replication set)
![Page 6: MongoDB @ fliptop](https://reader033.vdocuments.net/reader033/viewer/2022061223/54c665704a79594b538b4736/html5/thumbnails/6.jpg)
From Solr to MongoDB
• Our Storage Requiremento auto shardingo richness of querieso short insert latency
• Other Reasonso documentationo active communityo word of mouth
• Migration Effortso querieso db drivero performance tuning
![Page 7: MongoDB @ fliptop](https://reader033.vdocuments.net/reader033/viewer/2022061223/54c665704a79594b538b4736/html5/thumbnails/7.jpg)
MongoDB Features
• Auto-Shardingo scale out to 1000 nodes
• Replication & High Availabilityo master/slave and replication set
• Queryingo most SQL syntax
• Document-oriented storageo json, schema-free
• Full Index Supporto inde any field
• Map/Reduceo javascript at server side
![Page 8: MongoDB @ fliptop](https://reader033.vdocuments.net/reader033/viewer/2022061223/54c665704a79594b538b4736/html5/thumbnails/8.jpg)
MongoDB Servers
![Page 9: MongoDB @ fliptop](https://reader033.vdocuments.net/reader033/viewer/2022061223/54c665704a79594b538b4736/html5/thumbnails/9.jpg)
MongoDB Shardings
• Automatic balancing for changes in load and data distribution
• Easy addition of new machines• Scaling out to one thousand nodes• No single points of failure• Automatic failover
![Page 10: MongoDB @ fliptop](https://reader033.vdocuments.net/reader033/viewer/2022061223/54c665704a79594b538b4736/html5/thumbnails/10.jpg)
MongoDB Replication
• master/slaveo easy setupo manually fail-over
• replication seto bit complex setupo automatic fail-overo minimun nodes: 3 (1 abriter)o maximun nodes :12
![Page 11: MongoDB @ fliptop](https://reader033.vdocuments.net/reader033/viewer/2022061223/54c665704a79594b538b4736/html5/thumbnails/11.jpg)
MongoDB Failover
• Voting algorithm (replication set)o floor(all nodes/current nodes)+1
• Priorityo if 0, never becomes primary
backup with small machine
![Page 12: MongoDB @ fliptop](https://reader033.vdocuments.net/reader033/viewer/2022061223/54c665704a79594b538b4736/html5/thumbnails/12.jpg)
Fliptop MongoDB Infrastructures
• Data
o 10MM/replication set
• MongoDB serverso router x 1o config server x1o shards servers x 10
5 primary 5 secondary
o abriter servers x 5
• AWS EC2 Instanceso m2.large x 10
![Page 13: MongoDB @ fliptop](https://reader033.vdocuments.net/reader033/viewer/2022061223/54c665704a79594b538b4736/html5/thumbnails/13.jpg)
MongoDB and AWS EC2
• Instances typeo m2.xlarge
17.1 GB of memory 6.5 EC2 Compute Units
• Storageo Local Drive
faster i/o not portable
o EBS i/o = network + disk i/o portable easy backup raid 1/0
![Page 14: MongoDB @ fliptop](https://reader033.vdocuments.net/reader033/viewer/2022061223/54c665704a79594b538b4736/html5/thumbnails/14.jpg)
MongoDB Sharding Strategy
• Sharding Key Strategyo Ascending shard key
data locality hotspot for read/write ex. timestamp, auto-incement PK
o Random sharding key evenly distribute read/write no data locality ex. UUID, md5
o Hybrid sharding key ascending evenly distribute ex. timestamp + uuid
![Page 15: MongoDB @ fliptop](https://reader033.vdocuments.net/reader033/viewer/2022061223/54c665704a79594b538b4736/html5/thumbnails/15.jpg)
From timestamp to uuid
• Why timestamp?o same sharding key with our solro issues
slowness of count (traverse) query maintenance headache
add node more frequently duplication of uuids
• From timestamp to uuido performance gain with cout
2x faster ex. count 1MM, from 10s ~ 5s.
o less maintenance enable multiple nodes at the same time
o dedup uniqueness of uuid is guarantee local only
![Page 16: MongoDB @ fliptop](https://reader033.vdocuments.net/reader033/viewer/2022061223/54c665704a79594b538b4736/html5/thumbnails/16.jpg)
MongoDB Balancer
• if number of chunks are not evenly distributed, balancer can fix ito stop criteria
until diff between each nodes is <=2o balancer window
active time windowo blocking if moving massive data
while add brand new node
![Page 17: MongoDB @ fliptop](https://reader033.vdocuments.net/reader033/viewer/2022061223/54c665704a79594b538b4736/html5/thumbnails/17.jpg)
MongoDB Schema
• Document orientedo json
• Schema Freeo pros
no predefined schema is required save 'as is'
o cons overhead of headers low sensitivity of broken data
![Page 18: MongoDB @ fliptop](https://reader033.vdocuments.net/reader033/viewer/2022061223/54c665704a79594b538b4736/html5/thumbnails/18.jpg)
MongoDB Schema and Size
• Size matterso simple schema is better
payment:[{"publisher_id": 176, "paid":true}] payment:[176_1]
o abbreviation of headers payment:[176_1] pm:[176_1]
![Page 19: MongoDB @ fliptop](https://reader033.vdocuments.net/reader033/viewer/2022061223/54c665704a79594b538b4736/html5/thumbnails/19.jpg)
MongoDB Queries
1) COLUMN = VALUE2) COLUMN in RANGE3) boolean operators AND, OR, NOT4) pagination (start, rows)5) sort6) count (of query result)7) COLUMN is non-existent8) multiValued fields9) dynamic fields10) dynamic multiValued fields11) stats queries (min, max)12) faceted queries (aggregation of specific fields)13) free text search (regular expression)
![Page 20: MongoDB @ fliptop](https://reader033.vdocuments.net/reader033/viewer/2022061223/54c665704a79594b538b4736/html5/thumbnails/20.jpg)
MongoDB Index
• Tree structure Index• At most 64 indexes per collection(table)• A query only leverages 1 index unless using $or query• Index entails addition work on insert, delete, update
![Page 21: MongoDB @ fliptop](https://reader033.vdocuments.net/reader033/viewer/2022061223/54c665704a79594b538b4736/html5/thumbnails/21.jpg)
MongoDB Index Types
• Basic Indexo db.persons.ensureIndex({name:1});
• Embedded Indexo db.pesons.ensureIndex({location.city:1})
• Compound Indexo db.persins.ensureIndex({name:1, location.city:1})
• Sparse Indexo db.persons.ensureIndex({job:1}, {sparse: true})
![Page 22: MongoDB @ fliptop](https://reader033.vdocuments.net/reader033/viewer/2022061223/54c665704a79594b538b4736/html5/thumbnails/22.jpg)
MongoDB Index Limits
• negations operationo $ne, $noto ex. db.things.find( { x : { $ne : 3 } } );
• arithmetic operations o $modo ex. db.things.find( "this.a % 10 == 1")
• most regular expressiono yes
db.persons.find({/^robbie/}) db.persons.find({/^robbie.*/}) db.persons.find({/^robbie.*/i})
o no db.persons.find({/robbie}})
• $where
![Page 23: MongoDB @ fliptop](https://reader033.vdocuments.net/reader033/viewer/2022061223/54c665704a79594b538b4736/html5/thumbnails/23.jpg)
MongoDB Index Optimization
• simple data typeo ex. int is faster than string
• simple data schemao ex. {payment: "176_1"}
• sparse indexo if optional fields
![Page 24: MongoDB @ fliptop](https://reader033.vdocuments.net/reader033/viewer/2022061223/54c665704a79594b538b4736/html5/thumbnails/24.jpg)
MongoDB Miscellaneous
• Monitoringo CPU
if high which implies index is brokeno Driver Size
time to add new instance• Backup
o EBS: snapshoto mongo import/export tool
mongodump/mongoimport• Auto Deployment
o Hudson + fabric (python)
![Page 25: MongoDB @ fliptop](https://reader033.vdocuments.net/reader033/viewer/2022061223/54c665704a79594b538b4736/html5/thumbnails/25.jpg)
What's Next?
• Further Data and Index weight loseo target: 20MM/instance
• introduce Java POJO/DAOo Morphiao Spring mongodb
• Watchdog mechanismo restart server automatically
![Page 28: MongoDB @ fliptop](https://reader033.vdocuments.net/reader033/viewer/2022061223/54c665704a79594b538b4736/html5/thumbnails/28.jpg)
Thank you!