mongodb tuning on aws
TRANSCRIPT
Information
•Tweet: Hashtag #jawsdays #ijaws
•Please register you on ijaws on Doorkeeper (Next
meetup on Mid April)
•There’s the JOB board behind the wall
Self-Introduction•Ryuji Tamagawa@facebook
•tamagawa_ryuji@twitter
•Software Developer working in
Osaka
•Translator (for O’Reilly)
•Loves performance tuning
Introducing MongoDB
•Hybrid of NoSQL and RDB
•Easily Scales (up to certain point)
•Stores JSON document as ‘BSON’
•Has Seconday Index ( on any part of JSON Doc), Query Optimizer
•Replication, Sharding ready
To make MongoDB runs fast on AWS
•You have to understand:
•its architectural feature of memory
management
•Workload pattern of your application
•Size of your ‘HOT’ data
What’s the ‘HOT’ data?
•‘Hot’ Data is what accessed frequently
•Ex: If you simply write data like access logs and transfer
them to somewhere else, ‘hot’ spot could be very small
•If the collection has indexes, one write can make many
places hot
MongoDB does not manage memory
•Most DBMS has built-in MMS,
but MongoDB doesn’t.
•MongoDB accesses database
files through ‘Memory
mapped files’: Let the OS
manage the buffer
Traditional RDB
Memory
Buffer
DB Files
MongoDB
Memory Mapped
DB Files
OS
App
The Rules of Thumb about Memory
•Give enough memory to the OS to hold ‘HOT’ data
•Don’t forget about the indexes
•Use dedicated EC2 instances
Keep your data safe with Replication
•Using ReplicaSet, you can distribute
your data to many places easily
•You have choices to keep your data
safe from crashes
•EBS or Instance Store : trade off
between cost, safety, performance
Primary
Secondary Secondary
Try MongoDB’s Replicaset with:
https://bitbucket.org/tamagawa_ryuji/mongodb_replicaset_playground_on_vagrant
Storage Performance Evaluated
• Converted Wikipadia-ja’s page data (about 1,700,000
documents) to JSON
• Write them to MongoDB on EC2 from another instance
• Data writer is a simple python application with
pymongo driver running 4 processes
Storage Performance Evaluated
Instance TypeInstance
Cost(Spot)Storage Time to finish
ebs-normal 0:10:55
ephemeral0 0:07:36
PIOPS 1500 0:08:26
ephemeral0 0:10:22
PIOPS 1500 0:09:02
ephemeral0 0:05:19
m3.large $0.09
m3.xlarge
(SSD instance store)$0.16
hi1.4xlarge
(Storage Optimized)$0.50
Comparing Instance TypesInstance
TypeCPU ECU Memory Storage Cost
Memory
($/GB)
CPU
($/ECU)
Storage
($/100GB)
m3.medium 1 3 3.75 1 x 4 SSD $0.17 $0.05 $0.06 $4.28
m3.large 2 6.5 7.5 1 x 32 SSD $0.34 $0.05 $0.05 $1.07
m3.xlarge 4 13 15 2 x 40 SSD $0.68 $0.05 $0.05 $0.86
m3.2xlarge 8 26 30 2 x 80 SSD $1.37 $0.05 $0.05 $0.86
m2.xlarge 2 6.5 17.1 1 x 420 $0.51 $0.03 $0.08 $0.12
m2.2xlarge 4 13 34.2 1 x 850 $1.01 $0.03 $0.08 $0.12
m2.4xlarge 8 26 68.4 2 x 840 $2.02 $0.03 $0.08 $0.12
cr1.8xlarge 32 88 244 2 x 120 SSD $4.31 $0.02 $0.05 $1.80
i2.xlarge 4 14 30.5 1 x 800 SSD $1.05 $0.03 $0.08 $0.13
i2.2xlarge 8 27 61 2 x 800 SSD $2.10 $0.03 $0.08 $0.13
i2.4xlarge 16 53 122 4 x 800 SSD $4.20 $0.03 $0.08 $0.13
i2.8xlarge 32 104 244 8 x 800 SSD $8.40 $0.03 $0.08 $0.13
hs1.8xlarge 16 35 117 24 x 2048 $5.67 $0.05 $0.16 $0.01
THANK YOU !YOUR CONTACTS ARE WELCOME !!