introduction to new high performance storage engines in mongodb 2.8 henrik ingo solutions architect,...

31
Introduction to new high performance storage engines in MongoDB 2.8 Henrik Ingo Solutions Architect, MongoDB 3. 0

Upload: violet-robbins

Post on 19-Dec-2015

236 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Introduction to new high performance storage engines in MongoDB 2.8 Henrik Ingo Solutions Architect, MongoDB 3.0

Introduction to new high performance storage engines in MongoDB 2.8

Henrik IngoSolutions Architect, MongoDB

3.0

Page 2: Introduction to new high performance storage engines in MongoDB 2.8 Henrik Ingo Solutions Architect, MongoDB 3.0

2

Hi, I am Henrik Ingo

@h_ingo

Page 3: Introduction to new high performance storage engines in MongoDB 2.8 Henrik Ingo Solutions Architect, MongoDB 3.0

Introduction to new high performance storage engines in MongoDB 2.8

Agenda:

- MongoDB and NoSQL - Storage Engine API - WiredTiger configuration + performance

3.0

Page 4: Introduction to new high performance storage engines in MongoDB 2.8 Henrik Ingo Solutions Architect, MongoDB 3.0

4

Most popular NoSQL database

Page 5: Introduction to new high performance storage engines in MongoDB 2.8 Henrik Ingo Solutions Architect, MongoDB 3.0

5

5 NoSQL categories

Key Value Wide Column Document

Graph Map Reduce

Redis, Riak Cassandra

Neo4j Hadoop

Page 6: Introduction to new high performance storage engines in MongoDB 2.8 Henrik Ingo Solutions Architect, MongoDB 3.0

6

MongoDB is a Document Database

MongoDBRich Queries

• Find Paul’s cars• Find everybody in London with a car

built between 1970 and 1980

Geospatial• Find all of the car owners within 5km of

Trafalgar Sq.

Text Search• Find all the cars described as having

leather seats

Aggregation• Calculate the average value of Paul’s

car collection

Map Reduce• What is the ownership pattern of colors

by geography over time? (is purple trending up in China?)

{ first_name: ‘Paul’, surname: ‘Miller’, city: ‘London’, location: [45.123,47.232], cars: [ { model: ‘Bentley’, year: 1973, value: 100000, … }, { model: ‘Rolls Royce’, year: 1965, value: 330000, … } }}

Page 7: Introduction to new high performance storage engines in MongoDB 2.8 Henrik Ingo Solutions Architect, MongoDB 3.0

7

Operational Database Landscape

Page 8: Introduction to new high performance storage engines in MongoDB 2.8 Henrik Ingo Solutions Architect, MongoDB 3.0

MongoDB 3.0 & storage engines

Page 9: Introduction to new high performance storage engines in MongoDB 2.8 Henrik Ingo Solutions Architect, MongoDB 3.0

9

Current state in MongoDB 2.6

Read-heavy apps

• Great performance• B-tree• Low overhead

• Good scale-out perf• Secondary reads• Sharding

Write-heavy apps

• Good scale-out perf• Sharding

• Per-node efficiency wish-list:• Doc level locking• Write-optimized data

structures (LSM)• Compression

Other

• Complex transactions• In-memory engine• SSD optimized engine• etc...

Page 10: Introduction to new high performance storage engines in MongoDB 2.8 Henrik Ingo Solutions Architect, MongoDB 3.0

10

Current state in MongoDB 2.6

Read-heavy apps

• Great performance• B-tree• Low overhead

• Good scale-out perf• Secondary reads• Sharding

Write-heavy apps

• Good scale-out perf• Sharding

• Per-node efficiency wish-list:• Doc level locking• Write-optimized data

structures (LSM)• Compression

Other

• Complex transactions• In-memory engine• SSD optimized engine• etc...

How to get all of the above?

Page 11: Introduction to new high performance storage engines in MongoDB 2.8 Henrik Ingo Solutions Architect, MongoDB 3.0

11

MongoDB 3.0 Storage Engine API

MMAP

Read-heavy app

WiredTiger

Write-heavy app

3rd party

Special app

Page 12: Introduction to new high performance storage engines in MongoDB 2.8 Henrik Ingo Solutions Architect, MongoDB 3.0

12

MMAP

Read-heavy app

WiredTiger

Write-heavy app

3rd party

Special app

• One at a time:– Many engines built into mongod– Choose 1 at startup– All data stored by the same engine– Incompatible on-disk data formats (obviously)– Compatible client API

• Compatible Oplog & Replication– Same replica set can mix different engines– No-downtime migration possible

MongoDB 3.0 Storage Engine API

Page 13: Introduction to new high performance storage engines in MongoDB 2.8 Henrik Ingo Solutions Architect, MongoDB 3.0

13

• MMAPv1– Improved MMAP (collection-level locking)

• WiredTiger– Discussed next

• RocksDB– LSM style engine developed by Facebook– Based on LevelDB

• TokuMXse– Fractal Tree indexing engine from Tokutek

Some existing engines

Page 14: Introduction to new high performance storage engines in MongoDB 2.8 Henrik Ingo Solutions Architect, MongoDB 3.0

14

• Heap– In-memory engine

• Devnull– Write all data to /dev/null– Based on idea from famous flash animation...– Oplog stored as normal

• SSD optimized engine (e.g. Fusion-IO)

• KV simple key-value engine

Some rumored engines

https://github.com/mongodb/mongo/tree/master/src/mongo/db/storage

Page 15: Introduction to new high performance storage engines in MongoDB 2.8 Henrik Ingo Solutions Architect, MongoDB 3.0

WiredTiger

Page 16: Introduction to new high performance storage engines in MongoDB 2.8 Henrik Ingo Solutions Architect, MongoDB 3.0

16

• Modern NoSQL database engine– flexible schema

• Advanced database engine– Secondary indexes, MVCC, non-locking algorithms

– Multi-statement transactions (not in MongoDB 3.0)

• Very modular, tunable– Btree, LSM and columnar indexes

– Snappy, Zlib, 3rd-party compression

– Index prefix compression, etc...

• Built by creators of BerkeleyDB• Acquired by MongoDB in 2014• source.wiredtiger.com

What is WiredTiger

Page 17: Introduction to new high performance storage engines in MongoDB 2.8 Henrik Ingo Solutions Architect, MongoDB 3.0

17

Choosing WiredTiger at server startup

mongod --storageEngine wiredTiger

http://docs.mongodb.org/master/reference/program/mongod/#cmdoption--storageEngine

Page 18: Introduction to new high performance storage engines in MongoDB 2.8 Henrik Ingo Solutions Architect, MongoDB 3.0

18

Main tunables exposed as MongoDB options

mongod --storageEngine wiredTiger --wiredTigerCacheSizeGB 8 --wiredTigerDirectoryForIndexes /data/indexes --wiredTigerCollectionBlockCompressor zlib --syncDelay 30

http://docs.mongodb.org/master/reference/program/mongod/#cmdoption--storageEngine

Page 19: Introduction to new high performance storage engines in MongoDB 2.8 Henrik Ingo Solutions Architect, MongoDB 3.0

19

All WiredTiger options via configString (hidden)

mongod --storageEngine wiredTiger --wiredTigerEngineConfigString "cache_size=8GB,eviction=(threads_min=4,threads_max=8), checkpoint(wait=30)"

--wiredTigerCollectionConfigString "block_compressor=zlib"

--wiredTigerIndexConfigString "type=lsm,block_compressor=zlib" --wiredTigerDirectoryForIndexes /data/indexes

See docs for wiredtiger_open() & WT_SESSION::create()http://source.wiredtiger.com/2.5.0/group__wt.html#ga9e6adae3fc6964ef837a62795c7840edhttp://source.wiredtiger.com/2.5.0/struct_w_t___s_e_s_s_i_o_n.html#a358ca4141d59c345f401c58501276bbb

Page 20: Introduction to new high performance storage engines in MongoDB 2.8 Henrik Ingo Solutions Architect, MongoDB 3.0

20

Also via createCollection(), createIndex()

db.createCollection( "users", { storageEngine: { wiredTiger: { configString: "block_compressor=none" } } )

http://docs.mongodb.org/master/reference/method/db.createCollection/#db.createCollectionhttp://docs.mongodb.org/master/reference/method/db.collection.createIndex/#db.collection.createIndex

Page 21: Introduction to new high performance storage engines in MongoDB 2.8 Henrik Ingo Solutions Architect, MongoDB 3.0

21

• db.serverStatus()

• db.collection.stats()

More...

Page 22: Introduction to new high performance storage engines in MongoDB 2.8 Henrik Ingo Solutions Architect, MongoDB 3.0

Understanding and OptimizingWiredTiger

Page 23: Introduction to new high performance storage engines in MongoDB 2.8 Henrik Ingo Solutions Architect, MongoDB 3.0

23

Understanding WiredTiger architectureW

ired

Tig

er S

E

Btree LSM Columnar

Cache (default: 50%)

None Snappy Zlib

OS Disk Cache (Default: 50%)

Physical disk

Page 24: Introduction to new high performance storage engines in MongoDB 2.8 Henrik Ingo Solutions Architect, MongoDB 3.0

24

Covering 90% of your optimization needsW

ired

Tig

er S

E

Btree LSM Columnar

Cache (default: 50%)

None Snappy Zlib

OS Disk Cache (Default: 50%)

Physical disk

Decompression time

Disk seek time

Page 25: Introduction to new high performance storage engines in MongoDB 2.8 Henrik Ingo Solutions Architect, MongoDB 3.0

25

Strategy 1: fit working set in CacheW

ired

Tig

er S

E

Btree LSM Columnar

Cache (default: 50%)

None Snappy Zlib

OS Disk Cache (Default: 50%)

Physical disk

cache_size = 80%

Page 26: Introduction to new high performance storage engines in MongoDB 2.8 Henrik Ingo Solutions Architect, MongoDB 3.0

26

Strategy 2: fit working set in OS Disk CacheW

ired

Tig

er S

E

Btree LSM Columnar

Cache (default: 50%)

None Snappy Zlib

OS Disk Cache (Default: 50%)

Physical disk

cache_size = 10%

OS Disk Cache (Remaining: 90%)

Page 27: Introduction to new high performance storage engines in MongoDB 2.8 Henrik Ingo Solutions Architect, MongoDB 3.0

27

Strategy 3: SSD disk + compression to save €W

ired

Tig

er S

E

Btree LSM Columnar

Cache (default: 50%)

None Snappy Zlib

OS Disk Cache (Default: 50%)

Physical diskSSD

Page 28: Introduction to new high performance storage engines in MongoDB 2.8 Henrik Ingo Solutions Architect, MongoDB 3.0

28

Strategy 4: SSD disk (no compression)W

ired

Tig

er S

E

Btree LSM Columnar

Cache (default: 50%)

None Snappy Zlib

OS Disk Cache (Default: 50%)

Physical diskSSD

Page 29: Introduction to new high performance storage engines in MongoDB 2.8 Henrik Ingo Solutions Architect, MongoDB 3.0

29

What problem is solved by LSM indexes?P

erf

orm

ance

Fast reads Fast writesBoth

Easy: Add indexes

Easy: No indexes

Hard: Smart schema design (hire a consultant) LSM index structures (or columnar)

Page 30: Introduction to new high performance storage engines in MongoDB 2.8 Henrik Ingo Solutions Architect, MongoDB 3.0

30

2B inserts (with 3 secondary indexes)

http://smalldatum.blogspot.fi/2014/12/read-modify-write-optimized.html

Page 31: Introduction to new high performance storage engines in MongoDB 2.8 Henrik Ingo Solutions Architect, MongoDB 3.0