Шардинг в mongodb, henrik ingo (mongodb)

49
Solutions Architect, MongoDB Henrik Ingo @h_ingo Introduction to Sharding

Upload: ontico

Post on 21-Jun-2015

556 views

Category:

Internet


3 download

DESCRIPTION

Доклад Хенрика Инго на HighLoad++ 2014.

TRANSCRIPT

Page 1: Шардинг в MongoDB, Henrik Ingo (MongoDB)

Solutions Architect, MongoDB

Henrik Ingo

@h_ingo

Introduction to Sharding

Page 2: Шардинг в MongoDB, Henrik Ingo (MongoDB)

Agenda

• Scaling Data

• MongoDB's Approach

• Architecture

• Configuration

• Mechanics

Page 3: Шардинг в MongoDB, Henrik Ingo (MongoDB)

MongoDB's Approach to Sharding

Page 4: Шардинг в MongoDB, Henrik Ingo (MongoDB)

Partitioning

• User defines shard key

• Shard key defines range of data

• Key space is like points on a line

• Range is a segment of that line

Page 5: Шардинг в MongoDB, Henrik Ingo (MongoDB)

Initially 1 chunk

Default max chunk size: 64mb

MongoDB automatically splits & migrates chunks when max reached

Data Distribution

Page 6: Шардинг в MongoDB, Henrik Ingo (MongoDB)

Queries routed to specific shards

MongoDB balances cluster

MongoDB migrates data to new nodes

Routing and Balancing

Page 7: Шардинг в MongoDB, Henrik Ingo (MongoDB)

MongoDB Auto-Sharding

• Minimal effort required– Same interface as single mongod

• Two steps– Enable Sharding for a database

– Shard collection within database

Page 8: Шардинг в MongoDB, Henrik Ingo (MongoDB)

Architecture

Page 9: Шардинг в MongoDB, Henrik Ingo (MongoDB)

What is a Shard?

• Shard is a node of the cluster

• Shard can be a single mongod or a replica set

Page 10: Шардинг в MongoDB, Henrik Ingo (MongoDB)

Sharding infrastructure

Page 11: Шардинг в MongoDB, Henrik Ingo (MongoDB)

Configuration

Page 12: Шардинг в MongoDB, Henrik Ingo (MongoDB)

Example Cluster

Page 13: Шардинг в MongoDB, Henrik Ingo (MongoDB)

mongod --configsvr

Starts a configuration server on the default port (27019)

Starting the Configuration Server

Page 14: Шардинг в MongoDB, Henrik Ingo (MongoDB)

mongos --configdb <hostname>:27019

For 3 configuration servers:

mongos --configdb <host1>:<port1>,<host2>:<port2>,<host3>:<port3>

This is always how to start a new mongos, even if the cluster is already running

Start the mongos Router

Page 15: Шардинг в MongoDB, Henrik Ingo (MongoDB)

mongod --shardsvr

Starts a mongodwith the default shard port (27018)

Shard is not yet connected to the rest of the cluster

Shard may have already been running in production

Start the shard database

Page 16: Шардинг в MongoDB, Henrik Ingo (MongoDB)

On mongos: – sh.addShard(‘<host>:27018’)

Adding a replica set: – sh.addShard(‘<rsname>/<seedlist>’)

Add the Shard

Page 17: Шардинг в MongoDB, Henrik Ingo (MongoDB)

db.runCommand({ listshards:1 })

{ "shards" :

[{"_id”: "shard0000”,"host”: ”<hostname>:27018” } ],

"ok" : 1

}

Verify that the shard was added

Page 18: Шардинг в MongoDB, Henrik Ingo (MongoDB)

Enabling Sharding

• Enable sharding on a database

sh.enableSharding(“<dbname>”)

• Shard a collection with the given key

sh.shardCollection(“<dbname>.people”,{“country”:1})

• Use a compound shard key to prevent duplicates

sh.shardCollection(“<dbname>.cars”,{“year”:1,

”uniqueid”:1})

Page 19: Шардинг в MongoDB, Henrik Ingo (MongoDB)

Tag Aware Sharding

• Tag aware sharding allows you to control the distribution of your data

• Tag a range of shard keys– sh.addTagRange(<collection>,<min>,<max>,<tag>)

• Tag a shard– sh.addShardTag(<shard>,<tag>)

Page 20: Шардинг в MongoDB, Henrik Ingo (MongoDB)

Routing Requests

Page 21: Шардинг в MongoDB, Henrik Ingo (MongoDB)

Cluster Request Routing

• Targeted Queries

• Scatter Gather Queries

• Scatter Gather Queries with Sort

Page 22: Шардинг в MongoDB, Henrik Ingo (MongoDB)

Cluster Request Routing: Targeted Query

Page 23: Шардинг в MongoDB, Henrik Ingo (MongoDB)

Routable request received

Page 24: Шардинг в MongoDB, Henrik Ingo (MongoDB)

Request routed to appropriate shard

Page 25: Шардинг в MongoDB, Henrik Ingo (MongoDB)

Shard returns results

Page 26: Шардинг в MongoDB, Henrik Ingo (MongoDB)

Mongos returns results to client

Page 27: Шардинг в MongoDB, Henrik Ingo (MongoDB)

Cluster Request Routing: Non-Targeted Query

Page 28: Шардинг в MongoDB, Henrik Ingo (MongoDB)

Non-Targeted Request Received

Page 29: Шардинг в MongoDB, Henrik Ingo (MongoDB)

Request sent to all shards

Page 30: Шардинг в MongoDB, Henrik Ingo (MongoDB)

Shards return results to mongos

Page 31: Шардинг в MongoDB, Henrik Ingo (MongoDB)

Mongos returns results to client

Page 32: Шардинг в MongoDB, Henrik Ingo (MongoDB)

Cluster Request Routing: Non-Targeted Query with Sort

Page 33: Шардинг в MongoDB, Henrik Ingo (MongoDB)

Non-Targeted request with sort received

Page 34: Шардинг в MongoDB, Henrik Ingo (MongoDB)

Request sent to all shards

Page 35: Шардинг в MongoDB, Henrik Ingo (MongoDB)

Query and sort performed locally

Page 36: Шардинг в MongoDB, Henrik Ingo (MongoDB)

Shards return results to mongos

Page 37: Шардинг в MongoDB, Henrik Ingo (MongoDB)

Mongos merges sorted results

Page 38: Шардинг в MongoDB, Henrik Ingo (MongoDB)

Mongos returns results to client

Page 39: Шардинг в MongoDB, Henrik Ingo (MongoDB)

Shard Key

Page 40: Шардинг в MongoDB, Henrik Ingo (MongoDB)

Shard Key

• Shard key is immutable

• Shard key values are immutable

• Shard key must be indexed

• Shard key limited to 512 bytes in size

• Shard key used to route queries– Choose a field commonly used in queries

• Only shard key can be unique across shards– `_id` field is only unique within individual shard

Page 41: Шардинг в MongoDB, Henrik Ingo (MongoDB)

Shard Key Considerations

• Cardinality

• Write Distribution

• Query Isolation

• Reliability

• Index Locality

Page 42: Шардинг в MongoDB, Henrik Ingo (MongoDB)

{

_id: ObjectId(),

user: 123,

time: Date(),

subject: “...”,

recipients: [],

body: “...”,

attachments: []

}

Example: Email Storage

Most common scenario, can be applied to 90% cases

Each document can be up to 16MB

Each user may have GBs of storage

Most common query: get user emails sorted by time

Indexes on {_id}, {user, time}, {recipients}

Page 43: Шардинг в MongoDB, Henrik Ingo (MongoDB)

Example: Email Storage

CardinalityWrite

Scaling

Query

IsolationReliability

Index

Locality

Page 44: Шардинг в MongoDB, Henrik Ingo (MongoDB)

Example: Email Storage

CardinalityWrite

Scaling

Query

IsolationReliability

Index

Locality

_id Doc level One shardScatter/gat

her

All users

affectedGood

Page 45: Шардинг в MongoDB, Henrik Ingo (MongoDB)

Example: Email Storage

CardinalityWrite

Scaling

Query

IsolationReliability

Index

Locality

_id Doc level One shardScatter/gat

her

All users

affectedGood

hash(_id) Hash level All ShardsScatter/gat

her

All users

affectedPoor

Page 46: Шардинг в MongoDB, Henrik Ingo (MongoDB)

Example: Email Storage

CardinalityWrite

Scaling

Query

IsolationReliability

Index

Locality

_id Doc level One shardScatter/gat

her

All users

affectedGood

hash(_id) Hash level All ShardsScatter/gat

her

All users

affectedPoor

user Many docs All Shards TargetedSome users

affectedGood

Page 47: Шардинг в MongoDB, Henrik Ingo (MongoDB)

Example: Email Storage

CardinalityWrite

Scaling

Query

IsolationReliability

Index

Locality

_id Doc level One shardScatter/gat

her

All users

affectedGood

hash(_id) Hash level All ShardsScatter/gat

her

All users

affectedPoor

user Many docs All Shards TargetedSome users

affectedGood

user, time Doc level All Shards TargetedSome users

affectedGood

Page 48: Шардинг в MongoDB, Henrik Ingo (MongoDB)

@h_ingo

Questions?

Page 49: Шардинг в MongoDB, Henrik Ingo (MongoDB)

Solutions Architect, MongoDB

Henrik Ingo

@h_ingo

Thank you!