sharding - patterns & antipatterns, Константин Осипов, Алексей...

23
Sharding: patterns and antipatterns Konstantin Osipov (Mail.Ru, Tarantool) Alexey Rybak (Badoo)

Upload: ontico

Post on 08-Jul-2015

1.746 views

Category:

Internet


13 download

DESCRIPTION

Доклад Константина Осипова (Mail.Ru, Tarantool) и Алексея Рыбака (Badoo)

TRANSCRIPT

Page 1: Sharding -  patterns & antipatterns, Константин Осипов, Алексей Рыбак

Sharding: patterns and

antipatterns

Konstantin Osipov (Mail.Ru, Tarantool)

Alexey Rybak (Badoo)

Page 2: Sharding -  patterns & antipatterns, Константин Осипов, Алексей Рыбак

Big picture: scalable databases

● replication

● sharding and re-sharding

● distributed queries & jobs, Map/Reduce

● DDL

● will focus on sharding/re-sharding only

Page 3: Sharding -  patterns & antipatterns, Константин Осипов, Алексей Рыбак

Contents

I. sharding function

II. routing

III.re-sharding

Page 4: Sharding -  patterns & antipatterns, Константин Осипов, Алексей Рыбак

I. Sharding function

Page 5: Sharding -  patterns & antipatterns, Константин Осипов, Алексей Рыбак

Selecting a good shard key

● the identified object

should be small

● some data you won’t be

able to shard (and have to

duplicate in each shard)

● don’t store the key if you

don’t have to

Page 6: Sharding -  patterns & antipatterns, Константин Осипов, Алексей Рыбак

Good and bad shard keys

● good: user session, shopping order

● maybe: user (if user data isn’t too thick)

● bad: inventory item, order date

Page 7: Sharding -  patterns & antipatterns, Константин Осипов, Алексей Рыбак

Garage sharding: numbers

● replication based doubling (2, 4, 8, out of

cash)

● the magic number 48 (2✕3✕4)

Page 8: Sharding -  patterns & antipatterns, Константин Осипов, Алексей Рыбак

Garage sharding thru hashing

● good: remainderso f(key) ≡ key % n_srv

o f(key) ≡ crc32(key) % n_srv

● bad: first login letter

Page 9: Sharding -  patterns & antipatterns, Константин Осипов, Алексей Рыбак

Sharding for grown-ups

● table function

● consistent hashing

Page 10: Sharding -  patterns & antipatterns, Константин Осипов, Алексей Рыбак

Table functions● virtual buckets: key -> bucket -> shard

o “key -> bucket” function, “bucket -> shard” table

o “key -> bucket” table, “bucket -> shard” table

Page 11: Sharding -  patterns & antipatterns, Константин Осипов, Алексей Рыбак

Consistent hashing

● Danny Lewin RIP

● Kinda ring and like...

uhm... points, you

know ...

● Libraries: Ketama

Page 12: Sharding -  patterns & antipatterns, Константин Осипов, Алексей Рыбак

Guava/Sumbur

● f(key, n_servers) => server_id

● strictly uniform key-to-server mapping

● recurrence formula (15 lines of code)

Page 13: Sharding -  patterns & antipatterns, Константин Осипов, Алексей Рыбак

II. Routing

Page 14: Sharding -  patterns & antipatterns, Константин Осипов, Алексей Рыбак

Routing types

● smart client

● coordinator

● proxy

● local proxy on every app server

● intra-database routing

Page 15: Sharding -  patterns & antipatterns, Константин Осипов, Алексей Рыбак

Smart Client

● no extra hops

● all clients

(PHP/Python/C...)

should implement

it

● resharding is hard

Page 16: Sharding -  patterns & antipatterns, Константин Осипов, Алексей Рыбак

Proxy

● encapsulates routing logic

● extra hop, traffic

● +1 service

● SPOF

=> local proxy

Page 17: Sharding -  patterns & antipatterns, Константин Осипов, Алексей Рыбак

Coordinator

● centralized

knowledge

● SPOF

Page 18: Sharding -  patterns & antipatterns, Константин Осипов, Алексей Рыбак

Intra-database routing

● too many nodes

● redundancy is high

● ad-hoc requests

Page 19: Sharding -  patterns & antipatterns, Константин Осипов, Алексей Рыбак

III.Re-sharding

Page 20: Sharding -  patterns & antipatterns, Константин Осипов, Алексей Рыбак

Re-sharding is a pain

● redistribution impacts:o clients

o network performance

o consistency

=> maintenance time window

● forget about it on petabyte scale

Page 21: Sharding -  patterns & antipatterns, Константин Осипов, Алексей Рыбак

Best practice: no data redistribution

● update is a move

● data expiration (new data on new servers)

● new data on selected servers

Page 22: Sharding -  patterns & antipatterns, Константин Осипов, Алексей Рыбак

DDL

● upgrade your app

● upgrade your database

● update your app and remove any trace of old

schema