[globant summer take over] empowering big data with cassandra

86
Empowering Big Data with Cassandra

Upload: globant

Post on 07-Apr-2017

455 views

Category:

Education


0 download

TRANSCRIPT

Page 1: [Globant summer take over] Empowering Big Data with Cassandra

Empowering Big Datawith Cassandra

Page 2: [Globant summer take over] Empowering Big Data with Cassandra

Empowering Big Datawith Cassandra

Page 3: [Globant summer take over] Empowering Big Data with Cassandra

./me

> Renato Carelli- DevOps + Infra @ Big Data- Hardening Enthusiast- Cloud evangelist- Bitcoin speculator

Page 4: [Globant summer take over] Empowering Big Data with Cassandra

./intro

Page 5: [Globant summer take over] Empowering Big Data with Cassandra

./intro/CAP

Consistency

Availability Partitiontolerance

CA CP

AP

N/A

Page 6: [Globant summer take over] Empowering Big Data with Cassandra

./intro/RDBMS

Data

Performance

Com

plex

ityCo

stsMana

g. ti

me

Issue

s

Page 7: [Globant summer take over] Empowering Big Data with Cassandra

./intro/NoSQL

Unstructured(not really a DB) Key Value Column Graph Document

General file storageText filesLog files

Complex modelsFlexible business logicSemi-structured dataHigh volumes

OLAPAnalyticsNOT FOR UPDATES

Relations between entities (social graphs)

Agile developmentFlexible data-modelsToo many types. Eg: Corporate areas

Data Store BigQueryGFS BigTableCloudStore

Page 8: [Globant summer take over] Empowering Big Data with Cassandra

./intro/BigData

Page 9: [Globant summer take over] Empowering Big Data with Cassandra

./intro/specs

Page 10: [Globant summer take over] Empowering Big Data with Cassandra

./intro/history

BigTable (2006) Dynamo (2007)

Open Source(2008)

{data modeling} {design}

Page 11: [Globant summer take over] Empowering Big Data with Cassandra

./intro/version_historyju

l/08

apr/1

0

jul/0

9

jan/

11ju

n/11

oct/1

1

(0.1) (0.3) (0.6) (0.7)(0.8)(1.0)

apr/1

2

(1.1) (1.2)

jan/

13

sept

/13

(2.0)

sept

/14

(2.1)

sept/15 (3.0.0-rc1)

Page 12: [Globant summer take over] Empowering Big Data with Cassandra

./infra

Page 13: [Globant summer take over] Empowering Big Data with Cassandra

./infra/features

N1

N2N4

N3

> Masterless

> Distributed

> Decentralized [p2p]

> Elastically Scalable

> Highly Available

> Fault-Tolerant

> Tuneable Consistent

Page 14: [Globant summer take over] Empowering Big Data with Cassandra

./infra/benchmark

Nodes

Ops

/sec

Page 15: [Globant summer take over] Empowering Big Data with Cassandra

./infra/benchmark

Page 16: [Globant summer take over] Empowering Big Data with Cassandra

./infra/references

N1 C* Node

Connection Failed

Connection Established

Updated Data

Outdated Data

ACK

Slow Connection Established

Page 17: [Globant summer take over] Empowering Big Data with Cassandra

./infra/token

Murmur3Partitioner:

-2^63 to +2^63 -1

token(‘Globant’) = -6148914691517517206

Page 18: [Globant summer take over] Empowering Big Data with Cassandra

./infra/token

DemoPartitioner:

1 to 100

token(‘Globant’) = 68

Page 19: [Globant summer take over] Empowering Big Data with Cassandra

./infra/token_ring

Node 1 Node 2

Node 3 Node 4

Page 20: [Globant summer take over] Empowering Big Data with Cassandra

./infra/token_ring

Node 1 Node 2

Node 3 Node 4

1 - 25 26 - 50

51 - 75 76 - 100

‘Glob’ = 17

‘ant’ = 94

‘Globant’ = 68

~/Images/pic.png = 69

~/media/movie.mkv = 34

Page 21: [Globant summer take over] Empowering Big Data with Cassandra

./infra/token_ring/replication

Node 1 Node 2

Node 3 Node 4

1 - 25 26 - 50

51 - 75 76 - 100

‘Glob’ = 17

RF = 3

Page 22: [Globant summer take over] Empowering Big Data with Cassandra

./infra/token_ring/vnodes

What about virtual nodes?

C* 1.2

Page 23: [Globant summer take over] Empowering Big Data with Cassandra

./infra/coordinator

N1

N2N4

N3

Page 24: [Globant summer take over] Empowering Big Data with Cassandra

./infra/coordinator

N1

N2N4

N3> readRF = 3CL = TWO

Page 25: [Globant summer take over] Empowering Big Data with Cassandra

./infra/coordinator

N1

N2N4

N3> readRF = 3CL = TWO

Page 26: [Globant summer take over] Empowering Big Data with Cassandra

./infra/coordinator

N1

N2N4

N3

Coordinator

> readRF = 3CL = TWO

Page 27: [Globant summer take over] Empowering Big Data with Cassandra

./infra/coordinator

N1

N2N4

N3

Coordinator

> readRF = 3CL = TWO

Page 28: [Globant summer take over] Empowering Big Data with Cassandra

./infra/coordinator

N1

N2N4

N3

Coordinator

> readRF = 3CL = TWO

Page 29: [Globant summer take over] Empowering Big Data with Cassandra

./infra/coordinator

N1

N2N4

N3

Coordinator

> readRF = 3CL = TWO

Page 30: [Globant summer take over] Empowering Big Data with Cassandra

./infra/coordinator

N1

N2N4

N3

Coordinator

> readRF = 3CL = TWO

Page 31: [Globant summer take over] Empowering Big Data with Cassandra

./infra/coordinator

N1

N2N4

N3

Coordinator

> readRF = 3CL = TWO

Page 32: [Globant summer take over] Empowering Big Data with Cassandra

./infra/replication

How many copies of each piece of data (partition) do we want in the system?

Page 33: [Globant summer take over] Empowering Big Data with Cassandra

./infra/replication

> Replication Factor> Replication Strategy

Keyspace-based!

Page 34: [Globant summer take over] Empowering Big Data with Cassandra

./infra/replication

N1

N2N4

N3RF = 3

Page 35: [Globant summer take over] Empowering Big Data with Cassandra

./infra/replication

CREATE KEYSPACE Globant WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 3 };

Page 36: [Globant summer take over] Empowering Big Data with Cassandra

./infra/replicationN1

N2N3

N4

R1

R2

Data Center - East

N1

N2N3

N4

R1

R2

Data Center - West

RF = {‘w’:3, ‘e’:2}

Page 37: [Globant summer take over] Empowering Big Data with Cassandra

./infra/replication

CREATE KEYSPACE Globant WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy', 'w' : 3, 'e' : 2};

Page 38: [Globant summer take over] Empowering Big Data with Cassandra

./infra/consistency_level

How many replicas/nodes (based in RF) must respond to declare success?

Page 39: [Globant summer take over] Empowering Big Data with Cassandra

./infra/consistency_level

Query-based!

Page 40: [Globant summer take over] Empowering Big Data with Cassandra

./infra/consistency_level

N1

N2N4

N3

> writeCL = QUORUM

CL = { ANY, ONE, TWO, THREE, QUORUM, LOCAL_ONE, LOCAL_QUORUM, EACH_QUORUM, ALL }

Page 41: [Globant summer take over] Empowering Big Data with Cassandra

./infra/consistency_level

N1

N2N4

N3

> readCL = ALL

CL = { ANY, ONE, TWO, THREE, QUORUM, LOCAL_ONE, LOCAL_QUORUM, EACH_QUORUM, ALL }

Page 42: [Globant summer take over] Empowering Big Data with Cassandra

./infra/consistency_level

N1

N2N4

N3

> readCL = QUORUM

CL = { ANY, ONE, TWO, THREE, QUORUM, LOCAL_ONE, LOCAL_QUORUM, EACH_QUORUM, ALL }

Page 43: [Globant summer take over] Empowering Big Data with Cassandra

./infra/consistency_level

Latest timestamp wins!

Page 44: [Globant summer take over] Empowering Big Data with Cassandra

./infra/consistency_level/immediate

{ R + W > RF }

Page 45: [Globant summer take over] Empowering Big Data with Cassandra

./infra/consistency_level/immediate

+Reads > Write CL: ALL> Read CL: ONE

Page 46: [Globant summer take over] Empowering Big Data with Cassandra

./infra/consistency_level/immediate

N1

N2N4

N3

RF = 3

> writeCL = ALL

Page 47: [Globant summer take over] Empowering Big Data with Cassandra

./infra/consistency_level/immediate

N1

N2N4

N3

RF = 3

> writeCL = ALL

> readCL = ONE

Page 48: [Globant summer take over] Empowering Big Data with Cassandra

./infra/consistency_level/immediate

N1

N2N4

N3

RF = 3

> writeCL = ALL

> readCL = ONE

Page 49: [Globant summer take over] Empowering Big Data with Cassandra

./infra/consistency_level/immediate

N1

N2N4

N3

RF = 3

> writeCL = ALL

> readCL = ONE

Page 50: [Globant summer take over] Empowering Big Data with Cassandra

./infra/consistency_level/immediate

{ R + W > RF }

Page 51: [Globant summer take over] Empowering Big Data with Cassandra

./infra/consistency_level/immediate

{ 1 + 3 > 3 }

Page 52: [Globant summer take over] Empowering Big Data with Cassandra

./infra/consistency_level/immediate

+Writes> Write CL: ONE> Read CL: ALL

Page 53: [Globant summer take over] Empowering Big Data with Cassandra

./infra/consistency_level/immediate

N1

N2N4

N3

RF = 3

> writeCL = ONE

Page 54: [Globant summer take over] Empowering Big Data with Cassandra

./infra/consistency_level/immediate

N1

N2N4

N3

RF = 3

> writeCL = ONE

> readCL = ALL

Page 55: [Globant summer take over] Empowering Big Data with Cassandra

./infra/consistency_level/immediate

N1

N2N4

N3

RF = 3

> writeCL = ONE

> readCL = ALL

Page 56: [Globant summer take over] Empowering Big Data with Cassandra

./infra/consistency_level/immediate

N1

N2N4

N3

RF = 3

> writeCL = ONE

> readCL = ALL

Page 57: [Globant summer take over] Empowering Big Data with Cassandra

./infra/consistency_level/immediate

{ R + W > RF }

Page 58: [Globant summer take over] Empowering Big Data with Cassandra

./infra/consistency_level/immediate

{ 3 + 1 > 3 }

Page 59: [Globant summer take over] Empowering Big Data with Cassandra

./infra/consistency_level/immediate

Balanced> Write CL: QUORUM> Read CL: QUORUM

Page 60: [Globant summer take over] Empowering Big Data with Cassandra

./infra/consistency_level/immediate

N1

N2N4

N3

RF = 3

> writeCL = QUORUM

Page 61: [Globant summer take over] Empowering Big Data with Cassandra

./infra/consistency_level/immediate

N1

N2N4

N3

RF = 3

> writeCL = QUORUM

> readCL = QUORUM

Page 62: [Globant summer take over] Empowering Big Data with Cassandra

./infra/consistency_level/immediate

N1

N2N4

N3

RF = 3

> writeCL = QUORUM

> readCL = QUORUM

Page 63: [Globant summer take over] Empowering Big Data with Cassandra

./infra/consistency_level/immediate

N1

N2N4

N3

RF = 3

> writeCL = QUORUM

> readCL = QUORUM

Page 64: [Globant summer take over] Empowering Big Data with Cassandra

./infra/consistency_level/immediate

{ R + W > RF }

Page 65: [Globant summer take over] Empowering Big Data with Cassandra

./infra/consistency_level/immediate

{ 2 + 2 > 3 }

Page 66: [Globant summer take over] Empowering Big Data with Cassandra

./infra/read_repair

> Query ALL replicas when reading- Data from one.- Checksum + Timestamp from others.

Page 67: [Globant summer take over] Empowering Big Data with Cassandra

./infra/read_repair

> If there is a mismatch:- Pull all data and merge- Write back to out of sync replicas

Page 68: [Globant summer take over] Empowering Big Data with Cassandra

./infra/read_repair

Table-based!

Page 69: [Globant summer take over] Empowering Big Data with Cassandra

./infra/read_repair

N1

N2N4

N3

DATA

SUM

SUM

Page 70: [Globant summer take over] Empowering Big Data with Cassandra

./infra/read_repair

N1

N2N4

N3

DATA

SUM

SUM

Page 71: [Globant summer take over] Empowering Big Data with Cassandra

./infra/read_repair

N1

N2N4

N3

Page 72: [Globant summer take over] Empowering Big Data with Cassandra

./infra/read_repair

ALTER TABLE Globant.foobar WITH read_repair_chance = 0.2;

Page 73: [Globant summer take over] Empowering Big Data with Cassandra

./infra/read_repair

> Weak Consistencyreturn results + repair

> Strong Consistencyrepair + return results

Page 74: [Globant summer take over] Empowering Big Data with Cassandra

./infra/hinted_handoff

> Recovery mechanism- Stored @ Coordinator‘s system.hints- 3hs default TTL- DataCenter-based!

Page 75: [Globant summer take over] Empowering Big Data with Cassandra

./infra/nodetool

$ nodetool repair> Recovering a failed node> Infreq read data (read repair chance)> Tombstone gc period (gc_grace_seconds)

Page 76: [Globant summer take over] Empowering Big Data with Cassandra

./internals

Page 77: [Globant summer take over] Empowering Big Data with Cassandra

./internals/write_path

Npartition key 3 n: SasoConf city: cur year: 3partition key 2 n: EkoParty city: caba year: 11partition key 1 n: pwnConf city: mdq year: 2

MEMORY

STORAGE

Memtable (1 table)

CommitLog

SSTables

... ... ... ...

... ... ... ...

... ... ... ...

... ... ... ...

... ... ... ...

... ... ... ...

C

Page 78: [Globant summer take over] Empowering Big Data with Cassandra

./internals/write_path

N

MEMORY

STORAGE

Memtable (1 table)

CommitLog

SSTables

... ... ... ...

... ... ... ...

... ... ... ...

... ... ... ...

... ... ... ...

... ... ... ...

C

(Flush)

... ... ... ...

... ... ... ...

... ... ... ...

partition key 3 n: SasoConf city: cur year: 3partition key 2 n: EkoParty city: caba year: 11partition key 1 n: pwnConf city: mdq year: 2

Page 79: [Globant summer take over] Empowering Big Data with Cassandra

./internals/write_path

N

MEMORY

STORAGE

Memtable (1 table)

CommitLog

SSTables

... ... ... ...

... ... ... ...

... ... ... ...

... ... ... ...

... ... ... ...

... ... ... ...

C

... ... ... ...

... ... ... ...

... ... ... ...

partition key 3 n: SasoConf city: cur year: 3partition key 2 n: EkoParty city: caba year: 11partition key 1 n: pwnConf city: mdq year: 2

Compaction

... ... ... ...

... ... ... ...

... ... ... ...

Page 80: [Globant summer take over] Empowering Big Data with Cassandra

./internals/write_path

N

MEMORY

STORAGE

Memtable (1 table)

CommitLog

SSTables

C

... ... ... ...

... ... ... ...

... ... ... ...

... ... ... ...

... ... ... ...

... ... ... ...

... ... ... ...

... ... ... ...

... ... ... ...

Compaction

... ... ... ...

... ... ... ...

... ... ... ...

APPEND ONLY

APPEND ONLY

IMMUTA

BLE

Page 81: [Globant summer take over] Empowering Big Data with Cassandra

./hands-on

Page 82: [Globant summer take over] Empowering Big Data with Cassandra

./hands-on/stress

> 1.3M writes/sec (1.3 write/µs)

> 160K reads/sec (160 reads/ms)

> Collisions?

Page 83: [Globant summer take over] Empowering Big Data with Cassandra

./hands-on/stress

Custom Apps

Page 84: [Globant summer take over] Empowering Big Data with Cassandra

./me/contact

> Renato Carelli- mailto: [email protected] mailto: [email protected] telegram: @renato

Page 85: [Globant summer take over] Empowering Big Data with Cassandra

We are hiring DevOps!> mailto: [email protected]

Page 86: [Globant summer take over] Empowering Big Data with Cassandra

Thanks!