le monde nosql pour les spécialistes du relationnel,
Post on 08-Jan-2017
69 Views
Preview:
TRANSCRIPT
BÂLE BERNE BRUGG DUSSELDORF FRANCFORT S.M. FRIBOURG E.BR. GENÈVE
HAMBOURG COPENHAGUE LAUSANNE MUNICH STUTTGART VIENNE ZURICH
#SDF16
Cassandra for DBAs
Ulises Fasoli – Senior Consultant Trivadis Lausanne AD
#SDF16
Programme
1. NoSQL Landscape
2. What is Apache Cassandra?
3. Cassandra Architecture
4. Data Distribution & Replication
5. Cassandra Data Model
6. Cassandra Write path
7. Cassandra Read path
8. Tools
9. Last thoughts and conclusion
#SDF16
NoSQL Landscape
#SDF16
NoSQL Landscape : Type of databases
Type Example Description
Key-Value
keys map to arbitrary values of any
data type
Wide Column
keys mapped to sets of n-number of
typed columns
Document
document sets (JSON) queryable in
whole or part
Graphdata elements each relate to n others
in a graph/network
#SDF16
Brewer's CAP Theorem
• Consistency
do you get identical results, regardless which node
is queried?
• Availability
can the cluster respond to very high write and read
volumes?
• Network Partition tolerance is the cluster still
available when part of it goes dark?
Availability
Consistency
Network
Partition
Tolerance
n/a
CA CP
AP
Any networked shared-data system can have at most two of the three desirable properties:
#SDF16
What is Apache Cassandra?
#SDF16
Cassandra
Fully distributed, with no single point of failure
Free and open source, with deep developer support
Highly performant, with near-linear horizontal scaling in proper
use cases
Bigtable Dynamo
#SDF16
Use cases for Cassandra
Product Catalog / Playlists
Personalization
Ads / Recommendations
Fraud Detection
Time Series
IoT / Sensor Data
Graph / Network data
#SDF16
Cassandra Architecture
#SDF16
Architecture overview
Designed with the understanding that system/hardware failures can and do occur
Peer-to-peer, distributed system
All nodes are identical in the cluster
Data partitioned among all nodes in the cluster
Custom data replication to ensure fault tolerance
Read/Write-anywhere design
#SDF16
What is a cluster?
A peer to peer set of nodes
• Node – one Cassandra instance
• Rack – a logical set of nodes
• Data Center – a logical set of racks
• Cluster – the full set of nodes which map to a single complete
token ring
Node 4
Node 1
Node 2
Data Center - East
Node 1
Node 3
Node 4
Node 3
Data Center - West
Rack 1
Rack 2
Node 2
Rack 1
Rack 2
Cassandra
Cluster
- 263+ 263
Token Range
(Murmur3)
#SDF16
What is a cluster?Node 1
Node 3
Node 2Node 4
127.0.0.1
127.0.0.2
127.0.0.3
127.0.0.4
Seed
Nodes join a cluster based on the configuration of
their own conf/cassandra.yaml file
Some key settings :
• cluster_name
• seeds
• listen_address
#SDF16
What is a coordinator?
The node chosen by the client to receive a
particular read or write request to its cluster
Any node can coordinate any request
Each client request may be coordinated by a
different node
No single point of failure
Fundamental principle to Cassandra's architecture
Node 1
Node 3
Node 2Node 4
Client Driver
#SDF16
Data Distribution & Replication
#SDF16
Data Partitioning & distribution
Nodes are logically structured in Ring Topology.
Each node is responsible for a part of the overall database
Data is assigned to a specific node based on a hashed value of key
Lightly loaded nodes can move position to alleviate highly loaded nodes
#SDF16
Data Partitioning01
1/2
F
E
D
C
B
A N=3
h(key2)
h(key1)
#SDF16
Data Replication
Defined at keyspace level
o Replication factor : how many replicas to make
o Replication strategy : on which node should each replica be
placed
All partitions are "replicas", there are no "originals"
First replica : placed on the node owning its token's primary
range
#SDF16
Data Replication / Distribution
Native data replication / distribution support
Transparently handled by Cassandra
Multi-data center capable
Hybrid Cloud/On premise support
#SDF16
What is consistency ?Node 1
Node 3
Node 2Node 4
Client Driver
Just one?
CL=ONE
Two?
CL=TWO
51%?
CL=QUORUM
Partition key determines which nodes are
sent any given request
• Consistency Level : how many nodes must
acknowledge before response is sent
The meaning varies by type
• Write request – how many nodes
must acknowledge the write?
• Read request – how many nodes
must acknowledge by sending their
most recent copy of the data?
#SDF16
What is immediate vs. eventual consistency? Immediate Consistency – reads always return the most recent data
• Immediate consistency guaranteed with Consistency Level ALL
• Highest latency (all replicas are checked and compared)
Eventual Consistency – reads may return stale data
• Consistency Level ONE carries the highest risk of stale data
• Lowest latency (first replica is immediately returned)
ANY ALLONE TWO . . . .
0 Total Nodes (N)1 2Available Replicas
Consistency Level
Read repairs are there to prevent entropy
#SDF16
Cassandra Data Model
#SDF16
Cassandra Data Model
The Cassandra data model defines
1. Column family as a way to store and organize data
2. Table as a two-dimensional view of a multi-dimensional column family
3. Cassandra Query Language (CQL) : A language to perform operations
on tables
#SDF16
Cassandra Data Model
Keyspace
Column Family Column Family
#SDF16
What is a column family
row key3
v3.a
cola
v3.b
colb
v3.c
colc
v1.a
cola
v1.b
colb
v1.c
colc
v2.a
cola
v2.b
colb
v2.c
colc
v3.d
cold
v1.d
cold
v2.d
cold
COLUMNS
RO
WS
row key1
row key2
CELLS
Column family – set of rows with a similar structure
• Sorted columns
• Multidimensional
• Distributed
• Sparse
#SDF16
What are row, row key, column key, and column value?• Rows – individual rows constitute a column family
• Row key – uniquely identifies a row in a column family
• Row – stores pairs of column keys and column values
• Column key – uniquely identifies a column value in a row
• Column value – stores one value or a collection of values
row key
va
cola
vb
colb
vc
colc
vd
cold
Column keys (or column names)Row
Column values (or cells)
#SDF16
What are row, row key, column key, and column value?
John Lennon
1940
born
England
country
1980
died
Rock
style
artist
type
The Beatles
England
country
1957
founded
Rock
style
band
type
Row key Column keys
Column values
#SDF16
What is a wide row?
Rows may be described as “skinny” or “wide”
• Skinny row –fixed, relatively small number of column keys
Wide row –relatively large number of column keys (hundreds or thousands)
• For example, a row that stores all bands of the same style
• The number of such bands will increase as new bands are formed
Rock
The Animals The Beatles...
...
...
...
...
...
#SDF16
What are composite row key and composite column key?
Composite row key – multiple components separated by colon
Composite column key – multiple components separated by colon
• Composite column keys are sorted by each component
Revolver:1966
Rock
genre
The Beatles
performer
{1: 'Taxman', ..., 14: 'Tomorrow Never Knows'}
tracks
Revolver:1966
Taxman
1:title
Eleanor Rigby
2:title
Tomorrow Never Knows
14:title...
...
#SDF16
What are partition, partition key, row, column, and cell?
Column family
view
Table with single-row
partitions
#SDF16
What are composite partition key and clustering column?Table with multi-row partitions
partitions
album_title year num
ber
track_title
Revolver 1966 1 Taxman
Revolver 1966 … …
Revolver 1966 14 Tomorrow Never Knows
Let It Be 1970 1 Two Of Us
Let It Be 1970 … …
Let It Be 1970 11 Get Back
Magical Mystery Tour 1967 1 Magical Mystery Tour
Magical Mystery Tour 1967 … …
Magical Mystery Tour 1967 11 All You Need Is Love
rows in a partition/table
columns
composite partition key
clustering column
cells
#SDF16
What are composite partition key and clustering column?
Revolver:1966
Taxman
1:title
Two Of Us
1:title
Let It Be:1970
Magical Mystery Tour:1967
Magical Mystery Tour
1:title
Doctor Robert
11:title
Get Back
11:title
All You Need Is Love
11:title
Tomorrow Never Knows
14:title...
...
...
...
...
...
...
...
Table with multi-row partitions : Column family view
#SDF16
What are static columns?Table with multi-row partitions and static columns
album_title year num
ber
genre performer track_title
Revolver 1966 1 Rock The Beatles Taxman
Revolver 1966 … Rock The Beatles …
Revolver 1966 14 Rock The Beatles Tomorrow Never Knows
Let It Be 1970 1 Rock The Beatles Two Of Us
Let It Be 1970 … Rock The Beatles …
Let It Be 1970 11 Rock The Beatles Get Back
Static
columns
#SDF16
What is a primary key? Primary key uniquely identifies a row in a table
Simple or composite partition key and all clustering columns (if present)
performer born country died founded style type
John Lennon 1940 England 1980 Rock artist
Paul McCartney 1942 England Rock artist
album_title year number track_title
Revolver 1966 1 Taxman
Revolver 1966 … …
Revolver 1966 14 Tomorrow Never
Let It Be 1970 1 Two Of Us
Let It Be 1970 … …
Let It Be 1970 11 Get Back
composite partition key
+
clustering column
Primary
key
Single partition key
#SDF16
What is a table or CQL Table?
A CQL table is a column family
• CQL tables provide two-dimensional views of a column family, which contains
potentially multi-dimensional data, due to composite keys and collections
CQL table and column family are largely interchangeable terms
Supported by declarative language Cassandra Query Language (CQL)
#SDF16
Cassandra Query Language (CQL) Data Definition Language, subset of CQL
SQL-like syntax, but with somewhat different semantics
#SDF16
Cassandra Data Model differences from RDBMS
Cassandra RDBMS
Cassandra deals with unstructured data. RDBMS deals with structured data.
Cassandra has a flexible schema. It has a fixed schema.
In Cassandra, a table is a list of “nested key-value
pairs”. (ROW x COLUMN key x COLUMN value)
In RDBMS, a table is an array of arrays. (ROW x
COLUMN)
keyspace is the outermost container that contains data
corresponding to an application.
Database is the outermost container that contains
data corresponding to an application.
Tables or column families are the entity of a keyspace. Tables are the entities of a database.
Row is a unit of replication in Cassandra. Row is an individual record in RDBMS.
Column is a unit of storage in Cassandra. Column represents the attributes of a relation.
Relationships are represented using collections. RDBMS supports the concepts of foreign keys, joins.
#SDF16
Cassandra Write path
#SDF16
Write path: how is data written Cassandra is a log-structured storage engine
Data is sequentially appended, not placed in pre-set locations
RDBMS CASSANDRA
Continuously appends to a log
??
Seeks and writes values to
various pre-set locations
#SDF16
How does the write path flow on a node?
… … … …
… … … …
… … … …
… … … …
… … … …
… … … …
Node memory
Node file system
Client partition key1 first:Oscar last:Orange level:42
partition key2 first:martin last:Blue
Memtable (corresponds to a CQL table)Coordinator
CommitLog
Appe
nd O
nly
… … … …
… … … …
Flush current state to SSTable
Each write request … Periodically …
Periodically …
… … … …
… … … …
… … … …
… … … …
… … … …
Compaction
Compact related SSTables
SSTables
partition key3 first:Ricky last:Red
#SDF16
What is the Commit Log?
An append-only log used to automatically rebuild Memtableson restart of a downed node.
Memtables flush to disk when CommitLog size reaches total allowed space
Entries are marked as flushed, as corresponding Memtableentries flush to disk as an SSTable
CommitLog options are configured in the Cassandra.yaml file
CommitLog
#SDF16
What are Memtables and how are they flushed to disk?
Memtables are in-memory representations of a CQL table :
• Each node has a Memtable for each CQL table in the keyspace
• Each Memtable accrues writes and provides reads for data not yet flushed
• Updates to Memtables mutate the in-memory partition
partition key1 first:Oscar last:Orange level:42
partition key2 first:Ricky last:Red
Memtable
#SDF16
What is a SSTable and what are its characteristics?
… … … …
… … … …
… … … …
… … … …
… … … …
… … … …
… … … …
… … … …
A SSTable ("sorted string table") is
• an immutable file of sorted partitions
• written to disk through fast, sequential i/o
• contains the state of a Memtable when flushed
The current data state of a CQL table is comprised of
• its corresponding Memtable plus
• all current SSTables flushed from that Memtable
SSTables are periodically compacted from many to one
#SDF16
Cassandra Read path
#SDF16
How does the read path flow on each node?
MemTable (e.g., player)
Coordinator
SSTables (e.g., player)
… … … …
pk7 … … level:42timestamp 1114
pk1 … … …
pk7 first:Bettytimestamp 541
last:Bluetimestamp 541
level:63timestamp 541
pk2 … … …
pk7 first:Elizabethtimestamp 994
pk7 first:Elizabeth last:Blue level:42
Row Cache (optional)
Read
<pk7>
Hit
pk1, pk2pk1, pk2, pk7
Node memory
Node file system
Off Heap On HeapRow cache hit
pk1 … … …
pk2 … … …
#SDF16
How does the read path flow on each node?
MemTable (e.g., player)
… … … …
pk7 … … level:42timestamp 1114
pk1 … … …
pk7 first:Bettytimestamp 541
last:Bluetimestamp 541
level:63timestamp 541
pk2 … … …
pk7 first:Elizabethtimestamp 994
pk7 first:Elizabeth last:Blue level:42
Row Cache (optional)
pk1 … … …
pk2 … … …
Bloom Filter
Bloom Filter
Bloom Filter
Miss
pk1, pk2
Node memory
Node file system
Hit
Hit
Read
<pk7>
Off Heap On HeapKey cache hit
Coordinator
?
Key
Cache
pk7?
, pk7
SSTables (e.g., player)
#SDF16
How does the read path flow on each node?
MemTable (e.g., player)
… … … …
pk7 … … level:42timestamp 1114
pk1 … … …
pk7 first:Bettytimestamp 541
last:Bluetimestamp 541
level:63timestamp 541
Pk2 … … …
pk7 first:Elizabethtimestamp 994
Pk7 first:Elizabeth last:Blue level:42
Row Cache (optional)
pk1 … … …
pk2 … … …
pk1, pk2
Bloom Filter
Bloom Filter
Bloom Filter
Miss
pk1, pk2, pk7
Node memory
Node file system
Miss Partition
Summary
Part
itio
n
Ind
ex
Miss Partition
Summary
Part
itio
n
Ind
ex
?
?
Key
Cache
pk7
Read
<pk7>
Off Heap On HeapRow and Key miss
Coordinator
SSTables (e.g., player)
#SDF16
Tools
#SDF16
Tools : CQLSH
Interactive command line CQL utility
Supports tab completion for commands
Think of it as SQL*Plus for Cassandra
#SDF16
Tools : Cassandra Cluster Manager (CCM)
Open source utility
Creates and manages multi-node clusters on a local machine
Not for production configuration
Useful for :
• Testing failure scenarios
• Development / Prototyping without the hardware
• Version migrations
• …
#SDF16
Tools : Nodetool
Command-line cluster management utility
Supports over 40 commands like :
• Status
• Info
• ring
#SDF16
Tools : Datastax : DevCenter
Visually Create and Navigate Database Objects
View Query Results and Tune Queries for Faster Performance
#SDF16
Tools : Datastax OpsCenter
Web-based visual management and monitoring solution
Visual cluster management
Point-and-Click Provisioning and Administration
Secured administration
Visual monitoring and tuning
Customizable Dashboards
#SDF16
Tools : Datastax OpsCenter
#SDF16
Tools : Datastax OpsCenter
#SDF16
Last thoughts and conclusion
#SDF16
DBAs wanted
NoSQL and Cassandra will not replace RDBMS
Different tools for different jobs
Current situation :
• Community largely driven by developers and sysadmins
• Community needs insight from DBAs to make the database evolve
• Get involved!
#SDF16
https://academy.datastax.com/
https://academy.datastax.com/courses
https://academy.datastax.com/courses/ds201-cassandra-core-concepts
https://academy.datastax.com/courses/ds210-operations-and-performance-tuning
Sources and additionnal information
#SDF16
Ulises Fasoli
Senior Consultant AD Trivadis
Tél. +41 21 321 47 00
ulises.fasoli@trivadis.com
#SDF16
Feedback
Confirmez votre présence et évaluez la session avec ce QRC.
Un vol en montgolfière à gagner !
top related