cassandra 2.0 - introduction. use cases
DESCRIPTION
Patrick McFadin from DataStax talks about Cassandra at Big Data Guru MeetupTRANSCRIPT
©2013 DataStax Confidential. Do not distribute without consent.
@PatrickMcFadin
Patrick McFadin Chief Evangelist/Solution Architect - DataStax
Cassandra : Introduction
Who I am
�2
• Patrick McFadin • Solution Architect at DataStax • Cassandra MVP • User for years • Follow me for more:
I talk about Cassandra and building scalable, resilient apps ALL THE TIME!
@PatrickMcFadin
Dude. Uptime == $$
Five Years of Cassandra
0 1 2 3 4 5
0.1 0.3 0.6 0.7 1.0 1.2...
2.0
DSE
Jul-08
Cassandra - An introduction
Cassandra - Intro
• Based on Amazon Dynamo and Google BigTable paper • Shared nothing • Data safe as possible • Predictable scaling
�5
Dynamo
BigTable
Cassandra - More than one server
• All nodes participate in a cluster • Shared nothing • Add or remove as needed •More capacity? Add a server
�6
Cassandra - Locally Distributed
• Client writes to any node • Node coordinates with others • Data replicated in parallel • Replication factor: How many
copies of your data? • RF = 3 here
�7
Cassandra - Geographically Distributed
• Client writes local • Data syncs across WAN • Replication Factor per DC
�8
Cassandra - Consistency
• Consistency Level (CL) • Client specifies per read or write
�9
• ALL = All replicas ack • QUORUM = > 51% of replicas ack • LOCAL_QUORUM = > 51% in local DC ack • ONE = Only one replica acks
Cassandra - Transparent to the application
• A single node failure shouldn’t bring failure • Replication Factor + Consistency Level = Success • This example: • RF = 3 • CL = QUORUM
�10
>51% Ack so we are good!
Cassandra Applications - Drivers
• DataStax Drivers for Cassandra • Java • C# • Python •more on the way
�11
Cassandra Applications - Connecting
• Create a pool of local servers • Client just uses session to interact with Cassandra
�12
!contactPoints = {“10.0.0.1”,”10.0.0.2”}!!keyspace = “videodb”!!public VideoDbBasicImpl(List<String> contactPoints, String keyspace) {!
! cluster = Cluster! .builder()! .addContactPoints(!! contactPoints.toArray(new String[contactPoints.size()]))! .withLoadBalancingPolicy(Policies.defaultLoadBalancingPolicy())! .withRetryPolicy(Policies.defaultRetryPolicy())! .build();!! session = cluster.connect(keyspace);! }
Cassandra Applications - Load balancing• Token aware - Request sent to primary node with data • Calls can be asynchronous and in parallel
�13
1
23
45
6Client
Thread
Node
Node
Node
Client Thread
Client Thread
Node
Driver
Cassandra Applications - Fault tolerance
• Try first with a Consistency Level of QUORUM • If fails, retry with Consistency Level ONE
�14
Client Node
Node Replica
Replica
NodeReplica
Application Example - Layout
• Active-Active • Service based DNS routing
�15
Cassandra Replication
Application Example - Uptime
�16
• Normal server maintenance • Application is unaware
Cassandra Replication
Application Example - Failure
�17
• Data center failure • Data is safe. Route traffic.
33
Another happy user!
Cassandra Users and Use Cases
Netflix!• If you haven’t heard their story… where have you been? • 18B market cap — Runs on Cassandra • User accounts • Play lists • Payments • Statistics
Spotify
•Millions of songs. Millions of users. • Playlists • 1 billion playlists • 30+ Cassandra clusters • 50+ TB of data • 40k req/sec peak
�20
http://www.slideshare.net/noaresare/cassandra-nyc
Instagram(Facebook)
• Loads and loads of photos. (Probably yours) • All in AWS • Security audits • News feed • 20k writes/sec. 15k reads/sec.
�21
©2013 DataStax Confidential. Do not distribute without consent. �22