cassandra and clojure
DESCRIPTION
An introduction to Cassandra as well as an example of accessing Cassandra from Clojure. Includes an introduction to cluster architecture and data model in Cassandra. The code for the examples is available at: https://github.com/nickmbailey/clojure-cassandra-demoTRANSCRIPT
![Page 1: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/1.jpg)
©2013 DataStax. Do not distribute without consent.©2013 DataStax. Do not distribute without consent.
Nick Bailey
OpsCenter Architect
Cassandra and Clojure
![Page 2: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/2.jpg)
Who am I?• OpsCenter Architect
• Monitoring/management tool for Cassandra
• Organizer of Austin Cassandra Users• http://www.meetup.com/Austin-Cassandra-Users/
• Third Thursday each month. Come join!
• Working with Cassandra for 4 years
![Page 3: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/3.jpg)
Cassandra - An introduction
![Page 4: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/4.jpg)
Cassandra - Intro
• Based on Amazon Dynamo and Google BigTable papers
• Shared nothing
• Distributed
• Predictable scaling
Dynamo
BigTable
![Page 5: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/5.jpg)
Users
33
![Page 6: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/6.jpg)
Cassandra - Architecture
![Page 7: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/7.jpg)
Cassandra - Cluster Architecture
• All nodes participate in a cluster
• Shared nothing
• Add or remove as needed
• More capacity? Add a server
![Page 8: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/8.jpg)
Cassandra - Data Distribution
75
0
25
50
• Each node owns 1 or more “tokens”
• Each piece of data has a “partition key”
• Partition key is hashed to determine token
• Hashes:
• Murmur3 (default)
• Md5
![Page 9: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/9.jpg)
Cassandra - Replication
• Client writes to any node
• Node coordinates with replicas
• Data replicated in parallel
• Replication factor (RF): How many copies of your data?
![Page 10: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/10.jpg)
Cassandra - Failure Modes
• Consistency level
• How many nodes?
• ONE/QUORUM/ALL
![Page 11: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/11.jpg)
Cassandra - Geographically Distributed
• Client writes local
• Data syncs across WAN
• Replication Factor per DC
• Consistency Level
• LOCAL_QUORUM
Datacenter East Datacenter West
![Page 12: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/12.jpg)
Data Modeling - Concepts
![Page 13: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/13.jpg)
CQL• Cassandra Query Language
• SQL-like
• Not Relational
![Page 14: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/14.jpg)
Terminology• Keyspace
• Table (Column Family)
• Row
• Column
• Partition Key
• Clustering Key
![Page 15: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/15.jpg)
Data Typescqlsh:clojure_cassandra_demo> help types
CQL types recognized by this version of cqlsh:
ascii bigint blob boolean counter decimal double float inet int list map set text timestamp timeuuid uuid varchar varint
![Page 16: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/16.jpg)
Advanced Concepts• Lightweight Transactions
• Atomic Batches
• User Defined Types (coming soon)
![Page 17: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/17.jpg)
Data Modeling - An Example
![Page 18: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/18.jpg)
Approaching Data Modeling• Model your queries, not your data
• Generally, optimize for reads
• Denormalize!
• Iterate!
![Page 19: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/19.jpg)
Basic Last.fm Clone• See songs that user X has listened to recently
• See user X’s favorite songs in a specific month
• See who has recently listened to artist Y
• See artist Y’s most popular songs in a specific week
![Page 20: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/20.jpg)
Basic Last.fm Clone• See songs that user X has listened to recently
• One of the most common patterns/data models
• Time series
• Immutable (good fit for Clojure!)
![Page 21: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/21.jpg)
Basic Last.fm Clone• See songs that user X has listened to recently
SELECT song, artist, played_at FROM user_history WHERE username = ‘nickmbailey’ORDER BY played_at DESC;
• Partition key = ‘username’
• Clustering key = ‘played_at’
![Page 22: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/22.jpg)
Basic Last.fm Clone• See songs that user X has listened to recently
CREATE TABLE user_history ( username text, played_at timestamp, album text, artist text, song text, PRIMARY KEY (username, played_at)) WITH CLUSTERING ORDER BY (played_at DESC)
![Page 23: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/23.jpg)
Basic Last.fm Clone• See songs that user X has listened to recently
• This table has a “bad” partition key
CREATE TABLE user_history ( username text, played_at timestamp, album text, artist text, song text, PRIMARY KEY (username, played_at)) WITH CLUSTERING ORDER BY (played_at DESC)
![Page 24: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/24.jpg)
Basic Last.fm Clone• See songs that user X has listened to recently
• Much better partition key
CREATE TABLE user_history ( username text, year_and_month text, played_at timestamp, album text, artist text, song text, PRIMARY KEY ((username, year_and_month), played_at)) WITH CLUSTERING ORDER BY (played_at DESC)
![Page 25: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/25.jpg)
Basic Last.fm Clone• See songs that user X has listened to recently
cqlsh:clojure_cassandra_demo> select * from user_history limit 5;
username | year_and_month | played_at | album | artist | song-------------+----------------+--------------------------+--------------------------+--------------------------+------------------------- nickmbailey | 2014-06 | 2014-06-30 17:13:54-0500 | Once More 'Round The Sun | Mastodon | Halloween nickmbailey | 2014-06 | 2014-06-30 17:08:53-0500 | Once More 'Round The Sun | Mastodon | Ember City b_hastings | 2014-06 | 2014-06-30 12:57:12-0500 | Buena Vista Social Club | Buena Vista Social Club | Chan Chan zack_smith | 2014-07 | 2014-07-30 12:49:35-0500 | Awake Remix | Tycho | Awake (Com Truise Remix) zack_smith | 2014-03 | 2014-03-30 12:44:50-0500 | Awake Remix | Tycho | Awake
Partition Key - unordered Clustering Key - Ordered
![Page 26: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/26.jpg)
Basic Last.fm Clone• See user X’s favorite songs in a specific month
SELECT song, artist, play_count FROM user_history WHERE username = ‘nickmbailey’ AND month = ‘July’ORDER BY play_count DESC;
• Partition key = ‘username’, ‘month’
• Clustering key = ‘play_count’?
• Counters are a special case
![Page 27: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/27.jpg)
Counters• Counter can not be part of the PRIMARY KEY
• No ordering based on counter value
• All non counter columns must be part of the PRIMARY KEY
• Limitations due to the storage format
![Page 28: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/28.jpg)
Basic Last.fm Clone• See user X’s favorite songs in a specific month
CREATE TABLE user_song_counts ( username text, year_and_month text, artist text, song text, play_count counter, PRIMARY KEY ((username, year_and_month), artist, song))
![Page 29: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/29.jpg)
Basic Last.fm Clone• See user X’s favorite songs in a specific month
• Results unordered• Client will have to do the sorting
cqlsh:clojure_cassandra_demo> select * from user_song_counts where username = 'nickmbailey' and year_and_month = '2014-07';
username | year_and_month | artist | song | count-------------+----------------+----------+-----------------------------------+------- nickmbailey | 2014-07 | Amos Lee | Tricksters, Hucksters, And Scamps | 10 nickmbailey | 2014-07 | Beck | Blackbird Chain | 1 nickmbailey | 2014-07 | Beck | Blue Moon | 4 nickmbailey | 2014-07 | Cherub | <3 | 12 nickmbailey | 2014-07 | Cherub | Chocolate Strawberries | 6
![Page 30: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/30.jpg)
Basic Last.fm Clone• See who has recently listened to artist Y
CREATE TABLE artist_history ( artist text, year_and_week text, played_at timestamp, album text, song text, username text, PRIMARY KEY ((artist, year_and_week), played_at)) WITH CLUSTERING ORDER BY (played_at DESC)
![Page 31: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/31.jpg)
Basic Last.fm Clone• See artist Y’s most popular songs in a specific week
CREATE TABLE artist_song_counts ( artist text, year_and_week text, album text, song text, play_count counter, PRIMARY KEY ((artist, year_and_week), album, song))
![Page 32: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/32.jpg)
Cassandra from Clojure
![Page 33: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/33.jpg)
Building Blocks
• Java Driver
• Hayt
![Page 34: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/34.jpg)
Java Driver
• Fully featured
• Connection pooling
• Failover policies
• Retry policies
• Sync and Async interfaces
• Exposes client metrics
• https://github.com/datastax/java-driver
![Page 35: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/35.jpg)
Hayt
• CQL DSL
• Similar to Korma
• Solely for building CQL strings
• https://github.com/mpenet/hayt
(select :foo (where { :bar 1
:baz 2)})
(->raw (select :foo (where {:bar 1 :baz 2)}))> "SELECT * FROM foo WHERE bar = 1 AND baz = 2;"
![Page 36: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/36.jpg)
Clients
• Alia
• https://github.com/mpenet/alia
• Cassaforte
• https://github.com/clojurewerkz/cassaforte
• Both built on Java Driver and Hayt
• Not particularly different
![Page 37: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/37.jpg)
Alia vs. Cassaforte
Cassaforte(let [conn (cc/connect ["127.0.0.1"])] (cql/create-keyspace conn "cassaforte_keyspace" (with {:replication {:class "SimpleStrategy" :replication_factor 1 }})))
Alia(def cluster (alia/cluster {:contact-points ["localhost"]}))(def session (alia/connect cluster))(alia/execute session
(create-keyspace :alia (if-exists false) (with {:replication {:class "SimpleStrategy" :replication_factor 1}})))
![Page 38: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/38.jpg)
Learn by Example - Alia
![Page 39: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/39.jpg)
Cluster Object
• Entry point
• Configures relevant client options
• :contact-points
• :load-balancing-policy
• :reconnection-policy
• :retry-policy
• and more!
(def cluster (alia/cluster {:contact-points ["localhost"]}))
![Page 40: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/40.jpg)
Session Object
• A Session is associated with a keyspace
• Allows interacting with multiple keyspaces
(def cluster (alia/cluster {:contact-points [“localhost"]}))(def session (alia/connect cluster))(def session (alia/connect cluster) :my_keyspace)
![Page 41: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/41.jpg)
Querying
• Multiple ways to query
• alia/execute
• Synchronous, block on result
• alia/execute-async
• Returns a Lamina result-channel (basically, a promise)
• Optional success/error callbacks
• alia/execute-chan
• Returns a core.async channel
• We won’t dive in to core.async now
![Page 42: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/42.jpg)
Prepared Statements
• Statements can be prepared server side
• Better performance for common queries
(def prepared-statement (alia/prepare session "select * from users where user_name=?;"))
![Page 43: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/43.jpg)
What else?
• See github and docs
• https://github.com/mpenet/alia
• http://mpenet.github.io/alia/qbits.alia.html
![Page 44: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/44.jpg)
Demo
![Page 45: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/45.jpg)
Demo
• https://github.com/nickmbailey/clojure-cassandra-demo
• Built with
• CCM - https://github.com/pcmanus/ccm
• Alia - https://github.com/mpenet/alia
• ring - https://github.com/ring-clojure/ring
• compojure - https://github.com/weavejester/compojure
• hiccup - https://github.com/weavejester/hiccup
• least - https://github.com/Raynes/least
![Page 46: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/46.jpg)
MoreCassandra: http://cassandra.apache.org
DataStax Drivers: https://github.com/datastax
Documentation: http://www.datastax.com/docs
Getting Started: http://www.datastax.com/documentation/gettingstarted/index.html
Developer Blog: http://www.datastax.com/dev/blog
Cassandra Community Site: http://planetcassandra.org
Download: http://planetcassandra.org/Download/DataStaxCommunityEdition
Webinars: http://planetcassandra.org/Learn/CassandraCommunityWebinars
Cassandra Summit Talks: http://planetcassandra.org/Learn/CassandraSummit
![Page 47: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/47.jpg)
©2013 DataStax Confidential. Do not distribute without consent.©2013 DataStax Confidential. Do not distribute without consent.