Download - Cassandra and Clojure
![Page 1: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/1.jpg)
©2013 DataStax. Do not distribute without consent.©2013 DataStax. Do not distribute without consent.
Nick Bailey
OpsCenter Architect
Cassandra and Clojure
![Page 2: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/2.jpg)
Who am I?• OpsCenter Architect
• Monitoring/management tool for Cassandra
• Organizer of Austin Cassandra Users• http://www.meetup.com/Austin-Cassandra-Users/
• Third Thursday each month. Come join!
• Working with Cassandra for 4 years
![Page 3: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/3.jpg)
Cassandra - An introduction
![Page 4: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/4.jpg)
Cassandra - Intro
• Based on Amazon Dynamo and Google BigTable papers
• Shared nothing
• Distributed
• Predictable scaling
Dynamo
BigTable
![Page 5: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/5.jpg)
Users
33
![Page 6: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/6.jpg)
Cassandra - Architecture
![Page 7: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/7.jpg)
Cassandra - Cluster Architecture
• All nodes participate in a cluster
• Shared nothing
• Add or remove as needed
• More capacity? Add a server
![Page 8: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/8.jpg)
Cassandra - Data Distribution
75
0
25
50
• Each node owns 1 or more “tokens”
• Each piece of data has a “partition key”
• Partition key is hashed to determine token
• Hashes:
• Murmur3 (default)
• Md5
![Page 9: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/9.jpg)
Cassandra - Replication
• Client writes to any node
• Node coordinates with replicas
• Data replicated in parallel
• Replication factor (RF): How many copies of your data?
![Page 10: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/10.jpg)
Cassandra - Failure Modes
• Consistency level
• How many nodes?
• ONE/QUORUM/ALL
![Page 11: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/11.jpg)
Cassandra - Geographically Distributed
• Client writes local
• Data syncs across WAN
• Replication Factor per DC
• Consistency Level
• LOCAL_QUORUM
Datacenter East Datacenter West
![Page 12: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/12.jpg)
Data Modeling - Concepts
![Page 13: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/13.jpg)
CQL• Cassandra Query Language
• SQL-like
• Not Relational
![Page 14: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/14.jpg)
Terminology• Keyspace
• Table (Column Family)
• Row
• Column
• Partition Key
• Clustering Key
![Page 15: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/15.jpg)
Data Typescqlsh:clojure_cassandra_demo> help types
CQL types recognized by this version of cqlsh:
ascii bigint blob boolean counter decimal double float inet int list map set text timestamp timeuuid uuid varchar varint
![Page 16: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/16.jpg)
Advanced Concepts• Lightweight Transactions
• Atomic Batches
• User Defined Types (coming soon)
![Page 17: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/17.jpg)
Data Modeling - An Example
![Page 18: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/18.jpg)
Approaching Data Modeling• Model your queries, not your data
• Generally, optimize for reads
• Denormalize!
• Iterate!
![Page 19: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/19.jpg)
Basic Last.fm Clone• See songs that user X has listened to recently
• See user X’s favorite songs in a specific month
• See who has recently listened to artist Y
• See artist Y’s most popular songs in a specific week
![Page 20: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/20.jpg)
Basic Last.fm Clone• See songs that user X has listened to recently
• One of the most common patterns/data models
• Time series
• Immutable (good fit for Clojure!)
![Page 21: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/21.jpg)
Basic Last.fm Clone• See songs that user X has listened to recently
SELECT song, artist, played_at FROM user_history WHERE username = ‘nickmbailey’ORDER BY played_at DESC;
• Partition key = ‘username’
• Clustering key = ‘played_at’
![Page 22: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/22.jpg)
Basic Last.fm Clone• See songs that user X has listened to recently
CREATE TABLE user_history ( username text, played_at timestamp, album text, artist text, song text, PRIMARY KEY (username, played_at)) WITH CLUSTERING ORDER BY (played_at DESC)
![Page 23: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/23.jpg)
Basic Last.fm Clone• See songs that user X has listened to recently
• This table has a “bad” partition key
CREATE TABLE user_history ( username text, played_at timestamp, album text, artist text, song text, PRIMARY KEY (username, played_at)) WITH CLUSTERING ORDER BY (played_at DESC)
![Page 24: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/24.jpg)
Basic Last.fm Clone• See songs that user X has listened to recently
• Much better partition key
CREATE TABLE user_history ( username text, year_and_month text, played_at timestamp, album text, artist text, song text, PRIMARY KEY ((username, year_and_month), played_at)) WITH CLUSTERING ORDER BY (played_at DESC)
![Page 25: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/25.jpg)
Basic Last.fm Clone• See songs that user X has listened to recently
cqlsh:clojure_cassandra_demo> select * from user_history limit 5;
username | year_and_month | played_at | album | artist | song-------------+----------------+--------------------------+--------------------------+--------------------------+------------------------- nickmbailey | 2014-06 | 2014-06-30 17:13:54-0500 | Once More 'Round The Sun | Mastodon | Halloween nickmbailey | 2014-06 | 2014-06-30 17:08:53-0500 | Once More 'Round The Sun | Mastodon | Ember City b_hastings | 2014-06 | 2014-06-30 12:57:12-0500 | Buena Vista Social Club | Buena Vista Social Club | Chan Chan zack_smith | 2014-07 | 2014-07-30 12:49:35-0500 | Awake Remix | Tycho | Awake (Com Truise Remix) zack_smith | 2014-03 | 2014-03-30 12:44:50-0500 | Awake Remix | Tycho | Awake
Partition Key - unordered Clustering Key - Ordered
![Page 26: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/26.jpg)
Basic Last.fm Clone• See user X’s favorite songs in a specific month
SELECT song, artist, play_count FROM user_history WHERE username = ‘nickmbailey’ AND month = ‘July’ORDER BY play_count DESC;
• Partition key = ‘username’, ‘month’
• Clustering key = ‘play_count’?
• Counters are a special case
![Page 27: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/27.jpg)
Counters• Counter can not be part of the PRIMARY KEY
• No ordering based on counter value
• All non counter columns must be part of the PRIMARY KEY
• Limitations due to the storage format
![Page 28: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/28.jpg)
Basic Last.fm Clone• See user X’s favorite songs in a specific month
CREATE TABLE user_song_counts ( username text, year_and_month text, artist text, song text, play_count counter, PRIMARY KEY ((username, year_and_month), artist, song))
![Page 29: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/29.jpg)
Basic Last.fm Clone• See user X’s favorite songs in a specific month
• Results unordered• Client will have to do the sorting
cqlsh:clojure_cassandra_demo> select * from user_song_counts where username = 'nickmbailey' and year_and_month = '2014-07';
username | year_and_month | artist | song | count-------------+----------------+----------+-----------------------------------+------- nickmbailey | 2014-07 | Amos Lee | Tricksters, Hucksters, And Scamps | 10 nickmbailey | 2014-07 | Beck | Blackbird Chain | 1 nickmbailey | 2014-07 | Beck | Blue Moon | 4 nickmbailey | 2014-07 | Cherub | <3 | 12 nickmbailey | 2014-07 | Cherub | Chocolate Strawberries | 6
![Page 30: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/30.jpg)
Basic Last.fm Clone• See who has recently listened to artist Y
CREATE TABLE artist_history ( artist text, year_and_week text, played_at timestamp, album text, song text, username text, PRIMARY KEY ((artist, year_and_week), played_at)) WITH CLUSTERING ORDER BY (played_at DESC)
![Page 31: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/31.jpg)
Basic Last.fm Clone• See artist Y’s most popular songs in a specific week
CREATE TABLE artist_song_counts ( artist text, year_and_week text, album text, song text, play_count counter, PRIMARY KEY ((artist, year_and_week), album, song))
![Page 32: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/32.jpg)
Cassandra from Clojure
![Page 33: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/33.jpg)
Building Blocks
• Java Driver
• Hayt
![Page 34: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/34.jpg)
Java Driver
• Fully featured
• Connection pooling
• Failover policies
• Retry policies
• Sync and Async interfaces
• Exposes client metrics
• https://github.com/datastax/java-driver
![Page 35: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/35.jpg)
Hayt
• CQL DSL
• Similar to Korma
• Solely for building CQL strings
• https://github.com/mpenet/hayt
(select :foo (where { :bar 1
:baz 2)})
(->raw (select :foo (where {:bar 1 :baz 2)}))> "SELECT * FROM foo WHERE bar = 1 AND baz = 2;"
![Page 36: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/36.jpg)
Clients
• Alia
• https://github.com/mpenet/alia
• Cassaforte
• https://github.com/clojurewerkz/cassaforte
• Both built on Java Driver and Hayt
• Not particularly different
![Page 37: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/37.jpg)
Alia vs. Cassaforte
Cassaforte(let [conn (cc/connect ["127.0.0.1"])] (cql/create-keyspace conn "cassaforte_keyspace" (with {:replication {:class "SimpleStrategy" :replication_factor 1 }})))
Alia(def cluster (alia/cluster {:contact-points ["localhost"]}))(def session (alia/connect cluster))(alia/execute session
(create-keyspace :alia (if-exists false) (with {:replication {:class "SimpleStrategy" :replication_factor 1}})))
![Page 38: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/38.jpg)
Learn by Example - Alia
![Page 39: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/39.jpg)
Cluster Object
• Entry point
• Configures relevant client options
• :contact-points
• :load-balancing-policy
• :reconnection-policy
• :retry-policy
• and more!
(def cluster (alia/cluster {:contact-points ["localhost"]}))
![Page 40: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/40.jpg)
Session Object
• A Session is associated with a keyspace
• Allows interacting with multiple keyspaces
(def cluster (alia/cluster {:contact-points [“localhost"]}))(def session (alia/connect cluster))(def session (alia/connect cluster) :my_keyspace)
![Page 41: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/41.jpg)
Querying
• Multiple ways to query
• alia/execute
• Synchronous, block on result
• alia/execute-async
• Returns a Lamina result-channel (basically, a promise)
• Optional success/error callbacks
• alia/execute-chan
• Returns a core.async channel
• We won’t dive in to core.async now
![Page 42: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/42.jpg)
Prepared Statements
• Statements can be prepared server side
• Better performance for common queries
(def prepared-statement (alia/prepare session "select * from users where user_name=?;"))
![Page 43: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/43.jpg)
What else?
• See github and docs
• https://github.com/mpenet/alia
• http://mpenet.github.io/alia/qbits.alia.html
![Page 44: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/44.jpg)
Demo
![Page 45: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/45.jpg)
Demo
• https://github.com/nickmbailey/clojure-cassandra-demo
• Built with
• CCM - https://github.com/pcmanus/ccm
• Alia - https://github.com/mpenet/alia
• ring - https://github.com/ring-clojure/ring
• compojure - https://github.com/weavejester/compojure
• hiccup - https://github.com/weavejester/hiccup
• least - https://github.com/Raynes/least
![Page 46: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/46.jpg)
MoreCassandra: http://cassandra.apache.org
DataStax Drivers: https://github.com/datastax
Documentation: http://www.datastax.com/docs
Getting Started: http://www.datastax.com/documentation/gettingstarted/index.html
Developer Blog: http://www.datastax.com/dev/blog
Cassandra Community Site: http://planetcassandra.org
Download: http://planetcassandra.org/Download/DataStaxCommunityEdition
Webinars: http://planetcassandra.org/Learn/CassandraCommunityWebinars
Cassandra Summit Talks: http://planetcassandra.org/Learn/CassandraSummit
![Page 47: Cassandra and Clojure](https://reader035.vdocuments.net/reader035/viewer/2022081414/54b6f4524a7959d0658b45ba/html5/thumbnails/47.jpg)
©2013 DataStax Confidential. Do not distribute without consent.©2013 DataStax Confidential. Do not distribute without consent.