cassandra eu - state of cql

20
The State of CQL Sylvain Lebresne (DataStax)

Upload: pcmanus

Post on 27-Dec-2014

1.226 views

Category:

Technology


1 download

DESCRIPTION

State of CQL talk at the Cassandra EU summit

TRANSCRIPT

Page 1: Cassandra EU - State of CQL

The State of CQL

Sylvain Lebresne (DataStax)

Page 2: Cassandra EU - State of CQL

A short CQL primer

New in Cassandra 2.0

Native protocol

What's next?

2/20

Page 3: Cassandra EU - State of CQL

A better API for CassandraThrift is not satisfactory:

Cassandra has often been regarded as hard to develop against.

It doesn't have to be that way!

Not user friendly, hard to use.

Low level, very little abstraction.

Hard to evolve (in a backward compatible way).

Unreadable without driver abstraction.

····

3/20

Page 4: Cassandra EU - State of CQL

Quick historical notesCQL1 first introduced in Cassandra 0.8, became CQL2 in Cassandra 1.0

"These aren't the CQL you are looking for"

CQL3 (CQL for short thereafter) introduced in Cassandra 1.2

Semantically, CQL1/CQL2 are closer to the Thrift API than to CQL3.

CQL3 is the version that's here to stay: no plan for a CQL4 any time soon.

·····

4/20

Page 5: Cassandra EU - State of CQL

A short CQL primer

Page 6: Cassandra EU - State of CQL

The Cassandra Query LanguageSyntactically, a subset of SQL (with a few extensions)

INSERT and UPDATE are both upserts

No joins, no sub-queries, no aggregation, ...

Denormalization is the norm: do the work at write time, not read time

·CREATE TABLE users ( user_id uuid, name text, password text, email text, picture_profile blob, PRIMARY KEY (user_id))

CQL

···

6/20

Page 7: Cassandra EU - State of CQL

Denormalization: Cassandra modeling 101Efficient queries in Cassandra are based on 2 principles:

Denormalization is the technique that allows to achieve this in practice.

But this means CQL exposes:

the data queried is collocated on one replica set

the data queried is collocated on disk on those replicas

··

how to collocate data on the same replica set

how to collocate data on disk (for a given replica)

··

7/20

Page 8: Cassandra EU - State of CQL

This is done in CQL through the primary key

CQL distinguishes 2 sub-parts in the PRIMARY KEY:

This is important, because CQL only allow queries for which an explicit indexexists:

CREATE TABLE inboxes ( user_id uuid, email_id timeuuid, sender text, recipients set<text>, subject text, is_read boolean, PRIMARY KEY (user_id, email_id))

CQL

partition key: decides the node on which the data is storedclustering columns: within the same partition key, (CQL3) rows arephysically ordered following the clustering columns

··

-- Get last 50 emails in user 51b-23-ab8 inboxSELECT * FROM inboxes WHERE user_id=51b-23-ab8 ORDER BY email_id DESC LIMIT 50;

CQL

8/20

Page 9: Cassandra EU - State of CQL

CQL main features

For more details:

Collections (set, map and list)

Secondary indexes

Convenience functions (timeuuid, type conversions, ...)

...

····

http://cassandra.apache.org/doc/cql3/CQL.html

http://www.datastax.com/documentation/cql/3.1/webhelp/index.html

··

9/20

Page 10: Cassandra EU - State of CQL

New in Cassandra 2.0

Page 11: Cassandra EU - State of CQL

New in Cassandra 2.0Lightweight transactions:

Triggers:

ALTER DROP:

Preparing TIMESTAMP, TTL and LIMIT:

INSERT INTO test (id, name) VALUES (42, 'Tom') IF NOT EXISTS;UPDATE test SET password='newpass' WHERE id=42 IF password='oldpass';

CQL

CREATE TRIGGER myTrigger ON test USING 'my.trigger.Class'; CQL

CREATE TABLE test (k int PRIMARY KEY, prop1 int, prop2 text, prop3 float);ALTER TABLE test DROP prop3;

CQL

SELECT * FROM myTable LIMIT ?;UPDATE myTable USING TTL ? SET v = 2 WHERE k = 'foo';

CQL

11/20

Page 12: Cassandra EU - State of CQL

New in Cassandra 2.0Conditional DDL:

Secondary indexes everywhere (almost):

SELECT aliases:

CREATE TABLE IF NOT EXISTS test (k int PRIMARY KEY);DROP KEYSPACE IF EXISTS ks;

CQL

CREATE TABLE timeline ( event_id uuid, created_at timeuuid, content blob, PRIMARY KEY (event_id, created_at));CREATE INDEX ON timeline (created_at);

CQL

SELECT event_id, dateOf(created_at) AS creation_date, FROM timeline;

CQL

12/20

Page 13: Cassandra EU - State of CQL

Coming in Cassandra 2.0.2Named bind variables:

Prepared IN:

Limited SELECT DISTINCT:

SELECT * FROM timeline WHERE created_at > :tlow AND created_at <= :thigh AND key = :k;CQL

SELECT * FROM users WHERE user_id IN ?; CQL

CREATE TABLE test ( event_id int, created_at timestamp, content blob, PRIMARY KEY (event_id, created_at));SELECT DISTINCT event_id FROM test;

CQL

13/20

Page 14: Cassandra EU - State of CQL

The native protocolA binary transport protocol for CQL

Page 15: Cassandra EU - State of CQL

Native protocol

Example usage of the Java driver (https://github.com/datastax/java-driver):

Binary transport protocol for CQL

Query execution, prepared statements, authentication, compression, ...

Asynchronous (allows multiple concurrent queries per connection)

Server notifications (Only generic cluster events currently)

Existing drivers for Java, C#, Python, C++, Golang, ...

·····

Cluster cluster = Cluster.builder().addContactPoint("127.0.0.1").build();Session session = cluster.connect("myKeyspace");

for (Row row : session.execute("SELECT * FROM myTable")) // Do something ...

JAVA

15/20

Page 16: Cassandra EU - State of CQL

New in Cassandra 2.0: native protocol 2Cursors:

Batching prepared statements:

One-shot prepare and execute:

SASL for authentication

for (Row row : session.execute("SELECT * FROM myTable")) // Do something ...

JAVA

PreparedStatement ps = session.prepare("INSERT INTO myTable (p1, p1) VALUES (?, ?)");

BatchStatement bs = new BatchStatement();bs.add(ps.bind(0, "v1"));bs.add(ps.bind(1, "v2"));bs.add(ps.bind(2, "v3"));session.execute(bs);

JAVA

session.execute("INSERT INTO users (id, photo) VALUES (?, ?)", someId, photoBytes);JAVA

16/20

Page 17: Cassandra EU - State of CQL

What's next?Cassandra 2.1 and beyond

Page 18: Cassandra EU - State of CQL

CQL: some ideasStorage engine optimizations for CQL

Secondary index for collections

Server side functions

User defined types

...

·····

18/20

Page 19: Cassandra EU - State of CQL

User defined types

CREATE TYPE address ( street text, zip_code int, state text, phones set<text>);

CREATE TABLE users ( id uuid PRIMARY KEY, name text, addresses map<text, address>);

INSERT INTO users (id, name) VALUES (234-4a-761, "Sylvain Lebresne");UPDATE users SET addresses["work"] = { street: '777 Mariners Island Blvd #510', zip_code: 94404, state: 'CA', phones: { 650-389-6000 }} WHERE id = 234-4a-761;

CQL

19/20

Page 20: Cassandra EU - State of CQL

Thank You!(Questions?)