a deep look at the cql where clause
TRANSCRIPT
A deep look at the CQL WHERE clause
© 2015. All Rights Reserved. 2
CQL WHERE clause
Driver
The WHERE clause restrictions are dependent on:
• The type of statement: SELECT, UPDATE or DELETE
• The type of column: partition key, clustering or regular column
• If a secondary index is used or not
© 2015. All Rights Reserved. 3
Driver
SELECT statements
© 2015. All Rights Reserved. 4
Partition key restrictions
Driver
Cluster Date Time Count
‘cluster 1’ ‘2015-09-21’ ‘12:00’ 251
‘cluster 1’ ‘2015-09-22’ ‘12:00’ 342
‘cluster 2’ ‘2015-09-21’ ‘12:00’ 403
‘cluster 2’ ‘2015-09-22’ ‘12:00’ 451
CREATE TABLE numberOfRequests ( cluster text, date text, time text, count int, PRIMARY KEY ((cluster, date)))
Partition Key
© 2015. All Rights Reserved. 5
Partition key restrictions
Driver
Cluster Date Murmur3 hash
‘cluster 1’ ‘2015-09-21’ -4782752162231423249
‘cluster 1’ ‘2015-09-22’ 4936127188075462704
‘cluster 2’ ‘2015-09-21’ 5822105674898716412
‘cluster 2’ ‘2015-09-22’ 2698159220916609751
A
C
D B
4611686018427387904to
9223372036854775807
-9223372036854775808to
-4611686018427387903
-1to
4611686018427387903-4611686018427387904 to
-1
© 2015. All Rights Reserved. 6
Partition key restrictions
Driver
Cluster Date Node
‘cluster 1’ ‘2015-09-21’ A
‘cluster 1’ ‘2015-09-22’ D
‘cluster 2’ ‘2015-09-21’ D
‘cluster 2’ ‘2015-09-22’ C
A
C
D B
© 2015. All Rights Reserved. 7
Partition key restrictions
Driver
A
C
D B
SELECT * FROM numberOfRequests;
Driver
© 2015. All Rights Reserved. 8
Partition key restrictions
Driver
SELECT * FROM numberOfRequests WHERE cluster= ‘cluster 1’;
InvalidRequest: code=2200 [Invalid query] message="Partition key parts: date must be restricted as other parts are"
© 2015. All Rights Reserved. 9
Partition key restrictions
Driver
A
C
D B
SELECT * FROM numberOfRequests WHERE cluster= ‘cluster 1’ AND date = ‘2015-09-21’;
Driver
© 2015. All Rights Reserved. 10
Partition key restrictions
Driver
A
C
D B
SELECT * FROM numberOfRequests WHERE cluster= ‘cluster 1’ AND date = ‘2015-09-21’;
Driver
…with TokenAwarePolicy
© 2015. All Rights Reserved. 11
Partition key restrictions
Driver
A
C
D B
SELECT * FROM numberOfRequests WHERE cluster= ‘cluster 2’ AND date IN (‘2015-09-21’, ‘2015-09-22’);
Driver
© 2015. All Rights Reserved. 12
Partition key restrictions
Driver
A
C
D B
SELECT * FROM numberOfRequests WHERE cluster = ‘cluster 2’ AND date = ‘2015-09-21’;
Driver
…with TokenAwarePolicy and asynchronous calls
SELECT * FROM numberOfRequests WHERE cluster = ‘cluster 2’ AND date = ‘2015-09-22’;
© 2015. All Rights Reserved. 13
Partition key restrictions
Driver
SELECT * FROM numberOfRequests WHERE cluster= ‘cluster 1’ AND date >= ‘2015-09-21’;
InvalidRequest: code=2200 [Invalid query] message="Only EQ and IN relation are supported on the partition key (unless you use the token() function)"
© 2015. All Rights Reserved. 14
Partition key restrictions
Driver
Cluster Date Node
‘cluster 1’ ‘2015-09-21’ A
‘cluster 1’ ‘2015-09-22’ D
‘cluster 2’ ‘2015-09-21’ D
‘cluster 2’ ‘2015-09-22’ C
A
C
D B
SELECT * FROM numberOfRequests WHERE cluster= ‘cluster 1’ AND date >= ‘2015-09-21’;
© 2015. All Rights Reserved. 15
Partition key restrictions
Driver
• Murmur3Partitioner (default): uniformly distributes data across the cluster based on MurmurHash hash values.
• RandomPartitioner: uniformly distributes data across the cluster based on MD5 hash values.
• ByteOrderedPartitioner: keeps an ordered distribution of data lexically by key bytes
© 2015. All Rights Reserved. 16
Partition key restrictions
Driver
SELECT * FROM numberOfRequests WHERE token(cluster, date) > token(‘cluster 1’, ‘2015-09-21’)AND token(cluster, date) < token(‘cluster 1’, ‘2015-09-23’);
© 2015. All Rights Reserved. 17
Partition key restrictions (SELECT)
• Without secondary index, either all partition key components must be restricted or none of them
• = restrictions are allowed on any partition key component
• IN restrictions are allowed on any partition key component since 2.2
• Prior to 2.2, IN restrictions were only allowed on the last partition key component
• =, >, >=, <= and < restrictions are allowed with the token function
© 2015. All Rights Reserved. 18
Clustering column restrictions
CREATE TABLE numberOfRequests ( cluster text, date text, datacenter text, server inet, time text, count int, PRIMARY KEY((cluster, date), datacenter, server, time))
…
© 2015. All Rights Reserved. 19
Clustering column restrictions
…
Datacenter Server Time Count
Iowa 196.8.7.134 00:00 130
Iowa 196.8.7.134 00:01 125
Iowa 196.8.7.134 00:02 97
Iowa 196.8.7.135 00:00 178
Iowa 196.8.7.135 00:01 201
[Iowa, 196.8.7.134, 00:02, count] :
97
In the Memtables:
[Iowa, 196.8.7.134, 00:00, count] :
130
[Iowa, 196.8.7.134, 00:01, count] :
125
[Iowa, 196.8.7.135, 00:00, count] :
178
[Iowa, 196.8.7.135, 00:01, count] :
201
[Iowa, 196.8.7.134, 00:00, count] :130
Cell nameCell
Column name
© 2015. All Rights Reserved. 20
Clustering column restrictions
…
Datacenter Server Time Count
Iowa 196.8.7.134 00:00 130
Iowa 196.8.7.134 00:01 125
Iowa 196.8.7.134 00:02 97
Iowa 196.8.7.135 00:00 178
Iowa 196.8.7.135 00:01 201
[Iowa, 196.8.7.134, 00:02, count] :
97
In the SSTables:
[Iowa, 196.8.7.134, 00:00, count] :
130
[Iowa, 196.8.7.134, 00:01, count] :
125
[Iowa, 196.8.7.135, 00:00, count] :
178
[Iowa, 196.8.7.135, 00:01, count] :
201
[Iowa, 196.8.7.134, 00:00, count] :130
Cell nameCell
Column name
© 2015. All Rights Reserved. 21
Clustering column restrictions
…
[Iowa, 196.8.7.134, 00:02, count] :
97
In the Memtables:
[Iowa, 196.8.7.134, 00:00, count] :
130
[Iowa, 196.8.7.134, 00:01, count] :
125
[Iowa, 196.8.7.135, 00:00, count] :
178
[Iowa, 196.8.7.135, 00:01, count] :
201
SELECT * FROM numberOfRequests WHERE cluster = ‘cluster1’ AND date =‘2015-09-21’AND datacenter = ‘Iowa’ AND server = ‘196.8.7.135’ AND time = ‘00:00’;
[Iowa,196.8.7.135,00:00]
© 2015. All Rights Reserved. 22
Clustering column restrictions
…
SELECT * FROM numberOfRequests WHERE cluster = ‘cluster1’ AND date =‘2015-09-21’AND datacenter = ‘Iowa’ AND server = ‘196.8.7.135’ AND time = ‘00:00’;
[Iowa,196.8.7.135,00:00]
…[Iowa, 196.8.7.134, 00:02,
count] :97
In the SSTables:
[Iowa, 196.8.7.134, 00:00, count] :
130
[Iowa, 196.8.7.134, 00:01, count] :
125
[Iowa, 196.8.7.135, 00:00, count] :
178
[Iowa, 196.8.7.135, 00:01, count] :
201
© 2015. All Rights Reserved. 23
Clustering column restrictions
[Iowa, 196.8.7.134, 00:02, count] :
97
In the Memtables:
[Iowa, 196.8.7.134, 00:00, count] :
130
[Iowa, 196.8.7.134, 00:01, count] :
125
[Iowa, 196.8.7.135, 00:00, count] :
178
[Iowa, 196.8.7.135, 00:01, count] :
201
SELECT * FROM numberOfRequests WHERE cluster = ‘cluster1’ AND date =‘2015-09-21’AND datacenter = ‘Iowa’ AND server = ‘196.8.7.135’;
[Iowa,196.8.7.135]
© 2015. All Rights Reserved. 24
Clustering column restrictions
…
SELECT * FROM numberOfRequests WHERE cluster = ‘cluster1’ AND date =‘2015-09-21’AND datacenter = ‘Iowa’ AND server = ‘196.8.7.135’;
[Iowa,196.8.7.135]
…[Iowa, 196.8.7.134, 00:02, count] :
97
In the SSTables:
[Iowa, 196.8.7.134, 00:00, count] :
130
[Iowa, 196.8.7.134, 00:01, count] :
125
[Iowa, 196.8.7.135, 00:00, count] :
178
[Iowa, 196.8.7.135, 00:01, count] :
201
Clustering column restrictions
SELECT * FROM numberOfRequests WHERE cluster = ‘cluster1’ AND date =‘2015-09-21’AND time = ‘00:00’;
[?,?,00:00]
InvalidRequest: code=2200 [Invalid query] message="PRIMARY KEY column "time" cannot be restricted as preceding column "datacenter" is not restricted"
© 2015. All Rights Reserved. 26
Clustering column restrictions
…AND datacenter = ‘Iowa’AND server IN (‘196.8.7.134’, ‘196.8.7.135’)AND time = ‘00:00’;
In 2.2:
[Iowa,196.8.7.134,00:00][Iowa,196.8.7.135,00:00]
…[Iowa, 196.8.7.134, 00:02, count] :
97
In the SSTables:
[Iowa, 196.8.7.134, 00:00, count] :
130
[Iowa, 196.8.7.134, 00:01, count] :
125
[Iowa, 196.8.7.135, 00:00, count] :
178
[Iowa, 196.8.7.135, 00:01, count] :
201
© 2015. All Rights Reserved. 27
Clustering column restrictions
…AND datacenter = ‘Iowa’AND server IN (‘196.8.7.134’, ‘196.8.7.135’)AND time = ‘00:00’;
In 2.1:
InvalidRequest: code=2200 [Invalid query] message="Clustering column "server" cannot be restricted by an IN relation"
© 2015. All Rights Reserved. 28
Clustering column restrictions
= multi-column restriction:(clustering1, clustering2, clustering3) = (?, ?, ?)
IN multi-column restriction:(clustering1, clustering2, clustering3) IN ((?, ?, ?), (?, ?, ?))
Slice multi-column restriction:(clustering1, clustering2, clustering3) > (?, ?, ?) (clustering1, clustering2, clustering3) >= (?, ?, ?) (clustering1, clustering2, clustering3) <= (?, ?, ?) (clustering1, clustering2, clustering3) < (?, ?, ?)
© 2015. All Rights Reserved. 29
Clustering column restrictions
…AND datacenter = ‘Iowa’AND (server, time) IN ((‘196.8.7.134’, ‘00:00’), (‘196.8.7.135’, ‘00:00’));
In 2.1:
[Iowa,196.8.7.134,00:00][Iowa,196.8.7.135,00:00]
…[Iowa, 196.8.7.134, 00:02, count] :
97
In the SSTables:
[Iowa, 196.8.7.134, 00:00, count] :
130
[Iowa, 196.8.7.134, 00:01, count] :
125
[Iowa, 196.8.7.135, 00:00, count] :
178
[Iowa, 196.8.7.135, 00:01, count] :
201
© 2015. All Rights Reserved. 30
Clustering column restrictions…AND datacenter = ‘Iowa’AND server = ‘196.8.7.134’AND time > ’00:00’;
from after [Iowa,196.8.7.134,00:00]to end of [Iowa,196.8.7.134]
…[Iowa, 196.8.7.134, 00:02, count] :
97
In the SSTables:
[Iowa, 196.8.7.134, 00:00, count] :
130
[Iowa, 196.8.7.134, 00:01, count] :
125
[Iowa, 196.8.7.135, 00:00, count] :
178
[Iowa, 196.8.7.135, 00:01, count] :
201
© 2015. All Rights Reserved. 31
Clustering column restrictions (SELECT)
• Without secondary index, a clustering column cannot be restricted if one of the previous ones was not
• = restrictions (single and multi) are allowed on any clustering column
• IN restrictions (single and multi) are allowed on any clustering column since 2.2
• Prior to 2.2, IN restrictions (single and multi) were only allowed on the last clustering column or set of clustering columns
• >, >=, <=, < restrictions (single and multi) are only allowed on the last restricted clustering column or set of clustering columns
• CONTAINS and CONTAINS KEY restrictions are only allowed on indexed collections
© 2015. All Rights Reserved. 32
Secondary index queries
CREATE TABLE numberOfRequests ( cluster text, date text, datacenter text, server inet, time text, count int, PRIMARY KEY((cluster, date), datacenter, server, time));
CREATE INDEX ON numberOfRequests (time);…
© 2015. All Rights Reserved. 33
Secondary index queries
CREATE INDEX ON numberOfRequests (time);
CREATE LOCAL TABLE numberOfRequests_time_idx ( time text, cluster text, date text, datacenter text, server inet, PRIMARY KEY(time, cluster, date, datacenter, server);…
Table Partition Key
Table remaining clustering columns
© 2015. All Rights Reserved. 34
IDX-BIDX-D
IDX-C
IDX-A
Secondary index queries
Driver
A
C
D B
SELECT * FROM numberOfRequests WHERE time = ‘12:00’;
Driver
© 2015. All Rights Reserved. 35
Secondary index queries
Driver
SELECT * FROM numberOfRequests WHERE time = ‘12:00’;
idxSELECT * FROM numberOfRequests_time_idxWHERE time = ‘12:00’;
Results (Primary Keys)
tableSELECT with full PK;
[For each]
Add to rows
© 2015. All Rights Reserved. 36
Secondary index queries
Driver
SELECT * FROM numberOfRequests WHERE time >= ‘12:00’;
InvalidRequest: code=2200 [Invalid query] message="PRIMARY KEY column "time" cannot be restricted as preceding column "datacenter" is not restricted"
Direct queries on secondary index support only =, CONTAINS or CONTAINS KEY restrictions.
© 2015. All Rights Reserved. 37
Secondary index queries
Driver
SELECT * FROM numberOfRequests WHERE time = ‘12:00’AND count >= 500 ALLOW FILTERING;
idxSELECT * FROM numberOfRequests_time_idxWHERE time = ‘12:00’;
Results (Primary Keys)
tableSELECT with full PK;
[For each]
Add to rows
[if count >= 500]
© 2015. All Rights Reserved. 38
IDX-BIDX-D
IDX-C
IDX-A
Secondary index queries
Driver
A
C
D B
SELECT * FROM numberOfRequestsWHERE cluster = ‘cluster 1’ AND date = ‘2015-09-21’AND time = ‘12:00’;
Driver
© 2015. All Rights Reserved. 39
Secondary index queries
Driver
SELECT * FROM numberOfRequestsWHERE cluster = ‘cluster 1’ AND date = ‘2015-09-21’ AND time = ‘12:00’;
idx
SELECT * FROM numberOfRequests_time_idxWHERE time = ‘12:00’ AND cluster = ‘1’ AND date = ‘2015-09-21’;
Results (Primary Keys)
tableSELECT with full PK
[For each]
Add to rows
© 2015. All Rights Reserved. 40
Driver
UPDATE/DELETE statements
© 2015. All Rights Reserved. 41
UPDATE statements
Driver
In the UPDATE statements all the primary key columns must be restricted and the only allowed restrictions are:
• Prior to 3.0:
• Single column = restriction on any partition key or clustering column
• Single column IN restriction on the last partition key column
• In 3.0:
• = and IN single column restrictions on any partition key column
• = and IN single or multi column restrictions on any clustering column
© 2015. All Rights Reserved. 42
DELETE statements
Driver
Before 3.0, in the DELETE statements all the primary key columns must be restricted and the only allowed restrictions were:
• Single column = restriction on any partition key or clustering column
• Single column IN restriction on the last partition key column
© 2015. All Rights Reserved. 43
DELETE statements
Driver
Since 3.0:
• The partition key columns must be restricted by = or IN restrictions
• A clustering column might not be restricted if none of the following is
• Clustering columns can be restricted by:
• Single or multi column = restriction
• Single or multi column IN restriction
• Single or multi column >, >=, <=, < restriction
© 2015. All Rights Reserved. 44
Design your tables for the queries you want to perform.
Thank you