a deep look at the cql where clause

45
A deep look at the CQL WHERE clause

Upload: benjamin-lerer

Post on 13-Apr-2017

1.552 views

Category:

Technology


2 download

TRANSCRIPT

Page 1: A deep look at the cql where clause

A deep look at the CQL WHERE clause

Page 2: A deep look at the cql where clause

© 2015. All Rights Reserved. 2

CQL WHERE clause

Driver

The WHERE clause restrictions are dependent on:

• The type of statement: SELECT, UPDATE or DELETE

• The type of column: partition key, clustering or regular column

• If a secondary index is used or not

Page 3: A deep look at the cql where clause

© 2015. All Rights Reserved. 3

Driver

SELECT statements

Page 4: A deep look at the cql where clause

© 2015. All Rights Reserved. 4

Partition key restrictions

Driver

Cluster Date Time Count

‘cluster 1’ ‘2015-09-21’ ‘12:00’ 251

‘cluster 1’ ‘2015-09-22’ ‘12:00’ 342

‘cluster 2’ ‘2015-09-21’ ‘12:00’ 403

‘cluster 2’ ‘2015-09-22’ ‘12:00’ 451

CREATE TABLE numberOfRequests ( cluster text, date text, time text, count int, PRIMARY KEY ((cluster, date)))

Partition Key

Page 5: A deep look at the cql where clause

© 2015. All Rights Reserved. 5

Partition key restrictions

Driver

Cluster Date Murmur3 hash

‘cluster 1’ ‘2015-09-21’ -4782752162231423249

‘cluster 1’ ‘2015-09-22’ 4936127188075462704

‘cluster 2’ ‘2015-09-21’ 5822105674898716412

‘cluster 2’ ‘2015-09-22’ 2698159220916609751

A

C

D B

4611686018427387904to

9223372036854775807

-9223372036854775808to

-4611686018427387903

-1to

4611686018427387903-4611686018427387904 to

-1

Page 6: A deep look at the cql where clause

© 2015. All Rights Reserved. 6

Partition key restrictions

Driver

Cluster Date Node

‘cluster 1’ ‘2015-09-21’ A

‘cluster 1’ ‘2015-09-22’ D

‘cluster 2’ ‘2015-09-21’ D

‘cluster 2’ ‘2015-09-22’ C

A

C

D B

Page 7: A deep look at the cql where clause

© 2015. All Rights Reserved. 7

Partition key restrictions

Driver

A

C

D B

SELECT * FROM numberOfRequests;

Driver

Page 8: A deep look at the cql where clause

© 2015. All Rights Reserved. 8

Partition key restrictions

Driver

SELECT * FROM numberOfRequests WHERE cluster= ‘cluster 1’;

InvalidRequest: code=2200 [Invalid query] message="Partition key parts: date must be restricted as other parts are"

Page 9: A deep look at the cql where clause

© 2015. All Rights Reserved. 9

Partition key restrictions

Driver

A

C

D B

SELECT * FROM numberOfRequests WHERE cluster= ‘cluster 1’ AND date = ‘2015-09-21’;

Driver

Page 10: A deep look at the cql where clause

© 2015. All Rights Reserved. 10

Partition key restrictions

Driver

A

C

D B

SELECT * FROM numberOfRequests WHERE cluster= ‘cluster 1’ AND date = ‘2015-09-21’;

Driver

…with TokenAwarePolicy

Page 11: A deep look at the cql where clause

© 2015. All Rights Reserved. 11

Partition key restrictions

Driver

A

C

D B

SELECT * FROM numberOfRequests WHERE cluster= ‘cluster 2’ AND date IN (‘2015-09-21’, ‘2015-09-22’);

Driver

Page 12: A deep look at the cql where clause

© 2015. All Rights Reserved. 12

Partition key restrictions

Driver

A

C

D B

SELECT * FROM numberOfRequests WHERE cluster = ‘cluster 2’ AND date = ‘2015-09-21’;

Driver

…with TokenAwarePolicy and asynchronous calls

SELECT * FROM numberOfRequests WHERE cluster = ‘cluster 2’ AND date = ‘2015-09-22’;

Page 13: A deep look at the cql where clause

© 2015. All Rights Reserved. 13

Partition key restrictions

Driver

SELECT * FROM numberOfRequests WHERE cluster= ‘cluster 1’ AND date >= ‘2015-09-21’;

InvalidRequest: code=2200 [Invalid query] message="Only EQ and IN relation are supported on the partition key (unless you use the token() function)"

Page 14: A deep look at the cql where clause

© 2015. All Rights Reserved. 14

Partition key restrictions

Driver

Cluster Date Node

‘cluster 1’ ‘2015-09-21’ A

‘cluster 1’ ‘2015-09-22’ D

‘cluster 2’ ‘2015-09-21’ D

‘cluster 2’ ‘2015-09-22’ C

A

C

D B

SELECT * FROM numberOfRequests WHERE cluster= ‘cluster 1’ AND date >= ‘2015-09-21’;

Page 15: A deep look at the cql where clause

© 2015. All Rights Reserved. 15

Partition key restrictions

Driver

• Murmur3Partitioner (default): uniformly distributes data across the cluster based on MurmurHash hash values.

• RandomPartitioner: uniformly distributes data across the cluster based on MD5 hash values.

• ByteOrderedPartitioner: keeps an ordered distribution of data lexically by key bytes

Page 16: A deep look at the cql where clause

© 2015. All Rights Reserved. 16

Partition key restrictions

Driver

SELECT * FROM numberOfRequests WHERE token(cluster, date) > token(‘cluster 1’, ‘2015-09-21’)AND token(cluster, date) < token(‘cluster 1’, ‘2015-09-23’);

Page 17: A deep look at the cql where clause

© 2015. All Rights Reserved. 17

Partition key restrictions (SELECT)

• Without secondary index, either all partition key components must be restricted or none of them

• = restrictions are allowed on any partition key component

• IN restrictions are allowed on any partition key component since 2.2

• Prior to 2.2, IN restrictions were only allowed on the last partition key component

• =, >, >=, <= and < restrictions are allowed with the token function

Page 18: A deep look at the cql where clause

© 2015. All Rights Reserved. 18

Clustering column restrictions

CREATE TABLE numberOfRequests ( cluster text, date text, datacenter text, server inet, time text, count int, PRIMARY KEY((cluster, date), datacenter, server, time))

Page 19: A deep look at the cql where clause

© 2015. All Rights Reserved. 19

Clustering column restrictions

Datacenter Server Time Count

Iowa 196.8.7.134 00:00 130

Iowa 196.8.7.134 00:01 125

Iowa 196.8.7.134 00:02 97

Iowa 196.8.7.135 00:00 178

Iowa 196.8.7.135 00:01 201

[Iowa, 196.8.7.134, 00:02, count] :

97

In the Memtables:

[Iowa, 196.8.7.134, 00:00, count] :

130

[Iowa, 196.8.7.134, 00:01, count] :

125

[Iowa, 196.8.7.135, 00:00, count] :

178

[Iowa, 196.8.7.135, 00:01, count] :

201

[Iowa, 196.8.7.134, 00:00, count] :130

Cell nameCell

Column name

Page 20: A deep look at the cql where clause

© 2015. All Rights Reserved. 20

Clustering column restrictions

Datacenter Server Time Count

Iowa 196.8.7.134 00:00 130

Iowa 196.8.7.134 00:01 125

Iowa 196.8.7.134 00:02 97

Iowa 196.8.7.135 00:00 178

Iowa 196.8.7.135 00:01 201

[Iowa, 196.8.7.134, 00:02, count] :

97

In the SSTables:

[Iowa, 196.8.7.134, 00:00, count] :

130

[Iowa, 196.8.7.134, 00:01, count] :

125

[Iowa, 196.8.7.135, 00:00, count] :

178

[Iowa, 196.8.7.135, 00:01, count] :

201

[Iowa, 196.8.7.134, 00:00, count] :130

Cell nameCell

Column name

Page 21: A deep look at the cql where clause

© 2015. All Rights Reserved. 21

Clustering column restrictions

[Iowa, 196.8.7.134, 00:02, count] :

97

In the Memtables:

[Iowa, 196.8.7.134, 00:00, count] :

130

[Iowa, 196.8.7.134, 00:01, count] :

125

[Iowa, 196.8.7.135, 00:00, count] :

178

[Iowa, 196.8.7.135, 00:01, count] :

201

SELECT * FROM numberOfRequests WHERE cluster = ‘cluster1’ AND date =‘2015-09-21’AND datacenter = ‘Iowa’ AND server = ‘196.8.7.135’ AND time = ‘00:00’;

[Iowa,196.8.7.135,00:00]

Page 22: A deep look at the cql where clause

© 2015. All Rights Reserved. 22

Clustering column restrictions

SELECT * FROM numberOfRequests WHERE cluster = ‘cluster1’ AND date =‘2015-09-21’AND datacenter = ‘Iowa’ AND server = ‘196.8.7.135’ AND time = ‘00:00’;

[Iowa,196.8.7.135,00:00]

…[Iowa, 196.8.7.134, 00:02,

count] :97

In the SSTables:

[Iowa, 196.8.7.134, 00:00, count] :

130

[Iowa, 196.8.7.134, 00:01, count] :

125

[Iowa, 196.8.7.135, 00:00, count] :

178

[Iowa, 196.8.7.135, 00:01, count] :

201

Page 23: A deep look at the cql where clause

© 2015. All Rights Reserved. 23

Clustering column restrictions

[Iowa, 196.8.7.134, 00:02, count] :

97

In the Memtables:

[Iowa, 196.8.7.134, 00:00, count] :

130

[Iowa, 196.8.7.134, 00:01, count] :

125

[Iowa, 196.8.7.135, 00:00, count] :

178

[Iowa, 196.8.7.135, 00:01, count] :

201

SELECT * FROM numberOfRequests WHERE cluster = ‘cluster1’ AND date =‘2015-09-21’AND datacenter = ‘Iowa’ AND server = ‘196.8.7.135’;

[Iowa,196.8.7.135]

Page 24: A deep look at the cql where clause

© 2015. All Rights Reserved. 24

Clustering column restrictions

SELECT * FROM numberOfRequests WHERE cluster = ‘cluster1’ AND date =‘2015-09-21’AND datacenter = ‘Iowa’ AND server = ‘196.8.7.135’;

[Iowa,196.8.7.135]

…[Iowa, 196.8.7.134, 00:02, count] :

97

In the SSTables:

[Iowa, 196.8.7.134, 00:00, count] :

130

[Iowa, 196.8.7.134, 00:01, count] :

125

[Iowa, 196.8.7.135, 00:00, count] :

178

[Iowa, 196.8.7.135, 00:01, count] :

201

Page 25: A deep look at the cql where clause

Clustering column restrictions

SELECT * FROM numberOfRequests WHERE cluster = ‘cluster1’ AND date =‘2015-09-21’AND time = ‘00:00’;

[?,?,00:00]

InvalidRequest: code=2200 [Invalid query] message="PRIMARY KEY column "time" cannot be restricted as preceding column "datacenter" is not restricted"

Page 26: A deep look at the cql where clause

© 2015. All Rights Reserved. 26

Clustering column restrictions

…AND datacenter = ‘Iowa’AND server IN (‘196.8.7.134’, ‘196.8.7.135’)AND time = ‘00:00’;

In 2.2:

[Iowa,196.8.7.134,00:00][Iowa,196.8.7.135,00:00]

…[Iowa, 196.8.7.134, 00:02, count] :

97

In the SSTables:

[Iowa, 196.8.7.134, 00:00, count] :

130

[Iowa, 196.8.7.134, 00:01, count] :

125

[Iowa, 196.8.7.135, 00:00, count] :

178

[Iowa, 196.8.7.135, 00:01, count] :

201

Page 27: A deep look at the cql where clause

© 2015. All Rights Reserved. 27

Clustering column restrictions

…AND datacenter = ‘Iowa’AND server IN (‘196.8.7.134’, ‘196.8.7.135’)AND time = ‘00:00’;

In 2.1:

InvalidRequest: code=2200 [Invalid query] message="Clustering column "server" cannot be restricted by an IN relation"

Page 28: A deep look at the cql where clause

© 2015. All Rights Reserved. 28

Clustering column restrictions

= multi-column restriction:(clustering1, clustering2, clustering3) = (?, ?, ?)

IN multi-column restriction:(clustering1, clustering2, clustering3) IN ((?, ?, ?), (?, ?, ?))

Slice multi-column restriction:(clustering1, clustering2, clustering3) > (?, ?, ?) (clustering1, clustering2, clustering3) >= (?, ?, ?) (clustering1, clustering2, clustering3) <= (?, ?, ?) (clustering1, clustering2, clustering3) < (?, ?, ?)

Page 29: A deep look at the cql where clause

© 2015. All Rights Reserved. 29

Clustering column restrictions

…AND datacenter = ‘Iowa’AND (server, time) IN ((‘196.8.7.134’, ‘00:00’), (‘196.8.7.135’, ‘00:00’));

In 2.1:

[Iowa,196.8.7.134,00:00][Iowa,196.8.7.135,00:00]

…[Iowa, 196.8.7.134, 00:02, count] :

97

In the SSTables:

[Iowa, 196.8.7.134, 00:00, count] :

130

[Iowa, 196.8.7.134, 00:01, count] :

125

[Iowa, 196.8.7.135, 00:00, count] :

178

[Iowa, 196.8.7.135, 00:01, count] :

201

Page 30: A deep look at the cql where clause

© 2015. All Rights Reserved. 30

Clustering column restrictions…AND datacenter = ‘Iowa’AND server = ‘196.8.7.134’AND time > ’00:00’;

from after [Iowa,196.8.7.134,00:00]to end of [Iowa,196.8.7.134]

…[Iowa, 196.8.7.134, 00:02, count] :

97

In the SSTables:

[Iowa, 196.8.7.134, 00:00, count] :

130

[Iowa, 196.8.7.134, 00:01, count] :

125

[Iowa, 196.8.7.135, 00:00, count] :

178

[Iowa, 196.8.7.135, 00:01, count] :

201

Page 31: A deep look at the cql where clause

© 2015. All Rights Reserved. 31

Clustering column restrictions (SELECT)

• Without secondary index, a clustering column cannot be restricted if one of the previous ones was not

• = restrictions (single and multi) are allowed on any clustering column

• IN restrictions (single and multi) are allowed on any clustering column since 2.2

• Prior to 2.2, IN restrictions (single and multi) were only allowed on the last clustering column or set of clustering columns

• >, >=, <=, < restrictions (single and multi) are only allowed on the last restricted clustering column or set of clustering columns

• CONTAINS and CONTAINS KEY restrictions are only allowed on indexed collections

Page 32: A deep look at the cql where clause

© 2015. All Rights Reserved. 32

Secondary index queries

CREATE TABLE numberOfRequests ( cluster text, date text, datacenter text, server inet, time text, count int, PRIMARY KEY((cluster, date), datacenter, server, time));

CREATE INDEX ON numberOfRequests (time);…

Page 33: A deep look at the cql where clause

© 2015. All Rights Reserved. 33

Secondary index queries

CREATE INDEX ON numberOfRequests (time);

CREATE LOCAL TABLE numberOfRequests_time_idx ( time text, cluster text, date text, datacenter text, server inet, PRIMARY KEY(time, cluster, date, datacenter, server);…

Table Partition Key

Table remaining clustering columns

Page 34: A deep look at the cql where clause

© 2015. All Rights Reserved. 34

IDX-BIDX-D

IDX-C

IDX-A

Secondary index queries

Driver

A

C

D B

SELECT * FROM numberOfRequests WHERE time = ‘12:00’;

Driver

Page 35: A deep look at the cql where clause

© 2015. All Rights Reserved. 35

Secondary index queries

Driver

SELECT * FROM numberOfRequests WHERE time = ‘12:00’;

idxSELECT * FROM numberOfRequests_time_idxWHERE time = ‘12:00’;

Results (Primary Keys)

tableSELECT with full PK;

[For each]

Add to rows

Page 36: A deep look at the cql where clause

© 2015. All Rights Reserved. 36

Secondary index queries

Driver

SELECT * FROM numberOfRequests WHERE time >= ‘12:00’;

InvalidRequest: code=2200 [Invalid query] message="PRIMARY KEY column "time" cannot be restricted as preceding column "datacenter" is not restricted"

Direct queries on secondary index support only =, CONTAINS or CONTAINS KEY restrictions.

Page 37: A deep look at the cql where clause

© 2015. All Rights Reserved. 37

Secondary index queries

Driver

SELECT * FROM numberOfRequests WHERE time = ‘12:00’AND count >= 500 ALLOW FILTERING;

idxSELECT * FROM numberOfRequests_time_idxWHERE time = ‘12:00’;

Results (Primary Keys)

tableSELECT with full PK;

[For each]

Add to rows

[if count >= 500]

Page 38: A deep look at the cql where clause

© 2015. All Rights Reserved. 38

IDX-BIDX-D

IDX-C

IDX-A

Secondary index queries

Driver

A

C

D B

SELECT * FROM numberOfRequestsWHERE cluster = ‘cluster 1’ AND date = ‘2015-09-21’AND time = ‘12:00’;

Driver

Page 39: A deep look at the cql where clause

© 2015. All Rights Reserved. 39

Secondary index queries

Driver

SELECT * FROM numberOfRequestsWHERE cluster = ‘cluster 1’ AND date = ‘2015-09-21’ AND time = ‘12:00’;

idx

SELECT * FROM numberOfRequests_time_idxWHERE time = ‘12:00’ AND cluster = ‘1’ AND date = ‘2015-09-21’;

Results (Primary Keys)

tableSELECT with full PK

[For each]

Add to rows

Page 40: A deep look at the cql where clause

© 2015. All Rights Reserved. 40

Driver

UPDATE/DELETE statements

Page 41: A deep look at the cql where clause

© 2015. All Rights Reserved. 41

UPDATE statements

Driver

In the UPDATE statements all the primary key columns must be restricted and the only allowed restrictions are:

• Prior to 3.0:

• Single column = restriction on any partition key or clustering column

• Single column IN restriction on the last partition key column

• In 3.0:

• = and IN single column restrictions on any partition key column

• = and IN single or multi column restrictions on any clustering column

Page 42: A deep look at the cql where clause

© 2015. All Rights Reserved. 42

DELETE statements

Driver

Before 3.0, in the DELETE statements all the primary key columns must be restricted and the only allowed restrictions were:

• Single column = restriction on any partition key or clustering column

• Single column IN restriction on the last partition key column

Page 43: A deep look at the cql where clause

© 2015. All Rights Reserved. 43

DELETE statements

Driver

Since 3.0:

• The partition key columns must be restricted by = or IN restrictions

• A clustering column might not be restricted if none of the following is

• Clustering columns can be restricted by:

• Single or multi column = restriction

• Single or multi column IN restriction

• Single or multi column >, >=, <=, < restriction

Page 44: A deep look at the cql where clause

© 2015. All Rights Reserved. 44

Design your tables for the queries you want to perform.

Page 45: A deep look at the cql where clause

Thank you