cql performance with apache cassandra 3.0 (aaron morton, the last pickle) | c* summit 2016

59
CASSANDRA SUMMIT 2016 CQL PERFORMANCE WITH APACHE CASSANDRA 3.0 Aaron Morton @aaronmorton CEO Licensed under a Creative Commons Attribution-NonCommercial 3.0 New Zealand License

Upload: datastax

Post on 06-Jan-2017

67 views

Category:

Software


2 download

TRANSCRIPT

Page 1: CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016

CASSANDRA SUMMIT 2016

CQL PERFORMANCE WITH APACHE CASSANDRA 3.0

Aaron Morton@aaronmorton

CEO

Licensed under a Creative Commons Attribution-NonCommercial 3.0 New Zealand License

Page 2: CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016
Page 3: CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016

How We Got HereStorage Engine 3.0

Read Path

Page 4: CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016

How We Got Here

Way back in 2011…

Page 5: CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016

2011

Blog: Cassandra Query Plans

http://thelastpickle.com/blog/2011/07/04/Cassandra-Query-Plans.html

Page 6: CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016

2012

Talk: Technical Deep Dive - Query Performance

https://www.youtube.com/watch?v=gomOKhMV0zc

Page 7: CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016

2012

Explain Read & Write performance in 45 minutes.

Page 8: CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016

Skip Forward to 2016

Blog: Introduction To The Apache Cassandra 3.x Storage

Enginehttp://thelastpickle.com/blog/2016/03/04/introductiont-to-

the-apache-cassandra-3-storage-engine.html

Page 9: CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016

Skip Forward to 2016

“Why don’t I do another talk about Cassandra performance.”

Page 10: CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016

Skip Forward to 2016

It was a busy 4 years…

Page 11: CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016

Skip Forward to 2016

CQL 3, Collection Types, UDTs, UDF’s, UDA’s,

Materialised Views, Triggers, SASI,…

Page 12: CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016

Skip Forward to 2016

Explain Read & Write performance in 45 minutes.

Page 13: CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016

So Lets Avoid

CQL 3, Collection Types, UDTs, UDF’s, UDA’s,

Materialised Views, Triggers, SASI,…

Page 14: CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016

How We Got HereStorage Engine 3.0

Read Path

Page 15: CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016

High Level Storage Engine 3.0

Page 16: CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016

Storage Engine 3.0 Files

Data.db Index.db Filter.db

Page 17: CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016

Storage Engine 3.0 FilesCompressionInfo.db

Statistics.db Digest.crc32

CRC.db Summary.db TOC.txt

Page 18: CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016

CQL Recapcreate table my_table ( partition_1 text, cluster_1 text, foo text, bar text, baz text, PRIMARY KEY (partition_1, cluster_1) );

Page 19: CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016

CQL Recap

WARNING: FAKE DATA AHEAD

Page 20: CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016

CQL With Thrift Pre 3.0[default@dev] list my_table; ------------------- RowKey: part_a => (column=clust_a:, value=, timestamp=1357…739000) => (column=clust_a:foo, value=some foo, timestamp=1357…739000) => (column=clust_a:bar, value=and bar, timestamp=1357…739000) => (column=clust_a:baz, value=no baz, timestamp=1357…739000) => (column=clust_b:, value=, timestamp=1357…739000) => (column=clust_b:foo, value=no foo, timestamp=1357…739000) => (column=clust_b:bar, value=no bar, timestamp=1357…739000) => (column=clust_b:baz, value=lots baz, timestamp=1357…739000)

Page 21: CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016

CQL Pre 3.0

Clustering Keys RepeatedColumn Names Repeated

Timestamps RepeatedFixed Width Encoding

No Knowledge Of Row Contents

Page 22: CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016

Storage Engine 3.0 Improvements

Delta EncodingVariable Int Encoding

Clustering Written OnceAggregated Metadata

Cell Presence

Page 23: CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016

SerializationHeader

For each SSTable*.

Stored in each SSTable.

Held in memory.

Page 24: CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016

SerializationHeaderpublic class SerializationHeader { private final AbstractType<?> keyType; private final List<AbstractType<?>> clusteringTypes;

private final PartitionColumns columns; private final EncodingStats stats; … }

Page 25: CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016

EncodingStats

Collected on the fly by the Memtable.

Page 26: CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016

EncodingStatspublic class EncodingStats { public final long minTimestamp; public final int minLocalDeletionTime; public final int minTTL; … }

Page 27: CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016

SerializationHeaderpublic class SerializationHeader { public void writeTimestamp(long timestamp, DataOutputPlus out) throws IOException

{ out.writeUnsignedVInt(timestamp - stats.minTimestamp);

} … }

Page 28: CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016

VIntCodingpublic class VIntCoding { public static void writeUnsignedVInt(long value, DataOutput output) throws IOException { int size = VIntCoding.computeUnsignedVIntSize(value); if (size == 1) { output.write((int)value); return; }

output.write(VIntCoding.encodeVInt(value, size), 0, size); }

Page 29: CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016

Storage Engine 3.0 Improvements

Delta EncodingVariable Int Encoding

Clustering Written OnceAggregated Metadata

Cell Presence

Page 30: CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016

CQL With Thrift Pre 3.0[default@dev] list my_table; ------------------- RowKey: part_a => (column=clust_a:, value=, timestamp=1357…739000) => (column=clust_a:foo, value=some foo, timestamp=1357…739000) => (column=clust_a:bar, value=and bar, timestamp=1357…739000) => (column=clust_a:baz, value=no baz, timestamp=1357…739000) => (column=clust_b:, value=, timestamp=1357…739000) => (column=clust_b:foo, value=no foo, timestamp=1357…739000) => (column=clust_b:bar, value=no bar, timestamp=1357…739000) => (column=clust_b:baz, value=lots baz, timestamp=1357…739000)

Page 31: CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016

Storage Engine 3.0 Data.db

Page 32: CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016

Storage Engine 3.0 Partition Header

Page 33: CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016

Storage Engine 3.0 Row

Page 34: CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016

Storage Engine 3.0 Clustering Block

Page 35: CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016

Storage Engine 3.0 Improvements

Delta EncodingVariable Int Encoding

Clustering Written OnceAggregated Cell Metadata

Cell Presence

Page 36: CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016

CQL With Thrift Pre 3.0[default@dev] list my_table; ------------------- RowKey: part_a => (column=clust_a:, value=, timestamp=1357…739000) => (column=clust_a:foo, value=some foo, timestamp=1357…739000) => (column=clust_a:bar, value=and bar, timestamp=1357…739000) => (column=clust_a:baz, value=no baz, timestamp=1357…739000) => (column=clust_b:, value=, timestamp=1357…739000) => (column=clust_b:foo, value=no foo, timestamp=1357…739000) => (column=clust_b:bar, value=no bar, timestamp=1357…739000) => (column=clust_b:baz, value=lots baz, timestamp=1357…739000)

Page 37: CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016

Aggregated Cell Metadata

Only store Cell Timestamp, TTL, and Local Deletion Time if different to

the Row.

Page 38: CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016

Aggregated Cell MetadataSimple Cell Component Byte Size

Flags 1

Optional Cell Timestamp (delta) varint 1…n

Optional Cell Local Deletion Time (delta) varint 1…n

Optional Cell TTL (delta) varint 1…n

Fixed Width Cell Value Byte Size

Value 1…n

Optional Cell Value See Below

Variable Width Cell Value Byte Size

Value Length varint 1…n

Value 1…n

Apache Cassandra 3.0 Storage Engine

Page 39: CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016

Storage Engine 3.0 Improvements

Delta EncodingVariable Int Encoding

Clustering Written OnceAggregated Cell Metadata

Cell Presence

Page 40: CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016

Cell Presence

SSTable stores list of Cells in this SSTable.

Rows stores bitmap of Cells in this Row, with reference to SSTable.

Page 41: CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016

Storage Engine 3.0 Row

Page 42: CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016

Remember Where We Came From[default@dev] list my_table; ------------------- RowKey: part_a => (column=clust_a:, value=, timestamp=1357…739000) => (column=clust_a:foo, value=some foo, timestamp=1357…739000) => (column=clust_a:bar, value=and bar, timestamp=1357…739000) => (column=clust_a:baz, value=no baz, timestamp=1357…739000) => (column=clust_b:, value=, timestamp=1357…739000) => (column=clust_b:foo, value=no foo, timestamp=1357…739000) => (column=clust_b:bar, value=no bar, timestamp=1357…739000) => (column=clust_b:baz, value=lots baz, timestamp=1357…739000)

Page 43: CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016

How We Got HereStorage Engine 3.0

Read Path

Page 44: CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016

Read Paths

Ignoring Index Read paths.

Page 45: CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016

Read Commands

PartitionRangeReadCommand SinglePartitionReadCommand

Page 46: CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016

AbstractClusteringIndexFilter

ClusteringIndexNamesFilter (When we know the column names.)

ClusteringIndexSliceFilter (When we do not know the column names.)

Page 47: CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016

ClusteringIndexNamesFilter

When we know what Columns to select, we know

when the search is over.

Page 48: CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016

ClusteringIndexNamesFilter1. Get Partition From Memtables.2. Filter named columns into a temporary

result.3. Select SSTables that may contain Partition

Key.4. Order in descending timestamp order.5. Read from SSTables in order.

Page 49: CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016

Names Filter Short Circuits

If result has a Partition Deletion newer than next SSTable max

timestamp.

Stop Search.

Page 50: CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016

Names Filter Short Circuits

If read all Columns and max timestamp of next SSTable less than selected Columns min timestamp.

Stop Search.

Page 51: CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016

Names Filter Short Circuits

If search clustering value not within clustering range in the SSTable.

Skip SSTable.

Page 52: CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016

Names Filter Short Circuits

If SSTable Cell not in search set.

Skip reading value.

Page 53: CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016

ClusteringIndexSliceFilter

When we do not know which columns to select, the search ends when it is exhausted.

Page 54: CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016

ClusteringIndexSliceFilter

Used with:

Distinct.Not all clustering columns

restricted.

Page 55: CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016

ClusteringIndexSliceFilter1. Get Partition From Memtables.2. Create Iterators for Partitions.3. Select SSTables that may contain Partition

Key.4. Order in reverse max timestamp order.5. Create Iterators for SSTables in order.

Page 56: CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016

Slice Filter Short Circuits

If SSTable max timestamp is before max seen Partition Deletion

timestamp.

Stop Search.

Page 57: CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016

Names Filter Short Circuits

If search clustering value not within clustering range in the SSTable.

Skip SSTable.

Page 58: CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016

Thanks.

Page 59: CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016

Aaron Morton@aaronmorton

Co-Founder & Principal Consultantwww.thelastpickle.com