apachecon cassandra transport

Post on 12-Jan-2015

784 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Apache Cassandra client and transport architecture. Slides from ApacheCon NA 2013 in Portland, OR.

TRANSCRIPT

Apache Cassandra

Clients and Transports

Thursday, February 28, 13

Hi Folks!I’m Nate @zznate

Thursday, February 28, 13

API ManagementAPI AnalyticsAPI Tools

Thursday, February 28, 13

Clients and transportsfor Cassandra

Thursday, February 28, 13

But first... some questions

Thursday, February 28, 13

But first: Architectural stuff

Thursday, February 28, 13

Cassandra:“Sparsely Columnar”

Thursday, February 28, 13

An RDBMS table

Thursday, February 28, 13

An RDBMS table

Thursday, February 28, 13

Cassandra Style

Thursday, February 28, 13

Cassandra Style

Thursday, February 28, 13

Cassandra data modelling

Thursday, February 28, 13

Four common patternsSimple object to simple rowSparse object to rowsMaterialized viewManual index

Thursday, February 28, 13

simple objects to simple row

Thursday, February 28, 13

“static” column family

Thursday, February 28, 13

Sparse Objects

Thursday, February 28, 13

“dynamic” column family

Thursday, February 28, 13

Materialized Views

Thursday, February 28, 13

Materialized view

Thursday, February 28, 13

Regardless of the approached used, there are four overall goals

Thursday, February 28, 13

1. Denormalize2. Eliminate seeks3. Design for read4. Optimiza for blind writes

Thursday, February 28, 13

Now... let’s talk about protocols

Thursday, February 28, 13

Thrift

Thursday, February 28, 13

ThriftRPC-Based

Thursday, February 28, 13

ThriftRPC-BasedMature Apache Project

Thursday, February 28, 13

ThriftRPC-BasedMature Apache ProjectSupports lots of languages

Thursday, February 28, 13

ThriftRPC-BasedMature Apache ProjectSupports lots of languagesExtensible!

Thursday, February 28, 13

CQL

Thursday, February 28, 13

CQLWell defined protocol

Thursday, February 28, 13

CQLWell defined protocolSupports Compression

Thursday, February 28, 13

CQLWell defined protocolSupports CompressionNetty/NIO-based

Thursday, February 28, 13

Storage Mechanics(but quickly)

Thursday, February 28, 13

get_slice

Workhorse of Cassandra selection methods

Thursday, February 28, 13

get_slice: key

The row key

Thursday, February 28, 13

get_slice: ColumnParent

The column family (a.k.a table)

Thursday, February 28, 13

get_slice: SlicePredicate

defines the column range, or specifically named columns

Thursday, February 28, 13

get_slice:ConsistencyLevel

The level of consistency we want for this read

Thursday, February 28, 13

Obtuse at first glance, but nothing is hidden

Thursday, February 28, 13

So...

Thursday, February 28, 13

But one person’s abstraction leakage is another’s preffered model

Thursday, February 28, 13

How closely do you want to interact with the underlying storage engine?

Thursday, February 28, 13

Client APIs

Thursday, February 28, 13

Benefits of thrift

Thursday, February 28, 13

Benefits of thriftMature selection of clients

Thursday, February 28, 13

Benefits of thriftMature selection of clientsMultiple languages

Thursday, February 28, 13

Benefits of thriftMature selection of clientsMultiple languagesWell documented

Thursday, February 28, 13

Benefits of thriftMature selection of clientsMultiple languagesWell documentedCan be used in other places

Thursday, February 28, 13

Drawbacks of thrift

Thursday, February 28, 13

Drawbacks of thriftSeveral objects are required for any request

Thursday, February 28, 13

Drawbacks of thriftSeveral objects are required for any requestClients differs in implementation

Thursday, February 28, 13

Drawbacks of thriftSeveral objects are required for any requestClients differs in implementationUpstream dependency issues

Thursday, February 28, 13

Drawbacks of thriftSeveral objects are required for any requestClients differs in implementationUpstream dependency issuesSchema changes and cluster health done pro-actively

Thursday, February 28, 13

Benefits of cql api

Thursday, February 28, 13

Benefits of cql apiStored procedures

Thursday, February 28, 13

Benefits of cql apiStored proceduresCommon operations are straight forward

Thursday, February 28, 13

Benefits of cql apiStored proceduresCommon operations are straight forward Cluster health and schema change push-back

Thursday, February 28, 13

Benefits of cql apiStored proceduresCommon operations are straight forward Cluster health and schema change push-backAwesome client available

Thursday, February 28, 13

Drawbacks of CQL apis

Thursday, February 28, 13

Drawbacks of CQL apisStill have idiomatic clients

Thursday, February 28, 13

Drawbacks of CQL apisStill have idiomatic clientsStill a binary protocol

Thursday, February 28, 13

Drawbacks of CQL apisStill have idiomatic clientsStill a binary protocolDefault storage model emposes substantial restrictions** see gotchas section later

Thursday, February 28, 13

Considerations for your app

Thursday, February 28, 13

Stick with Thrift if...

Thursday, February 28, 13

Heavy update workloads

Thursday, February 28, 13

Large, dynamic batch insertions

Thursday, February 28, 13

Hadoop integration(CASSANDRA-4421)

Thursday, February 28, 13

Commonly deal with very wide rows(CASSANDRA-4176)

Thursday, February 28, 13

CASSANDRA-4176:“Pick your shard keys carefully”

Thursday, February 28, 13

Thursday, February 28, 13

Consider CQL if...

Thursday, February 28, 13

Static column family model:Take advantage of stored procedures for common reads

Thursday, February 28, 13

Despite the shard key jab, CQL makes good use of the storage model

Thursday, February 28, 13

You can replace some custom serialization with collections

Thursday, February 28, 13

Integration with JDBC and/or BI tools

Thursday, February 28, 13

Wire efficient:Does not return timestamp or TTL by default

Thursday, February 28, 13

Larger, potentially more transient evironments

Thursday, February 28, 13

But CQL is - new- an abstraction

Thursday, February 28, 13

In some cases, CQL might not do what you think

Thursday, February 28, 13

Most common CQL pitfalls

Thursday, February 28, 13

Collections can only be retrieved in their entirety

Thursday, February 28, 13

Can’t mix static and dynamic data in a column family

Thursday, February 28, 13

“keys only” range slices don’t work(CASSANDRA-4536)

Thursday, February 28, 13

Range ghosts will not be returned

Thursday, February 28, 13

Batch inserts are clunky(CASSANDRA-4693)

Thursday, February 28, 13

With non-compact storage the whole row must be read every time.

Thursday, February 28, 13

The take away is that you have options. Particularly good ones for Java.

Thursday, February 28, 13

Thursday, February 28, 13

BUT

Thursday, February 28, 13

there is a larger, more fundamental problem to discuss

Thursday, February 28, 13

“If [they] think that CQL is the answer to usability then I just won. We at least know where our problems are.”- 10gen exec.

Thursday, February 28, 13

The market has spoken and we missed the boat.

Thursday, February 28, 13

POST /endpoint {json}

Thursday, February 28, 13

A Cassandra-MVP actually maintains a REST front-end

Thursday, February 28, 13

So we’ve taken this and gone further

Thursday, February 28, 13

What if...

Thursday, February 28, 13

Coming soon...Intravert. Vert.x+Cassandra.ASF-licensed.Driven by real-world requirements.

Thursday, February 28, 13

top related