Введение в apache cassandra

Post on 08-Jul-2015

92 Views

Category:

Engineering

2 Downloads

Preview:

Click to see full reader

DESCRIPTION

Saratov open it teach talk. Дамир Яраев: Введение в Apache Cassandra (В ходе презентации Дамир расскажет, когда и почему стоит переходить с проверенных временем реляционных баз данных на ставшие модными в последнее время решения на базе NoSQL. В качестве примера рассмотрит колоночную NoSQL базу данных Apache Cassandra)

TRANSCRIPT

© 2014 Grid Dynamics

Created for BigData Community by Dmitry Yaraev

Apache CassandraWhen and Why

© 2014 Grid Dynamics

Agenda

1. When RDBMS Becomes a Bottleneck2. Concepts of NoSQL Paradigm3. Variety of NoSQL Databases4. Why Apache Cassandra?5. Essential Use Cases of Cassandra6. Bad Usage Patterns

Page 1

© 2014 Grid Dynamics

What Is Offered by RDBMS

● Mature technology with common standards● Easy migration from one engine to another● Data model corresponds to the real world● Structured Query Language (SQL)● ACID transactions

Page 2

© 2014 Grid Dynamics

Bottlenecked by RDBMS

● Horizontal scalability● Schema support and migration● Server and maintenance cost

Page 3

© 2014 Grid Dynamics

NoSQL :: History

● First mention in 1998● Class of distributed databases● Not Only SQL

Page 4

© 2014 Grid Dynamics

NoSQL :: Features

● Simple schema without relations● Good horizontal scalability● Combination of two of the following:

○ Consistency○ Availability○ Partition Tolerance

Page 5

© 2014 Grid Dynamics

NoSQL :: CAP Theorem

Page 6

© 2014 Grid Dynamics

NoSQL :: Storage Types

Page 7

© 2014 Grid Dynamics

Questions?

Page 8

© 2014 Grid Dynamics

Cassandra :: What Is It?

● Wide-column distributed data store● The latest version is 2.1.2 (released this month)● Proved itself in production (Instagram, Spotify,

eBay and many other big players on IT market)

Page 9

© 2014 Grid Dynamics

Cassandra :: Origin

● Originally created in Facebook● Open-sourced in 2008● Apache incubator project in early 2009● Top level Apache project in March 2010

Page 10

© 2014 Grid Dynamics

Cassandra :: Features

● High scalability● Tunable consistency● Cross-datacenter replication● Query language (CQL)● Drivers for a variety of languages● Lightweight transactions● Indexing

Page 11

© 2014 Grid Dynamics

Cassandra :: Data Types

● Primitive types● Arbitrary bytes (blob)● Collections (list, map, set)● Tuples (tuple)● User defined

Page 12

© 2014 Grid Dynamics

Cassandra :: Data Model

● Keyspace● ColumnFamily● Row● Column

Page 13

© 2014 Grid Dynamics

Cassandra :: Data Model

Page 14

© 2014 Grid Dynamics

Cassandra :: ColumnFamily

Page 15

© 2014 Grid Dynamics

Cassandra :: CQL3

● SQL-like syntax● Three types of statements

○ data definition statements○ data manipulation statements○ data look up statements

● Prepared statements

Page 16

© 2014 Grid Dynamics

Cassandra :: Example Queries

CREATE TABLE songs ( id uuid PRIMARY KEY, title text, album text, artist text, data blob );

SELECT * FROM songs WHERE artist = ‘Metallica’; -- RETURNS AN ERROR

CREATE INDEX ON songs(artist);

SELECT * FROM songs WHERE artist = ‘Metallica’;

Page 17

© 2014 Grid Dynamics

Cassandra :: Data Distribution

Page 18

© 2014 Grid Dynamics

Cassandra :: Replication

Page 19

© 2014 Grid Dynamics

Cassandra :: Eventual Consistency

Page 20

© 2014 Grid Dynamics

Cassandra :: Tunable Consistency

Page 21

© 2014 Grid Dynamics

Cassandra :: Consistency Levels

● Defines a condition for successful read/write operation

● Multiple Options○ ONE○ ALL○ QUORUM,○ LOCAL_QUORUM○ SERIAL○ …

● Can be specified per request

Page 22

© 2014 Grid Dynamics

Cassandra :: Consistency (Quorum)

Page 23

© 2014 Grid Dynamics

Cassandra :: Consistency (ONE)

Page 24

© 2014 Grid Dynamics

Cassandra :: Consistency (ONE)

Page 25

© 2014 Grid Dynamics

Cassandra :: Use Cases

● Large data sets and simple scaling● Perfectly fits for semi-structured data● Fault tolerance (no SPoF)● High write throughput

Page 26

© 2014 Grid Dynamics

● No good for large blobs ( > 64MB )● When there are more read operations than

writes ones and low read latency is critical● ACID transactions

Cassandra :: Limitations

Page 27

© 2014 Grid Dynamics

Thanks!

Page 28

top related