apache cassandra overview

12
Apache Cassandra overview by Taras Tymoshchuk, software developer at ElifTech

Upload: eliftech

Post on 12-Apr-2017

146 views

Category:

Software


0 download

TRANSCRIPT

Apache Cassandraoverview

by Taras Tymoshchuk, software developer at ElifTech

IntroductionWhat is Apache Cassandra?

Apache Cassandra™ is a free

Distributed…High performance…Extremely scalable…Fault tolerant (i.e. no single point of failure)…

post-relational database solution. Cassandra can serve as both real-time datastore (the “system of record”) for online/transactional applications, and as a read-intensive database for business intelligence systems.

Top Use Cases● Internet of things applications – Cassandra is perfect for consuming lots of fast

incoming data from devices, sensors and similar mechanisms that exist in many different locations.

● Product catalogs and retail apps – Cassandra is the database of choice for many retailers that need durable shopping cart protection, fast product catalog input and lookups, and similar retail app support.

● User activity tracking and monitoring – many media and entertainment companies use Cassandra to track and monitor the activity of their users’ interactions with their movies, music, website and online applications.

● Messaging – Cassandra serves as the database backbone for numerous mobile phone and messaging providers’ applications.

● Social media analytics and recommendation engines – many online companies, websites, and social media providers use Cassandra to ingest, analyze, and provide analysis and recommendations to their customers.

Key Cassandra Features and Benefits

● Gigabyte to Petabyte scalability

● Linear performance

● No SPOF

● Easy replication / data distribution

● Multi datacenter and cloud capable

● No need for separate caching layer

● Tunable data consistency

● Flexible schema design

● Data compaction

● CQL language (like SQL)

● Support for key languages and platforms

● No need for special hardware or software

Architecture OverviewIn Cassandra, all nodes play an identical role; there is no concept of a master node.

Cassandra’s built-for-scale architecture means that it is capable of handling large amounts of data and thousands of concurrent users.

Cassandra’s architecture also means that, unlike other master-slave or sharded systems, it has no single point of failure and therefore is capable of offering true continuous availability and uptime.

CQLAstyanix / Hector API:

SliceQuery<string,string,string>query=...

query.set Key (“x”)

query.set Column Family (“y”)

CQL:

SELECT A FROM Y WHERE ID=”X”

Cassandra Data Objects

OverviewCassandra data model

COL1 VAL1 (TS1)COL2 VAL2 (TS2)KEY

Writing Data

Reading Data

Rake

● Bad implemented range scan, Cassandra can not currently transfer data;

● Compaction backing a request;

● Many settings made on the cluster level, type, storage strategy and etc.;

● Counters.

Thank you for your attention!

Find us at eliftech.comHave a question? Contact us:[email protected]