datastax: datastax enterprise - the multi-model platform

21
DataStax Enterprise: The Multi-Model Data Platform

Upload: datastax-academy

Post on 20-Mar-2017

612 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: DataStax: Datastax Enterprise - The Multi-Model Platform

DataStax Enterprise: The Multi-Model Data Platform

Page 2: DataStax: Datastax Enterprise - The Multi-Model Platform

1 Multi-Model Defined

2 The evolution of Models

3 Graph Databases Overview

4 The DataStax Approach to Graph Databases

5 DataStax in the Open Source Graph Community

6 Conclusion 2 © 2015. All Rights Reserved.

Page 3: DataStax: Datastax Enterprise - The Multi-Model Platform

Multi Model Defined

© 2015. All Rights Reserved. 3

Most database management systems are organized around a single data model that determines how data can be organized, stored, and manipulated. In contrast, a multi-model database is designed to support multiple data models against a single, integrated backend. Source - https://en.wikipedia.org/wiki/Multi-model_database

Page 4: DataStax: Datastax Enterprise - The Multi-Model Platform

The evolution of Models

© 2015. All Rights Reserved. 4

Application Application

Application (OLTP and OLAP)

Data Access Abstraction

RDBMS / SQL Polyglot Persistence

Multi Model

Page 5: DataStax: Datastax Enterprise - The Multi-Model Platform

DSE Multi Model

© 2015. All Rights Reserved. 5

Multiple (polygot) Models exposed via cohesive interaction mechanisms, (APIs) for OLTP and OLAP workloads, (mixed workload) with a unified persistence layer, (Apache Cassandra) providing GR, always on characteristics within a TCO efficient data base platform for a variety of use cases

Cassandra

Page 6: DataStax: Datastax Enterprise - The Multi-Model Platform

©2015 DataStax Confidential. Do not distribute without consent. 6

DataStax Enterprise – Wide Row / MVs

C1 C2

MV1c1

MV1c2

Agg1c1

Agg2c1

Page 7: DataStax: Datastax Enterprise - The Multi-Model Platform

©2015 DataStax Confidential. Do not distribute without consent. 7

DataStax Enterprise - JSON

Inserting JSON data is easy

Reading JSON data is easy

Finding JSON errors is easy

Page 8: DataStax: Datastax Enterprise - The Multi-Model Platform

Introduction to Graph Databases

© 2015. All Rights Reserved. 8

Page 9: DataStax: Datastax Enterprise - The Multi-Model Platform

What is a Graph Database?

©2015 DataStax Confidential. Do not distribute without consent.

•  Store, manage and query highly connected data •  Data stored as nodes (vertices), edges and properties to represent

a domain model •  By explicitly embedding relationships in the data model, you store

a more logical business model using the natural data access language Gremlin

•  Think of a graph asa pre-joineddatabase

Page 10: DataStax: Datastax Enterprise - The Multi-Model Platform

Choose the Right Model to Fit your Business Needs

© 2015 DataStax, All Rights Reserved. Company Confidential 10

DSE Wide Row - Build and maintain models using

CQL’s DDL features - Super-fast CRUD at scale with CQL

DML features - Good option for denormalized

schemas with high-throughput requirements

- Perfect fit for IoT applications that require consuming enormous amounts of data with specific data retrieval requirements: product catalogs, high-volume messaging systems, collecting and storing sensor data

DSE Graph - Flexible schema that is easy to modify

and maintain with Gremlin - Clearly maps business semantics to a

logical model for easy maintenance and understanding

- Ideal model for highly-connected data models

- Perfect fit for social-engagement models, recommendation engines and IT network / device management

- Update and query the graph in real-time with easy-to-learn open-source Gremlin language

- Good option for transitioning from slow RDBMS 3NF models with lots of JOINs

Page 11: DataStax: Datastax Enterprise - The Multi-Model Platform

What is DataStax Enterprise (DSE) Graph?

©2015 DataStax

•  Highly scalable graph database for modern web, mobile, and IoT applications that need to manage highly connected and heterogeneous data

•  Built-in support for real-time search, and analytic graph queries via tight integration with the DSE platform

•  A property graph model native inside the DataStax product, engineered specifically for Cassandra

•  Store & find relationships in data fast and easy on huge graphs

Page 12: DataStax: Datastax Enterprise - The Multi-Model Platform

DSE Graph: Built-in Scalability

©2015 DataStax

•  Scale out Graph vs. Scale up only •  Graph partitioning built on Cassandra’s scale-out architecture •  Graph index structures integrated into Cassandra •  Domain model maps more naturally to data model, allowing for greater

understanding between business and IT •  Traverse millions of relationships in a short period of

time, faster than modeling the data in RDBMS •  Flexible data model that can be easily adapted to

business changes

Page 13: DataStax: Datastax Enterprise - The Multi-Model Platform

DSE Graph: Integrated Search, Analytics and Ops

©2015 DataStax

•  Real-time traversals over complex-structured graph data •  Integrates with DSE Search to mix search with traversal queries •  Integrates with DSE Analytics and Spark to support OLAP and breadth-first

graph traversals •  Iterative graph analytics like PageRank or other centrality measures •  Reporting and aggregates over graph data •  Integrated with DataStax OpsCenter

Page 14: DataStax: Datastax Enterprise - The Multi-Model Platform

Graph Database Use Cases

© 2015. All Rights Reserved. 14

Page 15: DataStax: Datastax Enterprise - The Multi-Model Platform

Additional Graph Use Cases

360 Degree View of Your Customer •  Collect massive amounts of data point about your customer •  Data collected from social networks, web analytics, digital ads, mobile devices, CRM •  Bring heterogeneous customer data together into DSE Graph •  Uncover buying patterns and customer behaviors •  Graph becomes a master data hub for customer data •  Use the graph customer hub to build better products

for your target customers •  Keep the customers you already have with

customer intelligence Customer 360 View

Social

Store Sensors

Email

CRM

Mobile

Weblogs

Page 16: DataStax: Datastax Enterprise - The Multi-Model Platform

Additional Graph Use Cases IT Network and Device Management •  Allows IT to monitor, manage and protect corporate networks and devices

(laptops, iPads, mobile phones, etc.) •  Requires understanding of network topologies and relationships between

devices, interfaces, equipment, people, services … •  A traditional RDBMS would require

expensive query-time joins •  A graph model intrinsically knows how

to traverse the topology because the relationships are already stored

•  This makes for quick & easy recognition of problems, root cause analysis and event correlation

Page 17: DataStax: Datastax Enterprise - The Multi-Model Platform

DataStax in the Graph Open Source Community TinkerPop / Gremlin

© 2015. All Rights Reserved. 17

Page 18: DataStax: Datastax Enterprise - The Multi-Model Platform

DataStax Role in TinkerPop

©2015 DataStax

•  DataStax utilizes the TinkerPop framework for the DSE Graph product

•  DataStax will contribute to the TinkerPop community and is

heavily invested in the success of the Gremlin language •  DataStax will provide resource guides, documentation,

samples and training on building and querying graphs with Gremlin, using DSE Graph as the graph engine

Page 19: DataStax: Datastax Enterprise - The Multi-Model Platform

Gremlin The open-source standard graph query language

•  DataStax contributes and supports the Apache TinkerPop community, along with the Gremlin Graph query language

•  g.V.hasLabel('person').as('a').out('knows').as('b').select('a','b').by('age')

.by('age') "for all people in the graph, give me the ages of the people on each end of a friendship relationship“

•  g.V.has('name','marko').out('knows').out('mother').outE('worksFor'). has('time',between(2001,2002)).inV.name “what are the names of the places that marko's friends' mothers worked for from 2001 to 2002”Deep traversal == multiple levels of query-time joins in RDBMS

Page 20: DataStax: Datastax Enterprise - The Multi-Model Platform

Recommendation Query – RDBMS vs. Graph

©2015 DataStax

SELECT TOP (5) [t14].[ProductName] FROM (SELECT COUNT(*) AS [value], [t13].[ProductName] FROM [customers] AS [t0] CROSS APPLY (SELECT [t9].[ProductName] FROM [orders] AS [t1] CROSS JOIN [order details] AS [t2] INNER JOIN [products] AS [t3] ON [t3].[ProductID] = [t2].[ProductID] CROSS JOIN [order details] AS [t4] INNER JOIN [orders] AS [t5] ON [t5].[OrderID] = [t4].[OrderID] LEFT JOIN [customers] AS [t6] ON [t6].[CustomerID] = [t5].[CustomerID] CROSS JOIN ([orders] AS [t7] CROSS JOIN [order details] AS [t8] INNER JOIN [products] AS [t9] ON [t9].[ProductID] = [t8].[ProductID]) WHERE NOT EXISTS(SELECT NULL AS [EMPTY] FROM [orders] AS [t10] CROSS JOIN [order details] AS [t11] INNER JOIN [products] AS [t12] ON [t12].[ProductID] = [t11].[ProductID] WHERE [t9].[ProductID] = [t12].[ProductID] AND [t10].[CustomerID] = [t0].[CustomerID] AND [t11].[OrderID] = [t10].[OrderID]) AND [t6].[CustomerID] <> [t0].[CustomerID] AND [t1].[CustomerID] = [t0].[CustomerID] AND [t2].[OrderID] = [t1].[OrderID] AND [t4].[ProductID] = [t3].[ProductID] AND [t7].[CustomerID] = [t6].[CustomerID] AND [t8].[OrderID] = [t7].[OrderID]) AS [t13] WHERE [t0].[CustomerID] = N'ALFKI' GROUP BY [t13].[ProductName]) AS [t14] ORDER BY [t14].[value] DESC

g.V('customerId','ALFKI').as('customer') \ .out('ordered').out('contains').out('is').as('products') \ .in('is').in('contains').in('ordered').except('customer') \ .out('ordered').out('contains').out('is').except('products') \ .groupCount().cap().orderMap(T.decr)[0..<5].productName

VS.

Page 21: DataStax: Datastax Enterprise - The Multi-Model Platform

Thank you