marklogic and hadoop - strata + hadoop world 2014

23
© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. MarkLogic & Hadoop Presented by: Jim Clark, Senior Director, Product Management

Upload: marklogic

Post on 21-May-2015

436 views

Category:

Technology


0 download

DESCRIPTION

A Global Investment Bank had to find a solution to satisfy the recent Dodd Frank, Basel III regulatory requirements. This legislation requires companies and entities to maintain all trade and related information to remain available for external auditing for up to 7 years. In order to satisfy the new legislation the IT organization tried to deploy a solution using an architecture based on legacy technology, but found this approach was too expensive and inflexible. Instead, the bank deployed MarkLogic Enterprise NoSQL with Tiered Storage and Hadoop to meet their requirements faster and at a lower cost. The bank is now able to scale out the architecture to accommodate both operational and analytic workloads along with satisfying the regulatory requirements.

TRANSCRIPT

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

MarkLogic & Hadoop Presented by: Jim Clark, Senior Director, Product Management

© COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 2

Why should we care? Why should we care?

© COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 3

Why is Hadoop important?

Economics of commodity scale out vs. up Unstructured throughout More data > clever algorithms Fault tolerant by design Momentum and community

Emerging compute and storage infrastructure

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 4

The Hadoop “Ecosystem”

© COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 5

Hadoop

SLIDE: 6

© COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

Built-in Search

Scalability and Elasticity

ACID Transactions

Government-grade Security

HA/DR

Cloud Deployment

Hadoop-ready

NoSQL. No Compromises.

© COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 7

Real-time applications

Hadoop

Real-time applications

Batch analytics

© COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 8

Real-time applications

Hadoop

Real-time applications

Batch analytics

Magic?

© COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 9

The best database for Hadoop

Hadoop

Real-time applications

Batch analytics

MarkLogic

© COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 10

Harnessing Data & Reimagining Applications

Reduce Risk

Manage Compliance

Create New Value from Data

Optimize Operations

Lower TCO / Better IT Economics

Better Decision-making

© COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 11

Hadoop

Hadoop

Staging Analytics

Persistence

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 12

Batch Analytics with Hadoop

Progressive Enhancement

Raw Data

Application

mlcp MarkLogic

Batch Analytics

© COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 13

Direct access

4 4

3 3

4 4

3 3

4 4

3 3

2 2 2 2 2 2

1 1 1 1 1 1

Batch and real-time No ETL or re-indexing Consistent migrations Online in seconds Open-source reader

MapReduce processing

© COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 14

Data Retention and Tiered Storage

Provide multiple Service Level Agreements (SLAs)

in a single system

Decrease time and costs of ETL to bring

offline content back online

Empower your operations team without

imposing burdens on your developers

© COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 15

Information lifecycle

Active Historical Archive Time

SSD DAS SAN Hadoop

DAS SAN NAS Hadoop S3

NAS Hadoop S3

© COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 16

Active

Active Local 10K SAS, RAID10 Replication for HA Merge overhead for updates 20 hosts, 320 shards 4 TB of SSD cache

96 TB

© COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 17

Compliance

Active

Compliance Shared NAS 63 hosts Effective 8 TB/host

504

96

TB

© COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 18

Active

Compliance

Analytic Hadoop 120 hosts Effective 12 TB/host 10 MarkLogic hosts

Analytic

1,044

504

96

TB

© COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 19

Active

Compliance

Analytic

Online migration

TB

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 20

96 504 1,044

592 2,066 2,080

Total Size (TB)

Total Cost ($000)

Effective Unit Cost ($/GB)

$4

Compliance

$1.50

Analytic Operational

$25

($/GB)

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 21

MarkLogic makes Hadoop better MarkLogic On-line applications Decision Making Real-time Distributed Indexes

Hadoop Offline analytics Model-Building Long-Haul Batch Distributed file

system

Complimentary Capabilities

SLIDE: 22

© COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

Tiered Storage

Bitemporal

Semantics

Alerting

Elasticity

Differentiated Hadoop Use Cases

Geospatial

Composable Queries & Powerful Search

More Features. No Compromises!

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 23

For additional information we have resources at

www.marklogic.com

Contact me directly [email protected]

THANK YOU!!