operationalizing data analytics
TRANSCRIPT
1 © Copyright 2014 Pivotal. All rights reserved. 1 © Copyright 2014 Pivotal. All rights reserved.
Operationalizing Data Analytics
How big data analytics frameworks are evolving in the age of Hadoop
Jan 2015
2 © Copyright 2014 Pivotal. All rights reserved.
Abstract
Hadoop is widely regarded as a key component of a Big Data infrastructure, but many companies have yet to reap expected benefits from the platform.
In this Webinar, Brian Hopkins of Forrester, and Greg Chase of Pivotal examine the following:
� Business-cases driving Hadoop deployments
� Challenges translating deployment into business benefits
� Other tools in a big data platform needed to realize insights
� Future evolution of big data infrastructure with in-memory processing
� How to achieve business value with data stored in Hadoop
3 © Copyright 2014 Pivotal. All rights reserved. 3 © Copyright 2014 Pivotal. All rights reserved.
Business cases What is driving adoption of Hadoop?
4 © Copyright 2014 Pivotal. All rights reserved.
How is Hadoop being used in the enterprise? Hadoop has been considered key capability for implementing Big Data Initiatives in the enterprise. What are the different ways Hadoop is being used?
� Hadoop adoption has been driven by large organizations with lots of data and complex needs for insight
� Early Hadoop adoption in large organizations was driven by need to reduce cost of data persistence for analytics
� Early use cases – Fraud, IT security, ad tech, pricing optimization, offload transactional analytics
� Emerging use cases – internet of things, customer journey path analytics, dynamic digital experiences
Percentage of firms that have implemented Hadoop
Percentage of firms with > 20,000 employees that have implemented Hadoop
Source: Forrester’s Business Technology Technographics Global Data And Analytics Survey, Q2 2014 Base: 1658 business and technology management professionals with knowledge of data and analytics
5 © Copyright 2014 Pivotal. All rights reserved.
Hadoop Use Cases for Pivotal Customers Retail • CRM – Customer Scoring • Store Siting and Layout • Fraud Detection / Prevention • Supply Chain Optimization
Advertising & Public Relations • Demand Signaling • Ad Targeting • Sentiment Analysis • Customer Acquisition
Financial Services • Algorithmic Trading • Risk Analysis • Fraud Detection • Portfolio Analysis
Media & Telecommunications • Network Optimization • Customer Scoring • Churn Prevention • Fraud Prevention
Manufacturing • Product Research • Engineering Analytics • Process & Quality Analysis • Distribution Optimization
Energy • Smart Grid • Exploration
Government • Market Governance • Counter-Terrorism • Econometrics • Health Informatics
Healthcare & Life Sciences • Pharmaco-Genomics • Bio-Informatics • Pharmaceutical Research • Clinical Outcomes Research
6 © Copyright 2014 Pivotal. All rights reserved. 6 © Copyright 2014 Pivotal. All rights reserved.
Challenges What are typical barriers to achieving
business results with Hadoop?
7 © Copyright 2014 Pivotal. All rights reserved.
How mature are data analytics on Hadoop, really? Enterprises have invested in Hadoop but their analytics capabilities are still at initial stages of maturity.
� Enterprises are asking, “what do I need that I don’t have already?” – Especially the larger ones that already have been investing for years in
analytics technology
� Firms are struggling with business cases
� Insight is elusive – Valuable insight is even more so
� Experiment and fail fast
Large enterprises say they have enough big data and are not expanding
Source: Forrester’s Business Technology Technographics Global Data And Analytics Survey, Q2 2014 Base: 1658 business and technology management professionals with knowledge of data and analytics
Large enterprises say their business cases have a proven ROI
8 © Copyright 2014 Pivotal. All rights reserved.
Common Hadoop Analytics Challenges
1010101010101010101 1010101010101010101 1010101010101010101
Handling volatile streaming data
1010101010101 1010101010101 1010101010101
Querying large datasets
Applying advanced analytics
1010101010101
1010101010101
10101010
In-Memory
Web App
Web App
Web App
Data consistency at scale
9 © Copyright 2014 Pivotal. All rights reserved. 9 © Copyright 2014 Pivotal. All rights reserved.
Realize Insights How to apply advanced analytics to
data stored in Hadoop?
10 © Copyright 2014 Pivotal. All rights reserved.
What they key data access methods? What methods do you see enterprises taking to access and work with big data?
� The basic access paradigm is still KVS
� MapReduce is too limiting for many cases – graph, stream and SQL emerging
� SQL/Hadoop is hot, and for good reason – Recognize the different flavors of this – It’s not an apples-to-apples comparison
� Recognize the emerging need for speed – streaming and in-memory – Accelerate mapReduce, graph, and search
11 © Copyright 2014 Pivotal. All rights reserved.
A Business Data Lake Adds Analytic Insights Centralized Management
System monitoring System management
Unified Data Management Tier Data mgmt.
services MDM RDM
Audit and policy mgmt.
Processing Tier
Workflow Management
Distillation Tier
HDFS storage Unstructured and structured data
In-memory MPP database
Unified Sources Flexible Actions
Real-time ingestion
Micro batch ingestion
Batch ingestion
Real-time insights
Interactive insights
Batch insights
12 © Copyright 2014 Pivotal. All rights reserved. 12 © Copyright 2014 Pivotal. All rights reserved.
In-Memory Computing How will in-memory computing evolve
big data platforms?
13 © Copyright 2014 Pivotal. All rights reserved.
What’s is happening with in-memory? There is a lot of momentum around in-memory processing with Spark and other technologies. What is really going on?
� It’s important to understand the flavors of in-memory – Pure DBs, DB acceleration, caches/grids, Hadoop/Spark
� Much of it is not all that new – Likely your DBs today are accelerated with better in-memory
caching
� What is new/ interesting? – Spark (SparkSQL, SparkX, Spark Streaming, SparkML) – Spark will steal HDFS workloads – Streaming is a form of in-memory and comes in many flavors too – Caching
14 © Copyright 2014 Pivotal. All rights reserved.
How In-Memory Evolves the Data Lake � Current: In-memory distributed databases: SQL & NoSQL
� Future: Tachyon in-memory File System – Extending analytical data warehouses to in-memory OLAP – Convergence of in-memory OLAP and OLTP – Robust handling of Spark RDDs
Tez HAWQ Spark GemFire
Tachyon (In-mem Polyglot File System)
HDFS NFS S3 Gluster FS
GemFireXD
Ceph
15 © Copyright 2014 Pivotal. All rights reserved. 15 © Copyright 2014 Pivotal. All rights reserved.
Business Value How to achieve business value with
analytics based on Hadoop
16 © Copyright 2014 Pivotal. All rights reserved.
How do enterprises cross the chasm? How can enterprises get over the chasm in rapidly implementing analytics (bottom up) and reaping business benefits (top down)?
� Look for shared architecture requirements as you run LOB specific pilots – Focus investment on metrics a LOB exec cares about – Demonstrate shared architecture benefits
� Look for ways to automate and scale insights execution – Deliver insight at the point of decision
� Stop using security as a blanket cloud objection – You likely already have sensitive customer data in the cloud – Leverage cloud when data is “soupy”
17 © Copyright 2014 Pivotal. All rights reserved.
The Journey to Data Driven Innovation
STORE
Business Data Lake
Store everything
ANALYZE
Big Data Analytics
Generate Insights
BUILD
Data-Driven Applications
Operationalize
INNOVATE
Agile Enterprise
Iterate Rapidly
PDL Data Science
Pivotal Labs Agile
Pivotal CF Services
PDL Data Architecture
Agile Development
Big Data SQL-Based Analytics
Enterprise PaaS
18 © Copyright 2014 Pivotal. All rights reserved.
World’s Leading Experts Pivotal Labs – Pivotal Data Labs
BATCH BATCH
INTERACTIVE INTERACTIVE HAWQ Greenplum DB
Pivotal HD
REAL-TIME REAL-TIME GemFire XD GemFire
The Foundation for Data-Driven Enterprise
19 © Copyright 2014 Pivotal. All rights reserved.
Find out more…
� Pivotal Big Data Suite
� Download Pivotal HD
� Enterprise SQL on Hadoop
� Big Data @ Pivotal blog – The Future Architecture of the Data Lake
20 © Copyright 2014 Pivotal. All rights reserved. 20 © Copyright 2014 Pivotal. All rights reserved.
Thank You