making big data analytics with hadoop fast & easy (webinar slides)
DESCRIPTION
Looking to analyze your Big Data assets to unlock real business benefits today? But, are you sick of all the theories, hype and whoopla? View these slides from Actian and Yellowfin’s "Big Data Analytics with Hadoop" Webinar to discover how we’re making Big Data Analytics fast and easy. Hold on as we go from data in Hadoop to dashboard in just 40-minutes. Learn how to combine Hadoop with the most advanced Big Data technologies, and world’s easiest BI solution, to quickly generate real business value from Big Data Analytics. Watch as we use live CDR data stored in Hadoop – quickly connecting, preparing, optimizing and analyzing this data in a tangible real-world use case from the telecommunications industry – to easily deliver actionable insights to anyone, anywhere, anytime. To learn more about Yellowfin, and to try its intuitive Business Intelligence platform today, go here: http://www.yellowfinbi.com To learn more about Actian, and its next generation suite of Big Data technologies, go here: http://www.actian.com/TRANSCRIPT
Making Big Data Analytics Fast and Easy
Using Actian, Yellowfin and Hadoop
December 16, 2013
John Ryan Marketing Manager APAC
Actian Corporation
Ryan Templeton Snr Solutions Architect
Actian Corporation
Ivan Seow Snr Technical Consultant
Yellowfin
2
Take Action on Big Data Making BI Easy
3
Take Action on Big Data
Fastest Data Prep Engine
Fastest Hadoop Loader
Fastest Single Node Database
Fastest MPP Database
Huge library of Analytical Functions
Making BI Easy
4
Take Action on Big Data Making BI Easy
Fastest Data Prep Engine
Fastest Hadoop Loader
Fastest Single Node Database
Fastest MPP Database
Huge library of Analytical Functions
Ranked #1 BI Vendor
Dresner Global BI Study 2012 & 13
#1 Dashboard Vendor: BARC BI Survey 12
#1 Enterprise Reporting Vendor:
BARC BI Survey 13
Gartner: ‘Vendor to Consider’
Today’s Agenda
5
1. Big Data Analytics with Hadoop 2. Making Analytics in Hadoop Fast & Easy 3. Customer Example (Telecom) 4. Demo: From Data to Dashboard
• Making Hadoop Fast and Easy • Making BI Fast and Easy
5. Summary
6 Confidential © 2012 Actian Corporation
Big Data Analytics With Hadoop
Expect to have HDFS in production
7
Based on 263 respondents TDWI Best Practices Report – Q2 2013
73%
Big Data Source for Analytics Most Likely to Benefit from Hadoop
8
Based on 263 respondents TDWI Best Practices Report – Q2 2013
71%
Why is analytics inside Hadoop so hard and slow?
9
HDFS is a file system, not a database
Queries not standard SQL, only resemble SQL
Need a Data Scientist MapReduce inefficient for analytic queries
10 Confidential © 2012 Actian Corporation
Making Big Data with Hadoop Fast and Easy With Actian and Yellowfin
Enterprise
Actian Big Data Analytic Platform
11
DATA VALUE
Business Intelligence
Applications DW
Big Data Storage
Advanced technology platform:
Industry leading: Scale
Performance
Complexity
Cost (price/performance)
Time to Value
Multiple deployment options: On-premise
Cloud
Hybrid
Embedded
Connect Prepare Analyze
Optimize
Accelerating Big Data 2.0
Enterprise
Actian Big Data Analytic Platform
12
DATA VALUE
Business Intelligence
Applications DW
Big Data Storage
Advanced technology platform:
Industry leading: Scale
Performance
Complexity
Cost (price/performance)
Time to Value
Multiple deployment options: On-premise
Cloud
Hybrid
Embedded
Connect Prepare Analyze
Optimize
Accelerating Big Data 2.0
Enterprise
Actian Big Data Analytic Platform
13
DATA VALUE
Business Intelligence
Applications DW
Big Data Storage
Advanced technology platform:
Industry leading: Scale
Performance
Complexity
Cost (price/performance)
Time to Value
Multiple deployment options: On-premise
Cloud
Hybrid
Embedded
Connect Prepare Analyze
Optimize
Accelerating Big Data 2.0
Enterprise
Actian Big Data Analytic Platform
14
DATA VALUE
Business Intelligence
Applications DW
Big Data Storage
Advanced technology platform:
Industry leading: Scale
Performance
Complexity
Cost (price/performance)
Time to Value
Multiple deployment options: On-premise
Cloud
Hybrid
Embedded
Connect Prepare Analyze
Optimize
Accelerating Big Data 2.0
Enterprise
Actian Big Data Analytic Platform
15
DATA VALUE
Business Intelligence
Applications DW
Big Data Storage
Advanced technology platform:
Industry leading: Scale
Performance
Complexity
Cost (price/performance)
Time to Value
Multiple deployment options: On-premise
Cloud
Hybrid
Embedded
Connect Prepare Analyze
Optimize
Accelerating Big Data 2.0
Industry Leading Performance
16
Process Hadoop Data Faster
Dataflow vs PIG (MapReduce) DBT-3@1TB : Run times
Analyze Data Faster
Database Benchmarks TPC-H QphH@1TB Benchmarks (non-clustered)
Today’s demonstration
17
Connect Hadoop
Transform Data
Parallel Load
Fast Database Queries
Fast Analysis
Actian Dataflow Actian Vector BI Visualization Layer Yellowfin BI
18 Confidential © 2012 Actian Corporation
Telecom Example Storing CDR Log Files inside Hadoop
Customer Use Case
Tier two telecom provider
Planning for large growth with minimal staff impact
Business demands deeper insights
19
IT Challenges
20
Collect, manage, process CDR data in Hadoop
Users are domain experts, not data scientists
Swamped with data. Network switch dumps 200MB /min
during peak times. Hundreds of thousands of records per drop.
170 columns.
Too hard to analyze Raw data must first be distilled
and enriched to gain insight
What the business was asking for
Fastest time to decision Speed up processing by an order of magnitude
Increased granularity of analysis
Without increasing processing times or bogging down backend
Proactive analysis, not reactive Enable trend analysis and predictive capabilities
Answer real business questions
e.g. visual insight for near real-time customer and vendor performance, determine routing performance
optimization, etc
Scale for future growth Extensible for future capabilities and scalable growth
21
Specific Business Questions - CDR Analysis
Answer Service Rate (ASR & Adjusted ASR) • Calls completed vs. route attempts (vendor performance)
• Calls completed vs. call attempts (customer satisfaction)
Opportunity Monitor • Calculate profit/loss per call due to routing path chosen
Post Dial Delay (PDD) • Annoying delay until path through network selected
Analysis of near real time quality measures • Call duration, jitter and packet loss
Trends and correlations of above metrics
22
Filter data Logical functions Split flow for separate
processing rules
Meta-node encapsulates
processing Extract failed
routing attempts
CDR Workflow Overview
23
CONNECT TRANSFORM
PARALLEL DATA LOAD
Data processing – Execution Plan
24
Reader FilterRows DeriveFields Group(partial)
Reader FilterRows DeriveFields Group(partial)
Reader FilterRows DeriveFields Group(partial)
Reader FilterRows DeriveFields Group(partial)
Repartition Group(final) Writer
Repartition Group(final) Writer
Repartition Group(final) Writer
Repartition Group(final) Writer
Phase 1 Phase 2
Compiled to a set of physical graphs
25 Confidential © 2012 Actian Corporation
Demo Making Big Data Analytics Fast and Easy
Customer Take Aways – Actionable Insights
Processing streaming CDR data in seconds
26
FAST
Customer Take Aways - Analysis
visibility at the Area Code and Exchange level
27
Deeper Analysis
Customer Take Aways – Cost Savings
updates made to routing tables during first week of collecting data
28
20,000
Customer Take Aways - Scalability
rows of data collected during first 6 months
29
8.9 Billion
Solution Architecture
30 30 30
End Users
Desktop & Mobile Devices
Yellowfin BI
• Dashboard • Ad Hoc • Statistics • Data Mining • Analytics
Hadoop Collection
Paraccel Dataflow
Extraction Cleansing Enrichment Aggregation Analysis Mining
Vectorwise Very fast reporting
database
Clustered Execution Parallel Loading
OSS/BSS
Data Retention
Summary – Take Action on Big Data
31
Enterprise
DATA VALUE
Business Intelligence
Applications DW
Big Data Storage
Advanced technology platform:
Industry leading: Scale
Performance
Complexity
Cost (price/performance)
Time to Value
Multiple deployment options: On-premise
Cloud
Hybrid
Embedded
Connect Prepare Analyze
Optimize
Accelerating Big Data 2.0
32 Confidential © 2012 Actian Corporation
Questions
Ivan Seow [email protected]
John Ryan [email protected]
Ryan Templeton [email protected]
Actian www.actian.com Yellowfin www.Yellowfin.bi