5 reasons to augment your data warehouse with hadoop webinar
Post on 13-Apr-2017
688 Views
Preview:
TRANSCRIPT
5 Reasons to Augment Your
Enterprise Data Warehouse with Hadoop Date: 1st October 2015
Presenters: Nirav Shah
Marketing Head
CIGNEX Datamatics
Bharath Mundlapudi
Co-founder & CTO
Orzota, Inc.
2
CIGNEX Datamatics: Established in 2000, India, US & UK
8 #1
14
Open Source Products
Open Source Consultants
Pure Play Open Source Services Company
Open Source Implementations
Global Offices
Open Source Community Contributions
Open Source Books Authored
Business Engagement Platforms 13+ 5+
5000+ 500
+ 500+
Portals, Content & Collaboration Portals Enterprise Integration Identity Relationship Management
Enterprise Content Document & Web Content Management Learning/Knowledge Management Imaging and Scanning - OCR/Digitization Enterprise & NLP Search BPM/Workflow
E-Commerce B2B B2C
Internet of Things (IoT) Big Data Analytics Data Integration Information Delivery Data Analysis
Open Source Consulting Application Modernization
OpeRA™ - Open Source Readiness Assessment Managed Application & Platform Services
Business Engagement Platforms Big Data Platform – Panoramyx™
IoT Platform – Vitalstatistyx™ Digital Employee Engagement Platform – DEEP™
Reputation Management Platform – RMP™ Franchise Management Platform – FMP™
Concept 5k – The PoC Lab
Pure Play Open Source Consulting Company
Bharath Mundlapudi is Co-founder, President & CTO of Orzota. Bharath was part of the initial team at Yahoo! that built Hadoop. He has extensive advisory experience (both strategy and technical) in Big Data technologies and Data Science solutions for various verticals – Finance, Retail, Manufacturing, and Tech. Prior to that, he was an architect in the Data Science group at Netflix and the Java team at Sun Microsystems.
Presenter Profile
4
Orzota Inc.: Established in 2012, Silicon Valley, CA & Chennai, India
– Use Case Analysis
– Data Architecture, Design and Augmentation
– Exploratory and Predictive Analytics
– Vendor Evaluation
– PoCs and Implementation
– Performance Optimization
Big Data Management Services
Clients
Orzota is a Big Data solutions company that provides technology enabled services to help businesses accelerate their big data projects. It has a team of skilled data scientists, architects, and engineers have
created solutions for customers in a wide variety of industries.
• Typical Enterprise Data Architecture
• Augment Proprietary EDW with Hadoop
– Pain-points of EDW
– Use Cases
– Solution
• Case Study: Top 10 banks in the US
– Solution architecture
– Approach & Best Practices: Augmenting Teradata with Hadoop
– Challenges
– Benefits
• Q & A 5
Webinar Topics
6
Typical Enterprise Data Architecture
Transactional Systems (OLTP)
Customer Relationship
Management (CRM)
Enterprise Resource Planning
(ERP)
Data
Warehouse
BI and Reporting Tools
ETL
Staging
Data Mart
Data Mart
ETL
7
Emerging Enterprise Data Architecture
Efficient Analytical & Operational
Processing
Replication across multiple data centers for 99.999% uptime
(<10 mins / yr)
Scale on demand at reduced TCO
Global Application with Geography specific
Data
Millions of reads & writes
Agile Application Rollouts
Next Generation Enterprise Data Warehouse
8
Typical EDW Pain Points – The 5 Reasons
1
2
3
Inability to handle Unstructured Data RDBMS and MPP stores are not designed to handle variety
Excessive Resource Use ETL and other jobs conflict with analytics use
Wasted Storage Less than 20% of data is hot (actively used)
4
5
Inefficient Backups Backup to tape is slow and expensive Restores can cause significant disruption
Expensive Disaster Recovery Full Data Warehouse at DR site is expensive
Solution: Hadoop
• Open Source Apache Project
– Framework for large scale data processing
– Uses commodity servers
– Massive scalability
– Distributed and fault-tolerant
– Most dominant big data platform
• Distributed and Supported by:
9
10
Solution: Augment your Proprietary EDW with Hadoop
Backups Tape backups slow and expensive Backup to Hadoop is low cost. Recovery is fast.
Examples
Performance ETL, apps and analytics requirements
compete
Offload ETL and other apps to Hadoop. Reduce Primary EDW usage
Storage Capacity
~20% of data is hot Move warm and cold data to Hadoop
ROI Expensive and Proprietary Hadoop costs 2-10% of EDW
& More..
Proprietary EDW Hadoop EDW
Cloudera, Hortonworks & MapR
11
EDW Augmentation with Hadoop
??
??
??
??
??
??
Transactional Systems (OLTP)
Customer Relationship
Management (CRM)
Enterprise Resource Planning
(ERP)
Hadoop Cluster Teradata Data Warehouse
BI and Reporting Tools Big Data Analytics
Unstructured Data
12
Hadoop Use Cases
Network Failure Prediction
Single View of “X” X= Customer, Employee, Partner
Churn Analysis
Fraud Detection Risk Modeling
Data Lake
Search Quality
Targeted Marketing
Recommendation Engine
Operational Analysis
Case Study Top 10 Bank in US
Hadoop Augmentation
13
Client Overview
14
• A major bank in US wanted to off-load proprietary EDW to Hadoop
– Challenge
• Current system couldn’t scale to support new business use cases
• Upgrades would cost hundreds of millions of dollars
– Objective
• Avoid EDW upgrade and reduce on-going maintenance cost
• Scale the data architecture based on need
Solution Architecture
15
Hadoop Teradata
Data movement
Data models Data
verification Data quality
ETL Process
Data models
Data verification
ETL Process
Data Engineers Data Scientists Data Analysts
Mainframe
Scheduler
Data models
Data movement
16
Approach – Augmenting EDW with Hadoop
Define
– Business objective(s)
– Use Case(s)
– Migration strategy
Architect
– Solution Architecture
– Select Right Technology
Execute
– Implement
– Deploy
– Verify & Test
17
Key EDW Architecture Design to Production Challenges
Business
– Use-case Discovery
– Value from Data
– Project Management
– Technology Strategy
– Talent
– Process
Technical
– Structuring Data
– Functional Gaps
– Floating Point Computation
– Source of Truth
– Key Management
– Verification, Integrity & Quality
– Getting the right data architecture
• Performance
– Improved SLA for many workloads
• Capacity
– Improved capacity of EDW
• Leverage open source technology advancements
– Development tools, Advanced ML libraries etc.
• Lower TCO
• Commodity hardware for Hadoop at very low cost to off-load expensive EDW
• Open Source technology reduced licensing costs and vendor dependency
• Accelerated speed to development with out-of-box features
18
Benefits Delivered
Thank you
www.cignex.com
Contact Us
Sales: sales@cignex.com | Jobs – jobs@cignex.com | Others – info@cignex.com
facebook.com/CIGNEXTechnologies youtube.com/cignexglobal twitter.com/cignex www.cignex.com
www.orzota.com
20
Take a Quick Assessment for FREE http://operaonline.cignex.com
Test Drive Big Data Analytics Engage us for Proof-of-Concept (PoC)
@ US5K
Q & A
2 Hour FREE Consultation
for all attendees
Email us @ info@cignex.com Contact Us @ www.cignex.com/contact-us
top related