data monetization: leveraging legacy with big data
DESCRIPTION
Covering 65 petabytes of active data across 80 countries and 60 million customers and 7000 systems, data virtualization lowers total cost of ownership, improves agility, and enables greater business self service. http://www.cisco.com/c/en/us/products/cloud-systems-management/data-analytics/index.htmlTRANSCRIPT
Data Monetization
PUBLIC
1st October 2014
Leveraging Legacy with Big Data
2
P
U
B
L
I
C
7000+ Operational Systems
65Pb of Active Data
3
Making Data an AssetLegacy bad, heritage good
P
U
B
L
I
C
65Pb of Active Data
Monitization
Overhead Asset$ $
4
What does our businesses want?
PUBLIC
AMBITION TECHNOLOGY REQUIREMENTS
A single unified global data platform
Internet speed of change
Advanced Analytical Capabilities
Lowest possible TCO-D
agile Change
Self Service Analytics
Automated Delivery
5
Data Architecture circa 2011
PUBLIC
ETL WAREHOUSE ANALYTICS
Source
Ops
Trades
Position
Corp
Actions
Integration
ODS
CMF
StagingExternal
Market Data
Client
Exchange
ETL
Warehouse
Enterprise
Logical
ModelETL
Division Marts
Product
ETLProduct
Product
Product
Strategic Marts
Function
ETL
Function
Function
Function
Channels
eCommerce
Analytical
Tools
Reporting
Read
ReadETL
6
Data Architecture circa 2011
PUBLIC
Whats the problem?
Change is too slow
Running costs prohibitively expensive
Multiple platforms giving a
partial view of our business
….incredibly unhappy customers
7
The Elephant enters the roomApache Hadoop
PUBLIC
8
Future State Data ArchitectureThe Data Operating System
PUBLIC
Data Platform
Security
Operations
Pluggable Multi-tenant Processing
Batch
API Façade (SQL and RESTful)
OnlineInteractive In-Memory
Search Graph
ConsumptionRepositories
Compute and Storage
Apps
Metadata Management
Governance
Tag, filter &
process
Inte
gra
tio
nR
eal-tim
eB
atc
h
Data Sources
SQL & ScriptingRDBMS
XML
API
Files
EUC
Internal
AppsAnalytics Reporting
Business
Intelligence
External
AppsDWH
ODS / Data
Marts
9
Product Sprints
Modelling Sprints
PUBLIC
Agile Data developmentContent driven, business enabled product development
Data
ProductsLoad Data
Business
Analysis &
DQ Check
Integration Reporting &
Dashboards
Advanced
Analytics
Data Engineering
Data Science
10
PUBLI
C
The Data LakeBuilding a big data enabled Enterprise
“Big Data is the most important technology I’ve seen since Mozilla”
Project Sponsor
OpEx 95%
Apache Hadoop has a
TCO-D that is 95%
cheaper than the current
analytical platforms
5 Weeks
11 Sources
To…..
Load &
Integrate &
Dashboard
11
So we’re done?Hurdles for Hadoop
PUBLIC
AMBITION TECHNOLOGY REQUIREMENTS
Internet speed of change
Advanced Analytical Capabilities
Lowest possible TCO-D
agile Change
Self Service Analytics
Automated Delivery
A single unified global data platform
12
Issue #1: GlobalImmoveable data
P
U
B
L
I
C
API Façade (SQL and RESTful)
13
Issue #2: UnifiedConsolidating Platforms
PUBLIC
Future State Analytics
Legacy RDBMS MPP
Reporting &
Dashboards
Advanced
Analytics
API Façade (SQL and RESTful)
14
PUBLI
C
Future State Data ArchitectureDriving Principles
RDBMS is becoming legacy……………but SQL is not
Virtualization is a key enabler for our Big Data strategy
15
PUBLI
C
Journeys to insightSuccesses to date
Sprint 1
= + =>
Sprint 2
= + =>Sprint 1
Sprint 3
= + =>Sprint 2