getting real about data virtualization - informatica · 2011-05-02 · 5/2/2011 1 1 getting real...
TRANSCRIPT
5/2/2011
1
1
Getting Real About Data Virtualization Informatica Data Services
Ash Parikh
May 3, 2011
2
• Makes multiple heterogeneous data
sources appear as one
• Federates data in real-time & also
supports physical materialization to DW
• Abstracts data sources from consumers
and insulates from change
• Hides & handles complexity (quality, rich
transformations, bus-IT collaboration)
• Let’s the business own the data & define
the rules while IT retains control
Source: VMWare, Inc.
Enterprise
Servers
Enterprise
Network
Enterprise
Storage
Hardware
Abstraction
Operating
Systems
HARDWARE VIRTUALIZATION
Logical View of Computing Resources
Portal BI Composite Apps
Enterprise
Data Sources
Data
Abstraction
Logical Data Objects
CUSTOMER ORDER PRODUCT INVOICE
Data
Consumers
DATA VIRTUALIZATION
Logical View of Underlying Data Simple Data
Federation Does Not Cut It
Defining Data Virtualization
5/2/2011
2
3
Here’s Why…
1. Source
2. Target
3. Mapping
4. Workflow
5. Execute/Test
6. Deploy
The Traditional
DI Process
Business
Typical Value Stream Map - Too Much WAIT & WASTE
• The biz gets involved late & does not get what is needed
4
The Problem(s)
5/2/2011
3
5
Business IT
Applications
DW
Unstructured
Data Mainframe
DM DM
Trust
It Takes too Long to Deliver Data the Business Needs!
Change Request … Approve & Prioritize … Analyze & Design … Build … Test … Deploy
6
Business IT
Applications
DW/MDM
Hub
Unstructured
Data Mainframe DM
DM
Spread Marts
DM DM
Trust
Data Is Everywhere & Growing!
5/2/2011
4
7
The
Impact
8
Reports Take too Long
Change
Request
Deploy to
Production
Business IT
3-6 Months
Reporting Scenario: On-going requests for data that is NOT in the DW
?
Change Request … Approve & Prioritize … Analyze & Design … Build … Test … Deploy
Weeks/Days
What if?
•66% of BI requirements change on between a daily and monthly basis
•71% of the respondents said they have to ask data analysts to create custom reports for them
•36% of custom report requests require a custom cube or data mart to answer the request
•77% of respondents cited that it takes between days and months to get their BI requests fulfilled
Source: Forrester Research, “Agile BI: Best Practices for Breaking Through the BI Backlog,” 2010
5/2/2011
5
9
16 Heterogeneous Enterprise Stores With Large Volumes of Data
Different Price Info in Each BU & 1700 Dev Hours to Add 1 Product
Agreement on MEMBER & Attributes Time-Consuming & Painful
Business IT
Product Config Mgmt
(MS SQL Server)
Facets [Benefits, Products]
(Sybase ASE)
Data Warehouse
(DB2)
30,000 Data Marts
(MS Access)
BI
(Cognos)
Portal
(WebSphere)
No Reuse
No Reuse of Data Services for BI, MDM & SOA
SQL Web services
HealthNow Case Study
X
10
Lean Integration &
Data Virtualization
5/2/2011
6
11
Data Virtualization Built on Lean Integration Principles
• Early and on-going business user involvement for agile DI
1. Source
2. Target
3. Mapping
4. Workflow
5. Execute/Test
6. Deploy
1. Logical Data Object
2. Source
3. Integrate - Preview
- Profile at any stage
- Apply rich transformations
- Apply DQ & masking rules on-the-fly
- Federate data without data movement
4. Comment/Tag - Debug
5. Deploy as Reusable Data Services - Web services or SQL
The Traditional
DI Process
The New Agile
DI Process
Business
Business
Original Value Stream Map - Too Much WAIT & WASTE
Optimized Value Stream Map – Cut the WAIT & WASTE
12
Self Service – Analyst Empowerment & Business-IT Collaboration
BI Report
DI Analyst
DI Developer
Data Warehouse
Batch ETL
SQL or Web Service
• Improved analyst & developer productivity
• Easily map sources to physical &
virtual targets
• Quickly find data via integrated
business glossary
• Specify transformations
with reusable expressions
• Include pre-built rules and
mappings (e.g. ETL, DQ)
• Collaborate, test and validate
specification results
• Automatically generate ETL
mappings and SQL views
5/2/2011
7
13
PRODUCT ORDER MEMBER CLAIM
“Virtual Table”
1 week
(vs. 3 months)
Product Config Mgmt
(MS SQL Server)
Facets [Benefits, Products]
(Sybase ASE)
Data Warehouse
(DB2)
30,000 Data Marts
(MS Access)
BI
(Cognos)
Portal
(WebSphere)
REUSE
5x Faster Direct Data Access, Increased Reuse, Improved Governance & Agility
BI Strategy
MDM Strategy
SOA Strategy
14
CRM
Leverage a scalable
& reliable engine
Accounts Involve business early
to define & validate rules
Business
Manager
Analyst,
Steward
Developer
Common
Metadata
CRM Accounts
Access all data,
accelerate profiling Abstract, find, govern
Logical Data Objects
Customer Name Address Category Orders
Support multiple styles
of data processing
Accounts Call Center
ETL
Rich transforms, DQ
& masking rules in RT
DQ
Transformation
Single-click reuse
across all applications
Batch Web Services
1 2
3
4
5
6
7
EII
Optimizations
& Caching
Query
Engine
WS
Server
Case for Making Data Virtualization as part of Your Information Management Best Practices
Informatica Data Services
(Data Virtualization)
5/2/2011
8
15
Next Steps
“SOA Data Integration Architecture
Group”
Forrester IaaS
(Data Services) Wave
Let’s Talk!
Architect to Architect Webinar Series Data Virtualization Architecture & Best
Practices for Agile Data Integration
May 19, 2011
16