your data any place, any time data warehouse platform
TRANSCRIPT
Your Data Any Place, Any Time
Data Warehouse Platform
Agenda
• Data Warehouse Challenges• Microsoft Data Warehouse Platform
– Build Your DW faster– Manage your DW– Deliver Insights
• Analysts Reviews• Customer References
Low usage of data warehouse Steep learning curve for usersInability to gain insight from dataUnable to collaborate and share data
Deliver
Cant keep pace with growing data volumes and usersConcurrent workloads on the systemDisappearing batch windowsUnable to identify bottlenecks
Manage
Inability to incorporate legacy systems and unstructured dataDirty data from sourcesCant make current decisions based on old dataComplex development tools
Build
Data Warehouse Challenges
Accelerate Business Insightwith SQL Server 2008 for Data Warehousing
Visual Development EnvironmentEnterprise Data Integration platform Connectivity to heterogeneous data sourcesClean and profile data
BuildDeep analytical capabilityEnterprise Scale ReportingClose integration with Office SystemCollaborate and share information with Office SharePoint Server
Deliver
Enterprise ScalabilityVisibility across all data warehouse workloadsContinuous AvailabilitySimplified Manageability
Manage
End to EndBusiness
Intelligence
Agenda
• Data Warehouse Challenges• Microsoft Data Warehouse Platform
– Build Your DW faster– Manage your DW– Deliver Insights
• Analysts Reviews• Customer References
Build Your DW faster
BI Development Studio to build your data warehouse Visual Development environment to build your
reports, cubes and ETL packages Powerful debugging capabilities Development lifecycle support
SQL Server Integration Services provides an Enterprise Data Integration platform Connect to heterogeneous data sources Clean and validate your data Load directly into partitioned tables and cubes Enterprise Scalability to load larger volumes of data
in smaller batch windows Supports continuous loading with
SQL Server Service Broker connectivity
BI Development Studio
Complete, Integrated development tool in BI Development Studio
Build end to end DW solution Create reports Build OLAP cubes and KPI Create Data Mining models Create your end to end data
integration packages Development Lifecycle support
with Visual Studio TFS Integrated into
Visual Studio
Breakthrough ETL Capabilities
Change Data Capture (CDC) Identifies operational changes
MERGE Transact-SQL keyword Addresses common DW load scenarios
SQL Server Integration Services Connectivity to any source Scalable pipeline architecture Comprehensive set of transformations Improved performance of lookup tasks DW smarts handling slowly changing dimensions and direct
loading of cubes Clean and validate using data mining and data quality Built-in data profiling for source and destination Extensible platform through pluggable custom tasks and
Componentized development using .NET
Change Data CaptureGeneralize log-reading capability for all applications, notably DW• Improves DW scale
– Provide change information upfront for efficient processing of dimensions
– Eliminate expensive lookups– DW can be loaded throughout the day or at
specific intervals• Changes are captured and placed in change
tables– Log based capture with minimal impact on source– Captures complete content of changes &
maintains cross table consistency– Change Metadata stored as part of the changes– ETL process calls functions to retrieve changes– Works even across schema changes!
Change Tables
OLTP
Data warehouse
Agenda
Data Warehouse Challenges Microsoft Data Warehouse Platform
Build Your DW faster Manage your DW Deliver Insights
Analysts Reviews Customer References
Manageability
Partition your large data sets into manageable chunks using partitioning
Simplify aggregation of your data using Indexed views
Take advantage of simplified Index Management Use Resource Governor to manage concurrent
workloads Use data compression to reduce the storage
costs and increase query performance by reducing I/O
Use backup compression to reduce storage costs
Resource GovernorSQL Server
Admin Workload
Backup
Admin Tasks
Report WorkloadOLTP Workload
OLTP Activity
Executive
ReportsAd-hoc Reports
High
Min Memory 10%Max Memory 20%
Max CPU 20%
Admin Pool
Max CPU 90%
Application Pool
Ability to differentiate workloads e.g.
app_name, login
Per-request limits Max memory
% Max CPU time Grant timeout Max Requests
Resource monitoring
Data Partitioning
Partitioning breaks a single object into multiple manageable pieces Transparent to the application Allows easy management of very large tables and
indexes The row is the unit of partitioning All partitions run on a single SQL Server database Table fully available while loading, indexing a new
partition Create new and drop old partition quickly
Time
Sliding Window
Archived
HotRead-OnlyNear-Line Read-MostlyObsolete
Simplify Aggregate ManagementUsing Indexed Views
Aggregations can be pre-computed to minimize query execution time
Tables can pre-joined with stored results sets
Automatic query rewrite to use Indexed Views as appropriately reducing response times
Proactive manageability using Database Tuning Advisor to suggest appropriate views
0
50
100
150
200
250
300
Original Query times
With Indexed Views
Simplified Index Management SQL Server 2008 includes Alter Index
for online Index Management Rebuild Indexes without taking
data offline Allows Indexes to be rebuilt
at partition level Significant Improvement in Adding and
Dropping Clustered Indexes Parallel Index Management Identify new or unused Indexes using
Missing Indexes feature – DMV and Show Plan
Manageability Tools
Management Studio to manage your entire BI infrastructure
Declarative Management Framework provides policy-based configuration management
Dynamics Management Views to get real time insight into the database
SQL Query Analyzer provides prescriptive guidance on tuning query performance
Management Objects allows integration into existing management infrastructure
Enterprise Scalability
Sophisticated Indexing functionality to query large amounts of data
Parallel Query Execution to handle large numbers of concurrent users
Partitioned Query Processing that enables queries to span multiple partitions
Increase query performance in large data warehouses by enabling Star Schema Query Optimizations
Star Join Improvement
Pain points: Slow star join query with grouping and
aggregation No fact table partition elimination through join on
date dimension. Bad cardinality estimates due to “Gatekeeper row”
problem: presence of one row in filtered result from dimension table drastically affects # of matching rows in fact table.
Star Join Enhancement Features
• Improved recognition of star join pattern in query (available in some cases since SQL Server 7.0)• Enhanced rules to generate query plan alternatives for Star Join• Improved costing of Star Join plans• Multiple bitmap filter query execution strategy added
Data Compression
Key goal: shrink DW fact tables Secondary goal: improve query performance
Less I/O (but higher CPU) Higher buffer hit rates
2X to 7X compression ratio for real DW fact data anticipated, depending on data
Orthogonal to other features Works for data and indexes Techniques: page level dictionary,
differential encoding, variable-length encoding of ints, dates, don’t store 0s.
High Availability
Use Snapshot Isolation to take read snapshots of your data
Use Database Mirroring to minimize data loss and reduce down time
Backup and Restore partitions to reduce downtime
Agenda
Data Warehouse Challenges Microsoft Data Warehouse Platform
Build Your DW faster Manage your DW Deliver Insights
Analysts Reviews Customer References
Deliver Insights
Deep Analytical Capabilities using SQL Server Analysis Services Unified Dimension Models Key Performance Indicators Predictive Analysis using integrated Data Mining Scalable Analytical Engine that can handle large analytical data
sets with larger number of users Present the information to the organization using SQL Server
Reporting Services Ad-hoc reporting using Report Builder Supports OLAP and relational sources
Deliver rich information via Office System 2007 Better Integration with Excel 2007, including full predictive
analysis through data mining Rich Visualization using Microsoft Office PerformancePoint Server Collaborate and share Excel sheets, reports and dashboards using
Office SharePoint Server 2007
Analysis ServicesEnhanced OLAP and Data Mining Capabilities
Decision Trees
Clustering Time Series
Sequence Clustering
AssociationNaïve Bayes
Neural Net
plus… Logistic
Regression Linear
Regression Text Mining
Unified Dimension ModelMetadata layer for analytical applicationsProvides single version of truth for attributes and calculationsEnable deep analysis using Aggregation Engine
Advanced Business Intelligence
Key Performance IndicatorsDrive processes using Actions
Open Interface to End User Tools via Web Services
OLAP Leadership
http://www.olapreport.com
Unprecedented 3rd year as market growth leaderHighest growth Rate in industry
Reporting ServicesEnhanced Reporting Capabilities
Enterprise Reporting Build Complex reports using
integrated Visual Designer Present data from both relational
and OLAP sources Multiple delivery Options Enable Adhoc reporting via Report
Builder Collaborate and share
reports via with Office SharePoint Server
Provide reports over the Internet
Scalable Reporting Engine to drive reports of any size or complexity
Analyze data in ExcelPublish spreadsheets and documents to SharePoint
Reuse spreadsheets, reports, and documents in dashboards
and business applications
Access and explore data in spreadsheets, reports, and documents in the Portal
Best withSQL Server
Business Intelligence With Office 2007
Demo
End to End BI Demo
“The 2005 results reveal significant strengths for Microsoft SQL Server in the very large database (VLDB) market for both data warehouse (DW) and transaction processing (OLTP) systems.”
“They can scale and handle data warehouses of any size; we know that now,”… “I routinely talk to SQL Server customers that have built 7TB data warehouses or Windows Datacenter [systems] with 3,000 users.”
Donald FeinbergVP & Distinguished Analyst
“The recent release of the 2006 Gartner DW DBMS Magic Quadrant demonstrates the significant progress Microsoft has made in the Data Warehouse (DW) Database Management Systems (DBMS) market, moving from deep placement in the challenger’s quadrant to riding the line between leader and challenger.”.
Industry Accolades
"Microsoft's BI Tools revenue growth in 2005 was more than 25 percent, growing at more than twice the rate of the overall market. Microsoft has seen strong growth over the past several years as it has expanded and enhanced its database-embedded BI features …. Microsoft's impact on the BI tools market cannot be overemphasized. “ Dan Vesset
SQL Server 2005 TB+ Customers
5.1 TB Credit Card DW, 5 Mil card holders300 power users, complex query, OLAP, SSIS, SSRS, Office BSM
4 TB Consumer Packaged Goods (CPG) DW350 users, complex relational query, SSIS & SSRS
3 TB Retail Data Warehouse on HP SuperdomeUses SQL Server OLAP, SSIS, SSRS
2 TB DW, originally migrated from Informix300 users, complex query, OLAP & Data Mining
2 TB of Clinical Data & growing, 50 TB storageUS Dept of Veterans Affairs, 1200+ facilities, 40+ OLAP Marts
2.2 TB SAP Business Warehouse systemIn the process of migrating SAP R/3 OLTP to SS2005
Resources
Build Your Data Warehouse faster Learn for free, SQL Server BI technologies Develop ETL packages with SQL Server Integration Services Project REAL — Business Intelligence in Practice Data Warehousing Webcasts – presented by Kimball Group
Manage your Data Warehouse Strategies for Partitioning Relational Data Warehouses in Microsoft SQL Server Managing Aggregations with SQL Server 2005 Indexed Views Scale Out using SQL Server 2005 for DW Workloads Microsoft Data Warehouse
Deliver insights to your organization Microsoft BI SQL Server Reporting Services SQL Server Analysis Services Predictive Analysis using Data Mining
Learn more about Microsoft’s presence in BI & DW Microsoft Data Warehouse customer references Microsoft BI Market Share info Winter Corp Survey results
© 2004 Microsoft Corporation. All rights reserved.This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.
Appendix
SQL Server Momentum – Winter Corp Survey
Great Progress in Winter Corporation’s Data Warehousing TopTen 2005 Survey Largest data warehousing entry at 19.5 terabytes Eighth largest data warehousing database in the
world Eight entries at or more than 4 terabytes The largest showing ever:
23 data warehousing entries in all
Five times more data warehousing entries compared with previous roundhttp://www.microsoft.com/sql/prodinfo/compare/wintercorp-survey.mspx
Instant insight to health of entire networkOperations now 30-50% more efficientImproved network troubleshooting and manageabilityDevelopers not taxed with report generationFaster delivery of all types of reports
Developed NetMon based on the SQL Server BI platformSQL SSIS to integrate heterogeneous data into a single data warehouseAnalysis Services for building and analyzing OLAP cubesReporting Services to instantly report on network health
Event data logged differently for various areas of network Dispersed among separate databases, Oracle, MySQL and SQL ServerNeeded an integrated BI platform for all dataNeeded a single version of network health
“NetMon should helps us be 30-50% more efficient…we’ll be able to shift a good amount of energy toward new proactive projects” –Tim Winebarger, Manager NOC Tools Group, Cox Comm.
Customer: Cox CommunicationsLeading network service provider – telephony, cable TV, internet
CustomerBusiness Challenge Solution Customer
Results/Benefits
Customer Results/Benefit
s
Solution: Microsoft BI
Customer Business Challenge
50% reduction in development time over OracleSQL Server Integration Services shows up to 400% performance improvementDeliver more meaningful, up-to-date info
End to end business intelligence with: SQL Server, Reporting Services, Analysis Services, Integration Services and Data Mining1.6 TB Raw Data, 64-bit, 20-way server
Instances of store out of stockInappropriate stock levelsDifficult to perform analysis of large data volumesOverburdened operational systems
Project ObjectiveCapture transaction level sales and store stock position to increase sales and improve profitability
Single version of the truth for store managers and corporate executivesImproved productivity for store managersEnhanced attention to margin driversNew opportunities aftermarket sales (technical assistance program) and better inventory tracking to reduce fraudROI estimated at $6 million
SQL Server data warehouse Analysis Services cubes with 15+ dimensionsProClarity – Web-based reports for stores and desktop client for corporateSolution Decision Factors: Time to market and cost
150+ disparate, disconnected legacy systems Integrated reporting system for 200+ stores, districts, and corporateMaintain competitive advantage through customer innovationAgile and flexible infrastructure with immediate, demonstrable ROI
Customer: CompUSA Largest North American Retailer
CustomerBusiness Challenge Solution Customer
Results/Benefits
“We're looking for an ROI next year of about $6 million, and that's a very conservative estimate—one that just looks at taking advantage of the aftermarket sales these reports will help generate.”
–Cathy Witt, Vice President and CIO, CompUSA
Edcon DW System
Scenario/Business Top retailer in South Africa, 650 stores, 11K employees
Diversified retailing conglomerate with centralized back-end ops Multiple store lines
Won data warehouse from Oracle Workload/Architecture 9 TB of total data in DW
5 production servers, multiple instances, based on retail store line Growth of approximately 1 TB per month total All ETL uses DTS
4 TB single instance for customer data Front-ended by SQL Server OLAP cubes
All ad hoc query workload with ProClarity 300 users total growing to 500 by YE04
DB Platform Currently running on 32-bit systems Currently evaluating 64-bit HW platforms
Storage plans to grow to 70 TB total, with 15 TB allocated for DW
Clalit Health Services Patient DW Scenario/Business Oldest and largest HMO in Israel, covers 60% (3.7 mil) of the population
15 hospitals, 1300 clinics, 400 pharmacies, 32K employees (6K doctors) IBM Informix on time Sun Solaris migration in 2002
67% TCO savings, 25% perf improvements: 15x in some OLAP queries Workload/Architecture 5 TB of total data, multiple instances
1 TB in largest single SQL Server instance Ad hoc query workload 15-20 concurrent users at this time, 20 concurrent in-flight queries often
OLAP cubes using Microsoft Analysis Services ISV front-end app: ProClarity ETL: Ascential Software’s Datastage ETL Patient data collected from 2 S/390, 2 AS400, 150 Unix servers
Also runs main clinic system OLTP on SQL Server Migrating 1000 local SQL Servers in clinics to central DB, 2 TB expected
DB Platform Windows Server 2003, Unisys 8-way, 64-bit, 32 GB RAM, EMC SAN
© 2004 Microsoft Corporation. All rights reserved.This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.