dw-bi - best practice for dw with sql-2008r2

Download DW-BI - Best Practice for DW With SQL-2008R2

Post on 22-Aug-2014

106 views

Category:

Documents

0 download

Embed Size (px)

TRANSCRIPT

Best Practices for Data Warehousing with SQL Server 2008 R2SQL Server Technical Article

Writers:

Mark Whitehorn, Solid Quality Mentors Keith Burns, Microsoft Eric N Hanson, Microsoft

Technical Reviewer: Eric N. Hanson, Microsoft

Published: December 2010 Applies To: SQL Server 2008 R2, SQL Server 2008 Summary: There is considerable evidence that successful data warehousing projects often produce a very high return on investment. Over the years a great deal of information has been collected about the factors that lead to a successful implementation versus an unsuccessful one. These are encapsulated here into a set of best practices, which are presented with particular reference to the features in SQL Server 2008 R2. The application of best practices to a data warehouse project is one of the best investments you can make toward the establishment of a successful Business Intelligence infrastructure.

CopyrightThis is a preliminary document and may be changed substantially prior to final commercial release of the software described herein.

The information contained in this document represents the current view of Microsoft Corporation on the issues discussed as of the date of publication. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information presented after the date of publication.

This White Paper is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS DOCUMENT.

Complying with all applicable copyright laws is the responsibility of the user. Without limiting the rights under copyright, no part of this document may be reproduced, stored in or introduced into a retrieval system, or transmitted in any form or by any means (electronic, mechanical, photocopying, recording, or otherwise), or for any purpose, without the express written permission of Microsoft Corporation.

Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual property rights covering subject matter in this document. Except as expressly provided in any written license agreement from Microsoft, the furnishing of this document does not give you any license to these patents, trademarks, copyrights, or other intellectual property.

2010 Microsoft Corporation. All rights reserved.

Microsoft, Excel, PerformancePoint Server, SharePoint Server, SQL Server 2008 R2, Visual Basic.Net, Visual C#, Visual C++, and Visual Studio are either registered trademarks or trademarks of Microsoft Corporation in the United States and/or other countries.

The names of actual companies and products mentioned herein may be the trademarks of their respective owners.

Table of ContentsIntroduction ......................................................................................................1 Benefits of Using Microsoft Products for Data Warehousing and Business Intelligence .......................................................................................................3 Best Practices: Creating Value for Your Business ................................................. 4 Find a sponsor ........................................................................................... 5 Get the architecture right at the start ........................................................... 5 Develop a Proof of Concept ......................................................................... 5 Select Proof of Concept projects on the basis of rapid ROI ............................... 5 Incrementally deliver high value projects ...................................................... 5 Designing Your Data Warehouse/BI solution ....................................................6 Best Practices: Initial Design ............................................................................ 6 Keep the design in line with the analytical requirements of the users ................ 6 Use data profiling to examine the distribution of the data in the source systems 9 Design from the start to partition large tables, particularly large fact tables..... 10 Plan to age out old data right from the start ................................................ 11 Best Practices: Specifying Hardware ................................................................ 11 Design for maintenance operation performance, not just query performance ... 11 Specify enough main memory so most queries never do I/O .......................... 12 ETL ..................................................................................................................12 Best Practices: Simplify the ETL Process and Improve Performance ..................... 12 Use SSIS to simplify ETL programming ....................................................... 12 Simplify the transformation process by using Data Profiling tasks ................... 13 Simplify using MERGE and INSERT INTO ..................................................... 13 Terminate all SQL statements with a semi-colon in SQL Server 2008 R2 ......... 16 If you cannot tolerate downtime, consider using ping pong partitions ........... 16 Use minimal logging to load data precisely where you want it as fast as possible16 Simplify data extraction by using Change Data Capture in the SQL Server source systems .................................................................................................. 17 Simplify and speed up ETL with improved Lookup ........................................ 17 Relational Data Warehouse Setup, Query, and Management ...........................18 Best Practices: General .................................................................................. 18 Use the resource governor to reserve resources for important work such as data loading, and to prevent runaway queries ..................................................... 18 Carefully plan when to rebuild statistics and indexes .................................... 19 Best Practices: Date/time ............................................................................... 19 Use the correct time/date data type ........................................................... 19 Consider using datetime2 in some database ports ...................................... 20 Best Practices: Compression and Encryption ..................................................... 20 Use PAGE compression to reduce data volume and speed up queries .............. 20

Use backup compression to reduce storage footprint .................................... 22 Best Practices: Partitioning ............................................................................. 22 Partition large fact tables .......................................................................... 22 Partition-align your indexed views .............................................................. 23 Design your partitioning scheme for ease of management first and foremost ... 23 For best parallel performance, include an explicit date range predicate on the fact table in queries, rather than a join with the Date dimension .................... 23 Best Practice: Manage Multiple Servers Uniformly ............................................. 24 Use Policy-Based Management to enforce good practice across multiple servers24 Additional Resources ..................................................................................... 24 Analysis ..........................................................................................................25 Best Practices: Analysis ................................................................................. 25 Use PowerPivot for end-user analysis.......................................................... 25 Seriously consider the best practice advice offered by AMO warnings .............. 25 Use MOLAP writeback instead of ROLAP writeback ........................................ 25 Use SQL Server 2008 R2 Analysis Services backup rather than file copy ......... 26 Write simpler MDX without worrying about performance ............................... 27 Scale out if you need more hardware capacity and hardware price is important 29 Reporting ........................................................................................................29 Best Practices: Data Presentation .................................................................... 29 Allow IT and business users to create both simple and complex reports .......... 29 Present data in the most accessible way possible ......................................... 30 Present data in reports that can be understood easily ................................... 31 Present data to users in familiar environments ............................................. 33 Best Practices: Performance ........................................................................... 33 Structure your query to return only the level of detail displayed in the report .. 33 Filter by using parameters that are passed to the query ................................ 34 Sort within the query ................................................................................ 34 Avoid using subreports inside a grouping .................................................... 34 Limit the data in charts to what a user can see ............................................ 34 Pre-sort and pre-group the data in your query ............................................. 34 Use drillthrough rather than drilldown when detail data volumes are large ...... 35 Avoid complex expressions in the page header and footer ............................. 35 Turn off CanGrow on textboxes and AutoSize on images if possible ................ 35 Do not return columns that you are not going to use from your query ............ 35 Best Practices: System Architecture and Performance ...

Recommended

View more >