exploring scalability, performance and deployment

33

Upload: rsnarayanan

Post on 25-Dec-2014

952 views

Category:

Technology


2 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Exploring Scalability, Performance And Deployment
Page 2: Exploring Scalability, Performance And Deployment

SSIS Exploring Scalability, Performance and Deployment

Vinod Kumar MTechnology Evangelist – DB and BIMicrosoftwww.ExtremeExperts.com

Page 3: Exploring Scalability, Performance And Deployment

Objectives and Takeaways

A high level viewDesign considerationsHow to measure performancePerformance implications of architectureManageability aspects of SSISDeployment tips

Out of scopePrescriptive guidance for specific situations

Page 4: Exploring Scalability, Performance And Deployment

Agenda

Quick IntroductionUnderstanding Buffers and MemoryOVAL Concept DetailedComponent Specific NotesManageability FeaturesDeployment Considerations

Page 5: Exploring Scalability, Performance And Deployment

Introduction

Page 6: Exploring Scalability, Performance And Deployment

SSIS Life Cycle toolsDesign the SSIS Package

Business Intelligence Studio (visual Studio)Migration wizard for pre SQL 2005 packagesVersion Control Integration (VSS)

Deployment/ExecutionDeployment Utility to copy packagesCommand Line execution (dtexec.exe and dtexecui.exe)Flexible Configuration Options

SupportabilityRich per package Logging SQL Management Studio for monitoring running packages and organizing stored packages Checkpoint - Restartability

Page 7: Exploring Scalability, Performance And Deployment

Deep dive - Performance

Page 8: Exploring Scalability, Performance And Deployment

Buffers and MemoryBuffers based on design time metadata

The width of a row determines the size of the bufferSmaller rows = more rows in memory = greater efficiency

Memory copies are expensive!A buffer might have placeholder columns filled by downstream componentsPointer magic where possible

Page 9: Exploring Scalability, Performance And Deployment

Component Types

Logically works at a row levelBuffer ReusedData Convert, Derived Column

Row based(synchronousoutputs)

Partially Blocking(asynchronousoutputs)

Blocking(asynchronousoutputs)

May logically work at a row levelData copied to new buffersMerge, Merge Join, Union All

Needs all input buffers before producing any output rowsData copied to new buffersAggregate, Sort

Page 10: Exploring Scalability, Performance And Deployment

CPU Utilization

Execution TreeStarts from a source or an async outputEnds at a destination or an input that has no sync outputs

Each Execution Tree can get a worker threadMaxEngineThreads to control parallelism

Page 11: Exploring Scalability, Performance And Deployment

Performance StrategyUse OVAL to identify the factors affecting data integration performance…

Operations

Which app is best suited to these operations on this volume of data? For example, use SQL Server or SSIS for sorting data?

Volume

Application

Location

How much data must be processed?

What logic should be applied to the data?

Where should the app run? For example, on a shared server, or on a standalone machine?

Page 12: Exploring Scalability, Performance And Deployment

An OVAL Example—Loading a Text File

Simple scenario…

Interesting performance considerations!Text file on Server 1 SQL Server on Server 2

Page 13: Exploring Scalability, Performance And Deployment

Understand all operations performed

Operations

Beware of hidden operationsData conversion in either step 3 or 4

1. Open a transaction on SQL Server2. Read data from the text file3. Load data into the SSIS data flow4. Load the data into SQL Server5. Commit the transaction

Page 14: Exploring Scalability, Performance And Deployment

VolumeReduce where possible

Don’t push unneeded columnsConditional split for filtering rowsDo not parse or convert columns unnecessarily

In a fixed-width format you can combine adjacent unneeded columns into oneLeave unneeded columns as strings

Page 15: Exploring Scalability, Performance And Deployment

Application Is SSIS right for this?

Overhead of starting up an SSIS package may offset any performance gain over BCP for small data sets.

Is BCP good enough?Is the greater manageability and control of SSIS needed?

Bulk Import Task vs. Data Flow

Page 16: Exploring Scalability, Performance And Deployment

LocationConsider the following configuration …

Text file on Server 1 SQL Server on Server 2

Where should SSIS run? (Licensing issues aside)

Page 17: Exploring Scalability, Performance And Deployment

Measuring Performance

OVAL does not provide prescriptive guidanceToo many variables

Improve performance by applying OVAL and measuring

SSIS LoggingPerformance countersSQL Server Profiler

For extract queries, lookups and loading

Page 18: Exploring Scalability, Performance And Deployment

ParallelismFocus on critical pathUtilize available resources

Memory Constrained Reader and CPU Constrained

Let it rip! Optimize the slowest

Page 19: Exploring Scalability, Performance And Deployment

Moving Ahead

Page 20: Exploring Scalability, Performance And Deployment

Manageability Features

Logging and Log ProvidersCheckpoint RestartabilityPrecedence ConstraintsConfigurationsSSIS Service

Page 21: Exploring Scalability, Performance And Deployment

CheckpointingCheckpoint File Created

Write Checkpoint

Write Checkpoint

Write Checkpoint

Checkpoint File deleted

Package Loads

Package Completes

Data Flow Task

Data Flow Task

Send Mail Task

Page 22: Exploring Scalability, Performance And Deployment

Configuration Scenario

Dev DB

Multiple Configurations

DevTest Production

Test DB Prod DB

Machines where packages are being designed /tested /executed

Configuration updates package on load with DB locations (and mail server, file share locations….)

Package Handoff

Page 23: Exploring Scalability, Performance And Deployment

Precedence constraints

Directs Flow from object to object…Basically, ‘when do I move on’Success, Failure, Completion or one of those plus an expression (condition)

Dataflow Task

SendMail Task

Success

Completion

Failure

Success & expression

Page 24: Exploring Scalability, Performance And Deployment

Tackle the basics …Manageability …

Page 25: Exploring Scalability, Performance And Deployment

Deployment Flow

Tools to organize and ‘copy’ packages and supporting files

•Design Package•Add Configurations•Add Miscellaneous files•Set Project Deployment properties•Build

•Choose Destination (SQL File System) •Modify protection level•Choose location of supporting files•Change configurations•Execute Installation Wizard

Bi Studio

•Copy/Move Deployment folder\files User

•Create desired agent jobs SQL Agent

•Copy/Move Deployment folder\files User

Page 26: Exploring Scalability, Performance And Deployment

SQL Management Studio

Utilizes the SSIS serviceAllows Monitoring of currently Executing packagesMaintain stored package structureAd hoc Package execution

Page 27: Exploring Scalability, Performance And Deployment

Simple flow …Deployment …

Page 28: Exploring Scalability, Performance And Deployment

SSIS: SummaryFast !

Data flows process large volumes of data efficiently - even through complex operationsExceptional price / performance on multi-core

Feature RichMany pre-built adapters and transformations reduce hand codingExtensible object model enables specialized custom or scripted componentsHighly productive visual environment speeds development and debuggingIntegral part of a complete BI stack (IS-AS-RS)

Beyond ETLEnables integration of XML, RSS and Web Services dataData cleansing features enable “difficult” data to be handled during loadingData and Text mining allow “smart” handling of data for imputation of incomplete data, conditional processing of potential problems, or smart escalation of issues such as fraud detection

Page 29: Exploring Scalability, Performance And Deployment

Your Feedbackis Important!

Please Fill Out the feedback form

Page 30: Exploring Scalability, Performance And Deployment

Questions !!!

Page 31: Exploring Scalability, Performance And Deployment

धन्यवा�दઆભા�ર ধন্য�বা�দ

ਧੰ�ਨਵਾ�ਦ

ଧନ୍ୟ�ବା�ଦ

நன்றி�

ధన్య�వాదాలు� ಧನ್ಯ�ವಾ�ದಗಳು

നി�ങ്ങള്‍‌ക്ക്� നിന്ദി�

Page 32: Exploring Scalability, Performance And Deployment

question & answer

Page 33: Exploring Scalability, Performance And Deployment

© 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS,

IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.