sqlserverdays2012_ssis_cdc

26
Partner Logo Here SQL SERVER DAYS 2012 the perfect marriage SSIS & CDC IN SQL SERVER 2012 BI consultant koen verbeeck

Upload: koenverbeeck

Post on 19-Jan-2015

1.077 views

Category:

Technology


0 download

DESCRIPTION

Slidedeck of SQL Server Days 2012 session about SSIS and Change Data Capture. Created by Koen Verbeeck.

TRANSCRIPT

Page 1: SQLServerDays2012_SSIS_CDC

SQL SERVER DAYS 2012

the perfect marriage

SSIS & CDC IN SQL SERVER 2012

BI consultantkoen verbeeck

Page 2: SQLServerDays2012_SSIS_CDC

WHO AM I?

Page 3: SQLServerDays2012_SSIS_CDC

OUTLINE

• introduction to change data capture + demo

• control flow tasks+ demo

• data flow components+ demo

• Oracle as a source

• conclusion

Page 4: SQLServerDays2012_SSIS_CDC

INTRODUCTION

• change data capture (CDC)• designed to capture insert/update/delete data

o make it available in “easily consumed” relational formato pretty much DIY in TSQL

• introduced in SQL Server 2008o technically hasn’t changed in SQL Server 2012

• alternative for change tracking

• typically used in data warehouse scenarioso speed up extraction by incremental loads

• why?• more data everyday• less time to process it

o full load vs. incremental load I want my data real-time!

Page 5: SQLServerDays2012_SSIS_CDC

INTRODUCTION

• CDC versus change tracking

CDC change tracking

which changes? insert/update/delete insert/update/delete

shows type of DML? yes yes

methodology? asynchronous synchronous

column changed? yes yes

historical data? yes no

editions? Enterprise only all

Page 6: SQLServerDays2012_SSIS_CDC

DEMOshowcase “2008” CDC

Page 7: SQLServerDays2012_SSIS_CDC

OUTLINE

• introduction to change data capture + demo

• control flow tasks+ demo

• data flow components+ demo

• Oracle as a source

• conclusion

Page 8: SQLServerDays2012_SSIS_CDC

CONTROL FLOW TASKS

• CDC control task

• controls lifecycle of CDC packageso handles initial load synchronizationo management of LSN rangeso error scenarios and recovery

• stores state in SSIS variable andin database table

• two operationso initial loado incremental load

Page 9: SQLServerDays2012_SSIS_CDC

DEMOinitial load package

Page 10: SQLServerDays2012_SSIS_CDC

CONTROL FLOW TASKS

• initial load package

• also possible from snapshot

Page 11: SQLServerDays2012_SSIS_CDC

OUTLINE

• introduction to change data capture + demo

• control flow tasks+ demo

• data flow components+ demo

• Oracle as a source

• conclusion

Page 12: SQLServerDays2012_SSIS_CDC

DATA FLOW COMPONENTS

• CDC source• reads a range of change data• all / net changes

o and everything in between

Page 13: SQLServerDays2012_SSIS_CDC

DATA FLOW COMPONENTS

• CDC splitter• basically a conditional split on steroids

o … but without a GUI

• splits flow of changed data intoinserts, updates and deletes

• do not combine with ALL CDC processing modeo order between operations is lost

Page 14: SQLServerDays2012_SSIS_CDC

DEMOincremental load package

Page 15: SQLServerDays2012_SSIS_CDC

DATA FLOW COMPONENTS

• incremental load package

• remember to run it twice…• use the __$reprocessing column to identify rows in need of attention

Page 16: SQLServerDays2012_SSIS_CDC

OUTLINE

• introduction to change data capture + demo

• control flow tasks+ demo

• data flow components+ demo

• Oracle as a source

• conclusion

Page 17: SQLServerDays2012_SSIS_CDC

ORACLE AS A SOURCE

• Oracle CDC components by Attunity• not installed by default• can be found on installation media /Tools/AttunityCDCOracle

CDC Designer

log miner

CDC Service mirror

cdc

figure modeled after slide of Matt Masson [16]

Page 18: SQLServerDays2012_SSIS_CDC

ORACLE AS A SOURCE

• Oracle CDC Service• prepare SQL Server

o creates MSXDBCDC database + objects

• local system accounto logon as a policy permission

• CDC service master passwordo used to encrypt Oracle credentials

• one service per Oracle DB

Page 19: SQLServerDays2012_SSIS_CDC

ORACLE AS A SOURCE

• Oracle CDC Designer• create and manage CDC instances

o multiple instances per service possible

• containso Oracle DB infoo tables and columns being trackedo mirroring info

Page 20: SQLServerDays2012_SSIS_CDC

OUTLINE

• introduction to change data capture + demo

• control flow tasks+ demo

• data flow components+ demo

• Oracle as a source

• conclusion

Page 21: SQLServerDays2012_SSIS_CDC

CONCLUSION

• no new technical enhancements• enhancements are on manageability level• provided by Attunity

• new SSIS tasks/components• CDC control task• CDC source• CDC splitter

• two types of packages• initial load• incremental load• remember the __$reprocessing column

• possible to use Oracle as CDC source• CDC service & designer

Page 22: SQLServerDays2012_SSIS_CDC

RESOURCES

• CDC 2008 & Change Tracking• [1] MSDN - Change Data Capture

http://msdn.microsoft.com/en-us/library/bb522489(v=sql.105).aspx

• [2] MSDN - Comparing Change Data Capture and Change Trackinghttp://msdn.microsoft.com/en-us/library/cc280519(v=sql.105).aspx

• [3] Kendra Little - Change Data Capture vs Change Trackinghttp://www.littlekendra.com/2010/06/23/cdcvsct/

• CDC & SSIS 2012• [4] MSDN - CDC Control Task

http://msdn.microsoft.com/en-us/library/hh758674.aspx

• [5] MSDN - CDC Sourcehttp://msdn.microsoft.com/en-us/library/hh758686.aspx

• [6] MSDN - CDC Splitterhttp://msdn.microsoft.com/en-us/library/hh758656.aspx

• [7] MSDN - CDC Flow Componentshttp://msdn.microsoft.com/en-us/library/hh231087.aspx

• [8] MSDN - Define a State Variablehttp://msdn.microsoft.com/en-us/library/hh758667

• [9] Allan Mitchell - SSIS CDC Processing Mode Not Updating Flag Attributeshttp://copperblueblog.com/2012/06/06/ssis-cdc-processing-mode-not-updating-flag-attributes/

Page 23: SQLServerDays2012_SSIS_CDC

RESOURCES

• CDC & SSIS 2012 (continued)• [10] Allan Mitchell - CDC Package Needs Executing Twice

http://copperblueblog.com/2012/06/06/ssis-change-data-capture-cdc-package-needs-executing-twice/

• [11] Allan Mitchell - Changes Made To Source During Full Loadhttp://copperblueblog.com/2012/06/07/ssis-cdc-changes-made-to-source-during-full-load/

• [12] Allan Mitchell - Batch Sizes, Fast Load, Commit Size And The OLE DB Destinationhttp://copperblueblog.com/2012/10/02/batch-sizes-fast-load-commit-size-and-the-ole-db-destination/

• [13] Matt Masson - CDC in SSIS for SQL Server 2012http://www.mattmasson.com/index.php/2011/12/cdc-in-ssis-for-sql-server-2012-2/

• [14] Matt Masson - CDC Control Task Operationshttp://www.mattmasson.com/index.php/2012/02/cdc-control-task-operations/

• [15] Matt Masson - Processing Modes for the CDC Sourcehttp://www.mattmasson.com/index.php/2012/01/processing-modes-for-the-cdc-source/

• [16] Channel9 - TechEd North America (Matt Masson) - Incremental ETL Using CDC for SQL and Oracle with SQL Server Integration Services (SSIS) 2012http://channel9.msdn.com/Events/TechEd/NorthAmerica/2012/DBI322

Page 24: SQLServerDays2012_SSIS_CDC

RESOURCES

• CDC & Oracle• [17] TechNet Wiki - Installing Microsoft SQL Server 2012 Change Data

Capture for Oracle by Attunityhttp://social.technet.microsoft.com/wiki/contents/articles/7647.installing-microsoft-sql-server-2012-change-data-capture-for-oracle-by-attunity.aspx

• [18] SSIS blog MSDN - CDC for Oracle in SQL Server 2012http://blogs.msdn.com/b/mattm/archive/2012/03/26/cdc-for-oracle-in-sql-server-2012.aspx

• [19] SSIS blog MSDN - Video about CDC for Oracle Databases in SSIS 2012http://blogs.msdn.com/b/mattm/archive/2012/07/10/video-about-cdc-for-oracle-databases-in-ssis-2012.aspx

Page 25: SQLServerDays2012_SSIS_CDC

Q&A

Page 26: SQLServerDays2012_SSIS_CDC

26

Partner Logo Here

THANK YOU

[email protected]@Ko_Verhttp://www.linkedin.com/in/kverbeeck