sqlserverdays2012_ssis_cdc
DESCRIPTION
Slidedeck of SQL Server Days 2012 session about SSIS and Change Data Capture. Created by Koen Verbeeck.TRANSCRIPT
SQL SERVER DAYS 2012
the perfect marriage
SSIS & CDC IN SQL SERVER 2012
BI consultantkoen verbeeck
WHO AM I?
OUTLINE
• introduction to change data capture + demo
• control flow tasks+ demo
• data flow components+ demo
• Oracle as a source
• conclusion
INTRODUCTION
• change data capture (CDC)• designed to capture insert/update/delete data
o make it available in “easily consumed” relational formato pretty much DIY in TSQL
• introduced in SQL Server 2008o technically hasn’t changed in SQL Server 2012
• alternative for change tracking
• typically used in data warehouse scenarioso speed up extraction by incremental loads
• why?• more data everyday• less time to process it
o full load vs. incremental load I want my data real-time!
INTRODUCTION
• CDC versus change tracking
CDC change tracking
which changes? insert/update/delete insert/update/delete
shows type of DML? yes yes
methodology? asynchronous synchronous
column changed? yes yes
historical data? yes no
editions? Enterprise only all
DEMOshowcase “2008” CDC
OUTLINE
• introduction to change data capture + demo
• control flow tasks+ demo
• data flow components+ demo
• Oracle as a source
• conclusion
CONTROL FLOW TASKS
• CDC control task
• controls lifecycle of CDC packageso handles initial load synchronizationo management of LSN rangeso error scenarios and recovery
• stores state in SSIS variable andin database table
• two operationso initial loado incremental load
DEMOinitial load package
CONTROL FLOW TASKS
• initial load package
• also possible from snapshot
OUTLINE
• introduction to change data capture + demo
• control flow tasks+ demo
• data flow components+ demo
• Oracle as a source
• conclusion
DATA FLOW COMPONENTS
• CDC source• reads a range of change data• all / net changes
o and everything in between
DATA FLOW COMPONENTS
• CDC splitter• basically a conditional split on steroids
o … but without a GUI
• splits flow of changed data intoinserts, updates and deletes
• do not combine with ALL CDC processing modeo order between operations is lost
DEMOincremental load package
DATA FLOW COMPONENTS
• incremental load package
• remember to run it twice…• use the __$reprocessing column to identify rows in need of attention
OUTLINE
• introduction to change data capture + demo
• control flow tasks+ demo
• data flow components+ demo
• Oracle as a source
• conclusion
ORACLE AS A SOURCE
• Oracle CDC components by Attunity• not installed by default• can be found on installation media /Tools/AttunityCDCOracle
CDC Designer
log miner
CDC Service mirror
cdc
figure modeled after slide of Matt Masson [16]
ORACLE AS A SOURCE
• Oracle CDC Service• prepare SQL Server
o creates MSXDBCDC database + objects
• local system accounto logon as a policy permission
• CDC service master passwordo used to encrypt Oracle credentials
• one service per Oracle DB
ORACLE AS A SOURCE
• Oracle CDC Designer• create and manage CDC instances
o multiple instances per service possible
• containso Oracle DB infoo tables and columns being trackedo mirroring info
OUTLINE
• introduction to change data capture + demo
• control flow tasks+ demo
• data flow components+ demo
• Oracle as a source
• conclusion
CONCLUSION
• no new technical enhancements• enhancements are on manageability level• provided by Attunity
• new SSIS tasks/components• CDC control task• CDC source• CDC splitter
• two types of packages• initial load• incremental load• remember the __$reprocessing column
• possible to use Oracle as CDC source• CDC service & designer
RESOURCES
• CDC 2008 & Change Tracking• [1] MSDN - Change Data Capture
http://msdn.microsoft.com/en-us/library/bb522489(v=sql.105).aspx
• [2] MSDN - Comparing Change Data Capture and Change Trackinghttp://msdn.microsoft.com/en-us/library/cc280519(v=sql.105).aspx
• [3] Kendra Little - Change Data Capture vs Change Trackinghttp://www.littlekendra.com/2010/06/23/cdcvsct/
• CDC & SSIS 2012• [4] MSDN - CDC Control Task
http://msdn.microsoft.com/en-us/library/hh758674.aspx
• [5] MSDN - CDC Sourcehttp://msdn.microsoft.com/en-us/library/hh758686.aspx
• [6] MSDN - CDC Splitterhttp://msdn.microsoft.com/en-us/library/hh758656.aspx
• [7] MSDN - CDC Flow Componentshttp://msdn.microsoft.com/en-us/library/hh231087.aspx
• [8] MSDN - Define a State Variablehttp://msdn.microsoft.com/en-us/library/hh758667
• [9] Allan Mitchell - SSIS CDC Processing Mode Not Updating Flag Attributeshttp://copperblueblog.com/2012/06/06/ssis-cdc-processing-mode-not-updating-flag-attributes/
RESOURCES
• CDC & SSIS 2012 (continued)• [10] Allan Mitchell - CDC Package Needs Executing Twice
http://copperblueblog.com/2012/06/06/ssis-change-data-capture-cdc-package-needs-executing-twice/
• [11] Allan Mitchell - Changes Made To Source During Full Loadhttp://copperblueblog.com/2012/06/07/ssis-cdc-changes-made-to-source-during-full-load/
• [12] Allan Mitchell - Batch Sizes, Fast Load, Commit Size And The OLE DB Destinationhttp://copperblueblog.com/2012/10/02/batch-sizes-fast-load-commit-size-and-the-ole-db-destination/
• [13] Matt Masson - CDC in SSIS for SQL Server 2012http://www.mattmasson.com/index.php/2011/12/cdc-in-ssis-for-sql-server-2012-2/
• [14] Matt Masson - CDC Control Task Operationshttp://www.mattmasson.com/index.php/2012/02/cdc-control-task-operations/
• [15] Matt Masson - Processing Modes for the CDC Sourcehttp://www.mattmasson.com/index.php/2012/01/processing-modes-for-the-cdc-source/
• [16] Channel9 - TechEd North America (Matt Masson) - Incremental ETL Using CDC for SQL and Oracle with SQL Server Integration Services (SSIS) 2012http://channel9.msdn.com/Events/TechEd/NorthAmerica/2012/DBI322
RESOURCES
• CDC & Oracle• [17] TechNet Wiki - Installing Microsoft SQL Server 2012 Change Data
Capture for Oracle by Attunityhttp://social.technet.microsoft.com/wiki/contents/articles/7647.installing-microsoft-sql-server-2012-change-data-capture-for-oracle-by-attunity.aspx
• [18] SSIS blog MSDN - CDC for Oracle in SQL Server 2012http://blogs.msdn.com/b/mattm/archive/2012/03/26/cdc-for-oracle-in-sql-server-2012.aspx
• [19] SSIS blog MSDN - Video about CDC for Oracle Databases in SSIS 2012http://blogs.msdn.com/b/mattm/archive/2012/07/10/video-about-cdc-for-oracle-databases-in-ssis-2012.aspx
Q&A