ibm data movement tool

27
DB2 9.7: IBM Data Movement Tool Enable applications from Oracle to DB2 the easy way Skill Level: Intermediate Vikram S. Khatri Certified Consulting I/T Specialist IBM 19 Jun 2009 Updated 28 Jan 2010 This article presents a very simple and powerful tool that enables applications from Oracle to be run on IBM® DB2® Version 9.7 for Linux®, UNIX®, and Windows®. The tool can also be used to move data from various other database management systems to DB2 for Linux, UNIX, and Windows and DB2 for z/OS®. Introduction Beginning with DB2 V9.7 for Linux, UNIX, and Windows, the Migration Toolkit (MTK) is not required in order to use applications from Oracle on DB2 products. This tool replaces the MTK functionality with a greatly simplified workflow. For all other scenarios, for example, moving data from a database to DB2 for z/OS, this tool supports the MTK particularly in the area of the high speed data movement. Using this tool, as much as 4TB of data have been moved in just three days. A GUI provides an easy to use interface for the novice while the command line API is often preferred by the advanced user. Preparation IBM Data Movement Tool © Copyright IBM Corporation 2009, 2010. All rights reserved. Page 1 of 27

Upload: bcpman

Post on 25-Nov-2014

332 views

Category:

Documents


5 download

TRANSCRIPT

Page 1: IBM Data Movement Tool

DB2 9.7: IBM Data Movement ToolEnable applications from Oracle to DB2 the easy way

Skill Level: Intermediate

Vikram S. KhatriCertified Consulting I/T SpecialistIBM

19 Jun 2009

Updated 28 Jan 2010

This article presents a very simple and powerful tool that enables applications fromOracle to be run on IBM® DB2® Version 9.7 for Linux®, UNIX®, and Windows®. Thetool can also be used to move data from various other database managementsystems to DB2 for Linux, UNIX, and Windows and DB2 for z/OS®.

Introduction

Beginning with DB2 V9.7 for Linux, UNIX, and Windows, the Migration Toolkit (MTK)is not required in order to use applications from Oracle on DB2 products. This toolreplaces the MTK functionality with a greatly simplified workflow.

For all other scenarios, for example, moving data from a database to DB2 for z/OS,this tool supports the MTK particularly in the area of the high speed data movement.Using this tool, as much as 4TB of data have been moved in just three days.

A GUI provides an easy to use interface for the novice while the command line APIis often preferred by the advanced user.

Preparation

IBM Data Movement Tool© Copyright IBM Corporation 2009, 2010. All rights reserved. Page 1 of 27

Page 2: IBM Data Movement Tool

Download

First, download the tool from the Download section to your target DB2 server.Additional steps are required to move data to DB2 for z/OS. (Check for the latestavailable version of the tool.)

Installation

Once you have downloaded the IBMDataMovementTool.zip file, extract the files intoa directory called IBMDataMovementTool on your target DB2 server. A server sideinstall (on DB2) is strongly recommended to achieve the best data movementperformance.

Prerequisites

• DB2 V9.7 should be installed on your target server if you are enabling anOracle application to be run on DB2 for Linux, UNIX, and Windows.

• Java™ version 1.5 or higher must be installed on your target server. Toverify your current Java version, run java -version command. Bydefault, Java is installed as part of DB2 for Linux, UNIX, and Windows in<install_dir>\SQLLIB\java\jdk (Windows) or /opt/ibm/db2/V9.7/java/jdk(Linux).

Table 1. Location of JDBC drivers for your source database and DB2Database JDBC drivers

Oracle ojdbc5.jar or ojdbc6.jar or ojdbc14.jar, xdb.jar,xmlparserv2.jar or classes12.jar orclasses111.jar for Oracle 7 or 8i

SQL Server sqljdbc5.jar or sqljdbc.jar

Sybase jconn3.jar

MySQL mysql-connector-java-5.0.8-bin.jar or latest driver

PostgreSQL postgresql-8.1-405.jdbc3.jar or latest driver

DB2 for Linux, UNIX, and Windows db2jcc.jar, db2jcc_license_cu.jar or db2jcc4.jar,db2jcc4_license_cu.jar

DB2 for z db2jcc.jar, db2jcc_license_cisuz.jar ordb2jcc4.jar, db2jcc4_license_cisuz.jar

DB2 for i jt400.jar

MS Access Optional Access_JDBC30.jar

Environment setup

• UNIX: Login to your server as DB2 instance owner.

developerWorks® ibm.com/developerWorks

IBM Data Movement ToolPage 2 of 27 © Copyright IBM Corporation 2009, 2010. All rights reserved.

Page 3: IBM Data Movement Tool

• Windows: Launch a DB2 Command Window.

• Change to the IBMDataMovementTool directory. The tool is a JAR filewith two driver scripts to run the tool.

IBMDataMovementTool.cmd - Command script to run the tool on Windows.IBMDataMovementTool.sh - Command script to run the tool on UNIX.IBMDataMovementTool.jar - JAR file of the tool.Pipe.dll - A DLL required on Windows if pipe option is used.

Create the DB2 target database

Since a database connection to the target is required to run the tool, the DB2database must be created first. On DB2 V9.7, we recommended that you use thedefault automatic storage and choose a 32KB page size. When enablingapplications to be run on DB2 V9.7, the instance and the database must beoperating in compatibility mode. It is also recommended to adjust the roundingbehavior to match that of Oracle. You can deploy objects out of dependency orderby setting the revalidation semantics to deferred_force.

On UNIX systems

$ db2set DB2_COMPATIBILITY_VECTOR=ORA$ db2set DB2_DEFERRED_PREPARE_SEMANTICS=YES$ db2stop force$ db2start$ db2 "create db testdb automatic storage yes on /db2data1,/db2data2,/db2data3 DBPATH ON /db2system PAGESIZE 32 K"$ db2 update db cfg for testdb using auto_reval deferred_force$ db2 update db cfg for testdb using decflt_rounding round_half_up

On Windows systems

C:\> db2set DB2_COMPATIBILITY_VECTOR=ORAC:\> db2set DB2_DEFERRED_PREPARE_SEMANTICS=YESC:\> db2stop forceC:\> db2startC:\> db2 "create db testdb automatic storage yes on C:,D: DBPATH ON E: PAGESIZE 32 K"C:\> db2 update db cfg for testdb using auto_reval deferred_forceC:\> db2 update db cfg for testdb using decflt_rounding round_half_up

Extracting objects and data

Before you run the tool, have the following information for your source and DB2server ready:

• IP Address or Host Name of the source and DB2 servers

ibm.com/developerWorks developerWorks®

IBM Data Movement Tool© Copyright IBM Corporation 2009, 2010. All rights reserved. Page 3 of 27

Page 4: IBM Data Movement Tool

• Port numbers to connect

• Name of the databases, SID, sub-system name etc. as required

• A User ID with DBA privileges on the source database

• Password for that user

• Location of your source database and DB2 JDBC drivers

• Enough space or volume/mount point information where data will bestored

Run IBMDataMovementTool.cmd on Windows or ./IBMDataMovementTool.sh onUNIX. The tool will start a GUI if the server is capable of displaying graphics.Otherwise it will switch to the interactive command line mode to gather input.

On Windows:IBMDataMovementTool.cmd

On UNIX:chmod +x IBMDataMovementTool.sh./IBMDataMovementTool.sh

What is the DB2_COMPATIBILITY_VECTOR?The DB2_COMPATIBILITY_VECTOR is used to place both the DB2V9.7 instance and database into an Oracle compatible mode. fordetails see the DB2 V9.7 Information Center.

You will now see a GUI window. Some messages should also appear in the shellwindow. Please look through these messages to ensure no errors were loggedbefore you start using the GUI.

If you have not set DB2_COMPATIBILITY_VECTOR, the tool will report a warning.Please follow the steps to set the compatibility vector if you have not done so.

[2010-01-10 17.08.58.578] INPUT Directory = .[2010-01-10 17.08.58.578] Configuration file loaded: 'jdbcdriver.properties'[2010-01-10 17.08.58.593] Configuration file loaded: 'IBMExtract.properties'[2010-01-10 17.08.58.593] appJar : 'C:\IBMDataMovementTool\IBMDataMovementTool.jar'[2010-01-10 17.08.59.531] DB2 PATH is C:\Program Files\IBM\SQLLIB[2010-01-10 17.35.30.015] *** WARNING ***. The DB2_COMPATIBILITY_VECTOR is not set.[2010-01-10 17.35.30.015] To set compatibility mode, discontinue this program and

run the following commands[2010-01-10 17.35.30.015] db2set DB2_COMPATIBILITY_VECTOR=FFF[2010-01-10 17.35.30.015] db2stop force[2010-01-10 17.35.30.015] db2start

Using the Graphical User Interface

developerWorks® ibm.com/developerWorks

IBM Data Movement ToolPage 4 of 27 © Copyright IBM Corporation 2009, 2010. All rights reserved.

Page 5: IBM Data Movement Tool

The GUI screen as shown in Figure 1 has fields for specifying the source and DB2database connection information. The sequence of events in this screen are:

1. Specify source and DB2 connection information.

2. Click on Connect to Oracle to test the connection.

3. Click on Connect to DB2 to test the connection.

4. Specify the working directory where DDL and data are to be extracted to.

5. Choose if you want DDL and/or DATA. If you only select DDL, anadditional genddl script will be generated.

6. Click on the Extract DDL/Data button. You can monitor progress in theconsole window.

7. After the data extraction is completed successfully, go through the resultoutput files for the status of the data movement, warnings, errors andother potential issues.

8. Optionally, you can click on the View Script/Output button to check thegenerated scripts, DDL, data or the output log file.

9. Click on the Deploy DDL/Data button to create tables, indexes in DB2and load data that was extracted from the source database.

10. You can use Execute DB2 Script to run the generated DB2 scriptsinstead of running it from the command line. The data movement is aninterative exercise. If you need to drop all tables before you start fresh,you can select the drop table script and execute it. You can also use thisbutton to execute the scripts in the order you want them to be executed.

Figure 1. Input parameters for source and DB2 database

ibm.com/developerWorks developerWorks®

IBM Data Movement Tool© Copyright IBM Corporation 2009, 2010. All rights reserved. Page 5 of 27

Page 6: IBM Data Movement Tool

After clicking on the Extract DDL/Data button, you will notice tool's messages in theView File tab, as shown in Figure 2:

Figure 2. Extract DDL and Data

developerWorks® ibm.com/developerWorks

IBM Data Movement ToolPage 6 of 27 © Copyright IBM Corporation 2009, 2010. All rights reserved.

Page 7: IBM Data Movement Tool

After completing the extraction of DDL and DATA, you will notice several new filescreated in the working directory. These files can be used at the command line to runin DB2.

Configuration files

The following command scripts are regenerated each time you run the tool in GUImode. However, you can use these scripts to perform all data movement stepswithout the GUI. This is helpful when you want to embed this tool as part of a batchprocesses to accomplish an automated data movement.

Table 2. Command scriptsFile name Description

IBMExtract.properties This file contains all input parameters that youspecified through your GUI or command lineinput values. You can edit this file manually tomodify or correct parameters. Note: This file isoverwritten each time you run the GUI.

unload This script is created by the tool. It unloads datafrom the source database server to flat files, ifyou check DDL and Data options. The samescript moves data from source database to DB2using pipes, if you check the pipe option in theGUI to eliminate intermediate flat files. The pipe

ibm.com/developerWorks developerWorks®

IBM Data Movement Tool© Copyright IBM Corporation 2009, 2010. All rights reserved. Page 7 of 27

Page 8: IBM Data Movement Tool

option is controlled through usePipe option in theIBMExtract.properties file.

rowcount This script is created by the tool, and you can runit after deploying data to verify rowcounts insource and DB2 database.

Figure 3. Files created after data extraction

developerWorks® ibm.com/developerWorks

IBM Data Movement ToolPage 8 of 27 © Copyright IBM Corporation 2009, 2010. All rights reserved.

Page 9: IBM Data Movement Tool

Using command line mode

You can run the tool using command line mode particularly when the GUI capabilityis not available. The tool switches modes automatically if it is not able to start GUI. Ifyou want to force to run the tool in command line interactive mode, you can specify

ibm.com/developerWorks developerWorks®

IBM Data Movement Tool© Copyright IBM Corporation 2009, 2010. All rights reserved. Page 9 of 27

Page 10: IBM Data Movement Tool

the -console option to the IBMDataMovementTool command.

On Windows:IBMDataMovementTool -consoleOn UNIX:./IBMDataMovementTool.sh -console

You will be presented with interactive options to specify source and DB2 databaseconnection parameters in step-by-step process. A sample output from the consolewindow is shown as below:

[2010-01-10 20.08.05.390] INPUT Directory = .[2010-01-10 20.08.05.390] Configuration file loaded: 'jdbcdriver.properties'[2010-01-10 20.08.05.390] Configuration file loaded: 'IBMExtract.properties'[2010-01-10 20.08.05.390] appJar : 'C:\IBMDataMovementTool\IBMDataMovementTool.jar'Debug (Yes) : 1Debug (No) : 2Enter a number (Default=2) :IS TARGET DB2 LOCAL (YES) : 1IS TARGET DB2 REMOTE (NO) : 2Enter a number (Default=1) :Extract DDL (Yes) : 1Extract DDL (No) : 2Enter a number (Default=1) :Extract Data (Yes) : 1Extract Data (No) : 2Enter a number (Default=1) :Enter # of rows limit to extract. (Default=ALL) :Enter # of rows limit to load data in DB2. (Default=ALL) :Compress Table in DB2 (No) : 1Compress Table in DB2 (YES) : 2Enter a number (Default=1) :Compress Index in DB2 (No) : 1Compress Index in DB2 (YES) : 2Enter a number (Default=1) :******* Source database information: *****Oracle : 1MS SQL Server : 2Sybase : 3MS Access Database : 4MySQL : 5PostgreSQL : 6DB2 z/OS : 7DB2 LUW : 8Enter a number (Default 1) :DB2 Compatibility Feature (DB2 V9.7 or later) : 1No Compatibility feature : 2Enter compatibility feature (Default=1) :

Deploying objects and loading data

Create database target objects

After extraction of the DDL and DATA, you have three different ways of deployingthe extracted objects in DB2.

developerWorks® ibm.com/developerWorks

IBM Data Movement ToolPage 10 of 27 © Copyright IBM Corporation 2009, 2010. All rights reserved.

Page 11: IBM Data Movement Tool

• Click the Deploy DDL/DATA button from the GUI screen

• Go to the Interactive Deploy tab and deploy objects in step-by-stepprocess

• Deploy DDL/DATA using command line script db2gen

Which options to choose to deploy data are based upon the data and objectsmovement requirements. If you are migrating only non PL/SQL DDL objects andDATA, using the db2gen script or clicking the Deploy DDL/DATA button from theGUI will suffice.

The interactive deploy option is likely your better choice when you are also deployingPL/SQL objects such as triggers, functions, procedures, and PL/SQL packages.

The GUI screen as shown in Fig-4 is used for interactive deployment of DDL andother database objects. The sequence of events in this screen is:

1. Ensure you are connected to DB2 using the Extract/Deploy tab.

2. Click on the Interactive Deploy tab.

3. Use the Open Directory button to select the working directory containingthe previously extracted objects. The objects are read and listed in a treeview.

4. You can deploy all objects by pressing Deploy All Objects button on thetoolbar. Most objects will deploy successfully while others may fail.

5. When you click on an object which failed to deploy in the tree view, youcan see the source of the object in the editor window. The reason for thefailure is listed in the deployment log below.

6. The Oracle compatibility mode generally allows deployment of objects asis. However, there may still be unsupported features that preventsuccessful deployment of some objects out of the box. Using the editoryou can adjust the source code of these objects to work around anyissues. When you deploy the changed object, the new source is savedwith a backup of the old source.

7. You can select one or more objects using the CTRL key and click DeploySelected Objects button on the toolbar to deploy objects after they havebeen edited. Often deployment failures occur in a cascade which meansthat once one object is successfully deployed others which depend on itwill also deploy.

ibm.com/developerWorks developerWorks®

IBM Data Movement Tool© Copyright IBM Corporation 2009, 2010. All rights reserved. Page 11 of 27

Page 12: IBM Data Movement Tool

8. Repeat steps 5 through 7 until all objects have been successfullydeployed.

Figure 4. Interactive deploy of the objects

Compare row counts

• Go to the root directory of the data movement and run the rowcountscript.

• You should see a report generated in the "<source databasename>.tables.rowcount" file. The report contains row counts from bothsource and target databases.

oracle : db2"TESTCASE"."CALL_STACKS" : 123 "TESTCASE"."CALL_STACKS" : 123"TESTCASE"."CLASSES" : 401 "TESTCASE"."CLASSES" : 401"TESTCASE"."DESTINATION" : 513 "TESTCASE"."DESTINATION" : 513

Use pipes to move data

When the source database size is too large and there is not enough space to holdintermediate data files, using pipe is the recommended way to move the data.

developerWorks® ibm.com/developerWorks

IBM Data Movement ToolPage 12 of 27 © Copyright IBM Corporation 2009, 2010. All rights reserved.

Page 13: IBM Data Movement Tool

On Windows systems

The tool uses Pipe.dll to create Windows pipes and makes sure that this dll is placedin the same directory where IBMDataMovementTool.jar file is placed.

On UNIX systems

The tool creates UNIX pipes using the mkfifo command for use to move data fromsource to DB2.

Before you can use pipe between source and DB2 database, it is necessary to havetable definition created. Follow this procedure:

1. Specify # Extract Rows=1 in the GUI or set LimitExtractRows=1 inthe IBMExtract.properties, if you're using the command line window.

2. Click on the Extract DDL/Data button to unload the data, or run theunload script from the command line window.

3. Click on the Deploy DDL/Data button, or run the db2gen script from thecommand line window.

4. Select Use Pipe, or set usepipe=true in the IBMExtract.properties, ifyou're using the command line window.

5. Click on the Extract / Deploy through Pipe Load button, or run theunload script from the command line window.

Figure 5. Use of pipe to move data

ibm.com/developerWorks developerWorks®

IBM Data Movement Tool© Copyright IBM Corporation 2009, 2010. All rights reserved. Page 13 of 27

Page 14: IBM Data Movement Tool

Additional steps for DB2 on z/OS

UNLOAD process on z/OS

• This tool requires USS to run but DB2 LOAD on z/OScan not use HFS files to load the data. That is why, youneed to use JZOS toolkit to create PS datasets on z/OSfrom Unix System Services. However, DB2 LOAD canuse USS (or HFS) files to LOAD CLOBS/BLOBS in DB2.That is why, we create PS datasets on z/OS to move thedata from source database to z/OS and we use UnixSystem Services HFS files to keep all theBLOBS/CLOBS.

• The LOAD statement can not be run from USS. UseSYSPROC.DSNUTILS stored procedures to run theLOAD, CHECK DATA and RUN STATS.

• Creating PS datasets is a challenge since you need toallocate it for each table. A fixed size cannot be allocatedahead of time because the size of the table is unknown. Itmight waste lots of space on z/OS. To avoid a spaceproblem, use an algorithm to allocate the size.

You can use this tool from z/OS to do the data movement from a source database toDB2 for z/OS. However, the following additional steps are required.

1. Download and install JZOS from this IBM link.

developerWorks® ibm.com/developerWorks

IBM Data Movement ToolPage 14 of 27 © Copyright IBM Corporation 2009, 2010. All rights reserved.

Page 15: IBM Data Movement Tool

2. This zip file contains a file named jzos.pax. FTP this file using UnixSystem Services in binary mode to the directory where you would likeJZOS installed.

3. Change to the directory where you saved the .pax file.

4. Run the command: pax -rvf. This will create a subdirectory called jzosin your current working directory. This subdirectory will be referred to as<JZOS_HOME>

5. In the user's home directory, create a file named .profile based upon thetemplate given below by making changes as per your z/OS DB2installation.

export JZOS_HOME=$HOME/jzosexport JAVA_PATH=/usr/lpp/java/J1.5export PATH=$JAVA_HOME/bin:$PATHexport CLPHOME=/usr/lpp/db2/db2910/db2910_base/lib/IBMexport CLASSPATH=$CLASSPATH:/usr/lpp/db2/db2910/db2910_base/lib/clp.jarexport CLPPROPERTIESFILE=$HOME/clp.propertiesexport LIBPATH=$LIBPATH:<JZOS_HOME<alias db2="java com.ibm.db2.clp.db2"

6. CLPHOME and CLASSPATH may have to be modified depending onyour environment. Replace <JZOS_HOME> with the appropriatedirectory.

7. In the user's home directory, create a file name clp.properties based upontemplate given below:

#Specify the value as ON/OFF or leave them blankDisplaySQLCA=ONAutoCommit=ONInputFilename=OutputFilename=DisplayOutput=StopOnError=TerminationChar=Echo=StripHeaders=MaxLinesFromSelect=MaxColumnWidth=20IsolationLevel=<SUBSYSTEM_NAME>=<IP address>:<port number>/<location name>,USER,PASSWD

Replace items on the last line as appropriate.

8. Run the command chmod 777 <JZOS_HOME>/*.so

9. Run IBMDataMovementTool.sh -console command and specify values of

ibm.com/developerWorks developerWorks®

IBM Data Movement Tool© Copyright IBM Corporation 2009, 2010. All rights reserved. Page 15 of 27

Page 16: IBM Data Movement Tool

the parameters through interactive user response.

10. IBMExtract.properties, geninput and unload scripts are created for you.

11. zdb2tableseries parameter in IBMExtract.properties is for specifying thename of the series for PS datasets. For example, if your TSO ID isDNET770, and this param is set to R, the name of the PS dataset createdfor first table will be DNET777.TBLDATA.R0000001

12. The parameter znocopypend is used to add NOCOPYPEND parameter inLOAD statement. With this parameter, the z/OS DB2 DBA can performthe backup because the table will not be put in COPY pending mode.

13. The parameter zoveralloc specifies by how much you want to oversizeyour file allocation requests. A value of 1 means that you are noroversizing at all. In an environment with sufficient free storage, this mightwork. In a realistic environment, 15/11 (1.3636) will be a good estimate. Itis recommended that you start at 1.3636 (15/11) and lower the valuegradually until you get file write errors, and then increase it a little. If youknow the value of SMS parameter REDUCE SPACE UP TO, you shouldbe able to calculate the perfect value of overAlloc by setting it to 1 / (1 -(X/100)), where X is the value of REDUCE SPACE UP TO given as aninteger between 0 - 100. Note that REDUCE SPACE UP TO represents apercentage.

14. The parameter zsecondary is used to allocate fixed secondary extents.Start with a value of 0 and increase it slowly until file errors occur andthen bring it back down

15. Run geninput script to create an input file for the unload process.

16. Run unload script to generate DDL and DATA.

17. Run generated script to create the DDL and load data on z/OS DB2.

18. The DSNUTILS will fail if you do not delete those datasets. The followingjava program can delete those intermediate datasets.

java -cp /u/dnet770/migr/IBMDataMovementTool.jar:$JZOS_HOME/ibmjzos.jar \-Djava.ext.dirs=${JZOS_HOME}:${JAVA_HOME}/lib/ext ibm.Cleanup

19. After data loading is completed into DB2 tables on z/OS, you may find thedatasets that you need to delete. Use the the following java program todelete those datasets as a part of cleanup.Create a script jd as shown below:

developerWorks® ibm.com/developerWorks

IBM Data Movement ToolPage 16 of 27 © Copyright IBM Corporation 2009, 2010. All rights reserved.

Page 17: IBM Data Movement Tool

JZOS_HOME=$HOME/jzosJAVA_HOME=/usr/lpp/java/J1.5CLASSPATH=$HOME/migr/IBMDataMovementTool.jar:$JZOS_HOME/ibmjzos.jarLIBPATH=$LIBPATH:$JZOS_HOME

$JAVA_HOME/bin/java -cp $CLASSPATH \-Djava.ext.dirs=${JZOS_HOME}:${JAVA_HOME}/lib/ext ibm.Jd $1

Change file permission to 755 and run it and then you will get an outputshown below:

DNET770:/u/dnet770/migr: >./jdUSAGE: ibm.Jd <filter_key>USAGE: ibm.Jd "DNET770.TBLDATA.**"USAGE: ibm.Jd "DNET770.TBLDATA.**.CERR"USAGE: ibm.Jd "DNET770.TBLDATA.**.LERR"USAGE: ibm.Jd "DNET770.TBLDATA.**.DISC"

So, if you want to delete all datasets under "DNET770.TBLDATA", usefollowing command.

DNET770:/u/dnet770/migr: >./jd "DNET770.TBLDATA.**"

Plan for very large data movement

The strength of this tool is for large scale data movement. This tool has been used tomove 4TB of Oracle data in just three days with good planning and procedures.Here are the tips and techniques that will help you to achieve large scale datamovement in the time window constraint that you might have.

Hardware requirement and capacity planning

It is out of the scope of this article to discuss hardware requirements and databasecapacity planning but it is important to keep in mind following considerations forestimating time to complete large scale data movement.

• You need a good network connection between source and DB2 server,preferably of 1GBPS or higher. You will be limited by the networkbandwidth for the time frame to complete the data movement.

• The number of CPUs on the source server will allow you to unloadmultiple tables in parallel. For database size greater than 1TB, you shouldhave minimum 4 CPU on source server.

ibm.com/developerWorks developerWorks®

IBM Data Movement Tool© Copyright IBM Corporation 2009, 2010. All rights reserved. Page 17 of 27

Page 18: IBM Data Movement Tool

• The number of CPUs on the DB2 server will determine the speed of theLOAD process. As a rule of thumb, you will require 1/4 to 1/3 of the timeto load data in data and rests will be consumed by the unload process.

• Plan ahead the DB2 database layout. Please consult IBM's best practicepaperss for DB2

Tips and techniques

• Gain understanding of the tool in the command line mode. Use GUI togenerate data movement scripts (geninput and unload) and practice dataunload by running unload script from the command line.

• Extract only DDL from source by setting GENDDL=true andUNLOAD=false in the unload script. Use the generated DDL to plan forthe table space and table mapping. Use a separate output directory tostore generated DDL and data by specifying the target directory using-DOUTPUT_DIR parameter in the unload script. The generation of theDDL should be done ahead of the final data movement.

• Use geninput script to generate a list of tables to be moved from sourceto DB2. Use SRCSCHEMA=ALL and DSTSCHEMA=ALL parameter in thegeninput script to generate a list of all tables. Edit the file to removeunwanted tables and split it into several input files to do a staggeredmovement approach where you perform unload from source and load totarget in parallel.

• After breaking the table input file (generated from geninput script) intoseveral files, copy the unload script into equivalent different files, changethe name of the input file, and specify a different directory for each unloadprocess. For example, you could create 10 unload scripts to unload 500tables from each unload script, totalling 5000 tables.

• Make sure that you do DDL and DATA in separate steps. Do not mixthese 2 into a single step for such large movement of data.

• The tool unloads data from the source tables in parallel controlled byNUM_THREADS parameter in the unload script. The default value is 5,and you can increase it to a level where CPU utilization on your sourceserver is around 90%.

• Pay attention to the tables listed in the input tables file. The scriptgeninput does not have intelligence to put the tables in a particularorder, but you need to order the tables in such a way as to minimizeunload time. The tables listed in the input files are fed to a pool of threadsin a round robin fashion. It may so happen that all the threads havefinished the unload process but one is still running. In order to keep allthreads busy, organize the input file for the tables in the increasing

developerWorks® ibm.com/developerWorks

IBM Data Movement ToolPage 18 of 27 © Copyright IBM Corporation 2009, 2010. All rights reserved.

Page 19: IBM Data Movement Tool

numbers of rows.

• It may still so happen that all tables have unloaded and a few threads arestill holding up unloading very large tables. You can unload the sametable in multiple threads if you can specify the WHERE clause properly inthe input file. For example:

"ACCOUNT"."T1":SELECT * FROM "ACCOUNT"."T1" WHERE id between 1 and 1000000"ACCOUNT"."T1":SELECT * FROM "ACCOUNT"."T1" WHERE id between 1000001 and 2000000"ACCOUNT"."T1":SELECT * FROM "ACCOUNT"."T1" WHERE id between 2000001 and 3000000"ACCOUNT"."T1":SELECT * FROM "ACCOUNT"."T1" WHERE id between 3000001 and 4000000

Make sure that you use the right keys in the WHERE clause, whichshould preferrably be either the primary key or a unique index. The tooltakes care of making proper DB2 LOAD scripts to load data from multiplefiles generated by the tool. There is no other setup required to unload thesame table in multiple threads, except to add different WHERE clause asexplained.

• After breaking your unload process in several steps, you can start puttingdata in DB2 simultaneously when a batch has finished unloading the data.The key here is the seperate output directory for each unload batch. Allnecessary files are generated to put data in DB2 in the output directory.For DDL, you will use generated db2ddl script to create table definitions.For data, you will use db2load script to load the data in DB2. If youcombine DDL and data in a single step, the name of the script will bedb2gen.

• Automate the whole process in your shell scripts so that the unload andload processes are synchronised. Each and every large data movementfrom Oracle or other databases to DB2 is unique. You will have your skillstested determining how to automate all of these jobs. Save the output ofthe jobs in a file by using the tee command, so that you can keepwatching the progress, and the output is saved in a log file.

Run mock tests

It is a bad idea to fail to do the mock movement to test your automation and validatethe way you planned staggered unload from source and load in DB2. The level ofcustomization is only in creating the shell scripts to run these tasks in the right order.Follow these steps to run the mock tests:

1. Copy your data movement scripts and automation shell scripts to a mockdirectory.

2. Estimate your time by unloading a few large tables in a few threads, and

ibm.com/developerWorks developerWorks®

IBM Data Movement Tool© Copyright IBM Corporation 2009, 2010. All rights reserved. Page 19 of 27

Page 20: IBM Data Movement Tool

accordingly stagger the movement of the data.

3. Add a WHERE clause to limit the number of rows to test the movement ofdata. For example, you can add a ROWNUM clause to limit the number ofrows in Oracle or use the TOP clause for SQL Server.

"ACCOUNT"."T1":SELECT * FROM "ACCOUNT"."T1" WHERE rownum < 100"ACCOUNT"."T2":SELECT * FROM "ACCOUNT"."T2" WHERE rownum < 100"ACCOUNT"."T3":SELECT * FROM "ACCOUNT"."T3" WHERE rownum < 100"ACCOUNT"."T4":SELECT * FROM "ACCOUNT"."T4" WHERE rownum < 100

4. Practice your scripts and make changes as necessary, and prepare forthe final run.

Final run

1. You have already extracted DDL and made the required manual changesfor the mapping between tables and tablespaces if required.

2. Take a downtime for the movement of the data.

3. Make sure your have around 10000 open cursors setting for the Oracledatabase if that is the source.

4. Watch the output from the log file.

For large movement of data, it is much more about planning, discipline and theability to automate jobs. The tool provides all the capability that you require for suchmovement. This little tool has moved very large databases from source to DB2.

Support for the tool

This tool is not supported by the IBM support organization. However, you can reportbugs, issues, suggestions, enhancement requests in the support forum.

Frequently asked questions

Table 3. Frequently asked questionsQuestion/Issue Answer/Solution

Do I need to install anything on my sourcedatabase server in order for this tool to work?

You do not need to install anything on yoursource database for this tool.

What are the supported platforms for thistool?

Windows, z/OS, AIX, Linux, UNIX, HP-UX,Solaris, Mac and any other platform that has a

developerWorks® ibm.com/developerWorks

IBM Data Movement ToolPage 20 of 27 © Copyright IBM Corporation 2009, 2010. All rights reserved.

Page 21: IBM Data Movement Tool

JVM on it.

I am running this tool from a secure shellwindow on my Linux/Unix platform and I seefew messages in the command line shell but Ido not see GUI and it seems that tool hashung.

Depending upon your DISPLAY settings, the GUIwindow has opened on your display capableserver. You need to properly export yourDISPLAY settings. Consult your Unix systemadminstrator.

I am trying to move data from PostgreSQLand I do not see PostgreSQL JDBC driverattached with the tool.

There is no JDBC drivers provided with the tooldue to licensing considerations. You should getyour database JDBC driver from your licensedsoftware.

It is not possible to grant DBA to the userextracting data from Oracle database. Howcan I use the tool?

You will at least needSELECT_CATALOG_ROLE granted to the userand SELECT privileges on tables used formigration.

What are the databases to which this tool canconnect?

Any database that has a type-IV JDBC driver.So, you can connect to MySQL, PostgreSQL,Ingres, SQL Server, Sybase, Oracle, DB2 andothers. It can also connect to a database that hasa ODBC-JDBC connector so you can also movefrom Access database.

What version of Java do I need to run thistool?

You need minimum Java 1.5 to run the tool. Thedependency for Java 1.5 is basically due to theGUI portion of the tool. If you really need supportfor Java 1.4.2, send me a note and I will compilethe tool for Java 1.4.2 but the GUI will not run tocreate the data movement driver scripts.You can determine the version of Java byrunning this command.

$ java -versionC:\>java -version

How do I check the version of the tool? Run IBMDataMovementTool -version onWindows or ./IBMDataMovementTool.sh-version on Linux/UNIX

I am get the error "Unsupported major.minorversion 49.0" or "(.:15077): Gtk-WARNING **:cannot open display: " when I run the tool.What does it mean?

You are using a version of Java less than 1.5.Install Java higher than version 1.4.2 toovercome this problem. We prefer that you installIBM Java.

What information do I need for a source andDB2 database servers in order to run thistool?

You need to know IP address, port number,database name, user id and password for thesource and DB2 database. The user id for thesource database should have DBA priviliges andSYSADM privilege for the DB2 database.

I am running this tool from my Windowsworkstation and it is running extremely slow.What can I do?

The default memory allocated to this tool fromIBMDataMovementTool.cmd orIBMDataMovementTool.sh command script is990MB by using -Xmx switch for the JVM. Tryreducing this memory as you might be havingless memory on your workstation.

ibm.com/developerWorks developerWorks®

IBM Data Movement Tool© Copyright IBM Corporation 2009, 2010. All rights reserved. Page 21 of 27

Page 22: IBM Data Movement Tool

I am doing a data movement from SQL Serverto DB2. How do I get my TEXT field to go toVARCHAR in DB2.

Specify mssqltexttoclob=true inIBMExtract.properties file.

I am doing a data movement from Sybase toDB2 and it did not move my T-SQLprocedures to DB2.

The purpose of this tool is only DDL and DATAmovement. You will have to use MTK for thepurpose of procedure / triggers movement.

I am doing a DDL movement from Sybase toDB2 and I have my Sybase objects in a file. Ido not see a way to specify DDL file as a datasource.

The purpose of this tool is the high speed datamovement and that is why there is no capabilityto transform a DDL file from a database to DB2.You can however use IBM InfoSphere DataArchitect to trasnform a DDL from a sourcedatabase to a target.

I am doing a data movement from MS Accessto DB2 and I do not see all indexes etc in theDDL generated.

We use basic ODBC-JDBC connector to connectto MS Access database. You will need a differentcommercial JDBC driver to obtain complete setof DDLs. You can try HXTT JDBC driver for MSAccess. If you use HXTT driver, you will have tospecify DBVENDOR=hxtt in generated unloadscript instead of access.

I am doing a data movement from Sybase toDB2 using this tool and I am getting tons oferror.

It is quite possible that your Sybase database isnot enabled for required JDBC support. Pleaseconsult your Sybase DBA to ensure that correctJDBC stored procedures are installed in yourSybase database.

I am doing a data movement from MySQL toDB2 and I am running out of memory.

Try different values with FETCHSIZE=nnn in thegenerated unload script and run the datamovement from command line. If you use GUItool, it will overwrite unload script.

I am doing a data movement from Oracle toDB2 and I notice that there are 3 jars filesrequired for the data movement. Myunderstanding is the we only need a JDBCdriver for data movement. Why additional jarfiles?

The additional JAR files are mainly required forOracle XML data types. You should get thosefiles from your Oracle installation directory.

I want Oracle data type of CLOB to go asDBCLOB in DB2.

Go to IBMExtract.properties file and setDBCLOB=true.

I am using this tool to move data from Oracleto DB2 and I am getting many Oracle SQLerror that a table was not found.

The user ID connecting to Oracle should haveSELECT_CATALOG_ROLE granted to it andSELECT privileges on the tables.

I do not want NCHAR and NVARCHAR2 to goas GRAPHIC or VARGRAPHIC in DB2. I wantthem to go as CHAR and VARCHAR2 since Icreated DB2 database as UTF-8.

Go to IBMExtract.properties file and setGRAPHIC=false.

Can I do data movement from Oracledatabase to DB2 version less than V9.7/V9.5?

Yes, go to IBMExtract.properties and setdb2_compatibility=false

I noticed that your tool moved Oracle'sNUMBER(38) to NUMBER(31) and Iunderstand that DB2 supports only up to 31. I

Go to IBMExtract.properties and setroundDown_31=false.

developerWorks® ibm.com/developerWorks

IBM Data Movement ToolPage 22 of 27 © Copyright IBM Corporation 2009, 2010. All rights reserved.

Page 23: IBM Data Movement Tool

do not want to round down and I want toconvert this to DOUBLE.

I am getting lots of data rejected. How do I getthat rejected data in a file so that I cananalyze the reason of rejection.

Go to IBMExtract.properties and setdumpfile=true.

I am trying to load data from a workstation toa DB2 server and I am getting erros. Do I haveto run the tool from server only?

It is preferable to run this tool from the DB2server to extract data from the source databaseand avoid an intermediate server. However if youwant to run this tool from an intermediate server,you can specify REMOTELOAD=TRUE in thegenerated script unload. Please remember thatDB2 LOAD utility requires forBLOBS/CLOBS/XML data to be available onserver. You will need to mount those directorieswith same naming convention on the target DB2server.

I can only login to my DB2 server through aSSH shell and we do not allow X-Windows torun on DB2 server. How do I run this GUI toolto move DDL and DATA?

Run IBMDataMovementTool.sh from yourSSH and if there is no graphics support, the toolwill switch to command line input automatically. Ifit does not switch for some reason, specify-console option to the IBMDataMovementTool.shcommand and it will force to run the tool in theinteractive command line mode. The commandline mode is just a way to gather the input and togenerate necessary scripts for data movement.The use of GUI is just a way to generate thescripts and the actual works is done through thescripts only.

Why did you not create DB2 databasethrough your script since you ask the nameof the database.

DBAs normally like to create their database asper their storage paths information. We dohowever create necessary table spaces so thattables are put automatically in right table spaceby DB2. You should consider reading IBM's bestpractice papers to carefully plan for yourdatabase. It is recommended that you createDB2 database with 32K page size as default.

Acknowledgements

Many IBMers from around the world provided valuable feedback to the tool andwithout their feedback, the tool in this shape would not have been possible. Iacknowledge significant help, feedback, suggestions and guidance from followingpeople.

• Jason A Arnold

• Serge Rielau

ibm.com/developerWorks developerWorks®

IBM Data Movement Tool© Copyright IBM Corporation 2009, 2010. All rights reserved. Page 23 of 27

Page 24: IBM Data Movement Tool

• Marina Greenstein

• Maria N Schwenger

• Patrick Dantressangle

• Sam Lightstome

• Barry Faust

• Vince Lee

• Connie Tsui

• Raanon Reutlinger

• Antonio Maranhao

• Max Petrenko

• Kenneth Chen

• Masafumi Otsuki

• Neal Finkelstein

Disclaimer

This article contains a tool. IBM grants you ("Licensee") a non-exclusive, royaltyfree, license to use this tool. However, the tool is provided as-is and without anywarranties, whether EXPRESS OR IMPLIED, INCLUDING ANY IMPLIEDWARRANTY OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSEOR NON-INFRINGEMENT. IBM AND ITS LICENSORS SHALL NOT BE LIABLEFOR ANY DAMAGES SUFFERED BY LICENSEE THAT RESULT FROM YOURUSE OF THE SOFTWARE. IN NO EVENT WILL IBM OR ITS LICENSORS BELIABLE FOR ANY LOST REVENUE, PROFIT OR DATA, OR FOR DIRECT,INDIRECT, SPECIAL, CONSEQUENTIAL, INCIDENTAL OR PUNITIVE DAMAGES,HOWEVER CAUSED AND REGARDLESS OF THE THEORY OF LIABILITY,ARISING OUT OF THE USE OF OR INABILITY TO USE SOFTWARE, EVEN IFIBM HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.

developerWorks® ibm.com/developerWorks

IBM Data Movement ToolPage 24 of 27 © Copyright IBM Corporation 2009, 2010. All rights reserved.

Page 25: IBM Data Movement Tool

Downloads

• Product: IBM Data Movement Tool1

Note

1. A new build of the tool is uploaded very frequently, after bug fixes and new enhancements.Click on Help > Check New Version from the GUI or enter the command./IBMDataMovementTool.sh -check to check if a new build is available for download. You canfind the Tool's build number from the Help > About menu option or by entering the./IBMDataMovementTool.sh -version command. This tool uses JGoodies Forms 1.2.1, JGoodiesLook 2.2.2, and JSyntaxPane 0.9.4 packages for the GUI interface.

ibm.com/developerWorks developerWorks®

IBM Data Movement Tool© Copyright IBM Corporation 2009, 2010. All rights reserved. Page 25 of 27

Page 26: IBM Data Movement Tool

Resources

Learn

• "Migrate from MySQL or PostgreSQL to DB2 Express-C" (developerWorks,June 2006) was the first article written for this tool.

• "DB2 Viper 2 compatibility features" (developerWorks, July 2007) is the articlethat explains compatibility features.

• You can also use Migration Toolkit, for the migration of data and procedures.

Get products and technologies

• Download DB2 Express-C 9.7, a no-charge version of DB2 Express databaseserver for the community.

• Download a free trial version of DB2 9.7 for Linux, UNIX, and Windows..

• Download IBM product evaluation versions and get your hands on applicationdevelopment tools and middleware products from DB2, Lotus®, Rational®,Tivoli®, and WebSphere®.

Discuss

• Participate in the discussion forum for this content.

• Check out developerWorks blogs and get involved in the developerWorkscommunity.

About the author

Vikram S. KhatriVikram S Khatri works for IBM in the Sales and Distribution Division and is a memberof the DB2 Migration team. Vikram has 24 years of IT experience and specializes inenabling non-DB2 applications to DB2. Vikram supports the DB2 technical salesorganization by assisting with complex database migration projects as well as withdatabase performance benchmark testing.

Trademarks

IBM, AIX, DB2, z/OS and DB2 are trademarks of IBM Corporation in the UnitedStates and many other countries. Java and all Java-based trademarks aretrademarks of Sun Microsystems, Inc. in the United States and other countries. Linux

developerWorks® ibm.com/developerWorks

IBM Data Movement ToolPage 26 of 27 © Copyright IBM Corporation 2009, 2010. All rights reserved.

Page 27: IBM Data Movement Tool

is a trademark of Linus Torvalds in the United States and other countries. Microsoft,Windows, Windows NT, and the Windows logo are trademarks of MicrosoftCorporation in the United States and other countries. UNIX is a registered trademarkof The Open Group in the United States and other countries. Oracle is a trademark ofOracle Corporation in the United States and other countries. Other company,product, or service names may be trademarks or service marks of others.

ibm.com/developerWorks developerWorks®

IBM Data Movement Tool© Copyright IBM Corporation 2009, 2010. All rights reserved. Page 27 of 27