adbms assignment

M. Sc. Part I [INFORMATION TECHNOLOGY] Practical Journal [DATAWAREHOUSING AND MINING]

Seat Number [ ]

Department of Computer Science and Information Technology Deccan Education Societys

Kirti College of Arts, Science and Commerce. [ NAAC Accredited : A Grade]

Department of Computer Science and Information Technology

Deccan Education Societys Kirti College of Arts, Science and Commerce.

[ NAAC Accredited : A Grade]

C E R T I F I C A T E This is to certify that Mr./Miss ______________________ of M.Sc Part-I [COMPUTER SCIENCE] with Seat

No._______has successfully completed the practical of Datawarehousing and Data Mining under my supervision in

this college during the year 2006 - 2007.

Lecturer-in-charge Head of Department (Mrs.Apurva Yadav) Dept of Com.Sc and I.T (Dr. Seema Purohit)

NAAC Accreditation A Grade

Deccan Education Societys

Kirti M. Doongursee College, Dadar, Mumbai-28 Dept. of Computer Science & I.T.

M.Sc. (Part-I) 2006 07

PAPER IV (SECTION-I)

DATAWAREHOUSING AND DATA MINING

INDEX

No. Topic Page No. Date Sign

1

Create a warehouse in MS SQL Server 2000 and import various databases from external sources such as Access/Excel/Text File by using Data Transformation Services (DTS) tool.

2

Create and schedule a DTS Package using Data Transformation services (DTS) tool. Fire at least 5 queries on the database

3 Create a Database using Analysis Manager and create a Single-Dimensional OLAP cube by using STAR schema.

4

Create a Database using Analysis Manager and create a Multi-Dimensional OLAP cube by using Snowflake schema

5 Create a Mining Model by using OLAP Data.

6 Create a Mining Model by using Relational Data.

Roll No. 22 NAAC Accreditation A Grade

Deccan Education Societys Kirti M. Doongursee College, Dadar, Mumbai-28

Dept. of Computer Science & I.T. M.Sc. (Part-I) 2006 07

Subject: Data Warehousing

Practical 01

Create a warehouse in MS SQL Server 2000 and import various databases from external sources such as Access/Excel/Text File by using Data Transformation Services (DTS) tool. STEP 1: Build OLTP SYSTEM

Steps in importing a database from Access Start->Program->MS SQL Server->Import and Export Data

Click from drop down list of Database as shown above. Enter Database Name. Click Ok

Select the database practdb from the Database dropdown list.Click Next.

Click Select All - Click Transform button near Customer table and check drop and recreate option as shown below.(Repeat this step for each table.)

Check Save DTS Package and Click Next Name your package as practdts. Click Next.

Click Finish.

Start Enterprise Manager Start->Programs ->Microsoft SQL Server->Enterprise Manager

Right click the DTS package practdts and select design package.

Steps in creating a fact table: 1. Select Microsoft OLEDB provider for SQL server from connection toolbox.

2. Select Microsoft Access from connection toolbox.

3. Select Execute SQL task from Task toolbox.

a) Create Fact Table ->Click Build Query->Click Parse Query

3. Select Execute SQL task from Task toolbox.

b) Drop Fact Table- Write Query for drop table as shown below->Click Parse Query

4. Transforming data from MS Access to SQL server

First Select Microsoft Access and then Select Transform Data Task from Task toolbox and drop it on FactConnection First Select Drop fact table and then Create Fact table Select from WorkFlow menu-on completion First Select create fact table task and then select Microsoft Access Select from WorkFlow menu -on success. Next double click the arrow pointing from Microsoft Access to Factconnection Click Build Query->Parse Query

DeSelect all the transformation

Click New-> Click Copy Column->Click OK

Select source column then select destination column that you want->click ok

Transformation Completed. Save the DTS package and execute the package.

BUILD A CUBE: Step1: Create a Database

Step2: Create a data source

Step 3: After creating datasource right click Cube option under your database practdb. Select New Cube-> Wizard

Click Next

How to add measures to the cube:

Measures are the quantitative values in the database that you want to analyze. Commonly-used measures are sales, cost, and budget data. Measures are analyzed against the different dimension categories of a cube.

1. To define the measures for your cube, under Fact table numeric columns, double-click store_sales. Repeat this procedure for the store_cost and unit_sales columns, and then click Next.

Click Next

How to build your Customer dimension:

1. Click New Dimension. 2. In the Welcome step, click Next. 3. In the Choose how you want to create the dimension step, select Star Schema: A single

dimension table, and then click Next.

4. In the Select the dimension table step, click Customer, and then click Next.

5. In the Select the dimension type step, click Next. 6. To define the levels for your dimension, under Available columns, double-click the

Country, State_Province, City, and lname columns, in that order. After you double-click

each column, its name appears under Dimension levels. After you have selected all four columns, click Next.

7. In the Specify the member key columns step, click Next.

8. In the Select advanced options step, click Next.

9. In the last step of the wizard, enter Customer in the Dimension name box, and leave the Share this dimension with other cubes box selected. Click Finish.

10. In the Cube Wizard, you should see the Customer dimension in the Cube dimensions list.

How to build your Product dimension:

1. Click New Dimension again. In the Welcome to the Dimension Wizard step, click Next. 2. In the Choose how you want to create the dimension step, select Snowflake Schema:

Multiple, related dimension tables, and then click Next. 3. In the Select the dimension tables step, double-click product and product_class to add

them to Selected tables. Click Next. 4. The two tables you selected in the previous step and the existing join between them are

displayed in the Create and edit joins step of the Dimension Wizard. Click Next.

5. To define the levels for your dimension, under Available columns, double-click the product_category, product_subcategory, and brand_name columns, in that order. After you double-click each column, its name appears under Dimension levels. Click Next after you have selected all three columns.

6. In the Specify the member key columns step, click Next. 7. In the Select advanced options step, click Next. 8. In the last step of the wizard, enter Product in the Dimension name box, and leave the Share

this dimension with other cubes box selected. Click Finish. 9. You should see the Product dimension in the Cube dimensions list.

How to build your Store dimension:


Multiple, related dimension tables, and then click Next. 12. In the Select the dimension tables step, double-click store and promotion to add them to

Selected tables. Click Next.

13. The two tables you selected in the previous step and the existing join between them are displayed in the Create and edit joins step of the Dimension Wizard. Click Next.

14. To define the levels for your dimension, under Available columns, double-click the store country, store state, promotion name, store city, store name columns, in that order. After you double-click each column, its name appears under Dimension levels. Click Next after you have selected all three columns.

15. In the Specify the member key columns step, click Next. 16. In the Select advanced options step, click Next. 17. In the last step of the wizard, enter store in the Dimension name box, and leave the Share

this dimension with other cubes box selected. Click Finish. 18. You should see the store dimension in the Cube dimensions list.

How to build your Time dimension:

1. In the Select the dimensions for your cube step of the wizard, click New Dimension. This calls the Dimension Wizard.

2. In the Welcome step, click Next. 3. In the Choose how you want to create the dimension step, select Star Schema: A single

dimension table, and then click Next. 4. In the Select the dimension table step, click time_by_day. You can view the data contained

in the time_by_day table by clicking Browse Data. When you are finished viewing the time_by_day table, click Next.

5. In the Select the dimension type step, select Time dimension, and then click Next.

6. Next, you will define the levels for your dimension. In the Create the time dimension levels step, click Select time levels, click Year, Quarter, Month, and then click Next.

7. In the Select advanced options step, click Next. 8. In the last step of the wizard, enter Time for the name of your new dimension.

Note: You can designate whether this dimension will be shared or private using the Share this dimension with other cubes check box, which is located on the lower left corner of the screen. Leave the box selected.

9. Click Finish to return to the Cube Wizard.

10.In the Cube Wizard, you should now see the Time dimension in the Cube dimensions list.

How to finish building your cube:

1. In the Cube Wizard, click Next. 2. Click Yes when prompted by the Fact Table Row Count message.

3. In the last step of the Cube Wizard, name your cube Pract1Cube, and then click Finish. 4. The wizard closes and then launches Cube Editor, which contains the cube you just created.

By clicking on the blue or yellow title bars.

Save the cube editor

Design Storage and Process the Cube How to design storage by using the Storage Design Wizard

1. In the Analysis Manager tree pane, expand the Cubes folder, right-click the Sales cube, and then click Design Storage.

2. In the Welcome step, click Next. 3. Select MOLAP as your data storage type, and then click Next.

4. Under Set Aggregation Options, click Performance gain reaches. In the box, enter 40 to indicate the percentage.

You are instructing Analysis Services to give a performance boost of up to 40 percent, regardless of how much disk space this requires. Administrators can use this tuning ability to balance the need for query performance against the disk space required to store aggregation data.

5. Click Start. 6. You can watch the Performance vs. Size graph in the right side of the wizard while Analysis

Services designs the aggregations. Here you can see how increasing performance gain requires additional disk space utilization. When the process of designing aggregations is complete, click Next.

7. Under What do you want to do?, select Process now, and then click Finish. Note: Processing the aggregations may take some time.

8. In the window that appears, you can watch your cube while it is being processed. When processing is complete, a message appears confirming that the processing was completed successfully.

9. Click Close to return to the Analysis Manager tree pane.

METADATA:

Click On Data For Analysing

Roll No. 22

NAAC Accreditation A Grade Deccan Education Societys


M.Sc. (Part-I) 2006 07


Practical 02 Create and schedule a DTS Package using Data Transformation services (DTS) tool. Fire at least 5 queries on the database. BUILD OLTP SYSTEM

CREATE DATABASE: nw_mart

Create an empty nw_mart database in SQL Server.

OPEN SQL Query Analyzer

Start->Program->MS SQL Server->Query Analyzer

Click Ok In SQL Query Analyzer window CODE: (For Creating and Dropping Dimension tables) if exists (select * from sysobjects where id = object_id(N'[dbo].[Customer_Dim]') and OBJECTPROPERTY(id, N'IsUserTable') = 1) drop table [dbo].[Customer_Dim] GO if exists (select * from sysobjects where id = object_id(N'[dbo].[Employee_Dim]') and OBJECTPROPERTY(id, N'IsUserTable') = 1) drop table [dbo].[Employee_Dim] GO if exists (select * from sysobjects where id = object_id(N'[dbo].[Product_Dim]') and OBJECTPROPERTY(id, N'IsUserTable') = 1) drop table [dbo].[Product_Dim] GO if exists (select * from sysobjects where id = object_id(N'[dbo].[Sales_Fact]') and OBJECTPROPERTY(id, N'IsUserTable') = 1) drop table [dbo].[Sales_Fact] GO if exists (select * from sysobjects where id = object_id(N'[dbo].[Shipper_Dim]') and OBJECTPROPERTY(id, N'IsUserTable') = 1) drop table [dbo].[Shipper_Dim] GO if exists (select * from sysobjects where id = object_id(N'[dbo].[Time_Dim]') and OBJECTPROPERTY(id, N'IsUserTable') = 1) drop table [dbo].[Time_Dim] GO

CREATE TABLE [dbo].[Customer_Dim] ( [CustomerKey] [int] IDENTITY (1, 1) NOT NULL , [CustomerID] [nchar] (5) NOT NULL , [CompanyName] [nvarchar] (40) NOT NULL , [ContactName] [nvarchar] (30) NOT NULL , [ContactTitle] [nvarchar] (30) NOT NULL , [Address] [nvarchar] (60) NOT NULL , [City] [nvarchar] (15) NOT NULL , [Region] [nvarchar] (15) , [PostalCode] [nvarchar] (10) NULL , [Country] [nvarchar] (15) NOT NULL , [Phone] [nvarchar] (24) NOT NULL , [Fax] [nvarchar] (24) NULL )ON [PRIMARY] GO CREATE TABLE [dbo].[Employee_Dim] ( [EmployeeKey] [int] IDENTITY (1, 1) NOT NULL , [EmployeeID] [int] NOT NULL , [EmployeeName] [nvarchar] (30) NOT NULL , [HireDate] [datetime] NULL ) ON [PRIMARY] GO CREATE TABLE [dbo].[Product_Dim] ( [ProductKey] [int] IDENTITY (1, 1) NOT NULL , [ProductID] [int] NOT NULL , [ProductName] [nvarchar] (40) NOT NULL , [SupplierName] [nvarchar] (40) NOT NULL , [CategoryName] [nvarchar] (15) NOT NULL , [ListUnitPrice] [money] NOT NULL ) ON [PRIMARY] GO CREATE TABLE [dbo].[Sales_Fact] ( [TimeKey] [int] NOT NULL , [CustomerKey] [int] NOT NULL , [ShipperKey] [int] NOT NULL , [ProductKey] [int] NOT NULL , [EmployeeKey] [int] NOT NULL , [RequiredDate] [datetime] NOT NULL , [LineItemFreight] [money] NOT NULL , [LineItemTotal] [money] NOT NULL , [LineItemQuantity] [smallint] NOT NULL , [LineItemDiscount] [money] NOT NULL ) ON [PRIMARY] GO CREATE TABLE [dbo].[Shipper_Dim] ( [ShipperKey] [int] IDENTITY (1, 1) NOT NULL , [ShipperID] [int] NOT NULL , [ShipperName] [nvarchar] (40) NOT NULL ) ON [PRIMARY] GO CREATE TABLE [dbo].[Time_Dim] ( [TimeKey] [int] IDENTITY (1, 1) NOT NULL ,

[TheDate] [datetime] NOT NULL , [DayOfWeek] [nvarchar] (20) NOT NULL , [Month] [int] NOT NULL , [Year] [int] NOT NULL , [Quarter] [int] NOT NULL , [DayOfYear] [int] NOT NULL , [Holiday] [nvarchar] (1) NOT NULL , [Weekend] [nvarchar] (1) NOT NULL , [YearMonth] [nvarchar] (10) NOT NULL , [WeekOfYear] [int] NOT NULL ) ON [PRIMARY] GO

Explanation: sysobjects SQL Server sysobjects Table contains one row for each object created within a database. In other words, it has a row for every constraint, default, log, rule, stored procedure, and so on in the database. Therefore, this table can be used to retrieve information about the database. OBJECT_ID Returns the database object identification number. Syntax

OBJECT_ID ( 'object' )

Arguments

'object'

Is the object to be used. object is either char or nchar. If object is char, it is implicitly converted to nchar.

Return Types int UNICODE STRING

Unicode strings have a format similar to character strings but are preceded by an N identifier (N

stands for National Language in the SQL-92 standard). The N prefix must be uppercase. For example, 'Michl' is a character constant while N'Michl' is a Unicode constant. Unicode constants are interpreted as Unicode data, and are not evaluated using a code page. Unicode constants do have a collation, which primarily controls comparisons and case sensitivity. Unicode constants are assigned the default collation of the current database, unless the COLLATE clause is used to specify a collation. Unicode data is stored using two bytes per character, as opposed to one byte per character for character data.

In Microsoft SQL Server, these data types support Unicode data:

nchar

nvarchar

ntext

Note The n prefix for these data types comes from the SQL-92 standard for National (Unicode) data types.

Use of nchar, nvarchar, and ntext is the same as char, varchar, and text, respectively

OBJECTPROPERTY

Returns information about objects in the current database.

Syntax

OBJECTPROPERTY ( id , property )

Arguments

id

Is an expression containing the ID of the object in the current database. id is int.

property

Is an expression containing the information to be returned for the object specified by id. property can be one of these values.

Note Unless noted otherwise, the value NULL is returned when property is not a valid property name.

Property name Object type Description and values returned

IsUserTable Table User-defined table.

1 = True 0 = False

IDENTITY (Property)

Creates an identity column in a table. This property is used with the CREATE TABLE and ALTER TABLE Transact-SQL statements.

Note The IDENTITY property is not the same as the SQL-DMO Identity property that exposes the row identity property of a column.

Syntax

IDENTITY [ ( seed , increment ) ]

Arguments

seed

Is the value that is used for the very first row loaded into the table.

increment

Is the incremental value that is added to the identity value of the previous row that was loaded.

You must specify both the seed and increment or neither. If neither is specified, the default is (1,1).

STEPS:

1.Save Query as Nwmartcreate.sql.

2.Execute the Nwmartcreate.sql script in SQL Server on the newly-created nw_mart database. This script will create the table and index structure for the sample database.

Create DTS package

3.Save the sample DTS package, NorthwindDTS.dts to SQL Server.

4.Run the DTS package by selecting Execute from the Package menu.

This will retrieves the data from the Northwind Database and populate the tables in newly created nw_mart Database.

To schedule the Package NorthwindDTS.dts

Right click NorthwindDTS.dts Package -> Schedule Package

Queries:

1. Show the total units sold for each product where the required data is earlier than today.

CODE:

SELECT Product_Dim.ProductName, Product_Dim.CategoryName,

Product_Dim.SupplierName,SUM(Sales_Fact.LineItemQuantity) AS

[Total Units Sold], Sales_Fact.RequiredDate FROM Sales_Fact

INNER JOIN Product_Dim ON

Sales_Fact.ProductKey = Product_Dim.ProductKey

GROUP BY Product_Dim.ProductName,

Product_Dim.CategoryName,

Product_Dim.SupplierName,

Sales_Fact.RequiredDate

HAVING (Sales_Fact.RequiredDate < getdate())

Output:

2. Show the Product name, Category name, Suppliers name,total units sold for each product where total unit sold is greater than 100 .

Code:

SELECT Product_Dim.ProductName, Product_Dim.CategoryName, Product_Dim.SupplierName, SUM(Sales_Fact.LineItemQuantity) AS [Total Units Sold], Sales_Fact.RequiredDate

FROM Sales_Fact INNER JOIN Product_Dim ON Sales_Fact.ProductKey = Product_Dim.ProductKey

GROUP BY Product_Dim.ProductName, Product_Dim.CategoryName, Product_Dim.SupplierName, Sales_Fact.RequiredDate, Sales_Fact.LineItemQuantity

HAVING (SUM(Sales_Fact.LineItemQuantity) >100)

Output:

3. To View Total Unit Sold Of All the Products Under the category whose average sale is greater than 50% SELECT Product_Dim.ProductName,

Product_Dim.CategoryName, SUM(Sales_Fact.LineItemQuantity) AS [Total Units Sold]

FROM Sales_Fact INNER JOIN Product_Dim ON Sales_Fact.ProductKey = Product_Dim.ProductKey

GROUP BY Product_Dim.Productkey, Product_Dim.ProductName, Product_Dim.CategoryName

HAVING (AVG(Sales_Fact.LineItemQuantity) >0.5) Output:

4. To View Company Name and Total Quantity sold for all the Products Code: SELECT Customer_Dim.CompanyName, Sum(Sales_Fact.LineItemQuantity) AS

TotalQtySold FROM Sales_Fact, Customer_Dim WHERE Sales_Fact.CustomerKey=Customer_Dim.CustomerKey GROUP BY Customer_Dim.CompanyName ORDER BY Sum(Sales_Fact.LineItemQuantity) DESC Output:

5. To view total Price of products sold by Employee to Company Code: SELECT Employee_Dim.EmployeeName, Product_Dim.ProductName,

Customer_Dim.CompanyName, Time_Dim.TheDate, Sales_Fact.LineItemTotal

FROM Time_Dim INNER JOIN (Product_Dim INNER JOIN (Employee_Dim INNER JOIN (Customer_Dim INNER JOIN Sales_Fact ON Customer_Dim.CustomerKey = Sales_Fact.CustomerKey) ON Employee_Dim.EmployeeKey = Sales_Fact.EmployeeKey) ON Product_Dim.ProductKey = Sales_Fact.ProductKey) ON Time_Dim.TimeKey = Sales_Fact.TimeKey;

Output:

Roll No. 22



M.Sc. (Part-I) 2006 07


Practical 03

Create a Database using Analysis Manager and create a Single-Dimensional OLAP cube by using STAR schema. Set Up the System Data Source Connection: A data source contains the information necessary to access source data for an object. Why? Before you begin working with Analysis Manager, you must first set connections to the source of your data in the ODBC Data Source Administrator.

How to set up your system data source name (DSN)

1. Microsoft Windows NT 4.0 users: Click the Start button, point to Settings, click Control Panel, and then double-click Data Sources (ODBC).

Windows 2000 users: Click the Start button, point to Settings, click Control Panel, double-click Administrative Tools, and then double-click Data Sources (ODBC).

2. On the System DSN tab, click Add. 3. Select Microsoft Access Driver (*.mdb), and then click Finish. 4. In the Data Source Name box, enter pract3DSN , and then under Database, click Select. 5. In the Select Database dialog box, browse C:\Program Files\Microsoft Analysis

Services\Samples, and then click FoodMart 2000.mdb. Click OK.

6. In the ODBC Microsoft Access Setup dialog box, click OK.

7. In the ODBC Data Source Administrator dialog box, click OK.

Start Analysis Manager: Analysis Manager is a snap-in program that runs on Microsoft Management Console (MMC). How to start Analysis Manager

Click the Start button, point to Programs, Microsoft SQL Server, and Analysis Services, and then click Analysis Manager.

Set Up the Database and Data Source: How to set up your database structure

1. In the Analysis Manager tree view, expand Analysis Servers. 2. Click the name of your server. A connection with the Analysis server will be established. 3. Right-click your server's name, and then click New Database. 4. In the Database dialog box, in the Database name box, enter pract3DW, and then click OK. 5. In the Analysis Manager tree pane, expand the server, and then expand the pract3DW

database you just created.

Next set up a connection to the sample data in the pract3DW data source.

How to set up your data source

1. In the Analysis Manager tree pane, right-click the Data Sources folder under the pract3DW database, and then click New Data Source.

2. In the Data Link Properties dialog box, click the Provider tab, and then click Microsoft OLE DB Provider for ODBC Drivers.

3. Click the Connection tab, and then from the Use data source name list, click pract3DSN. 4. Click Test Connection to be sure everything works. A message should appear in the

Microsoft Data Link dialog box, stating that your connection was successful. In the message box, click OK.

5. Click OK to close the Data Link Properties dialog box.

Build a Cube: How to open the Cube Wizard

In the Analysis Manager tree pane, under the pract3DW database, right-click the Cubes folder, click to New Cube, and then click Wizard.

How to add measures to the cube


1. In the Welcome step of the Cube Wizard, click Next. 2. In the Select a fact table from a data source step, and then click sales_fact_1998. 3. You can view the data in the sales_fact_1998 table by clicking Browse data. After you finish

browsing data, close the Browse data window, and then click Next. 4. To define the measures for your cube, under Fact table numeric columns, double-click

store_sales. Repeat this procedure for the store_cost and unit_sales columns, and then click Next.

How to build your Customer dimension


dimension table, and then click Next. 4. In the Select the dimension table step, click Customer, and then click Next. 5. In the Select the dimension type step, click Next. 6. To define the levels for your dimension, under Available columns, double-click the

Country, State_Province, City, and lname columns, in that order. After you double-click each column, its name appears under Dimension levels. After you have selected all four columns, click Next.

7. In the Specify the member key columns step, click Next. 8. In the Select advanced options step, click Next. 9. In the last step of the wizard, enter Customer in the Dimension name box, and leave the

Share this dimension with other cubes box selected. Click Finish. 10. In the Cube Wizard, you should see the Customer dimension in the Cube dimensions list.

How to finish building your cube

1. In the Cube Wizard, click Next. 2. Click Yes when prompted by the Fact Table Row Count message. 3. In the last step of the Cube Wizard, name your cube pract3CUBE, and then click Finish. 4. The wizard closes and then launches Cube Editor, which contains the cube you just created.

By clicking on the blue or yellow title bars, arrange the tables so that they match the following illustration.

Process the Cube:

1. Close the cube editor, go to Analysis Manager expand cube.

2. Right click pract3CUBE.

3. Select process option.



Browse Cube Data:

How to view cube data using Cube Browser

1. In the Analysis Manager tree pane, right-click the pract3CUBE cube, and then click Browse Data.

2. Cube Browser appears, displaying a grid made up of one dimension and the measures of your cube.

How to drill down

1. On Customer dimensions Double-click the cell in your grid that contains country canada. The cube expands to include the subcategory column.

Note: You can close the subcategory column by double-clicking a cell that has been expanded.

Use the above techniques to move dimensions to and from the grid.

When you are finished, click Close to close Cube Browser.

Roll No. 22



M.Sc. (Part-I) 2006 07


Practical 04

1. Create a Database using Analysis Manager and create a Multi-Dimensional OLAP cube by

using Snowflake schema.

Set Up the System Data Source Connection

A data source contains the information necessary to access source data for an object.

Why?

Before you begin working with Analysis Manager, you must first set connections to the source of your data in the ODBC Data Source Administrator.

How to set up your system data source name (DSN)

6. Microsoft Windows NT 4.0 users: Click the Start button, point to Settings, click Control Panel, and then double-click Data Sources (ODBC).

Windows 2000 users: Click the Start button, point to Settings, click Control Panel, double-click Administrative Tools, and then double-click Data Sources (ODBC).

7. On the System DSN tab, click Add. 8. Select Microsoft Access Driver (*.mdb), and then click Finish. 9. In the Data Source Name box, enter pract4DSN , and then under Database, click Select. 10. In the Select Database dialog box, browse C:\Program Files\Microsoft Analysis

Services\Samples, and then click FoodMart 2000.mdb. Click OK. 11. 6.In the ODBC Microsoft Access Setup dialog box, click OK. 12. 7.In the ODBC Data Source Administrator dialog box, click OK.

Start Analysis Manager

Analysis Manager is a snap-in program that runs on Microsoft Management Console (MMC).

How to start Analysis Manager

Click the Start button, point to Programs, Microsoft SQL Server, and Analysis Services, and then click Analysis Manager.

Set Up the Database and Data Source

How to set up your database structure

6. In the Analysis Manager tree view, expand Analysis Servers. 7. Click the name of your server. A connection with the Analysis server will be established. 8. Right-click your server's name, and then click New Database. 9. In the Database dialog box, in the Database name box, enter pract4DW, and then click OK. 10. In the Analysis Manager tree pane, expand the server, and then expand the pract4DW

database you just created.

Next set up a connection to the sample data in the pract4DW data source.

How to set up your data source

6. In the Analysis Manager tree pane, right-click the Data Sources folder under the pract4DW database, and then click New Data Source.

7. In the Data Link Properties dialog box, click the Provider tab, and then click Microsoft OLE DB Provider for ODBC Drivers.

8. Click the Connection tab, and then from the Use data source name list, click pract4DSN. 9. Click Test Connection to be sure everything works. A message should appear in the

Microsoft Data Link dialog box, stating that your connection was successful. In the message box, click OK.

10. Click OK to close the Data Link Properties dialog box.

Build a Cube

How to open the Cube Wizard

In the Analysis Manager tree pane, under the pract4DW database, right-click the Cubes folder, click to New Cube, and then click Wizard.



5. In the Welcome step of the Cube Wizard, click Next. 6. In the Select a fact table from a data source step, and then click sales_fact_1998. 7. You can view the data in the sales_fact_1998 table by clicking Browse data. After you finish

browsing data, close the Browse data window, and then click Next. 8. To define the measures for your cube, under Fact table numeric columns, double-click

store_sales. Repeat this procedure for the store_cost and unit_sales columns, and then click Next.

How to build your Product dimension




displayed in the Create and edit joins step of the Dimension Wizard. Click Next. 5. To define the levels for your dimension, under Available columns, double-click the

product_category, product_subcategory, and brand_name columns, in that order. After you double-click each column, its name appears under Dimension levels. Click Next after you have selected all three columns.



How to build your Store dimension


Multiple, related dimension tables, and then click Next. 3. In the Select the dimension tables step, double-click store and region to add them to

Selected tables. Click Next. 4. The two tables you selected in the previous step and the existing join between them are

displayed in the Create and edit joins step of the Dimension Wizard. Click Next. 5. To define the levels for your dimension, under Available columns, double-click the store

country,sales country,sales region,store state,salescity,store city columns, in that order. After you double-click each column, its name appears under Dimension levels. Click Next after you have selected all three columns.

6. In the Specify the member key columns step, click Next. 7. In the Select advanced options step, click Next. 8. In the last step of the wizard, enter Store in the Dimension name box, and leave the Share

this dimension with other cubes box selected. Click Finish. 9. You should see the Store dimension in the Cube dimensions list.


1. In the Cube Wizard, click Next. 2. Click Yes when prompted by the Fact Table Row Count message. 3. In the last step of the Cube Wizard, name your cube Sales, and then click Finish. 4. The wizard closes and then launches Cube Editor, which contains the cube you just created.

By clicking on the blue or yellow title bars, arrange the tables so that they match the following illustration.

Process the Cube

1. Close the cube editor, go to Analysis Manager expand cube

2. Right click pract4CUBE

3. Select Process option.



Browse Cube Data


3. In the Analysis Manager tree pane, right-click the pract4CUBE cube, and then click Browse Data.

4. Cube Browser appears, displaying a grid made up of one dimension and the measures of your cube. The additional one dimensions appear at the top of the browser.

How to replace a dimension in the grid

1. To replace one dimension in the grid with another, drag the dimension from the top box and drop it directly on top of the column you want to exchange it with. Make sure the pointer appears with a double-ended arrow during this process.

2. Using this drag and drop technique, select the Store dimension button and drag it to the grid, dropping it directly on top of Measures. The Store and Measures dimensions will switch positions in Cube Browser.

3. When you are finished, click Close to close Cube Browser.

.

Roll No. 22



M.Sc. (Part-I) 2006 07

Subject: Data Mining

Practical 05

Create a Mining Model by using Relational Data.

Create a Relational Data Mining Model Using Microsoft Decision Trees Scenario: The Marketing department is now getting familiar with data mining techniques. They realize the data warehouse contains a great deal of information that is not in the cube. They want to analyze this detailed information to find out whether it will reveal interesting facts about customers' buying behavior.

In this section you will create a relational mining model using the Microsoft Decision Trees algorithm to investigate the data warehouse data.

How to create a data mining model that discovers customer patterns

1. In the Analysis Manager tree pane, right-click the Mining Models folder, and then click New Mining Model.

2. The Mining Model Wizard opens. In the Welcome to the Mining Model Wizard step, click Next.

3. In the Select source type step, click Relational Data. Click Next.

4. In the Select case tables step, click A single table contains the data. In the Available tables box, select Customer. Click Next.

5. In the Select data mining technique step, in the Technique box, select Microsoft Decision Trees. Click Next.

6. In the Select the key column step, in the Case key column box, click customer_id. Click Next.

7. In the Select input and predictable columns step, select the following columns and successively move them to the Predictable columns box using the > button: marital_status, yearly_income, num_children_at_home, total_children, education, member_card, occupation, houseowner, num_cars_owned.

8. The same columns will also be used as input columns. Select the same columns and move them to the Input columns box by using the > button next to the Input column list. Click Next.

9. In the final step, in the Model Name box, enter Advanced customer patterns discovery. Ensure that Save and process now is selected. Click Finish.

10. The Process windows appears, showing your model being processed. When processing is complete, a message appears, stating "Processing completed successfully", click Close.

How to read the Customer decision tree

1. You are now in Relational Mining Model Editor. You can use this editor to edit properties of the model or to browse the result of it. Maximize the Relational Mining Model Editor.

2. Click the Content tab at the bottom of the right pane. 3. The decision tree for the Education characteristic appears. In the Data Mining Wizard, you

selected several columns from the relational table as input and predictable columns for the mining model. This meant that those columns were used to train the model and were also the target of the model to determine possible predictions. Consequently, the relational mining model generated one decision tree for each predictable column. Each decision tree is defined by nodes determined by the other columns. In the Education decision tree example, you can see that the two most important factors that predict the likelihood that the customer has a certain education level are his or her yearly income (defined by 1st level of the tree) and his or her occupation (defined by the 2nd level of the tree).

4. Now you have two main ways to investigate and navigate the tree further: You can double-click on nodes of the tree, or you can use the content navigator pane. You can see that the tree extends beyond the right edge of the editor. To access those invisible nodes, you can make one of the nodes in the branch that you investigate the new root of your current decision tree view. To do this, double-click the selected node. In this example, double-click Yearly Income = $30K - $50K. The decision tree makes this node the root of the current view and creates more space to display all its children.

5. You can see in the content navigator pane that the part of the tree currently displayed in the content detail pane is magnified. Now, move your mouse over the content navigator pane and click different locations. You can see that the decision tree magnifies what it displays in the content detail pane based on the position of your mouse. To return to the original tree pane, in the content navigator pane, move your mouse over the root of the tree and click on it to refresh the content detail pane.

6. To investigate other trees, in the Prediction Tree box, select Yearly Income. Its decision tree appears. You can see that this tree is much deeper and larger than the previous one. You can use the two navigation methods described in the previous step to navigate this tree.

7. In a similar way, select other characteristics in the Prediction Tree box and investigate the various characteristic patterns.

Roll No. 22


Deccan Education Societys Kirti M. Doongursee College, Dadar, Mumbai-28

Dept. of Computer Science & I.T. M.Sc. (Part-I) 2006 07

Subject: Data Mining

Practical 06

Create a Mining Model by using OLAP Data. Step 1: Create a database :pract6DM Step 2: Create a data source

Build a Cube

A cube is a multidimensional structure of data. Cubes are defined by a set of dimensions and measures.

How to open the Cube Wizard

In the Analysis Manager tree pane, under the pract6DM database, right-click the Cubes folder, click to New Cube, and then click Wizard.


1. In the Welcome step of the Cube Wizard, click Next. 2. In the Select a fact table from a data source step, expand the pract6DM data source, and

then click sales_fact_1998. 3. You can view the data in the sales_fact_1998 table by clicking Browse data. After you finish

browsing data, close the Browse data window, and then click Next.

4. To define the measures for your cube, under Fact table numeric columns, double-click store_sales. Repeat this procedure for the store_cost and unit_sales columns, and then click Next.

How to build your Time dimension

1. In the Select the dimensions for your cube step of the wizard, click New Dimension. This calls the Dimension Wizard.

In the Select advanced options step, click Next.

How to build your Customer dimension


dimension table, and then click Next. 4. In the Select the dimension table step, click Customer, and then click Next. 5. In the Select the dimension type step, click Next. 6. To define the levels for your dimension, under Available columns, double-click the

Country, State_Province, City, and lname columns, in that order. After you double-click each column, its name appears under Dimension levels. After you have selected all four columns, click Next.

7. In the Specify the member key columns step, click Next. 8. In the Select advanced options step, click Next. 9. In the last step of the wizard, enter Customer in the Dimension name box, and leave the

Share this dimension with other cubes box selected. Click Finish. 10. In the Cube Wizard, you should see the Customer dimension in the Cube dimensions list.

How to build your Product dimension




displayed in the Create and edit joins step of the Dimension Wizard. Click Next.

5. To define the levels for your dimension, under Available columns, double-click the product_category, product_subcategory, and brand_name columns, in that order. After you double-click each column, its name appears under Dimension levels. Click Next after you have selected all three columns.



How to build your Store dimension


dimension table, and then click Next. 4. In the Select the dimension table step, click Store, and then click Next. 5. In the Select the dimension type step, click Next. 6. To define the levels for your dimension, under Available columns, double-click the

store_country, store_state, store_city, and store_name columns, in that order. After you double-click each column, its name will appear under Dimension levels. After you have selected all four columns, click Next.

7. In the Specify the member key columns step, click Next. 8. In the Select advanced options step, click Next. 9. In the last step of the wizard, enter Store in the Dimension name box, and leave the Share

this dimension with other cubes box selected. Click Finish. 10. In the Cube Wizard, you should see the Store dimension in the Cube dimensions list.


1. In the Cube Wizard, click Next. 2. Click Yes when prompted by the Fact Table Row Count message.

In the last step of the Cube Wizard, name your cube pract6cube, and then click Finish.

Cube Editor

Edit a Cube

You can make changes to your existing cube by using Cube Editor.

How to edit your cube in Cube Editor You can use two methods to get to Cube Editor:

1.In the Analysis Manager tree pane, right-click an existing cube, and then click Edit.

2.Create a new cube using Cube Editor directly. This method is not recommended unless you are an advanced user.

In the schema pane of Cube Editor, the fact table (with yellow title bar) and the joined dimension tables (blue title bars) are seen. In the Cube Editor tree pane, you can preview the structure of your cube in a hierarchical tree. You can edit the properties of the cube by clicking the Properties button at the bottom of the left pane.

How to add a dimension to an existing cube

At this point, you decide you need a new dimension to provide data on product promotions. You can easily build this dimension in Cube Editor.

NOTE: Dimensions built in Cube Editor are, by default, private dimensions; that is, they can be used only with the cube you are working on and cannot be shared with other cubes. They do not appear in the Shared Dimensions folder in the Analysis Manager tree view. When creating such a dimension through the Dimension Wizard, you can make it shared across cubes.

1. In Cube Editor, on the Insert menu, click Tables. 2. In the Select table dialog box, click the promotion table, click Add, and then click

Close. 3. To define the new dimension, double-click the promotion_name column in the

promotion table. 4. In the Map the Column dialog box, select Dimension, and then click OK.

5. Select the Promotion Name dimension in the tree view. 6. On the Edit menu, click Rename. 7. Type Promotion, and then press ENTER. 8. Save your changes. 9. Close Cube Editor. When prompted to design the storage, click No

Design Storage and Process the Cube

You can design storage options for the data and aggregations in your cube. Before you can use or browse the data in your cubes, you must process them.

How to design storage by using the Storage Design Wizard

1. In the Analysis Manager tree pane, expand the Cubes folder, right-click the pract6cube, and then click Design Storage.

2. In the Welcome step, click Next. 3. Select MOLAP as your data storage type, and then click Next. 4. Under Set Aggregation Options, click Performance gain reaches. In the box, enter 40 to

indicate the percentage.

You are instructing Analysis Services to give a performance boost of up to 40 percent, regardless of how much disk space this requires. Administrators can use this tuning ability to balance the need for query performance against the disk space required to store aggregation data.

5. Click Start. 6. You can watch the Performance vs. Size graph in the right side of the wizard while Analysis

Services designs the aggregations. Here you can see how increasing performance gain requires additional disk space utilization. When the process of designing aggregations is complete, click Next.

7. Under What do you want to do?, select Process now, and then click Finish. Note: Processing the aggregations may take some time.



Browse Cube Data

Now you're ready to browse the data in the pract6cube cube!


1. In the Analysis Manager tree pane, right-click the pract6cube, and then click Browse Data. 2. Cube Browser appears, displaying a grid made up of one dimension and the measures of your

cube. The additional four dimensions appear at the top of the browser.

Build a Cube with Parent-Child Dimensions

A parent-child dimension is an organized hierarchy of members that is defined by its parent-child relationships. Often it does not have a symmetrical number of levels for each of its branches.

Why?

Parent-child dimensions are often used for describing employees or relationships between geographical areas. They can be used to represent charts of accounts (Profit & Loss, Balance Sheet, and so on). In some cases, Products or Customer dimensions can also be organized in a nonsymmetrical way. The parent-child schema is used in a relational database for this type ofdimension: one column represents the children, and another represents the parents.

Scenario: Now the Sales cube is built. The HR department heard about this new analysis tool and wants to analyze employee salary by store.

In this section you will build an HR cube for employee salary analysis. You will create the employee dimension as a parent-child dimension. Then you will use it, as well as regular dimensions, to generate the HR cube.

How to open the Dimension Wizard for Analysis Manager

1. In the Analysis Manager tree pane, under the pract6DM database, right-click the Shared Dimensions folder, click New Dimension, and then click Wizard.

How to build your Employee dimension

1. In the Welcome step, click Next. 2. In the Choose how you want to create the dimension step, select Parent-Child: Two

related columns in a single dimension table, and then click Next. 3. In the Select the dimension table step, click employee, and then click Next. 4. To define the child column, next to Member key, select employee_id. To define the parent

column, next to Parent key, select supervisor_id. To define the Member name column, next to Member name, select full_name. Click Next.

5. In the Select advanced options step of the wizard, click Next. 6. In the final step, enter Employee in the Dimension name box. Click Finish. 7. You are now in Dimension Editor. On the File menu, click Exit to close Dimension Editor. 8. You should see the Employee dimension in the Shared dimensions list.

How to build the HR cube

1. In the Analysis Manager tree pane, under the pract6DM database, right-click the Cubes folder, click New Cube, and then click Wizard.

2. Follow the steps in the wizard to create an HR cube with the following characteristics: 1. Fact table: salary 2. Measures: salary_paid, vacation_used 3. Dimensions: Employee, Store, Time 4. Count fact table rows? Yes

3. In the last step of the wizard, name your cube HR, and then click Finish. 4. Cube Editor appears. To manually create the joins, drag the the_date field of the

time_by_day table onto the pay_date field in the salary table. 5. Click the store_id field in the store table and drag it onto the store_id field in the employee

table.

6. Remove the department_id join that was automatically created between the salary table and the employee table: Select the join by clicking on it, and then press Delete.

7. When this is complete, close Cube Editor. Click Yes when you are prompted to save the cube, but click No when you are prompted to design the storage.

Create a Calculated Member

You can create customized measures or dimension members, called calculated members, by combining cube data or by using arithmetic operators, numbers, and/or functions.

Why?

You can use calculated members to enhance your analysis by modeling the raw data into meaningful business indicators. Calculated members increase the value of your analysis. They can outline trends, behaviors, and exceptions

Scenario: Now the Sales cube is populated with data. The Marketing department wants to enhance the pract6cube data and determine the average product price of the products sold at each store.

How to create a calculated member

1. In the Analysis Manager tree pane, under the pract6DM database, right-click the pract6cube, and then click Edit.

2. You are now editing the pract6cube in Cube Editor. The cube components (dimensions, measures, calculated members...) are listed in the left pane of Cube Editor.

3. Right-click Calculated Members, and then click New Calculated Member. 4. You are now in Calculated Member Builder. The first three boxes determine the dimension's

characteristics of the calculated member: Parent dimension (the dimensions to which it belongs), Parent member (the parent under which it is attached), and Member name.

5. Leave Parent dimension set to Measures. The Parent Member box is unavailable because the measure dimension does not support hierarchies. In the Member name box, enter Average price.

6. The lower part of Calculated Member Builder provides all the components necessary for building the calculated member expression. Under Data, expand the Measures dimension, and then expand MeasuresLevel. The list of measures appears.

7. Select Store Sales, and then drag it into the Value expression box. 8. In the number and operator pad, click the / operator. The operator appears at the end of the

expression in the Value expression box. 9. Under Data, select the Unit sales measure and drag it to the end of the expression in the

Value expression box.

10. The calculated member is now completely defined. Click OK. Calculated Member Builder closes and you are back in Cube Editor. Notice that the newly created calculated member is now available in the Calculated Members folder in the left pane of Cube Editor.

11. Save your changes by clicking the Save icon or by clicking Save on the File menu.

How to view calculated member data

Calculated members are calculated on the fly. This means that the data resulting from the calculated member expression is never stored; it is calculated every time the calculated member is requested in an analysis.

1. To view data, click the Data tab at the bottom of the right pane. The data appears, with the Measures dimension in columns and the Customer dimension in rows. Notice that four columns appear: the three measures and the calculated member you just created, Average Price.

2.

3. Close Cube Editor.

Create Member Properties

A member property is an attribute of a dimension member. It provides end users with additional information about the member.

Why?

Member properties have a variety of uses. In addition to providing information about a member, member properties can be used in queries and thus provide end users with more options when analyzing cube data. Member properties can also be the basis of levels in virtual dimensions

Scenario: The Marketing department wants to extend the pract6cube analysis capabilities to analyze customers sales data based on their characteristics: gender, marital status, education, yearly income, number of children at home, and membership card.

In this section you will add six member properties to the Customer dimension: gender, marital status, education, yearly income, number of children at home, and membership card. These member properties will qualify each member of the Customer dimension.

How to create Member Properties

1. In the Analysis Manager tree pane, expand the Shared Dimensions folder. 2. Right-click the Customer dimension, and then click Edit. 3. In Dimension Editor, expand Lname. You will see the Member Properties folder for the

level. 4. In the schema pane, drag the gender column from the Customer table to the Member

Properties folder for LName.

5. Repeat the previous step for the following five columns: marital_status, education, yearly_income, num_children_at_home, and member_card. You should see six member properties under Lname in the Member Properties folder: Gender, Marital Status, Education, Yearly Income, Num Children At Home, and Member Card.

6. On the File menu, click Save. 7. Close Dimension Editor.

Create a Virtual Dimension

A virtual dimension is a logical dimension based on the contents of a physical dimension. These contents can be either existing member properties in the physical dimension or columns in the tables of the physical dimension.

Why?

Using virtual dimensions, you can analyze cube data based on the member properties of the dimension members in a cube. The benefit is that this type of dimension does not consume disk space or processing time.

Scenario:

Now that you have added six member properties to the Customers dimension, you will create a virtual dimension with the Yearly Income member property and then add this newly created dimension to the pract6cube.

How to create a virtual dimension

1. In the Analysis Manager tree pane, right-click the Shared Dimensions folder, point to New Dimension, and then click Wizard.

2. In the Welcome step of the Dimension Wizard, click Next. 3. Select Virtual Dimension: The member properties of another dimension, and then click

Next. 4. In the Select the dimension with the member properties step, click the Customer

dimension, and then click Next.

5. In the Select the levels for the virtual dimension step, click the Lname.Yearly Income member property, and then click the add (>) button. Click Next.

6. In the Select advanced options step, make sure that no items in the Options box are checked. You will not need to set these advanced options at this time. Click Next.

7. In the Finish the Dimension Wizard step, in the Dimension Name box, enter Yearly Income.

8. Click Finish. 9. You are now in Dimension Editor. On the File menu, click Exit. 10. The new dimension is included in list of shared dimensions.

How to add a virtual dimension to an existing cube

1. In the Analysis Manager tree view, right-click the pract6cube in the Cubes folder, and then click Edit.

2. In Cube Editor, right-click Dimensions in the left pane tree. Click Existing Dimensions.

3. In Dimension Manager, select the newly created dimension, Yearly Income, and drag it to the Cube dimensions list. Click OK.

4. Close Cube Editor. Click Yes when prompted to save the cube. 5. Click Yes when prompted by the Design Storage window. 6. Follow the Storage Design Wizard steps and select the following settings:

1. Data storage type: MOLAP 2. Aggregation options: Performance gain reaches 20% 3. Final step: Process the cube

7. Click Close in the Process dialog box when the last line reads: "Processing completed successfully".

Create an OLAP Data Mining Model Using Microsoft Decision Trees

A data mining model is a model that contains all the settings necessary to run a specific data mining task.

How to create a data mining model that discovers customer patterns

1. In the Analysis Manager tree view, expand the Cubes folder, right-click the pract6cube, then select New Mining Model.

2. The Mining Model Wizard opens. In the Select data mining technique step, in the Technique box, select Microsoft Decision Trees. Click Next.

3. In the Select case step, select Customer in the Dimension box. In the Level box, ensure that Lname is selected. Click Next.

4. In the Select predicted entity step, select A member property of the case level. Then, in the Member properties box, select Member Card.

5. Click Next. 6. In the Select training data step, scroll to the Customer dimension and clear the Country,

State Province, and City boxes (we don't need to determine customer patterns with aggregated level but at the individual customer level only). Click Next.

7. In the Create a dimension and virtual cube (optional) step, enter Customer Patterns in the Dimension name box. Then, in the Virtual cube name box, enter Trained Cube. Click Next.

8. In the final step, type Customer patterns discovery in the Model name field. Ensure that Save and Process now is selected. Click Finish.

9. A window appears that shows your model being processed. When processing is complete, a message appears, stating "Processing completed successfully", click Close.

How to read the Customer decision tree

1. You are now in OLAP Mining Model Editor. You can use this editor to edit properties of the model or to browse the result of it. Maximize the OLAP Mining Model Editor.

2. In the right pane, the decision tree is displayed. It is composed of 4 panes. The content detail pane (1) in the middle represents the portion of the decision tree on which the focus is set. The content navigator pane (2) represents the complete view of the tree. It enables you to set the focus to a different part of the tree. The two other panes provide attributes information (3) that can be seen with numeric values (with the Totals tab) or graphically (with the Histogram tab) and the node path area (4) related to the node that has the focus.

3. In the decision tree area of the content detail pane, the color represents the density of Cases (in our case: density of customers). The darker it is, the more cases are contained in the node. Click on the All node. It is black because it represents 100% of the (7632) cases. 7632 represents the number of customers that were active in 1998 (customers who have transactions recorded in the Sales cube). This also shows that not all customers were active during 1998, because we only have 7632 cases out of the 9991 customers contained in the Lname level of the Customer dimension.

4. The attributes pane shows that for the All node, 55.83% of all cases, or 4263 cases, are likely to select the Bronze card; 11.50% to select the Golden card; 23.32% to select the Normal card and 9.34% to select the Silver card. The Probability column in the Totals panel of the attributes pane can be resized if percentage is not shown.

5. These percentages change if you select different nodes of the tree. Let's try to investigate which customers are likely to select the golden card. To do this, we will redraw the tree to outline the high density of golden cards. In the lower right hand side, select Golden in the Tree color based on field. The tree now shows a different pattern of colors. We can see that the Customer.Lname.Yearly Income = $150K+ node has a higher density that any other node.

1. Double-click the Customer.Lname.Yearly Income = $150K+ node. The tree now displays only the sub-tree beneath the Customer.Lname.Yearly Income = $150K+ node. Select the Customer.Lname.Marital Status = M node. In the node path pane, you can see the complete characteristics definition of the customer contained in this node: customers whose income is higher than $150K+ and who are married. The attributes pane now shows that a higher percentage (81.05%) of customers than in the previous level (45.09%) are likely to select the Golden card.

1. You can look at other branches of the tree and investigate how likely a customer is to select one card over another. The Marketing department can use this information to determine the characteristics of customers who are most likely to select a specific type of card. Based on these characteristics (income, number of children, marital status, and so on), the card services and programs can be redefined to better fit their customers.

2. When you are finished analyzing the decision tree, close OLAP Mining Model Editor.

Department of Computer Science and Information Technology Deccan Education Societys

Kirti College of Arts, Science and Commerce. [ NAAC Accredited : A Grade]

C E R T I F I C A T E

This is to certify that Mr./Miss______________________ of M.Sc Part-I [COMPUTER SCIENCE] with Seat

No._______has successfully completed the practical of Advance Database Management System under my supervision

in this college during the year 2006 - 2007.

Lecturer-in-charge Head of Department (Mrs.Apurva Yadav) Dept of Com.Sc and I.T (Dr. Seema Purohit)


Deccan Education Societys


M.Sc. (Part-I) 2006 07

PAPER IV (SECTION-II)

ADVANCED DATABASE MANAGEMENT SYSTEM

INDEX

No. Title Page No. Date Sign

1 Introduction to SQL

2 Distributed Database

3 Object Oriented Database

4 Active Database

5 Temporal Database

6 Spatial Database

7 XML Database

8 Multimedia Database

Roll No. 22



M.Sc. (Part-I) 2006 07

Subject: ADBMS

Practical No.1

Revision to SQL

1. Basic Queries of SQL.

a. Create the table empno number (4) ename varchar2 (15) job varchar2 (15) join_date date salary Number (6)

2. Alter the table column Salary 3. Insert the 10 records 4. Update the 2 records 5. Select the records 6. Use the following functions

AVG( ), MIN, MAX, CEIL, FLOOR, MOD ( ), POWER ( ).

Practical No:1 Revision to SQL Create table Query:-

create table emp(empno number(4), ename varchar2(15), job varchar2(15) ,join_date date, salary number(6));

Alter table :- SQL> select * from emp; EMPNO ENAME JOB JOIN_DATE SALARY --------- --------------- --------------- --------- --------- 3 Kiran Manager 18-APR-83 36024 4 Raja Manager 03-APR-83 40000 5 Sayali DBA 01-NOV-84 20000 6 Reshma Accountant 21-NOV-84 15000 7 Kaustubh Tester 28-SEP-83 18000 8 Sunil Tester 24-OCT-81 12000 9 Saurbi Clerk 05-OCT-83 10000 10 Deepa Programmer 15-AUG-79 21000 8 rows selected. SQL> alter table emp add dept_no number(3); Table altered. SQL> select * from emp; EMPNO ENAME JOB JOIN_DATE SALARY DEPT_NO --------- --------------- --------------- --------- --------- -------------- 3 Kiran Manager 18-APR-83 36024 4 Raja Manager 03-APR-83 40000 5 Sayali DBA 01-NOV-84 20000 6 Reshma Accountant 21-NOV-84 15000 7 Kaustubh Tester 28-SEP-83 18000 8 Sunil Tester 24-OCT-81 12000 9 Saurbi Clerk 05-OCT-83 10000 10 Deepa Programmer 15-AUG-79 21000 9 rows selected. SQL> alter table emp drop column dept_no; Table altered.

SQL> select * from emp; EMPNO ENAME JOB JOIN_DATE SALARY --------- --------------- --------------- --------- --------- 3 Kiran Manager 18-APR-83 36024 4 Raja Manager 03-APR-83 40000 5 Sayali DBA 01-NOV-84 20000 6 Reshma Accountant 21-NOV-84 15000 7 Kaustubh Tester 28-SEP-83 18000 8 Sunil Tester 24-OCT-81 12000 9 Saurbi Clerk 05-OCT-83 10000 10 Deepa Programmer 15-AUG-79 21000 9 rows selected. SQL> Insert Query :-

insert into emp values(3,'Kiran','Manager','18-Apr-1983',35000); insert into emp values(4,'Raja','Manager','03-Apr-1983',40000); insert into emp values(5,'Sayali','DBA','1-Nov-1984',30000); insert into emp values(6,'Reshma','Accountant','21-Nov-1984',15000); insert into emp values(7,'Kaustubh','Tester','28-Sep-1983',18000); insert into emp values(8,'Sunil','Tester','24-Oct-1981',12000); insert into emp values(9,'Saurbi','Clerk','05-Oct-1983',10000); insert into emp values(10,'Deepa','Programmer','15-Aug-1979',21000); Select Query:- SQL> select * from emp; EMPNO ENAME JOB JOIN_DATE SALARY --------- --------------- --------------- --------- --------- 3 Kiran Manager 18-APR-83 35000 4 Raja Manager 03-APR-83 40000 5 Sayali DBA 01-NOV-84 30000 6 Reshma Accountant 21-NOV-84 15000 7 Kaustubh Tester 28-SEP-83 18000 8 Sunil Tester 24-OCT-81 12000 9 Saurbi Clerk 05-OCT-83 10000 10 Deepa Programmer 15-AUG-79 21000 8 rows selected. Update Query:- update emp set salary=20000 where ename='Sayali'; Select Query :- SQL> select * from emp where ename=Sayali;

EMPNO ENAME JOB JOIN_DATE SALARY --------- --------------- ---------- ----------------- -------------- 5 Sayali DBA 01-NOV-84 20000

Function :- AVG():-

SQL> select avg(salary) from emp;

AVG(SALARY) -----------

20600 MIN():- select min(salary) from emp;

MIN(SALARY) -----------

10000 MAX():- select max(Salary) from emp;

MAX(SALARY) -----------

40000 MOD():- SQL> select * from emp where mod(salary,7)=0; EMPNO ENAME JOB JOIN_DATE SALARY

--------- --------------- --------------- --------- --------- 3 Kiran Manager 18-APR-83 35000 10 Deepa Programmer 15-AUG-79 21000 POWER():- SQL> update emp set salary=salary+power(2,10) where empno=3; 1 row updated. SQL> select * from emp where empno=2; EMPNO ENAME JOB JOIN_DATE SALARY --------- --------------- --------------- --------- ---------

3 Kiran Manager 18-APR-83 36024

FLOOR():- SQL> select floor(15.7) from dual;

FLOOR(15.7) -----------

15

CEIL():- SQL> select ceil(15.7) from dual;

CEIL(15.7) ----------

16

Roll No. 22



M.Sc. (Part-I) 2006 07

Subject: ADBMS

Practical No.2 2.1 1. For the following global conceptual schema, divide the schema into Horizontal fragments

and place them on different nodes. Implement at least 5 suitable queries on these fragments that will demonstrate distributed databases environment.

We are given the following three relations with their keys underlined: Supplier( Sno,Sname,City,State) Part( Pno,Pname,Color) Supplier-Part( Sno,Pno,Qty). We know that Suppliers can supply many Parts and many Suppliers can supply a Part. Assume the Supplier table is horizontally fragmented using the predicates: State = Maharashtra State = Karnataka. We can also assume that Suppliers are evenly located in only those two states. In addition, the Part table is horizontally fragmented using the predicates: 1 Pno 100, 101 Pno 200. Part numbers are continuous from 1 to 500, inclusive. Fragment the Supplier- Part relation according to your choice horizontally. Implement at least 5 suitable queries using oracle 8i/9i.

2. For a given a global conceptual schema, divide the schema into vertical fragments and

place them on different nodes. Implement at least 5 suitable queries on these fragments that will demonstrate distributed databases environment.

Assume we have a global conceptual schema that contains the following table with the key underlined: Employee(Eno,Ename,Job,Dno,Dname,Location). Also assume that we vertical fragment the table as follows: Employee (Eno; Ename; Job; Dno) Department (Dno; Dname; Location) In addition, assume we have 2 nodes/sites that contain the following fragments:

Site1/Node1 has Employee

Site2/Node2 has Department Implement at least 5 suitable queries using oracle 8i/9i on Employee fragments. 3. Place the replication of global conceptual schema on different nodes and implement

at least 5 suitable queries that will demonstrate distributed databases environment. Assume a schema Student ( Rollno, Name, Std, Marks) When you insert, update delete the record in one node then at the same time this action will

be fired on another node.

Practical No:2 Distributed Database

HORIZONTAL FRAGEMENTATION Create the table Supplier in oracle9i create table supplier9i(sno number(3),sname varchar2(30), city varchar2(20),state varchar2(20)) / Insert the atleast 10 records in Supplier table that is in oracle9i insert into supplier9i values(&sno,'&sname','&city','&state'); SQL> select * from supplier9i; SNO SNAME CITY STATE ----- ---------- ---------- ------------ 1 Ashwini Mumbai MH 2 Supriya Vasai MH 3 Shilpa Thane MH 4 Pallavi Dadar MH 5 Aditya Vasai MH 6 Sandesh Parel MH 7 Saurabh Virar MH 8 Samadhan Worli MH 9 Jayesh Thane MH 10 Amit Byculla MH 10 rows selected. Create the table Part in oracle9i create table part9i(pno number(3),pname varchar2(30), color varchar2(20)) / Table created.

Insert the atleast 10 records in Part table that is in oracle9i insert into part9i values (&pno,'&pname','&color'); SQL> select * from part9i; PNO PNAME COLOR --------- ---------- ------ 11 cpu silver 12 floppy pink 13 monitor white 14 keyboard black 15 mouse blue 16 web camera blue 6 rows selected. Create the table Supplier_Part in oracle9i SQL>create table supplier_part9i(sno number(3),pno number(3), qty number(5)) / Insert the atleast 10 records in Supplier_Part table that is in oracle9i SQL>insert into supplier_part9i values (&sno,&pno,&qty); SQL> select * from supplier_part9i order by sno; SNO PNO QTY ------- --------- --------- 1 16 20 2 14 55 3 12 24 4 12 45 5 13 41 6 17 40 7 12 34 8 14 22 9 11 20 10 11 34 10 rows selected.

Create the table Supplier in oracle8i create table supplier8i(sno number(3),sname varchar2(30), city varchar2(20),state varchar2(20)) / Insert the atleast 10 records in Supplier table that is in oracle8i insert into supplier8i values(&sno,'&sname','&city','&state'); SQL> select * from supplier8i; SNO SNAME CITY STATE ------- ---------- ---------- ------------ 11 niraj thane mp 12 yogesh thane mp 13 swapnil worli mp 14 sachin dadar mp 15 salil chembur mp 16 saloni mumbai mp 17 snehal borivali mp 18 kamini dahisar mp 19 Ratna Vasai mp 20 Ramesh Vasai mp 10 rows selected. SQL> insert into part8i values(&pno,'&pname','&color'); SQL> select * from part8i; PNO PNAME COLOR --------- ---------- ---------- 200 floppy white 201 pendrive white 202 cpu black 203 lancard black 204 touchpad red 205 cdrom white 6 rows selected.

Create the table Supplier_Part in oracle8i create table supplier_part8i(sno number(3),pno number(3), qty number(5)) / insert into supplier_part8i values (&sno,&pno,&qty); SQL> select * from supplier_part8i order by pno; SNO PNO QTY ------- ------- ------- 204 11 12 202 12 25 203 13 28 201 14 15 201 15 20 203 16 36 201 17 40 205 18 30 203 19 20 205 20 50 10 rows selected. Display the names of all the supplier from both state SQL> select s.sname,e.sname from supplier9i s,msc26.supplier8i@oracle8i e where s.sno=e.sno; SNAME SNAME ---------- ---------- Ashwini niraj Supriya yogesh Silpa swapnil Pallavi sachin Aditya salil Sandesh saloni Saurabh snehal samadhan kamini jayesh Ratna amit Ramesh 10 rows selected.

SQL> select sno from supplier9i 2 union 3 select sno from msc26.supplier8i@oracle8i; SNO --------- 1 2 3 4 5 6 7 8 9 10 10 rows selected. SQL> select sno from supplier9i 2 intersect 3 select sno from msc26.supplier8i@oracle8i; SNO --------- 1 2 3 4 5 6 7 8 9 10 10 rows selected.

SQL> select sname from supplier9i 2 union all 3 select sname from msc26.supplier8i@oracle8i; SNAME ---------- Ashwini Supriya Silpa Pallavi Aditya Sandesh Saurabh samadhan jayesh amit niraj yogesh swapnil sachin salil saloni snehal kamini Ratna Ramesh 20 rows selected. SQL> select s.sno,s.sname,p.pname,sp.qty from supplier9i s,part9i p, supplier_part9i sp where 2 s.sno=sp.sno and p.pno=sp.pno; SNO SNAME PNAME QTY -------- ---------- ------------ --------- 1 Ashwini web camera 20 2 Supriya keyboard 55 3 Silpa floppy 24 4 Pallavi floppy 45 5 Aditya monitor 41 7 Saurabh floppy 34 8 samadhan keyboard 22 9 jayesh cpu 20 10 amit cpu 34 9 rows selected.

SQL> select s.sno,s.sname,p.pname,sp.qty from supplier9i s,part9i p, supplier_part9i sp where 2 s.sno=sp.sno and p.pno=sp.pno 3 union 4 select s.sno,s.sname,p.pname,sp.qty from msc26.supplier8i@oracle8i s, 5 msc26.part8i@oracle8i p, msc26.supplier_part8i@oracle8i sp where 6 s.sno=sp.sno and p.pno=sp.pno; SNO SNAME PNAME QTY ----- ---------- ----------- --------- 1 Ashwini web camera 20 1 niraj cpu 12 2 Supriya keyboard 55 3 Silpa floppy 24 3 swapnil touchpad 28 4 Pallavi floppy 45 4 sachin floppy 15 5 Aditya monitor 41 5 salil floppy 20 6 saloni touchpad 36 7 Saurabh floppy 34 7 snehal floppy 40 8 kamini cdrom 30 8 samadhan keyboard 22 9 Ratna touchpad 20 9 jayesh cpu 20 10 Ramesh cdrom 50 10 amit cpu 34 18 rows selected. Total qty available in both states select sum(s.qty),sum(e.qty) from supplier_Part9i s,scott.supplier_part8i@oracle8i e where s.sno=e.sno select pname,sum(qty) from part9i,supplier_part9i where part9i.pno=supplier_part9i.pno group by pname

VERTICAL FRAGEMENTATION SQL> create table emp(empno number(5),ename varchar2(20),job varchar2(15),deptno number(4)); Table created.

SQL> / Enter value for empno: 1 Enter value for ename: 'Sandesh' Enter value for job: 'CEO' Enter value for deptno: 10 old 1: insert into emp values(&empno,&ename,&job,&deptno) new 1: insert into emp values(1,'Shilpa','CEO',10) 1 row created. SQL> / Enter value for empno: 2 Enter value for ename: 'Pallavi' Enter value for job: 'Md' Enter value for deptno: 10 old 1: insert into emp values(&empno,&ename,&job,&deptno) new 1: insert into emp values(2,'Pallavi','Md',10) 1 row created. SQL> / Enter value for empno: 3 Enter value for ename: 'Ashwini' Enter value for job: 'Md' Enter value for deptno: 10 old 1: insert into emp values(&empno,&ename,&job,&deptno) new 1: insert into emp values(3,'Ashwini','Md',10) 1 row created. SQL> / Enter value for empno: 4 Enter value for ename: 'Supriya' Enter value for job: 'Ceo' Enter value for deptno: 10 old 1: insert into emp values(&empno,&ename,&job,&deptno) new 1: insert into emp values(4,'Supriya','Ceo',10) 1 row created. SQL> / Enter value for empno: 5 Enter value for ename: 'Aditya' Enter value for job: 'Agm' Enter value for deptno: 20 old 1: insert into emp values(&empno,&ename,&job,&deptno) new 1: insert into emp values(5,'Aditya','Agm',20) 1 row created.

SQL> / Enter value for empno: 6 Enter value for ename: 'Saurabh' Enter value for job: 'Gm' Enter value for deptno: 20 old 1: insert into emp values(&empno,&ename,&job,&deptno) new 1: insert into emp values(6,'Saurabh','Gm',20) 1 row created. SQL> / Enter value for empno: 7 Enter value for ename: 'Samadhan' Enter value for job: 'Clerk' Enter value for deptno: 20 old 1: insert into emp values(&empno,&ename,&job,&deptno) new 1: insert into emp values(7,'Samadhan','Clerk',20) 1 row created. SQL> / Enter value for empno: 8 Enter value for ename: 'Ram' Enter value for job: 'Peon' Enter value for deptno: 20 old 1: insert into emp values(&empno,&ename,&job,&deptno) new 1: insert into emp values(8,'Ram','Peon',20) 1 row created. SQL> / Enter value for empno: 9 Enter value for ename: 'Shayam' Enter value for job: 'Security' Enter value for deptno: 10 old 1: insert into emp values(&empno,&ename,&job,&deptno) new 1: insert into emp values(9,'Shayam','Security',10) 1 row created. SQL> / Enter value for empno: 10 Enter value for ename: 'Gopal' Enter value for job: 'Md' Enter value for deptno: 20 old 1: insert into emp values(&empno,&ename,&job,&deptno) new 1: insert into emp values(10,'Gopal','Md',20) 1 row created.

SQL> select * from emp; EMPNO ENAME JOB DEPTNO ---------- ----------- ----------- -------------- 1 Sandesh Ceo 10 2 Pallavi Md 10 3 Ashwini Md 10 4 Supriya Ceo 10 5 Aditya Agm 20 6 Saurabh Gm 20 7 Samadhan Clerk 20 8 Ram Peon 20 9 Shayam Security 10 10 Gopal Md 20 10 rows selected. SQL> conn msc01/msc01@oracle8i connected. SQL> create table dept(deptno number(4),dname varchar2(20),location varchar2(20)); table created. SQL> insert into dept values(&deptno,&dname,&location); Enter value for deptno: 10 Enter value for dname: 'Accounting' Enter value for location: 'Mumbai' old 1: insert into dept values(&deptno,&dname,&location) New 1: insert into dept values(10,'Accounting','Mumbai') 1 row created. SQL> / Enter value for deptno: 20 Enter value for dname: 'Research' Enter value for location: 'Jalgaon' old 1: insert into

adbms assignment

Documents

data task

olap data

sql task

export data

relational data

data mining index

task toolbox

data warehousing practical