power point for data mining

23
Data Mining & OLAP

Upload: tommy96

Post on 20-May-2015

608 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Power Point for Data Mining

Data Mining & OLAP

Page 2: Power Point for Data Mining

What is Data Mining? Data Mining is the set of activities

used to find new, hidden, or unexpected patterns in data.

Page 3: Power Point for Data Mining

What is OLAP? On-Line Analytical Processing

a category of software technology that enables analysts and executives to gain insight to data through fast, consistent, interactive access to a wide variety of possible views of information that has been transformed from raw data to reflect the real dimensionality of the enterprise as understood by the user.

Page 4: Power Point for Data Mining

OLAP Functionalities dynamic multi-dimensional analysis of consolidated

data supporting end user analytical and navigational activities including:

Calculations and modeling applied across dimensions, through hierarchies and/or across members

Trend analysis over sequential time periods Slicing subsets for on-screen viewing Drill down to deeper levels of consolidation Reach-through to underlying detail data Rotation to new dimensional comparisons in the

viewing area  

Page 5: Power Point for Data Mining

2 Approaches to conduct the analysis Multidimensional OLAP (MOLAP)

Hypercube Relational OLAP (ROLAP)

In ROLAP, multidimensional database server is replaced with a large relational database server

Page 6: Power Point for Data Mining

Internal data

External data

Data Transformation

services

Mapping

measures and

dimensions

Transaction database

Data warehouse

Multidimensional cube

End User OLAP Interface

OLTP

OLTP

Components of OLAP

Page 7: Power Point for Data Mining

Infrastructure of Data Warehouses & OLAP Systems

Page 8: Power Point for Data Mining

Hypercube data representations make it convenient to query data along any dimension

Page 9: Power Point for Data Mining

Sales Performance from Various Markets

CountryCountry

Page 10: Power Point for Data Mining

Drill Down Operation of OLAP Cube

Country > RegionCountry > Region

Page 11: Power Point for Data Mining

Drill Down Operation of OLAP Cube

Country > Region> CityCountry > Region> City

Page 12: Power Point for Data Mining

Workflow Monitoring

Page 13: Power Point for Data Mining

CompanyCompanyCustomerCustomer

CustomerCustomer Sales & Sales & MarketingMarketing

ManufacturingManufacturingPMCPMC ShipperShipper

AccountingAccountingWarehouseWarehouse

purchase orderpurchase order

order requestorder requestapprovalapproval

order requestorder request

job job orderorder

delivery notedelivery note

shipping ordershipping order

invoiceinvoice

paymentpayment

purchase purchase confirmationconfirmation

Schematic Diagram of Business Flow

Page 14: Power Point for Data Mining

Sample Workflow for Electronic Procurement - Participating Organizations

SupplierSupplierSupplierSupplierBuyerBuyerBuyerBuyerUser

InvoiceApprover

POApprover

CommerceFinance Supplier Reviewer Shipper

Purchase Request

PO RequestApproval

PO ApprovalPurchaseOrder

Configuration

ReviewPurchase Confirmation and ETA

Shipping OrderInvoice

Invoice Request Approval

Invoice ApprovalPayment

App

Shipment Verification

Page 15: Power Point for Data Mining

Management And Monitoring

Process

FK1

PK

ScheduleIDPortN am e

PortID

Port

FK1

PK

ScheduleIDM essageN am e

M essageID

M essage

FK1FK2

PK

ScheduleN am eC ontext_optionalA ttributeswhere_optionalM oduleIDG roupIDSta te

ScheduleID

Schedule

PK

M oduleN am e

M oduleID

M oduleA m o du le is a co llec tion o f

scheudu les - in nospe c ified o rde r

m essage s and po rts a redesc ribed in lis ts tha t a re

pa rt o f the sched u le

hea de r.

FK1

PK

m sgportScheduleID

W hereID

W hereTable

PK

G roupN am e

GroupID

GroupTable

SQL SERVERSQL SERVER

HandleApproval

Query

ReceiveApproval Status

Update

Approve

Email UserChangeStatus

Call ValidateSchedule

M onitoring Application

M onitoring Application

BiztalkBiztalk

CustomCustom

Page 16: Power Point for Data Mining

Orchestrating Business Activities

BizTalkBizTalk Orchestration EngineOrchestration Engine

COM Components

WebWebServiceService

(Internal)(Internal)

WebWebServiceService

(External)(External)

MSMQ

Exchange Workflows

SQL ServerSQL ServerScriptScriptFilesFiles

BizTalkBizTalk

MessagingMessagingServicesServices

Internal Apps

Page 17: Power Point for Data Mining

Business OrchestrationBusinessBusinessProcess Process

FlowFlow

ImplementationImplementation

Page 18: Power Point for Data Mining

BizTalk Server- An Integration Server

MS BizTalk Server

Scan-based Trading

Inventory Management

BOM Module

PO Module

CO Module

Other Modules

Other Legacy Systems

Customers

Suppliers

ECTools

Page 19: Power Point for Data Mining

BizTalk Server- An Automation Server

MS BizTalk Server

Scan-based Trading

Inventory Management System

Customer

Accounting System

Begin

Receive Inventory Record

Issue Delivery Note

Update Inventory Record

Credit or COD

CustomerIssue InvoiceCOD

Credit customers’ account

Receive Payment

AccountAccounting System

EndPre-defined Business Rule could be added for process automation

Support various types of protocol for messaging

Data format conversion for different formats ECTools

Page 20: Power Point for Data Mining

Questions for Discussion Determine the potential OLAP

applications in business operation?

Suggested Answer:

Marketing and sales analysisDatabase marketingBudgetingFinancial reportingManagement reportingProfitability analysisQuality analysis

Page 21: Power Point for Data Mining

Questions for Discussion MOLAP is good for handling what kind

of data?

Suggested Answer:

MOLAP is good at handling summarized data, it is not particularly well-suited to handle large amount of detailed data

Page 22: Power Point for Data Mining

Questions for Discussion ROLAP is suitable for handling what

kind of data?

Suggested Answer:

ROLAP architectures are especially well-suited to those situations where dynamic access to combinations of summarized and detailed data is more important than the performance gains offered by MOLAP approach using only summarized or pre-consolidated data.

Page 23: Power Point for Data Mining

Questions for Discussion Limitations and Challenges to Data

Mining

Suggested Answer:

Identification of missing informationOriginal data set contains the necessary elements for effective mining cannot be detected yetData noise and missing valuesLarge databases and high dimensionality