basic concepts of data warehousing

Upload: srikanth-venkata

Post on 29-May-2018

226 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/8/2019 Basic Concepts of Data Warehousing

    1/22

    Basic Concepts Of DataBasic Concepts Of Data

    WareHousingWareHousingPresented By:Presented By:

    RashmiRashmi JhaJha

  • 8/8/2019 Basic Concepts of Data Warehousing

    2/22

    A genda A genda

    Evolu ti o nEvolu ti o nTraditi o na l Appr o achTraditi o na l Appr o ach

    Need of Data wareh ou seNeed of Data wareh ou seDe f initi o n & Characteristics of DataDe f initi o n & Characteristics of Datawareh ou sewareh ou se

    Data ware Architect u reData ware Architect u reMo de ling techniq u eMo de ling techniq u e

  • 8/8/2019 Basic Concepts of Data Warehousing

    3/22

    Ev olutionEv olution

    60 s60 s Batch Rep o rtsBatch Rep o rts1.1. Hard t o f ind and ana lyzeHard t o f ind and ana lyze

    in fo rmati o n.in fo rmati o n.2.2. In fl exib le and expensi v e, repr o gramIn fl exib le and expensi v e, repr o gram

    e v ery new req u est.e v ery new req u est.

    7 0 s7 0 s Termina l based DSS and E ISTermina l based DSS and E IS1.1. Sti ll in fl exib le, n o t integrated withSti ll in fl exib le, n o t integrated with

    deskt o p t ool s.deskt o p t ool s.

  • 8/8/2019 Basic Concepts of Data Warehousing

    4/22

  • 8/8/2019 Basic Concepts of Data Warehousing

    5/22

    Th e T raditional Researc h A pproac hTh e T raditional Researc h A pproac h

    Qu eryQu ery--dri v en ( lazy, o ndri v en ( lazy, o n--demand)demand)Clients

    IntegrationSystem

    Metadata

    Wrapper Wrapper Wrapper

    . . .

  • 8/8/2019 Basic Concepts of Data Warehousing

    6/22

    Disad v antages of QueryDisad v antages of Query- -Driv enDriv en A

    pproach A

    pproach

    De lay in q u ery pr o cessingDe lay in q u ery pr o cessingS lo w o r u na v ai lab le in fo rmati o n s ou rcesS lo w o r u na v ai lab le in fo rmati o n s ou rces

    Co

    mplex

    f iltering and integrati

    onC

    omp

    lex

    f iltering and integrati

    on

    Ine ff icient and p o tentia ll y expensi v eIne ff icient and p o tentia ll y expensi v efo r f req u ent q u eriesfo r f req u ent q u eries

    Co

    mpetes withlo

    cal

    pro

    cessing atCo

    mpetes withlo

    cal

    pro

    cessing ats ou rcess ou rces

  • 8/8/2019 Basic Concepts of Data Warehousing

    7/22

    Data , e v eryw h ere :yetData , e v eryw h ere :yetCant f ind the data I needCant f ind the data I need

    1.1. Data is scattered ov er the netw o rk.Data is scattered ov er the netw o rk.2.2. Many v ersi o ns, s u bt le di ff erence.Many v ersi o ns, s u bt le di ff erence.

    Cant get the data I needCant get the data I need1.1. Need an expert t o get the data.Need an expert t o get the data.

    Cant u nderstand the data I fou ndCant u nderstand the data I fou nd1.1. Av ai lab le data p oo r ly d o cu mented.Av ai lab le data p oo r ly d o cu mented.

    Cantu

    se the data Ifou

    ndCantu

    se the data Ifou

    nd1.1. Res ul ts are u nexpectedRes ul ts are u nexpected2.2. Data needs t o be trans fo rmed f ro m o ne fo rm t o Data needs t o be trans fo rmed f ro m o ne fo rm t o

    o ther.o ther.

  • 8/8/2019 Basic Concepts of Data Warehousing

    8/22

    N eed for Data Ware h ousingN eed for Data Ware h ousing

    F aster timeF aster time- -t ot o --market fo r pr o d u cts & market fo r pr o d u cts & ser v ices.ser v ices.

    Rep lacement of ol der, lessRep lacement of ol der, less- -resp o nsi v e decisi o n s u pp o rt systems.resp o nsi v e decisi o n s u pp o rt systems.Better b u siness inte ll igence fo r endBetter b u siness inte ll igence fo r endu sers.u sers.Red u cti o n in time t o lo cate, accessRed u cti o n in time t o lo cate, accessand ana lyze.and ana lyze.

  • 8/8/2019 Basic Concepts of Data Warehousing

    9/22

    Data Ware h ouseData Ware h ouse

    De f initi o n gi v en byDe f initi o n gi v en by W.H.Inm o nW.H.Inm o nThe data wareh ou se is a c oll ecti o n of The data wareh ou se is a c oll ecti o n of integrated, s u bjectintegrated, s u bject- - o riented databaseso riented databasesdesigned t o s u pp o rt the DSS (decisi o ndesigned t o s u pp o rt the DSS (decisi o ns u pp o rt) fu ncti o n, where each u nit of s u pp o rt) fu ncti o n, where each u nit of data is re le v ant t o s o me m o ment indata is re le v ant t o s o me m o ment intime.time.

  • 8/8/2019 Basic Concepts of Data Warehousing

    10/22

    Data Ware h ousingData Ware h ousingSt o red c oll ecti o n of di v erse dataSt o red c oll ecti o n of di v erse data

    A s olu ti o n t o data integrati o n pr o b lemA s olu ti o n t o data integrati o n pr o b lemSing le rep o sit o ry of in fo rmati o nSing le rep o sit o ry of in fo rmati o n

    Su

    bjectSu

    bject- -o

    rientedo

    rientedOrganized by s u bject, n o t by app licati o nOrganized by s u bject, n o t by app licati o nUsed fo r ana lysis, data mining, etc.Used fo r ana lysis, data mining, etc.

    Large volu me of data (Gb, Tb)Large volu me of data (Gb, Tb)No nNo n-- vol ati levol ati le

    Hist o rica lHist o rica lTime attrib u tes are imp o rtantTime attrib u tes are imp o rtant

    Updates in f req u ent ly.Updates in f req u ent ly.

  • 8/8/2019 Basic Concepts of Data Warehousing

    11/22

    A dv antages of Ware h ousing A dv antages of Ware h ousing A

    pproach A

    pproach

    High q u ery per fo rmanceHigh q u ery per fo rmanceBu t n o t necessari ly m o st c u rrent in fo rmati o nBu t n o t necessari ly m o st c u rrent in fo rmati o n

    Do esn t inter f ere with lo ca l pr o cessing atDo esn t inter f ere with lo ca l pr o cessing at

    sou

    rcessou

    rcesCo mp lex q u eries at wareh ou seCo mp lex q u eries at wareh ou seIn fo rmati o n c o pied at wareh ou seIn fo rmati o n c o pied at wareh ou se

    Can m o di f y, ann o tate, s u mmarize,Can m o di f y, ann o tate, s u mmarize,restr u ct u re, etc.restr u ct u re, etc.

    Can st o re hist o rica l in fo rmati o nCan st o re hist o rica l in fo rmati o nSec u rity, n o a u ditingSec u rity, n o a u diting

  • 8/8/2019 Basic Concepts of Data Warehousing

    12/22

    Ch aracteristics of Data ware h ouseCh aracteristics of Data ware h ouse

    It is aIt is aS u bjectS u bject- - o rientedo riented

    IntegratedIntegratedTimeTime- - v aryingv aryingNo nNo n-- vol ati levol ati le

    And c o nsistentAnd c o nsistent

  • 8/8/2019 Basic Concepts of Data Warehousing

    13/22

    Data Ware h ouse A rc h itectureData Ware h ouse A rc h itecture

  • 8/8/2019 Basic Concepts of Data Warehousing

    14/22

    Data Ware h ouse A rc h itectureData Ware h ouse A rc h itecture

  • 8/8/2019 Basic Concepts of Data Warehousing

    15/22

    Data Ware h ouseData Ware h ouse ModelingModelingT

    ech

    niqueT

    ech

    niqueDimensional Data ModelDimensional Data Model

    1.1. Dimensi o nDimensi o n2.2. Attrib u teAttrib u te3.3. HierarchyHierarchy4.4. F act tab leF act tab le5.5. Loo k u p Tab leLoo k u p Tab le

  • 8/8/2019 Basic Concepts of Data Warehousing

    16/22

    Designing Data Models For DWDesigning Data Models For DW

    Star SchemaStar Schema

    Sn o w fl ake SchemaSn o w fl ake Schema

  • 8/8/2019 Basic Concepts of Data Warehousing

    17/22

    Data Mart ModelData Mart Model Star Star Sc h emaSc h ema

    TIME

    time key

    daymont hyear period

    CUSTOMER

    customer key

    first namelast namestreet addresscitystateziptransaction Idpayment datepayment status

    MOVIE

    mo v ie key

    mo v ie copy number mo v ie number mo v ie titlerental statusrental date

    due date

    MARKET

    market keystore number store citystore statedistrict nameregion name

    REVENUE

    mo v ie key (FK)market key (FK)customer key (FK)time key (FK)

    mo v ie rental rateov erdue c h argepayment amount

  • 8/8/2019 Basic Concepts of Data Warehousing

    18/22

    Data Mart Model -Snowflake Sc h emaSnowflake Sc h ema

    REVENUE

    mo v ie key (FK)market key (FK)customer key (FK)time key (FK)

    mo v ie rental rateov erdue c h argepayment amount

    MARKET

    market key

    store number store citystore state

    region key (FK)district key (FK)

    MOVIE

    mo v ie key

    mo v ie copy number mo v ie number mo v ie titlerental statusrental date

    due date

    CUSTOMER

    customer key

    first namelast nametransaction id

    payment datepayment status

    TIME

    time key

    day

    month

    year period

    DISTRICT

    district key

    number nameoffice addressmanager name

    REGION

    region key

    number nameoffice address

    manager name

    ADDRESS

    customer key (FK)

    street addresscitystatezip

  • 8/8/2019 Basic Concepts of Data Warehousing

    19/22

    Different Le v els of Data ModelDifferent Le v els of Data Model

    Co ncept u a l Data M o de l Co ncept u a l Data M o de l

    Lo gica l Data M o de l Lo gica l Data M o de l

    Physica l Data M o de l Physica l Data M o de l

  • 8/8/2019 Basic Concepts of Data Warehousing

    20/22

    V arious terminologies used inV arious terminologies used in

    th

    e DWth

    e DWData MiningData Mining

    Data C leansingData C leansingData Integrati o nData Integrati o nData MartData Mart

  • 8/8/2019 Basic Concepts of Data Warehousing

    21/22

    E T L ProcessE T L Process

    Capt u reCapt u re

    Scr u bScr u bTrans fo rmTrans fo rmLo adLo ad

  • 8/8/2019 Basic Concepts of Data Warehousing

    22/22

    THANK YOUTHANK YOU