basic concepts of data warehousing
TRANSCRIPT
-
8/8/2019 Basic Concepts of Data Warehousing
1/22
Basic Concepts Of DataBasic Concepts Of Data
WareHousingWareHousingPresented By:Presented By:
RashmiRashmi JhaJha
-
8/8/2019 Basic Concepts of Data Warehousing
2/22
A genda A genda
Evolu ti o nEvolu ti o nTraditi o na l Appr o achTraditi o na l Appr o ach
Need of Data wareh ou seNeed of Data wareh ou seDe f initi o n & Characteristics of DataDe f initi o n & Characteristics of Datawareh ou sewareh ou se
Data ware Architect u reData ware Architect u reMo de ling techniq u eMo de ling techniq u e
-
8/8/2019 Basic Concepts of Data Warehousing
3/22
Ev olutionEv olution
60 s60 s Batch Rep o rtsBatch Rep o rts1.1. Hard t o f ind and ana lyzeHard t o f ind and ana lyze
in fo rmati o n.in fo rmati o n.2.2. In fl exib le and expensi v e, repr o gramIn fl exib le and expensi v e, repr o gram
e v ery new req u est.e v ery new req u est.
7 0 s7 0 s Termina l based DSS and E ISTermina l based DSS and E IS1.1. Sti ll in fl exib le, n o t integrated withSti ll in fl exib le, n o t integrated with
deskt o p t ool s.deskt o p t ool s.
-
8/8/2019 Basic Concepts of Data Warehousing
4/22
-
8/8/2019 Basic Concepts of Data Warehousing
5/22
Th e T raditional Researc h A pproac hTh e T raditional Researc h A pproac h
Qu eryQu ery--dri v en ( lazy, o ndri v en ( lazy, o n--demand)demand)Clients
IntegrationSystem
Metadata
Wrapper Wrapper Wrapper
. . .
-
8/8/2019 Basic Concepts of Data Warehousing
6/22
Disad v antages of QueryDisad v antages of Query- -Driv enDriv en A
pproach A
pproach
De lay in q u ery pr o cessingDe lay in q u ery pr o cessingS lo w o r u na v ai lab le in fo rmati o n s ou rcesS lo w o r u na v ai lab le in fo rmati o n s ou rces
Co
mplex
f iltering and integrati
onC
omp
lex
f iltering and integrati
on
Ine ff icient and p o tentia ll y expensi v eIne ff icient and p o tentia ll y expensi v efo r f req u ent q u eriesfo r f req u ent q u eries
Co
mpetes withlo
cal
pro
cessing atCo
mpetes withlo
cal
pro
cessing ats ou rcess ou rces
-
8/8/2019 Basic Concepts of Data Warehousing
7/22
Data , e v eryw h ere :yetData , e v eryw h ere :yetCant f ind the data I needCant f ind the data I need
1.1. Data is scattered ov er the netw o rk.Data is scattered ov er the netw o rk.2.2. Many v ersi o ns, s u bt le di ff erence.Many v ersi o ns, s u bt le di ff erence.
Cant get the data I needCant get the data I need1.1. Need an expert t o get the data.Need an expert t o get the data.
Cant u nderstand the data I fou ndCant u nderstand the data I fou nd1.1. Av ai lab le data p oo r ly d o cu mented.Av ai lab le data p oo r ly d o cu mented.
Cantu
se the data Ifou
ndCantu
se the data Ifou
nd1.1. Res ul ts are u nexpectedRes ul ts are u nexpected2.2. Data needs t o be trans fo rmed f ro m o ne fo rm t o Data needs t o be trans fo rmed f ro m o ne fo rm t o
o ther.o ther.
-
8/8/2019 Basic Concepts of Data Warehousing
8/22
N eed for Data Ware h ousingN eed for Data Ware h ousing
F aster timeF aster time- -t ot o --market fo r pr o d u cts & market fo r pr o d u cts & ser v ices.ser v ices.
Rep lacement of ol der, lessRep lacement of ol der, less- -resp o nsi v e decisi o n s u pp o rt systems.resp o nsi v e decisi o n s u pp o rt systems.Better b u siness inte ll igence fo r endBetter b u siness inte ll igence fo r endu sers.u sers.Red u cti o n in time t o lo cate, accessRed u cti o n in time t o lo cate, accessand ana lyze.and ana lyze.
-
8/8/2019 Basic Concepts of Data Warehousing
9/22
Data Ware h ouseData Ware h ouse
De f initi o n gi v en byDe f initi o n gi v en by W.H.Inm o nW.H.Inm o nThe data wareh ou se is a c oll ecti o n of The data wareh ou se is a c oll ecti o n of integrated, s u bjectintegrated, s u bject- - o riented databaseso riented databasesdesigned t o s u pp o rt the DSS (decisi o ndesigned t o s u pp o rt the DSS (decisi o ns u pp o rt) fu ncti o n, where each u nit of s u pp o rt) fu ncti o n, where each u nit of data is re le v ant t o s o me m o ment indata is re le v ant t o s o me m o ment intime.time.
-
8/8/2019 Basic Concepts of Data Warehousing
10/22
Data Ware h ousingData Ware h ousingSt o red c oll ecti o n of di v erse dataSt o red c oll ecti o n of di v erse data
A s olu ti o n t o data integrati o n pr o b lemA s olu ti o n t o data integrati o n pr o b lemSing le rep o sit o ry of in fo rmati o nSing le rep o sit o ry of in fo rmati o n
Su
bjectSu
bject- -o
rientedo
rientedOrganized by s u bject, n o t by app licati o nOrganized by s u bject, n o t by app licati o nUsed fo r ana lysis, data mining, etc.Used fo r ana lysis, data mining, etc.
Large volu me of data (Gb, Tb)Large volu me of data (Gb, Tb)No nNo n-- vol ati levol ati le
Hist o rica lHist o rica lTime attrib u tes are imp o rtantTime attrib u tes are imp o rtant
Updates in f req u ent ly.Updates in f req u ent ly.
-
8/8/2019 Basic Concepts of Data Warehousing
11/22
A dv antages of Ware h ousing A dv antages of Ware h ousing A
pproach A
pproach
High q u ery per fo rmanceHigh q u ery per fo rmanceBu t n o t necessari ly m o st c u rrent in fo rmati o nBu t n o t necessari ly m o st c u rrent in fo rmati o n
Do esn t inter f ere with lo ca l pr o cessing atDo esn t inter f ere with lo ca l pr o cessing at
sou
rcessou
rcesCo mp lex q u eries at wareh ou seCo mp lex q u eries at wareh ou seIn fo rmati o n c o pied at wareh ou seIn fo rmati o n c o pied at wareh ou se
Can m o di f y, ann o tate, s u mmarize,Can m o di f y, ann o tate, s u mmarize,restr u ct u re, etc.restr u ct u re, etc.
Can st o re hist o rica l in fo rmati o nCan st o re hist o rica l in fo rmati o nSec u rity, n o a u ditingSec u rity, n o a u diting
-
8/8/2019 Basic Concepts of Data Warehousing
12/22
Ch aracteristics of Data ware h ouseCh aracteristics of Data ware h ouse
It is aIt is aS u bjectS u bject- - o rientedo riented
IntegratedIntegratedTimeTime- - v aryingv aryingNo nNo n-- vol ati levol ati le
And c o nsistentAnd c o nsistent
-
8/8/2019 Basic Concepts of Data Warehousing
13/22
Data Ware h ouse A rc h itectureData Ware h ouse A rc h itecture
-
8/8/2019 Basic Concepts of Data Warehousing
14/22
Data Ware h ouse A rc h itectureData Ware h ouse A rc h itecture
-
8/8/2019 Basic Concepts of Data Warehousing
15/22
Data Ware h ouseData Ware h ouse ModelingModelingT
ech
niqueT
ech
niqueDimensional Data ModelDimensional Data Model
1.1. Dimensi o nDimensi o n2.2. Attrib u teAttrib u te3.3. HierarchyHierarchy4.4. F act tab leF act tab le5.5. Loo k u p Tab leLoo k u p Tab le
-
8/8/2019 Basic Concepts of Data Warehousing
16/22
Designing Data Models For DWDesigning Data Models For DW
Star SchemaStar Schema
Sn o w fl ake SchemaSn o w fl ake Schema
-
8/8/2019 Basic Concepts of Data Warehousing
17/22
Data Mart ModelData Mart Model Star Star Sc h emaSc h ema
TIME
time key
daymont hyear period
CUSTOMER
customer key
first namelast namestreet addresscitystateziptransaction Idpayment datepayment status
MOVIE
mo v ie key
mo v ie copy number mo v ie number mo v ie titlerental statusrental date
due date
MARKET
market keystore number store citystore statedistrict nameregion name
REVENUE
mo v ie key (FK)market key (FK)customer key (FK)time key (FK)
mo v ie rental rateov erdue c h argepayment amount
-
8/8/2019 Basic Concepts of Data Warehousing
18/22
Data Mart Model -Snowflake Sc h emaSnowflake Sc h ema
REVENUE
mo v ie key (FK)market key (FK)customer key (FK)time key (FK)
mo v ie rental rateov erdue c h argepayment amount
MARKET
market key
store number store citystore state
region key (FK)district key (FK)
MOVIE
mo v ie key
mo v ie copy number mo v ie number mo v ie titlerental statusrental date
due date
CUSTOMER
customer key
first namelast nametransaction id
payment datepayment status
TIME
time key
day
month
year period
DISTRICT
district key
number nameoffice addressmanager name
REGION
region key
number nameoffice address
manager name
ADDRESS
customer key (FK)
street addresscitystatezip
-
8/8/2019 Basic Concepts of Data Warehousing
19/22
Different Le v els of Data ModelDifferent Le v els of Data Model
Co ncept u a l Data M o de l Co ncept u a l Data M o de l
Lo gica l Data M o de l Lo gica l Data M o de l
Physica l Data M o de l Physica l Data M o de l
-
8/8/2019 Basic Concepts of Data Warehousing
20/22
V arious terminologies used inV arious terminologies used in
th
e DWth
e DWData MiningData Mining
Data C leansingData C leansingData Integrati o nData Integrati o nData MartData Mart
-
8/8/2019 Basic Concepts of Data Warehousing
21/22
E T L ProcessE T L Process
Capt u reCapt u re
Scr u bScr u bTrans fo rmTrans fo rmLo adLo ad
-
8/8/2019 Basic Concepts of Data Warehousing
22/22
THANK YOUTHANK YOU