dataware slides2
Post on 30-May-2018
240 Views
Preview:
TRANSCRIPT
-
8/14/2019 Dataware slides2
1/16
Good
Evening to All
http://images.google.co.in/imgres?imgurl=http://www.victorianrose.org/images/red_rose2.jpg&imgrefurl=http://www.nextbillion.net/archive/2007/8&h=313&w=501&sz=19&hl=en&start=14&tbnid=Mw37_CezC67l7M:&tbnh=81&tbnw=130&prev=/images%3Fq%3Droses%26gbv%3D2%26svnum%3D10%26hl%3Denhttp://images.google.co.in/imgres?imgurl=http://www.victorianrose.org/images/red_rose2.jpg&imgrefurl=http://www.nextbillion.net/archive/2007/8&h=313&w=501&sz=19&hl=en&start=14&tbnid=Mw37_CezC67l7M:&tbnh=81&tbnw=130&prev=/images%3Fq%3Droses%26gbv%3D2%26svnum%3D10%26hl%3Den -
8/14/2019 Dataware slides2
2/16
III & II
IT
-
8/14/2019 Dataware slides2
3/16
Characteristics of Data-warehousing
Goals Of Data-Warehousing
Architecture Of Data-Warehousing
The Phases Involved
-
8/14/2019 Dataware slides2
4/16
Accessibility Getting required information
when ever needed
Timeliness Time taken to submit the report
Formats Formats like spreadsheets,
graphs, maps etc.,
Integrity Accuracy and Reliability of data
-
8/14/2019 Dataware slides2
5/16
Data Warehouse is a database of data
gathered from many systems and intended tosupport management reporting and
decision making.
This process of gathering data is called
Data Warehousing
-
8/14/2019 Dataware slides2
6/16
Subject oriented: Data Warehouse deals with all the subjects of
corporate data.Eg: sales, finance, customers etc
Integrated: Integrates data from different Database systems
(Heterogeneous data) to single homogeneous
data. Non-volatile: The Data Warehouse is a read only database. It
cannot be overwritten or deleted. So, itsNon-volatile.
Time variant: Historical data with chronological importance,
i.e. Historical data is maintained and analysed
for future analysis.
-
8/14/2019 Dataware slides2
7/16
To provide a reliable, single, integrated source of
information
To give end users access to their data without a reliance on
reports produced by Information System (IS)
department.
Allows to analyze corporate data, predictive models and
improve Business Intelligence.
-
8/14/2019 Dataware slides2
8/16
-
8/14/2019 Dataware slides2
9/16
Four Data structures for the storage of data are: 1. DATA STORE 1, called , called Online Transaction Processing
(OLTP).
2. DATA STORE 2, called Integration Layer or Data Warehouse
3. DATA STORE 3, called Data Mart or High Processing QuerySystem (HPQS)
4. DATA STORE 4, called Online Analytical Processing (OLTP)
Three Data flow paths between the four data structures are:
1. FLOW1, from DATA STORE1 to DATA STORE 2
2. FLOW2, from DATA STORE2 to DATA STORE 3
3. FLOW3, from DATA STORE3 to DATA STORE 4
-
8/14/2019 Dataware slides2
10/16
The architecture is divided into threephases :
1.Extract Phase2.Transform Phase
3.Loading Phase
Transfer data
Data Store 1---------------Data Store 2There are different mechanisms for extracting that
data out of its sources. This is called Data
-
8/14/2019 Dataware slides2
11/16
The art of determining what records to extract from the
source system is frequently called Change data capture.
Some general techniques used to recognize changes to
source database tables. They are:Timestamps: The lucky among us extract data from
systems the timestamp records whenever
they are inserted or deleted.Triggers: Every time a record is inserted into,
updated in or deleted from a source table,
these triggers write a corresponding
message in a log file.FileCompares: Identify changes in your data is to
compare the file as it appears today to a
copy of how it appeared when you last
loaded the warehouse.
-
8/14/2019 Dataware slides2
12/16
Transform phase is where this data is Transformed into the required form in the
DATA STORE 2 . Some of the fundamental steps in the Transformation phase are:
1. Converting heterogeneous data to homogeneous data:--- The data in the DATA STORE 2 is from the different source
systems of DATA- STORE 1. So, the data is heterogeneous.DATA STORE 2 is called Integration Layer or Warehouse.
2. Adding Surrogate keys:--- For example, rather than using the customer number as
the key on the CUSTOMER table, you might use asurrogate key that is simply a sequential number generatedby your warehouse load programs.
3. Removing dirty data:----a. Ignoring them.
b. Rejecting bad records, but saving them in a separate filefor manual review.
c. Loadingas much of the bad record as possible and pointingout the errors for later.
4. Normalization:---A normalized database is like a flat file that is broken up into
smaller files or tables in order to store the data more
-
8/14/2019 Dataware slides2
13/16
Transformed data is sent to DATA STORE3, which is called DATAMART.
DEFINITION OF DATA MART:Data Marts are databases that share many of the
features of data warehouses but are smaller in scope.
LOADING phase constitutes several schemas. Two of them are:
Star Schema: Maintenance of data will be in one facttable and multiple dimension tables.Snow Flake Schema: Maintenance of data will be in the form of
normalized dimension tables.This DATA STORE 3 is also calledHigh Performance Query
Structures [HPQS].
DATA FLOW 3 is the transfer of data from the High PerformanceQuery Structures to the End User Reporting applications,DATA STORE4.
DATA STORE 4 is the data in the end users hands. This report inusers hands is the end of the information utility. It is, also, the
-
8/14/2019 Dataware slides2
14/16
A Centralized Data Warehouse Server is maintained at aparticular place. The transactions of all the GovernmentDepartments are transferred to the Centralized Server,Data Warehouse Server.The topology of the Network is
equated to the Architecture of the Data Warehouse asshown in the fig
-
8/14/2019 Dataware slides2
15/16
DWHS-Data Ware Housing Server
OLPS-OnLine Analytical Processing
System.
In the above example, Data from three departments areextracted and transformed to Centralized Server [DWHS].
Data Marts can answer most complex Queries andReport generation will be immediate
This Data can be checked further for any correctionsif any Incorrect data is found in the Data Ware House canbe informed to the government.
.Thus, Data Warehousing can take both
private and public sectors to a top level.
-
8/14/2019 Dataware slides2
16/16
top related