ibm cognos tutorial - abc learn
TRANSCRIPT
Data Warehouse Concepts
What is a Data Warehouse?
A warehouse in general terms is a repository where we store our information. Coming to the data
warehouse, it is a collection of data organized in a specific manner and categorized information.
A data warehouse stores historical data of an organization so that they can analyze their
performance over the past years and plan for the future.
The popular definition of the data warehouse by WH Inmon:
A data warehouse is:
Subject oriented: Data in the warehouse is categorized in different subject areas. For example,
consider a KFC store. It has many branches all over the world. If we have to analyze “sales” for
India, this is termed as “subject”.
Integrated: A data warehouse has data coming from multiple sources which are integrated into
the warehouse. For example, consider the same KFC store, stores in India may store date field as
“dd/mm/yyyy” whereas the same data in another country will be stored as “MM/DD/YYYY”. The
data warehouse will have only one format fixed to say “MM/DD/YYYY”.
Time-Variant: A data warehouse stores historical data with which we can identify patterns of
sales over a time period of 3 months,6 months or 2 years of any organization which has a
warehouse.
Non-Volatile: Data in the warehouse will not change once it is entered.
Another definition from Ralph Kimball is more precise:
A data warehouse is a copy of transaction data specifically structured for query and analysis.
What is a Data mart?
A data mart is a subset of a data warehouse. Suppose we have an organization established in
many different locations and each location maintains a data warehouse which we call it as data
mart because a warehouse will have all the data integrated and as far as data mart is considered it
will be a part of the data warehouse.
What are uses of having a Data Warehouse?
A Data Warehouse, in general, is used to analyze trends over a period of time and enhances the
decision making of an organization. Once the data is loaded into the warehouse will be creating
an OLAP cube or directly use the data to analyze trends. Basing on which the top level
management will approach their future business strategies.
Data warehouse Architecture
We will multiple sources there all the operational data will be stored. Some may store them in flat
files and some in databases. We will have to read all the sources and perform ETL (Extract,
Transform and Load operations) which will be used to integrate data from these sources and
transform them into one unique structure and then load into our target data warehouse.
Data Warehouse Architecture
There are 2 approaches in designing your data warehouse. First one is “Top-Down approach” and
second is “Bottom-up approach”.
Top – Down Approach: The above image shows the top-down approach, where we are reading
data from multiple sources and transforming the data and loading into your warehouse, then on
top of that we are creating our data marts.
Bottom-up Approach: The opposite of the above approach is this. Here we will be creating data
marts first and then we will create our data ware house on top of all the data marts.
Data Warehouse Terms Dimension: A Dimension is a categorical information which is stored in a data warehouse.
Fact: Fact is a measurable quantity by which we can actually figure out what the dimension does.
Attribute: Attributes are the elements in a dimension
Ex: Suppose we have product information classified in a dimension. The attributes of the “Product
dimension” are: “Product_ID, Product_Name, Product_Color” etc.
Sales is a fact table where you store numeric values associated with the dimension attributes.For
instance, we have to calculate the number of sales of each product for the current month.In this
case, Product is the dimension and Month is a dimension and numbers of product sold is the fact.
OLAP: Online Analytical Processing is a multidimensional model using which users can view
data in multiple dimensions at a single glance.
o Types of OLAP’s: o Relational OLAP
o Multi-dimensional OLAP
o Hybrid OLAP
OLTP: Online transactional processing is a transaction based model where the main aim will be
to retrieve data faster and update transactions quickly.
Schemas in Datawarehouse The schema is a logical arrangement of tables in a data warehouse. We have schemas in relational
databases. Very much like same Data warehouse has schema’s namely Star schema, Snowflake
schema, and Fact-Constellation.
Star Schema
In this logical arrangement of tables, the fact table will be at the center of the schema surrounded
by multiple dimension tables. Fact and dimension tables are linked via a Primary – Foreign Key
relationship.
Fact table will be having foreign keys of all the Dimension tables along with the facts whereas a
Dimension table will be having attributes which describe the Dimension.
Snowflake Schema:
In this logical arrangement of tables, we have dimension tables connected to other dimension
tables which in turn are connected to fact table via a primary-foreign key relationship.
Else we can say that the Dimension tables in Snowflake schema are normalized.
Because of the normalization, the data redundancy will be reduced and a lot of storage space is
saved.
Fact Constellation Schema:
A fact constellation schema, unlike the start or snowflake, will have multiple fact tables. It is also
called as Galaxy Schema.
Star Schema Example
In this schema shown, a fact table (sales) is connected to multiple dimension tables item, location,
time and branch. Each dimension table has attributes describing the dimension and fact table has
the foreign key of all dimension tables along with facts like dollars_sold and units sold.
Snowflake Schema Example:
In this schema, a fact table (sales) is connected to multiple dimensions, whereas the dimensions
item and location are again connected to City and Supplier dimension respectively.
Fact Constellation schema Example:
In this schema, we can find two fact table Sales and Shipping connecting with each other and are
again connected to multiple dimension tables individually.
Now we are jumping into the actual topic “Cognos Business Intelligence”.
Overall DWH architecture We will be dealing with Cognos reporting tool and Cognos Metadata managing tool and other
useful tools that Cognos BI has.
Let’s go through where BI will actually come into picture
The below image shows overall Data warehouse Architecture. From data load to using the data
loaded into the tables for enhancing the business of your organization. From all the data sources
(may it be Operational databases, Flat files etc.) we load the data into data warehouse through
different ETL process and creating cubes and using the data for mining purpose.
The below figure will describe the Business Intelligence flow how the data will be used for
Business optimization. Business intelligence is a process from where you will be able to derive
methods to enhance your business with the data that you have. Using BI one will be able to look
at their data at different levels and will be able to make decisions to make their business better.
Cognos BI will be used to generate reports (list, crosstab, and charts). The below figure will show
you the entire flow that happens with Cognos.
The first step in using Cognos BI is to gather requirements. This is possible when you understand
your business structure and the data that drives your business. Once the requirement is finalized
the next step is to create your framework metadata model. You can choose either to you star
schema or snowflake schema or you can use the existing relational model.
After this step is done with we have to publish the model that we have created. Publishing the
model will make your metadata model available for reporting purpose. Once your model is
published, we can use the model to create different types of reports as per your requirement and
we will schedule the reports basing your requirement.
Different Versions of Cognos
Cognos ReportNet
Cognos ReportNet (CRN) is a web-based software product for creating and managing ad-hoc and
custom-made reports. ReportNet is developed by the Ottawa-based company Cognos (formerly
Cognos Incorporated), an IBM company. The web-based reporting tool was launched in
September 2003. Since IBM's acquisition of Cognos, ReportNet has been renamed IBM Cognos
ReportNet like all other Cognos products.
Components:
Cognos Report Studio – A Web-based product for creating complex professional looking reports
Cognos Query Studio - A Web-based product for creating ad-hoc reports.
Cognos Framework Manager – A metadata modeling tool to create BI metadata for reporting and
dashboard applications.
Cognos Connection – the Main portal used to access reports, schedule reports and perform
administrator activities.
Cognos 8.x
IBM Cognos 8 BI, initially launched in September 2005, combined the features of several
previous products, including ReportNet, PowerPlay, Metrics Manager. There are also Express
and Extended versions of Cognos 8 BI. Full features:
Components:
Report Studio (Professional report authoring tool formatted for the web)
Query Studio (Ad hoc report authoring tool with instant data preview)
Analysis Studio (Explore multi-dimensional cube data to answer business questions)
Metric Studio (Monitor, analyze, and report on KPIs)
Metric Designer (Define, load, and maintain metrics to be available in Metric Studio)
Event Studio (Action based agents to notify decision makers as events happen)
Framework Manager (Semantic metadata layer tool which creates models or packages)
PowerPlay Studio (formerly PowerPlay Web)
Analytic Applications (Packaged BI Applications, built on an adaptable platform and extensible
into Business Analytics)
Cognos 10.x
We can see different components and how they are fitting in. The top most layer, where we can
see Cognos Connection, Administrator, Business Insight and different studios. They all are web-
based and end-user needs not to install any client side software if he has the latest web browser
installed.
The bottom layer is basically data layer where you may have homogeneous or heterogeneous
database systems. Data may be relational or multi-dimensional. On top of it, we can see three
modeling tools there - Framework Manager, Transformer, and Metric Designer. All of them are
client based installation.
We’ll maintain the flow of components from top to bottom as shown in below BI components
figure.
Before getting into the actual topics about how you create the model, deploy it and use it to create
reports we have to first know the architecture of Cognos.
Cognos Architecture: Cognos Business Intelligence framework is 3- tier architecture. The first tier of the architecture
will be having the Web server which is responsible for the accessing the user interfaces. The
second tier will be having IBM Cognos BI Server which will be having gateways and dispatchers
to route requests from different UI’s to Database and communicate back to the server.
The Third Tier will have the data sources. You can have multiple data sources and they will
connect to the Cognos server using JDBC or API’s.
The Below figure shows the architecture of Cognos BI:
In the data tier, you can see one more component called the “Content Store” which is a repository
for the Cognos server. Whenever you save any objects reports, models everything the data is
stored in Content Store database which once you reopen or reuse the existing object created will
fetch the data from the Content Store. All this work will be performed by the “Content Manager”.
Content Manager: The content manager is responsible for connecting with the Content Store
database and saving any report or model back to the Content Store.
Dispatcher: The job of dispatchers is to route requests sent from and to the Web UI/Windows
based components.
Web-based and Windows Based components of Cognos
Cognos BI has two types of components,
1. Web-Based components.
2. Window based components.
Web-based components includes
Cognos Connection: This is the portal from where you can access all the web-based components
of IBM Cognos BI.
Cognos Administration: Using this web-based UI you can perform administrative tasks like
granting access, revoking access, creating users, groups and defining roles etc.,
Cognos Report Studio: Using Report studio tool we can create reports and format the reports. We
can create multiple types of reports like list report, crosstab reports, graph’s etc.,
Cognos Query Studio: Here we create ad-hoc reports. Ad-hoc reports are useful when a user
wants to see the report without any prompts and less formatting. Using query studio we can create
a report instantly.
Cognos Metric Studio: This studio is used to build customized scorecards reports to monitor and
analyze metrics.
Cognos Analysis studio: This studio is used to analyze data from different dimensions and also
compare trends.
Cognos Business Insight: This tool is used to build dashboards. A dashboard allows a user to
quickly look into the data and enhance the decision making.
Cognos Business Insight Advanced: With this tool, we can build more powerful dashboards and
as well as simple reports.
Window-based Components:
Cognos Framework Manager: Cognos framework manager is a tool used to create metadata
models which we can use in Report studio or analysis studio.
Cognos Map Manager: Using this tool we can create maps which will allow you to create a new
region by using the existing regions.