principles operational v analytical systems
DESCRIPTION
Sheffield Hallam University. Data Warehousing & Data Mining. Principles Operational v Analytical Systems. A customer walks into a bank, and speaks to a cashier. The cashier uses his/her computer to answer the queries / fulfil the actions. List five typical queries / actions - PowerPoint PPT PresentationTRANSCRIPT
Principles
Operational v AnalyticalSystems
Data Warehousing & Data Mining Sheffield Hallam University
1
A customer walks into a bank, and speaks to a cashier. The cashier uses his/her computer to answer the queries / fulfil the actions.
List five typical queries / actions1.__________________________________2.__________________________________3.__________________________________4.__________________________________5.__________________________________
2
What similarities can you spot about the nature of the data involved in your five queries / actions? Hint: volume, up-to-dateness (“currency”), level of detail
List four similarities
1.__________________________________
2.__________________________________
3.__________________________________
4.__________________________________
3
What general characteristics do you think can be stated about the IT application that the cashier uses to handle the customer queries / actions? Hint: Complete the following sentence: The application is oriented towards ..........
List four characteristics
1.__________________________________
2.__________________________________
3.__________________________________
4.__________________________________4
Operational Systems1. High volume of
transactions2. Small processing
per transaction
3. Frequent updating of data
4. Data is always current
5. Transaction driven6. Predictable query types
7. Static structure8. Content varies
9. High accuracy10. High availability
11. Mature support
Table 1: Attributes of an Operational System 5
Explanation of points 7 & 8 on the previous slide•How often does the underlying design of the bank’s database have to change (tables, relationships, integrity rules etc)?
•Just thinking about the current data (not the archive data), how much bigger /smaller will the bank’s volume of data be in one year’s time?
•Just thinking about the current data, how much of that data will have different values in one year’s time?
Do you understand why we said “Static structure; content varies” ?
6
Operational systems are generally based on Relational Database systems.
Very highly optimised towards fast writing / retrieval of small items of data (eg Oracle, DB2, SQL-Server, MySQL = very long established, huge $$ research investments)
Highly optimised towards using Relationships to fetch related data (eg Customer Name, and current balance)
These reasons make Relational Databases extremely quick
Operational Systems
7
Operational systems are generally based on Relational Database systems (cont ...)
The database itself can enforce Referential Integrity (eg cannot delete customer name & address if they still have an account open)
• Relational design: Only store each data item in one place (eg if customer changes address, only one copy to change)
These reasons make Operational applications much easier to write.
Operational Systems
8
The literature tends to use the term
On-Line Transactional Processing (OLTP)for what we have described as “operational” systems.
Transactional =
On-line : This term is a bit historic ... originally most systems processed batches of data non-interactively (cheques are one of the few batch-oriented systems left now). Now systems all tend to be on-line / interactive
Since the industry calls it OLTP, we will too.
9
Management have different needs to that of the operational side of the business. Managers are much more concerned with trends and totals and are generally not so concerned with the finer details. What they want are reporting systems that:
Give quick access to summaries of data Have data structures that are business oriented Allow users to explore the data Give them control over report writing
Management Reporting
10
The manager of the bank wants to analyse the effectiveness of last month’s business
List five typical queries1.__________________________________2.__________________________________3.__________________________________4.__________________________________5.__________________________________Nb: TRY to stick with just data drawn from the Cashier system. But you will find this hard. Later we will see that it is a feature of Analytical system that they integrate data from many sources
11
What similarities can you spot about the nature of the data involved in your five queries?Hint: volume, up-to-dateness (“currency”), level of detail
List four similarities
1.__________________________________
2.__________________________________
3.__________________________________
4.__________________________________
12
What general characteristics do you think can be stated about analytical applications? Hint: Complete the following sentence: The application is oriented towards ..........
List four characteristics
1.__________________________________
2.__________________________________
3.__________________________________
4.__________________________________
13
Analytical Systems1. Small volume of
transactions2. Often huge
processing per transaction
3. Data output level is summary
4. Data routinely added to, but infrequently changed
5. Analysis driven6. Flexible results
structure7. 'Fairly accurate' better
than no result
8. Medium availability
9. Requires different database tools
Table 2: Attributes of an Analytical System 14
Data is stored in structures easy for business users to understand (not constrained by Relational rules)
Data held in duplicate if this makes access easier/quicker (eg can hold summary/totals of data too)
Out-of-date data held (with timestamp) as well as new (allows examination of historical trends)
These sort of systems are referred to as OLAP systems or On-Line Analytical Processing Systems.
On-line Analytical Processing (OLAP)The Management Data is better stored in a Data Warehouse.
15
16
The diagram below shows how Data is converted into Business Intelligence
Information Knowledge
Structure Analyse Apply
Data Intelligence
The Process Flow
The diagram below shows the relationships between the various components:
Database Data Warehouse
OLTP OLAP
SQL Server, Oracle, Ingres, Informix, DB/2
etc
SAS Warehouse Administrator, SQL Server etc
Application Software, SAP, SQL
Server, Oracle, Spreadsheets etc
SQL Server Analysis Services, Crystal Analysis, Oracle Discovery
etc
Cleansing/Staging
Database Data Warehouse
OLTP OLAP
SQL Server, Oracle, Ingres, Informix, DB/2
etc
SAS Warehouse Administrator, SQL Server etc
Application Software, SAP, SQL
Server, Oracle, Spreadsheets etc
SQL Server Analysis Services, Crystal Analysis, Oracle Discovery
etc
Cleansing/Staging
Structure Analyse Apply
Information KnowledgeData Intelligence
A Data Warehouse is ...
"... where data is specifically structured for query and analysis performance and ease-of-use"
Kimball, 2002
Definition
19
Data Warehouses offer the flexibility needed to cope with the Management demands. A major issue is that often there are many differing OLTP systems and other data storage media. The Data Warehouse offers the opportunity to gather these together into one system with a unified structure.
Because data is stored in a simplified aggregated format it allow reports to be written by staff who have a lesser computing background.
Why a Data Warehouse?
20
OLAP versus OLTPIf a report is designed to obtain information from an OLTP system it will generally be:
• Slow to produce the answer• Complicated to write• Slow down the operational system• Need complicated formulas for grouping data
21
Operational Analytical
Data Individual Items Summarised
Data Relations Simple Chains Complex/Unknown
Time Present Past
Access Record-at-a-time Many records
Approach Support a transaction Explore a domain
Implementation Relational Follow the module!
Summary
22