business intelligence open source
DESCRIPTION
Business Intelligence Open Source course, theory and principal vendors.TRANSCRIPT
![Page 1: Business Intelligence Open Source](https://reader031.vdocuments.net/reader031/viewer/2022020207/554eb96ab4c905977e8b5469/html5/thumbnails/1.jpg)
Business Intelligence
![Page 2: Business Intelligence Open Source](https://reader031.vdocuments.net/reader031/viewer/2022020207/554eb96ab4c905977e8b5469/html5/thumbnails/2.jpg)
www.robertomarchetto.com
History
● Business Intelligence term first apparition on 1958 by Hans Peter Luhn, an IBM researcher
● Authomatic method to provide current awareness services to scientists and engineers
● Current definition of Business Intelligence as a combination of processes and technologies for gathering, storing, analyzing and providing access to informations to help enterprise users to make conscious decisions
![Page 3: Business Intelligence Open Source](https://reader031.vdocuments.net/reader031/viewer/2022020207/554eb96ab4c905977e8b5469/html5/thumbnails/3.jpg)
www.robertomarchetto.com
Main concept
● Collect data from different sources● Integrate and clean up data in a common, easy
to analyze repository● Provide business related analysis for managers
and decision makers● Focus on business, data integration, data
presentation
![Page 4: Business Intelligence Open Source](https://reader031.vdocuments.net/reader031/viewer/2022020207/554eb96ab4c905977e8b5469/html5/thumbnails/4.jpg)
www.robertomarchetto.com
Datawarehouse
● Bill Inmon: A collection of data in support of decisional process● End-user oriented● Collected from different sources● Time dependence● Data is not editable
● In theory means a group of processes● In the real world is often used for the database
![Page 5: Business Intelligence Open Source](https://reader031.vdocuments.net/reader031/viewer/2022020207/554eb96ab4c905977e8b5469/html5/thumbnails/5.jpg)
www.robertomarchetto.com
OLTP: On-Line Transaction Processing
● Commonly used in ERP, CRM systems and database applications
● Focuson transaction level (one invoice, one sales order, a search query, etc.)
● Updates and insertions are frequent● Relational model with many tables, using
normalization rules
![Page 6: Business Intelligence Open Source](https://reader031.vdocuments.net/reader031/viewer/2022020207/554eb96ab4c905977e8b5469/html5/thumbnails/6.jpg)
www.robertomarchetto.com
OLAP: On-Line Analytical Processing
● A system designed for analysis prouposes● Focused on the data exploration on the whole ● Data once added changes a lot less frequently● 13 (12+0) rules of Dr. Codd (1993)
● Multidimensional view● Intuitive data manipulation● Dimensions, Facts, Hierarchy levels, Cardinality
![Page 8: Business Intelligence Open Source](https://reader031.vdocuments.net/reader031/viewer/2022020207/554eb96ab4c905977e8b5469/html5/thumbnails/8.jpg)
www.robertomarchetto.com
Relational OLAP
● Uses relational database schemas and SQL to store and access OLAP cubes
● Reuse of RDBMS technology● Many tools and vendors available● SQL can be used directly by many tools● Scalability
![Page 10: Business Intelligence Open Source](https://reader031.vdocuments.net/reader031/viewer/2022020207/554eb96ab4c905977e8b5469/html5/thumbnails/10.jpg)
www.robertomarchetto.com
Memory OLAP, Hybrid OLAP
● Memory OLAP uses optimized multidimensional arrays● Requires pre-computation and storage of the cube
(processing)● Often better in performances than ROLAP, better
caching, multidimensional indexing● Compression techniques, statistical indexes● Less scalable than ROLAP on high volume of data,
less tools and vendors available● Hybrid OLAP (HOLAP) is the combination of ROLAP
and MOLAP
![Page 11: Business Intelligence Open Source](https://reader031.vdocuments.net/reader031/viewer/2022020207/554eb96ab4c905977e8b5469/html5/thumbnails/11.jpg)
www.robertomarchetto.com
Slowly Changing Dimensions
● In some Business Intelligence implementations data is always added and almost never modified
● This makes possible to go back in the timeline ● For example if an employer was hired in a time period
you can analyze data as being in that period, counting exactly the number of employes
● A common approach to ensure Slowly Changing Dimesions is to add some special fields to the database records, giving a time-related validity for each record
![Page 12: Business Intelligence Open Source](https://reader031.vdocuments.net/reader031/viewer/2022020207/554eb96ab4c905977e8b5469/html5/thumbnails/12.jpg)
www.robertomarchetto.com
MDX
● Multidimensional Expressions (MDX) is a query language for OLAP databases
● MDX is to OLAP as SQL queries are to OLTP databases
● Powerfull on computing indexes and navigating through OLAP dimensions
● SELECT {[Measures].[Store Sales]} ON COLUMNS{[Date].[2002], [Date].[2003]} ON ROWS FROM Sales WHERE ([Store].[USA].[CA])
![Page 13: Business Intelligence Open Source](https://reader031.vdocuments.net/reader031/viewer/2022020207/554eb96ab4c905977e8b5469/html5/thumbnails/13.jpg)
www.robertomarchetto.com
Features for a BI platform
● Data storage, data management● Data Integration, process schedulement● Querying and reporting● On Line Analitycal Processing (OLAP)● Documents management, versioning● Statistical computations● Microsoft Office or Open Office support● Easy to use and end user self creation of
documents (indipendence from developers)
![Page 16: Business Intelligence Open Source](https://reader031.vdocuments.net/reader031/viewer/2022020207/554eb96ab4c905977e8b5469/html5/thumbnails/16.jpg)
www.robertomarchetto.com
Data Mining
● Requires a strong preparation in computational statistics
![Page 18: Business Intelligence Open Source](https://reader031.vdocuments.net/reader031/viewer/2022020207/554eb96ab4c905977e8b5469/html5/thumbnails/18.jpg)
www.robertomarchetto.com
● Reporting● OLAP● Charts● Portal containers● Data integration tools● Libraries, CMS,
scheduler● Databases
Open Source offers
![Page 19: Business Intelligence Open Source](https://reader031.vdocuments.net/reader031/viewer/2022020207/554eb96ab4c905977e8b5469/html5/thumbnails/19.jpg)
www.robertomarchetto.com
SpagoBI (BI Suite)
● Engineering Informatica (Italy)
● Integration of components using drivers
● Comprehensive● Full Open Source
![Page 20: Business Intelligence Open Source](https://reader031.vdocuments.net/reader031/viewer/2022020207/554eb96ab4c905977e8b5469/html5/thumbnails/20.jpg)
www.robertomarchetto.com
Pentaho (BI Suite)
● Pentaho (USA)● Acquisition instead of
integration● Strong marketing● Commercial and
Open Source
![Page 21: Business Intelligence Open Source](https://reader031.vdocuments.net/reader031/viewer/2022020207/554eb96ab4c905977e8b5469/html5/thumbnails/21.jpg)
www.robertomarchetto.com
JasperServer (BI Suite)
● JasperSoft (USA)● Famous for
JasperReports● Easy to use● Commercial and
Open Souce
![Page 22: Business Intelligence Open Source](https://reader031.vdocuments.net/reader031/viewer/2022020207/554eb96ab4c905977e8b5469/html5/thumbnails/22.jpg)
www.robertomarchetto.com
Palo (In memory OLAP)
● Jedox (Germany)● Interesting technology
(M-OLAP, GPU)● Excel and OpenOffice
plugins● Web spreadsheet and
reporting● Open Source and
Commercial support
![Page 23: Business Intelligence Open Source](https://reader031.vdocuments.net/reader031/viewer/2022020207/554eb96ab4c905977e8b5469/html5/thumbnails/23.jpg)
www.robertomarchetto.com
Talend (Data Integration)
● Talend (France)● „Cool Vendor“
Gartner for Data Integration
● Data Integration, Data Quality, Data Management, ESB
● Open Source and Commercial support