olap solutions using pentaho analysis services

Post on 01-Jan-2017

255 Views

Category:

Documents

3 Downloads

Preview:

Click to see full reader

TRANSCRIPT

OLAP Solutions usingPentaho Analysis Services

Gabriele Pozzani

PAS

● Pentaho Analysis Services (PAS) provides– OLAP capabilities– To interactively analyze data through a cross-tab

interface– No need to define a query– A front-end provides the interface to retrieve and

format data● Drill-down● Drill-up● Slicing● Dicing

PAS components (I)

● PAS consists of four components

1. Mondrian OLAP Engine: receives MDX queries from JPivot and returns a multi-dimensional result-set

• Included in the Pentaho Server

2. Schema Workbench: designes and tests Mondrian cube schemas• Cubes are used by Mondrian to interpret MDX and

translate it into SQL queries on a RDBMS

PAS components (II)

3. JPivot analysis front-end: a Java-based analysis tool. Front-end for OLAP cubes

4. Aggregate designer: a designer for generating aggregate tables to speed up the analytical engine

Schemas

● Mondrian Schemas are XML documents– Describe multidimensional cubes– Describe the mapping between multi-dimensional

and relational model– Is used to translate MDX to SQL

MDX

● MDX: Multi-Dimensional eXpressions– A language designed for querying OLAP databases– A de facto standard developed by Microsoft

http://msdn.microsoft.com/en-us/library/ms145506.aspx

Pentaho Schema Workbench

Pentaho Schema Workbench

● PSW is a graphical tool– To create Mondrian schemas– To publish schemas to the Pentaho Server

Connect to DB

● The first thing to do is to establish a connection to the database– Options →

Connections...

JDBC Explorer

● Once the connection has been established you can explore the database– File New JDBC Explorer→ →

Create a new schema

● The schema editor can:– Create a new schema

● File New Schema→ →

– Save the schema on disk● .xml

– Edit object attributes– Switch to view the XML representation of the

schema● Only view. No editing

Main tasks

● Basic tasks for defining a schema are:

1. Create a schema

2. Create cubes2.1. Choose a fact table

2.2. Add measures

3. Create dimensions3.1. Edit the default hierarchy and choose a dimension

table

3.2. Define hierarchy levels

4. Associate dimensions with cubes

1. Create a schema

● File New Schema→ →

2. Create cubes

2.1 !!!

2.1. Choose a fact table

DB Schema

Table namein the schema

2.2. Add measures

3. Create dimensions (I)

● Dimensions can be added to:– A cube: "private dimensions" known only to the

cube that contains them– A schema: "shared dimensions" that can be

associated to multiple cubes

3. Create dimensions (II)

Fact tableforeign key

● Date/time related dim.has TimeDimension type

3. Create dimensions (III)

Usual dimensions haveStandardDimension type

3.1 !!!

3.1. Add/edit hierarchies● A new hierarchy is created for each dimension● New hierarchies can be added to dimensions● Each hierarchy must have a table node and one

or more levels

3.1. Dimension table

● Same settings for fact tables

3.2. Add hierarchy levels

4. Associate shared dimensions

● Shared dimensions can be associated to a cube adding a "Dimension usage"

Shared dim.

Testing and deployment

● Once schemas have been defined they may be– Tested using the MDX query tool (MDX) included in

PSW– Published to the Pentaho Server

MDX query tool (I)

● File New MDX Query→ →● If a schema editor is open MDX attempts to

connect to the underlying DB for loading the schema definition

MDX query tool (II)

● A query can be entered in the upper pane

● The result is shown in the lower pane

Publishing the cube (I)

● File Publish...→

Server URL

Password specified inpublisher_config.xml

User with privilegesfor publishing

Publishing the cube (II)

● If the connection succeeds a dialog appears– Choose the location in

the server's solution repository where to save the schema

– Specify the data source to use at the server side to execute the SQL queries (corresponding to the MDX ones)

JPivot

JPivot

● Once a cube has been published it can be used to build analysis applications

● Pentaho provides the JPivot front-end in the Pentaho User Console

Analysis View

Create a new analysis view

Schema to use

Cube to use definedinto the schema

New analysis view

JPivot toolbar

Drilling

● Drilling allows the user to navigate from one level of aggregation to another

Drilling flavors

● There are 4 different ways to drill, with different drill result

● Different drill ways can be selected in the toolbar– Drill member– Drill position– Drill replace– Drill through

Apply to dimensions

Apply to measures

Drill member & Drill position

● Drill member: the drilling on one instance of a member is also applied to all other instances of this member

● Drill position: the drilling occurs directly to the member instance and it is not applied to other instances of that member

Drill replace

● The drilled member is replaced with the drill result

Drill through

● It applies to measures● It retrieves the detail rows of the rolled up

measure aggregate value and shows them in a separate table

The OLAP Navigator (I)

● It is a GUI that allows to control the mapping between the cube and the pivot table– Which dimension is mapped to which axis– How multiple dimensions on one axis are ordered– What slice of the cube

is used in analysis

The OLAP Navigator (II)

● The navigator has three sections– A Columns section– A Rows section– A Filtes section

Controlling placement of dimensions on axes

● Clicking the little square before a dimension you can move the dimension from Rows (Columns) to Columns (Rows)

Slicing with the OLAP Navigator (I)

● A slicer corresponds to the MDX WHERE clause– Used to show only a subset (slice) of the data

● Clicking on the funnel icon you move a dimension in the Filter section

Slicing with the OLAP Navigator (II)

Specifying member sets

● It is also possible to specify particular members on columns and rows axes

MDX query pane

● You can also view the MDX query that represent the current state of the analysis view– Useful to learn MDX syntax

Export

● Print to PDF

● export in MS Excel format

Charts

● JPivot allows to display data in a chart● The chart can be configured

Alternative to JPivot

● Pentaho has a modular structure– It may be extended with new plugins

● SAIKU– Provide a plugin for Pentaho offering lightweight

OLAP features– It also provides a RESTful server that can connect

with any OLAP system– http://analytical-labs.com

Saiku

● It allows to execute OLAP analysis on any cube already defined

● Based on the definition of what we want to see in the analysis– By specifying which dimensions/measures we want

on columns, rows, and filters● Drag 'n' drop UI

Defining the analysis (I)

● Once a cube has been selected the available dimensions (with hierarchies) and measures are listed

Defining the analysis (II)

● Then, we can drag'n'drop dimensions and measures as we want in columns, rows, filters

– We are restricted only to not put measures on both columns and rows

● After each changethe query is updated and executed automatically

Defining the analysis (III)

Filtering

● Filters may be applied to visible (columns and rows) and invisible (filter) dimensions

Ordering● Each dimension and/or measure can be used to order data

– But not all possible combinations are allowed● We can't order both by a measure on columns and a dimension on

row (or viceversa)

Popup menus

● Some options for fast filtering and adding/removing dimension levels are available by clicking on columns and rows header

Charts

● Data can be also reported in a chart

Statistics

● Saiku allows also to show some statistics about columns values

Other commands

● Other available commands include:– Show MDX query– Drill through on cell– Export Drill-Through on cell to CSV– Export XLS– Export CSV

Saiku remarks

● Saiku is still in development– Some features of JPivot are missing– Some features have bugs or malfunctionings

● Charts● Drill through

top related