pentaho business analytics & data integration amjad.akkawi@zaponet.com

Post on 25-Feb-2016

56 Views

Category:

Documents

3 Downloads

Preview:

Click to see full reader

DESCRIPTION

Pentaho business analytics & data integration Amjad.akkawi@zaponet.com. About US – Zaponet data science solutions. - PowerPoint PPT Presentation

TRANSCRIPT

Pentaho business analytics & data integration

Amjad.akkawi@zaponet.com

About US – Zaponet data science solutions

Zaponet is a service integrator and development shop providing solutions & professional services for building state of the art data-products which leverage big-data & data-science technologies.

Zaponet architect, design and builds big-data solutions: data warehouses, user-profile systems, recommendation engines, complex event processing and more

Some of our technology partners are: pentaho ,cloudera ,infobright , vertica, kognitio ,gigaspaces

• more details www.zaponet.com *future meetup: Pentaho Weka for data science

About Me – Amjad Akkawi

Zaponet CTO

Experience in pentaho

Agenda

• Pentaho in business analytics & data integration

• Pentaho BI Demo• Pentaho PDI Demo

About Pentaho

• Recognized leader in business analytics & data integration• Subscription-based business model• Achieved critical mass:

• Over 1,200 commercial customers• Over 10,000 production deployments• Over 185 countries

• Stewardship of most important open source analytics projectsINDUSTRY RECOGNITION OVER 160 PARTNERS GLOBALLY

Why Customer Love Pentaho

Innovation & Scalability

Superior Customer

Service

Total Value

8 weeks time to market

2 weeks time to market

€350K+ cost saving75% lower acquisition costs

Music files from 20,000 sources

Operational reports at all 1000 retail stores

Less than 1 month ROI

Analyzing buying patterns of 5 million

membersAnalytics on 500,000

patients records

…“better functionality and more support”

…“top-notch professional support”

“Pentaho support is as good as its software”

…“a great partner through every phase of

our project”

…“ROI was almost immediate”.

Fully rolled out in budget in 4 months

Marketing dashboard in less than 1 day

Speed of Deployment

Pentaho in the Big Data Fabric

Big

Dat

a M

gmt

HadoopJava MapReduce, PigPentaho MapReduce

NoSQL Databases Analytic Databases

Data IntegrationJob Orchestration

Workflow

SchedulingHigh Performance

Visual IDE

Dat

a In

tegr

atio

n

Pentaho Business Analytics•R

•3rd Party BI Tools•Applications

3rd Party Tools

Big

Ana

lytic

s

High Level Feature/Functions

Advanced Power Users

&ViewersData Mining

Information ConsumersDashboards

Knowledge Workers/

Business UsersAnalysis

Business UsersReporting

Power Users,Developers &

DBAsData

Advanced Predictive

Analysis

Self-service InteractiveKPI & Metrics and

Visualization

Self-service Interactive and Ad Hoc Analysis

Ad hoc and Operational

Reports

High Performance Data Integration, BIG DATA, Cleansing

and Presentation

Com

pone

nts a

re in

depe

nden

t

High Level Feature/Functions

Advanced Power Users

&ViewersData Mining

Information ConsumersDashboards

Knowledge Workers/

Business UsersAnalysis

Business UsersReporting

Power Users,Developers &

DBAsData

Advanced Predictive

Analysis

Self-service InteractiveKPI & Metrics and

Visualization

Self-service Interactive and Ad Hoc Analysis

Ad hoc and Operational

Reports

High Performance Data Integration, BIG DATA, Cleansing

and Presentation

Dashboards

Dashboards & Interactive Dashboards

Dashboards – Geo Location-Based

High Level Feature/Functions

Advanced Power Users

&ViewersData Mining

Information ConsumersDashboards

Knowledge Workers/

Business UsersAnalysis

Business UsersReporting

Power Users,Developers &

DBAsData

Advanced Predictive

Analysis

Self-service InteractiveKPI & Metrics and

Visualization

Self-service Interactive and Ad Hoc Analysis

Ad hoc and Operational

Reports

High Performance Data Integration, BIG DATA, Cleansing

and Presentation

Reports – Interactive, Static, Distributed

15

Reports – Reporting Pack & House Styles

Reports – Reporting Pack & House Styles

High Level Feature/Functions

Advanced Power Users

&ViewersData Mining

Information ConsumersDashboards

Knowledge Workers/

Business UsersAnalysis

Business UsersReporting

Power Users,Developers &

DBAsData

Advanced Predictive

Analysis

Self-service InteractiveKPI & Metrics and

Visualization

Self-service Interactive and Ad Hoc Analysis

Ad hoc and Operational

Reports

High Performance Data Integration, BIG DATA, Cleansing

and Presentation

18

Enhanced In-Memory Analytics• Enhanced in-memory caching for speed of

thought visualization & analysis– More re-usability of in-memory data– Fewer trips to the database/disk

• Builds on existing unique extreme-scale in-memory analytics– Support for external data grids

• Infinispan / JBoss Enteprise Data Grid and Memcached

• Scale to caching hundreds of GBs (potentially TBs) of data in-memory

• Competition– Java heap or C++ memory space (a few GB at

most (most BI products)or

– Proprietary (hard to manage) in-memory technology (e.g. Qlikview, Microstrategy)

Analyzer – Table format

Analyzer – Chart format

Analyzer: Geo Location-Based Analysis

High Level Feature/Functions

Advanced Power Users

&ViewersData Mining

Information ConsumersDashboards

Knowledge Workers/

Business UsersAnalysis

Business UsersReporting

Power Users,Developers &

DBAsData

Advanced Predictive

Analysis

Self-service InteractiveKPI & Metrics and

Visualization

Self-service Interactive and Ad Hoc Analysis

Ad hoc and Operational

Reports

High Performance Data Integration, BIG DATA, Cleansing

and Presentation

Scenario 1

OperationalDatabase Dashboard

Report

Scenario 2

Data Mart(s) / Warehouse

Metadata

Dashboard

Report

Analyzer

Metadata – Schema WorkbenchComplex calculations and multi-cube requirements may need more modeling

Scenario 3

Unstructured Data100

Data Mart(s) / Warehouse

Structured Data

BIG DATA Technology

and/orStaging Area &

Data Vault

Pentaho Data Integration

Source data acquisition

Initial consolidation as required

Pentaho Data Integration

Cleansing

Transformation

Change Data Capture

Data Warehouse Management

PDI PDI Metadata

Dashboard

Report

Analyzer

Variations on a Theme

Unstructured Data

Ad-hoc Data

Data Mart(s) / Warehouse

Structured Data

AlertingSMS, eMail & attachments

Pentaho Data Integration

Source data acquisition

Initial consolidation as required

Pentaho Data Integration

Cleansing

Transformation

Change Data Capture

Data Warehouse Management

PDI PDI Metadata

Dashboard

Report

Analyzer

BIG DATA Technology

and/orStaging Area &

Data Vault

PDI Components• Enterprise Edition Data Integration Server

– Execution and remote monitoring– Integrated scheduling– Enterprise Security options– Enhanced content management including revision history and locking– Remote distributed cluster based processing

Kettle Conceptual Model

Pentaho Data Integration

Step based processing engine with instant visualization of results

Pentaho Data Integration

Step based performance

32

Pentaho Data Integration

Integrated Metadata Creation

Pentaho and Big DataForrester Wave, Enterprise Hadoop Solutions, Q1 2012

Only vendor in strong performer category: “an impressive Hadoop integration tool”

Only business analytics vendor

Richest functionality Most extensive integration

with open source Apache Hadoop and major Hadoop distributions

Expanded Insight into Big and Diverse Data• Improved support for Hadoop

– Simpler deployment across Hadoop clusters• Support for the Hadoop cache• Debian RPM installer

– Performance and ease of use enhancements for Pentaho MapReduce visual development

– Support for Hadoop Security data access

• New NoSQL database support– Cassandra– MongoDB

• Growing the Pentaho big data community– Open sourced all big data components (Hadoop & NoSQL)

• Apache License – same as used by leading Hadoop and NoSQL distros

– New big data developer resources: How to documents, videos, walk-throughs

Hadoop Data Management & Integration

Accessible by any ETL developer or data scientist

Pentaho MapReduce

NoSQL Data Management & Integration

Accessible by any ETL developer or data scientist

Visual Job OrchestrationAny Data Source

Visual Job Orchestration Any Data Source

Scheduling

Accessible to any ETL developer

or data scientist

Pentaho Integration Options

PentahoBI Server

OtherApplication

Pentaho

CustomStuff

My Application

PentahoComponents

IntegrationBundled Mashup Extended Embedded

Value Fastest Way to Get Analytics that

Have Your Look & Feel

An Integrated Experience for Yours

End User

Customizing Pentaho for Your

Experience

Ultimate Integration and Customization

What it Takes?

• Pentaho is a separate app, branded with Partner’s logo, look & feel

• Optional: Partner app may include links to Pentaho reports, analysis and dashboards (popping new window)

• Optional: Single sign-on creates a seamless experience

• Pentaho & Partner app have the same UI

• Pentaho User Console, or individual reports, analysis or dashboards are included in partner app

• Single sign-on creates a seamless experience

• Pentaho’s core functionality is extended through plug-ins. Examples:- Connecting to custom data sources- Adding new visualizations- Customizing security- Replacing Pentaho rules engine

• Integrate with Partner’s App Server

• Directly embedding Pentaho into your app

• Calling Pentaho Java APIs from your App

Skill Level • Limited HTML skills • HTML skills • HTML skills• Java skills

• HTML skills• Java skills• Knowledge of Pentaho architecture

Q & A

NEXT …Pentaho PDI DemoPentaho BI Demo

“Traditional” Database Support

DATA INTEGRATIONDATA ANALYSIS

Author

Broadest Support for Big Data Platforms

Hadoop NoSQL Analytic Databases

top related