open source solutions: managing, analyzing and delivering business information

53
Open Source Solutions: Managing, Analyzing and Delivering Business Information Mark R. Madsen – November 2009 www.ThirdNature.net

Upload: mark-madsen

Post on 27-Jan-2015

112 views

Category:

Technology


7 download

DESCRIPTION

These slides on the usage of open source solutions within the business intelligence and data warehousing market go with a webcast and research report. The webcast is archived at http://ow.ly/KLz0 along with a PDF of the report, This presentation describes what open source software is being deployed and presents the benefits, challenges and practices for organizations adopting open source technologies.

TRANSCRIPT

Page 1: Open Source Solutions: Managing, Analyzing and Delivering Business Information

Leveraging Open Source Business Intelligence Across Your Organization Mark R. Madsen – February 2009www.ThirdNature.net

Open Source Solutions:Managing, Analyzing andDelivering Business InformationMark R. Madsen – November 2009www.ThirdNature.net

Page 2: Open Source Solutions: Managing, Analyzing and Delivering Business Information

Slide 2February 2009 Mark Madsen

The First Recorded Patent

Page 3: Open Source Solutions: Managing, Analyzing and Delivering Business Information

Slide 3February 2009 Mark Madsen

The First Monopoly

Page 4: Open Source Solutions: Managing, Analyzing and Delivering Business Information

Slide 4February 2009 Mark Madsen

The Origin of Copyright

•1556: The Worshipful Company of Stationers and Newspaper Makers is granted a Royal Charter, giving it a monopoly over the publishing industry until …

•1710: “An Act for the Encouragement of Learning, by vesting the Copies of Printed Books in the Authors or purchasers of such Copies, during the Times therein mentioned”, otherwise known as the Statute of Anne, put the put the rights into the hands of authors

Page 5: Open Source Solutions: Managing, Analyzing and Delivering Business Information

Slide 5February 2009 Mark Madsen

After Each Revolution, the Old Pirates Become the New Establishment

Pirate

Establishment

Page 6: Open Source Solutions: Managing, Analyzing and Delivering Business Information

Slide 6February 2009 Mark Madsen

What is Commercial Software, Really?

Page 7: Open Source Solutions: Managing, Analyzing and Delivering Business Information

Slide 7February 2009 Mark Madsen

What Makes Software Open Source?

More freedom

Academic LIcenses

Reciprocal Licenses

“Freeware” Licenses

Commercial Licenses

Less freedom

The fuzzy dividing line between open and closed source

Page 8: Open Source Solutions: Managing, Analyzing and Delivering Business Information

Slide 8February 2009 Mark Madsen

Some Quick Definitions

Proprietary SoftwareSoftware under a license that provides limited usage rights only, provided in binary format.Open Source Software (OSS)Software under a license that allows acquisition, modification and redistribution. FreewareSoftware that does not have licensing limitations, generally distributed in binary format. Not the same as open source.

Page 9: Open Source Solutions: Managing, Analyzing and Delivering Business Information

Slide 9February 2009 Mark Madsen

Fauxpen SourceSomething appearing with greater frequency as open source becomes more popular and lower tier proprietary vendors seek a differentiator.

Page 10: Open Source Solutions: Managing, Analyzing and Delivering Business Information

Slide 10February 2009 Mark Madsen

Evolution of the Software Market 1987

Source: John Prendergast (data: Bloomberg, Factset)

Page 11: Open Source Solutions: Managing, Analyzing and Delivering Business Information

Slide 11February 2009 Mark Madsen

Evolution of the Software Market 1997

Source: John Prendergast (data: Bloomberg, Factset)

Page 12: Open Source Solutions: Managing, Analyzing and Delivering Business Information

Slide 12February 2009 Mark Madsen

Evolution of the Software Market 2007

Source: John Prendergast (data: Bloomberg, Factset)

Page 13: Open Source Solutions: Managing, Analyzing and Delivering Business Information

Slide 13February 2009 Mark Madsen

The DW & BI Software Market Today According to IDC, the analytics and data warehouse software market is growing at 10.3% CAGR

17,38619,342

21,40823,601

26,00128,682

31,595

2005 2006 2007 2008 2009 2010 2011

Page 14: Open Source Solutions: Managing, Analyzing and Delivering Business Information

Slide 14February 2009 Mark Madsen

Any Industry This Big is Maturing

Annual US software sales

-10

10

30

50

70

90

110

130

150

70 75 80 85 90 95 00Source: US Dept. of Commerce

Page 15: Open Source Solutions: Managing, Analyzing and Delivering Business Information

“If the automobile had followed the same development as the computer, a Rolls-Royce would today cost $100, get a million miles per gallon, and explode once a year killing everyone inside.”

Robert Cringely

Time

Anything

Reality

Page 16: Open Source Solutions: Managing, Analyzing and Delivering Business Information

Slide 16February 2009 Mark Madsen

Software Revenue = Corporate IT Cost

IT costs as a percent of equipment investment

0

10

20

30

40

50

68 72 76 80 84 88 92 96 00 04Source: US Dept. of Commerce

Page 17: Open Source Solutions: Managing, Analyzing and Delivering Business Information

Slide 17February 2009 Mark Madsen

Open Source is an Inevitable Consequence

If the means of production is widely distributed at commodity costAnd the internet connects all those means of productionAnd the supply of any software program is infiniteThen we need to rethink some things.“The era of high capital industrial production is giving way to a different model.” – Peter Drucker

Page 18: Open Source Solutions: Managing, Analyzing and Delivering Business Information

Slide 18February 2009 Mark Madsen

A Perfect Commodity Changes Things

Open source is a means of production and distribution of software, and is driving change in the market.

But the fact that the internet is a massive copying machine for the perfect commodity is the real change in conditions.

The basis of open source is economics, not ideology.

Page 19: Open Source Solutions: Managing, Analyzing and Delivering Business Information

Slide 19February 2009 Mark Madsen

The Real State of Enterprise Software?

Page 20: Open Source Solutions: Managing, Analyzing and Delivering Business Information

Slide 20February 2009 Mark Madsen

Enterprise Software Economics

• 70% - 80% of sales & marketing is for new sales

• 76% of new license revenue goes to sales & marketing

• Maintenance makes up 45% of revenues and this number is increasing

• 75% of R&D for mature products is for updates, bug fixing, and non-revenue enhancements

• Maintenance and support is becoming the biggest factor is software company profitability.

Sources Godman-Sachs, Tech Strategy Partners, Forrester

The enterprise software model is breaking down. Some facts:

Page 21: Open Source Solutions: Managing, Analyzing and Delivering Business Information

Slide 21February 2009 Mark Madsen

Open Source Disruption

“Which sector of the industry is most vulnerable to disruption by open source in the next five years?”

1. Web publishing and content management2. Social software3. Business Intelligence

Source: North Bridge Venture Partners

Page 22: Open Source Solutions: Managing, Analyzing and Delivering Business Information

Slide 22February 2009 Mark Madsen

BI is Entering Mainstream Adoption

The BI market has lots of segments, most new, some mature, some being rejuvenated.

Platforms

DatabasesReporting & Analysis

Data Integration

Predictive analytics

Page 23: Open Source Solutions: Managing, Analyzing and Delivering Business Information

Slide 23February 2009 Mark Madsen

Maturity for OSS Components of the StackIn

form

atio

n de

liver

y Dashboards & Scorecards

Analytics / OLAP clients

Interactive Reporting

Standard Reporting

Visualization

GIS & location

Predictive Analytics

Search/Discovery

Modeling

Portal Workflow

Infrastructure

Operating SystemsServers

Integration Management

ETL EII EAI EDR

Information Management

DW/Mart/ODS OLAP servers MDM* Data Quality

Databases

Metadata

Page 24: Open Source Solutions: Managing, Analyzing and Delivering Business Information

Slide 24February 2009 Mark Madsen

Interest in and Use of Open Source

5%

14%

18%

18%

8%

8%

12%

13%

18%

22%

17%

18%

43%

37%

31%

29%

26%

19%

22%

22%

Advanced analytics

Business intelligence

Data integration and ETL

Database

In production Prototype or pilot Evaluating Considering No plans

Source: Third Nature Open Source BI/DW adoption survey

Page 25: Open Source Solutions: Managing, Analyzing and Delivering Business Information

Slide 25February 2009 Mark Madsen

Database Use

2%

2%

2%

3%

3%

3%

3%

7%

7%

8%

10%

11%

44%

75%

Bizgres

Kickfire

LucidDB

MonetDB

SQLite

CouchDB

Palo

Firebird

Ingres

BerkeleyDB

EnterpriseDB

Infobright

Postgres

MySQL

Source: Third Nature Open Source BI/DW adoption survey

Page 26: Open Source Solutions: Managing, Analyzing and Delivering Business Information

Slide 26February 2009 Mark Madsen

Data Integration Tool Use

Source: Third Nature Open Source BI/DW adoption survey

What’s popular

What it’s being used for

2%

2%

2%

5%

5%

8%

13%

33%

42%

Clover

Open Data Quality

OSDQ

Apatar

Red Hat Teiid

DataCleaner

Jitterbit

Talend

Pentaho DI / Kettle

8%

10%

15%

15%

21%

30%

Low‐latency ETL for a data warehouse or mart

Master data management efforts

Data quality efforts

Data migration efforts

Operational integration

Batch ETL for a data warehouse or mart

Page 27: Open Source Solutions: Managing, Analyzing and Delivering Business Information

Slide 27February 2009 Mark Madsen

14.6%

15.2%

15.9%

16.5%

17.1%

20.7%

OLAP

Reports embedded  in an application or website

Reporting against an application database

End user or interactive reporting

Dashboards or scorecards

Static reports

BI Tool Use

Source: Third Nature Open Source BI/DW adoption survey

2%

2%

5%

5%

9%

14%

19%

26%

28%

47%

OpenReports

Palo

MarvelIT

Openl

SpagoBI

Jfree

BIRT

Mondrian

Jaspersoft

Pentaho

What’s popular

What it’s being used for

Page 28: Open Source Solutions: Managing, Analyzing and Delivering Business Information

Slide 28February 2009 Mark Madsen

Advanced Analytics Use

2%

3%

4%

4%

7%

8%

8%

23%

42%

46%

Cytoscape

Taverna

Axiis

Processing

Orange

Graphviz

Knime

RapidMiner

Weka

R

Source: Third Nature Open Source BI/DW adoption survey

Page 29: Open Source Solutions: Managing, Analyzing and Delivering Business Information

Slide 29February 2009 Mark Madsen

Usage of the tools

18%

10% 10%13%

50%

14%15%

10%8%

53%

16%

11%14%

18%

41%

25%

18%14%

7%

36%

Replacing proprietary software

Replacing internally developed software

Supplementing a system with similar 

features

Adding new functionality to an existing system

Using as part of a new system or 

project

Database Data Integration BI Adv. Analytics

Source: Third Nature Open Source BI/DW adoption survey

Page 30: Open Source Solutions: Managing, Analyzing and Delivering Business Information

Slide 30February 2009 Mark Madsen

Who’s Adopting Open Source for BI/DW?

1.The under-budgeted2. ISVs3.The under-served4.The over-served5.Developers who never

had it before

More co-existence and use in edge cases than straight replacements, and often competing with lack of use.

Page 31: Open Source Solutions: Managing, Analyzing and Delivering Business Information

Slide 31February 2009 Mark Madsen

Adoption by Organization Size

Page 32: Open Source Solutions: Managing, Analyzing and Delivering Business Information

Slide 32February 2009 Mark Madsen

Adoption by Size of Organization

Medium and large are the two biggest evaluators, with small using the most in production.

Source: Third Nature Open Source BI/DW adoption survey

38%

23%

41%

23%

37%

32%

Evaluating

Using Small

Medium

Large

Small

Medium

Large

Small

Medium

Large

Page 33: Open Source Solutions: Managing, Analyzing and Delivering Business Information

Slide 33February 2009 Mark Madsen

Scope of System Deployment

27%

38%32%

35%40%

27%

Department or Division Corporate‐wide

Small Medium Large

Source: Third Nature Open Source BI/DW adoption survey

Page 34: Open Source Solutions: Managing, Analyzing and Delivering Business Information

Slide 34February 2009 Mark Madsen

21%

31%

13%

6%

9%

14%

33%

31%

23%

24%

22%

29%

33%

28%

30%

52%

28%

36%

45%

38%

38%

58%

53%

54%

Large

Medium

Small

No purchasee

Maintenance or support contract

Training

Consulting or installation services

Phone, email or on‐site support from the vendor

Commercial license

Phone, email or on‐site support from a third party

Subscription to value‐added, enterprise features

Open Source Purchasing

Source: Third Nature Open Source BI/DW adoption survey

Page 35: Open Source Solutions: Managing, Analyzing and Delivering Business Information

Slide 35February 2009 Mark Madsen

Where Are People Getting Information?

7%

14%

14%

16%

17%

19%

20%

27%

28%

29%

32%

37%

37%

47%

47%

48%

53%

53%

Internet relay chat (IRC)

Support from a third party

Classroom training

Pre‐bundled software (e.g. a database packaged with a BI tool)

Software features in a paid "professional" version of the software

Outside consultant or systems integrator

Vendor support, paid or as part of a subscription

Third party books or documentation

Web‐based training

Print articles

Vendor evaluation / trial support (free)

Blogs

Web seminars or screencasts

Community forums

Online demos

White papers

Online documentation / wikis

Online articles

Source: Third Nature Open Source BI/DW adoption survey

Page 36: Open Source Solutions: Managing, Analyzing and Delivering Business Information

Slide 36February 2009 Mark Madsen

Why Consider Open Source?

IT is after one of three things:

Page 37: Open Source Solutions: Managing, Analyzing and Delivering Business Information

Slide 37February 2009 Mark Madsen

Rationale When Evaluating OSS

Source: Third Nature Open Source BI/DW adoption survey

Lower cost and reducing vendor risk are the two big reasons.

28%

28%

32%

32%

32%

33%

43%

44%

48%

66%

Access to the source code

Extensibility, customizability of software

Open development process and road …

Easier to evaluate or procure

Speed of innovation of the software

Flexibility in deployment

Lower maintenance costs

Reduced dependence on a vendor

Open standards

Lower acquisiton costs

Page 38: Open Source Solutions: Managing, Analyzing and Delivering Business Information

Slide 38February 2009 Mark Madsen

Good News: It Works

12%

22%

30%

32%

33%

34%

36%

40%

43%

69%

Better performance

Quicker turnaround on bug fixes

Speed of innovation of the software

Extensibility / customizability of software

Access to the source code

Freedom from vendor lock‐in

Flexibility in deployment

Reduced dependence on vendor

Ease of integration / open standards

Lower costs

Source: Third Nature Open Source BI/DW adoption survey

The benefits are largely being realized.

Page 39: Open Source Solutions: Managing, Analyzing and Delivering Business Information

Slide 39February 2009 Mark Madsen

Reduced Vendor Dependence

Avoid vendor imposed upgrade cycles

Page 40: Open Source Solutions: Managing, Analyzing and Delivering Business Information
Page 41: Open Source Solutions: Managing, Analyzing and Delivering Business Information

Slide 41February 2009 Mark Madsen

Why did the software evaluations fail?

16%

18%

19%

21%

25%

28%

29%

32%

34%

72%

Lack of vendor service or support

Higher costs than anticipated

Interoperability problems

Lack of available consulting

Reliability problems

Difficulty finding available solutions

Difficulty integrating into current environment

Required more internal expertise than expected

Scalability problems

Missing or incomplete features

Source: Third Nature Open Source BI/DW adoption survey

The biggest reason is maturity of the software.

Page 42: Open Source Solutions: Managing, Analyzing and Delivering Business Information

Slide 42February 2009 Mark Madsen

24%

14% 14%15%

13%

4%

1%3%

Less than 50GB

50 to <100GB

100 to <500GB

500GB to <1TB

1 to <5TB 5 to <20TB 20TB to 50TB

More than 50TB

67% of the sample < 1TB

Data Size, All Database TypesSource: Third Nature Open Source BI/DW adoption survey

Page 43: Open Source Solutions: Managing, Analyzing and Delivering Business Information

Slide 43February 2009 Mark Madsen

Performance problems

33%

33%

37%

69%

Poor batch reporting performance

Poor ETL or data integration performance

Poor performance loading data

Poor interactive BI or analytics performance

Source: Third Nature Open Source BI/DW adoption survey

Page 44: Open Source Solutions: Managing, Analyzing and Delivering Business Information

Slide 44February 2009 Mark Madsen

Solving Performance ProblemsReplace every single thing before the database?

Migrating to an analytic database is twice as likely as to another row-store database.

4%

8%

10%

18%

18%

26%

30%

32%

32%

34%

38%

Migrate to a different traditional database

Buy a specialized accellerator

Migrate to an analytic database

Limit the number of users accessing the system

Change ETL or data integration tools

Rewrite the BI application or reports

Limit the amount of data stored in the system

Redesign the ETL or data integration

Change BI or analytics tools

Buy more powerful hardware

Database or application tuning

Source: Third Nature Open Source BI/DW adoption survey

Page 45: Open Source Solutions: Managing, Analyzing and Delivering Business Information

Slide 45February 2009 Mark Madsen

Discontinuity Drives Open Source BI Use

The situations most appropriate to open source BI tools often involve discontinuous change.

• New interface requirements• New integration requirements• Platform change• Schema change• Data latency / real-time

requirements• Segmenting the user population

The data warehouse is becoming much more diverse – one BI vendor can no longer be expected to provide tools for all needs.

Page 46: Open Source Solutions: Managing, Analyzing and Delivering Business Information

Slide 46February 2009 Mark Madsen

First Thought is Often “Replace”

Page 47: Open Source Solutions: Managing, Analyzing and Delivering Business Information

Slide 47February 2009 Mark Madsen

Coexist is More Likely Than Replace

Page 48: Open Source Solutions: Managing, Analyzing and Delivering Business Information

Slide 48February 2009 Mark Madsen

Augment is Also More Likely

Page 49: Open Source Solutions: Managing, Analyzing and Delivering Business Information

Slide 49February 2009 Mark Madsen

Recommendations1.Don't focus solely on cost

savings. People did not mention as up-front reasons many of the benefits they discovered later.

2.Plan to augment, not replace, existing software with open source. Rather than trying to saving money by replacing software, look at gaps in the BI portfolio or data warehouse stack and use open source to supplement your systems.

Page 50: Open Source Solutions: Managing, Analyzing and Delivering Business Information

Slide 50February 2009 Mark Madsen

Recommendations3.Consider developing open

source policies. Most organizations are adopting open source in an ad-hoc fashion, project by project.

4.Evaluate open source like any other software. It doesn't matter if the software is free if it takes longer to build, manage and deploy solutions to end users, if it is unstable, or if it is missing a key feature

5.Make open source the default option. When there are no internal tools, open source should be the first alternative.

Page 51: Open Source Solutions: Managing, Analyzing and Delivering Business Information

Slide 51February 2009 Mark Madsen

Questions?“When a new technology rolls over you, you're either part of the steamroller or part of the road.” – Stewart Brand

Page 52: Open Source Solutions: Managing, Analyzing and Delivering Business Information

Slide 52February 2009 Mark Madsen

Creative CommonsThanks to the people who made their images available via creative commons:glassblower - http://flickr.com/photos/cazasco/261229878/canal - http://flickr.com/photos/mcsixth/150749007/rc toy truck.jpg - http://flickr.com/photos/texas_hillsurfer/2683650363/asymmetry_building_tokyo.jpg - http://flickr.com/photos/fukagawa/2004102417/beer_free_beer2.jpg - http://flickr.com/photos/fzero/173386050beer_free_beer3.jpg - http://flickr.com/photos/henrikmoltke/142750871/condiments_salsa.jpg - http://flickr.com/photos/uberculture/2462506722/london modern and ancient together.jpg - http://www.flickr.com/photos/cc_chapman/299509390/firemen not noticing fire.jpg - http://flickr.com/photos/oldonliner/1485881035/acapluco_cliff_divers_cc.jpg - http://flickr.com/photos/raveller/highway storm.jpg - http://flickr.com/photos/areyoumyrik/235230688Tenessee chicken - http://www.flickr.com/photos/mayhem/2495739721/

Page 53: Open Source Solutions: Managing, Analyzing and Delivering Business Information

About the Presenter

Mark Madsen is president of Third Nature, a technology research and consulting firm focused on business intelligence, data integration and data management. Mark is an award-winning author, architect and CTO whose work has been featured in numerous industry publications. Over the past ten years Mark received awards for his work from the American Productivity & Quality Center, TDWI, and the Smithsonian Institute. He is an international speaker, a contributing editor at Intelligent Enterprise, and manages the open source channel at the Business Intelligence Network. For more information or to contact Mark, visit http://ThirdNature.net.