tdwi bobi 2014 web updated

71
RESEARCH EXCERPTS THE STATE OF BIG DATA MANAGEMENT IMPLEMENTATION PRACTICES FOR BETTER DECISIONS INSIGHTFUL ARTICLES INSIDE FACEBOOK'S RELATIONAL PLATFORM WHAT DATA WAREHOUSES CAN LEARN FROM BIG DATA THE FUTURE OF CUSTOMER-CENTRIC RETAIL THE VERY BEST OF TDWI’S BI ARTICLES, RESEARCH, AND NEWSLETTERS VOLUME 11 PLUS 2014 FORECAST: BI, ANALYTICS, AND BIG DATA TRENDS AND RECOMMENDATIONS FOR THE NEW YEAR 2013 IN REVIEW: CLOUD BI TAKES OFF

Upload: prince2venkat

Post on 23-Jan-2016

11 views

Category:

Documents


0 download

DESCRIPTION

TDWI

TRANSCRIPT

  • ReseaRch exceRpts the state of Big Data ManageMent

    iMpleMentation pRactices foR

    BetteR Decisions

    insightful aRticles insiDe faceBook's Relational platfoRM

    What Data WaRehouses can leaRn fRoM Big Data

    the futuRe of custoMeR-centRic Retail

    The very besT of

    TDWIs bI arTIcles,

    research, anD

    neWsleTTers

    Volume 11

    PLUS 2014 ForecaSt:BI, AnAlytIcs, And BIg dAtA trends

    And recommendAtIons for the new yeAr

    2013 IN reVIeW:

    cLoUD BItaKeS oFF

  • See and understand your datain seconds with Tableau.The data you capture has massive potential to give you a competitive advantage. Turn that potential into reality with fast visual analysis and easy sharing of reports and dashboards from Tableau. By revealing patterns, outliers and insights, Tableau helps you find more insights in your data and get more value from it.

    Tableau provides: Live, optimized connection to a variety of data sources The ability to analyze truly big data at interactive speeds An in-memory data engine that overcomes slow databases Drag & drop data visualization no coding required

    Tableau is changing the way companies are analyzing and sharing their data. For a free trial visit, www.tableausoftware.com/tdwi

    Click on screen to play,or view online version.

  • tDWis Best of Bi VOL. 11 1 tdwi.org

    Table of conTenTs

    sponsoR inDex

    birst

    hP vertica

    Information builders

    Paxata

    Tableau software

    FEATURES 5 2013 in Review: Cloud BI Takes Off Stephen Swoyer

    10 2014 Forecast: BI, Analytics, and Big Data Trends and Recommendations for the New Year

    Fern halper, philip ruSSom, DaviD StoDDer

    TDWI BEST PRACTICES REPORTS 21 The State of Big Data Management philip ruSSom 25 Implementation Practices for Better Decisions DaviD StoDDer

    29 TEN MISTAKES TO AVOID SERIES Ten Mistakes to Avoid When Delivering Business-Driven BI laura reeveS

    TDWI FLASHPOINT 32 Enabling an Agile Information Architecture william mcKnight 34 TDWI Salary Survey: Average Wages Rise a Modest 2.3

    Percent in 2012 marK hammonD

    BUSINESS INTELLIGENCE JOURNAL 36 The Database Emperor Has No Clothes DaviD teplow 41 Dynamic Pricing: The Future of Customer-Centric Retail troy hiltbranD

    BI THIS WEEK 48 Inside Facebooks Relational Platform Stephen Swoyer 50 Load First, Model LaterWhat Data Warehouses

    Can Learn from Big Data JonaS olSSon

    TDWI CHECKLIST REPORT 52 How to Gain Insight from Text Fern halper

    55 NEW! VOTE FOR YOUR FAVORITE BEST OF BI STORY

    56 TDWI WEBINAR SERIES

    57 TDWI EDUCATION

    60 BEST PRACTICES AWARDS 2013

    66 BI SOLUTIONS

    69 ABOUT TDWI

    Volume 11

  • tDWis Best of Bi VOL. 11 2 tdwi.org

    Enterprise-caliber BI born in the cloud.Pulling real insights from all your data just got a whole lot easier. And faster. Birst is engineered with serious muscle under the

    hood. Like an automated data warehouse coupled with

    powerful analytics. Now add the speed and ease of use

    of the cloud and you get a solution more flexible than

    legacy BIand a whole lot more powerful than Data

    Discovery. But, hey, dont just take our word for itfind

    out why Gartner named us a Challenger in its most

    recent BI Magic Quadrant and why more than a

    thousand businesses rely on Birst for their analytic

    needs. Learn to think fast at www.birst.com.

    Think fast.

    Take a look at whats under our hood at birst.com

    20

    14 T

    hink

    fast

    is a

    trad

    emar

    k of

    Birs

    t, In

    c. A

    ll oth

    er tr

    adem

    arks

    cite

    d he

    re a

    re th

    e pr

    oper

    ty o

    f the

    ir re

    spec

    tive

    owne

    rs.

  • tDWis Best of Bi VOL. 11 3 tdwi.org

    Welcome to the eleventh annual TDWIs Best of Business Intelligence: A Year in Review. Each year we select a few of TDWIs most well-received, hard-hitting articles, research, and information, and present them to you in this publication.

    Stephen Swoyer kicks off this issue with a review of major business intelligence (BI) developments. In 2013 in Review: Cloud BI Takes Off, he argues that cloud BI has gained real and verifiable momentum in 2013.

    In this issues 2014 Forecast, TDWI Research analysts Fern Halper, Philip Russom, and David Stodder share their predictions and recommendations for the coming year, including BI and analytics trends and tips for successful big data implementations.

    To further represent TDWI Research, weve provided excerpts from some of the past years Best Practices Reports. Russoms The State of Big Data Management covers big data implementations, and Stodders

    Implementation Practices for Better Decisions explains how to use data visualization to help your organization succeed.

    This volumes Ten Mistakes to Avoid will help you dodge common blunders when delivering a BI solution. And thanks to articles from TDWIs e-newsletters, youll learn more about agile information architectures, salary trends in the BI/DW industry, Facebooks relational platform, and how to maximize the ROI of your data warehouse.

    In The Database Emperor Has No Clothes, one of our selections from the Business Intelligence Journal, youll read about Hadoops advantages over relational database management systems. Our second Journal piece,

    Dynamic Pricing: The Future of Customer-Centric Retail describes a new toll road in Washington, DC that uses big data and advanced analytics to monitor and manage traffic flow.

    Were also including a selection of our informative, on-demand Webinars, as well as a peek inside a TDWI World Conference keynote address, given by popular speaker Ken Rudin. And dont miss our Best of BI story survey on page 55we want to know your favorite stories from this issue.

    TDWI is committed to providing industry professionals with information that is educational, enlightening, and immediately applicable. Enjoy, and we look forward to your feedback on the Best of Business Intelligence, Volume 11.

    Denelle Hanlon Editorial Director, TDWIs Best of Business Intelligence The Data Warehousing Institute [email protected]

    EDITORIAL DIRECTORS NOTE

    tdwi.org

    Editorial Director Denelle hanlon

    Senior Production Editor Roxanne cooke

    Graphic Designer Rod gosser

    President Richard Zbylut

    Director of Education paul kautza

    Director, Online Melissa Reeve Products & Marketing

    President & neal Vitale Chief Executive Officer

    Senior Vice President & Richard Vitale Chief Financial Officer

    Executive Vice President Michael J. Valenti

    Vice President, Finance christopher M. coates & Administration

    Vice President, Information erik a. lindgren Technology & Application Development

    Vice President, David f. Meyers Event Operations

    Chairman of the Board Jeffrey s. klein

    Reaching the staffStaff may be reached via e-mail, telephone, fax, or mail.

    e-Mail: To e-mail any member of the staff, please use the following form: [email protected]

    Renton office (weekdays, 8:30 a.m.5:00 p.m. PT) Telephone 425.277.9126; Fax 425.687.2842 555 S Renton Village Place, Ste. 700 Renton, WA 98057

    coRpoRate office (weekdays, 8:30 a.m.5:30 p.m. PT) Telephone 818.814.5200; Fax 818.734.1522 9201 Oakdale Avenue, Suite 101, Chatsworth, CA 91311

    aDVeRtising oppoRtunities Scott Geissler, [email protected], 248.658.6365

    RepRints anD e-pRints: For single article reprints (in minimum quantities of 250500), e-prints, plaques, and posters, contact PARS International. Phone 212.221.9595; E-mail [email protected]; Web www.magreprints.com/QuickQuote.asp

    Copyright 2014 by TDWI (The Data Warehousing Institute), a division of 1105 Media, Inc. All rights reserved. Reproduc-tions in whole or in part are prohibited except by written permission. Mail requests to Permissions Editor, c/o Best of BI 2014, 555 S Renton Village Place, Ste. 700 Renton, WA 98057. The information in this magazine has not undergone any formal testing by 1105 Media and is distributed without any warranty expressed or implied. Implementation or use of any information contained herein is the readers sole responsibility. While the information has been reviewed for accuracy, there is no guarantee that the same or similar results may be achieved in all environments. Technical inac-curacies may result from printing errors, new developments in the industry, and/or changes or enhancements to either hardware or software components. Produced in the USA.

    TDWI is a trademark of 1105 Media, Inc. Other product and company names mentioned herein may be trademarks and/or registered trademarks of their respective companies.

    Volume 11

  • HP Vertica 7Big Data Analytics - No Limits, No Compromises

    Its Time to Modernize Your Enterprise Data Warehouse So, you have made a major investment in your data warehouse. However, squeezing out marginal performance gains requires you to constantly invest in more hardware and services. There has to EHDEHWWHUPRUHFRVWHHFWLYHVROXWLRQsDVROXWLRQZKHUH\RXFDQSXWDQHQGWRWKHFRQVWDQW compromise between your quality of insight and speed of decision making. Why make compromises or be limited by your data warehouse? Purpose built for Big Data analytics IURPWKHYHU\UVWOLQHRIFRGHRQO\WKH+39HUWLFD$QDO\WLFV3ODWIRUPFDQPRGHUQL]H\RXUHQWHUSULVHdata warehouse by delivering:

    FREE Download HP Vertica Community Edition!

    Blazing - Fast Analytics Run queries 50x-1,000x faster than legacy data warehouses

    Massive Scalability $GGDQXQOLPLWHGQXPEHURILQGXVWU\VWDQGDUGservers for petabyte-scale

    Optimized Data Storage Store 10x-30x more data per server than row databases to achieve the lowest TCO

    Open ArchitectureLeverage tight built-in support for Hadoop, R, and your choice of ETL and BI

    DOWNLOAD NOW

  • tDWis Best of Bi VOL. 11 5 tdwi.org

    The year that was 2013 doesn't fit any obvious pattern. It was like no other. Applications such as social media sentiment analysis and big data analytics are hard, costly, and time-consuming. In 2013, enterprises began to come to terms with this inescapable fact.

    Along the way, marketing shifted into overdrive, cloud BI at long last took off, NoSQL grew exponentially, and a co-creator of the data warehousewhich, by the way, turned 25 this yearserved up a trenchant assessment of the state of BI. Such was the year that was 2013.

    self-service salvationThis year, vendor marketers celebrated self service as a fix for the usability, inflexibility, and adoption issues that have long bedeviled BI. Selling it as a new tool, however, was a tougher sell. Self-service BI isn't new; it isnt even sort of new. A decade ago, the former Business Objects, the former Cognos, and the former Hyperionalong with BI stalwarts such as Information Builders and

    featuRe

    BY STEPHEN SWOYER

    2013 in Review

    ClOUD Bi TAKeS OFF

  • tDWis Best of Bi VOL. 11 6 tdwi.org

    MicroStrategyalso championed self service as a solution for what ailed BI. In painful point of fact, self service is almost as old as BI itself.

    Self service first became a mantra for BI back in the early 1990s. The problem was that data was locked in a database and [users] didnt want to have to go to IT to ask them to write SQL queries whenever they needed something, industry luminary Cindi Howson, a principal with BIScorecard, told BI This Week in an interview in 2012. That first form of self service was generating SQL through a semantic layer, a business view, or whatever you want to call it.

    One of the things that ailed BI in 1993 and 2003 is the same thing that ails BI today: anemic adoption. This is in spite of the fact that todays BI tools incorporate a staggering array of self-service features, along with (often genuinely helpful) data visualization capabilities. They likewise address new usage paradigms (such as visual BI discovery) and, accordingly, are less rigid (in regard to both usability and information access) than were their predecessors.

    BI vendors in 2013 talked about use cases and feature bundles that couldn't have been anticipated a decade ago, but has any of this actually made BI better, more usable, or more pervasive? Are companies having more success with BIand does this success actually correlate with

    value?

    Judging by BI adoption rates, you might not think so. According to Howson, BI adoption has hovered at or around 25 percent for the better part of a decade. In the 2013 installment of BI Scorecard's Successful BI Survey, sizeable percentages of respondents reported that BI tools still aren't easy enough to use (an issue cited by 24 percent of survey respondents) or aren't able to answer complex questions (23 percent).

    In search of a silver bullet Self service is a time-tested prescription for the BI blues. Information search is a new(er) solution.

    The case for search goes something like this: thanks to the inflexibility of the data warehouse and its

    rigid data model, information from multi-structured sources (e.g., machine or sensor logs, blog postings and documents, videos and photos) can't easily be prepared and schematized for SQL query access. Information search promises to bridge the structured and multi-structured worlds, situating business facts in a rich semantic context.

    Search, too, is by no means new: Google, for example, introduced an enterprise search appliance almost a decade ago. That being said, 2013 produced some genuinely interesting developments on the information search front, such as IBM's still-incubating Project Neo natural-language search technology (NLS).

    Elsewhere, Information Builders introduced a new version of its WebFOCUS Magnify information search offering and Microsoft touted NLS as a major part of PowerBI, the BI and analytic component of its Office365 cloud service. Also this year, vendors such as Cloudera (Cloudera Search), MarkLogic, and DataRPMalong with established players such as NeutrinoBI and Oracle (with its Endeca product line)likewise touted search as a differentiating technology.

    True, search might be marketed as a silver bullet, but there's no disputing that it's a genuinely intriguing technology. What's more, it's becoming increasingly commoditized. Tools such as Cloudera Search, DataRPM, and WebFOCUS Magnify leverage open source software (OSS) components such as Solr (an OSS search platform) and Lucene (an OSS indexing library).

    The upshot is that it's increasingly possible to build a serviceable information search platform using free OSS tools. For example, a savvy organization could use a combination of Solr, the Apache unstructured information management architecture (UIMA) project, the R statistical programming environment, and other technologies to build and deploy an analytic search platform that addresses both faceted search (a non-taxonomic scheme for classifying information in multiple dimensions) and NLS requirements. Look for information search to play a more prominent role in 2014.

    at long last, cloud?In 2013, we saw BI marketing shift to the cloud, with Microsoft's PowerBI for Office365, a new cloud offering from Tableau, a new Active Data Warehouse Private Cloud service from Teradata, a cloud BI platform-as-a-service (PaaS) offering from start-up RedRock BI, and a data management splash by relative newcomer Treasure

    SEARCH MIGHT BE MARKETED AS A SILVER

    BULLET, BUT THERE'S NO DISPUTING THAT IT'S A

    GENUINELY INTRIGUING TECHNOLOGY.

  • tDWis Best of Bi VOL. 11 7 tdwi.org

    Data, which markets a hosted big data analytic service (this last mixes OSS pieces of Hadoop with proprietary bits), among other entries.

    There's been no shortage of SaaS BI offerings, including solutions from Birst, Domo (a relative newcomer), and GoodData, among othersbut for a long time, prevailing wisdom held that enterprise BI and data warehousing just wouldn't take in the cloud. BI information is

    too sensitive and data warehousing workloads too demanding for cloud environments, some argued.

    There's some truth to both claims: some workloads or applications simply can't be shifted to the cloud, owing chiefly to regulatory requirements. Give Mark Madsen, a research analyst with IT strategy consultancy Third Nature, a few hours and he'll exhaustively tally the many and varied reasons data warehousing workloads aren't a great fit for the highly virtualized, loosely coupled cloud.

    That said, Madsen himself expects that a clear majority of data warehousing (DW) workloads will shift to the cloud over the next decade. As 2013 draws to a close, in fact, it's fair to say that cloud BI has real and verifiable momentum. Several vendorsRedRock BI, but also MicroStrategy and Yellowfineven market BI PaaS offerings, which shift BI (i.e., platform infrastructure, workloads, data, development) entirely to the cloud.

    MicroStrategy hosts its own PaaS service, while Yellowfin BI can be deployed on Amazon Web Services (AWS) and used in conjunction with Amazon's Redshift massively parallel processing (MPP) cloud data warehouse service. Other BI vendors, such as Actuate, Jaspersoft, and Talend, are available as PaaS packages, too.

    Then there's AWS, which is a bona fide cloud powerhouse. A year ago, Amazon announced Redshift, an MPP cloud data warehouse for AWS. Let's not sugarcoat this: there are significant challenges involved in shifting data warehouse workloads into the cloud. With Redshift, Amazon seems to have licked many of them.

    Steve Dine, managing partner with DataSource Consulting and a frequent instructor at TDWI educational events, says he's worked with Redshift in a few client engagements.

    It scales well. Just like any MPP system, it scales based on how well you parallelize your workloads, how well you partition your data, and how many nodes you spin up, he explains.

    [Redshift is] just like any columnar database: if you're isolating it to a subset of attributes, it's great; if you're trying to do very wide queries, as you would in many retail situations, you are likely to see better performance from a row-based MPP database.

    Dine doesn't think of Redshift as a silver bullet (e.g., even though it's inexpensive, Redshift's per-TB pricing can quickly add up) but sees it (1) as a compelling option for smaller companies looking to build data warehouses in the cloud, and (2) as a proof of concept for large companies concerned about shifting MPP workloads to the cloud.

    It just democratizes [MPP analytic databases], he points out. What's nice about it is that you can spin it up and set it to automatically take snapshots. You can bring it up and take it down whenever you want. Will it work for everybody? As with any [MPP platform], it just depends on what your workload is.

    hadoop, nosQl, and Google f1As a combined market/technology segment, NoSQL continues to grow like a flowering kudzu plant. At O'Reilly's Strata conference in February, Pivotal, a big data spin-off formed by EMC and VMWare last December, announced Hawq, an ANSI SQL, ACID-compliant database system (based on EMC's Greenplum MPP database) for Hadoop.

    Elsewhere this year, the OSS community (aided by Cloudera, Hortonworks, MapR, and other commercial software vendors) focused on bolstering Hadoop's security and disaster recovery bona fides. One of the biggest deliverables of 2013 was version 2.2 of the Hadoop framework, which went live in early October. Hadoop 2.2 bundles YARN (a backronym for yet another resource negotiator), which promises to make it easier to monitor and manage non-MapReduce workloads in Hadoop clusters. (Prior to YARN, Hadoop's JobTracker and TaskTracker jointly managed resource negotiation. Both daemons were built with the MapReduce compute engine in mind.) Now that YARN's available, users should finally be able to manage, monitor, and scale mixed workloads in the Hadoop environment.

    Nor is Hadoop the last word in NoSQL. Vendors such as Basho Technologies (which develops the Riak distributed DBMS), Cloudant (which bases its distributed NoSQL

    AS 2013 DRAWS TO A CLOSE, IN FACT, IT'S FAIR

    TO SAY THAT CLOUD BI HAS REAL AND VERIFIABLE

    MOMENTUM.

  • tDWis Best of Bi VOL. 11 8 tdwi.org

    database on the Apache CouchDB project), DataStax (a commercial distribution of Apache Cassandra), FoundationDB, MarkLogic, and RainStor, among others, would beg to differ. This October, Basho previewed a version 2 release of Riak that claims to support strong consistency (i.e., strong transactions, or the ACID that's known and loved by DM types). Most distributed DBMSs (such as NuoDB and Splice Machine) support what's known as eventual consistency.

    An altogether new entrant was F1, the ANSI SQL- and ACID-compliant database platform that Google announced in September. Google, which uses F1 to power its AdWords service, claims it can function as a single platform for both OLTP and analytic workloads. Unlike Hadoop, F1 addresses classic data management requirements (with support for strong transactions and row-level locking)and does so at Google scale. Google's push behind F1 underscores an important consideration: we conflate the terms NoSQL, big data, and Hadoop at our own risk.

    final Thoughts: business unIntelligence and the Data Warehouse at 25This year was a milestone annum, too. The data warehouse itself was born 25 years ago, in 1988, when Dr. Barry Devlin and Paul Murphy published their seminal paper,

    An architecture for a business and information system, in the IBM System Journal.

    Devlin was back in 2013 with a new book, Business unIntelligence: Insight and Innovation beyond Analytics and Big Data. In many ways, Devlin's book is a wry assessment of his prodigal creation, which, a quarter century on, is at once dominant and besieged.

    Over the last quarter century, organizations have invested hundreds of billions of dollars in data warehouse systems, to say nothing of the BI tools that are the DW's raison d'tre. There probably isn't a Global 2000 organization that doesn't have at least one enterprise data warehouse.

    The net net of this ubiquity, as Devlin demonstrates with perspicacity and humor, is a kind of muddling through. (The unIntelligence in Devlin's title speaks to precisely this problem.) In this regard, Devlin could well say of BI what philosopher Immanuel Kant famously said of humankind: Out of the crooked timber of [a BI implementation project], no straight thing was ever made.

    The business of bI: comings, Goings, and IPosThis year, we bade adieuchiefly by way of acquisition to several stalwart vendors. Composite Software, Kalido, KXEN, ParAccel, and Pervasive Software, among others, were acquired this year. Cisco Systemswhich seems poised to make a big push into data management in 2014 and beyondsnapped up Composite in June; Silverback Enterprise Group, an Austin-based holding company, acquired Kalido in October; SAP nabbed KXEN, a long-time Teradata Partner, in September; and Actian acquired both ParAccel and Pervasive. Will these technologies survive and thrive, or will they vanish (as with the former Brio Software, the former Celequest, and the former DecisionPoint Software, to name just a few) into the void of the industry's memory hole?

    Elsewhere this year, Tableau's long-rumoured IPO finally (and successfully) took place. Dell pulled off a kind of reverse IPO: in early February, its shares were delisted from both the NASDAQ and the Hong Kong Stock Exchange. At the same time, founder Michael Dell (bolstered by VC giant Silver Lake Partners, and with an additional $2 billion in financing from Microsoft) came back to take Dell private. Prior to its delisting, Dell had managed to cobble together a large information management portfolio, anchored by its Toad assets, which it acquired from the former Quest Software. Its execution in 2014 will bear watching.

    So, too, will that of Teradata, which this year became acquainted with the downside of being a public company. Teradata missed its earnings in Q2 andjust ahead of its Partners conference in Octobercut its earnings outlook for the year. As a result, the DW giant's stock was repeatedly buffeted by the market. Slings and arrows, indeed.

    OVER THE LAST qUARTER CENTURY,

    ORGANIzATIONS HAVE INVESTED HUNDREDS

    OF BILLIONS OF DOLLARS IN DATA WAREHOUSE

    SYSTEMS, TO SAY NOTHING OF THE BI TOOLS THAT

    ARE THE DW'S RAISON D'TRE.

  • tDWis Best of Bi VOL. 11 9 tdwi.org

    buzzwords: a PleaA new year brings new buzzwords. By buzzword, we mean those once-unique coinages thatwhen used sparingly or in isolationhave imaginative, conceptual, and/or thematic power. As poet A.R. Ammons once aptly put it, A word too much repeated falls out of being.

    And how. This year, adjectives such as disruptive, transformative, self service, high value, and advanced and/or predictive analytic along with the term analytic itselfwere taken up as adjectives or adverbs by marketers everywhere. Even the word cloud was used and misused by marketeers.

    All of these terms passed into a lexicon of descriptive wordssprinkled with a few scant nouns and verbsthat includes marketing mainstays such as game changing,

    unprecedented, and patent pending, as well as more innocuous descriptors such as visual, intuitive, market leading, or innovative.

    These words comprise the noise that we must filter out if we're to meaningfully assess products and technologies, or (more important) make buying decisions about products and technologies. It isn't that these words aren't useful and don't mean anything. Rather, it's that the contexts in which they're employed are so general (or so inapposite) as to dilute their meanings. They're all-noise, noor very littlesignal.

    This year, and with shocking frequency, BI and DM vendors delivered disruptive tools that surface (a popular alternative is expose) innovative and/or intuitive capabilitiesalmost always in a rich visual and/or self-service contextand promise to

    transform a typically dysfunctional status quo. In many cases, tools tout advanced analytic or predictive capabilities and (increasingly) have cloud components or attributes, too.

    This author is as guilty in the dilution of these terms as anyone else. For the New Year, he's resolved to strike them from his lexiconat least in contexts where they're inappropriate.

    If only industry vendors would do the same.

    Stephen Swoyer is a contributing editor for TDWI.

  • tDWis Best of Bi VOL. 11 10 tdwi.org

    Four AnAlytics technology trends For 2014BY FERN HalPER, RESEaRcH DiREcTOR FOR aDvaNcED

    aNalYTicS, TDWi

    In 2013, TDWI saw increasing activity around predictive analytics as a foothold for advanced analytics. Predictive analytics is fast becoming an important component of an organizations analytics arsenal, providing significant advantage for achieving a range of desired business outcomes, including higher customer profitability and more efficient and effective operations. TDWI expects interest will continue to build in 2014 for this technology, and that it will continue to evolve. Additionally, other advanced analytics technologies will begin to gain momentum.

    Advanced analytics provides algorithms for complex analysis of either structured or unstructured data. It uses sophisticated statistical models, machine learning, and other advanced techniques to find patterns in data for prediction and decision optimization. Although some of the techniques for advanced analytics have been around for many years, several factors have come together in almost a perfect storm to ignite increasing market interest in these technologies: the explosion of data (type, volume, and frequency); the availability of cheap computing power; and the realization that analytics can provide a competitive advantage.

    Here are four analytics trends that I see as we move into 2014.

    Trend #1: Predictive analytics deployment models progress. Predictive analytics is a technology whose time has finally come. Although many of the algorithms have been around for decades, more organizations want to utilize the power of this technology. Use cases include predicting customer

    FeATURe

    2014 FOReCAST:

    Bi, AnAlyTiCS, AnD Big DATA TRenDS AnD ReCOmmenDATiOnS FOR The new yeARBY FERN HalPER, PHiliP RuSSOm, aND DaviD STODDER

  • tDWis Best of Bi VOL. 11 11 tdwi.org

    and machine behavior, such as fraud, churn, or machine failure. Figure 1, from a recent TDWI World Conference survey, illustrates that close to 90% of respondents cited predictive analytics as a technology they would be using in the next three years. Twenty-seven percent were currently using it. TDWI expects adoption of predictive analytics will continue in 2014.

    The deployment options for this technology are evolving. There has been a market move over the past few years to democratize predictive analytics (i.e., make it easier to use). This has fueled the growth of the technology. For example, in the new TDWI Best Practices Report, Predictive Analytics for Business Advantage, 86% of respondents cited business analysts and 79% cited statisticians when we asked the question, In the near future, who do you expect will be using predictive analytics tools in your company? This points to a shift occurring (at least in perception) about who is going to make use of predictive analytics.

    Many respondents believe that business analysts will build the predictive models. Whether all business analysts have the skills to build complex models is another discussion. However, other deployment models for predictive analytics that can help a wider group of people make use of the technology will become more popular in 2014. These include:

    Operationalizing it. One way to make predictive analytics more pervasive is to include it as part of an automated business process. For instance, a data scientist or busi-ness analyst might build a model for cross-sell and

    up-sell that is instantiated into a call center system. A call center agent might use the model output without even necessarily knowing there is a complex model working behind the scenes. The agent might only see the next best offer to suggest to a customer.

    Consumerizing it. Another way to make predictive analytics more consumable is for a technical person to build a model that someone less technical can interact with. For instance, a data scientist might build a model that a marketing analyst uses.

    Trend #2: Geospatial analytics continues to gain steam. Geospatial data, sometimes referred to as location data or simply spatial data, is emerging as an important source of information in both traditional and big data analytics. Geospatial data and geographic information systems (GIS) software are being integrated with other analytics products to enable analytics that utilize location and geographic information. Use cases include market segmentation, logistics, detecting fraud, and situational awareness.

    Geospatial analytics is being used in visualizations that layer geospatial information together with other information to spot patterns. Geospatial data is also being combined with other forms of data to be used in more sophisticated analysis, such as prediction. In this case, the geospatial data is combined with other sources of data as attributes for a model. In fact, more than 70% of respondents surveyed for Predictive Analytics for Business Advantage indicated that they plan to incorporate

    Figure 1. Source: TDWI World Conference Tech Survey, 2013.

    What kind of analytics are you currently using in your organization today to analyze data? In three years? Please select all that apply.

    Analysis type Using now or will use within 3 years

    Visualization tools 96%

    Predictive analytics 88%

    Geospatial analytics 70%

    Other advanced statistical techniques (e.g. clustering, forecasting, optimization) 64%

    Text analytics 53%

    Web analytics 51%

    Social media analytics 51%

    Other data mining techniques (e.g. neural nets, machine learning) 45%

    Link analysis 28%

  • tDWis Best of Bi VOL. 11 12 tdwi.org

    geospatial data into their predictive models within the next three years (see Figure 3). And as Figure 1 illustrates, close to 70% of the respondents to the 2013 World Conference survey plan to be using geospatial analytics in the next three years. This kind of analysis is unmistakably gaining in popularity. TDWI expects this trend to continue in 2014.

    Trend #3: Analytics in the cloud becomes more popular. Although the adoption of cloud analytics has been slower than predicted, TDWI is seeing an increasing number of companies investigating the technology. This trend will continue into 2014. A hybrid approach (using a combination of public and private clouds) will also become more popular.

    TDWI Research in predictive analytics supports this trend. In Figure 2, only 26% of respondents stated that they would never use the cloud for BI or predictive analytics, whereas 35% were thinking about using the cloud for some kind of analytics and 25% were currently using it. This is an increase from previous surveys.

    As companies start to think about analytic workloads, the cloud will continue to become more popular. TDWI is already seeing certain kinds of analytics workloads move to the cloud. Typically, these are not well suited to the data warehouse. For instance, companies that are already collecting data in the public cloud are analyzing it there to reduce it. This might include telemetry data or other kinds

    of big data. They are then sending this reduced data set to their on-premises data centers for further analysis.

    Companies are also using the public cloud for analytics sandboxes (i.e., test beds) for advanced analytics, but more often leaving the data there and using the cloud for many different kinds of advanced analysis. As this occurs, service providers are also setting up communities where data (such as census data) can be shared. TDWI expects to see more use cases of cloud analytics emerge in 2014.

    Trend #4: Data and the Internet of Things. There is little doubt that the world will continue to create more data. TDWI Research indicates that in addition to ever-increasing amounts of structured data, companies will begin to use other forms of data for analysis. For instance, in Predictive Analytics for Business Advantage, we asked respondents who were already using predictive analytics what data they plan to use for it. Figure 3 illustrates their responses.

    Three big growth areas (aside from geospatial data) over the next three years include social media data, text data, and real-time event data. For instance, TDWI sees more companies utilizing text analytics technologies to essentially structure unstructured data. Organizations are using this data in isolation to discover patterns; however, they are also marrying the text data with structured data to provide lift to advanced analytics models.

    Likewise, more companies are looking to real-time data for advanced analytics. New technologiessuch as complex

    Figure 2. Source: TDWI Research survey to support TDWI Best Practices Report: Predictive Analytics for Business Advantage, 2014.

    Does your organization use cloud computing for analytics?

    35% 26%

    We use a public cloud or SaaS for BI and/or predictive analytics

    We use a public cloud or SaaS for BI and/or predictive analytics and on-premises

    We use a hybrid approach to BI/predictive analytics, meaning we use both public and private cloud

    We use cloud and on-premises solutions

    We use a private cloud for BI or predictive analytics

    We would never use the cloud for BI or predictive analytics

    We don't use the cloud now but we are thinking about using a cloud for BI or predictive analytics

    Dont know

    14%

    11%

    6%4%

    2%2%

    35%

    26%

  • tDWis Best of Bi VOL. 11 13 tdwi.org

    event processing (CEP), stream mining, in-memory analytics, and in-database analyticshave enabled real-time analytics for customer interactions and operational and situational intelligence, among other use cases. TDWI expects that more users will adopt real-time data in 2014.

    One class of potentially real-time data is machine-generated data, which is called out separately in Figure 3. Interestingly, close to 20% of respondents are already using machine-generated data in some sort of predictive capacity. Another 20% expect to use it in the next three years.

    This machine-generated data is part of the Internet of Things (IoT), which TDWI expects to grow in market awareness in 2014. This term refers to the fact that devices are becoming more numerous, and they are equipped with sensors and other technologies that can send data over the Internet. These devices are generating huge amounts of data that can be used in various ways. For instance, insurance companies are looking to use telemetric data from devices placed in cars to help develop better risk models for insurance. Logistics providers are tracking produce that is tagged with RFID tags to check for temperature and spoilage. Since data simply dumped into some sort of storage environment is not that useful, IoT analytics will also start to evolve.

    a final WordCompanies are finally starting to utilize more advanced analytics, and this trend will continue into 2014. This will be the case whether organizations are dealing with big data or with their current data sets. Organizations

    are looking to employ more analytics over different kinds of data, especially once they feel that they have their BI implementations under control. 2014 promises to be an exciting year for analytics!

    Fern Halper is director of TDWI Research for advanced analytics, focusing on predictive analytics, social media analysis, text analytics, cloud computing, and other big data analytics approaches. She has more than 20 years of experience in data and business analysis, and has published numerous articles on data mining and information technology. Halper is co-author of Dummies books on cloud computing, hybrid cloud, service-oriented architecture, service management, and big data. She has been a partner at industry analyst firm Hurwitz & Associates and a lead analyst for AT&T Bell Labs. Her Ph.D. is from Texas A&M University. You can reach her at [email protected], or follow her on Twitter: @fhalper.

    10 recommendAtions For Big dAtA implementAtionsBY PHiliP RuSSOm, RESEaRcH DiREcTOR FOR DaTa

    maNagEmENT, TDWi

    Its human nature to start a new calendar year by ruminating on the many things youd like to achieve. After all, most of us humans want to improve our work and home lives, and setting goals is one way to achieve improvement.

    If youre a data management professional or similar specialist, the spirit of the new year may have you thinking

    Figure 3. Source: TDWI Best Practices Report: Predictive Analytics for Business Advantage, 2014.

    What kind of data do you use for predictive analytics? now? Three years from now?

    Structured data (from tables, records) 98%

    Demographic data 77% 11% 6% 6%

    Times series data 65% 14% 11% 10%

    Web log data 37% 33% 21% 9%

    Geospatial data 35% 37% 16% 12%

    Clickstream data from websites 32% 29% 25% 14%

    Real-time event data 31% 40% 20% 9%

    Internal text data (i.e. from e-mails,call center notes, claims, etc.) 31% 45% 18% 6%

    External social media text data 21% 44% 25% 10%

    Machine-generated data (e.g., RFID, sensor, etc.) 19% 22% 38% 21%

    2%Using today and will keep using

    Will use within 3 years

    No plans

    N/A or dont know

  • tDWis Best of Bi VOL. 11 14 tdwi.org

    about how to get a better grip on one of the most apparent opportunities facing us nowadays, namely so-called big data.

    Managing big data is a relatively new practice, so its best practices and critical success factors are still emerging. To give you a leg up, allow me to present 10 recommendations for successful big data implementations.

    1. Demand business value from big data.Think of big data as an opportunity, and seize it. In TDWIs big data management survey of 2013, 89% of survey respondents said the management of big data is an opportunity. Sure, the management of big data presents technical challenges, but the insights resulting from the analysis of big data can lead to cost reductions and revenue lift. Hence, the primary path to business value from big data is through analytics. A second path joins new big data with older enterprise data to extend complete views of customers and other business entities. A third path taps streaming big data to enlighten and accelerate time-sensitive business processes.

    Leverage big data, dont just manage it. It costs money, time, bandwidth, and human resources to capture, store, process, and deliver big data. Therefore, no one should be content to simply manage big data as a cost center that burns up valuable resources.

    2. Put advanced analytics and big data together.Its the analytics, stupid. Current consensus says that analytics is the primary path to getting business value from big data. Therefore, the point of managing big data is to provide a large and rich data set for actionable business insights, discovered via analytics. This fact is so apparent that theres even a name for it: big data analytics.

    For example, a common analytic application today is the sessionization of website log data, which reveals the behavior of site visitorsinformation that helps marketers and Web designers do their jobs better. As another example, trucks and railcars are loaded with sensors and GPS systems nowadays so logistic firms can analyze operator behavior, vehicle performance, onboard inventory, and delivery route efficiency.

    In these examples, collecting big data from Web applications or sensors is almost incidental. The real point is to elevate the business to the next level of corporate performance based on insights gleaned from the analysis of big data.

    3. Dont expect new forms of analytics to replace older forms.Online analytic processing (OLAP) continues to be the most common form of analytics today. OLAP is here to stay because of its value serving a wide range of end users. The current trend is to complement OLAP with advanced forms of analytics based on technologies for data mining, statistics, natural language processing, and SQL-based analytics. These are more suited to exploration and discovery than OLAP is. Note that most data warehouses today are designed to provide data mostly for standard reports and OLAP, whereas future-facing data warehouses also provide additional data and functionality for advanced analytics.

    Use big data to create new applications and extend old ones. For example, big data can expand the data samples that data mining and statistical analysis applications depend on for accurate actuarial calculations and customer segments or profiles. Similarly, much of big datas value comes from mixing it with other enterprise data. The proverbial 360-degree view of customers and other business entities accumulates more degrees when big data from new sources is integrated into views.

    4. hire and train your staff for big data management.The focus should be on training and hiring data analysts, data scientists, and data architects who can develop the applications for data exploration, discovery analytics, and real-time monitoring that organizations need if theyre to get full value from big data. Most BI/DW professionals are already cross-trained in many data disciplines; cross-train them more. When in doubt, hire and train data specialists, not application specialists, to manage big data. TDWIs take is that its easier to train a BI professional in Hadoop and other big data technologies than it is to train an applications developer in BI and data warehousing.

    As with all data management, collaboration is key to the management of big data. Due to big datas diversity, diverse technology teams will need to play coordinated roles. From a business viewpoint, big data should be managed as an enterprise asset, such that multiple business units and stakeholders have access to big data so they can leverage it. It takes a lot of collaborationboth business and technicalto be sure everyone knows their role and has their needs met.

  • tDWis Best of Bi VOL. 11 15 tdwi.org

    5. beware the proliferation of siloed repositories for big data analytics.Most analytic applications have a departmental bias. For example, the sales and marketing departments want to own and control customer intelligence, just as procurement needs to control supply chain analytics, and the financial department owns financial analysis. Furthermore, unstructured and semi-structured big data is regularly segregated because it cant be managed properly in the usual relational databases. For these and other reasons, TDWI sees big data collections and big data platforms (such as Hadoop and NoSQL databases) too often managed in isolation silos.

    Your goal should be to integrate big data into your well-integrated enterprise data and BI/DW environments, not proliferate twenty-first-century spreadmarts and teramarts. Besides, eventually well stop calling it big data and just assumes its a subset of enterprise data. Someone (probably not you) should decide whether big data platforms will be departmentally owned (as a lot of analytic applications are) or shared enterprise infrastructure supplied by central IT (similar to how IT provides SAN/NAS storage, servers, the network, and so on).

    6. consider a data warehouse architecture that mixes relational and hadoop technologies.Architecture can enable or inhibit critical next-generation big data management functions such as extreme scalability, complete views, unforeseen forms of analytics, big data as an enterprise asset, and real-time operation.

    One architectural strategy being adopted by many organizations is to reserve the data warehouse for the relational and multidimensional data that populates the majority of BI deliverablesnamely standard reports, reports in dashboard or scorecard styles, metrics and key performance indicators for performance management, and multidimensional OLAP. In most organizations, the list constitutes a whopping 80% or more of the output of a BI program. So it makes sense that you guard the DW that your deliverables (and your job!) depend on most.

    On one hand, this architecture assures that the vast majority of BI deliverables have ample, clean, well-modeled, and well-documented data sourced from a traditional warehouse. On the other hand, this architecture assumes that the 20% or fewer deliverables for data exploration and advanced analytics will be populated with big data and similar data sets managed on other data platforms, not the core data warehouse. This is quite

    advantageous because each deliverable type and the data it requires is supported by data platforms that are most conducive and most easily optimized for them. It also parallels team structures that separate reporting and analytics, because the two require very different skills and tools.

    Organizations that have moved to this two-part DW architecture usually depend on two key platform types: a relational database management system (RDBMS) and the Hadoop Distributed File System (HDFS). TDWI keeps finding more RDBMS/HDFS data warehouses, in a growing list of industries, which indicates this will soon be a common architectural approach to data warehouse environments, especially in organizations that need to leverage big data via analytics while still maintaining high standards in reporting and related deliverables.

    7. Define places for big data in architectures for data warehousing and enterprise data management.For example, an obvious place to start is to rethink the data staging area within your data warehouse. Thats where big data enters a data warehouse environment and where it is usually stored and processed before being loaded into the warehouse proper. Consider moving your data staging area to a standalone big data management platformon Hadoop, a columnar DBMS, a data warehouse appliance, or a combination of these and other alternative (non-relational) data platforms outside the core data warehouse.

    Data staging aside, there are many other areas within standard DW architectures where alternative data platforms can make a contribution, namely in archiving detailed source data, managing non-structured data, managing file-based data, data sandboxes, more processing power for an ETL hub or ELT push down, and anywhere you might use a non-dimensional operational data store.

    Consider the many new architectures that boost scalability and performance for big data. If your relational data warehouse is still on an SMP platform, make migration to MPP a priority. Consider distributing your data warehouse architecture, largely to offload a workload to a standalone platform that performs well with that workload. When possible, take analytic algorithms to the data, instead of data to the algorithm (as is the DW tradition); this new paradigm is seen with in-database analytics, Hadoop with MapReduce layered over it, and gate-array processing in some storage platforms and appliances.

  • tDWis Best of Bi VOL. 11 16 tdwi.org

    8. reevaluate your current portfolio of data platforms and data management tools.For one thing, big data management is, more and more, a multi-platform solution (as are most data warehouse architectures), so you should expect to further diversify your software portfolio accordingly to fully accommodate big data. For another thing, survey data shows that the software types poised for the most brisk new adoption in the next three years are Hadoop (including HDFS, MapReduce, and miscellaneous Hadoop tools) and complex event processing (for streaming real-time big data). After those come NoSQL DBMSs, private clouds, and data virtualization/federation. If youre like most organizations surveyed, all of these have a potential use in your big data management (BDM) solution, so you should educate yourself about them, then evaluate the ones that come closest to your BDM requirements.

    In addition, diverse big data is subject to diverse processing, which may require multiple platforms. To keep things simple, users should manage big data on as few data platform types as possible to minimize data movement as well as to avoid data synchronization and silo problems that work against the single version of the truth. Yet there are ample exceptions to this rule, such as the multi-platform RDBMS/HDFS architecture for DWs discussed earlier. As you expand into multiple types of analytics with multiple big data structures, you will inevitably spawn many types of data workloads. Because no single platform runs all workloads equally well, most DW and analytic systems are trending toward a multi-platform environment.

    9. embrace all formats of big data, not just relational big data.Non-structured and semi-structured data types are daunting for the uninitiated, but they are the final frontierthe data your enterprise hasnt tapped for analytics. For example, human language text drawn from your website, call center application, and social media can be processed by tools for text mining or text analytics to create a sentiment analysis, which in turn gives sales and marketing valuable insights into what your customers think of your firm and its products. As another example, organizations with an active supply chain can analyze semi-structured data exchanged among partners (in, say, XML, JSON, RFID, or CSV formats) to understand which partners are the most profitable and which supplies are of the highest quality.

    Create a phased plan that eventually addresses all types of big data. You have to start somewhere, so start with relational data, then move on to other structured data,

    such as log files that have a recurring record structure. Carefully select a beachhead for unstructured data, such as text analytics applied to call center text in support of sentiment analysis. Look for mission-critical data thats semi-structured, as in the XML documents your procurement department is exchanging with partnering companies. Then continue down the line of big data types.

    10. embrace big data in motion, not just big data at rest.Some forms of big data are generated continuously. For example, Web servers can capture every click of every website visitor and append information about these to logs. An RFID chip emits data every time it passes by an RFID receiver. Sensors mounted on mobile assets (e.g., trucks, railcars, shipping pallets) transmit valuable information about their route and environment.

    In a lot of ways, streaming data of this sort is the hardest form of big data to handle, because it takes special systems to capture and process the data in real time, such as systems-based complex event processing (CEP). Yet streaming data is worth the effort and expense when it delivers unique insights into business processes, and does so faster and more frequently than any other data source can.

    Many organizations start with streaming big data by capturing a stream and analyzing it offline. (Web log data is a common stream to start with, followed by various types of machine data.) Assuming the analysis corroborates that the stream contains valuable content, the next phase is to start processing the messages, events, and other data in the stream as they arrive in real time.

    Philip Russom is director of TDWI Research for data management and oversees many of TDWIs research-oriented publications, services, and events. He is a well-known figure in data warehousing and business intelligence, having published over 500 research reports, magazine articles, opinion columns, speeches, Webinars, and more. Before joining TDWI in 2005, Russom was an industry analyst covering BI at Forrester Research and Giga Information Group. He also ran his own business as an independent industry analyst and BI consultant and was a contributing editor with leading IT magazines. Before that, Russom worked in technical and marketing positions for various database vendors. You can reach him at [email protected], @prussom on Twitter, and on LinkedIn at linkedin.com/in/philiprussom.

  • tDWis Best of Bi VOL. 11 17 tdwi.org

    driving the next phAse oF Bi innovAtion: Four trends For 2014BY DaviD STODDER, RESEaRcH DiREcTOR FOR BuSiNESS

    iNTElligENcE, TDWi

    Heading into the New Year, three Cs dominate many discussions about business intelligence and self-directed discovery analytics: content, context, and collaboration. Users want to go beyond the traditional limits of BI and data warehousing systems to access and analyze big data, which includes unstructured and semi-structured content. Users also seek more than just the numbers; they want context to fill in the gaps. New requirements are making it necessary for business users and IT to establish better forms of collaboration to accomplish objectives, govern the data, and ensure overall performance.

    Here are four trends that I see guiding how organizations will approach BI and emerging visual, self-directed data discovery analytics deployments in the next 12 months. Note that given these trends, success with BI and analytics will increasingly involve innovation in both technology implementation and in solving the often more difficult people and organizational challenges.

    Trend #1: Expansion in users consumption and direction of BI and analytics will force recalibration of the relationship with IT. ITs role in BI and analytics is and always will be vital, but the spotlight today is on business-driven BI and analytics. Widespread implementation of cloud computing and software-as-a-service for customer relationship management, project management, and other applications

    has opened the door to the movement of BI and analytics outside ITs direct control. Although data ownership and governance concerns require organizations to be cautious about who is doing what with the data, the clear trend is toward increased data consumption and self-directed analysis by a broad range of users.

    Often, the initial intent is not to create permanent business-driven systems. Built to be disposable, such

    shadow BI and analytics systems are to be shut down once a marketing campaign or particular analytical inquiry has run its course (of course, we all know how many such systems really die versus those that end up in ITs lap). Shadow systems can be beneficial to organizations as a place for experimentation with new and innovative technologies or practices that are not part of ITs repertoire. TDWI Research finds that one of the top reasons for business users to deploy self-service, business-driven BI and analytics systems is that IT lacks the experience and expertise to give users what they want (see Figure 1).

    The business-driven trend in BI and analytics means that business and IT must define a new level of collaboration. IT has a key role in governance, but IT must also excel at preparing and provisioning data for the growing population of data consumers and self-directed analysts. One key step in forging a new collaborative relationship is the adoption of agile development methods, which bring business users and IT developers together in small teams to work on projects designed to deliver continuous, incremental value. In the coming year, we will see a growing number of organizations either adopt agile methods or apply agile principles to update and improve collaboration in the development of BI and analytics systems.

    Figure 1. Source: TDWI Best Practices Report: Achieving Greater Agility with Business Intelligence, 2013.

    What are your organizations main reasons for implementing self-service bI and analytics?

    Users are requesting to do more on their own 67%

    IT cannot keep up with changing business needs 58%

    Users are going rogue and IT needs a comprehensive solution 38%

    Current BI processes cannot adapt to test-and-learn analytic processes 32%

    IT lacks adequate BI/analytics expertise 31%

    Lack of IT budget or need to reduce ITs BI/DW budget 28%

    Users need access to unstructured data sources and content 27%

    We do not have a self-service BI initiative 23%

    Poor quality of data in IT-managed BI reports 18%

  • tDWis Best of Bi VOL. 11 18 tdwi.org

    Trend #2: BI and visual discovery implementations focus on improving business processes. In recent years, the biggest buzz in BI has been about self-service visual discovery tools. These have enabled nontechnical users to get beyond simple reports to investigate the why questions behind the numbers. The tools have accelerated the trend toward greater independence from IT for data access and BI dashboard creation. Although independence from IT offers pluses and minuses from a data governance perspective, enabling users to do more investigation and sharing of insights on their own is critical to developing a broader analytic culture and strengthening data-driven decision making.

    Improving user productivity has long been a key goal of BI. Self-service discovery can help organizations accelerate progress toward user productivity goals by more closely integrating analytic activities with business process workflow. Then, users can apply their discovery insights more directly to the tasks and automated procedures for which they are held accountable. Discovery analytics can improve situational awareness by enabling users to examine, in real time, patterns that they may detect through activity monitoring or alerting functions.

    In 2014, we will see leading BI and visual data discovery tool vendors focus technology releases on achieving a tighter integration with process workflow, role-based responsibilities and interaction, and common metadata across process and analysis applications.

    Trend #3: Organizations will expand the role of text analytics and enterprise search in users BI and discovery tool sets. For most organizations, textual content still accounts for the lions share of their data and most of what lies beyond data warehouse systems that manage structured relational data. In addition to documents, e-mails, customer satisfaction surveys, and other internal content, organizations are reaching out externally into social media sources and are applying text analytics to interpret customer sentiment.

    Along with text analytics, enterprise search tools are vital for exploration and navigation of content. Search-based discovery, using tagging or labeling to describe the data, is an important capability to have for finding content that exists outside of what is described in BI and DW systems structured metadata. When integrated with BI systems, enterprise search can employ indexes to help users sift through and locate not only unstructured and semi-structured content, but also items in voluminous and numerous BI reports. Finally, enterprise search can help organizations gain a vital outside perspective, such

    as whether website visitors are finding information easily or whether refinements need to be made to content classification systems.

    Moving forward, organizations need to address user requirements to be able to view or access the full spectrum of data. In the new year, leading vendors will more fully incorporate search and text analytics tools and functions to expand and unify data access and analysis beyond the limits of existing BI and DW boundaries.

    Trend #4: Analytics will enable users to improve operational decisions through insights into broader patterns. Traditional BI applications often fall short of supporting operational decision makers because the systems exist in silos. They do not enable users to access data from outside single or small numbers of sources, and they limit views to simple reports and dashboards. Users can thus be blind to insights drawn from data scientists analyses of multiple and diverse data sources that could be valuable to operational decisions and performance monitoring.

    Leading organizations are beginning to link advanced analytics focused on larger trends and patterns to users BI and performance management systems so that insights drawn from analytics can be brought to bear on daily interactions with customers, patients, partners, or other individuals. They are also basing key performance metrics on sharp, detailed analytics rather than standard budget numbers and forecasts.

    A good example is the use of BI and analytics for population health management. Healthcare providers want to discover and track trends and patterns in larger populations and deliver analytic insights in real time so that practitioners can make better decisions at the point of care. Organizations also want to use knowledge of population health to predict and spot gaps in a patients continuum of care from provider to provider so that all can be more proactive with patient treatments and avoid expensive emergency hospital visits.

    While providers and policy organizations concerned with population health must work within the confines of patient health information privacy regulations such as HIPAA that limit how electronic health record (EHR) data can be used, organizations are finding ways to implement predictive analytics across a broad range of sources to assess health risks and disease patterns. BI dashboards, including on mobile devices, will be an increasingly effective means of delivering actionable, real-time insights into larger patterns to improve the care of individual patients.

  • tDWis Best of Bi VOL. 11 19 tdwi.org

    User satisfaction: closer at handBI technologies and practices are improving, enabling diverse users from executives to frontline personnel to be productive with more kinds of data. BI systems are delivering actionable insight from low-latency data for operations, while analytics applications are providing the means for deeper, more exploratory discovery. In 2014, some of the standard frustrations that have plagued users of BI and analytics applications should thankfully fade as tools become more user friendly, visual, universal, and faster. Of course, as new objectives are defined, new challenges will come to the fore. Business users and IT will need a strong partnership to overcome them.

    David Stodder is director of TDWI Research for business intelligence. He focuses on providing research-based insights and best practices for organizations implementing BI, analytics, data discovery, data visualization, performance management, and related technologies and methods. Stodder has provided thought leadership about BI, analytics, information management, and IT management for over two decades. Previously, he headed up his own independent firm and served as vice president and research director with Ventana Research. He was the founding chief editor of Intelligent Enterprise and served as editorial director for nine years. He was also one of the founders of Database Programming & Design magazine. You can reach him at [email protected], or follow him on Twitter: @dbstodder

  • tDWis Best of Bi VOL. 11 20 tdwi.org

    Even the worlds top organizations make poor decisions when planning, selecting, and rolling out a business intelligence (BI) solution mistakes that can be detrimental to BI success.

    In this white paper, updated for 2014, we detail the six worst practices in BI, and show you how to avoid them.

    Download your copy now to ensure a successful BI implementation in your own organization by learning from the mistakes of others.

    Top 6 Worst Practices in Business Intelligence

    Avoid these major pitfalls

    Download the White PaperUpdated for 2014

    WebFOCUS iWay Software Omni

    informationbuilders.com Get Social

    DN

    750

    7663

    .011

    4

  • tDWis Best of Bi VOL. 11 21 tdwi.org

    Even the worlds top organizations make poor decisions when planning, selecting, and rolling out a business intelligence (BI) solution mistakes that can be detrimental to BI success.

    In this white paper, updated for 2014, we detail the six worst practices in BI, and show you how to avoid them.

    Download your copy now to ensure a successful BI implementation in your own organization by learning from the mistakes of others.

    Top 6 Worst Practices in Business Intelligence

    Avoid these major pitfalls

    Download the White PaperUpdated for 2014

    WebFOCUS iWay Software Omni

    informationbuilders.com Get Social

    DN

    750

    7663

    .011

    4

    status of Implementations for big Data ManagementA number of user organizations are actively managing big data today, as seen in survey results. However, do they manage big data with a dedicated BDM solution, as opposed to extending existing data management platforms? To quantify these issues, this reports survey asked: Whats the status of big data management in your organization today? (See Figure 1.) The survey also asked: When do you expect to have a big data management solution in production? (See Figure 2.)

    Dedicated bDM solutions are quite rare, for the moment. Only 10% of respondents report having deployed a special solution for managing big data today. Most of these are very new (7%), whereas a few are relatively mature (3%), as seen in Figure 1. This is consistent with the 11% of respondents who already have a BDM solution in production, as seen in Figure 2.

    In the short term, the number of deployed bDM solutions will double. Another 10% of respondents say they have a BDM solution in development as a committed project,

    TDWI research

    2013 TDwI BEsT PrACTICEs rEPOrT

    the stAte oF Big dAtA mAnAgement

    BY PHiliP RuSSOm

  • tDWis Best of Bi VOL. 11 22 tdwi.org

    as seen in Figure 1. This is consistent with the 10% who say they will deploy a dedicated BDM solution within six months, as seen in Figure 2.

    half of surveyed organizations plan to bring a bDM solution online within three years. In addition to the 10% over six months just noted, more solutions will come online in 12 months (20%), 24 months (19%), and 36 months (12%). If users plans pan out, dedicated BDM solutions will jump from rare to mainstream within three years. But note that users plans are by no means certain, because many projects are still in the prototyping or discussion stage (20% and 37%, respectively, in Figure 1).

    few organizations dont need a special solution for managing big data. Just a quarter report no plans at present for such a solution (23% in Figure 1); even fewer say theyll never deploy a BDM solution (6% in Figure 2).

    strategies for Managing big DataDifferent organizations take different technology approaches to managing big data. On one hand, a fork in the road decision is whether to manage big data in existing data management platforms or to deploy one or more dedicated solutions just for managing big data. On the other hand, some organizations dont have or say they dont need a strategy for managing big data. (See Figure 3.)

    half of organizations have a strategy for managing big data. This is true whether the strategy involves deploying new data management systems specifically for big data (20%) or extending existing systems to accommodate big data (31%). One survey respondent selected other and added the comment: Our big data strategy is a core competency for our business.

    The other half doesnt have a strategy, for various reasons. Some dont have a strategy because theyre not committed to big data (15%). The business value is questionable, said one respondent. Others lack a strategy for managing big data, as yet, even though they know they need one (30%). Once our POC completes, strategy can be defined.

    a lack of maturity can prevent a strategy from coalescing. One survey respondent added the comment: We dont know enough yet to determine a strategy. Another commented: Our data management is in a nascent stage. [It] needs to mature before a strategy becomes clear.

    as with many strategies, hybrids can be useful. According to one respondent: [Well use] a blend of extending existing [platforms] and deploying new [ones] in a hybrid mode. Another echoed that strategy, but turned it into an evolutionary process: [Well] extend existing systems now, and add new and better systems later.

    Whats the status of bDM in your organization today?

    Deployed and relatively mature 3%

    Deployed, but very new 7%

    In development, as a committed project 10%

    Prototype or proof-of-concept under way 20%

    Under discussion, but no commitment 37%

    No plans for managing big data with a special solution

    23%

    Figure 1. Based on 461 respondents.

    When do you expect to have a bDM solution in production?

    It is already in production 11%

    Within 6 months 10%

    Within 12 months 20%

    Within 24 months 19%

    Within 36 months 12%

    In 3+ years 22%

    Never 6%

    Figure 2. Based on 461 respondents.

    Which of the following best describes your organizations strategy for managing big data?

    Deploy new data management systems specifically for big data 20%

    Extend existing data management systems to accommodate big data 31%

    No strategy for managing big data, although we do need one 30%

    No strategy for managing big data, because we dont need one 15%

    Other 4%

    Figure 3. Based on 461 respondents.

  • tDWis Best of Bi VOL. 11 23 tdwi.org

    Adobe

    PDF

    This article is an excerpt.

    Read the full report

    Read more reports

    strategy should be part business, part technology. Ideally, BDM strategy should start with upper management, who determines that big data and its management supports business goals enough that the business should in turn support big data management. Without this business strategy in place first, technology strategies for BDM are putting the cart before the horse.

    The success of big Data ManagementManaging big data successfully on a technology level is one thing. Managing big data so that it supports business goals successfully is a different matter. For example, the benefits of BDM noted in the discussion of Figure 3 include business goals such as more numerous and accurate business insights, greater business value from big data, and business optimization.

    To estimate metrics for these measures of success, this reports survey asked two related questions: How successful has your organization been with the technical management of big data? How successful has big data management been in terms of supporting business goals? (See Figures 4 and 5.) Note that these questions were answered by a subset of 188 survey respondents (which is 41% of the total respondents) who claim theyve managed one or more forms of big data. Hence, their responses are strongly credible, as they are based on direct, hands-on experience.

    big data management (bDM) is moderately successful for both technology and business. A clear majority of respondents feel BDM (which theyve done hands-on) is moderately successful on both technology and business levels (65% in Figure 4 and 64% in Figure 5, respectively).

    This is good news, considering that BDM is a relatively new practice. It also suggests that BDM can balance both technology and business goals.

    few consider bDM to be highly successful. This is the case for both technology (11%) and business (12%). No doubt, BDM will mature into higher levels of success.

    roughly a quarter of respondents consider bDM to be not very successful. Again, this is true for both technology (24%) and business (24%). The lack of success in some organizations may be due to the newness of BDM. At this point in BDM, weve mostly seen organizations first attempts and early implementation stages; as these mature, success ratings will likely improve.

    Philip Russom is director of TDWI Research for data management and oversees many of TDWI's research-oriented publications, services, and events. He is a well-known figure in data warehousing and business intelligence, having published over 500 research reports, magazine articles, opinion columns, speeches, Webinars, and more. Before joining TDWI in 2005, Russom was an industry analyst covering BI at Forrester Research and Giga Information Group. He also ran his own business as an independent industry analyst and BI consultant and was a contributing editor with leading IT magazines. Before that, Russom worked in technical and marketing positions for various database vendors. You can reach him at [email protected], @prussom on Twitter, and on LinkedIn at linkedin.com/in/philiprussom. The report was sponsored by Cloudera, Dell Software, Oracle, Pentaho, SAP, and SAS.

    how successful has big data management been in terms of supporting business goals?

    Highly successful 12%

    Moderately successful 64%

    Not very successful 24%

    Figure 5. Based on 188 respondents who have experience managing big data.

    how successful has your organization been with the technical management of big data?

    Highly successful 11%

    Moderately successful 65%

    Not very successful 24%

    Figure 4. Based on 188 respondents who have experience managing big data.

  • tDWis Best of Bi VOL. 11 24 tdwi.org

    tdwi.org/cbip

    CERTIFIED BUSINESS INTELLIGENCE PROFESSIONAL

    Get Recognized as an Industry LeaderAdvance your career with CBIP

    Professionals holding a TDWI CBIP certification command an average salary of $113,500more than $8,200 greater than the average for non-certified professionals.2013 TDWI Salary, Roles, and Responsibilities Report

    TDWI CERTIFICATION

    Distinguishing yourself in your career can be a difficult yet rewarding task. Let your rsum show that you have the powerful combination of experience and education that comes from the BI and DW industrys most meaningful

    and credible certification program.

    Become a Certified Business Intelligence Professional today! Find out how to advance your career with a BI certification credential from TDWI. Take the first step: visit tdwi.org/cbip.

  • tDWis Best of Bi VOL. 11 25 tdwi.org

    tdwi.org/cbip

    CERTIFIED BUSINESS INTELLIGENCE PROFESSIONAL

    Get Recognized as an Industry LeaderAdvance your career with CBIP

    Professionals holding a TDWI CBIP certification command an average salary of $113,500more than $8,200 greater than the average for non-certified professionals.2013 TDWI Salary, Roles, and Responsibilities Report

    TDWI CERTIFICATION

    Distinguishing yourself in your career can be a difficult yet rewarding task. Let your rsum show that you have the powerful combination of experience and education that comes from the BI and DW industrys most meaningful

    and credible certification program.

    Become a Certified Business Intelligence Professional today! Find out how to advance your career with a BI certification credential from TDWI. Take the first step: visit tdwi.org/cbip.

    Increasingly, implementation success rises and falls with users, not IT; dashboards, visual analytics, and discovery tools are giving users more control, enabling them to progress further on their own rather than depend on IT. This is important for large organizations where IT application backlogs are a problem; it is also a significant benefit for small and midsize firms that do not have extensive IT support for visual reporting and analysis. However, as always, with the advantages come new challenges.

    One of the most potent benefits is better communication. Our research makes it clear that performance management continues to be a vital initiative and that the associated dashboards are intended to be the centerpiece. In Figure 1, we can see that KPI definition and delivery is the most prevalent activity currently deployed for users through implementation of data visualization and visual analysis technologies (60%). Second and third highest are snapshot report creation (45%) and alerting/monitoring activity (44%). For all three of these activities, visualizations are critical in providing actionable insight;

    TDWI research

    implementAtion prActices For Better decisions

    BY DaviD STODDER

    2013 TDwI BEsT PrACTICEs rEPOrT

  • tDWis Best of Bi VOL. 11 26 tdwi.org

    they enable executives, managers, and other users to focus on the situation at hand rather than having to tease out facts from data tables, ratios, and formulas.

    visualizations enable new forms of collaboration on data. Many tools allow users to publish charts, not only in dashboards for viewers to share, but also through e-mail and collaboration platforms such as Microsoft SharePoint. Dashboards can deliver context for visualizations by providing annotations and related charts, since one chart often cannot tell the whole story. Other means of storytelling, including animation or video and audio files, may be part of the collaboration.

    Storytelling is important because visualizations are usuallyand often, intentionallyleft open to interpretation. Different viewers can draw different interpretations, which they can investigate by drilling down into the data. Some charts may hide the importance of certain factors, while others might exaggerate them. This ambiguity makes it important for executives, managers, and users to work with visualizations as tools to engage in a productive dialogue about metrics and measures. Organizations can use visualizations to overcome the

    one-way street limitations often cited as the bane of performance management and standard BI reporting.

    Time series analysis is an important focus. A significant percentage of respondents implement visualizations for time series analysis (39%). Users in most organizations need to analyze change over time, and they typically use various line charts for this purpose. Some will also

    apply more exotic visualizations such as scatterplots for specialized time series analysis, including examining correlations over time between multiple data sources. Visualizations for pattern and trend analysis, often related to time series analysis, are employed by 35% of respondents.

    Time series, pattern, and trend analysis complement predictive analysis. Organizations want to use history to forecast what will happen next and identify what factors will cause patterns to repeat themselves. Almost a third (32%) of respondents use visualizations for forecasting, modeling, and simulation, and 22% are doing so for predictive analysis. Again, visualizations can improve vital collaboration on predictive analysis among different subject matter experts, who can share perspectives and help the organization adjust strategies to be proactive. The organization will anticipate events and be prepared with the most intelligent way to respond.

    Geospatial analysis and visualizationThe ability to superimpose data visualizations on top of maps is already a powerful asset for firms in industries such as real estate, energy, telecommunications, land management, law enforcement, and urban planning. As more location-based data from geographical information systems (GIS) becomes available, organizations in many other industries are also becoming interested in analytical capabilities. Retail firms, for example, can use the combination of business data and maps to determine

    As more location-based data from geographical information systems becomes available, organizations in many industries are becoming interested

    Which of the following business analysis, reporting, and alerting activities are currently deployed for users in your organization through implementation of data visualization and visual analysis technologies? (Please select all that apply.)

    KPI definition and delivery 60%

    Snapshot report creation 45%

    Alerting/activity monitoring 44%

    Time series analysis 39%

    Pattern and trend analysis 35%

    Visual analysis of content 34%

    Forecasting, modeling, and simulation 32%

    Predictive analysis 22%

    Outlier, anomaly, or exception detection 21%

    Portfolio analysis 21%

    Quantitative modeling and scoring 20%

    List reduction 6%

    Figure 1. Based on answers from 408 respondents; respondents could select more than one answer.

  • tDWis Best of Bi VOL. 11 27 tdwi.org

    where to locate stores; healthcare organizations can better understand patient behavior and disease patterns; insurance firms can use location analysis to improve risk management; and marketing functions in a variety of firms can overlay customer information and demographics on maps to sharpen messaging to different neighborhoods.

    Although just under half (49%) of organizations surveyed are not currently implementing geospatial analysis, a significant percentage are implementing visualization for activities such as geographic targeting (35%), routing and logistics (14%), and finding nearest locations. Nearly a third (31%) of respondents seek to integrate geospatial with other types of analysis. The ability to visualize corporate data and advanced analysis such as time series along with location information can help organizations add a new dimension to business strategy and operational intelligence. Mapping visualizations can be enhanced with data to become geographical heat maps; these might show the most or least profitable sales territories or where customers are having particular kinds of service problems.

    Adobe

    PDF

    This article is an excerpt.

    Read the full report

    Read more reports

    David Stodder is director of TDWI Research for business intelligence. He focuses on providing research-based insights and best practices for organizations implementing BI, analytics, data discovery, data visualization, performance management, and related technologies and methods. Stodder has provided thought leadership about BI, analytics, information management, and IT management for over two decades. Previously, he headed up his own independent firm and served as vice president and research director with Ventana Research. He was the founding chief editor of Intelligent Enterprise and served as editorial director for nine years. He was also one of the founders of Database Programming & Design magazine. You can reach him at [email protected], or follow him on Twitter: @dbstodder. The report was sponsored by Adaptive Planning, ADVIZOR Solutions, Esri, Pentaho, SAS, and Tableau Software.

  • BI Training Solutions: As Close as Your Conference Room

    tdwi.org/onsite

    TDWI ONSITE EDUCATION

    TDWI Onsite Education brings our vendor-neutral BI and DW training to companies worldwide, tailored to meet the specifi c needs of your organization. From fundamental courses to advanced techniques, plus prep courses and exams for the Certifi ed Business Intelligence Professional (CBIP) designationwe can bring the training you need directly to your team in your own conference room.

    YOUR TEAM, OUR INSTRUCTORS, YOUR LOCATION.

    Contact Yvonne Baho at 978.582.7105 or [email protected] for more information.

    TDWI_Onsite Print Ad_F.indd 1 2/19/13 10:38 AM

  • tDWis Best of Bi VOL. 11 29 tdwi.org

    BI Training Solutions: As Close as Your Conference Room

    tdwi.org/onsite

    TDWI ONSITE EDUCATION

    TDWI Onsite Education brings our vendor-neutral BI and DW training to companies worldwide, tailored to meet the specifi c needs of your organization. From fundamental courses to advanced techniques, plus prep courses and exams for the Certifi ed Business Intelligence Professional (CBIP) designationwe can bring the training you need directly to your team in your own conference room.

    YOUR TEAM, OUR INSTRUCTORS, YOUR LOCATION.

    Contact Yvonne Baho at 978.582.7105 or [email protected] for more information.

    TDWI_Onsite Print Ad_F.indd 1 2/19/13 10:38 AM

    NOT SOLVING A REAL BUSINESS PROBLEM

    Technology is cool. Working on a BI project can be great for your rsum. Everyone is doing it.

    Its easy to get caught up in the hype, but these are not good reasons to launch a BI project. Maybe you already have several reporting and BI environments in place. Maybe you are starting fresh. Either way, too many organizations fail to seek real business problems to address. The BI project may be IT driven, and it is admirable to have an IT group that cares enough about the organization to want to build a BI solution. The intentions are good. However, no matter how much IT believes a project will benefit the organization, most projects will fail unless they are tied to problems the business community needs to address.

    Developing a new BI solution can benefit IT by reducing maintenance costs of the current reporting environment or replacing technology that is no longer supported. These are benefits to the overall organization, but they do not motivate business people to use new solutions that are provided.

    How can you find a real business problem to work on? Often there is a critical business need where data is not accessible or is only available to a few people with strong technical skills. This can be a great place to start. If there is no single, overriding concern, then more research is needed, typically through requirements gathering.

    Identify and prioritize business problems that need attention. Have an analyst from IT collect requirements. Conduct interviews or facilitated group sessions. (These should be business discussions about the challenges facing the organization.) Flesh out potential analyses and supporting data. Analyses that need the same data can be grouped together into themes. Now, hold a joint prioritization session to compare themes and evaluate business impact and feasibility (complexity, effort, and cost). Typically, there is a clear set of analyses and data that could have significant business benefit and can be delivered in an achievable project.

    1ten Mistakes to aVoiDWhen Delivering

    Business-Driven BI