data mart consolidation

23
White Paper William McKnight McKnight Consulting Group Data Mart Consolidation: Repenting for Sins of the Past www.mcknight cg.com

Upload: william-mcknight

Post on 20-May-2015

669 views

Category:

Documents


2 download

DESCRIPTION

This paper details the process of DMC at eight different organizations while capturing the keys to success from each. These case studies were specifically selected to demonstrate several variations on the concept of consolidation. While there is no such thing as a �cookie-cutter� DMC process, there are common best practices and lessons to be shared.

TRANSCRIPT

Page 1: Data mart consolidation

W h i t e P a p e r

William McKnight McKnight Consulting Group

Data Mart Consolidation: Repenting for Sins of the Past

www.m c k n i g h t cg. c o m

Page 2: Data mart consolidation

Data Mart Consolidation (DMC)

Contents

Part 1 Data Mart Consolidation (DMC): The Business Rationale 1

Building the Case for Data Mart Consolidation 1

The Benefits of the Program Approach to Data Warehousing 1

Desired Outcomes of Data Mart Consolidation 2

Approaches to Data Mart Consolidation 2

Part 2 The Interviews 3

3M 3

The Pre-Consolidation Environment 3

Reasons for Consolidation 3

The Consolidation Project 4

The Benefits Realized 4

The Post-Consolidation Environment 4

Delta Air Lines 5

The Pre-Consolidation Environment 5

Reasons for Consolidation 5

The Consolidation Project 6

The Benefits Realized 6

The Post-Consolidation Environment 7

Michigan Department of Community Health 7

The Pre-Consolidation Environment 7

Reasons for Consolidation 7

The Consolidation Project 8

The Benefits Realized 8

The Post-Consolidation Environment 8

Healthcare Insurance Company 9

The Pre-Consolidation Environment 9

Reasons for Consolidation 9

The Consolidation Project 9

The Benefits Realized 10

The Post-Consolidation Environment 10

Page 3: Data mart consolidation

Data Mart Consolidation (DMC)

Royal Bank of Canada 10

The Pre-Consolidation Environment 10

Reasons for Consolidation 10

The Consolidation Project 11

The Benefits Realized 11

The Post-Consolidation Environment 12

Major Telecommunications Company 12

The Pre-Consolidation Environment 12

Reasons for Consolidation 13

The Consolidation Project 13

The Benefits Realized 13

The Post-Consolidation Environment 13

Anthem Blue Cross Blue Shield 14

The Pre-Consolidation Environment 14

Reasons for Consolidation 14

The Consolidation Project 15

The Benefits Realized 15

The Post-Consolidation Environment 15

Sekisui Systems Corporation 16

The Pre-Consolidation Environment 16

Reasons for Consolidation 16

The Consolidation Project 16

The Benefits Realized 17

The Post-Consolidation Environment 17

Part 3 Best Practices for Data Mart Consolidation 18

Best Practices for DMC 18

Customer-Reported Keys to DMC Success 19

Author’s Additional Keys to DMC Success 19

About the Author 20

Page 4: Data mart consolidation

Data Mart Consolidation (DMC)

1

Part 1: Data Mart Consolidation (DMC):The Business RationaleBuilding the Case for Data Mart ConsolidationFor much of the last decade, conventional theories surrounding decision supportarchitectures have focused more on cost than business benefit. Lack of Return onInvestment (ROI) quantification has resulted in platform selection criteria being focusedon perceived minimization of initial system cost rather than maximizing lasting value tothe enterprise. Often these decisions are made within departmental boundaries withoutconsideration of an overarching data warehousing strategy.

This reasoning has led many organizations down the eventual path of data mart prolif-eration. This represents the creation of non-integrated data sets developed to addressspecific application needs, usually with an inflexible design. In the vast majority ofcases, data mart proliferation is not the result of a chosen architectural strategy, but aconsequence due to lack of an architectural strategy.

To further complicate matters, the recent economic environment and ensuing budgetreduction cycles have forced IT managers to find ways of squeezing every drop ofperformance out of their systems while still managing to meet users’ needs. In otherwords, we’re all being asked to do more with less. Wouldn’t it be great to follow inothers’ footsteps and learn from their successes while still being considered a thoughtleader?

The good news is that the data warehousing market is now mature enough that there aresuccesses and best practices to be leveraged. There are proven methods to reduce costs,gain efficiencies, and increase the value of enterprise data. Pioneering organizationshave found a way to save millions of dollars while providing their users with integrated,consistent, and timely information. The path that led to these results started with arapidly emerging trend in data warehousing today – Data Mart Consolidation (DMC).

I’ve learned that companies worldwide are embracing DMC as a way to save largeamounts of money while still providing high degrees of business value with ROI. DMCis an answer to the issues many face today. There is a way to cut BI costs and continueto deliver business value with BI. Others have done it and I’m going to share how theydid it in this paper.

This paper details the process of DMC at eight different organizations while capturingthe keys to success from each. These case studies were specifically selected to demon-strate several variations on the concept of consolidation. While there is no such thing asa “cookie-cutter” DMC process, there are common best practices and lessons to be shared.

The Benefits of the Program Approach to Data WarehousingTenets of sound business practices apply to data warehousing. One of these is thenecessity to accomplish an objective in the most efficient manner. What is the mostefficient way to accomplish data warehousing objectives?

It’s the way that builds a data warehouse to solve specific needs, but does so in a mannerthat leverages previous investment in the architecture, tools, processes, and people anddoes not prohibit future growth. This enables an efficient, programmatic approach todata warehousing created to serve information to the enterprise. By leveraging anintegrated data warehousing approach you will realize efficiencies generated byeconomies of scale.

I’ve learned that companies

worldwide are embracing

DMC as a way to save large

amounts of money while still

providing high degrees of

business value with ROI.

Page 5: Data mart consolidation

Data Mart Consolidation (DMC)

2

Efficiency as it relates to DMC comes in three primary forms. There are true costefficiencies involving the hardware, software and personnel carrying costs of theenvironment and switching the costs over to a more manageable expense stream.Many in this study referred to these as “IT benefits” but lower Total Cost of Ownership(TCO) and economies of scale are business benefits as well. With one data warehousingprogram as opposed to many, fewer resources and processes need to be supported in anenterprise.

Secondly, there are efficiencies associated with having a “single version of the truth”to reference as opposed to engaging in internal “data warfare” or spending most of the“analysis” time searching for data or “making do” with undesirable, outdated data.As the interviews will attest, many companies were engaged in “data warfare,” but it’snot simply a matter of whose data is better. In many organizations, the best data is notaccessible or the users are not trained on the access method. A central warehouse helpsset aside the politics of whose data is better by establishing a consistent, trustworthysource of information. Creating a “single version of the truth” drives internal efficien-cies by focusing resources on the value-added activities of business rather than datagathering activities.

Thirdly, there are system efficiencies to be gained by eliminating redundant processes.For example, although many are using the file delivery capabilities of operationalsystems to feed data to their data warehousing environment, getting data out of thesource is still one of the most difficult tasks in data warehousing. Usually the firstextract request is not met with “open arms.” A second or third one can be impossible.This leads many to a “single extract, many load” architecture which solves someproblems but not others.

Fortunately for those who have met the challenges, data warehousing has proved itselftime and time again as a valid conduit for delivering data and data analysis into businessprocesses and thereby improving them while helping the company achieve their statedgoals. DMC allows organizations to reap the benefits of integrated, centralized datawarehousing while delivering significant cost savings through internal efficiencies.In essence, it is the grand slam of IT initiatives.

Desired Outcomes of Data Mart ConsolidationData warehousing is a process, not a project, and a journey rather than a destination.This applies to DMC as well. The case studies below represent several forms thatDMC can take including merging data marts into a new warehouse, picking an existingwarehouse/mart and merging other warehouse/marts into it, and moving analyticalfunctionality from other databases onto a data warehouse. The consolidation itself canleverage existing designs and re-route Extract Transform & Load (ETL) processes intothe consolidated warehouse or consolidate designs as well as the platform.

This paper provides a framework of DMC reference points, lays out options for DMCand provides best practices for those considering, planning or doing some form of DMC.

Approaches to Data Mart ConsolidationApproaches and steps to DMC as well as maturity levels with DMC emerged fromthe interviews.

1. Rehosting – The process of picking up database designs and ETL “lock, stockand barrel” and moving it to a different platform either as an effort to gainperformance or cost advantages. Often the rehosting will be done onto a platformwith existing data constructs, thereby expanding the utility of the platform.

A central warehouse helps set

aside the politics of whose

data is better by establishing

a consistent, trustworthy

source of information.

Page 6: Data mart consolidation

Data Mart Consolidation (DMC)

3

2. Rearchitecting – The process of merging database designs and therefore thedata acquisition strategy for the data as well. Rearchitecting may involve pickingthe best model components from various models and/or it may involve morezero-based approaches, starting from scratch, that use requirements as the basisfor the new model.

Part 2: The Interviews

3M

The Pre-Consolidation Environment3M is a multi-faceted company that had a data mart environment which represented itsdiversity. Before the consolidation, they had 40 major data marts, several smaller ones,and some previous failed attempts at a data warehouse in the environment. Previousattempts at a more encompassing data warehouse had proved to be too constrained andinflexible to make it very far so a data mart environment had perpetuated over the years.

The marts were solving numerous business objectives including decision support,financial and sales reporting. There were 25 different platforms in place, “just abouteverything” according to Al Messerli, the former Director of the Enterprise InformationManagement Group at 3M and now with Allen Messerli Enterprise Systems, LLC. Thisincluded many UNIX, some Windows NT and some mainframe systems.

All together, it was many terabytes and the environment had grown tremendously over30 years so obviously it began well prior to market acceptance of data warehousing.This was a firmly entrenched environment many years in the making and it was goingto be a challenge to consolidate it!

ETL was being done mostly through “pushes” from the operational environments withdata pickup and movement through proprietary methods. There were also all kinds ofdata access tools and methods deployed in the pre-consolidated environment.

As a result, major subject areas of sales, product, and customer were duplicated acrossthese data marts and not in a consistent manner. As a matter of fact, the main reason forthe consolidation was the inconsistent results and inability to get a corporate-wide viewof customers, which was creating enormous business pain. Without this one face to thecustomer, 3M was unable to get complete customer information due to the distributednature of the data.

Reasons for ConsolidationAnother major reason for consolidation was a very large opportunity to reduce environ-ment carrying costs by eliminating data marts. 3M did a complete financial impact of theconsolidation. ROI was expected to be $20M per year! Indirect expense reductions frominternal efficiencies were also projected to accrue. In addition, the consolidated warehousewas expected to help meet market and customer penetration as well as sales growth goals.

The project was made very visible to the user community. “Everybody knew” accordingto Messerli. The idea for consolidation was primarily from certain individuals in IT.Cultural resistance was faced and a year-long sell cycle from the C-level throughoutthe organization was required.

Ironically, most of the resistance was from others in IT. The business saw the benefitsmore readily. Culture needed to be substantially changed to make it work for theenterprise and this required lots of selling “from the top-down and bottom-up”.

Page 7: Data mart consolidation

Data Mart Consolidation (DMC)

4

This took the form of chalk talks, hands-on sessions, user groups and data trusteeship(a form of data stewardship.) With 40+ data marts, there was a huge need to providemany mart users with a comfort level around a data warehouse environment and theconcept of data sharing.

Data security would actually be improved with the ability to apply a consistent securitypolicy at the data warehouse level and implement business unit specific security aroundsubject areas.

The Consolidation ProjectSince it was impossible to pick one from among the 40 marts to use as the conduit forthe data warehouse, 3M built a brand new data warehouse from scratch to accomplishits consolidation objectives. It was going to be totally comprehensive, with atomic leveldetail on all business subject areas and constructs from the existing marts incorporatedover time so the mart platforms could be retired.

The ETL was completely redesigned. In building the new warehouse, 3M made sure thenew environment would include all the old functionality and then some. They did somezero-based analysis around business needs for a warehouse and how to construct thewarehouse. It turned out that no pre-existing subject area in any mart was selected tomove verbatim into the new warehouse.

The extract load on the source systems was not materially affected by the DMC sincethose systems had mostly been programmed to push ample data out previously and thiswas not changed. Furthermore, they were previously extracting detailed data so thatwas maintained.

3M normalized the new data model. The data warehouse team did extensive datacomparisons between the legacy marts and the new warehouse to demonstrate that thedata warehouse was correct (or if the numbers were different, that the data warehousewas “better”).

Each migration was a separate project and in total it took several years to get thefunctionality of all 40 into the warehouse. All 40 marts are now gone. Data outageswere managed with parallel runs, causing only glitches in a very complex undertaking.

The team had top-down support after the year-long sell cycle for the effort and a “nochoice” budget allocation back to business units.

The Benefits RealizedThe benefits indeed were “many and large” and exceeded investment by quite a fewtimes over. Benefits came in many business areas including procurement, finance, sales,marketing, supply chain and e-business.

The Post-Consolidation EnvironmentThe consolidation is now complete. 3M chose Teradata for the data warehouse platform.Teradata was deemed to be the only solution that scaled to the eventual size and usersthey would have in a consolidated environment comprised of hundreds of source system,5,000 tables, and 20,000 daily users. Scalability was the major driver behind this decision.

They now have a consolidated and manageable set of data access tools and do ETL “oneway.” The data warehouse is now 15 TB of total disk space and has over 10,000 users.The marts were eliminated. Many of their platforms were obsolete according to Messerli.

Page 8: Data mart consolidation

Data Mart Consolidation (DMC)

5

The environment continues to evolve with more business functions, subject areas, users,and subsidiaries coming on board. The new warehouse environment has opened up thedata to channel partners and customers on a self-service basis.

Corporate mandates support the shared, centralized warehouse concept now and 100%of ongoing data warehouse efforts go into the centralized, mission-critical data warehouse.

Top 3 keys to DMC success:

1. Getting complete buy-in from executives and throughoutthe organization

2. Good data standardization and a good data model

3. Good user tools to help facilitate user buy-in

Delta Air Lines

The Pre-Consolidation EnvironmentDelta had three databases called data warehouses by their users. All three were on Teradataand served Financial, Marketing and Flight data interests, respectively. There were only50 users in total for all the warehouses.

The Financial warehouse was used for financial analysis. The 12 users primarily accessedthe 100 GB warehouse with a modern data access tool. The Flight data warehousesupported revenue management – the effectiveness and profitability of flights. Its 12 usersaccessed the 700 GB warehouse primarily through a data mining tool.

The largest of the warehouses was Marketing. It was used to look at frequent flyerinformation in order to adjust and judge the effectiveness of marketing programs. The500 GB were accessed with both a modern data access tool and a data mining tool.None of the warehouses leveraged a packaged ETL tool.

Reasons for ConsolidationTicket, flight and financial data were duplicated in the pre-consolidation environmentand they were materially inconsistent in their representation of this data. This approachdidn’t provide an accurate, consistent view of the same subject. This was not specificallytraced to negative ROI impact but there was a general feeling of dissatisfaction and datadisagreement within the user community.

There were separate staffs for each warehouse. A goal of consolidation was to bringthe warehouse under one group, which caused consternation. Typically IT groupswere functionally aligned and were the single points of contact for the business units.Consolidating caused different groups (functional and warehouse) to be making contactwith the users and this had to be managed. Additionally, there was conflict over whichtool to use and when to use it. There was a desire to get to a standard tool set anddevelop a training program to help the casual user.

The main reason for consolidation was not cost savings, but was to get to an “enterpriseview – a single source of the truth” according to Wayne Hyde, former IT Vice Presidentat Delta Air Lines and now with Reflection Technologies. This would eliminate compe-tition regarding whose data is best, which was previously left to IT to figure out. Theconsolidated warehouse would help put people on a common goal instead of beingin competition.

Page 9: Data mart consolidation

Data Mart Consolidation (DMC)

6

Bottom line improvement had to be demonstrated by getting data in the hands of lotsof people besides the financial analysts. “If only 60 people have access, they will beoverworked. But get the data to hundreds of thousands of people who can engage thedata in an adhoc fashion at the time they are performing business processes, they canexploit the data to perform better and impact costs, processes, fraud and recoverrevenue” according to Hyde.

IT did the analysis of corporate pain points and decided on DMC. The stated goal of theproject was not the end-all data warehouse, but focused on consolidating the 3 existingwarehouses and “let the future chips fall where they may.”

Several “IT” benefits were also expected including saving machine cycles by loadingone copy of the data (vs. many), redeploying people to more productive value-addingwork as opposed to redundant work, and better leveraging machine capacity. For example,during the DMC process, it was determined that different groups were trying to performthe same analysis!

In order to get DMC going, Delta took an ROI view of inefficiencies, redundancies,and software licenses. They did not establish quantifiable business ROI objectives forthe initial transition, but asked the business for the ROI when determining what priorityto train users for the new warehouse.

The Consolidation ProjectPleased with Teradata to-date, Delta stuck with Teradata for the consolidated warehouse.The initial step was to consolidate platforms and copy the data warehouse designs forFlight and Financial data onto Marketing’s platform.

Once standard tools were selected, the team used zero-based analysis of businessrequirements to define data warehousing needs. The users overwhelmed the datawarehouse team with demand.

However, according to Hyde,“Replatforming reports is like trading cars but still usingthe car for the same routes. It might be a nicer car but it does nothing for ROI – justpsychological benefits. You’ve got to provide some kind of incremental capability. Oneis changing the dimension of timeliness. There are some benefits from data marts butyou still have different business units making decisions with different People need tolook at the negative impacts of data marts.”

Delta ended up with multiple development teams organized under a central datawarehouse team. They had a business specific team that did specific reports, adhocanalysis and dashboard building. The platform consolidation took 18-24 months andyielded 60% - 70% of the enterprise view, the rest of which would be added over time.

Interestingly, they did not do parallel runs with the older warehouses. They just cutover after the platform movement and dealt with any issues. Extract loads on the sourcesystems actually increased over time since the new data warehouse identified needs overand above those that the previous warehouses uncovered.

The Benefits RealizedThere are numerous benefits cited for the consolidation but a good example is inRevenue Management. Delta Air Lines was able to contest tens of millions of promo-tional dollars that were claimed by travel agents. This analysis was made possiblethrough a consolidated environment with a common view of the data. However, the realvalue was giving access to data to hundreds of thousands of people, not just a select few.

Replatforming reports is like

trading cars but still using

the car for the same routes.

It might be a nicer car but it

does nothing for ROI – just

psychological benefits. You’ve

got to provide some kind of

incremental capability.”

Page 10: Data mart consolidation

Data Mart Consolidation (DMC)

7

The Post-Consolidation EnvironmentThe consolidation is complete and the two warehouses that were consolidated arehistory. The DMC of the three warehouses also led to a total of 27 marts being elimi-nated. Delta Air Lines is focused on its architected data warehouse now, which is 4 TBusable data on Teradata and uses an ETL tool in places with an entirely different dataaccess tool than before.

Users were consolidated from the Finance and Flight data warehouses and the user basehas grown over time to 4,000 users.

Top 3 keys to DMC success:

1. Having a strategic vision of where you are going from anenterprise view of the data

2. Having a delivery of new capabilities, not just the old.Need NEW capabilities to establish new points ofmemorable value to be tied to the effort.

3. Senior level understanding of the vision (sponsorship)

“Miss any one and you can be dead. If you have thestrategy without the sponsorship, you can get started butnot finish. If you have strategy without delivery, you’ll becondemned” according to Hyde.

Michigan Department of Community Health (DCH)

Pre-Consolidation EnvironmentStarting in 1994, DCH began storing Medicaid paid claims on their data warehouse,which maintains 5 years worth of paid claims. They have 1.2 million Medicaid recipientsand the majority of claims are paid through managed care. In 1998, they also startedreceiving encounter data and accumulated 66 million encounter data records to date,which are records of interactions between members and care providers.

David McLaury is the Director for Project Development and Implementation. TheDepartment of Community Health represents the largest user of the data warehouseenvironment in Michigan.

The State of Michigan operates an enterprise data warehouse, which multiple stateagencies utilize. It is a Teradata implementation. The department also operates a numberof Oracle operational databases that were being used for analytical work in addition tooperational needs. Users did not and could not have robust data access tools due to howthe tools would interfere with the system’s primary operational purpose.

Reasons for ConsolidationThe main reason that these operational databases were consolidated into the datawarehouse was to provide better query and analytical capabilities. By consolidatingthese databases onto the warehouse, they are now also able to move information ontoa new data mart, which uses a MedStat schema and is also run by the department.

The idea for the DMC came from business needs. McLaury chairs a departmentalcommittee that oversees the project and approved the DMC. The user community wasactively involved in the consolidation, including acquiring the necessary federal fundingto support the project.

Page 11: Data mart consolidation

Data Mart Consolidation (DMC)

8

One goal for the DMC was to create an integrated data warehouse environment that theycould manageably add onto over time and was available to department managers for allkinds of programs, not just those known initially.

There was concern about losing control and especially about security. Data owners mustsign off on new users and these users must sign usage agreements. These programshelped assure that owners still felt like owners and alleviated cultural resistance.

The Consolidation ProjectDCH created new data flows from the databases to the data warehouse. The additionaldata and emphasis on the data warehouse supported additional data cleansing activity.Data requirements were re-gathered and analysis was done on the requirements tounderstand what operational data was required for analytical purposes.

Some database redesign was necessary in the move but some legacy designs were goodenough, even for analytical purposes. The consolidation will take 2 years and is beingdone by stepwise movement of the data from the operational databases into the datawarehouse.

The Benefits RealizedThe benefits of DMC have been broad-based, especially in analytical areas. An exampleis the ability to cross-compare Medicaid paid claims and encounter data to otherdepartmental data sets.

The users are still adjusting to having access to more data. While many still take theapproach of accessing the same data as before only in a different database, access tomulti-source data for program purposes will over time provide the biggest benefits ofthe DMC. The more data is added, the more benefits will grow. This will include datasuch as long-term care, nursing facilities, mental health services, substance abuseservices, and dental services over the next year.

The Post-Consolidation EnvironmentThe Oracle operational databases were and are still available, but they are not nearlyas attractive for analytical purposes because the data warehouse is now available withclean, integrated, and historical data modeled for access and analytics. Reporting is alsobeing moved to the data warehouse. The data warehouse (combined with the MedStatdata mart) is 500 GB with 270 users.

BULL is the state’s contracted entity for the Teradata warehouse. This decision wasoriginally made through competitive bids. As a scalable platform available to a varietyof leading tools, Teradata was kept in place for the added data the DMC brought intothe data warehouse.

Top 3 keys to DMC Success:

1. Leadership and agreement that you have to do DMC

2. Show the ROI for DMC before and after

3. Have sufficient funding for the effort

Page 12: Data mart consolidation

Data Mart Consolidation (DMC)

9

Healthcare Insurance Company

The Pre-Consolidation EnvironmentBefore the merger of the two companies that formed this health care insurance company,there was a mainframe data warehouse at one and a Teradata data warehouse at theother. The Teradata data warehouse actually acted more like an Operational Data Store(ODS) in that its data was immediately available to users after the data was generatedin the operational systems. After the merger, this Teradata data warehouse became thefeeder system for the mainframe data warehouse.

Eventually, both of the systems gave way to a new Teradata data warehouse – onedestined to be this company’s consolidated data warehouse. In addition, there is stillanother data warehouse in the environment that is not yet part of the consolidation effort.

Prior to any consolidation, this company had three different ETL processes, three setsof definitions, some of the same data in three places, some critical data missing fromthe warehouses, and customer tracking being done in multiple data warehouses.

There was “extra everybody effort with extra cost on users – joining data from differentsystems and learning different systems” according to the Director of the Data Warehouse.Each data warehouse had different reconciliation masters (one to the general ledger, oneto invoices and one to cash). So the data did not easily reconcile and there was a costassociated with bringing it all together.

In most cases, the atomic level detail was captured everywhere although the summariesand some minor aspects were different between the warehouses. For example, theFinancial data warehouse has 90% currency-type fields so there are shorter records butit is still detailed. There were also homonyms and synonyms if you looked across thewarehouse environment, which created confusion for the users who frequently had toaccess data across different warehouses to accomplish a business objective.

Reasons for ConsolidationWhile direct carrying cost reduction was expected, this was not the most important orthe largest benefit. Although originally perceived as IT cost savings (because IT cameup with the project idea), DMC was positioned to provide business benefit. IT savingsalone would not have justified it.

The architectural goal of an enterprise-wide data warehouse was made very visible tothe user community. They had a business owner of the project and a steering committee.Interestingly, there were more privacy issues when the data resided on 3 different datawarehouses than there were after the initial consolidation was completed!

The Consolidation ProjectThe DMC thus far has consisted of rehosting the (former) mainframe data warehouse toTeradata for performance reasons and also feeding a separate schema from the ODS-likedata warehouse – two separate schemas for the two pre-merged organizations – but atleast sitting in the same Teradata instance. This allowed the pre-merger Teradata datawarehouse to focus solely on the organization’s needs for an ODS.

The parties involved recommended benchmarking to make sure the chosen DMCenvironment would perform as advertised. Although they’d had Teradata for almost10 years, they ran benchmarks prior to confirming its selection for the consolidatedenvironment. Teradata solved an immediate pain point by delivering a 5-fold performanceincrease compared to the mainframe data warehouse.

Moving the existing ETL streams, access environments and database designs to theconsolidated platform was the first step of the DMC. Most of their data transformation

Page 13: Data mart consolidation

Data Mart Consolidation (DMC)

10

happens in mainframe operational environment anyway so the Extract and Transformationstayed the same. Only the Load changed for the DMC. The number of extracts has beenreduced however based on the consolidation. To ensure integrity, a parallel run of aboutthree months for each pre-consolidated warehouse occurred.

The consolidation of the third data warehouse (previously mentioned as outside thescope of consolidation thus far) and the redesign of the schemas remain to be accom-plished. So, while many of the challenges in the pre-consolidation environment havebeen met by the DMC efforts to date, there is still much work to be done.

The Benefits RealizedThe larger benefits for this DMC came from the business perspective, specifically moretimely data to make better decisions and turn around requests quickly by not having toreconcile data and prove use of the “right” data. An example of this is profiling providersand determining whether members are being treated appropriately. This was improvedupon by consolidating the data warehouse environment.

The Post-Consolidation EnvironmentThe mainframe cycles were re-dedicated to OLTP-type work. The warehouse is providingdetailed data to support complex and diverse user queries in a manageable way. Therewill be more to this DMC story since it is not complete. Stay tuned.

Top 3 keys to DMC Success:

1. High levels of business customer Support – it’s not all IT

2. Know going into a DMC that you are fixing a businessproblem

3. Benchmark to determine the best platform to use

Royal Bank of Canada (RBC) Financial Group

The Pre-Consolidation EnvironmentRBC Financial Group had a 2.5 TB data warehouse along with several predominant datamarts, some of which pre-dated the data warehouse. These marts ran on heterogeneousdatabase platforms. These data marts were loaded from a combination of source systems,flat files and the enterprise data warehouse. There were numerous ways to load andaccess data, different staffs for the different marts and the warehouse, and a varietyof vendor tools deployed to access the data.

Systems and Technology within RBC Financial Group conducted a health check onthe data warehouse environment. As a result, the decision was made to transform to ahub and spoke environment, which would result in simplifying the ETL and processing,as well as optimize resource utilization. According to Mohammad Rifaie, the GroupManager of Information Resource Management at the RBC Financial Group, “Dataintegration is absolutely critical to create a ‘Single Version of the Truth’ whereby allbusiness information/data is unified and shared across all functional departments. Thisenterprise-wide view of our customer behavior along with operational data will allowfor analysis and insight that was not possible before. A consistent, single view of ourdata should improve sales, reduce operational costs, increase customer retention andsatisfaction, and ultimately lead to maximized profitability.”

Reasons for ConsolidationTechnological constraints imposed by existing multiple processing platforms made itvery difficult to share data. As a result, much data was replicated. This also resulted in

The mainframe cycles were

re-dedicated to OLTP-type

work. The warehouse is

providing detailed data

to support complex and

diverse user queries in a

manageable way.

Page 14: Data mart consolidation

Data Mart Consolidation (DMC)

11

duplication in resources and processes, which led to a higher cost of ownership and agreater potential for inconsistency. An impartial assessment of the data warehouseenvironment by an analyst group advocated consolidation onto a Teradata platform ifRBC Financial Group was to reduce costs, improve the effectiveness of the environmentand realize their strategic objectives.

Besides prohibitively higher operating costs, different processing environments pre-vented RBC Financial Group from leveraging all sources of information. The datastored in independent data marts usually encompassed one or two subject areas (sales,marketing, customer service) and failed to provide an integrated environment thatallowed the various pockets of information to be shared and leveraged across theorganization. RBC Financial Group had top-down support and strong executivesponsorship for the effort. Both were cited as keys to success.

Although it was not stated that the new data warehouse would be the final architecturefor data warehousing, that’s how it worked out. RBC Financial Group now will onlyhave a physical mart for geographical purposes. “DMC is like having a rearview mirrorAND a front windshield” according to Rifaie.

The Consolidation ProjectThe first step was to port the data to the single platform, then “rationalize” the data,removing duplicate data and unneeded ETL. They did not redesign initially – they“forklifted” the existing designs. Then they redesigned and rearchitected. They’ve justfinished the redesign of the client subject area, which is the most widely used and arenow re-doing the ETL to load the new tables and removing the legacy constructs.Arrangement, Product and other subject areas will be done this way as well.

RBC Financial Group chose an existing platform to consolidate onto. They had analyst helpin choosing the solution for their DMC and they chose Teradata due to multiple areas ofsavings and benefits including high availability and reliability. They had 99.995% availabilityin the 7 previous years with Teradata, which Rifaie says is “built for data warehousing andthey have compression and economical indexing. TCO is low for Teradata.” Culturalchallenges were overcome by keeping the focus on TCO and nothing else.

The technical re-porting took 4 months (with one more mart to go.) The team had to“steal machines cycles whenever they could – after midnight, weekends, etc.” to keepfrom impacting user environments. There was no impact on the source systems forDMC since the systems put out files for data mart/warehouse environment pick-up(both before and after the DMC.)

Parallel runs with the legacy marts and warehouse lasted 1 month after the queries wereconverted to the new warehouse, during which time they were able to procure a writtensign off of every client in the data warehouse. The nodes, disk, and software that themarts and warehouse resided on were then deployed elsewhere.

The Benefits Realized“Data Warehousing is about repenting for the sins of the past” according to Rifaie. “Thedata warehouse is corporate memory. Redundant data is difficult to control. In a datamart, the primary key might be a numeric identifier column but it might be differentin another mart where it might be dual-columns. It will be problematic to join the datafrom these two.”

For example, once the Business and Personal Marketing data marts are on a singleTeradata platform with the EDW, there will be additional revenue and cost-avoidanceopportunities. This will be followed by a subsequent data rationalization project to eliminateunnecessary data and process duplication between the EDW and the data marts.

DMC is like having a rearview

mirror AND a front windshield.”

Page 15: Data mart consolidation

Data Mart Consolidation (DMC)

12

DMC also positions RBC Financial Group to handle new data and business require-ments more effectively. These include business centricity, effortless scalability, highuser concurrency, ease of access, complex and ad hoc query performance, data central-ization, fast fail-safe data load utilities, capability to handle multiple subject area, openaccess, integrated metadata, generic modelling, data-source neutrality, and softwareaddressing all critical components of the architecture.

By consolidating data marts and the enterprise data warehouse onto the same platformRBC Financial Group has been able to improve overall profitability by:

• Lowering the total cost to own, operate, and expand the data warehouseenvironment

• Reducing the requirement for scarce and expensive skill sets

• Enabling data integration across functional areas

• Improving efficiency in making data available to meet changing businessrequirements

• Providing an enterprise-wide “single version of the truth” spanning from customerinformation to actionable data

• Facilitating easier implementation of Data Governance and Privacy andConfidentiality

• Shortening the supply chain for data access so they can see a client’s completerelationship to the bank in one place, which has helped improve client relationships

The Post-Consolidation EnvironmentThere is one ETL tool with one way to do ETL now. The data warehouse is 3 TB andsupports 2,500 – 3,000 users.

Top 3 keys to DMC Success:

1. Base a business case on real savings. Architecturedoesn’t sell. Avoid technical terms. Build the case onsavings of FTEs, operations and strict TCO.

2. Make sure to obtain support of business partners at thehighest level.

3. Make sure to communicate with users about schedulesand changes. Do hand-holding. You may need to changetheir queries for them. Get sign off. Have communicationsurveys and parallel runs.

Major Telecommunications Company

The Pre-Consolidation EnvironmentThe pre-consolidation environment had over 70 “reporting systems” which servedbusiness unit-specific purposes. There was nothing that, from a 10,000-foot perspective,resembled a data warehouse. None of the marts had enough of a footprint to be considered“major” from the big picture perspective.

Each mart had a “handful” of users and people used what they informally learned hadthe data they needed and that they could get access to. Their choices were not alwaysbest for their needs, but without an organized approach, this was the environment.

The data warehouse is corpo-

rate memory. Redundant data

is difficult to control.”

Page 16: Data mart consolidation

Data Mart Consolidation (DMC)

13

The reasons for the marts were as vast and numerous as the platforms. The environmenthad grown over “centuries” and it was difficult without central management to tell howvast and large the environment really was, although it could easily be surmised to bemultiple terabytes. Only when an inventory was done for evaluating DMC opportunitiesdid this company realize the extent of the problems.

ETL was hand-coded since individual projects could not justify the purchase of atool. Data access tools and methods were numerous and not very robust. Additionally,support staffs were numerous and not dedicated since they resided in business areas.

Major subject areas were duplicated across the pre-consolidation environment but moreimportantly business functions were also duplicated. Not only were they duplicated, asyou would imagine with so many marts, they were inconsistently duplicated. The mainproblems were data duplication and confusion as opposed to inconsistent representationproblems.

Reasons for ConsolidationThe idea for DMC actually came from the business side. This project had to focus ondirect expense reduction as the main key to success. This meant achieving the goal ofreducing technical support and maintenance requirements. Top-down support for theanticipated savings helped them deal with cultural resistance, which can be the age-oldconflict between centralization and decentralization. Interestingly, privacy was no morean issue in the new environment as it was in the old environment.

The Consolidation ProjectThey built a new data warehouse on Teradata to absorb all the data marts. They reroutedthe existing streams but added others that were necessary for the consolidation. They’vedone “a little redesign as we go and we’ll see when we’re done if more is necessary”according to the leader of the effort.

The ETLs and database design were changed, but like several other DMCs, they reliedon source system file outputs so the extract load on the source systems were not affectedmuch. For a few of the mart consolidations, parallel runs of 1 month were done.

The complete consolidation of all the marts, which is still occurring, will take approxi-mately three years total. Phase 1 delivered 22 of the data marts and was two years, whichincluded scoping, planning and financial analysis. They have picked up momentum andexperience and anticipate finishing the remaining 48+ in the next year.

The Benefits RealizedThe users have acknowledged that the new tools are better and there are many benefits,especially in the financial reporting environment. They get financial insight they neverhad before, especially into their quote-to-cash cycle. They can now make more immedi-ate decisions with consistent data usage, less data latency and less redundancy. Overall,it’s a more efficient business operation

The Post-Consolidation EnvironmentThe Teradata warehouse and standardized ETL and data access tools dominate the newenvironment, which was consolidated around this one set of tools. There are 2.3 TB ofusable disk, which will be doubled by project’s end. There are 5,000 users and the data-using community has greatly expanded with this project.

Teradata was chosen for the DMC since it was already being used for some internalapplications and they were not sure how well alternative products would hold up.Scalability was “Very important. With Teradata you know you can easily keep adding

Page 17: Data mart consolidation

Data Mart Consolidation (DMC)

14

nodes, but with SMP, you can add CPUs but you get to diminishing returns. They startbattling among themselves if you get too many of them”.

Normal growth has occurred in the warehouse since it went into production, but theyare more focused on bringing the other marts in, not advancing what they’ve alreadybrought in. While some of the marts still exist, they serve no purpose. All will be gonesoon. The success of this DMC is assured.

Top 3 keys to DMC Success:

1. Executive buy-in

2. Proper data management and data modeling techniques

3. A team that is knowledgeable with the chosen toolsets

Anthem Blue Cross Blue Shield

The Pre-Consolidation EnvironmentAnthem’s consolidation was a result of the merger of Blue Cross Blue Shield companiesin Indiana, Kentucky, Ohio, and Connecticut in the period 1993-97. There were threeincompatible data warehouses that needed to be brought together to provide a consoli-dated view of the business.

Each Blue Cross Blue Shield plan used their data warehouse for pricing, understandingtreatments, some fraud and abuse, group reporting, utilization, underwriting, providercontracting and affairs management, Exposure, mandatory government reporting, andexperience analysis. They were initially implemented in the several years prior to themerger, some as far back as 1990.

Two were mainframe warehouses and one was on Teradata. Each was hundreds ofgigabytes in size. “In a way we were fortunate that each state had a data warehouse.Everyone was used to using a data warehouse and there being a data warehouse around,but also each state had its own representations of data, its own technologies and its ownway of doing things” according to the author.

ETL was done with COBOL code and there were numerous data access tools andmethods. Many of these methods were of the heavy lifting, programming variety suchas Visual Basic, Microsoft Access, Q&E, Powerbuilder, and CLIST applications.Major subject areas were duplicated in the environment since each data warehousewas developed with not only different staffs, but different staffs in different companies.The models were vastly different.

For example, the representation of a customer was by policy in one, customer ID inanother and something else in the other. This inconsistent representation worked foreach independent company but when the companies got together, this presentedproblems.

Reasons for ConsolidationDMC was sold mainly on the idea of integrating data to get cross-company views fromwhich Anthem would have much richer data for doing functions like fraud detection andclaims re-routing to best-of-breed providers. The idea came from a combination of IT,the business, and Teradata. Each state was vested in their representations and the useof their data. Each was a $1B+ organization so this was not a small effort.

Page 18: Data mart consolidation

Data Mart Consolidation (DMC)

15

The project was made very visible to the user community. The chief actuary of theconsolidated company, who represented much of the usage, was also the executivesponsor.

Still, there was cultural resistance that was appeased by keeping their data warehousesalive while the new data warehouse was built. “There are always privacy issues whendealing with healthcare data. Some state specific data is not accessible by all – only bypersonnel in that state. Even though the companies were merged, it would take quitesome time to merge all business processes” per the author.

The Consolidation ProjectThe goal of the DMC was to establish one data warehouse that would, by default,receive data from all data warehouses in Blue Cross Blue Shield plans that were beingmerged with Anthem.

Ohio’s was the most recently built data warehouse. It was built on Teradata and the chiefsponsor of the project was Ohio’s chief actuary in the pre-consolidation environment.For this reason, and the good experience to date with Teradata, it was selected as theplatform for the ADW (Anthem data warehouse). Anthem began bringing data in fromthe other data warehouses. New streams were created for the Indiana and Kentucky plandata and each subject area went through the process of data element comparison, logicalmodeling, database design, code value comparison, data transformation, and implemen-tation for the design.

The database design, ETL processes, and access environments were changed. Eachsubject area was redesigned to represent the single version of the truth. Anthem wantedone scalable data model for absorption of new data sources or new Anthem, Inc.acquisitions. “Being able to access, review, analyze and share data across the companymade all the difference between success and failure” according to the author.

The consolidation took about one year although this included other normal developmentand creation of value-added functionality. Top-down support combined with parallelruns to make it smooth helped overcome cultural resistance.

The Benefits RealizedThere were many benefits of the consolidated data warehouse. Some were related tothe fact that there was consolidation of the prior data warehouses and some are relatedto the ongoing developments on the data warehouse.

The DMC helped Anthem win new business because of the flexibility and reportingcapabilities generating income. The cost of care was lowered by $250M annually byusing the ADW to identify patterns in the data that allowed Anthem to build betternetworks and craft the network reimbursement arrangements in different ways. TheADW was instrumental in reducing the cost of products for policyholders and members(i.e., pay VALID bills ONCE).

The ADW is used to ensure practitioners are licensed to perform and Anthem was ableto craft lower costs from providers by dealing with them based on their profitabilityas determined by the data warehouse. Anthem was also able to reduce the Caesariansection rates and improve the results from coronary bypass surgeries and improve staffproductivity. “I don’t think much of this could have been accomplished without a singleversion of the truth” according to the author.

The Post-Consolidation EnvironmentThe initial consolidation of the 3 states into 1 data warehouse is complete.

Page 19: Data mart consolidation

Data Mart Consolidation (DMC)

16

The multi-terabyte Teradata warehouse has hundreds of users. The former warehouseswere still in place well after the DMC given their new role of feeding the ADW. OtherBlue Cross Blue Shield plans still need to be brought in so this is a work-in-progress.

Teradata was chosen because of Ohio Blue Cross Blue Shield’s good experience withTeradata and its known scalability. If the chosen solution was unable to handle the largeworkload, the shared concept would have died and Anthem would have stayed withseparate data warehouses which means they wouldn’t have gained half of what they didwith the ADW – and would have wasted millions!

Top 3 Keys to DMC Success:

1. Strong, active executive sponsorship keeping the projectout of internal politics

2. Source the data warehouse from operational systems,not existing data warehouse/data marts

3. Create a program with standards and processes

Sekisui Systems Corp.

The Pre-Consolidation EnvironmentSekisui had seven data marts distributed to branch offices fed from a central datawarehouse. These supported a variety of business functions such as increasing thefrequency of effective customer calls by saving time to create meeting materials, byautomating the sales cycle, and by providing information directly to selected customers.

It has grown over time both in data size and number of users. Before the DMC, themarts in total had 110 GB of total disk space with 66 GB used. There were 400 users.Despite the number of marts, they managed to keep consistency among the DBMS,the ETL, and data access for all of them. They also had only 1 DBA for all seven marts,but there were still efficiencies to be gained from DMC.

Reasons for ConsolidationOne anticipated benefit was cost reduction by consolidating the machines from sevenbranch offices. It was time to replace some of these anyway due to obsolescence, furtheropening the door to DMC.

Another benefit was to unify the system operation and further standardize the operatingskill of the enterprise to the platform they could grow with. “Since the Teradata warehousewas already constructed, we wanted to standardize the operating skill on Teradata byconsolidating the data marts to Teradata” according to Masaaki Kondo, Director of theCorporate Group Systems Division at Sekisui.

The ROI was estimated by comparing the costs of continuing to license the branchoffice machines to the cost of a consolidated approach. The System OperatingDepartment Manager (IT) came up with the idea for DMC at Sekisui. There wasn’tany cultural resistance.

The Consolidation ProjectSekisui consolidated onto the existing Teradata data warehouse by redesigning the entiresystem. Extract loads on the operational systems were reduced with the DMC. Theproject took 6 months, just as expected.

Page 20: Data mart consolidation

Data Mart Consolidation (DMC)

17

The Benefits RealizedThe project is complete and the data marts for the sales database systems are consoli-dated. The planned benefit, direct expense reduction, was achieved. Support costs werereduced and all access is now against the data warehouse.

The Post-Consolidation EnvironmentScalability was crucial for data expansion. Concurrency was also immensely importantsince usage concentrated around 9 a.m. system-wide.

If the DBMS were unable to handle the workload, Sekisui would be isolated frominformation on member daily sales activities and division managers’ sales results. Allorganizations in Sekisui group using the system would be affected.

Top 3 keys to DMC Success:

1. Create an organized data warehouse (not data marts)which is best suited to your goals

2. Educate the end users on the project and secure theiragreement

3. Unify codes and subjects in a consolidated environment

Page 21: Data mart consolidation

Data Mart Consolidation (DMC)

Part 3: Best Practices for DMC

Key Findings from the Interviews:

1. The number of marts/warehouses consolidated rangedfrom 3 to 70 with a median of 7.5.

2. The majority of environments had duplicate and inconsis-tent data across the pre-consolidation environment.

3. The primary reason for DMC varied with very strong opin-ions for the reasons cited! Five quoted business rationalesuch as creating a consolidated view of customers as themain reason while three quoted IT cost reductions as themain reason.

4. All performed at least some manner of rearchitectingalthough several made this a later stage step that cameafter rehosting.

5. Except for the case where the consolidated databaseshad operational functions to perform in the environmentas well, only one kept the consolidated marts/warehousesin the environment after the DMC. The old platforms wereredeployed to other uses or, in most cases, eliminated.

6. Every DMC was made very visible to the user community.These projects required a great deal of support whichmost received from the highest levels of the organization.It was not possible to accomplish DMC objectives in askunkworks manner.

7. Very little user data access outages were reported.Most DMC programs took great caution to transitionusers smoothly to the new environment.

8. 5 programs credited IT with the idea for the DMC. Theother 3 cited the business with the initiative.

9. All said scalability was important to the data ware-housing decision. Many referenced the sudden increasein data and users that the warehouse would be takingon after the DMC as putting scalability on the top ofthe criteria list.

10. Almost every DMC faced some degree of cultural resis-tance to the idea of consolidating and centralizing. Mostof this was adeptly dealt with through attaining top-downsupport and cultivating user interests throughout theproject. The majority of resistance went away as soonas early benefits of the DMC were realized.

11. Little change occurred to operational systems impact asa result of DMC efforts.

18

Page 22: Data mart consolidation

Data Mart Consolidation (DMC)

DMC can be used to put in place a scalable, integrated, multi-application data ware-house that absorbs all analytical-type activity in an organization or it can be used to“simply” get an antiquated system out of the environment by moving its function toa system still under support from its vendor.

Regardless of the ambition, many DMC efforts eventually lead to the first goal. The actof initiating the consolidation idea within an organization seems to spawn more andmore consolidation.

For those organizations that are considering DMC and will have opportunity to plan itssuccess, some best practices emerged from the interviews as well as anecdotal evidence.The keys are also applicable to newer data warehouse efforts or those being revamped toa centralized data warehouse environment.

Customer-Reported Keys to DMC Success1. Get top down support. This was cited as the #1 key to success in 5 of the cases

and was a top 3 key in all but one case.

2. Fix a problem. Whether you justify on cost savings or a business benefit (orboth), the DMC should fix a major, known problem that can be quantified inbusiness terms.

3. Have data standards and a sound data model.

4. Pick the right tools and platform. Put DMC on a scalable platform. Your datavolume managed within a singular database will instantaneously explode withDMC. Future efforts will be continuing to grow the environment. Also note thatin addition, many took this opportunity of changing platforms to also changedata access and ETL tools.

5. Set expectations and communicate with users. There is no such thing as overcommunication in a DMC project. This is about the users and care needs to betaken to migrate the users without any disruption in their ability to access data.

Author’s Additional Keys to DMC Success1. Don’t just rehost, rearchitect. This time of transition is also an opportunity to

reevaluate the data warehouse program according to established best practices –a time to evaluate what is and isn’t working and fully take advantage of the newplatform and the migration process.

2. Starve the pre-consolidated marts of attention and resources. Negotiate thecondition for user signoff prior to DMC. Make sure all utility is removed fromthe marts.

3. Justify on either platform cost savings, business benefits or both. The larger theproject, the more DMC is a difficult technical challenge and the platform costsavings more evident. It is always easiest to justify on cost savings but businessbenefit based on delivering new capabilities can be significant.

4. Expect and plan for cultural resistance. Ownership, as a concept in the formerenvironment, may now be designated at a subject area level as opposed to a datamart level. Carry forward security and stewardship designations and responsibili-ties to the consolidated data warehouse. This may even be a time to improvethese programs.

5. Consolidate ETL and access tools too. Part of the re-gathering of requirementsthat should be gathered for a DMC necessitates taking the opportunity to ensuretools are still compatible with the new platform and the most fit-for-purpose.

19

Page 23: Data mart consolidation

Data Mart Consolidation (DMC)

About the Author

firm specializing in data warehousing solutions. William is an internationally recognized

and managing information and technology services for G2000 organizations.

William is a frequent and highly rated speaker at major worldwide conferences andprivate events, providing instruction on customer intimacy, return-on-investment,architecture, business integration, and other business intelligence strategic and architec-

conferences, William is widely quoted on data warehouse and has been featured onseveral prominent expert panels. An expert witness, skills evaluation author and a judgefor best practices competitions, William is the former executive of a recognized bestpractices information management program.

20

5960 West Parker RoadSuite 278, #133Plano, TX 75093

William McKnight is founder and president of McKnight Consulting Group, a consulting

expert in data warehousing and MDM with more than 15 years of experience architecting

ture issues. He is a well-published author and a columnist in Information Managementfor the column “Information Management Leadership".

A regularly featured expert on data warehouse/business intelligence and MDM at major

(214) 514-1444www.mcknightcg.com