white paper: distributed data quality

4
1 IAIDQ and University of Arkansas at Little Rock, “Understanding how Organisations Manage the Quality of their Information and Data Assets” November 2012, p21 2 Gartner Inc. “Measuring the Business Value of Data Quality (G00218962)” October 2011, p3 3 TDWI Research “Self-Service Business Intelligence” Q3 2011, p32 In a world where technology is the fundamental enabler in our every day lives we find that human activity is becoming ever more complex. We’re living in a mobile, connected, graphical, multi-tasking, object-oriented, cloud-serviced world, and the rate at which we’re collecting data is showing no sign of abatement. Distributed computing, increased personal autonomy, self-norming organisations and opportunity for self-service were meant to lead to better agility, responsiveness and empowerment. The trade-offs are in the forms of dilution of knowledge, hidden inefficiencies, reduced commitment to discipline and rigour, and unintended business consequences. And still poorer data quality. Distributed Data Quality A centralised approach isn’t the only way of achieving better data outcomes Author - Alan D Duncan ź 70% of companies recognise data as a strategic asset 1 ź Data quality effects overall labour productivity by as much as 20%, 2 ź Data Quality inhibits successful implementation of self-service Business Intelligence solutions in 55% of cases. 3 Fast Facts In response to these challenges, companies have invested to create a focused capability to manage their data quality, with traditional approaches typically relying on a centralised, bureau-style service function. Data Quality Analysts undertake systemised data profiling, engage with business users to capture and then encode the quality rules and criteria, and then provide ongoing monitoring and cleansing. This comes with an expectation of implementing an (expensive) core data quality software platform, developed and administered by a technical team of Data Quality specialists that incorporates a set of organisation wide rules where one size fits all. The traditional approach to addressing Data Quality Issues Centralised Data Quality Common process model & IM Governance Controls A B C In these examples A, B and C represent the different departments, business units or entities that contribute data to the enterprise. While each business unit may operate separately, there is significant commonality across the whole business lifecycle. Customers, product lines and service channels are inter- related and cross-business effectiveness is enhanced by close co-operation. FEATURES: Approaches & processes for Information Management and Data Quality need to be held in common. One set of controls and policies is put in place for Information Management throughout the organisation. GOOD FOR: Very hierarchical organisations where there needs to be a significant amount of information sharing between departments and business units. This is typical in organisations where information sharing and re-use is required at the detailed level. LIMITATIONS: ź The business knowledge needs to be transferred from the expert business user to a second party. ź As business rules change, there’s limited flexibility. ź Documentation is maintained separate from the rules base. ź Programmatic encoding of agreed rules make it difficult to maintain and enhance in the longer term. ź Can’t (easily) deal with ad-hoc data sets, reducing flexibility and inhibiting creativity and experimentation. ź Overall, this means there’s a significant opportunity for the business community to abdicate its responsibility for its data. ź The addition of a new business unit or a new process requires significant system modifications and either the business unit’s process modified to fit the organisation, or a new implementation to accommodate this variance in established programs. ź The up-front investment in organisational and technology capability can be significant, while the benefits can be hard to quantify.

Upload: alan-d-duncan

Post on 27-Jan-2015

110 views

Category:

Business


4 download

DESCRIPTION

In this new paper, I explore the organisational and cultural challenges of implementing information governance and data quality. I identify potential problems with the traditional centralised methods of data quality management, and offer alternative organistional models which can enable a more distributed and democratised approach to improving your organisations data. I also propose a simple four-step approach to delivering immediate business value from your data.

TRANSCRIPT

Page 1: WHITE PAPER: Distributed Data Quality

1 IAIDQ and University of Arkansas at Little Rock, “Understanding how Organisations Manage the Quality of their Information and Data Assets” November 2012, p21

2 Gartner Inc. “Measuring the Business Value of Data Quality (G00218962)” October 2011, p3

3 TDWI Research “Self-Service Business Intelligence” Q3 2011, p32

In a world where technology is the fundamental enabler in our every day lives we find that human activity is becoming ever more complex. We’re living in a mobile, connected, graphical, multi-tasking, object-oriented, cloud-serviced world, and the rate at which we’re collecting data is showing no sign of abatement.

Distributed computing, increased personal autonomy, self-norming organisations and opportunity for self-service were meant to lead to better agility, responsiveness and empowerment. The trade-offs are in the forms of dilution of knowledge, hidden inefficiencies, reduced commitment to discipline and rigour, and unintended business consequences.

And still poorer data quality.

Distributed Data QualityA centralised approach isn’t the only way of achieving better data outcomesAuthor - Alan D Duncan

ź 70% of companies recognise data as a strategic asset 1

ź Data quality effects overall labour productivity by as much as 20%, 2

ź Data Quality inhibits successful implementation of self-service Business Intelligence solutions in 55% of cases. 3

Fast Facts

In response to these challenges, companies have invested to create a focused capability to manage their data quality, with traditional approaches typically relying on a centralised, bureau-style service function.

Data Quality Analysts undertake systemised data profiling, engage with business users to capture and then encode the quality rules and criteria, and then provide ongoing monitoring and cleansing.

This comes with an expectation of implementing an (expensive) core data quality software platform, developed and administered by a technical team of Data Quality specialists that incorporates a set of organisation wide rules where one size fits all.

The traditional approach to addressing Data Quality Issues

Centralised Data Quality

Common process model & IM Governance Controls

A B C

In these examples A, B and C represent the different departments, business unitsor entities that contribute data to the enterprise.

While each business unit may operate separately, there is significant commonality across the whole business lifecycle.

Customers, product lines and service channels are inter-related and cross-business effectiveness is enhanced by close co-operation.

FEATURES:Approaches & processes for Information Management and Data Quality need to be held in common.One set of controls and policies is put in place for Information Management throughout the organisation.

GOOD FOR:Very hierarchical organisations where there needs to be a significant amount of information sharing between departments and business units.This is typical in organisations where information sharing and re-use is required at the detailed level.

LIMITATIONS:

ź The business knowledge needs to be transferred from the expert business user to a second party.

ź As business rules change, there’s limited flexibility.

ź Documentation is maintained separate from the rules base.

ź Programmatic encoding of agreed rules make it difficult to maintain and enhance in the longer term.

ź Can’t (easily) deal with ad-hoc data sets, reducing flexibility and inhibiting creativity and experimentation.

ź Overall, this means there’s a significant opportunity for the business community to abdicate its responsibility for its data.

ź The addition of a new business unit or a new process requires significant system modifications and either the business unit’s process modified to fit the organisation, or a new implementation to accommodate this variance in established programs.

ź The up-front investment in organisational and technology capability can be significant, while the benefits can be hard to quantify.

Page 2: WHITE PAPER: Distributed Data Quality

4 Forrester Consulting, “Information Governance, Turning Data Into Business Value” October 2011, p6

5 IAIDQ and University of Arkensas at Little Rock, “Understanding how Organisations Manage the Quality of their Information and Data Assets” November 2012, p17

6 TDWI Research “Next Generation Data Integration” Q2 2011, p21

7 Aberdeen Group, “Data Management for BI”, December 2010, P7

8 IAIDQ and University of Arkansas at Little Rock, “Understanding how Organisations Manage the Quality of their Information and Data Assets” November 2012, p37

9 Information Week 2014 Analytics, BI & Information Management Survey, November 2013, P12

CENTRALISED DATA QUALITY EXAMPLE – UK TELCO & MRS. WHITEHOUSE

A particular UK Telecommunications company had chosen to implement a centralised approach to managing their customer data.

The Customer Data team was the only group with access to create and update customer details.

Sometimes, anomalies and errors still crept into customer data (there wouldn’t be a need for a Data Quality team if that weren’t so!).

Common cases are where data entry for a sales contract may not have been completed before the ordered item was shipped, when the

hand-written details on the “New Customer” form were not fully legible, or when the billing price on a customer’s invoice was wrongly

calculated.

One repeating error was when the name and email addresses for new prospects got mis-typed. Most often this was an inadvertent problem

caused by “sticky fingers”, but on rare occasions there could be a more deliberate and malicious cause.

Like the one time when someone in the Customer Data team took a complaint call from a customer and decided that they should respond

by substituting an “S” for the “W” in the billing details for “Mrs. Whitehouse”. With no further checking and no other business user with direct

access to the customer data file, the change went through un-verified.

And the follow-on customer service call was even more robust...

By and large, the data management industry and software vendors have perpetuated the view that a centralised approach is required and the only way to implement. However, there are organisational cultures that don’t respond well to (or don’t work with) a centralised approach.

When it comes to reviewing the overall organisational operating models for Data Quality Management, the question is somewhat different. It’s not so much about the hierarchical situation of the core team; it’s more to do with the overall approach to bringing the various data groups together in a cross-functional, enterprise-wide approach.

Rather than being a hierarchical or functional issue, it will be the social and cultural characteristics of an organisation that will be the deciding factors in determining which approach to adopt.

Additionally in this world where change is now the norm and technology enablers foster new environments, the speed with which we can accommodate these new data sources becomes a competitive advantage when integrating them into our information architecture.

Alternative Models of Governance – there’s more than one approach

ź 74% of companies are planning an information governance project in the near future. 4

ź Less than a quarter of companies have met most or all of their data quality goals. 5

ź Fewer than 40% of businesses co-ordinate their data quality practices with their data integration capability. 6

Fast Facts

ź 42% of best-in-class companies consider building a data-driven decision culture to be the most important strategic action that they can take. 7

ź Mature organisations put almost twice as much effort into data quality activities that the general business population. 8

ź 59% of businesses say data quality problems are the biggest barrier to successful analytics or Business Intelligence initiatives. 9

Fast Facts

Each organisation will of course have its own dynamics, cultural constraints and behavioural norms that influence the way the business runs; these are often not formally recognised or dealt with, even in organisations that have well-documented Mission or Value Statements.

This paper looks at two additional two broad categories of alternative organisational and cultural models that can have an overarching influence on the approach to establishing a Data Quality environment:

ź Federated Data Qualityź Distributed Data Quality

Page 3: WHITE PAPER: Distributed Data Quality

Federated Data Quality Management

Information Management Principles

A B C

In these examples A, B and C represent the different departments, business unitsor entities that contribute data to the enterprise.

In the Federated model each unit operates autonomously and may have very different approaches & processes.However, each executes the same overall functionality & responsibilities as the others.

In this environment core policies are maintained at a corporate level allowing for local interpretation in their implementation.

FEATURES:There are shared guiding principles & objectives for Information Management. Guidelines cascade down allowing for the addition or amendment of existing principals and adding the capability for local teams to execute to support their own requirements.There is flexibility and ability to adapt quickly to changing circumstances and control is shared between necessary corporate guidelines and necessary local interpretation.

GOOD FOR:Geographically diverse organisations with a loose hierarchy. Information sharing is appropriate at a high level (themes, approaches, learning).This would suit organisations where significant differences in the operational environment exist or rapid change in the environment is regarded as business as usual, but where the overall high-level objectives are held in common.

LIMITATIONS:

ź Requires some cultural commitment to the measurement and improvement of data at an organisational level.

ź Can be confusing as to where the responsibility lies for the delivery at a practical level.

ź Provides an environment where nobody could be responsible for the outcome.

Distributed Data Quality ManagementDistributed Data Quality provides organisations with the ability to put data quality at the front line of their business, shifting responsibility to the people who best understand the data that’s coming into their business and allowing them to deal with the issue of bad data before it impacts their systems.

In the distributed model, there is little or no commonality within business units, customer base and product lines, with each operating fully autonomously. Local decision-making is paramount.

The Distributed Data Quality environment provides for the knowledge holder to be enabled to manage their respective data elements within a framework that ensure organisation wide enablement.

FEATURES:Very localised business rules. High levels of autonomy.Few, if any, dependencies for data exchange.

GOOD FOR:Organisations that are operationally diverse and functionality independent, with little or no need for sharing information between silos.For organisations that want to take a staged approach to the implementation of Data Quality and to have an early result and success when compared against the longer implementation of a centralised or federated implementation.

Information

Management

Principles A

A B C

In these examples A, B and C represent the different departments, business unitsor entities that contribute data to the enterprise.

Information

Management

Principles B

Information

Management

Principles C

DISTRIBUTED DATA QUALITY EXAMPLE – National Food Distribution CooperativeThe Cooperative aid their independent members with centralised administration; including marketing, advertising, finance advisory and IT services. These services are provided across a broad portfolio of stakeholders including:

ź 60 membersź 120 Suppliersź 29 hosted ERP Systems

In the distributed Data Quality implementation the cooperatives stakeholder areas encompass the following different collection types:

ź Consignment Goods within the cooperatives Warehouses ź Supplier Submitted Data from Members ź Collection of Non-hosted ERP and Sales Information ź Reporting to NSW Government for Participation in Delivery Contracts

Each of the differing types of information is used for a different purpose and the specific characteristics of that collection and subsequent validation are determined by the part using the information.

Page 4: WHITE PAPER: Distributed Data Quality

Success with data quality is never guaranteed. However, data quality improvement does not need to be daunting and a lot of progress can be made quickly.

There are several simple steps that you can take to deliver immediate value:

How to Bring it all together

Collect

Common Data Quality processes and collaboration workspace

Validate Protect Monitor

Make an inventory of the data that is useful to your business process – where it resides, what purpose(s) it serves, what format it comes in, who is the responsible person for the data set, and the access methods. Collate your data sets into an accessible workspace where data can be manipulated, corrected and distributed.

Collect your data

Perform test profiling on the data to identify potential issues and establish validation rules. Collaborate with colleagues to prototype and share business rules, to ensure that there is common understanding of any potential data problems. Based on the testing results, identify potential root causes and develop an action plan to ensure that data quality problems are addressed at source.

Validate your data

Once your data has been fixed, you want to ensure that only trusted data is propagated to other data users. Based on the shared validation rules, screen and filter any data that is found to be in error.

Protect your data

The value of your data is a function of its use. Be clear about the usage of your data, both as an input to and an output of the business process. Ongoing monitoring of live data sets will identify new data quality issues and enable you assess any potential impact before they have a negative effect upon your business.

Monitor your data

As more participants join the collaborative process, the power, utility and value of your data will grow.The most important thing is to make a start.

The lack of centralisation is not really the underlying issue. The challenge is really about a lack of a structured approach or the absence of a data quality culture.

The expectations and capabilities that are required will be very different depending on the cultural makeup of the organisation. It is necessary to identify the most suitable model and match any data quality initiatives to fit with the cultural norms that apply.

In many cases a traditional, centralised approach to data quality management is not desirable (and may even be counter-productive).

A distributed and end-user oriented approach to Data Quality Management can be desirable, for several reasons:

ź Many organisations still haven’t given data quality due consideration, and the impacts can be significant, but often hidden.Empowering users to thinks about and act upon data challenges can become a catalyst for a more structured, enterprise wide approach.

ź By managing data quality issues locally, knowledge and expertise is maintained as close to point-of-use as possible.

ź In environments where funding is hard to come by or where there isn’t appetite to establish critical mass for data quality activity, progress can still be made and value can still be delivered.

ź Change of culture is difficult to impose upon an organisation. Change of approach derived from grass-roots initiatives and operational process improvements can have a more long-lasting impact.

ź A co-ordinated approach that enables collaboration, co-operation and communication between business users will carry more weight that a centrally imposed regime that does not engage with the business process.

Change the culture, change the result. That doesn't necessitate centralisation to make it happen.