lessons in data modeling: data modeling & mdm

29
Data Modeling & Master Data Management (MDM) Donna Burbank Global Data Strategy Ltd. Lessons in Data Modeling DATAVERSITY Series September 28 th , 2017

Upload: dataversity

Post on 23-Jan-2018

555 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: Lessons in Data Modeling: Data Modeling & MDM

Data Modeling & Master Data Management (MDM)

Donna BurbankGlobal Data Strategy Ltd.

Lessons in Data Modeling DATAVERSITY Series

September 28th, 2017

Page 2: Lessons in Data Modeling: Data Modeling & MDM

Global Data Strategy, Ltd. 2017

Donna Burbank

Donna is a recognised industry expert in information management with over 20 years of experience in data strategy, information management, data modeling, metadata management, and enterprise architecture. Her background is multi-faceted across consulting, product development, product management, brand strategy, marketing, and business leadership.

She is currently the Managing Director at Global Data Strategy, Ltd., an international information management consulting company that specializes in the alignment

of business drivers with data-centric technology. In past roles, she has served in key brand strategy and product management roles at CA Technologies and Embarcadero Technologies for several of the leading data management products in the market.

As an active contributor to the data management community, she is a long time DAMA International member, Past President and Advisor to the DAMA Rocky Mountain chapter, and was recently awarded the Excellence in Data Management Award from DAMA International in 2016. She was on the review committee for the Object Management Group’s Information Management Metamodel (IMM) and the Business Process Modeling Notation (BPMN). Donna is also an analyst at the Boulder BI Train Trust (BBBT) where she provides advices and gains insight on the

latest BI and Analytics software in the market.

She has worked with dozens of Fortune 500 companies worldwide in the Americas, Europe, Asia, and Africa and speaks regularly at industry conferences. She has co-authored two books: Data Modeling for the Business and Data Modeling Made Simple with ERwin Data Modeler and is a regular contributor to industry publications. She can be reached [email protected] is based in Boulder, Colorado, USA.

2

Follow on Twitter @donnaburbankToday’s hashtag: #LessonsDM

Page 3: Lessons in Data Modeling: Data Modeling & MDM

Global Data Strategy, Ltd. 2017

DATAVERSITY Lessons in Data Modeling Series

• January - on demand How Data Modeling Fits Into an Overall Enterprise Architecture

• February - on demand Data Modeling and Business Intelligence

• March - on demand Conceptual Data Modeling – How to Get the Attention of Business Users

• April - on demand The Evolving Role of the Data Architect – What does it mean for your Career?

• May - on demand Data Modeling & Metadata Management

• June - on demand Self-Service Data Analysis, Data Wrangling, Data Munging, and Data Modeling

• July - on demand Data Modeling & Metadata for Graph Databases

• August - on demand Data Modeling & Data Integration

• September 28 Data Modeling & Master Data Management (MDM)

• October 26 Agile & Data Modeling – How Can They Work Together?

• December 5 Data Modeling, Data Quality & Data Governance

3

This Year’s Line Up

Page 4: Lessons in Data Modeling: Data Modeling & MDM

Global Data Strategy, Ltd. 2017

What is Master Data?

• Master Data is the consistent and uniform set of identifiers and extended attributes that describes the core entities of the enterprise including customers, prospects, citizens, suppliers, sites, hierarchies and chart of accounts (sic).

• Master data management (MDM) is a technology-enabled discipline in which business and IT work together to ensure the uniformity, accuracy, stewardship, semantic consistency and accountability of the enterprise's official shared master data assets.

- Source Gartner

4

Definition

Page 5: Lessons in Data Modeling: Data Modeling & MDM

Global Data Strategy, Ltd. 2017

When?

A Data Model Describes the Entities of the BusinessThe “Who, What, Where, When, Why” of the Organization – the Nouns

Entity: A classification of the types of objects found in the real world --persons, places, things, concepts and events – of interest to the enterprise. 1

1 DAMA Dictionary of Data ManagementWho?

How?

Where?

What?

Product

Salesperson

Invoice

Why?

OrderPeriod

Location

Page 6: Lessons in Data Modeling: Data Modeling & MDM

Global Data Strategy, Ltd. 2017

A Data Model Is a Visual Representation of Core Entities

6

A data model is a graphical view of the core entities important to the organization.

Humans tend to think in Pictures.

But… All Entities are not Master Data Entities

Page 7: Lessons in Data Modeling: Data Modeling & MDM

Global Data Strategy, Ltd. 2017

A Data Model Is a Visual Representation of Core Entities

7From Data Modeling for the Business by Hoberman, Burbank, Bradley, Technics Publications, 2009

A data model is a graphical view of the core entities important to the organization.

Humans tend to think in Pictures.

Early Master Data

Page 8: Lessons in Data Modeling: Data Modeling & MDM

Global Data Strategy, Ltd. 2017

Transaction Data vs. Master Data

Customer Date Product Code Price Quantity Location

Stefan Kraus 1/2/2017 Scarpa Telemark Ski Boot SC1279 €250 1 St. Moritz, CH

Donna Burbank 1/5/2017 Scarpa Telemark Ski Boot SCU1289 $150 1 Boulder, CO

Stefan Kraus 1/2/2017 North Face Down Jacket NF8392 €450 1 Zurich, CH

Stefan Kraus 1/2/2017 Garmin Sports Watch GM29384 €200 2 Zurich, CH

Wendy Hu 3/4/2017 Prana Yoga Pant PN82734 $51 5 New York, NY

Joe Smith 4/1/2017 Garmin Sports Watch GM29384 $150 1 Albany, NY

8

Consider the following retail transaction data

Transaction Data• Describes an action (verb): E.g. “buy”

• May include measurements about the action: (Who, When, What, How Many, Where, How Much, etc.)

• E.g. Stefan Kraus, 1/2/2017/, Scarpa Telemark Ski Boot, St. Moritz, CH, €250

Master Data• Describes the key entities (nouns), e.g. Customer, Product,

Location

• Provides attributes & context for these nouns

• e.g. Wendy Hu, age 25, female, resident of New York, NY, Customer since 2005, preferred customer card, etc.

Page 9: Lessons in Data Modeling: Data Modeling & MDM

Global Data Strategy, Ltd. 2017

Customer Date Product Code Price Quantity Location

Stefan Kraus 1/2/2017 Scarpa Telemark Ski Boot SC1279 €250 1 St. Moritz, CH

Donna Burbank 1/5/2017 Scarpa Telemark Ski Boot SCU1289 $150 1 Boulder, CO

Stefan Kraus 1/2/2017 North Face Down Jacket NF8392 €450 1 Zurich, CH

Stefan Kraus 1/2/2017 Garmin Sports Watch GM29384 €200 2 Zurich, CH

Wendy Hu 3/4/2017 Prana Yoga Pant PN82734 $51 5 New York, NY

Joe Smith 4/1/2017 Garmin Sports Watch GM29384 $150 1 Albany, NY

Transaction Data vs. Master Data

9

Master Data: Customer

Master Data: ProductMaster Data: Location

Reference Data: Country Codes

Reference Data: State Codes

Transaction Data

Page 10: Lessons in Data Modeling: Data Modeling & MDM

Global Data Strategy, Ltd. 2017

Master Data – the Opportunity

10

A 360 Degree View through Data

Stefan KraussAge = 31

Occupation = Ski Instructor Purchased €500 in outdoor gear in 2016

100% of purchases online

Top Finisher in Engadin Ski Marathon 2010-2015

Member of Loyalty Program since 2010

Prefers Text Message

Address = Pontresina, Switzerland

Page 11: Lessons in Data Modeling: Data Modeling & MDM

Global Data Strategy, Ltd. 201711

Stefan KraussAge = 62

Master Data – the Opportunity (& Need)A 360 Degree View through Data

Occupation = Banker

Member of Loyalty Program since 1990

Football Fan

Prefers Physical Mail

100% of spending in store

75% of spending is while on holiday

Purchased €3.500 in outdoor gear in 2016

Address = Zurich, Switzerland

Page 12: Lessons in Data Modeling: Data Modeling & MDM

Global Data Strategy, Ltd. 2017

Master Data Management (MDM)

• There are many architectural approaches to MDM. Two are the following:

12

Centralized Virtualized/Registry

MDM

Virtualization Layer

• Core data stored in a common schema in a centralized “hub”.

• Used as a common reference for operational systems, DW, etc.

• Data remains in source systems.

• Referenced through a common virtualization layer.

BOTH require a Data Model

Page 13: Lessons in Data Modeling: Data Modeling & MDM

Global Data Strategy, Ltd. 2017

MDM Data Models

• In an MDM Data Model, the core attributes for master data entities can be identified.

• This is typically the superset of attributes used by core systems & stakeholders in the organization.

13

Core, Shared Attributes

Source System A

Source System B

Source System C

Page 14: Lessons in Data Modeling: Data Modeling & MDM

Global Data Strategy, Ltd. 2017

ETL

Master Data Overview

14

CRM In-Store Sales

MarketingFinance Online Sales

Supply Chain

Each system has its own unique functionality and associated data model.

MDM“Golden Record”

Data Warehouse

BI & Reporting

Data Model

Lookup

End User ApplicationsReference Data Sets

Data Quality& Matching

Publish & Subscribe

The MDM data model is a selected superset of the source system models.

MDM can feed the dimensional model for the data warehouse (e.g. customer,

location, etc.)

Applications can reference the “Golden Record” for

lookup.

Page 15: Lessons in Data Modeling: Data Modeling & MDM

Global Data Strategy, Ltd. 2017

Business Rules & Matching

• Once master attributes have been assigned, and their populating source systems identified, a next step in MDM is to clarify how records are identified as equivalent, in a process known as matching. • Matching rules provide the criteria used to match records from disparate systems as

candidates for a golden record.

• Matching strategies are based on identifying attributes, and multiple match strategies can be defined, for example:• Match Strategy 1: Match on Date of Birth + Social Security Number

• Match Strategy 2: Match on Social Security Number + Last Name

• Match strategies can be executed in sequential order. For example, if no match is found using Strategy 1, a match will be searched for using Strategy 2, and so on through the list of match strategies.

15

Page 16: Lessons in Data Modeling: Data Modeling & MDM

Global Data Strategy, Ltd. 2017

Data Model / Database Keys for Matching Rules

• Candidate attribute combinations for matching are often aligned with the primary and alternate keys from the logical data model.

16

Ideally, if all systems use the same unique identifier, matching is easier. But this isn’t often realistic in “real world” systems.

• First, match on Date of Birth + SSN• Then, match on SSN + Last Name• Etc.

Matching on Primary Key

Matching on Alternate Keys

Page 17: Lessons in Data Modeling: Data Modeling & MDM

Global Data Strategy, Ltd. 2017

Fuzzy Matching

• Fuzzy matching logic can also be used, which is particularly helpful in matching string fields such as names and addresses, where human error or different entry standards between systems can cause slight variations in similar values, e.g.• “101 Main St” vs. “101 Main Street”

• “John Smith” vs. “J Smith”

• In addition synonyms can be created to assist with matching, for example • “St”, “St.”, “Street”, etc. for addresses

• “Tim”, “Timothy” for names and nicknames.

• When using fuzzing matching, data quality thresholds can be defined for auto approval.• Match scores are created for each fuzzy match, for example .9 would indicate a strong match and .2 a

weak one.

• Using these scores as a guide, thresholds can be defined for which matches can be auto-approved, which can be auto-rejected, and which need human review from a data steward.

17

Page 18: Lessons in Data Modeling: Data Modeling & MDM

Global Data Strategy, Ltd. 2017

Matching Approval – Key Stewardship Role

• A key responsibility of Data Stewards for MDM is the manual review and approval of potential matches which cannot be auto-approved and which require human review.

• In these cases, the match score is below the defined threshold, and requires a data steward to review the proposed matches and MDM golden record. Each steward would review the items for their given area only.

18

Match Group ID Name Match Status Match Score Record Source

000007 John R Smith Proposed .7843 System A

000007 Jack Smith Proposed .6532 System B

000007 John Smith Proposed .6894 System C

000007 John R Smith Proposed etc. System D

Page 19: Lessons in Data Modeling: Data Modeling & MDM

Global Data Strategy, Ltd. 2017

Survivorship – Attribute Groups• Once matches have been approved, a golden record can be assigned from a match group through a

process of Survivorship. (Note: These rules are distinct from Matching Rules)

• In order to create a mastered record from the various source systems, a series of attribute groups are defined, with specific survivorship rules for each of those attributes.

• For example, address sets could be defined for the following scenarios• Name fields (e.g. containing First Name, Last Name, Maiden Name, etc.). Rules could be defined that these

attributes are populated from System A.

• Demographic fields (e.g. containing Race, Ethnicity, Gender, etc.). Rules could be defined that these fields are populated from System B.

19

System A

System B

MDM“Golden Record”

Page 20: Lessons in Data Modeling: Data Modeling & MDM

Global Data Strategy, Ltd. 2017

Harmonization

• Harmonization: The process of harmonization pushes the mastered record back to source systems. • While this helps keep the MDM and source systems in synch, and works to improve overall data

quality …

• … it should be handled carefully, with close coordination with the owners and stewards of the source systems.

20

Page 21: Lessons in Data Modeling: Data Modeling & MDM

Global Data Strategy, Ltd. 2017

Governance & Business Process for MDM

• Successful MDM is critical on collaboration between the owners and stewards of various systems, and between business and IT stakeholders.

• In fact, the top two reasons for failure of MDM systems cited by the Gartner analyst group1 are :• Failure of IT to Align With Business Process Improvements and Document Business Value

• Delaying or Mismanaging Information Governance Implementation

• While the implementation of the hub and population strategies is complex, more complex is understanding the business processes and governance processes around the populating and publishing systems.

21

1 Top Four Reasons Your MDM Program Will Fail, and How to Avoid Them, Gartner, 2016, ID: G00223675, by Bill O’Kane. Note: The remaining two reasons are: Failure to Manage Initial Master Data Quality & Defining Transactional (Fact) Data as Master Data

Page 22: Lessons in Data Modeling: Data Modeling & MDM

Global Data Strategy, Ltd. 2017

The Importance of Business Process

• Process models are a helpful tool for describing core business processes (e.g. BPMN).• “Swimlanes” outline organizational considerations

• Data can be mapped to key business processes to understand creation & usage of information.

• Understanding business process is critical to Data Governance• Who is using data?

• How is it used in business processes?

• Are there redundancies, conflicts, etc.?

22

Identifying key data dependencies in core business processes

Page 23: Lessons in Data Modeling: Data Modeling & MDM

Global Data Strategy, Ltd. 2017

CRUD Matrix – Understanding Data Usage

Product

Development

Supply Chain

Accounting

Marketing Finance

Product Assembly Instructions C R

Product Components C R

Product Price C U R

Product Name C U,D

Etc.

23

Create, Read, Update, Delete

• CRUD Matrices shows where data is Created, Read, Updated or Deleted across the various areas of the organization

• This can be a helpful tool in data governance & data quality to determine route cause analysis.

Data entities or attributes

Users, Departments, and/or Systems

Page 24: Lessons in Data Modeling: Data Modeling & MDM

Global Data Strategy, Ltd. 2017

Case Study: Linking Data with Process for MDM

• An international restaurant chain realized through its digital strategy that:• While menus are the core product that drives their business…

• They had little control or visibility over their menu data

• Menu data was scattered across multiple systems in the organization from supply chain to kitchen prep to marketing, restaurant operations, etc.

• Menu data was consolidated & managed in a central hub:• Master Data Management created a “single view of menu” for business efficiency & quality control

• Data Governance created the workflow & policies around managing menu data

• Process Models & Data Mappings were critical• Business Process Models to identify the flow of information

• CRUD Matrixes to understand usage, stewardship & ownership

24

Managing the Data that Runs the Business

Product Creation & Testing

Menu Display & Marketing

Supply Chain Point of Sale & Restaurant Operations

Page 25: Lessons in Data Modeling: Data Modeling & MDM

Global Data Strategy, Ltd. 2017

Summary

• Master Data focuses on the core entities of the business (e.g. Customer, Product, Supplier, etc.)

• Data models are a critical part of any MDM initiative – defining & managing these core entities

• Master Data Management can provide significant business opportunity, as long as governance, process, data quality, survivorship rules, etc. are managed correctly.

• Data Governance is critical to any MDM initiative

• Business Process Models and CRUD matrices are important tools in aligning MDM to business success

Page 26: Lessons in Data Modeling: Data Modeling & MDM

Global Data Strategy, Ltd. 2017

DATAVERSITY Lessons in Data Modeling Series

• January - on demand How Data Modeling Fits Into an Overall Enterprise Architecture

• February - on demand Data Modeling and Business Intelligence

• March - on demand Conceptual Data Modeling – How to Get the Attention of Business Users

• April - on demand The Evolving Role of the Data Architect – What does it mean for your Career?

• May - on demand Data Modeling & Metadata Management

• June - on demand Self-Service Data Analysis, Data Wrangling, Data Munging, and Data Modeling

• July - on demand Data Modeling & Metadata for Graph Databases

• August - on demand Data Modeling & Data Integration

• September 28 Data Modeling & Master Data Management (MDM)

• October 26 Agile & Data Modeling – How Can They Work Together?

• December 5 Data Modeling, Data Quality & Data Governance

26

This Year’s Line Up

Page 27: Lessons in Data Modeling: Data Modeling & MDM

Global Data Strategy, Ltd. 2017

About Global Data Strategy, Ltd

• Global Data Strategy is an international information management consulting company that specializes in the alignment of business drivers with data-centric technology.

• Our passion is data, and helping organizations enrich their business opportunities through data and information.

• Our core values center around providing solutions that are:• Business-Driven: We put the needs of your business first, before we look at any technology solution.• Clear & Relevant: We provide clear explanations using real-world examples.• Customized & Right-Sized: Our implementations are based on the unique needs of your organization’s

size, corporate culture, and geography. • High Quality & Technically Precise: We pride ourselves in excellence of execution, with years of

technical expertise in the industry.

27

Data-Driven Business Transformation

Business StrategyAligned With

Data Strategy

Visit www.globaldatastrategy.com for more information

Page 28: Lessons in Data Modeling: Data Modeling & MDM

Global Data Strategy, Ltd. 2017

Contact Info

• Email: [email protected]

• Twitter: @donnaburbank

@GlobalDataStrat

• Website: www.globaldatastrategy.com

28

Page 29: Lessons in Data Modeling: Data Modeling & MDM

Global Data Strategy, Ltd. 2017

Questions?

29

Thoughts? Ideas?