business semantics for data governance and stewardship

Post on 17-Jul-2015

576 Views

Category:

Data & Analytics

6 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Business SemanticsFor Data Governance & Stewardship

Dr. Pieter De Leenheer

Sloan HallStanford University

Feb 4 - 2015

Overview

• ICT: from Truth to Trust

• The Spectrum of Business Semantics

• Situation Map

• Business Semantics Governance & Stewardship

– Principles

– Operating Framework

• Reflection and Questions

La Trahison des Images (Magritte, 1929)

La Trahison des Images (2)

https://deleenheer.wordpress.com/2009/12/15/magrittes-flirting-with-semantics/

What we talk about when we talk about

no Data Governance

Who approved this?

I wish these guys

spoke our

language

I can’t understand

this report !

I’ve never seen this

code! Who

introduced this ?

This doesn’t seem

right. Are we sure

this data is correct ?

The Problem

This rule is

different in our

country !

This is an exception

to the rule !

Glossary Search

• How frequently do you look up a word for your business?

• To what purpose?– Clarification– Differentiation

• What are your main sources?• Hierarchy-based navigation or key-word based

search?• Authoritative Truth or trust?

From Truth to Trust: Behind the Curtains

https://www.research.ibm.com/visual/projects/history_flow/results.htm

Overview

• ICT: from Truth to Trust

• The Spectrum of Business Semantics

• Situation Map

• Business Semantics Governance & Stewardship

– Principles

– Operating Framework

• Reflection and Questions

Spectrum of Business Semantics

Welty, C., Lehmann, F., Gruninger, G., and Uschold, M. (1999). Ontology: Expert systems all over again? In Invited panel at AAAI-99: The National Conference on Artificial Intelligence, Austin, Texas, USA.

The Big ‘Metadata’ BangCatalogue and text files

• The start of an organization’s data management

• Represented by shared folders with lists of things such as product, customer, templates

• First ‘clouds’ of metadata– Naturally emerge as by-product

– For human consumption

– Locally understood

• From this point exponential expansion:

• in volume• in consumers (receiver)• in producers (sender)• in entropy

Glossary• List of terms and definitions

e.g., http://web.stanford.edu/dept/pres-provost/cgi-bin/dg/wordpress/data-governance-and-stewardship-materials/

Thesaurus

• add homo-, syno, mero-, hyper- and hyponymous relations

Taxonomy

• Formalized representation of a “thesaurus”• Generalize and specialize properties and relations

– generalize Vendor and Customer with similar properties into Party

– specialize Location into Home Address and Office Addressbecause of different properties

• Classifying a thing as a Term, Data Element or System– E.g., “customer” vs. “CUST_TBL” vs. “CRM” to determine

ownership

• Inheritance-based reasoning such as syllogisms– Premise: “John doe” is a lead– Premise: All leads receive a mortgage offering– Conclusion : “John Doe” receives a mortgage offering

Frames

Logical constraints

• Modal Logic:

– context determines meaning, truthfulness, validity

– plausibility vs. necessity

• Modalities determine:

– who owns a term per region, process, function

– where and how enforce terms

– What the definition is of a term

Hierarchical Context in ACORD

Multidimensional Context

Overview

• ICT: from Truth to Trust

• The Spectrum of Business Semantics

• Situation Map

• Business Semantics Governance & Stewardship

– Principles

– Operating Framework

• Reflection and Questions

Situating an organization’s level of glossary need

size characterizing events business needs technology support status

1 to 50first term-and-condition templates, first products, customers

a catalogue of items like customers, products and offerings spreadsheet database

51 to 100

first customer segmentationlead engine setupbusiness functions defined

as the catalogues grow in size, transform loose descriptions and definitions in text files into a glossaryof terms

shared file folders (for lead, prospect, customer, product, offering)

101 to 500

business functions populatedinter-functional business processes developproduct and customer data volumes grow

the need for a thesaurus for comparing glossaries, differentation of customer types, pricing models, reporting templateslocal data analytics and storage

Spreadsheet, mediawiki, functional processes like salesforce, SDLC, servicenow; forecasting tools, reporting tools, databases

501 to 1000

invested growthmergers and acquisition take placefirst signs of corrupt data reports on the board table

the need to transforming thesauri into taxonomiesand data models and architecture framesISO/ACORD/BCBS standardization

mediawikis go viral without proper alignemnt between them; first metadata tools in IT to align certain functions, business limited to spreadsheets

1001 plus

global operationsone or more red flaggs: legal (regulatory compliance breached): organizational (CxO fired), bad reputation (fraud), financial loss (penalties, debt)

Reporting standards transformed into corporate data policies and rules and data qualityModalities as to who are to define them and how and where to enforce them have been setThe need for the CDO function is mentioned but resistance from CIO/CTOBig Data opportunities loom beyond the data nebula (screen with universe).

platform with several data management systems (infa, ibm, oracle) scared by M&A. Lineage fragmented, not properly validated by businessdata governance organization theorized (or failed before) so no one takes accountability, lack of functional descriptions or enterprise-wide championshipGlossaries’ usefulness implodes as their numbers increaseThe enterprise data model is common ground for IT but useless to the business. Validation is urgent.

Overview

• ICT: from Truth to Trust

• The Spectrum of Business Semantics

• Situation Map

• Business Semantics Governance & Stewardship

– Principles

– Operating Framework

• Reflection and Questions

Principles of Business Semantics

• Democracy

• Emergence

• Perspective rendering

• Perspective unification

• Validation

http://www.academia.edu/874733/Business_semantics_management_A_case_study_for_competency-centric_HRM

Principles at work in the Situation Map

• Emergence is a continuous principle at work• Unification and rendering continuous in flux but

at two different frequencies (B vs. IT)• Validation is limited to technical lineage• Democracy and Business Validation (socio-

technical) are lacking

• Reactive rather than pro-active governance (defining) and stewardship (enforcing)

• Lack of tools

Overview

• Communication: from Truth to Trust

• The Spectrum of Business Semantics

• Situation Map

• Business Semantics Governance & Stewardship

– Principles

– Operating Framework

• Reflection and Questions

Gradually Build Trust based on Stewardship and Validation

• What?

– Qualitative meta data: e.g., definition for

address, codes, mappings, classifications, etc.

• Who?

– Roles and responsibilities for people

• How ?

– Collaborative workflows to orchestrate

people in achieving high-quality meta-data

– Start Simple, Buy-in, Council

– Measure Maturity and Trust

– Separate stewardship from integration

Data Governance Council: Governance Operating Model

Roles &

Responsibilities

Processes &

Workflow

Asset Types &

Traceability

Data Governance

Organization

Data Stewardship Activities

Data Quality

Development

IT / Operational Data Management Activities

Data

Modeling

Metadata

Lineage

Establishes & drives

Aligns & Coordinates

Reports & Escalates

Monitors & Remediates

Metadata

Scanning

Reference Data

Authoring

Data

Integration

Collibra Business

Semantics Glossary (BSG)

Collibra Reference Data

Accelerator (RDA)

Hierarchy

Management

Business &

Data Definitions

Business

Traceability

Semantic

Modeling

Mapping

Specifications

Policy

Management

Business

Rules

Data Quality

Rules

Data Quality

Reporting

Issue

Management

Reference Data

Crosswalks

Master Data

StewardshipData Quality Profiling

DQ Defect

Resolution

Collibra Data Stewardship

Manager (DSM)

Collibra Platform

Other Data Management

Vendor products

...

Example in Health Insurancehttp://prezi.com/ve1ws8jmpqcn/workflow/

Global Data Governance

• Objective– n Enterprise service buses => 1 Global Information Market Place

• Challenges – Data Service = data sharing agreement across organization silos, policies,

regulations, semantic assumptions. E.g., Address

– No clear balance between data ownership and control:

• responsibilities are not set

• for each data point : increasing exposure to risk regarding quality and policy compliance

• Service is more about trust because truth is relative

Solution

Solution

One Global Information Hub

Solution Phase 1 : Jun-Sept

One Global Information Hub

Solution Phase 2 : Oct-Nov

One Global Information Hub

Solution Phase 2 : Oct-Nov

One Global Information Hub

Solution Phase 3 : Dec -

One Global Information Hub

Solution

One Global Information Hub

What is to be governed?

Data Governance Questions

• What does the term ”address” mean?

• How is term “address" represented?

• In what system are data elements on ”address” recorded?

• What views does a data sharing agreement include?

• To which policy does my data sharing agreement comply?

• What country is my term “address” classified?

• …

Collibra Traceability Paths

Term has attributes definition, description, etc.

Term is represented by Data Element

Data Element has system of record System

Data sharing Agreement groups Data View

Business Term ≠

Data Elementhttps://compass.collibra.com/display/COOK/Asset+Types+and+Traceability+Requirements

Operating Model

Traceability Diagram

Who? RACI

How is it to be governed?

• Status Types and Workflows

– For Domains, Terms, Users, and later for Issues and Data Sharing

Agreements

BUSINESS SEMANTICS GLOSSARY

Candidate In Progress

Under Review

Accepted In Revision

Rejected

Term requested on

the domain page 1 1

1

2

2

3

3

2

3

Depricated

4

5

Workflows

1

2

Propose Business Term

Edit Business Term

3 Onboarding Business Term

4 Deprecate Business Term

5 Reactivate Business Term

https://compass.collibra.com/display/COOK/Lifecycle%3A+Workflows+and+Status+Types

How it it to be governed? Propose Workflow

How it it to be governed? Onboarding Workflow

How it it to be governed? Approval Workflow

Questions for the Audience

We presume the starting point is glossary.

• What factors would make it impossible?

• Know of cases where it has been achieved without?

• Is it possible to establish data governance without a glossary?

top related