metadata intelligence - torornto - 2016-11...metadata intelligence and the cmmi data maturity model...
TRANSCRIPT
Metadata Intelligence and the CMMI Data Maturity Model
Protecting Critical Data
Wed. Nov. 16, 2016
Don SoulsbySandhill Consultants
Information Resource Management Association of Canada
Who we are:
Sandhill ConsultantsIs a group of federated people, partners, products and processes that help our clients:Build comprehensive
Data ArchitectureResulting from a persistent
Data Management ProcessFounded on a robust
Data Governance PracticeProduces Trusted, Governed Data
Don Soulsby • Vice President Architecture
Strategies, Sandhill Consultants• CMMI – Enterprise Data
Management Expert (EDME)• Enterprise Architecture,
Business Intelligence, Metadata Consultant
• Data/Process modeller• Educator & Industry Speaker
3
Meta an abstract, high-level analysis
Wright person who creates, builds, or repairs something specified
No Good News
No headline about the airship that didn’t crash…
…By any other name
stratuminsurance.com
It is not just about Locks and Keys!
Michael Bruemmer, VP, Experian Data Breach
Resolution
Companies are no longer judged by whether they have a breach, but rather on how they respond when a breach occurs.
5 Things to Do When Your Company Gets Hacked
InfoMgmt – Daily - Oct 14, 2016 - by David Weldon
1. Pull the plug2. Wipe the slate clean3. Diagnose compromised systems4. Notify affected users5. Take corrective action
Phil Phillips, the head systems administrator at Experts Exchange
http://www.information-management.com/
The Towering Inferno (1974)
It’s 10pm, do you know where..
your DATA ASSETS are?
Mel Epstein, New York's WNEW-TV, 1967
Major Challenges – Data Assets
• Lack of Senior Executive support, do not see data assets on the books.
• Data in many distributed systems with no common data model and multiple repositories of metadata with limited search capabilities.
• Difficulty in getting funding centralized and strategically planned.
• IT project culture, limited support for cross project data programs – master data.
That which is Everybody’s business is Nobody’s business
Izaak Walton (17th Century)
Investigate the Data Management Process
Data Management Body of Knowledge
Data Management Maturity - DMM
The DMM model, first announced in May 2014, was developed using the principles and structure of CMMI
Institute’s Capability Maturity Model Integration (CMMI®)
The DMM℠ model outlines data process improvement across business lines — allowing executives to make better and faster decisions using a strategic view of
their data.
CMMI Assessment
Data Management Maturity Model (DMM)
Data Management
Strategy
DataGovernance
DataQuality
Platform &Architecture
DataOperations
The Capability Maturity Model
Process
Control
Process
Measurement
Process
Definition
Basic
Management
Control
Initial
(1)ad hoc Process
Repeatable
(2)Stable Process
Defined
(3)Standard Process
Managed
(4)Measured Process
Optimizing
(5)Improving Process
Consistent
Execution
Controlled
Environment
Quality and
Productivity
Improvement
Continuing
Improvement
Chaotic
Form & Substance
Data Management
Strategy
DataGovernance
DataQuality
Platform &Architecture
DataOperations
FORM• Data Structure and
Context• The definition and
meaning of the data
CONTENT• Data Content• Data Quality
dimensions and standards
MetadataIntelligence
Data Vulnerability
DATAArchitecture
WHAT?
WHERE? WHO?
HOW?
Network
Architecture
Security
Architecture
RISK ASSESSMENT
• Classification• Valuation• Detection• MitigationApplication
Architecture
Data Risk Assessment
• Classification• Valuation• Detection• Mitigation
Risk Assessment: Classification
“Data is a Managed Corporate Asset”
Not all assets need to be managed the same way!
137Sales Revenue Amount Tower of Tuna Count
Classification
Carolus Linnaeus
Aristotle
Railway Asset Classification
Uniform Classification of Accounts (UCA)
• Purpose:
o The purpose of the UCA is to define the method of accounting for railway companies subject to regulation by the Canadian Transportation Agency (Agency).
• E.g. Equipment
o Rolling Stock – Revenue Service
• 171 Locomotives
Subject Area Model
UCA - Data – Asset/Expense ?
General Expenses• The accounts in this activity group are designed
to record costs related to the railway company as a whole.
• 801 Management Serviceso Include the costs of corporate and rail functions of a
staff nature which provide services to other management groups, including administrative and office services (including cafeterias, libraries, etc.), investigation and security, medical services, legal services, information systems and data processing, and research and development.
Data Risk Assessment
• Classification• Valuation• Detection• Mitigation
Risk Assessment: Valuation
Monetize Data Assets ?
More than classification – Rules!
• 271 Locomotives - Accumulated Amortization
o Record accumulated amortization of account 171 - Locomotives.
o Record by type of locomotive as set out in schedule A. (Section 1901)
• 501 Locomotive Maintenance
o Include the cost of repairing locomotives.
o Record according to schedule A (section 1901).
o Reference: Instruction 1502.04
• 503 Locomotive Servicing
o Include the cost of servicing locomotives. Record according to schedule A (section 1901).
o Reference: Instruction 1502.04
• 971 Locomotives - Amortizationo Include the amortization expense assessed for account 171 - Locomotives.
o Record according to schedule A. (Section 1901)
Data Asset Prioritization
Critical data elements
• (CDEs) are defined as “the data that is critical to success” in a specific business area (line of business, shared service, or group function), or “the data required to get the job done.”
Competing with High Quality Data: Concepts, Tools, and Techniques for Building a Successful Approach to Data Quality
by Rajesh JugulumJuly 2014
Privacy• Personal information• Personal Financial
information • Non-personal
information
Critical Data Element - Risk
Accessibility• Secure - Internet• Monitored - Extranet• Unsecure - Internet
Reliability• Trusted• Verifiable• Un-trusted
Confidentiality• Sensitive• Confidential• Internal Use • Public
Privacy• Personal information• Personal Financial
information • Non-personal
information
Critical Data Element - Risk
Accessibility• Secure - Internet• Monitored - Extranet• Unsecure - Internet
Reliability• Trusted• Verifiable• Un-trusted
Confidentiality• Sensitive• Confidential• Internal Use • Public
Data Risk Assessment
• Classification• Valuation• Detection• Mitigation
Risk Assessment: Detection
Courtesy ASG.
F O R MC O N T E N T+
Anatomy of a Data Element
Data Governance Capabilities
Governance Management
Business Glossary
Metadata Management
DataGovernance
Best practices for ensuring broad participation in the practice and senior oversight of the effectiveness of data management.
Stewardship
Everybody’s business is Nobody’s business, and Nobody’s business is My business
Clara Barton
Form vs. Content
• BASIC TABULAR STRUCTURE
Form - Row & Column Stewards
Governance Stewardship• Data Metadata - Party
who is responsible for the decision about WHAT data the enterprise will collect and maintain the data represented by the Column
Process Stewardship
• Process Metadata - Party who is responsible for the decision about HOW the enterprise will collect and maintain the data represented by the Row
Content Stewards
• Data Qualification Stewardship
o Compliance Metadata - Party who is responsible for qualifying entries per the constraints on the data.
Glossary - Critical Data Element
• Business Term– Name– Description– Workflow Status– Abbreviation – Acronym– Synonym
• Term Type– Topic
• Term– Qualified Term
(element)
• Origin• System of Reord• Business Rule• Authority• Scope• Organization• Domain
BusinessTerm
BusinessTerm
Multiple
Data Quality Capabilities
Data Quality Strategy
Data Profiling
Data Quality Assessment
Data Cleansing
DataQuality
Best practices for defining and implementing a collaborative approach for detecting, assessing, and cleansing data defects to ensure fitness for intended uses in business operations, decision making, and planning.
Data Quality Issues
ETL
ACCOUNTABILITY
UCA – Data Quality
Section 1602 Instructions Relating to Particular Groups of Statistics
• Records Rejected
• 1602.04 For computerized data bases used in the development of operating statistics from waybills and conductor's journals, records shall be maintained of data rejected due to uncorrected error conditions.
Measurement
DataQuality indicators
Metadata Repository Report
Data Quality Dimensions
1. Accuracy – criteria related to affinity with original intent, veracity as compared to an authoritative source, and measurement precision
2. Completeness –criteria related to the availability of required data attributes
3. Coverage – criteria related to the availability of required data records
4. Conformity – criteria related to alignment of content with required standards
5. Consistency – criteria related to compliance with required patterns and uniformity rules
6. Duplication – criteria related to redundancy of records or attributes7. Integrity – criteria related to the accuracy of data relationships
(parent and child linkage) 8. Timeliness – criteria related to the currency of content and
availability to be used when needed
Conformity, Duplication, Masking
MASK &CAMOFLAGUE
Data Quality going Forward
OBSERVATIONAL Analytics
ALGORYTMIC Analytics
DQ%
DQ%
Data Risk Assessment
• Classification• Valuation• Detection• Mitigation
Risk Assessment: Mitigation
Data Traceability Ecosystem
Business Domain
IT Domain
Metadata Driven Impact Analysis
Business Domain
Technology Domain
PhysicalData Model
ConceptualData Model
Business Glossary
DATABASE
LogicalData Model
Middle Out MetadataRepository
Forensic Metadata
Study exposes social media sites that delete photographs’ metadata.
http://www.embeddedmetadata.org/social-media-test-results.php
UCA Data + Metadata
Section 1102 Records
• 1102.02 Records are to be maintained in a manner facilitating review, audit and tracing of transactions to source data. Where full information on transactions is not recorded in the general books of a carrier the entries therein shall be supported by other records, suitably cross-referenced, containing such information.
Metadata Integration
The integration ofbusiness and technical
metadata has become the holy grail of Data Governance.
Rick Sherman DM Review Magazine
“But just like the holy grail, it seems to be a never-ending quest fueled by the belief that technology is going to solve all our problems.”
TMDDatabase Column
Business & Technical Metadata
BMDBusiness Term
ASSOCIATIVE ENTITY
Business Glossary - Taxonomy
Pinot Noir
France
Country
Red
Type
Burgundy
Region
Light
Body
Index for Searching
Taxonomy VS Ontology
Taxonomy • General - the practice and science of
classification of things or concepts• Corporate taxonomy, the hierarchical
classification of entities of interest to an enterprise, organization or administration
Business
Glossary
Business
Information
Model
Ontology• Information Science, fformal naming and
definition of the types, properties, and interrelationships of the entities that exist for a particular domain of discourse.
Data Governance Metamodel
BusinessTerm
DataElement
ASSOCIATIVE ENTITY
Data Element
KEYS
Subject Area• Entity
• AttributeRelationship
BusinessTermBusiness Concept
or Category• Business Term
• Qualified Term (Element)
Relationship
?????
Business
Information
Model
Technical
Data
Model
Risk Governed DataBusiness InformationModel
TechnicalData
Model
DATA MOVEMENT
LINEAGE
CDE SEMANTIC
LINEAGE
Finding Critical Data Elements
Data Reference Model
Identifying Critical Data Elements
RegulatoryCompliance
Critical Master Data
CustomerProduct
Process
Location
Calendar
$
42
Transaction Molecule
Big Data Model
MODEL
Business Terms
‘The ‘Wisdom of Crowds”
The author suggests that a crowd (or group) is usually
more accurate than an expert alone
Critical Data Element Lifecycle
CriticalData
ElementGovernance
Search for term
Maintain Term
Retire Term
Identify New Term
Review Term
Approve Term
Publish Term
Data Management Maturity Model (DMM)
Data Management
Strategy
DataGovernance
DataQuality
Platform &Architecture
DataOperations
MetadataIntelligence
CriticalData Elements
Roles &Responsibilities
DataLifecycle
DataStandards
Business/Technology StrategyBusiness Domain Technology Domain
Business Information Model
Technical Data Model
INNOVATION& CHANGE
TECHNOLOGYPROJECTIONS
Business Strategy
TechnologyStrategy
ALIGNMENT
DATA GOVERNANCE
BusinessOperating
Model
Data Mgmt.Operating
Model
CRITICALDATA ELEMENT
Not Just Data
BusinessDynamics
Model
SystemDynamics
Model
FunctionDynamics
Model
PROCESS
Business Information
Model
Logical Data Model – TO BE
Data DeliveryModel
DATA
Logical Data Model – AS IS
Reverse Engineered
Physical Model
Data Standards
Measure = ObservationStandard
"You can't manage, what you can't measure”
You can’t measure,what you can’t define
More than just a firewall
IT Domain Traceability
Data
Customer
Name
Order
Order Number
Package
Package ID
• Logical Name• Physical Name• Unstructured Data
Application
Customer
Shop On Line
Deliver Order
• ERP Systems• Application Services• Integration
SecurityOrder PackagePurchase
Product
Customer
Process
Order
• Access• Definitions• Processes• Organizations
Network
• Servers• Cloud• Mobile
GOVERNANCE
Enterprise Traceability
Data
Ne
two
rk
Secu
rity
Application
PRODUCT
PLA
CE PA
RTY
PROCESS
Technology Domain
Business Domain
In Summary
• Classificationo Definitions & Standards
• Identifying and defining their information, and establishing business standards for it.
• Valuationo Quality criteria
• Establishing quality controls and monitoring quality of information for accuracy, timeliness, consistency, validity, completeness, redundancy and impact across projects.
• Detectiono Protection & Risk
• Classifying information to ensure appropriate protection in accordance with privacy and information security policies, and relevant standards.
• Mitigationo Life-Cycle
• Providing leadership and direction for information from its collection or creation through to its retention or disposal based on business, regulatory and legal requirements. How information is to be used, disseminated and maintained throughout the life-cycle.
How to Get Started ?
”... please which way I ought to go from
here?"
"That depends a good deal on where you
want to get to," said the Cat.
"I don't much care where----" said Alice.
"Then it doesn't matter which way you
go," said the Cat.
NEED A MAP
YOU ARE HERE
SandhillDMM
Assessment
CMMI - Data Maturity
Model (DMM)
Data Management Maturity Assessment
POWEROF 3
EDME
Metadata Intelligrnce 2.0?
• Metadata Lifecycle
• Integration of Non-textual Metadata
• Cross pollination of Metadata Standards
o Semantic Web
• Resource Description Framework (W3C)
• Dublin Core (Non W3C)
o ISO (Geographic)
EM-SOS!
Data
Management
Strategy
Platform &Architecture
DataOperations
Data
Management
Maturity
Model
DataQuality
DataGovernance
Data Structure
Modeling
Data Content
Modeling
Data Standards
Modeling
Data Lifecycle
Modeling
Business/Data
Strategy
Helping Clients Understand and Manage
the Form and Substance of Their Data.
EMSOS!
SANDHILL
DMM Assessment
Power Designer
Capability
Delivery Solutions
Data
Governance
Capability
Delivery System
Tibetan Proverb
“ If upstream is dirty,
downstream will be muddy”
[shuǐtóu] [bù] [qīngshuǐ] [wěi] [hún]