l/o/g/o metadata business intelligence erwin moeyaert

41
L/O/G/O Metadata Business Intelligence Erwin Moeyaert

Upload: beverly-pearson

Post on 26-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: L/O/G/O Metadata Business Intelligence Erwin Moeyaert

L/O/G/O

MetadataBusiness IntelligenceMetadataBusiness Intelligence

Erwin Moeyaert

Page 2: L/O/G/O Metadata Business Intelligence Erwin Moeyaert

OverviewOverview

• What is metadata?• Why is it needed?• Types of metadata• Metadata life cycle

Page 3: L/O/G/O Metadata Business Intelligence Erwin Moeyaert

3

What is Metadata?What is Metadata?

•Data ‘reporting’ – WHO created the data?

– WHAT is the content of the data?

– WHEN was it created?

– WHERE is it geographically?

– HOW was the data developed?

– WHY was the data developed?

Page 4: L/O/G/O Metadata Business Intelligence Erwin Moeyaert

The MetadataThe Metadata

• The name suggests some high-level technological concept, but it really is fairly simple. Metadata is “data about data”.

• With the emergence of the data warehouse as a decision support structure, the metadata are considered as much a resource as the business data they describe.

• Metadata are abstractions -- they are high level data that provide concise descriptions of lower-level data.

Page 5: L/O/G/O Metadata Business Intelligence Erwin Moeyaert

5

What is Metadata ?What is Metadata ?

time period

author

sources

(file) size

title

supplemental information

abstract

©2005 CSC Brands, L.P. All Rights Reserved

Page 6: L/O/G/O Metadata Business Intelligence Erwin Moeyaert

6

What is Metadata?What is Metadata?

©2005 CSC Brands, L.P. All Rights Reserved

•entity

•attributes

Page 7: L/O/G/O Metadata Business Intelligence Erwin Moeyaert

The Meta DataThe Meta Data

• Last component of DW environments.

• It is information that is kept about the warehouse rather than information kept within the warehouse.

• Legacy systems generally don’t keep a record of characteristics of the data (such as what pieces of data exist and where they are located).

Page 8: L/O/G/O Metadata Business Intelligence Erwin Moeyaert

• Better end user data access and analysis tools can help users figure out how to get information they need out of the warehouse

• only good, easily accessible metadata can help them figure out what is available in the data warehouse and how to ask for it.

Page 9: L/O/G/O Metadata Business Intelligence Erwin Moeyaert

Metadata RepositoriesMetadata Repositories

Users and Developers often need a way to find information on the data they use. Information can include:• Source System(s) of the Data, contact information• Related tables or subject areas• Programs or Processes which use the data• Population rules (Update or Insert and how often)• Status of the Data Warehouse’s processing and

condition

Metadata

Page 10: L/O/G/O Metadata Business Intelligence Erwin Moeyaert

General Metadata Issues Associated with Data Warehouse UseGeneral Metadata Issues Associated with Data Warehouse Use

What tables, attributes, and keys does the data warehouse contain?

Where did each set of data come from? What transformation logic was applied in loading the data? How has the metadata changed over time? What aliases exist and how are they related to each other? What are the cross-references between technical and business

terms? How often does the data get reloaded? How much data is there? (assists in avoiding the submissions

of unrealistic queries)

Page 11: L/O/G/O Metadata Business Intelligence Erwin Moeyaert

Typical Mapping MetadataTypical Mapping Metadata

Identification of original source fields. Simple attribute-to-attribute mapping. Attribute conversions. Physical characteristic conversions. Encoding/reference table conversions. Naming changes. Key changes. Defaults values. Logic to choose from among multiple sources. Algorithmic changes.

Page 12: L/O/G/O Metadata Business Intelligence Erwin Moeyaert

Copyright © 1997, Enterprise Group, Ltd.

Data Warehouse Process

Source OLTPSystems Data Marts

• Design• Mapping

• Design• Mapping

• Extract• Scrub

• Transform

• Extract• Scrub

• Transform

• Load• Index

• Aggregation

• Load• Index

• Aggregation

• Replication• Data Set Distribution

• Replication• Data Set Distribution

• Access & Analysis• Resource Scheduling & Distribution

• Access & Analysis• Resource Scheduling & Distribution

Meta DataMeta Data

System MonitoringSystem Monitoring

• Raw Detail• No/Minimal History

• Integrated• Scrubbed

• History• Summaries

• Targeted• Specialized (OLAP)

Data Characteristics

DataWarehouse

Page 13: L/O/G/O Metadata Business Intelligence Erwin Moeyaert

Meta Data DescriptionMeta Data Description

• Information about the data warehouse system– Content– Organizational– Structural– Management Information– Scheduling Information– Contact Information– Technical Information

Page 14: L/O/G/O Metadata Business Intelligence Erwin Moeyaert

Why Do You Need Meta Data?Why Do You Need Meta Data?

• Share resources– Users– Tools

• Document system• Without metadata

– Not Sustainable– Not able to fully utilize resource

Page 15: L/O/G/O Metadata Business Intelligence Erwin Moeyaert

Metadata Life CycleMetadata Life Cycle• Collection - Identify metadata and capture into

repository; automate

• Maintenance - Put in place processes to synchronize metadata automatically with changing data architecture; automate

• Deployment - Provide metadata to users in the right form and with the right tools; match metadata offered to specific needs of each audience

Page 16: L/O/G/O Metadata Business Intelligence Erwin Moeyaert

Metadata CollectionMetadata Collection• Right metadata at the right time• Variety of collection strategies• Sources

– potential sources of data for DW– external data– data structures

• Data Models - enterprise data model start point– import from CASE tool– correlate enterprise and warehouse models

Page 17: L/O/G/O Metadata Business Intelligence Erwin Moeyaert

Metadata CollectionMetadata Collection• Warehouse mappings

– map operational data into warehouse data structure– Need record of logical connection used for mapping

and transformation• Warehouse usage information

– After roll out– What tables accessed, by whom and for what– What queries written– Capture nature of business problem or query

Page 18: L/O/G/O Metadata Business Intelligence Erwin Moeyaert

Maintaining MetadataMaintaining Metadata

• Up to date with reality• Capture incremental changes

Page 19: L/O/G/O Metadata Business Intelligence Erwin Moeyaert

Metadata DeploymentMetadata Deployment• Warehouse developers need:

– physical structure info for data sources– enterprise data model– warehouse data model– concerned with accuracy, completeness and flexibility of

metadata– Need access to comprehensive impact analysis capabilities– Need to defend against accuracy & integrity questions

Page 20: L/O/G/O Metadata Business Intelligence Erwin Moeyaert

Meta DataMeta Data

• Types – Technical– Business / User

• Levels– Core– Basic– Deluxe

Page 21: L/O/G/O Metadata Business Intelligence Erwin Moeyaert

Core Technical Meta DataCore Technical Meta Data

• Source• Target• Algorithm

Page 22: L/O/G/O Metadata Business Intelligence Erwin Moeyaert

Basic Technical Meta DataBasic Technical Meta Data• History of transformation changes • Business rules • Source program / system name• Source program author / owner• Extract program name & version• Extract program author / owner• Extract JCL / Script name• Extract JCL / Script author / owner• Load JCL / Script name

Page 23: L/O/G/O Metadata Business Intelligence Erwin Moeyaert

Basic Technical Meta Data (con’t)Basic Technical Meta Data (con’t)• Load JCL / Script author / owner• Load frequency• Extract dependencies• Transformation dependencies• Load dependencies• Load completion date / time stamp• Load completion record count• Load status

Page 24: L/O/G/O Metadata Business Intelligence Erwin Moeyaert

Deluxe Technical Meta DataDeluxe Technical Meta Data

• Source system platform• Source system network address• Source system support contact• Source system support phone / beeper• Target system platform• Target system network address• Target system support contact• Target system support phone / beeper• Etc.

Page 25: L/O/G/O Metadata Business Intelligence Erwin Moeyaert

Core Business Meta DataCore Business Meta Data

• Field / object description• Confidence level• Frequency of update

Page 26: L/O/G/O Metadata Business Intelligence Erwin Moeyaert

Basic Business Meta DataBasic Business Meta Data

• Source system name• Valid entries (i.e. “There are three valid codes: A, B,

C”)• Formats (i.e. Contract Date: 82/4/30)• Business rules used to calculate or derive

the data• Changes in business rules over time

Page 27: L/O/G/O Metadata Business Intelligence Erwin Moeyaert

Deluxe Business Meta DataDeluxe Business Meta Data

• Data owner• Data owner contact information• Typical uses• Level of summarization• Related fields / objects• Existing queries / reports using this field /

object• Estimated size (tables / objects)

Page 28: L/O/G/O Metadata Business Intelligence Erwin Moeyaert

Amount of Meta DataAmount of Meta Data

• How much Meta Data do I need?• As much as you can support!

Page 29: L/O/G/O Metadata Business Intelligence Erwin Moeyaert

Meta Data Functions - TechnicalMeta Data Functions - Technical• Maintenance• Troubleshooting• Documentation• Logging / Metrics

Page 30: L/O/G/O Metadata Business Intelligence Erwin Moeyaert

Meta Data LocationMeta Data Location

• DB Resident– Almost always relational– C/S predominantly– Normalized design– OODB is popular option for proprietary

solutions

Page 31: L/O/G/O Metadata Business Intelligence Erwin Moeyaert

RepositoryRepository

• Specialized databases designed to maintain metadata, together with tools and interfaces that allow a company to collect and distribute its metadata

• Repository Requirements– Logically Common– Open– Extensible

Page 32: L/O/G/O Metadata Business Intelligence Erwin Moeyaert

Meta Data ProcessMeta Data Process

Copyright © 1997, Enterprise Group, Ltd.

• Integrated with entire process and data flow– Populated from beginning to end

– Begin population at design phase of project– Dedicated resources throughout

• Build• Maintain

• Design• Mapping

• Design• Mapping

• Extract• Scrub

• Transform

• Extract• Scrub

• Transform

• Load• Index

• Aggregation

• Load• Index

• Aggregation

• Replication• Data Set Distribution

• Replication• Data Set Distribution

• Access & Analysis• Resource Scheduling & Distribution

• Access & Analysis• Resource Scheduling & Distribution

Meta DataMeta Data

System MonitoringSystem Monitoring

Page 33: L/O/G/O Metadata Business Intelligence Erwin Moeyaert

General Metadata IssuesGeneral Metadata Issues

General metadata issues associated with Data Warehouse use:– What tables, attributes and keys does the DW

contain?– Where did each set of data come from?– What transformations were applied with cleansing?– How have the metadata changed over time?– How often do the data get reloaded?– Are there so many data elements that you need to be

careful what you ask for?

Page 34: L/O/G/O Metadata Business Intelligence Erwin Moeyaert

Components of the MetadataComponents of the Metadata

• Transformation maps – records that show what transformations were applied

• Extraction & relationship history – records that show what data was analyzed

• Algorithms for summarization – methods available for aggregating and summarizing

• Data ownership – records that show origin• Patterns of access – records that show

what data are accessed and how often

Page 35: L/O/G/O Metadata Business Intelligence Erwin Moeyaert

Typical Mapping MetadataTypical Mapping Metadata

Transformation mapping records include:– Identification of original source– Attribute conversions– Physical characteristic conversions– Encoding/reference table conversions– Naming changes– Key changes– Values of default attributes– Logic to choose from multiple sources– Algorithmic changes

Page 36: L/O/G/O Metadata Business Intelligence Erwin Moeyaert

32

MetadataMetadata

• The structure of metadata will differ between each process, because the purpose is different.

• This means that multiple copies of metadata describing the same data item are held within the data warehouse.

• Most vendor tools for copy management and end-user data access use their own versions of metadata.

Page 37: L/O/G/O Metadata Business Intelligence Erwin Moeyaert

33

MetadataMetadata

• Copy management tools use metadata to understand the mapping rules to apply in order to convert the source data into a common form.

• End-user access tools use metadata to understand how to build a query.

• The management of metadata within the data warehouse is a very complex task that should not be underestimated.

Page 38: L/O/G/O Metadata Business Intelligence Erwin Moeyaert

METADATA VIEWSMETADATA VIEWS

• BUSINESS USER’S VIEWFROM A BUSINESS USER’S VIEW, METADATA SHOULD CONTAIN THE FOLLOWING SIX ELEMENTS:

1. TABLE OF CONTENTS2. ORIGIN OF THE DATA FOR THE WAREHOUSE3. TRANSFORMATION SEQUENCE4. ACCESS LEVEL5. TIMELINE OF THE JOURNEY6. ACCESS ESTIMATES

Page 39: L/O/G/O Metadata Business Intelligence Erwin Moeyaert

METADATA VIEWSMETADATA VIEWS

• DSS (DECISION SUPPORT SYSTEM) DEVELOPER’S VIEW

1. TRANSFORMATION AND BUSINESS RULES2. DATA MODELS3. AVAILABLE OPERATION DATA

Page 40: L/O/G/O Metadata Business Intelligence Erwin Moeyaert

METADATA VIEWSMETADATA VIEWS

• CORPORATE VIEW

METADATA IS A LOGICAL COLLECTION OF METADATA FROM VARIOUS SOURCES, INCLUDING THE FOLLOWING SIX PLACES:

Page 41: L/O/G/O Metadata Business Intelligence Erwin Moeyaert

METADATA VIEWSMETADATA VIEWS

1. LEGACY SYSTEM METADATA CONSISTING OF A DATA DICTIONARY CONTAINING

INFORMATION ABOUT PROGRAM LIBRARIES, DATABASE CATALOGS AND FILE LAYOUTS.

2. OPERATIONAL CLIENT/SERVER SYSTEMS – CONSISTING OF DISTRIBUTED SOFTWARE COMPONENTS FROM A VARIETY OF VENDORS.

3. ENTERPRISE MODELS –THEY ARE THE FIRST STAGE IN THE ULTIMATE GOAL OF BUILDING CORPORATE METADATA.