…optimise your it investments data discovery understanding data relationships philip howard...

Post on 13-Jan-2016

218 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

…optimise your IT investments

Data DiscoveryUnderstanding data relationships

Philip HowardResearch Director – Bloor Research

…optimise your IT investmentsConfidential © Bloor Research 2009 telling the Information Management storyConfidential © Bloor Research 2009

Agenda

What are data relationships and why are they important?

Different approaches to discovering data relationships

Features you might look for in a data discovery tool

…optimise your IT investmentsConfidential © Bloor Research 2009 telling the Information Management storyConfidential © Bloor Research 2009

What is a data relationship?

1. A relationship between database tables, either within or across databases

2. A relationship within or across non-relational data sources

3. A relationship between a relational and non-relational source

Note that relationships may be complex and/or involve more than 2 elements

…optimise your IT investmentsConfidential © Bloor Research 2009 telling the Information Management storyConfidential © Bloor Research 2009

Why are data relationships important?

1. Data migration

…optimise your IT investmentsConfidential © Bloor Research 2009 telling the Information Management storyConfidential © Bloor Research 2009

Why are data relationships important?

2. Data archival

…optimise your IT investmentsConfidential © Bloor Research 2009 telling the Information Management storyConfidential © Bloor Research 2009

Why are data relationships important?

3. Master data management

…optimise your IT investmentsConfidential © Bloor Research 2009 telling the Information Management storyConfidential © Bloor Research 2009

Why are data relationships important?

4. Data governance

…optimise your IT investmentsConfidential © Bloor Research 2009 telling the Information Management storyConfidential © Bloor Research 2009

Why are data relationships important?

5. Data modelling

…optimise your IT investmentsConfidential © Bloor Research 2009 telling the Information Management storyConfidential © Bloor Research 2009

Why are data relationships important?

6. Business intelligence

…optimise your IT investmentsConfidential © Bloor Research 2009 telling the Information Management storyConfidential © Bloor Research 2009

Why are data relationships important?

7 & 8 & 9 & …

Data integration

Legacy migration

Data warehousing

…optimise your IT investmentsConfidential © Bloor Research 2009 telling the Information Management storyConfidential © Bloor Research 2009

Why are data relationships difficult?

No definition exists across multiple sources

Within a source many relationships are not explicit

Ownership of relationships is diverse

Many relationships are defined within application software and not in the data source

…optimise your IT investmentsConfidential © Bloor Research 2009 telling the Information Management storyConfidential © Bloor Research 2009

Data relationships in place

Different issues arise when you consider relationships within

systems versus across systems

…optimise your IT investmentsConfidential © Bloor Research 2009 telling the Information Management storyConfidential © Bloor Research 2009

Data relationships within systems

Typical functions:Identification of primary-foreign key pairs

Dependency analysis

Redundant columns

Usually provided through data profiling, which also provides error statistics

…optimise your IT investmentsConfidential © Bloor Research 2009 telling the Information Management storyConfidential © Bloor Research 2009

Data relationships across systems

Requirement for relationship discovery

No requirement for error statistics

Requirement for rule violations where this represents a violation of a cross-source relationship

…optimise your IT investmentsConfidential © Bloor Research 2009 telling the Information Management storyConfidential © Bloor Research 2009

Specific requirements

For MDM – overlap & precedence analysis, transformation & business rules and exceptions, outlier analysis, matching keys

For data migration & archival – business entities

…optimise your IT investmentsConfidential © Bloor Research 2009 telling the Information Management storyConfidential © Bloor Research 2009

General functions

Automation of MDM and Profiling functions

Visualisation of relationships

Semantics the semantic type of the data e.g. zip code

context-free discovery – e.g. recognising that cust# is equivalent to custID

Data classification: recognising the relationship between a pre-defined, business-user-maintained domain of values and the actual content of a field in order to identify the content of a field as well as unexpected values.

Business glossary

…optimise your IT investmentsConfidential © Bloor Research 2009 telling the Information Management storyConfidential © Bloor Research 2009

Tools LandscapeTools Landscape

…optimise your IT investmentsConfidential © Bloor Research 2009 telling the Information Management storyConfidential © Bloor Research 2009

Conclusion

Understanding data relationships across data sources is important in many data management disciplines

There are relatively few tools that are good at discovering such relationships – moreover, data discovery is a broad discipline and no one tool is good at all aspects of relationship discovery.

…optimise your IT investmentsConfidential © Bloor Research 2009

top related