data vault reconnect speed presenting am part one

27
Presenter: Date: Note: Company: eMail: Twitter: Hans Hultgren June 5, 2014 Genesee Academy [email protected] gohansgo

Upload: hans-hultgren

Post on 26-Jun-2015

116 views

Category:

Data & Analytics


0 download

DESCRIPTION

First set of 5x5 Speed Presenting Updates: 1) Core Business Concept 2) Ensemble Modeling 3) Re-Defining the Link 4) Re-Defining the Satellite 5) Architectural Layers

TRANSCRIPT

Page 1: Data Vault ReConnect Speed Presenting AM Part One

Presenter:

Date: Note:

Company:

eMail: Twitter:

Hans Hultgren June 5, 2014 Genesee Academy [email protected] gohansgo

Page 2: Data Vault ReConnect Speed Presenting AM Part One

Core Business Concept is the #Entity, Master Data Component, or #Dimension that the Business actually Works with.

Page 3: Data Vault ReConnect Speed Presenting AM Part One

The level where the Business actually most commonly uses the Concept…

Page 4: Data Vault ReConnect Speed Presenting AM Part One

Not a Super-Type…

Page 5: Data Vault ReConnect Speed Presenting AM Part One

Not a Sub-Type or lower specification

Page 6: Data Vault ReConnect Speed Presenting AM Part One

Sale

Cust

L_Buyr Evnt

Prsn L

Type

L

Type

Sale Retrn Purch

Cust Emp Vend

L

Type

Buyer Seller

Page 7: Data Vault ReConnect Speed Presenting AM Part One

Presenter:

Date: Note:

Company:

eMail: Twitter:

Hans Hultgren June 5, 2014 Genesee Academy [email protected] gohansgo

Page 8: Data Vault ReConnect Speed Presenting AM Part One

Separate the things that change from the things that don’t change. DV Ensemble Conforms to a Single Key

Page 9: Data Vault ReConnect Speed Presenting AM Part One

Maintain the KEY DEPENDENCY of Context Attributes. But apply #UnifiedDecomposition to reduce the modeling impact of changes.

Page 10: Data Vault ReConnect Speed Presenting AM Part One
Page 11: Data Vault ReConnect Speed Presenting AM Part One

Presenter:

Date: Note:

Company:

eMail: Twitter:

Hans Hultgren June 5, 2014 Genesee Academy [email protected] gohansgo

Page 12: Data Vault ReConnect Speed Presenting AM Part One

Links are based on a Unique, Specific, Natural Business Relationship. Each naturally existing relationship should be modeled individually.

Page 13: Data Vault ReConnect Speed Presenting AM Part One

The Link contains only: 2-n FKs for the Relationship + 3 Untouchables: Surrogate Key, Load Date/Time, and Record Source

L_Cust_Class_SID

H_Customer_SID

H_Sequence2_SID

Date/Time Stamp

Record source

L_Cust_Class

Page 14: Data Vault ReConnect Speed Presenting AM Part One

Transactions are also #CoreBusinessConcepts. So they will decomposed into a full #Ensemble including a Hub, Link(s) and Satellites.

Page 15: Data Vault ReConnect Speed Presenting AM Part One

Transactions are also #CoreBusinessConcepts. So they will decomposed into a full #Ensemble including a Hub, Link(s) and Satellites.

A Transactional Link differs from a Link because it has a unique identity (similar to a 3NF Entity).

Page 16: Data Vault ReConnect Speed Presenting AM Part One

Presenter:

Date: Note:

Company:

eMail: Twitter:

Hans Hultgren June 5, 2014 Genesee Academy [email protected] gohansgo

Page 17: Data Vault ReConnect Speed Presenting AM Part One

All Context + All History Can only describe the Key of the Hub/Link

Can have no #ForeignKeys

Page 18: Data Vault ReConnect Speed Presenting AM Part One

As part of Ensemble: Has no Meaning except in relation

to the Whole

Multi-Valued means key is not the same as Hub

Page 19: Data Vault ReConnect Speed Presenting AM Part One

Can hold a CODE reference to a Dis-Connected Concept. When #Code is adequate context by itself & downstream joins are rare (<= 5% time)

Customer Hub

H_Cust_Seq_ID

Customer_Number

Date/Time Stamp

Record Source

Href_Customer Type

H_Cust_Type_Seq_ID

Cust_Type_Code

Date/Time Stamp

Record Source

Customer Sat

H_Cust_Seq_ID

Date/Time Stamp

Cust_Name

Cust_Type_Code

Record Source

Customer Type Sat

H_Cust_Type_Seq_ID

Date/Time Stamp

Cust_Type_Description

Record Source

Page 20: Data Vault ReConnect Speed Presenting AM Part One

Assuming Dis-Connected Concept is a Core Business Concept related to that subject area, it can also be Promoted easily.

Customer Hub

H_Cust_Seq_ID

Customer_Number

Date/Time Stamp

Record Source

Href_Customer Type

H_Cust_Type_Seq_ID

Cust_Type_Code

Date/Time Stamp

Record Source

Customer Sat

H_Cust_Seq_ID

Date/Time Stamp

Cust_Name

Record Source

Customer Type Sat

H_Cust_Type_Seq_ID

Date/Time Stamp

Cust_Type_Description

Record Source

Cust_/_Cust_Type_Link

H_Cust_Type_Seq_ID

H_Cust_Seq_ID

Date/Time Stamp

Record Source

Page 21: Data Vault ReConnect Speed Presenting AM Part One

Presenter:

Date: Note:

Company:

eMail: Twitter:

Hans Hultgren June 5, 2014 Genesee Academy [email protected] gohansgo

Page 22: Data Vault ReConnect Speed Presenting AM Part One

Source-facing auditability and Business-facing central business view…

Page 23: Data Vault ReConnect Speed Presenting AM Part One

#LessLayers = Better!

Page 24: Data Vault ReConnect Speed Presenting AM Part One

Problem with 1:1 RAW DV in the EDW…

Accounting

Finance

Logistics

Sales

Star 1

Star 2

Star 3

Star 4

Star 5

Star 6

Star 7

Star 8

Star 9

Star 10

Star 11

Star n…

Accounting

Finance

Logistics

Sales

Page 25: Data Vault ReConnect Speed Presenting AM Part One

…will need to have a BDW Layer for the EDW However now can still lose auditable path and we are tied to a mandatory persisted layer…

Star 1

Star 2

Star 3

Star 4

Star 5

Star 6

Star 7

Star 8

Star 9

Star 10

Star 11

Star n…

Accounting

Finance

Logistics

Sales

Accounting

Finance

Logistics

Sales

BDV

Page 26: Data Vault ReConnect Speed Presenting AM Part One

Can instead practice “RAW Integration” to maintain auditable path and reduce layers.

Star 1

Star 2

Star 3

Star 4

Star 5

Star 6

Star 7

Star 8

Star 9

Star 10

Star 11

Star n…

Acctg / Fin

Sales

BDV

Sales

Accounting

Finance

Logistics

RAW BDV

Page 27: Data Vault ReConnect Speed Presenting AM Part One

How Much can be done #InMemory? #Virtualized?