data quality & data governance

21
Data Quality and Data Governance Tuba Yaman Him

Upload: tuba-yaman-him

Post on 18-Jan-2017

135 views

Category:

Data & Analytics


6 download

TRANSCRIPT

Page 1: Data Quality & Data Governance

Data Quality and Data Governance

Tuba Yaman Him

Page 2: Data Quality & Data Governance

Why is Data Quality Important?• Wrong Reports = Wrong Decisions

Page 3: Data Quality & Data Governance

Why is Data Quality Important?• Wrong Reports = Wrong Decisions• Bad Reputation

Page 4: Data Quality & Data Governance

Why is Data Quality Important?• Wrong Reports = Wrong Decisions• Bad Reputation• Wasted MoneyAccording to a recent study in the UK, US and France, 16% to 18% of departmental budgets are eaten up because of poor data quality. The research also indicates that 90% of surveyed companies admit that inaccurate data – such as duplicate accounts, lost contacts and missed sales opportunities – contributes to budget waste. On top of this, a 2009 Gartner study revealed that the average organization surveyed loses $8.2 million annually because of poor data quality and that most of this is due to lost productivity.

Page 5: Data Quality & Data Governance

Modern Data Environment

EnterpriseData

Warehouse

ERP Systems(SAP/Oracle

etc)

CRM(Salesforce,

Dynamics etc)

Manufacturing Systems

Financial Systems

Web Applications

Documents

MarketingData Mart

SalesDataMart

FinancialData Mart

Page 6: Data Quality & Data Governance

Modern Data Environment

EnterpriseData

Warehouse

ERP Systems(SAP/Oracle

etc)

CRM(Salesforce,

Dynamics etc)

Manufacturing Systems

Financial Systems

Web Applications

Documents

MarketingData Mart

SalesDataMart

FinancialData Mart

Page 7: Data Quality & Data Governance

Dimensions Of Data QualityIntegrityAccurac

y

Currency

Validity

Page 8: Data Quality & Data Governance

Dimensions Of Data Quality

• Do data objects accurately represent the “real-world” values?• Is data correct?• Example: Wrong sales amount, wrong contact information of a

customer etc.

Accuracy

Page 9: Data Quality & Data Governance

Dimensions Of Data Quality

• Is there are any data missing important relationship linkages?• Example: A product ownership without a valid owner/customer

record.

Integrity

Page 10: Data Quality & Data Governance

Dimensions Of Data Quality

• Is any neccessary part of data is missing?• Example:A customer record which has an address without city,

although city is mandatory.

Completeness

Page 11: Data Quality & Data Governance

Dimensions Of Data Quality

• Is data up-to-date? • Do we provide real-time data to our clients?• Example: Customers with old address information. A bank which can

not provide the real-time amount of funds of its customers.

Currency

Page 12: Data Quality & Data Governance

Dimensions Of Data Quality

• Are there multiple, unnecessary representations of the same data objects within your data? • Example: 3 different records which indicate the same customer.

Misspelling can be the reason.

CurrencyUniqueness

Page 13: Data Quality & Data Governance

Dimensions Of Data Quality

• Do data values comply with the specified formats and rules? • Example: A customer record whose DOB is dd/mm/1735. A customer

record with invalid postal code for UK like WC3T.

CurrencyValidity

Page 14: Data Quality & Data Governance

Methods and Tools For Data Quality

Objective How to

Validation Regular Expressions

Data Merging For Duplicate Data SSIS Fuzzy Lookup, Fuzzy Grouping Packages

Integrity Proper ETL and ELT Process

Completeness Mandatory Fields Rules, ETL/ELT

Verification For Important Information Activation E-mails, Verification SMS

Prevent Typographical Error Autocomplete Tools

Minimizing Human Errors Employee Training

Page 15: Data Quality & Data Governance

SSIS Fuzzy Matching

• Tuba Yaman Him• [email protected]• Deniz Apt.• Ataşehir• İstanbul

• Tuba Him• [email protected]• Deniz Apt.• Ataşehir• istanbul

• Tuğba Yaman Him• [email protected]• Deniz Apt.• Ataşehir• İstanbul

• Tuba Him• [email protected]• Deniz Apt.• Ataşehir• istanbul

%90 Match

%90 Match

Page 16: Data Quality & Data Governance

Data GovernanceData governance is a set of policies, rules and standarts in order to increase and maintain enterprise data quality.It is about putting people in charge of fixing and preventing issues with data so that the enterprise can become more efficient. Data governance also describes an evolutionary process for a company, altering the company’s way of thinking and setting up the processes to handle information so that it may be utilized by the entire organization. It’s about using technology when necessary in many forms to help aid the process. When companies desire, or are required, to gain control of their data, they empower their people, set up processes and get help from technology to do it

Page 17: Data Quality & Data Governance

Data Governance

Page 18: Data Quality & Data Governance

Data Governance –Job AdsUSA1.885India290UK253Canada113Germany83Singapore25Switzerland24Turkey 0

Page 19: Data Quality & Data Governance

Data Governance Team Missions

Page 20: Data Quality & Data Governance

Data Quality ScorecardObjective Action Plan KPI Target Jul.2016 Aug.2016 Sep.2016

Decrease Duplicates

A Merging flow will be implemented

Number of duplicate records in CDB

0 11.276 3.500 200

Increase the Correctness of email info

Verification process will be implemented

Number of invalid email addresses in Customer DB

<500 25.500 4.700 4.700

Decrease wrong relationship of product and customer

ETL enhancement is planned.

Number of incorrect relations between products and customers in DB

0 2.700 2.700 2.900

Page 21: Data Quality & Data Governance