practical data strategies in the real world of poor data quality

69
Practical Data Strategies in the Real World of Poor Data Quality Andrew Patricio | www.dataeffectiveness.com

Upload: andrew-patricio

Post on 21-Apr-2017

148 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Page 1: Practical Data Strategies in the real world of poor Data Quality

Practical Data Strategies in the Real World of Poor Data Quality

A n d r e w P a t r i c i o | w w w . d a t a e f f e c t i v e n e s s . c o m

Page 2: Practical Data Strategies in the real world of poor Data Quality

Agenda

FoundationData EffectivenessData Sophistication Data PrioritizationConsistency, Relevancy, Accuracy Data Quality CultureReporting platformManaging RequestsSummary

Data Effectiveness

Andrew Patricio www.dataeffectiveness.com EDW2017 2

Page 3: Practical Data Strategies in the real world of poor Data Quality

Foundation

3

Page 4: Practical Data Strategies in the real world of poor Data Quality

The Foundation

ef·fec·tive·nessiˈfektivnəs/, nounthe degree to which something is successful in producing the intended or desired result

Data Effectiveness

4Andrew Patricio www.dataeffectiveness.com EDW2017

Page 5: Practical Data Strategies in the real world of poor Data Quality

The Wrong Question

Not “What do you want?”

Data Effectiveness

5Andrew Patricio www.dataeffectiveness.com EDW2017

Instead, “What problem are you trying to solve?”

Page 6: Practical Data Strategies in the real world of poor Data Quality

Effectiveness is about solving problems not deliverablesWhat do you want?

• Focused on requirements • Mid-stream changes = not delivering what was promised• Encourages business to think transactionally instead of as partners in the

solution• Overall sense is one of CYA, “We just did what you asked”

What problem are you trying to solve?• Focused on end goal• Mid-stream changes = steering to maintain drive towards end goal• Forces business to think of themselves as part of the team as well as articulate

the problem thereby making sure they understand it themselves• Overall sense is one of partners on a journey to discover an unknown answer

Data Effectiveness

6Andrew Patricio www.dataeffectiveness.com EDW2017

Page 7: Practical Data Strategies in the real world of poor Data Quality

The Ends (sometimes) Justify the MeansHaving a goal of effectiveness instead of quality means project is successful to the degree that it achieves desired result“What problem are you trying to solve?” is how to define the desired result

Data Effectiveness

7Andrew Patricio www.dataeffectiveness.com EDW2017

This combination gives you both a structure to make progress and the freedom to follow and steer around obstacles

Page 8: Practical Data Strategies in the real world of poor Data Quality

About Me – Andrew Patricio President Data Effectiveness Inc• www.dataeffectiveness.com

• Data Evaluation• Data Strategy• Data Infrastructure

Personal background• Chief Data Officer at DC Public Schools

Nov 2010 to June 2016• IT & management consulting• Electrical Engineering

Data Effectiveness

8Andrew Patricio www.dataeffectiveness.com EDW2017

Page 9: Practical Data Strategies in the real world of poor Data Quality

Data Effectiveness

9

Page 10: Practical Data Strategies in the real world of poor Data Quality

Data Driven Decision MakingAll organizations seek to make decisions based on data

Data Effectiveness

10Andrew Patricio www.dataeffectiveness.com EDW2017

Page 11: Practical Data Strategies in the real world of poor Data Quality

Data Reality

But the reality is that the data we have available is often in poor shape

Data Effectiveness

11Andrew Patricio www.dataeffectiveness.com EDW2017

Page 12: Practical Data Strategies in the real world of poor Data Quality

Getting to Data Driven – Reporting vs Analytics

Steve Levitt, Freakonomics Podcast, 26 June 2014“Yeah, I think the hardest single thing is that even if you have the desire … to be data driven, that the existing systems…I never would have thought this before I started working with companies. I never would have imagined that it is an I.T. problem that you simply cannot get the data you want, and the data are held in 27 different data sets that have different identifiers … the I.T. support and the complexity in these big firms blows your mind about how hard it is to do the littlest, simple things.”

Data analysts are NOT necessarily technologists

Data Effectiveness

12Andrew Patricio www.dataeffectiveness.com EDW2017

Page 13: Practical Data Strategies in the real world of poor Data Quality

Data Driven Decision MakingHigh performance data analytics…

Data Effectiveness

13

Requires pragmatic data reporting

…in the real world of data

Andrew Patricio www.dataeffectiveness.com EDW2017

Page 14: Practical Data Strategies in the real world of poor Data Quality

Data Sophistication

14Andrew Patricio www.dataeffectiveness.com EDW2017

Page 15: Practical Data Strategies in the real world of poor Data Quality

Data Sophistication CycleResults oriented incompatible with data driven?

• In a results-oriented organization the push is to “get things done” and the velocity of the need often makes it difficult for data systems to keep up.

• Data quality often suffers and the data driven aspect gets starved of food

Solution is to design data system complexity to slightly lead process sophistication rather than being too far ahead

Data Effectiveness

15Andrew Patricio www.dataeffectiveness.com EDW2017

Page 16: Practical Data Strategies in the real world of poor Data Quality

Data Sophistication CycleData capture system evolves along with process sophisticationReporting sophistication should keep pace with data quality

Data Effectiveness

16

Example Data Entry System

Key Datastructure

Process Sophistication

Data Quality

ReportingSophistication

Notepad Open entry

Excel Data cells

MS Access Data records

Student Information System (SIS)

Normalized data model

Reporting system separate from SIS

Reporting data model

Don’t build a formal data warehouse for excel “data systems”!

Andrew Patricio www.dataeffectiveness.com EDW2017

Page 17: Practical Data Strategies in the real world of poor Data Quality

Data Prioritization

17

Page 18: Practical Data Strategies in the real world of poor Data Quality

Capacity vs Demand

Not all data requests are created equalNeed to prioritize give finite capacity, time, and budgetCan‘t do everything perfectly but can be consciously imperfectEffectiveness is defined by achieving desired results so need to set expectations accordingly about those results

“What problem are you trying to solve?” but different parts of the organization have different problems

Data Effectiveness

18Andrew Patricio www.dataeffectiveness.com EDW2017

Page 19: Practical Data Strategies in the real world of poor Data Quality

Data Driven Pipeline

Data Effectiveness

19

Organizational Success

Data Analytics

Programs / Business

Product of business is Effective Outcomes Product of analytics is Effective DecisionsProduct of reporting is Effective Data

Effective Decisions

Effective Data

Data Reporting

Effective Outcomes

Andrew Patricio www.dataeffectiveness.com EDW2017

Page 20: Practical Data Strategies in the real world of poor Data Quality

Organizational Goals drive focus of data pipeline

Data Effectiveness

Prioritize Outcomes

Prioritize Analytics

Prioritize Data

Desired organizational success prioritize which outcomes business should focus onDesired business outcomes prioritize which decisions analytics should focus onDesired analytics decisions prioritizes which data reporting should focus on

Andrew Patricio www.dataeffectiveness.com EDW2017

Page 21: Practical Data Strategies in the real world of poor Data Quality

Focus on relevant data

Data Effectiveness

Two considerations:1. Some organizational goals are foundational if not necessary value adding: eg

Regulatory, Human Resources, Financial health, etc2. Not all interesting questions are relevant

Result is that resources are focused on data that ultimately solves the main problem of achieving organizational goals

Andrew Patricio www.dataeffectiveness.com EDW2017

Page 22: Practical Data Strategies in the real world of poor Data Quality

Data Quality

Data Effectiveness

22

Overall Organizational Successes

Not all of your data needs to be at the same level of quality. Sole measure is whether or not it is sufficient to achieve a particular organizational goal

Reporting Infrastructure

Effective Data

Business Streams and various Analytics

Effective Outcomes

Andrew Patricio www.dataeffectiveness.com EDW2017

Page 23: Practical Data Strategies in the real world of poor Data Quality

What is Data Effectiveness?

Data Effectiveness is primary responsibility of reporting

Data Effectiveness

23

Being effectively data driven starts with Data Effectiveness:

Getting good data, when it is needed, to who needs it

Organizational Success

Data Analytics Programs / Business

Effective Decisions

Effective Outcomes

Effective DataData

Reporting

Andrew Patricio www.dataeffectiveness.com EDW2017

Page 24: Practical Data Strategies in the real world of poor Data Quality

CAR cycle

24Andrew Patricio www.dataeffectiveness.com EDW2017

Page 25: Practical Data Strategies in the real world of poor Data Quality

How does Data go wrong?Data entry issues• Fat fingering• Workarounds, solving problem in front of them

• Transactional system only cares about latest enrollment action not data changes• Poor understanding of process/policy• Duplication

Legacy data • Different definitions year to year (regulatory changes, etc)• Poor QA processes (definition incorrect)• System transitions (Poor data transfer strategy from previous vendors)

Data Effectiveness

25Andrew Patricio www.dataeffectiveness.com EDW2017

Page 26: Practical Data Strategies in the real world of poor Data Quality

Data issuesEnd of year attendance example

Data Effectiveness

26

Date report run SY13-14 ADA (example)

July 2014 95%

October 2014 92%

Initially assumed that was bug in second reportReason behind nonsensical error was that schools were changing enrollment date from Aug 14 to Aug15 instead of entering new enrollment for the yearRegistrars were just solving immediate problem in front of themStudents who were present in SY14-15 data in june were missing in October

Andrew Patricio www.dataeffectiveness.com EDW2017

Page 27: Practical Data Strategies in the real world of poor Data Quality

Data issuesSchool Dashboard vs Weekly reports

Idea was to get more regularly updated data to schoolsInconsistencies reduced trust in data

Data Effectiveness

27

Two different queries implementing the same metric, data quality meant slightly different answers

• School on student table used for dashboard queries• Didn’t always match school based on enrollment history used in reports

Andrew Patricio www.dataeffectiveness.com EDW2017

Page 28: Practical Data Strategies in the real world of poor Data Quality

Fixing Data QualityHow do we make our data more effective given these challenges?

Data Effectiveness

28

Improve Data Quality long term?

Make data driven decisions today?

Andrew Patricio www.dataeffectiveness.com EDW2017

Page 29: Practical Data Strategies in the real world of poor Data Quality

Consistency, Accuracy, Relevancy cycleProblem is how to build a train as it’s moving down the track. When data quality is not so good you still have to provide reports and make decisions, you cannot wait until everything is perfect because that’s a moving target

Good enough is good enough but what is good enough?

Data Effectiveness

29

Consistency

Accuracy

Relevancy

Andrew Patricio www.dataeffectiveness.com EDW2017

Page 30: Practical Data Strategies in the real world of poor Data Quality

Consistency, Accuracy, Relevancy cycleGoal is to have accurate metrics aligned with business goal• Cannot talk about accuracy if there isn’t agreement on the value being reported• Once the value is consistent, you can talk about if it’s accurate• Once it’s accurate you can talk about whether it’s relevant to business goal

Data Effectiveness

30

Metric AReport 1: 90Report 2: 81Report 3: 87

Metric AReport 1: 87Report 2: 87Report 3: 87

Consistent

Metric AReport 1: 85Report 2: 85Report 3: 85

Metric aligned with

goal

NotRelevant

Determine proposed change and go through cycle again

Accurate Relevant

DATA INFORMATION KNOWLEDGE

Andrew Patricio www.dataeffectiveness.com EDW2017

Page 31: Practical Data Strategies in the real world of poor Data Quality

Consistency – DATA “What is the value measure of this metric?”Driven by reporting Consistency means literally just that: a metric has the same value for the same parameters no matter who pulls itFactors• Traceability – same metric in different reports must be traced back to same source• Same parameters – need to be careful because different metrics could be referred to by

the same common name • Time factor – legitimate changes can be made after report is run

Data Effectiveness

31

Total absences Truant absences Pulled Reason behind difference

100 90 Oct First pull

88 88 Nov Data corrected

80 85 Dec Some unexcused absences corrected to suspensions

Andrew Patricio www.dataeffectiveness.com EDW2017

Page 32: Practical Data Strategies in the real world of poor Data Quality

Accuracy – INFORMATION “Is the value measure shown for this metric correct?”Driven by AnalyticsOnce you have consistency, you can work on accuracy: key is to use only good data when verifying “accuracy”

Metric could be “inaccurate” because • Bug in query – fix • Wrong or inconsistent business rules – nail down definitions, two different sets of

business rules for the same metric could be appropriate. Two different metrics? Or “correct” business rules

• Data quality – identify source and reason, data entry team

Data Effectiveness

32Andrew Patricio www.dataeffectiveness.com EDW2017

Page 33: Practical Data Strategies in the real world of poor Data Quality

Relevancy – KNOWLEDGE “Is this metric helping to meet our goal?”Driven by businessOnce you have accuracy, then you can determine whether that metric is useful. If not, then either business goal or metric needs to change• Changing metric

• Use new metric – longer to get consistency, cycle could be just as long or longer• Refine business rules of existing metric – less effort to get consistency, shorter cycle

• Changing business goal• Effective data in hand is worth two in the bush• Tail could be wagging the dog but unmeasurable business goal is just a wish

Example:Unexcused absences Suspensions are not considered unexcused absences so this doesn’t truly

capture time away from instructionIn Seat Attendance (ISA) Counts all absences except in-school suspension, etc

Data Effectiveness

33Andrew Patricio www.dataeffectiveness.com EDW2017

Page 34: Practical Data Strategies in the real world of poor Data Quality

CycleAs data becomes information becomes knowledge, the data sophistication of the

process grows which requires more/different metrics

Data Effectiveness

34

Different metrics could be at different points in the cycle

Accuracy

RelevancyConsistency

Accuracy

RelevancyConsistency

Accuracy

RelevancyConsistency

Acc

RelCons

Acc

RelCons

Acc

RelCons

Acc

RelCons

Acc

RelCons

Acc

RelCons

Acc

RelCons

Acc

RelCons

Acc

RelCons

Acc

RelCons

Acc

RelCons

Acc

RelCons

Andrew Patricio www.dataeffectiveness.com EDW2017

Page 35: Practical Data Strategies in the real world of poor Data Quality

Data Quality Culture

35Andrew Patricio www.dataeffectiveness.com EDW2017

Page 36: Practical Data Strategies in the real world of poor Data Quality

Why is there inconsistency in the first place?Ongoing issue is data entry problem

• Need to balance flexibility/freedom of entry with validation checks• Most systems can validate based on patterns or entries but do not have enough flexibility to

differentiate between other valid and invalid entries

Why are there data entry errors?

Data Effectiveness

36

Often users don’t have the access to make a needed data change so they must enter a request for the tech team to handle

• strictness of data entry check needs to balance against technical team capacity

Andrew Patricio www.dataeffectiveness.com EDW2017

Page 37: Practical Data Strategies in the real world of poor Data Quality

Short sighted data entryExample: Enrollment overlapsStudent Information System is transactional and only tracks current state

• For enrollment it doesn’t care about data values in enrollment history• Only cares about latest enrollment action (admit or withdrawal) and school• “enrollment history” in system is merely log of events • Users can willy-nilly adjust enrollment history with no effect on current status

Data Effectiveness

37Andrew Patricio www.dataeffectiveness.com EDW2017

Page 38: Practical Data Strategies in the real world of poor Data Quality

Preventing data entry errorsBusiness line workers are our "data entry team" rather than our “users”

• Successful data reporting intimately tied to their effectiveness• Perfect system which users are not comfortable with will still have bad data quality

Data Effectiveness

38

Taking this point of view automatically fosters more collaboration• Connecting the dots for end users by tracing the pathway from a specific data entry error to specific

issue on data report• Data Integrity Management system displays errors to “data entry team”

• includes steps as to how issue can be fixed• Includes direct link to relevant record in transactional system to minimize context switching

Users Data Entry Team

Andrew Patricio www.dataeffectiveness.com EDW2017

Page 39: Practical Data Strategies in the real world of poor Data Quality

Central system to flag data errors to users for them to correct• Ideally errors reported back to users who entered it • Provides specific resolution steps

Data Integrity Management system

Data Effectiveness

39Andrew Patricio www.dataeffectiveness.com EDW2017

Page 40: Practical Data Strategies in the real world of poor Data Quality

Data Integrity Management System

Fixing DataError Correction Cycle• Feed back errors to users for them to correct• Technical team looks for other common data entry errors to either prevent through

front-end validation or add to error checking

Data Effectiveness

40

Error Dashboard

Technical team

Improve Front End Validations

Update Error Patterns

Fix Data Errors

Error Identification

Transactional Systems

Users (ie “data entry team”)

Andrew Patricio www.dataeffectiveness.com EDW2017

Page 41: Practical Data Strategies in the real world of poor Data Quality

Data Integrity Management System

Data Effectiveness

41Andrew Patricio www.dataeffectiveness.com EDW2017

Page 42: Practical Data Strategies in the real world of poor Data Quality

Reporting Platform

42Andrew Patricio www.dataeffectiveness.com EDW2017

Page 43: Practical Data Strategies in the real world of poor Data Quality

Single system for operations and reportingMany organizations create reports from queries directly off transactional systems• Makes querying a bear due to complex data model for transactional system• All reports require technical team capacity, even simple ones• Highly normalized = simple knowledge is stored in a complex way• Optimized for inserts not reporting• Business definitions often exist only in query code

Example: find Residency Verificationselect decode (afv.value,null,'N',438,'N','Y') end "Residency Verification SY13-14", from students p, adhoc_fields_values afv, adhoc_fields_drop_downs afddwhere p.pupil_number = afv.pupil_number(+) and afv.adhoc_fields_def_ID(+) = 109 and AFV.ADHOC_FIELDS_DEF_ID = AFDD.ADHOC_FIELDS_DEF_ID(+)and afv.value = AFDD.FIELD_KEY_VALUE(+)

Data Effectiveness

43Andrew Patricio www.dataeffectiveness.com EDW2017

Page 44: Practical Data Strategies in the real world of poor Data Quality

Reporting platform - SpeedData model focused on reporting, not on transactions• space vs speed tradeoff highly biased towards speed

• Virtually unlimited disk space• Batch processing not real time

• Complete flexibility to organize data optimally for ease of reporting• Central store for all siloed data (data-warehouse lite)

Data Effectiveness

44

Student Demographics

Admit_withdraw

Attendance Base

Assessment

Courses_Taken

Andrew Patricio www.dataeffectiveness.com EDW2017

Page 45: Practical Data Strategies in the real world of poor Data Quality

Reporting platform – ease of useReally nothing more than a dedicated reporting database, not data warehouseData model can be tailored for reporting• Keeps track of all changes, not just latest data (valid from, valid to)• Super flat, Highly denormalized• Redundancy okay so long as we have data traceability• have multiple copies/formats/structures of same base data for different users/uses• Fewer joins so can shift technical capacity to more complex business rules• Can be exposed more directly to data analysts for increased self-service

Data Effectiveness

45

select decode (afv.value,null,'N',438,'N','Y') end "Residency Verification", from students p, adhoc_fields_values afv, adhoc_fields_drop_downs afddwhere p.pupil_number = afv.pupil_number(+) and afv.adhoc_fields_def_ID(+) = 109 and AFV.ADHOC_FIELDS_DEF_ID = AFDD.ADHOC_FIELDS_DEF_ID(+) and afv.value = AFDD.FIELD_KEY_VALUE(+)

select [Residency Verification] from student_demographics_snapshot

Andrew Patricio www.dataeffectiveness.com EDW2017

Page 46: Practical Data Strategies in the real world of poor Data Quality

Reporting platform - ConsistencyCommon processing• Common query code centralized • Batch ETL so can make multiple passes to pre-calculate higher order metrics

Consistent business rules• can have old and new metrics back-calculated as well (old vs new truancy rules)• calculate metric, in one place so one number, right or wrong, is reported

Data Traceability • Data path from systems of record to reports fully documented

Data Effectiveness

46

Herding Kittens One Big Powerful CatAndrew Patricio www.dataeffectiveness.com EDW2017

Page 47: Practical Data Strategies in the real world of poor Data Quality

SSIS, SQL Server, Perl on Virtual Machine servers

Data Effectiveness

47

Accounting data system

HR data system

Assessment data dump

Assessment data dumpAssessment data dump

External imports

Assessment data dump

Assessment data dumpAssessment data dump

Misc Data Files

CRM

Misc SystemMisc SystemMisc System

ETL(SQL Server Integration Services,Perl,Manual loads)

Reporting Database (MS SQL Server)

Primary ERP

Data Mart(MS SQL Server)

Direct SQL (SQL Server Management Studio)

Reporting Platform Example Architecture

Andrew Patricio www.dataeffectiveness.com EDW2017

Page 48: Practical Data Strategies in the real world of poor Data Quality

Reporting Platform – Business Rules Centralized

Based on weekly attendance report• Updated daily• Calculates individual student attendance metrics

Data Effectiveness

48

Metric DetailsTruancy Calculates truancy based on old rules and new rules

so can compare trendsAbsence Counts Period and Daily; Unexcused, Excused, In Seat

Attendance, Suspension

Andrew Patricio www.dataeffectiveness.com EDW2017

Page 49: Practical Data Strategies in the real world of poor Data Quality

Reporting Platform – common processing tasksEnrollment admit withdraw matching• SIS stores enrollment as separate admit and withdraw events• Need to match admits to withdrawals for the same enrollment period and school

Data Effectiveness

49

Admit Date Withdraw Date School

24 August 2011 24 June 2012 123

24 June 2012 10 October 2012 456

11 October 2012 1 January 3030 789

Date Type School24 August 2011 Admit 12324 June 2012 Withdrawal 12324 June 2012 Admit 45610 October 2012 Withdrawal 45611 October 2012 Admit 789

Currently enrolled as “withdrawal date” in the far future so that there is an actual date and not a null to compare against:currently enrolled is today() < [withdraw date])

Andrew Patricio www.dataeffectiveness.com EDW2017

Page 50: Practical Data Strategies in the real world of poor Data Quality

Reporting Platform – Optimized for ReportingGenerally two ways we need to analyze assessments• Single view of all assessments for a student – data in columns

• Each row a single student for a particular school year

• Comparing one run of an assessment with another – data in rows• Each row a single assessmet for a single student for a particular school year

Data Effectiveness

50

Student Assessment SY Score

123 A1 Q1 SY1415 90

123 A1 Q2 SY1415 80

123 A1 Q3 SY1415 70

123 A1 Q4 SY1415 100

456 A1 Sem 1 SY1415 65

Student A1 Q1 A1 Q1 A1 Q3 A1 Q4 A2 Sem 1 A2 Sem 2 SY

123 90 80 70 100 76 87 SY1415

456 60 70 80 90 65 86 SY1415

Andrew Patricio www.dataeffectiveness.com EDW2017

All traceable back to same original data load so potential for different answers is minimized

Page 51: Practical Data Strategies in the real world of poor Data Quality

Reporting Platform DevelopmentHow to develop system with poor data quality?

With poor data quality it is hard to determine whether some inconsistent or inaccurate number is due to a bug in your query or inconsistent data.

Data Effectiveness

51Andrew Patricio www.dataeffectiveness.com EDW2017

or

?

Page 52: Practical Data Strategies in the real world of poor Data Quality

Reporting Platform DevelopmentKey is to realize that reporting platform did not need to be accurate per se, it just needed to not be more inaccurate.

Data Effectiveness

52Andrew Patricio www.dataeffectiveness.com EDW2017

Solution• Prioritize – Start with recreating standard

reports in reporting platform and compare with existing standard reports: CAR cycle

• Compartmentalize – Run reports using only students with no data quality issues so any errors are likely due to bugs that can be nailed down and fixed DO NO HARM

Page 53: Practical Data Strategies in the real world of poor Data Quality

Reporting Platform Development1. Create Sample Report and compare to Standard Report (eg attendance

weekly)2. Check for discrepancies

1. If discrepancy is due to mistake in reporting platform or query, fix it2. If discrepancy is due to bad data, store student id in exceptions table

3. Pull Sample Report again, filtering out exception students so that only “Good Data” is included in report

4. Continue until no discrepancies

Data Effectiveness

53Andrew Patricio www.dataeffectiveness.com EDW2017

Page 54: Practical Data Strategies in the real world of poor Data Quality

Reporting Platform DevelopmentNeed to ensure that reporting platform is not introducing new errors. How?Use only known good data to validate:

Data Effectiveness

54

Report validated

Fix any issues with Reporting platform

No discrepancies

discrepancies

Filter out students with bad data into exceptions tableReporting Platform

Report query

Standard Report

Sample ReportWhy?

Compare

Bad data students

Good data students

Andrew Patricio www.dataeffectiveness.com EDW2017

Page 55: Practical Data Strategies in the real world of poor Data Quality

Managing requests

55Andrew Patricio www.dataeffectiveness.com EDW2017

Page 56: Practical Data Strategies in the real world of poor Data Quality

Capacity vs DemandDemand for data is ever increasing, people are hungry for dataNeeded to do more with the same size teamTwo Tracks• Increase reporting efficiency • Reduce demand on reporting team

Data Effectiveness

56Andrew Patricio www.dataeffectiveness.com EDW2017

Page 57: Practical Data Strategies in the real world of poor Data Quality

Increase EfficiencyUsers make requests via online “Data Request Tool” (DRT)• Central point of communication with requestors for clarifications• Tracks implementation notes and report writer assignments• Report files attached to request along with query code• One report can be attached to multiple requests to allow for reuse• Data snapshot of common data available on front end

• Updated daily with common metrics (absences, GPA, grade level, school, etc)• User can customize columns/filters to download for themselves• Example of some columns available:

Data Effectiveness

57

Student_ID YTD_Unexcused_Absences Total SBT Suspension_DaysSchool_Name YTD_Excused_Absences Truant - still be truant?ELL_Status YTD_ISA_Average_Attendance Truant_>=10_daysFARM_Status Membership_days Current_School_Average_AttendanceStudent_Race Absences_Towards_Truancy Current_School_Excused_AbsencesSPED_Status Suspension_Absences_Days Current_School_ISA_Average_Attendance

Andrew Patricio www.dataeffectiveness.com EDW2017

Page 58: Practical Data Strategies in the real world of poor Data Quality

Increase Efficiency“Data Request Tool” (DRT)

Data Effectiveness

58Andrew Patricio www.dataeffectiveness.com EDW2017

Page 59: Practical Data Strategies in the real world of poor Data Quality

Increase EfficiencyData Librarian is first point of contact for requests to reporting team• Dedicated FTE position• Clarifies request requirements• Is there an already completed report that can fulfill this request?• Acts as gatekeeper to qualify requests before they hit reporting capacity

Data Effectiveness

59

Program needs data

Standard Report? Common metric?

Program Enters Data Request

Data Librarian clarifies request

Report Created

Report Writer assigned

Report Reviewed

Existing report available?

Report Delivered

Andrew Patricio www.dataeffectiveness.com EDW2017

Page 60: Practical Data Strategies in the real world of poor Data Quality

Self Service ReportingGoal is to provide self-service reporting to analysts while ensuring consistency• Giving them raw access to reporting platform is too overwhelming• Analysts are not database developers/DBAs• SQL skills, would still require joins to get meaningful data• Creating dedicated pull of custom data would mean another thing to maintain

Solution was first to create regularly disseminated standard report with commonly requested metrics and standard demographics

Data Effectiveness

60Andrew Patricio www.dataeffectiveness.com EDW2017

Page 61: Practical Data Strategies in the real world of poor Data Quality

Self Service ReportingThen save weekly snapshot of each report into a dedicated “data mart”• Simply add “report date” field to existing columns• Analysts already used to seeing these reports so no learning curve in using data

Data Effectiveness

61Andrew Patricio www.dataeffectiveness.com EDW2017

Page 62: Practical Data Strategies in the real world of poor Data Quality

QuickieData Mart

Standard Report Daily Feeds

Standard Report Daily Feeds

“Data Mart” example - Standard ReportStandard Report data flows into data mart. Analysts/Power Users can create

dashboards in tools like PowerBI for staff to use or they can access it directly

Data Effectiveness

62

Standard Report Weekly Feeds

Standard Report wk 1Standard Report wk 2Standard Report wk 3Standard Report wk 4

Standard Report wk 52

Analytics

Power Users

Andrew Patricio www.dataeffectiveness.com EDW2017

Page 63: Practical Data Strategies in the real world of poor Data Quality

Report requests hitting report writers

Data Effectiveness

63

0

20

40

60

80

100

120

Aug Sep Oct Nov Dec Jan Feb Mar Apr May Jun Jul

Data Requests per Month

SY12-13 SY13-14 SY14-15 SY15-16

More self-service reporting and standardized reports• Fewer adhoc requests for standard data• Reporting capacity can be spent on more complex requests

Andrew Patricio www.dataeffectiveness.com EDW2017

Page 64: Practical Data Strategies in the real world of poor Data Quality

Summary

64Andrew Patricio www.dataeffectiveness.com EDW2017

Page 65: Practical Data Strategies in the real world of poor Data Quality

Takeaways

“What problem are you trying to solve”?

Data Effectiveness

65Andrew Patricio www.dataeffectiveness.com EDW2017

Effective Data

Organizational Success

Data Analytics

Programs / Business

Effective Decisions

Effective Outcomes

Effective Data

Data Reporting

Page 66: Practical Data Strategies in the real world of poor Data Quality

Takeaways

Data Effectiveness

66

Don’t overengineerdata systems

Andrew Patricio www.dataeffectiveness.com EDW2017

Focus on data that supports organizational goals

Page 67: Practical Data Strategies in the real world of poor Data Quality

TakeawaysConsistency First, then Accuracy, then Relevancy

Data Effectiveness

67

Metric AReport 1: 90Report 2: 81Report 3: 87

Metric AReport 1: 87Report 2: 87Report 3: 87

Consistent

Metric AReport 1: 85Report 2: 85Report 3: 85

Metric aligned with

goalAccurate Relevant

School Staff is our "data entry team" rather than our “users”

Users Data Entry Team

Andrew Patricio www.dataeffectiveness.com EDW2017

Page 68: Practical Data Strategies in the real world of poor Data Quality

ROIMeet your data where it is today and build to where you want to be

Data Effectiveness

68Andrew Patricio www.dataeffectiveness.com EDW2017

Page 69: Practical Data Strategies in the real world of poor Data Quality

[email protected]

@dataeffectivelydataeff.sitedataeff.blogdataeff.me

Data Effectiveness

69Andrew Patricio www.dataeffectiveness.com EDW2017