peter aiken · 2012-07-27 · peter aiken [email protected] +1 804 382 5957 j. brian cassel...
TRANSCRIPT
- datablueprint.com 6/19/2012 © Copyright this and previous years by Data Blueprint - all rights reserved!
Data - It Shouldn't Be This Hard
Lessons From The Trenches
Peter [email protected] +1 804 382 5957
J. Brian Cassel [email protected] +1 804 628-1926
- datablueprint.com 6/19/2012 © Copyright this and previous years by Data Blueprint - all rights reserved!
Peter Aiken
• DoD Computer Scientist– Reverse Engineering Program Manager/Office of the Chief Information Officer (1992-1997)
• Visiting Scientist– Software Engineering Institute/Carnegie Mellon University (2001-2002)
• DAMA International Advisor/Board Member (http://dama.org)
– 2001 DAMA International Individual Achievement Award (with Dr. E. F. "Ted" Codd)– 2005 DAMA Community Award
• Founding Advisor/International Association for Information and Data Quality (http://iaidq.org)
• Founding Advisor/Meta-data Professionals Organization (http://metadataprofessional.org)
• Founding Director Data Blueprint 19992
• Full time in information technology since 1981• IT engineering research and project background• University teaching experience since 1979• Seven books and dozens of articles• Research Areas
– reengineering, data reverse engineering, software requirements engineering, information engineering, human-computer interaction, systems integration/systems engineering, strategic planning, and DSS/BI
• Director– George Mason University/Hypermedia Laboratory (1989-1993)
• Published Papers– Communications of the ACM, IBM Systems Journal, InformationWEEK, Information & Management, Information
Resources Management Journal, Hypermedia, Information Systems Management, Journal of Computer Information Systems and IEEE Computer & Software
- datablueprint.com 6/19/2012 © Copyright this and previous years by Data Blueprint - all rights reserved!3
J. Brian Cassel• J"Brian"Cassel,"PhD
• Visi9ng"Senior"Lecturer"/"Fulbright"Scholar
• King's"College"London"C"Cicely"Saunders"Ins9tute
• London,"England
• +44"(0)20"7848"5679"(office)
• +44"(0)78"2739"9421"(mobile)
• hVp://www.csi.kcl.ac.uk/"
• Massey"Cancer"Center
• Virginia"Commonwealth"University
• Richmond,"Virginia,"USA
• hVp://www.massey.vcu.edu/
Cassel is a senior analyst in the oncology administration department of the VCU Health System; assistant professor of Quality Health Care in the School of Medicine; adjunct assistant professor in Life Sciences; affiliate research associate professor in psychology; and member, Massey Cancer Center, VCU. Dr Cassel is the director of the VCU Life Sciences and Religion Initiative. In that role, he created and co-teaches the “Faith and Life Sciences” class (with Mark Wood and John Quillin), created and co-chairs the “Life Sciences and Religion Community Forum of Central Virginia” (with Mark Wood), funded by the Local Societies Initiative of the Metanexus Institute, and co-directs the “Science, Reason and Faith” series of six public lectures and debates this year (with Anthony Ellis, Thomas Huff and Donald Smith). His doctorate (City University of New York Graduate School and University Center, 1995) was in social-personality and health psychology. His dissertation, “Altruism is only part of the story: A prospective, longitudinal study of AIDS volunteers” defined a typology of AIDS volunteers; this prospective, longitudinal study revealed the antecedents and consequences of altruistic behavior. Dr Cassel has taught courses in research methods, social psychology, science-and-religion, clinical outcomes evaluation, and morality/justice at SUNY, NYU and VCU in BS, MS and PhD programs. He serves on several cancer and palliative-care focused committees for the VCU Health System. Dr. Cassel is recognized nationally for his studies of the financial and clinical outcomes of palliative care, and participates in state-wide and nation-wide training programs on palliative care. He presented a paper on the bio-psycho-social-spiritual aspects of end-of-life care at the 2006 Metanexus Institute conference and has published and presented papers on self-help group leaders, AIDS volunteers, end-of-life care, palliative care, cancer care, and the use of data systems in advancing cancer programs. With Everett Worthington he has submitted two research proposals on forgiveness at the end of life.
- datablueprint.com 6/19/2012 © Copyright this and previous years by Data Blueprint - all rights reserved!
Agenda
• What do we mean by data management maturity?
• Four lessons:1. Organizational thinking must
change2. Crawl, walk, run3. No silver bullets4. Taking an agile approach
4
- datablueprint.com 6/19/2012 © Copyright this and previous years by Data Blueprint - all rights reserved!
The File Naming Convention Committee's Output
5
- datablueprint.com 6/19/2012 © Copyright this and previous years by Data Blueprint - all rights reserved!
24 hour observation of all of the large aircraft flights in the world, condensed down to just over a minute
6
- datablueprint.com 6/19/2012 © Copyright this and previous years by Data Blueprint - all rights reserved!
IT Project Failure Rates (moving average)
7
Source: Standish Chaos Reports as reported at: http://www.galorath.com/wp/software-project-failure-costs-billions-better-estimation-planning-can-help.php
0%
15%
30%
45%
60%
1994 1993 1998 2000 2002 2004 2009
16%
27% 26%28%
34%
29%
32%
53%
33%
46%
49%51%
53%
44%
31%
40%
28%
23%
15%
18%
24%
Failed Challenged Succeeded
- datablueprint.com 6/19/2012 © Copyright this and previous years by Data Blueprint - all rights reserved!
Why Data Projects Fail by Joseph R. Hudicka
• Assessed 1200 migration projects!– Surveyed only
experienced migration specialists who have done at least four migration projects
• The median project costs over 10 times the amount planned!• Biggest Challenges: Bad Data; Missing Data; Duplicate Data
• The survey did not consider projects that were cancelled largely due to data migration difficulties
• "… problems are encountered rather than discovered"
$0 $125,000 $250,000 $375,000 $500,000
Median Project Expense
Median Project Cost
Joseph R. Hudicka "Why ETL and Data Migration Projects Fail" Oracle Developers Technical Users Group Journal June 2005 pp. 29-318
- datablueprint.com 6/19/2012 © Copyright this and previous years by Data Blueprint - all rights reserved!
StandardData
Data Management
Data Program Coordination
OrganizationalData Integration
DataStewardship
Data SupportOperations
Data Asset Use
Organizational Strategies
Goals
IntegratedModels
BusinessData
Business Value
Application Models & Designs
Feedback
Implementation
Direction
DataDevelopment
Guidance
9
Assign responsibilities for data.
- datablueprint.com 6/19/2012 © Copyright this and previous years by Data Blueprint - all rights reserved!
Manage data coherently.
Share data across boundaries.
Engineer data delivery systems.
Maintain data availability.
11
Data Program Coordination
Organizational Data Integration
Data Stewardship
Data Development
Data Support Operations
Data Management
10
- datablueprint.com 6/19/2012 © Copyright this and previous years by Data Blueprint - all rights reserved!11
Our DM practices are ad hoc and dependent upon "heroes" and heroic efforts
Initial(1)
Repeatable(2) We have DM experience and
have the ability to implement disciplined processes
Data Management Capability Maturity Model Levels
Defined(3)
We have experience that we have standardized so that all in the organization
can follow it
Managed(4)
We manage our DM processes so that the whole organization can
follow our standard DM guidance
Optimizing(5)
We have a process for improving our
DM capabilities
One concept for process improvement, others include:
• Norton Stage Theory• TQM• TQdM• TDQM• ISO 9000
and focus on understanding current processes and determining where to make improvements.
- datablueprint.com 6/19/2012 © Copyright this and previous years by Data Blueprint - all rights reserved!
Source: Applications Executive Council, Applications Budget, Spend, and Performance Benchmarks: 2005 Member Survey Results, Washington D.C.: Corporate Executive Board 2006, p. 23.
Percentage of Projects on BudgetBy Process Framework Adoption
…while the same pattern generally holds true for on-time performancePercentage of Projects on TimeBy Process Framework Adoption
Key Finding: Process Frameworks are not Created EqualWith the exception of CMM and ITIL, use of process-efficiency frameworks does not predict higher on-budget project delivery…
12
- datablueprint.com 6/19/2012 © Copyright this and previous years by Data Blueprint - all rights reserved!
Assessment Components
Data Management Practice AreasData Management Practice AreasData program coordination
DM is practiced as a coherent and coordinated set of activities
Organizational data integration
Delivery of data is support of organizational objectives – the currency of DM
Data stewardship Designating specific individuals caretakers for certain data
Data development
Efficient delivery of data via appropriate channels
Data support Ensuring reliable access to data
Capability Maturity Model Levels
Examples of practice maturity
1 – Initial Our DM practices are ad hoc and dependent upon "heroes" and heroic efforts
2 - Repeatable We have DM experience and have the ability to implement disciplined processes
3 - Documented We have standardized DM practices so that all in the organization can perform it with uniform quality
4 - Managed We manage our DM processes so that the whole organization can follow our standard DM guidance
5 - Optimizing We have a process for improving our DM capabilities
13
Data Program Coordination
Organizational Data Integration
Data Stewardship
Data Development
Data Support Operations
- datablueprint.com 6/19/2012 © Copyright this and previous years by Data Blueprint - all rights reserved!
Data Management Practices Measurement (DMPA)
14
Focus: Implementation
and Access
Focus: Guidance and
Facilitation
Optimizing (V)
Managed (IV)
Documented (III)
Repeatable (II)
Initial (I)
• CMU's Software Engineering Institute (SEI) Collaboration• Results from hundreds organizations in
various industries including:– Public Companies – State Government Agencies– Federal Government– International Organizations• Defined industry standard• Steps toward defining data
management "state of the practice"
- datablueprint.com 6/19/2012 © Copyright this and previous years by Data Blueprint - all rights reserved!
Agenda
• What do we mean by data management maturity?
• Four lessons:1. Organizational thinking must
change2. Crawl, walk, run3. No silver bullets4. Taking an agile approach
15
- datablueprint.com 6/19/2012 © Copyright this and previous years by Data Blueprint - all rights reserved!
Payroll Application(3rd GL)
Payroll Data(database)
R& D Applications(researcher supported, no documentation)
R & DData(raw)
Mfg. Data(home grown
database) Mfg. Applications(contractor supported)
FinanceData
(indexed)
Finance Application(3rd GL, batch system, no source)
Marketing Application(4rd GL, query facilities, no reporting, very large)
Marketing Data(external database)
Personnel Data(database)
Personnel App.(20 years old,
un-normalized data)
16
Typical System Evolution
t
t
Strategy
Goals/Objectives
Systems/Applications
Network/Infrastructure
Data/Information
- datablueprint.com 6/19/2012 © Copyright this and previous years by Data Blueprint - all rights reserved!
t
• In support of strategy, the organization develops specific goals/objectives
• The goals/objectives drive the development of specific systems/applications
• Development of systems/applications leads to network/infrastructure requirements
• Data/information are typically considered after the systems/applications and network/infrastructure have been articulated
• Problems with this approach:– This ensures that data is formed
around the application and not the organizational information requirements
– Process are narrowly formed around applications
– Very little data reuse is possible
Application-Centric Development Flow
Original articulation from Doug Bagley @ Walmart
t
t
Strategy
Goals/Objectives
Data/Information
Network/Infrastructure
Systems/Applications
- datablueprint.com 6/19/2012 © Copyright this and previous years by Data Blueprint - all rights reserved!
t
• In support of strategy, the organization develops specific goals/objectives
• The goals/objectives drive the development of specific data/information assets with an eye to organization-wide usage
• Network/infrastructure components are developed to support organization-wide use of data
• Development of systems/applications is derived from the data/network architecture
• Advantages of this approach:– Data/information assets are
developed from an organization-wide perspective
– Systems support organizational data/information needs and compliment organizational process flows
– Data/information reuse is maximized
Data-Centric Development Flow
Original articulation from Doug Bagley @ Walmart
- datablueprint.com 6/19/2012 © Copyright this and previous years by Data Blueprint - all rights reserved!
Designing for Evolution is Different than Creating New Systems
19
Common Organizational Data (and corresponding data needs requirements)
New Organizational
Capabilities
Systems Development
Activities
Create
Evolve
Future State
(Version +1)
Results
Increasing scope and depth of information architecture utility
Individual SDLC Effort
- datablueprint.com 6/19/2012 © Copyright this and previous years by Data Blueprint - all rights reserved!
Individual SDLC efforts make increasing use from IA
• Over time the:– Number of requests increase– Utility of the results increase– Amount of metadata contributed by
new systems development increases
20
Requirements
Design
Implement
Requests Results
Individual SDLC Effort
Requirements
Design
Implement
Requests
Results
Individual SDLC Effort
Requirements
Design
Implement
Requests
Organized system
metadata
Organized system
metadata
Organized system
metadata
Top Operations Job
- datablueprint.com 6/19/2012 © Copyright this and previous years by Data Blueprint - all rights reserved!
TDJ Reporting
21
Top Job
Top Finance Job
Top Technology Job
Top Marketing Job
TopData Job
• There is enough work to justify the function• There is not much talent
Data Governance Organization
- datablueprint.com 6/19/2012 © Copyright this and previous years by Data Blueprint - all rights reserved!
Agenda
• What do we mean by data management maturity?
• Four lessons:1. Organizational thinking must
change2. Crawl, walk, run3. No silver bullets4. Taking an agile approach
22
- datablueprint.com 6/19/2012 © Copyright this and previous years by Data Blueprint - all rights reserved!
Reduce-Reuse-Recycle … Data?• Reduce the amount of organizational data ROT
– Redundant, obsolete, trivial• Reuse the remainder
– Fewer vocabulary items to resolve– Greater quality engineering leverage
• Integration is impossible without information architecture components (for mapping)– Maintenance of these components
promotes greater reuse• Shared data is typified by
organizational ability to use information as a strategic asset
• However, assets are useless without knowledge of the asset characteristics
23
- datablueprint.com 6/19/2012 © Copyright this and previous years by Data Blueprint - all rights reserved!
Interdependencies
24
Data Governance
Master Data Data Quality
- datablueprint.com 6/19/2012 © Copyright this and previous years by Data Blueprint - all rights reserved!
Inextricably Intertwined Interactions
25
Improved Quality Data
Master Data
Monitoring
Data Governance
Practices
Master Data Management
Practices
Governance Violations Monitoring
Data Quality Engineering
Practices
Data Quality
Monitoring
Monitoring Results:
Suspected/ Identified
Data Quality
Problems Data Quality Rules
Monitoring Results:
Suspected/ Master Data &
Characteristics
Routine Data
Scans
Master Data
Catalogs
Governance Rules
Routine Data
Scans
Monitoring Rules
Focused Data
Scans
Operational Data
Data Harvesting
Quality Rules
• Progress beyond crawling requires– Two distinct skill sets to must reach
agreement• Expert organizational knowledge• Data management engineering expertise
– Practice• Collaborating• Learning from mistakes
– Ongoing training commitment
- datablueprint.com 6/19/2012 © Copyright this and previous years by Data Blueprint - all rights reserved!
Crawl, walk, run, fly
26
- datablueprint.com 6/19/2012 © Copyright this and previous years by Data Blueprint - all rights reserved!
Experience Pebbles
27
http://myriadtechnicalresources.com
- datablueprint.com 6/19/2012 © Copyright this and previous years by Data Blueprint - all rights reserved!
Myth Busters on Silver Bullets
28
- datablueprint.com 6/19/2012 © Copyright this and previous years by Data Blueprint - all rights reserved!
Agenda
• What do we mean by data management maturity?
• Four lessons:1. Organizational thinking must
change2. Crawl, walk, run3. No silver bullets4. Taking an agile approach
29
- datablueprint.com 6/19/2012 © Copyright this and previous years by Data Blueprint - all rights reserved!
Data Governance Hype Cycle
http://www.gartner.com/technology/research/methodologies/hype-cycle.jsp30
- datablueprint.com 6/19/2012 © Copyright this and previous years by Data Blueprint - all rights reserved!
How to avoid becoming part of the parade?
31
SOA
ERP
OOPS NoSQL
CLOUD
MDM
Cloud
- datablueprint.com 6/19/2012 © Copyright this and previous years by Data Blueprint - all rights reserved!
Response: Massey Cancer Center Data Analytics
Hospital billing claims• IP & OP encounter data• Cost, charge, reimbursement, &
utilization data • ICD-9 diagnosis, CPT procedure,
UBC revenue codes
Physician billing claims• IP & OP encounter data• Utilization & charge data• ICD-9 diagnosis, CPT
procedure codes
Pathology DB• Surgery/Cytology Path Reports• Test Values
UHCIP data for Academic Medical CentersICD-9, CPT, DRG codes, LOS, Mortality
Outpatient Pharmacy DBDrug Utilization Details•
VCUHS Internal Data Sources – linkable at patient level
Bone Marrow Transplant DBClinical data on donors & recipients•
Public Health Data• BRFSS• SEER• VCR• Vital Statistics
•Social SecurityDates of death
Patient Satisfaction
External Data Sources (useful for benchmarking)
Cancer Registry
•Site, stage, pathology details•Initial Treatment
US Census Population Data
Intellimed / VHHA• IP data for all Virginia hospitals• Can expand to other states as
well (Thomson-Reuters or AHRQ-HCUP)
SQL server system (MDAS)
32
- datablueprint.com 6/19/2012 © Copyright this and previous years by Data Blueprint - all rights reserved!33
$(3,000,000)
$(1,000,000)
$1,000,000(
$3,000,000(
$5,000,000(
$7,000,000(
$9,000,000(
$11,000,000(
FY(1997 FY(1998 FY(1999 FY(2003 FY(2004 FY(2005 FY(2006 FY(2007 FY(2008 FY(2009 FY(2010 FY(2011((est)
Total(Profit((Loss)(for(BMT@related hospital(accounts
Dr. Retchin (CEO), circa 2000: The hospital is losing $1-$2 million per year on this program. Fix it.
- datablueprint.com 6/19/2012 © Copyright this and previous years by Data Blueprint - all rights reserved!
How?• Worked very closely with service line administrator, BMT
medical director, and managed care contracting director• Evaluation & improvement of all aspects of the revenue
cycle – direct costs, indirect costs, coding, pricing, contract negotiations, tracking denials, etc.
• Compared VCUHS program to external benchmarks (UHC)
• Developed significant familiarity with transplant types, phases of transplants, contract terms (stop-loss; second-dollar outlier payments; phases; global-billing), etc.
• Did this continuously• Once profitable, program expanded: from 60 transplants
per year to ~15034
- datablueprint.com 6/19/2012 © Copyright this and previous years by Data Blueprint - all rights reserved!
Agile Programming (Doesn't Just Mean ... )
35
- datablueprint.com 6/19/2012 © Copyright this and previous years by Data Blueprint - all rights reserved!
Agenda
• What do we mean by data management maturity?
• Four lessons:1. Organizational thinking must
change2. Crawl, walk, run3. No silver bullets4. Taking an agile approach
36
- datablueprint.com 6/19/2012 © Copyright this and previous years by Data Blueprint - all rights reserved!
“Our hospital wants us to use the existing system, can we create an
Oncology ‘cube’?”
• Can you get all the information you need in a “cube” from an existing business intelligence data system?– Would it include outpatient care?– Would it capture the whole care continuum?– Would it allow you to categorize by disease type?– Would it allow you to categorize by modality of care?
37
- datablueprint.com 6/19/2012 © Copyright this and previous years by Data Blueprint - all rights reserved!
Patient-based analyses following patient from diagnosis through treatment
From%diagnosis:%Primary'site'of'cancerDate'of'diagnosis
Stage/spread'of'disease
xx
xx
x
xx
xx
xx
x
xxx
xx
xx
xx
xx
xx
x
xx
x
x
x
xxx
x xx
xxx
x x
Hospital Cancer Registry
Hospital, Physician, Pharm Claims
1111111
2222222
3333333
4444444
5555555
x
x
xx
x
xx
xxx
x
x
xx
xx x
x
xxxx x
xx
x
x x
x
xx
x
x
x
xxx
x xx
xxx
x
x
xx
xx
xx
x
x x
x
x
Follow%pa0ent%interac0ons%over%0me:Capture'all'encounter'dates'and'details
+
30
38
- datablueprint.com 6/19/2012 © Copyright this and previous years by Data Blueprint - all rights reserved!
What we know from the cancer
registry…
What we gain from integrating
billing claims
!!
A closer look …
31
39
- datablueprint.com 6/19/2012 © Copyright this and previous years by Data Blueprint - all rights reserved!
Consulting firm: “Close down palliative care program”
• VCU Health System opened one of first Palliative Care Units in the US, May 2000.
• Consultants recommended closing it in 2002.– They looked at net margin for hospitalizations ending on the
PC Unit and saw that the costs greatly exceeded reimbursement.
– They thought that getting rid of the unit would get rid of this problem.
• RWJ Foundation supported urgent response.• Appropriate financial analyses convinced consultants
that the unit actually produced valuable hospital outcomes.– See KR White & JB Cassel (2009). “The Business Case for a Hospital Palliative Care
Unit: Justifying its Continued Existence”. Practice of Evidence-Based Management, T Kovner, D Fine & R D’Aquila (Eds.), Chicago: Health Administration Press, pp 171-180.
40
- datablueprint.com 6/19/2012 © Copyright this and previous years by Data Blueprint - all rights reserved!
Slide 20
Cost-avoidance in drugs (-77%), labs (-95%), imaging (-95%), supplies (-60%).
41
- datablueprint.com 6/19/2012 © Copyright this and previous years by Data Blueprint - all rights reserved!
Agenda
• What do we mean by data management maturity?
• Four lessons:1. Organizational thinking must
change2. Crawl, walk, run3. No silver bullets4. Taking an agile approach
42
- datablueprint.com 6/19/2012 © Copyright this and previous years by Data Blueprint - all rights reserved!
http://peteraiken.net
Contact Information:
Peter Aiken, Ph.D.
Department of Information Systems School of BusinessVirginia Commonwealth UniversitySnead Hall Room B4217301 West Main StreetRichmond, Virginia 23284-4000
Data Blueprint 10124C West Broad StreetGlen Allen VA 23060804.521.4056http://datablueprint.com
office: +1.804.883.7594cell: +1.804.382.5957
e-mail: [email protected]: http://peteraiken.net
43
h"p://tw
i"er.com
/paiken