data governance in the big data era
TRANSCRIPT
![Page 1: Data Governance in the Big Data Era](https://reader035.vdocuments.net/reader035/viewer/2022070515/5878bc4d1a28ab724c8b7a6f/html5/thumbnails/1.jpg)
Data Governance in a Big Data Era
Pieter De Leenheer, PhDStanford University
Nov 3, 2016
![Page 2: Data Governance in the Big Data Era](https://reader035.vdocuments.net/reader035/viewer/2022070515/5878bc4d1a28ab724c8b7a6f/html5/thumbnails/2.jpg)
Misconceptions of Data Governance that impede Data Valuation• Data governance is a published repository of common definitions. • Data governance is a concern of – and hence managed by – IT.• Data governance is just data quality (DQ) and master data
management (MDM). • Data governance is siloed by business function.• Data governance provides no value or participation for the data-
consuming community.
![Page 3: Data Governance in the Big Data Era](https://reader035.vdocuments.net/reader035/viewer/2022070515/5878bc4d1a28ab724c8b7a6f/html5/thumbnails/3.jpg)
Admin• http://www.slideshare.net/pdeleenh/data-governance-in-the-big-data-era
![Page 4: Data Governance in the Big Data Era](https://reader035.vdocuments.net/reader035/viewer/2022070515/5878bc4d1a28ab724c8b7a6f/html5/thumbnails/4.jpg)
Hierarchical Data Management• Formal
• Operational and analytical data
• Inward Focus:• Improve Internal/external coordination• Understand customer• Predict next transaction
• Controlled by Central Provider• MDM, DWH, DM, Dashboards• Tedious Waterfall • Comprised by Obsolete Cost assumption
• Consumer• Small Elite C-level
![Page 5: Data Governance in the Big Data Era](https://reader035.vdocuments.net/reader035/viewer/2022070515/5878bc4d1a28ab724c8b7a6f/html5/thumbnails/5.jpg)
Hierarchical Data Governance• Wikipedia: “a set of processes that ensures that important data assets
are formally managed throughout the enterprise. Data governance ensures that data can be trusted and that people can be made accountable for any adverse event that happens because of low data quality”.
• biased by Total (Data) Quality Management practice• Suggest ‘policing’ rather than ‘empowerment’
• How to evolve to a democratic networked approach?• Involves IC’s and middle-management• With less middle-men slack• Dealing with Big Data
![Page 6: Data Governance in the Big Data Era](https://reader035.vdocuments.net/reader035/viewer/2022070515/5878bc4d1a28ab724c8b7a6f/html5/thumbnails/6.jpg)
Data Big Bang• Phenomenon: connectivity between
• Social• Knowledge• Technology
• Draws curiosity• Web Science (Pentland, etc)• Big Data Native Market Entrants (23andMe, Uber,
Inventure)• Disruption
• Bottom up• Starting From data• Low end
• +80% unstructured data or ‘dark matter’
![Page 7: Data Governance in the Big Data Era](https://reader035.vdocuments.net/reader035/viewer/2022070515/5878bc4d1a28ab724c8b7a6f/html5/thumbnails/7.jpg)
Three Forces Shaping the Digital Economy (1)1. Digitalization of the Physical
• Entertainment, Wealth, Biology, Chemistry
• MPx, Paypal, Bitcoin, 3d printing, IoT, VR
2. Sustained and accelerated growth of digital power (despite slow down Moore’s Law)• Mass parallelization (Hadoop and Hive)• Move function and reliability to software• Miniaturization
![Page 8: Data Governance in the Big Data Era](https://reader035.vdocuments.net/reader035/viewer/2022070515/5878bc4d1a28ab724c8b7a6f/html5/thumbnails/8.jpg)
Three Forces Shaping the Digital Economy (2)3. Modular and Generative Programmability
“By carefully excluding features that are not universally useful Internet technologies became easily adopted on a massive scale and gave the Web a generative [i.e. self-reproductive] character” (Zittrain, 2009).
• This opens new business models unimaginable before:• apps extend function of a smartphone• aggregations of components in complex machines
• once digitized opens new ways of manipulation and transport
![Page 9: Data Governance in the Big Data Era](https://reader035.vdocuments.net/reader035/viewer/2022070515/5878bc4d1a28ab724c8b7a6f/html5/thumbnails/9.jpg)
The “Dark Matter” of Big Data Universe• Observed consequence of these forces:
1. Consumerization of Digital Technologies pivoting around 20002. Grassroot Participation / Peer-based 3. Digitalization of Trust
• All contribute to Big Data• (2) and (3) contribute to Social Capital: Dark Matter (aka
unstructured data)? • Human communication, Text heavy• Context: emphasis, emotion, location at moment of capturing
changes meaning:
• “I did not say Peter’s talk stinks”
![Page 10: Data Governance in the Big Data Era](https://reader035.vdocuments.net/reader035/viewer/2022070515/5878bc4d1a28ab724c8b7a6f/html5/thumbnails/10.jpg)
Data-driven Hierarchies, Networks &HybridsHierarchical Networked Network peers provide ideas, feedback but
also service (uber driver analogy data scientist)
Product Ownership Service (hence Data) AccessExample: Uber doesn’t own. It only dispatches information about rolling material to riders and focus over lifetime value retention.
Data analogy: access to data more important than owning as cost of IS is marginal and replaced by data value appreciation by using community
Passive resources (material, goods)
Active resources (data, consumer)
Value-in-exchange Value-in-use
Acquisition Retention Example: Saas, Netflix, Costco, etc.
Data analogy: From formal roles and responsibilities to support internal process to social capital based trust
Process Relations
Provider push Consumer pulls Example: Feedback, mods on games, user participation, A/b testing etc.
Data analogy: data helpdesk
Consumerization of tech, grassroot participation, digitalization of trust
![Page 11: Data Governance in the Big Data Era](https://reader035.vdocuments.net/reader035/viewer/2022070515/5878bc4d1a28ab724c8b7a6f/html5/thumbnails/11.jpg)
Shift in Data Governance Approaches• Consequences of digital forces gigantic risk on organizations even with
hierarchical governance • Hierarchical data governance
• Few consumers served by a central oblique provider• Inward• Compromises on old obsolete cost assumptions of digital power• Use of digital optimizes to some extent• Not scalable for big data by larger ‘data scientist’ populations
• Combine with Networked Approach• Democratization (production)
• Breadlines• Consumerization of BI and cheap digital power• Many serve many• Supports customer
• Amazonification (consumption)• Access, SLA, Trust, etc
• Outward
![Page 12: Data Governance in the Big Data Era](https://reader035.vdocuments.net/reader035/viewer/2022070515/5878bc4d1a28ab724c8b7a6f/html5/thumbnails/12.jpg)
Big Data Analytics Challenges• When everybody has data scientists: predict next
transaction is not competitive anymore• from 'predict next transaction' to life-long relation
building and value creation• reduce search and navigation for customer with
better apps • crowd sourcing to cross compare with and learn
from other customers (Opower, INRIX, zillow)• get trust from customer through branded non-intrusive
apps: personal health monitoring, Nest• Retention analysis example
![Page 13: Data Governance in the Big Data Era](https://reader035.vdocuments.net/reader035/viewer/2022070515/5878bc4d1a28ab724c8b7a6f/html5/thumbnails/13.jpg)
Big Data Governance Challenges• Scalable Balance between (hierarchical) control and (networked) empowerment • Minimize search for data sets
• Advanced descriptors such as business glossary• Manage attention drift in case of proliferation
• Usage (page ranking): data sets that are reused more are more relevant• Digitalization of Trust
• Authenticity: lineage and provenance• data sets owned by people in your social capital
• Price: prices may be a mechanism but is difficult to identify a fair price and establish a currency-based market for data assets: see Infonomics
• Service level agreements
![Page 14: Data Governance in the Big Data Era](https://reader035.vdocuments.net/reader035/viewer/2022070515/5878bc4d1a28ab724c8b7a6f/html5/thumbnails/14.jpg)
Digitalization of Trust Challenges• In Hierarchical data governance trust
• established by a centrally sanctioned competence center• Or external appointed trustees with formal roles: steward,
owners, architects• In networked peer-driven approach Trust is more complicated:
• Authenticity: is the data factual or opiniated?• Intention: does this data have good intentions? Can I use
it without peril? Hidden privacy concerns I should be aware of?
• Assess expertise or quality: are people involved skilled or certified stewards?
• Is it accurately representing our business reality, i.e. customer base?
• Is it complete and up to date? • Has it be certified through standard process?
![Page 15: Data Governance in the Big Data Era](https://reader035.vdocuments.net/reader035/viewer/2022070515/5878bc4d1a28ab724c8b7a6f/html5/thumbnails/15.jpg)
Danger of the old paradigm models• Weapons of Math Destruction (WMD) are models
• Threaten to destabilize• Equality• Democracy
• Traits of WMDs• Opaque• Unregulated• Uncontestable• …hence : ungoverned
![Page 16: Data Governance in the Big Data Era](https://reader035.vdocuments.net/reader035/viewer/2022070515/5878bc4d1a28ab724c8b7a6f/html5/thumbnails/16.jpg)
The Rise of the Chief Data Officer (CD0) [6]
Data governance & stewardship provide the right level of control and trust in data
Data Infrastructure (IT) Data Consumers (Business)
LEADERSHIPCEO, CFO, VP, Marketing
ROLESData Scientist, Business Analyst
TECHNOLOGYVisualization, Self-service BI
NEED
Data Authority
LEADERSHIPCIO
ROLESInformation Manager, Data Architect, Data Modeler
TECHNOLOGYHadoop, Databases, Data Integration
Data Authority
LEADERSHIPChief Data Officer
ROLESData Governance Manager,
Data Steward
TECHNOLOGYData Stewardship
Platform
![Page 17: Data Governance in the Big Data Era](https://reader035.vdocuments.net/reader035/viewer/2022070515/5878bc4d1a28ab724c8b7a6f/html5/thumbnails/17.jpg)
Recommendations for the Chief Data Officer• Collaboration: inwards / outwards• Data Space: traditional data / big
data• Value Impact: service / strategy
• Join our MIT Sloan CDO Research• http://www.iscdo.org/
![Page 18: Data Governance in the Big Data Era](https://reader035.vdocuments.net/reader035/viewer/2022070515/5878bc4d1a28ab724c8b7a6f/html5/thumbnails/18.jpg)
Conclusion• Digital forces have digitally empowered individuals in the organization• Hybrid data governance approach should combine
• Hierarchical control of critical data assets to enhance internal coordination• Networked peer-driven empowerment to drive ‘serendipity’• On a shared platform
• Key challenges are:• Digitalization of trust with focus on social capital• Big data analytics that drives life-time value for customer• Get rid of old models that are oblique, unregulated and incontestable• Recognize CDO Leadership and Role transition
![Page 19: Data Governance in the Big Data Era](https://reader035.vdocuments.net/reader035/viewer/2022070515/5878bc4d1a28ab724c8b7a6f/html5/thumbnails/19.jpg)
Recommended Reading• O’Neil, C.: Weapons of Math Destruction• Franks, B.: Taming the Big Data Tidal Wave• Sundararajan, A.: The Sharing Economy• Pentland, S.: Social Physics: How Good Ideas Spread• Madnick, R. et al.: A Cubic Framework for the Chief Data Officer• Zittrain, J.: The Future of the Internet• https://www.collibra.com/blog/unleash-the-data-democracy-5-misco
nceptions-of-data-governance/• https://www.collibra.com/blog/the-rise-of-the-chief-data-officer-cdo/