analyze your data, transform your business
TRANSCRIPT
Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
ANALYZE YOUR DATA, TRANSFORM YOUR BUSINESS
DAN SOCEANU, SENIOR DATA MANAGEMENT SOLUTIONS ARCHITECT, SAS
Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
INFORMATION VS. KNOWLEDGE, WISDOM?
“Where is the wisdom we have lost in knowledge?
Where is the knowledge we have lost in information?”T.S. Eliot
Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
CONTEXT DEFINITION
con·text/’käntekst/
Noun
The circumstances that form the setting for an event, statement, or idea, and in
terms of which it can be fully understood and assessed.
"the decision was taken within the context of planned cuts in spending"
The parts of something written or spoken that immediately precede and follow a
word or passage and clarify its meaning.
"word processing is affected by the context in which words appear"
Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
CONTEXT DATA IS NOT INFORMATION*
Information is data in context
Data are simply collected facts and statistics used for reference or analysis. In
computing, data are quantities, characters or symbols on which operations are
performed by a computer, or being stored and transmitted.
Knowledge is information in context
Information assets are combinations of data sources, in a system. These assets are
often subject to an Information Architecture. Examples include documents, catalogs and
taxonomies. You can have data without information, but you cannot have information
without data. Knowledge encompasses the understanding of information.
Wisdom is knowledge in context
Wisdom comes from the ability to discern inner qualities and relationships in order to
apply sound judgment to a particular course of action*Source: Enterprise Architects, EA Blog, “Data is NOT Information” by Chris Aitken
Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
CONTEXT CONTEXT CATEGORIES*
Computing context
Examples: Connectivity, bandwidth, peripherals, networks
User context
Examples: Profile, location, emotion, proximity, activity, relation
Physical context
Examples: Audio, Video, temperature, condition, texture
Time context
Examples Time of day, week, month, year, era, period
*Source: “ B. Schilt, N. Adams, and R. Want, "Context-aware computing applications, Santa Cruz, 1994.
Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
BUSINESS CONTEXT IN COMPUTING
• Business context is used for search, discovery and navigation. Examples of
business context include: purpose, business requirements, who uses, when
to use, how to use, use cases, special procedures, how developed and tools
& methods used for analysis
• Business context also refers to social, business, or organizational
characteristics of the deployment environment, e.g. “the company is a small
enterprise”, “the company has branches in different countries”, “customers
speak different languages”, and “the revenue trend is negative”*
*Source: “Aligning Software Configuration with Business and IT Context”, Fabiano Dalpiaz, Raian Ali and Paolo Giorgini
.
Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
ARTIFICIAL
INTELLIGENCE*MAY SOLVE ALL OUR COMPUTING PROBLEMS…
• Computing excels at computational
speed and accuracy, but cannot currently
incorporate the human dimensions of
sight, sound, touch and smell fully
(analog + digital; biologic + machine)
• “Deep Learning” techniques are focusing
on speech and sound recognition &
understanding
• Text Analytics with Sentiment Analysis
are forging the path toward large-scale
contextual analytics
*Source: Ray Kurzweil“ The Singularity is Near: When Humans Transcend Biology“,2005
Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
ARTIFICIAL
INTELLIGENCE…BUT IT’S NOT HERE YET!
*Source: MIT Technology Review May/June 2013, “10 Breakthrough Technologies 2013”
Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
CHALLENGEAPPLYING CONTEXT FOR ANALYTICS IN THE FACE OF
POOR QUALITY DATA AND A LACK OF STANDARDS
TOO MUCH DATAin too many places
POOR QUALITY DATAcannot be trusted
INCONSISTENT DATAacross multiple sources
Result: the data strategy is not able to support the business strategy
Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
CONTEXT CHALLENGES IN THE DAY-TO-DAY ENTERPRISE
Tools and techniques for integrating enterprise data were primarily
designed for building data repositories, not for business analysis:
• Information context is inconsistent and often inaccurate
• Context often has multiple representations
• Information context is often highly interrelated, yet not available in one
source or system
• Information context can have multiple temporal (time) characteristics
• The data is often incomplete, inaccurate or not current
Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
THE QUEST FOR
ANALYTICSEXISTING ANALYTICS DATA MANAGEMENT PROCESS
DataWarehouse
Reporting ToolsRead
ETL
Application
3rd Party
Appliance
Transactional
Social Media
DI
DataMarts
AnalyticsAnalytics Data
Tables
Ad
-ho
c D
ata
Ma
na
ge
me
nt
ETL/ELT
Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
METADATA …IS IN THE EYE OF THE BEHOLDER
Business Metadata
– Business rules, Definitions, Terminology, Glossaries, Algorithms
and Lineage using business language
– Audience: Business users
Technical Metadata
– Defines Source and Target systems, their Table and Fields
structures and attributes, Derivations and Dependencies
– Audience: Specific Tool Users –BI, ETL, Profiling, Modeling
Company Confidential - For Internal Use Only
Copyright © 2013, SAS Insti tute Inc. Al l r ights reserved.
CONTEXT IN DATA MODELING DESIGN
This sample diagram represents an identifying relationship between two tables; DEPARTMENT and EMPLOYEE
• This relationship indicates that an EMPLOYEE may not exist outside of the context of a DEPARTMENT.
• In identifying relationships, the primary key of the parent table becomes part of the primary key of the child table.
Company Confidential - For Internal Use Only
Copyright © 2013, SAS Insti tute Inc. Al l r ights reserved.
THE ANALYTICS
LIFECYCLEDATA PREP & MANAGEMENT CONSUMES 80% OF THE TIME
IDENTIFY /
FORMULATE
PROBLEM
DATA PREP &
MANAGEMENT
DATA
EXPLORATION
TRANSFORM
& SELECT
BUILD
MODEL
VALIDATE
MODEL
DEPLOY
MODEL
EVALUATE /
MONITOR
RESULTS
Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
20%80%
Preparing to
solve the problem
Solving
the
problem
BUSINESS
PROBLEM
BUSINESS
DECISION
Preparing to
solve the
problem
Solving the
problem
Innovate
30%20% 50%
Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .Copyright © 2013, SAS Insti tute Inc. Al l r ights reserved.
Domain ExpertMakes DecisionsEvaluates Processes and ROI
BUSINESSMANAGER
Model ValidationModel DeploymentModel Monitoring Data Preparation
IT SYSTEMS /MANAGEMENT
Data ExplorationData VisualizationReport Creation
BUSINESSANALYST
Exploratory AnalysisDescriptive SegmentationPredictive Modeling
DATA MINER /STATISTICIAN
THE ANALYTICS
LIFECYCLEMULTIPLE PARTICIPANTS AND CONTEXTS
IDENTIFY /FORMULATE
PROBLEM
DATAPREPARATION
DATAEXPLORATION
TRANSFORM& SELECT
BUILDMODEL
VALIDATEMODEL
DEPLOYMODEL
EVALUATE /MONITORRESULTS
Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
ANALYTICS
MATURITY
GOAL: FROM REACTIVE TO PREDICTIVE
What happened?
Standard
reports
How many, how often, where?
Ad hoc
reports
Where exactly is the problem?Query
drill down
Why is this happening?Statistical
Analysis
What if these trends continue?Forecast
What will happen next? Predict
What is the best that can happen?
What actions are needed?Alerts
Raw
data
Clean
data
OptimizeCompetitive Advantage
Degree of Intelligence
Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .Copyright © 2013, SAS Insti tute Inc. Al l r ights reserved.
DATA MANAGEMENT
DATA MANAGEMENT METHODOLOGY
Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
HOLISTIC DATA
MANAGEMENTANALYTICS REQUIRES PROPER BUSINESS CONTEXT
Data Governance
DataWarehouse
Source Systems
Operations
Cloud
Appliance
Static Reporting
Read
ETL
Dynamic Visualization
ETL
Da
ta M
an
ag
em
en
t
ADW
Data Governance Program
Da
ta M
on
ito
rin
g
Exp
lora
tio
nQ
uali
tyIn
teg
rati
on
MD
M
DataMarts
Model Development
Operational
Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
DATA MANAGEMENT ESSENTIAL CAPABILITIES FOR ANALYTIC SUCCESS
Enterprise Data Access
• Relational, File, XML, Semi-Structured / Unstructured
• Message Queues, Streaming
• Data Federation
Data Management
• Data Integration
• Data Quality
• Master Data Management
Analytics Management
• Model Management & Monitoring
• Champion / Challenger Process
• Model Deployment & Integration
Decision Management
• Rules, Decision & Analytic Services
• Optimization and Automation
• Embedding Analytics and Data at the point of interaction
Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
ANALYTICS
PRACTICE
ANALYSIS NEEDS DATA
Outcome:
A Road Map for
embracing data
management best
practices
• Data required for analysis
must be identified, sourced,
and transformed
• Sourcing activities must fit
into IT operations and
conform to IT governance
• Critical technologies,
personnel, budgets, and
dependencies must be
identified Data Management
Better Deployment
of Resources
Improve Quality of Decisions
Increases IT
Efficiency
Improve User/Customer
Satisfaction
Better Data Sharing Between Business
Units
Prepare for New
Initiatives
EnableData
SourceIntegration
Establishes Data
Definitions
For every corporate strategy and problem, there is a corresponding data need!
Data quality inhibits usage
of forecasting tools – not
able to leverage available
technology
Conflicting definitions
at the business unit
level is a data
integration issue
Dozens of data
sources with no
plan to address
data sharing needs
Changes to the
number and structure
of data sources are
major challenges
Data definitions are
unique to business
units, and there is
no automated
integration method
Advertising traffic
information is
maintained by several
systems that have
different methods for
calculating usage
Lack of a 360° view of
vendors and agencies
prevents business
units from accurately
anticipating demand
Suboptimal data
prevents analytical
approach to
advertising
deployment
Copyr i g ht © 2013, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
ANALYTICS
PRACTICEANALYSIS NEEDS CONTEXT
Required:
A Framework for
defining and sharing
data
• Develop policies for sharing data across
the enterprise
• Define what the data represents and what
it will be labeled
• Define who can use the data and the
restrictions on how they use it
• Identify who is responsible for data
quality
• Inability to integrate citizen data from multiple
sources and channels may lead to increased
fraud
• Inaccurate view of family units can lead to
missed opportunity to assist children
• Inaccurate criminal history data leads to poor
judicial decisions for bail bond assignments
Integrated View of Citizen Effective Education
• Inability to quickly integrate/analyze
student data from multiple sources
• Lack of universal KPI’s
• Multiple overlapping LOB projects on tap
• Missed opportunities to direct resources
to at-risk students
Adapting to Budget Reduction Realities
• Difficulty integrating data from hundreds of
data centers
• Efforts at federal & statewide transformation
and consolidation hindered
Enabling Strategic Initiatives
• Without Data Governance enterprise
level transformation is not attainable
• Currently there is a needed for data
integration that cannot be met. Federal IT
integration and improving coordination
across agencies such as Homeland
Security is a major challenge.
• Unable to execute high impact analysis such
as forecasting police deployment by
neighborhood based on historical crime data
• Suboptimal data quality limits the ability to
analyze criminality patterns for strategic
investments that can limit recidivism
Working Smarter by Leveraging Analytics Meeting New Healthcare Challenges
• Inability to coordinate across many similar
healthcare programs to drive efficiency &
limit fraud
• ACA raises the bar on needing a 360° view
of patients which is unattainable without
data governance
Copyr i g ht © 2013, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
ANALYTICS
PRACTICE
ANALYSIS NEEDS A PURPOSE
Law Enforcement
Transportation Analysis
Effective Citizen Programs
Tax Compliance & Collections
Detecting Fraud
Policy Enforcement
Educational Effectiveness
Criminal Corrections Analysis
Required:
Priorities and
Road Map for BI
and Analytic
Capabilities
• Get consensus on business
priorities
• Identify data required for
analysis
• Quantify the business impact
Company Confidential - For Internal Use Only
Copyright © 2013, SAS Insti tute Inc. Al l r ights reserved.
DATA MANAGEMENT THE SAS® DATA MANAGEMENT FRAMEWORK
Decision MakingCustomer FocusCompliance
Mandates
Mergers &
AcquisitionsAt-Risk Projects
Operational Efficiencies
CORPORATE DRIVERS
Master/ Reference DM
Data Visualization
Data QualityData
Virtualization
Data ProfilingMetadata
ManagementData
ExplorationData
Monitoring
SOLUTIONS
Data Lifecycle
Reference and Master Data
Data Security
Data Architecture
Metadata Data QualityData
Administration
Data Warehousing & BI/Analytics
DATA MANAGEMENT
Da
ta S
tew
ard
sh
ip
Ro
les
& T
as
ks
Decision-making Bodies
Guiding Principles
Program Objectives
Decision Rights
DATA GOVERNANCE
People
Process
Technology
METHODS
Business Data Glossary
Data Integration