brighttalk reason 114 for learning math - final

58
Follow Us: #ITSMSummit Reason #114 For Learning Math: Using Analytics to Improve Service Assurance

Upload: andrew-white

Post on 29-Aug-2014

222 views

Category:

Technology


0 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Brighttalk   reason 114 for learning math - final

Follow Us: #ITSMSummit!

Reason #114 For Learning Math: Using Analytics to Improve Service Assurance

Page 2: Brighttalk   reason 114 for learning math - final

Follow Us: #ITSMSummit!

Mr. White has fifteen years of experience designing and managing the deployment of Systems Monitoring and Event Management software. Prior to joining IBM, Mr. White held various positions including the leader of the Monitoring and Event Management organization of a Fortune 100 company and developing solutions as a consultant for a wide variety of organizations, including the Mexican Secretaría de Hacienda y Crédito Público, Telmex, Wal-Mart of Mexico, JP Morgan Chase, Nationwide Insurance and the US Navy Facilities and Engineering Command.

Andrew White Cloud and Smarter Infrastructure Solution Specialist IBM Corporation

Page 3: Brighttalk   reason 114 for learning math - final

http://weheartit.com/entry/12433848!

Page 4: Brighttalk   reason 114 for learning math - final

Follow Us: #ITSMSummit!

Ground rules for this session… •  If you can’t tell if I am trying to be funny…

–  GO AHEAD AND LAUGH! •  Feel free to text, tweet, yammer, or whatever

to share with the rest of the attendees •  If you have a question, no need to wait until

the end. Just interrupt me. Seriously… I don’t mind.

Page 5: Brighttalk   reason 114 for learning math - final

I am here today to share some of what I have learned about

Page 6: Brighttalk   reason 114 for learning math - final

Follow Us: #ITSMSummit!

CIO’s turn to innovative technologies to deliver better outcomes

Cloud & Optimized Workloads §  Agile provisioning §  Elastic compute power §  Scalable storage

resources §  Intelligent services

Mobile Enterprise §  Hybrid mobile "

app development §  Multi-channel integration §  Device management §  Workloads on the move

Security Intelligence §  People &

identity §  Data &

information §  Application

security §  Security

analytics

Big Data Analytics §  Analyze an enormous variety of information sources §  Real-time insights & actions on streaming data

IBM  CIO  Study  (2012)    

Page 7: Brighttalk   reason 114 for learning math - final

Follow Us: #ITSMSummit!

Why is problem solving hard? • commencement opacity • continuation opacity

Non-transparency (lack of clarity of the situation)

•  inexpressiveness • opposition •  transience

Polytely (multiple goals)

• enumerability • connectivity (hierarchy relation, communication relation, allocation

relation) • heterogeneity

Complexity (large numbers of items, interrelations,

and decisions)

•  temporal constraints •  temporal sensitivity • phase effects • dynamic unpredictability

Dynamics (time considerations)

Page 8: Brighttalk   reason 114 for learning math - final

Follow Us: #ITSMSummit!

Problem Cycle Evaluation  

Recognition

Observation

Analysis Solution

Validation

Control

Page 9: Brighttalk   reason 114 for learning math - final

Follow Us: #ITSMSummit!

Point of Observation

Past Behavior • The observation period

used to feed the forecasting models

Future Behavior • The performance

period the model is trying to predict

Predictive Modeling Timeline

Page 10: Brighttalk   reason 114 for learning math - final

Predictive models harness the information lost in past data so you can identify discretely identify situations and react to them quickly.

Page 11: Brighttalk   reason 114 for learning math - final

Follow Us: #ITSMSummit!

Analytics 1.0

In the early days, we were just happy to know if the network was up or down.

We suffered from event floods and the perpetually red event console.

Page 12: Brighttalk   reason 114 for learning math - final

Follow Us: #ITSMSummit!

Analytics 2.0 Eventually the technology allowed us to correlate based on topology and filter unnecessary events.

Dashboards were all the rage and were measured in data per square inch.

Page 13: Brighttalk   reason 114 for learning math - final

Follow Us: #ITSMSummit!

Evolution of Analytics

Difficulty

Value

Descriptive Analytics

What  Happened?  

Diagnostic Analytics

Why  Did  It  Happen?  

Predictive Analytics

What  Will  Happen?  

PrescriptiveAnalytics

How  Do  We  Make  It  Happen?  

Adapted from Gartner

Page 14: Brighttalk   reason 114 for learning math - final

First…

… we need to talk a little bit about your brain

Page 15: Brighttalk   reason 114 for learning math - final

Follow Us: #ITSMSummit!

The Triune Brain

Reptilian Brain (basal ganglia)

Mammalian Brain (limbic system)

Cognitive Brain (neocortex)

Page 16: Brighttalk   reason 114 for learning math - final

Follow Us: #ITSMSummit!

Our Thought Process

*** not very reliable

Cognition

Limbic Center (hypocampus and amygdala)

Cortex (hypocampus and amygdala)

Conscious Choice (via motor centers)

Most primitive, seat of unconscious

Long-term memory

Conscious, meaning, choice

Perception (via the senses)***

Pre-Frontal Cortex (hypocampus and amygdala)

Stimulus

Page 17: Brighttalk   reason 114 for learning math - final

Follow Us: #ITSMSummit!

Short Term Memory

Your Brain Working Memory Understanding Judgement Relationship

Short-term memory is where the real work of sense-making takes place

Short-term memory has a limited amount of space (The estimate is 7 ± 2)

Page 18: Brighttalk   reason 114 for learning math - final

Follow Us: #ITSMSummit! Time

Qua

ntity

Information the brain can consume

Page 19: Brighttalk   reason 114 for learning math - final

Information is cheap. Understanding is expensive. -Karl Fast, Professor of UX Design, Kent State University

Page 20: Brighttalk   reason 114 for learning math - final

Follow Us: #ITSMSummit!

• Patterns • Comparisons • Organization

Information

• Decisions • Skill • Adaptation

Intelligence

• Trends • Generalizations • Beliefs

Knowledge

• Accountability • Foresight • Synthesis

Wisdom

• Symbols • Metrics • Facts

Data Correlation

Analysis

Application

Understanding

Complexity

Con

text

Communication

Repetition

From Data to Wisdom

Page 21: Brighttalk   reason 114 for learning math - final

Follow Us: #ITSMSummit! x

y

0i i i iy xα α ε= + +

Data

Information

Knowledge

Page 22: Brighttalk   reason 114 for learning math - final

Follow Us: #ITSMSummit!

Past Future

Abstract Tangible

Information Intelligence Knowledge Wisdom Data

Knowledge is the point of transition

Why Knowledge?

Page 23: Brighttalk   reason 114 for learning math - final

All You Need

Love

Page 24: Brighttalk   reason 114 for learning math - final

Follow Us: #ITSMSummit!

Models of Reasoning

•  Inductive –  Starts with Data Available –  Concludes with Possible

Hypotheses –  Bottom Up “Data Driven

Approach”

Data  

Interpreta@on  

Theory  Development  

Hypothesis  Tes@ng  

Hypothesis  

Theory  

•  Deductive –  Starts with Theoretical

Framework –  Concludes with Logical

Deductions –  Theory Driven Approach

Page 25: Brighttalk   reason 114 for learning math - final

Follow Us: #ITSMSummit!

Two Types of Decision Making

Programmed Decisions –  Routine –  Repetitive –  Well-Structured –  Predetermined Decision

Rules

Non-Programmed Decisions –  Unique –  Presence of Risk –  Presence of Uncertainty –  Black Swans

Page 26: Brighttalk   reason 114 for learning math - final

Follow Us: #ITSMSummit!

How To Improve Decision Making

•  Programmed Decision Making –  Collect evidence –  Identify the problem –  Select a solution –  Implement and evaluate the

outcome

•  Non-Programmed Decision Making –  Narrow evidence down to

the ideal level –  Apply heuristics to limit the

impact of cognitive bias –  Present options to a human

for a decision

Page 27: Brighttalk   reason 114 for learning math - final

Follow Us: #ITSMSummit!

Four Sources of Bad Decisions

•  Failure to frame the problem correctly •  Poor use of evidence •  Faulty decision making process •  No feedback for improvement

Page 28: Brighttalk   reason 114 for learning math - final

Follow Us: #ITSMSummit!

Common Logical Fallacies •  Appeals to Authority – where you rely on an expert source to form the basis of your

argument •  False Inductions – where you infer a causal relationship where none is evident •  Reification – when you rely on taking a hypothesis or potential theory and present it as a

known truth •  The Slippery Slope – when you base an argument on the thinking that once one action is

taken, it will trigger a sequence of events that will result in the direst of consequences •  The Band Wagon – when you present an argument as true on the basis of its popularity •  The False Dichotomy – when you provide only two options and force a choice to be made •  The Straw Man – when you create a false argument and refute it implying that the counter

argument is true •  Observational Selection – when you draw attention to the positive aspects of an idea and

ignore the negatives •  Statistics of Small Numbers – when you take one (or a very small sample) and use it to draw

a general conclusion

Page 29: Brighttalk   reason 114 for learning math - final

The problem is not that there are no silver bullets… the problem is that there are no werewolves. - Jim Tussing, CTO, Nationwide Insurance

Page 30: Brighttalk   reason 114 for learning math - final
Page 31: Brighttalk   reason 114 for learning math - final

Follow Us: #ITSMSummit!

Global Warming and Inflation

Inflation

Global warming

Page 32: Brighttalk   reason 114 for learning math - final

Follow Us: #ITSMSummit!

Hidden Factors

Hidden Factor

Smoking Lung Cancer

Page 33: Brighttalk   reason 114 for learning math - final

Follow Us: #ITSMSummit!

Page 34: Brighttalk   reason 114 for learning math - final
Page 35: Brighttalk   reason 114 for learning math - final

Follow Us: #ITSMSummit!

Boyd’s Loop

Observation

Outside Information

Implicit Guidance & Control

Unfolding Interaction With Environment Feedback

Feedback

Unfolding Circumstances Cultural

Norms

Cognitive Abilities

Knowledge Life Cycle

Prior Wisdom

New Information

Feed Forward Decision

(Hypothesis)

Feed Forward Action

(Test)

Feed Forward

•  Note how observation shapes orientation, shapes decision, shapes action, and in turn is shaped by the feedback and other phenomena coming into our sensing or observing window.

•  Also note how the entire “loop” (not just orientation) is an ongoing many-sided implicit cross-referencing process of projection, empathy, correlation, and rejection.

From “The Essence of Winning and Losing,” John R. Boyd, January 1996.

Observe Orient Decide Act

Page 36: Brighttalk   reason 114 for learning math - final

Follow Us: #ITSMSummit!

Where the Breakdown Occurs

Observe! Orient! Decide! Act!

Situational Awareness!

Perception of Elements in Current Situation!

!Level 1!

Comprehension of Current Situation!

!Level 2!

Projection of Future Status!

!!

Level 3!

Decision! Performance of Actions!

Cur

rent

Sta

te!

Feedback!

• Goals & Objectives!• Preconceptions!• Expectations!

• Abilities!• Experience!• Training!

Long Term Memory! Automaticity!

Cognitive Processes!

• System Capability!• Interface Design!• Stress & Workload!• Complexity!• Automation!

Adapted from Endsley, M.R. (1995b). Toward a theory of situation awareness in dynamic systems. Human Factors 37(1), 32–64.!

Systemic Influences!

Individual Influences!

Page 37: Brighttalk   reason 114 for learning math - final

Follow Us: #ITSMSummit!

Sometimes We Miss What is Going On

Say… what’s a mountain goat doing all the way up here in these clouds?

Page 38: Brighttalk   reason 114 for learning math - final

Follow Us: #ITSMSummit!

Rare Events “one chance in a million” will undoubtedly occur, with no less and no more than it’s appropriate frequency, however surprised we may be that it should occur to us. Sir Ronald A. Fisher

©  Aquire  Inc.  2012  

Page 39: Brighttalk   reason 114 for learning math - final

Follow Us: #ITSMSummit!

The Gaussian Bell Curve Mean  

-1σ +1σ -2σ +2σ

-3σ +3σ 67%

95%

99.5%

Page 40: Brighttalk   reason 114 for learning math - final

The trick is not to spend our time trying to get better at predicting this world, or making it more predictable, for both of these strategies are bound to fail. - Nassim Nicholas Taleb, Author and Philosopher

Page 41: Brighttalk   reason 114 for learning math - final

Follow Us: #ITSMSummit!

Normative Decision Making Model •  Limited Information Collection

–  7 +/- 2 –  Tendency to acquire manageable rather than optimal amounts of

information –  Difficulty identifying all possible options

•  Judgmental Heuristics –  Judgmental heuristics - rules of thumb or shortcuts that people use to

reduce information processing demands –  Availability heuristic - tendency to base decisions on information

readily available in memory –  Representativeness heuristic - tendency to assess the likelihood of an

event occurring based on impressions about similar occurrences •  Satisficing

–  Choosing a solution that meets a minimum standard of acceptance

Page 42: Brighttalk   reason 114 for learning math - final

Follow Us: #ITSMSummit!

The Analytics Focus… In addition to handling monitoring and performance alerts, it helps drive improved availability.

The Formula: 1.  Continually collect, categorize, and analyze all events from as many

sources as possible 2.  Correlate events and analyze them using previous outages as patterns

to identify situations worth investigating 3.  Notify a support team so the situation can be mitigated before

becoming an outage 4.  Automate responses that have well established situational fingerprints

and proven resolution steps

Page 43: Brighttalk   reason 114 for learning math - final

Follow Us: #ITSMSummit!

Most Common Modeling Tasks •  Classification: predicting an item class, “decision tree” •  Clustering: finding natural groups or clusters in data •  Association: finding things that occur together •  Deviation: finding changes or outliers •  Estimation: predicting values •  Linkage: finding relationships among actors •  Mining: extracting information from data

Page 44: Brighttalk   reason 114 for learning math - final

Follow Us: #ITSMSummit!

Types of Analytical Algorithms Algorithm Description

Decision Tree Calculating the odds of an outcome Association Rules Identifying the relationships between elements Naïve Bayes Clearly showing the differences in a particular variable Sequence Clustering Grouping data based on a sequence of events Time Series Analyze and forecast time-based data Neural Networks Seek to uncover non-intuitive relationships in data Text Mining Analyze unstructured text data looking for context and meaning Linear Regression Determine the relationship between columns to predict an

outcome Logistic Regression Evaluate the relationship between columns in order to evaluate

the probability that a column will contain a specific state

Page 45: Brighttalk   reason 114 for learning math - final

Follow Us: #ITSMSummit!

Questions Answered by Analytics Business Question Method What is the best that can happen? Optimization What will happen next? Predictive What if this trend continues? Predictive/Forecasting Why is this happening? Variance analysis/Root Cause Is some action needed? Alerts Where is the problem? Query/Drill Down How many, how often, when? Ad hoc reports What happened? Standard reports Value

Page 46: Brighttalk   reason 114 for learning math - final

Understanding what is already known but has not been shown.

Page 47: Brighttalk   reason 114 for learning math - final

Follow Us: #ITSMSummit!

Incident Life Cycle

Down Time

Detection Time Response Time Repair Time Recovery Time Outage De

tect

ion

Diag

nosis

Repa

ir

Reco

ver

Rest

ore

Observe Orient Decide Act

Page 48: Brighttalk   reason 114 for learning math - final

Follow Us: #ITSMSummit!

Anatomy of an Outage

Corporate!LANs & VPNs!

Load Balancer!

Firewall!

Web!Servers!

Message!Queue!

zOS!CICS!

WAS!

Database!

WAS!Database!

zOS!MQ!

DB2!

!!!!

4!

!!!!!!

3!

!!!!!!1!

5:45-ish pm: CICS ABENDS start flooding the console but not high enough to ticket!

!!!!!!2!

6:00-ish pm: MQ flows start are interrupted and are alerting in Flow Diagnostics!

6:04pm: Synthetic transactions fail at and 6:14 the Ops Center confirms the issue and creates a P0 Incident!

6:54pm: Support teams investigate the interrupted flows and determine it is a “back-end” problem!

10:29pm: Support teams investigate MQ and ultimately and rule it out and ultimately decide to reset CICS to resolve the issue!

!!!!

5!

Page 49: Brighttalk   reason 114 for learning math - final

Follow Us: #ITSMSummit!hKp://www.ithakabound.com/wp-­‐content/uploads/2010/02/DC-­‐Snow-­‐men-­‐pushing-­‐car.jpg  

Why did this happen?!

Page 50: Brighttalk   reason 114 for learning math - final

Follow Us: #ITSMSummit!

The Problem

If no there is no ‘early detection’ before the outage, operations teams can only react while outage is already in effect and already losing money...

Why aren’t operations teams preventative today?

§ Too much data to analyze manually § Existing analytic techniques, such as standard thresholds, are not up to the task § They cannot detect problems while they are emerging (before business impact) § Set threshold too high, insufficient warning before total failure. § Set threshold too low, too much noise, everything is ignored

Page 51: Brighttalk   reason 114 for learning math - final

Follow Us: #ITSMSummit!

Processing Streams

Situational Awareness

Engine

Adapted from http://www.slideshare.net/TimBassCEP/getting-started-in-cep-how-to-build-an-event-processing-application-presentation-717795

Real-Time Event Streams

Detected and Predicted Situations

Patterns from Historical Data

Causal Relationship from Past RCAs

Page 52: Brighttalk   reason 114 for learning math - final

Follow Us: #ITSMSummit!

Complex Event Processing

Event Pipeline

Event Queries

Time Window

Data Events

Control Event

Other Events

Event Filter

Scenarios

A

B

C

Feedback Loop

Event Intelligence

Action Events

Page 53: Brighttalk   reason 114 for learning math - final

Follow Us: #ITSMSummit!

One Integrated Environment

Distributed Database Mainframe Network Middleware Storage

Event Pool

Operational!Data Warehouse!

Predictive

Enrichment & Correlation

Service Desk Paging

CMDB

Knowledge

Asset Mgmt

Event Catalog

Event API

Business Telemetry

3rd Party Providers

Presentation Framework

Page 54: Brighttalk   reason 114 for learning math - final

Follow Us: #ITSMSummit!

Integrate Your Processes

Presentation Framework

Asset Management & Topology Database

Aggregation and Analysis

Security Management

Availability Management

Configuration Management

Change Management

Performance Management

Enterprise Data Sources

Business Telemetry

Information

Configuration Discrepancies

Enrichment Data Business Activity Data

Historical Data

“Enriched” Events

Change Activity

Topology Snapshots

Tren

d-Re

late

d Fa

ults

Discovered Problem

s

Status Indications

Incidents

Audit Information and Suspicious Activity

Enrichment Data Business Activity Data

Automated Discovery

Page 55: Brighttalk   reason 114 for learning math - final

Follow Us: #ITSMSummit!

Automated Action!

Notification and Escalation!

Business Impact

Analysis!

Root C

ause Analysis!

Correlation and

Event Suppression!

Enrichment!

Distributed C

ollectors!D

istributed Collectors!

LOB Managed Monitoring System!

Service Provider Managed Monitoring

System!

Vendor Managed Monitoring System!

Element Manager!

Element Manager!

Element Manager!

Service Center! Yammer! CMDB! CVOL! APM! KM

Entries! Triage! xMatters!

Visualization!Framework!

Com

mon Event

Format!

Topology And Relationship

Database!Automated

Action Tools!

Distributed C

ollectors!Automated Provisioning

System!

Predictive Analysis!

Automated Change

Reconciliation!

Security Management!

Archive and Report!

Business Telemetry Data!

Service Center and Enterprise

Notification Tool!

Meta-Data Integration Bus!

Page 56: Brighttalk   reason 114 for learning math - final

Follow Us: #ITSMSummit!

Predictive Outage Avoidance

Ensure  availability  of  applica3ons  and  services  

   

•  Use learning tools to augment custom best practices •  Leverage statistical methods to maximize predictive warning •  Improve problem detection across IT silos

Predict

Faster Problem Resolution

Find  &  correct  problems  faster  with  tools  that  determine  ac3ons  

required  to  resolve  issues  

   

•  Identify problems quicker with insight to large unstructured repositories

•  Isolate problems quicker by bringing relevant unstructured data into problem investigations

•  Repair problems quicker with the right details quickly to hand.

Resolve

Optimized Performance

Track,  Op3mize,  and  Predict  capacity  and  performance  needs  

over  3me  

   

•  Track capacity and performance of applications and services in classic and cloud environments • Optimize resource deployment with what-if and best fit planning tools •  Escalate capacity and performance problems before they cause critical failures

Perform

Improved Insight Enhance  visibility  into  systems  resource  rela3onships  while  

increasing  customer  sa3sfac3on    

   

•  Determine what resources are interdependent to assess impact of failures •  Gain insight into what is important to your customer

•  Decrease customer churn and acquisition costs while increasing customer retention and satisfaction

Know

Automated Analytics helps lower IT Administration Costs: • Performance and Capacity planning tools monitor appropriately and escalate, reducing

time consuming report browsing • Learning tools reduce customization and best practices investment on initial deployment • Log Analysis helps speed problem resolution to be able to do more with less

Page 57: Brighttalk   reason 114 for learning math - final

Follow Us: #ITSMSummit!

Let’s keep the conversation going…

[email protected]!

ReverendDrew!

SystemsManagementZen.Wordpress.com!

systemsmanagementzen.wordpress.com/feed/!

@SystemsMgmtZen!

ReverendDrew!

[email protected]!

614-306-3434!

Page 58: Brighttalk   reason 114 for learning math - final