incident management revue strategic process planning and integration management (sppim) sue silkey,...

26
Incident Management Revue Incident Management Revue Strategic Process Planning and Integration Management (SPPIM) Sue Silkey, Thelma Simons and Gail Schaplowsky

Upload: jared-casey

Post on 17-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Incident Management Revue Strategic Process Planning and Integration Management (SPPIM) Sue Silkey, Thelma Simons and Gail Schaplowsky

Incident Management RevueIncident Management Revue

Strategic Process Planning

and Integration Management (SPPIM)

Sue Silkey, Thelma Simons

and Gail Schaplowsky

Page 2: Incident Management Revue Strategic Process Planning and Integration Management (SPPIM) Sue Silkey, Thelma Simons and Gail Schaplowsky

Best PracticesBest Practices

• Best practices serve as a guide to designing IT management processes that increase the overall efficiency, reduce costs and align IT with business needs.

• ITIL asks…

Page 3: Incident Management Revue Strategic Process Planning and Integration Management (SPPIM) Sue Silkey, Thelma Simons and Gail Schaplowsky

How ITIL best practices can helpHow ITIL best practices can help

• Faster incident recovery • Fewer unplanned outages• Better communication with users• Information that enables better informed

management decisions

Page 4: Incident Management Revue Strategic Process Planning and Integration Management (SPPIM) Sue Silkey, Thelma Simons and Gail Schaplowsky

Incident ManagementIncident Management

Goal• Restore normal service operation as quickly

as possible and minimize adverse impact on business operations

• Basically this means using all available resources to get the user back to a productive state as quickly as possible

Page 5: Incident Management Revue Strategic Process Planning and Integration Management (SPPIM) Sue Silkey, Thelma Simons and Gail Schaplowsky

Incident ManagementIncident Management

Benefits• Minimize the disruption and downtime for our

users• Maintain a record during the entire Incident

life-cycle. (This allows any member of the service team to obtain or provide an up-to-date progress report)

• Building knowledgebase of known issues to allow quicker resolution of frequent Incidents

Page 6: Incident Management Revue Strategic Process Planning and Integration Management (SPPIM) Sue Silkey, Thelma Simons and Gail Schaplowsky

Incident ManagementIncident Management

How we implemented• Began using process July, 2006• Continued regular meetings to review and

tweak process• Process formally adopted in December, 2006

Current status• Starting to develop metrics to create

management reports (how many incidents, major incidents, etc.)

Page 7: Incident Management Revue Strategic Process Planning and Integration Management (SPPIM) Sue Silkey, Thelma Simons and Gail Schaplowsky

DefinitionsDefinitions

• Incident - any event which is not part of the standard operation of a service and which causes, or may cause, an interruption to, or a reduction in, the quality of that service

• Service Request - request for increased functionality for new services, not a failure in the IT infrastructure.

• Major Incident – an Incident for which the degree of impact on the User community is extreme, and which requires a response that is above and beyond that given to normal incidents.

• Problem - A condition identified by multiple incidents exhibiting common symptoms, or from one single significant incident, indicative of a single error, for which the cause is unknown

Page 8: Incident Management Revue Strategic Process Planning and Integration Management (SPPIM) Sue Silkey, Thelma Simons and Gail Schaplowsky

Incident LifecycleIncident Lifecycle

Page 9: Incident Management Revue Strategic Process Planning and Integration Management (SPPIM) Sue Silkey, Thelma Simons and Gail Schaplowsky

A day in the life…A day in the life…of an Incidentof an Incident

Our players• Nervous Nellie – Gail Schaplowsky• Incident/Major Incident – Dave Barnhill• Support Staff – Mike Wright• Major Incident Manager – Sue Silkey• CSC Staff – Bill Farris• Narrator – Thelma Simons

We begin on a bright and sunny day…

Page 10: Incident Management Revue Strategic Process Planning and Integration Management (SPPIM) Sue Silkey, Thelma Simons and Gail Schaplowsky
Page 11: Incident Management Revue Strategic Process Planning and Integration Management (SPPIM) Sue Silkey, Thelma Simons and Gail Schaplowsky

Case TypesCase Types

• Incident - any event which is not part of the standard operation of a service and which causes, or may cause, an interruption to, or a reduction in, the quality of that service

• Service Request - request for increased functionality for new services, not a failure in the IT infrastructure.

• Major Incident – an Incident for which the degree of impact on the User community is extreme, and which requires a response that is above and beyond that given to normal incidents.

• Problem - A condition identified by multiple incidents exhibiting common symptoms, or from one single significant incident, indicative of a single error, for which the cause is unknown

Page 12: Incident Management Revue Strategic Process Planning and Integration Management (SPPIM) Sue Silkey, Thelma Simons and Gail Schaplowsky

Incident ManagementIncident Management

Goal• Restore normal service operation as quickly

as possible and minimize adverse impact on business operations

Page 13: Incident Management Revue Strategic Process Planning and Integration Management (SPPIM) Sue Silkey, Thelma Simons and Gail Schaplowsky

I+U=PI+U=P

Impact + Urgency = Priority

Page 14: Incident Management Revue Strategic Process Planning and Integration Management (SPPIM) Sue Silkey, Thelma Simons and Gail Schaplowsky

I+U=PI+U=P

Impact is defined as the number of people

affected by a service outage.

• Low Impact: One customer affected, where no executive or executive staff are involved.

• Medium Impact: Several customers are affected, or an executive or executive staff are involved.

• High Impact: Whole organization, complete department or building affected, or revenue/financial systems affected.

Page 15: Incident Management Revue Strategic Process Planning and Integration Management (SPPIM) Sue Silkey, Thelma Simons and Gail Schaplowsky

I+U=PI+U=P

Urgency is defined as the affect of the event on a customer’s ability to work. (This is not to be confused with how urgent the requestor believes the incident to be.)

• Low Urgency: Ability not impaired, the customer is requesting extra or additional functions or services (a service request).

• Medium Urgency: Abilities are partially impaired, and customers cannot use certain functions or services.

• High Urgency: Abilities are completely impaired and customers cannot work.

Page 16: Incident Management Revue Strategic Process Planning and Integration Management (SPPIM) Sue Silkey, Thelma Simons and Gail Schaplowsky

I+U=PI+U=P

Priority is based on Impact and Urgency. The priority determines how quickly the issue needs to be addressed.

• Low Priority: Work to be completed in 4 business days.

• Medium Priority: Work to be completed in 2 business days.

• High Priority: Work to be completed in 4 hours.

• Urgent Priority: Work to be completed in 2 hours.

Page 17: Incident Management Revue Strategic Process Planning and Integration Management (SPPIM) Sue Silkey, Thelma Simons and Gail Schaplowsky

Major IncidentMajor Incident

I am the highest category of impact for an incident

I result in significant disruption to our business

In short, in matter technical on which we are dependent

I am the very model of an IT Major Incident!

(Sung to the tune of The Major General’s Song in the Pirates of Penzance

Page 18: Incident Management Revue Strategic Process Planning and Integration Management (SPPIM) Sue Silkey, Thelma Simons and Gail Schaplowsky

Case TypesCase Types

• Incident: an event which is not part of the standard operation of a service and which causes or may cause an interruption to, or a reduction in the quality of, that service i.e. some piece of technology that I previously used is not working now.

Major Incident: an Incident for which the degree of impact on the User community is extreme, or where the disruption is excessive and which requires a response that is above and beyond that given to normal incidents.

Page 19: Incident Management Revue Strategic Process Planning and Integration Management (SPPIM) Sue Silkey, Thelma Simons and Gail Schaplowsky

Major Incident Responsibilites Major Incident Responsibilites

Support Staff Major Incident Checklist

Assign the case to yourself (if not already done so)

Updates:• Hourly updates should be made to the work log or to

the Major Incident Manager at the CSC. If you do not make these hourly updates, the MIM or CSC will contact you for an update.

• Resolution updates should be called into the MIM or CSC for verification.

Once verified, Move the case to resolved Status and complete the information in the solutions tab.

Page 20: Incident Management Revue Strategic Process Planning and Integration Management (SPPIM) Sue Silkey, Thelma Simons and Gail Schaplowsky

Major Incident Responsibilites Major Incident Responsibilites

Major Incident Manager Checklist

1. Replicate or substantiate the failure (via monitoring equipment alerts)

2. Log the case3. Consult the Call List (contact support staff, Service

Owner, SCC)4. Monitor the case

a. Check activity log for updates hourlyb. If activity log hasn’t been updated for an hour,

contact support staff.5. Upon “resolution” or moving the case to “Pending –

Major Incident Cleared”a. Test that failure is resolved.b. Contact the SCC.

Page 21: Incident Management Revue Strategic Process Planning and Integration Management (SPPIM) Sue Silkey, Thelma Simons and Gail Schaplowsky

Call ListCall List

Page 22: Incident Management Revue Strategic Process Planning and Integration Management (SPPIM) Sue Silkey, Thelma Simons and Gail Schaplowsky

Tune in next time…Tune in next time…

• What will happen to Major Incident?• Come back next month to see the continuing

saga of Mr. Incident as he wafts his way through Change Management, Problem Management and Configuration Management.

Page 23: Incident Management Revue Strategic Process Planning and Integration Management (SPPIM) Sue Silkey, Thelma Simons and Gail Schaplowsky

Hope you had fun and…Hope you had fun and…

Learned • The difference between Incident and Major

Incident• How IM can minimize the disruption and

downtime for our users• The importance of maintaining a record

during the entire Incident life-cycle• That building a knowledgebase of known

issues will allow quicker resolution of frequent Incidents

Page 24: Incident Management Revue Strategic Process Planning and Integration Management (SPPIM) Sue Silkey, Thelma Simons and Gail Schaplowsky

IM Wrap UpIM Wrap Up

• Where we are• Where we want to be• Metrics to tell us when we arrive• Annual Review• New committee based on reorganization

Page 25: Incident Management Revue Strategic Process Planning and Integration Management (SPPIM) Sue Silkey, Thelma Simons and Gail Schaplowsky

Upcoming SessionsUpcoming Sessions

Future sessions are scheduled on:• Change Management • Problem Management• Configuration Management• Release Management

Page 26: Incident Management Revue Strategic Process Planning and Integration Management (SPPIM) Sue Silkey, Thelma Simons and Gail Schaplowsky

Questions?Questions?

More information at SPPIM (PSMO) website

www.technology.ku.edu/psmo

Also in IS/Process Management public folders