it – emergency change process - university of new...

45
ITSM Change Management IT Emergency Change Process Yvette Fournier- Change Manager 505-321-3287 (pager: 505-951-0950) [email protected] October 22, 2009 1

Upload: buidat

Post on 02-Sep-2018

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: IT – Emergency Change Process - University of New …aissis.unm.edu/nav_pages/files/Emergency_Change_Process... · Change Management IT –Emergency Change Process ... Current Procedure

ITSMChange Management

IT – Emergency Change Process

Yvette Fournier- Change Manager505-321-3287 (pager: 505-951-0950)

[email protected] 22, 2009

1

Page 2: IT – Emergency Change Process - University of New …aissis.unm.edu/nav_pages/files/Emergency_Change_Process... · Change Management IT –Emergency Change Process ... Current Procedure

Different Types of Changes:

1. Logging and Notification

2. Peer Review and Approval

3. Manager Review and Approval

4. High Risk/Outage – TAT/CAB Review and Approval

5. Emergency Changes – CAB/EC Review and Approval TAT/CAB post implementation review

2

Page 3: IT – Emergency Change Process - University of New …aissis.unm.edu/nav_pages/files/Emergency_Change_Process... · Change Management IT –Emergency Change Process ... Current Procedure

3

Page 4: IT – Emergency Change Process - University of New …aissis.unm.edu/nav_pages/files/Emergency_Change_Process... · Change Management IT –Emergency Change Process ... Current Procedure

# Change Characteristic\Type of Request for ChangeRFC – IT –Logging &

Notification

RFC- IT –Peer

Review

RFC – IT –Manager Review

RFC – IT –High Risk /

Outage(TAT/CAB Review)

RFC – IT –Emergency

(CAB/EC Review)

1Regularly applied changes that have well documented and tested procedures for applying and are low risk and low impact. X

2

Regularly applied changes that have well documented and tested procedures for applying and are not high risk but have a medium impact and/or where the change process has changed. X

3Changes that are rarely or have never been applied but are not high risk or high impact and do not create outages. X

4 Technical support required from other groups X

5 Change is likely to affect the work of other groups and/or users. X

6High risk, high impact or outage occurs; uses pre-determined maintenance window; minimum one week notification. X

7High risk, high impact or outage; OUTSIDE of pre-determined maintenance window; two weeks notification. X

8 Same as #7 but less than one week notification can occur. X

9 Same as #8 but less than two weeks notification can occur. X

10Outage exists due to an incident and change must be applied in order for service to become available. 4

Page 6: IT – Emergency Change Process - University of New …aissis.unm.edu/nav_pages/files/Emergency_Change_Process... · Change Management IT –Emergency Change Process ... Current Procedure

Emergency Changes create the most Pain!

6

Page 7: IT – Emergency Change Process - University of New …aissis.unm.edu/nav_pages/files/Emergency_Change_Process... · Change Management IT –Emergency Change Process ... Current Procedure

Current Procedure

1. Change Initiator (CI) realizes that a change must be applied ASAP. 2. CI discusses change with Supervisor and/or Manager, if available, and agree

that change must be applied.3. CI notifies Customer and begins to apply the change as agreed upon.4. Outage occurs and Support Center is “Slammed with Calls!”5. Gil, Moira and/or other directors get calls from Customers.6. Change Initiator/Manager/Supervisor needs to explain why?7. Notice (apology!!) has to be sent out and posted on White Board.8. Major Stress for all involved.

7

Page 8: IT – Emergency Change Process - University of New …aissis.unm.edu/nav_pages/files/Emergency_Change_Process... · Change Management IT –Emergency Change Process ... Current Procedure

Why? Because Change Initiators have questions without answers.1. What is the Emergency Change Process?2. Does an Emergency Change Process even exist?3. If one exists, how do I know that my change must follow the Emergency

Change process?4. Do I need to follow the Emergency Change Process if I am applying the

change due to an Incident?5. Who approves the Change?6. Whom do I communicate with and how do I know that the date and/or time

selected is acceptable?7. Who notifies users of the outage due to the application of an emergency

change? 8

Page 9: IT – Emergency Change Process - University of New …aissis.unm.edu/nav_pages/files/Emergency_Change_Process... · Change Management IT –Emergency Change Process ... Current Procedure

Emergency Change Process

1. Change Initiator (CI) realizes that a change must be applied ASAP. 2. CI discusses change with Supervisor and/or Manager, if available, and they

agree that change must be applied.3. CI contacts Change Manager.4. Change Manager coordinates communication process and approval process.5. If approved, CI applies the change & notifies the CM of outcome.6. Post Implementation review occurs at TAT/CAB meeting.

9

Page 11: IT – Emergency Change Process - University of New …aissis.unm.edu/nav_pages/files/Emergency_Change_Process... · Change Management IT –Emergency Change Process ... Current Procedure

Step 1 – Identifying an Emergency Change

11

Page 12: IT – Emergency Change Process - University of New …aissis.unm.edu/nav_pages/files/Emergency_Change_Process... · Change Management IT –Emergency Change Process ... Current Procedure

Production Services Maintenance Window

A period of time designated in advance by the technical staff of a high-availability service during which preventive maintenance or upgrades that could cause disruption of service may be performed.

The purpose of stating a time period in advance is to allow clients of the service to prepare for possible disruption or prepare for any major changes to the functioning of the service.

This type of disclosure is typically guaranteed as part of a service level agreement.

12

Page 13: IT – Emergency Change Process - University of New …aissis.unm.edu/nav_pages/files/Emergency_Change_Process... · Change Management IT –Emergency Change Process ... Current Procedure

Production Services Maintenance Window (continued)

The ITSM office is requesting that all services have a pre-defined maintenance window that is documented and posted on the IT web site.

The current IT maintenance windows can be found on the following web site:

http://it.unm.edu/availability/

13

Page 14: IT – Emergency Change Process - University of New …aissis.unm.edu/nav_pages/files/Emergency_Change_Process... · Change Management IT –Emergency Change Process ... Current Procedure

Emergency Change Criteria

A Change planned, scheduled and implemented at very short notice in order to protect a service from an unacceptable risk of failure or degradation, lack or loss of functionality.

?It is understood that maintenance windows, by their very

nature, may involve outages, therefore, if a change is applied within a preapproved maintenance window, should that change be classified as an emergency change?

14

Page 15: IT – Emergency Change Process - University of New …aissis.unm.edu/nav_pages/files/Emergency_Change_Process... · Change Management IT –Emergency Change Process ... Current Procedure

Emergency Change Criteria (continued)

Answer: IT DEPENDS!

The answer is based on the phrase “very short notice”.

IT’s current standard for notifications is as follows:

1. A high risk, high impact change or one that creates an outage outside of a maintenance window requires 2 weeks notification to our users.

2. A same change applied within the maintenance window, requires 1 week notification.

NOTE: A change, required to resolve an incident or a problem

responsible for creating a MAJOR disruption of service, follows the Incident Management Protocol.

15

Page 16: IT – Emergency Change Process - University of New …aissis.unm.edu/nav_pages/files/Emergency_Change_Process... · Change Management IT –Emergency Change Process ... Current Procedure

Emergency Change Criteria (continued)

If you answer No to any of these questions follow the Emergency Change Process:

When using a predefined Maintenance Window:

Can the change wait until the predefined Maintenance Window?

Can Notification of the Pending Change be sent out at a minimum of 1 week in advance?

16

Page 17: IT – Emergency Change Process - University of New …aissis.unm.edu/nav_pages/files/Emergency_Change_Process... · Change Management IT –Emergency Change Process ... Current Procedure

Emergency Change Criteria (continued)

If you answer No to any of these questions follow the Emergency Change Process:

If outside of a predefined Maintenance Window:

Can Notification of the Pending Change be sent out at a minimum of 2 weeks in advance?

17

Page 18: IT – Emergency Change Process - University of New …aissis.unm.edu/nav_pages/files/Emergency_Change_Process... · Change Management IT –Emergency Change Process ... Current Procedure

Emergency Change Criteria (continued)

Important Note

(Nothing to do with an emergency change.)

On all major changes make sure you give yourself enough time to set up the announcement of the impending outage, high impact or high risk change. If it’s a major outage, communication needs to be coordinated with the Support Center who may involve Planning and PR/Marketing.

18

Page 19: IT – Emergency Change Process - University of New …aissis.unm.edu/nav_pages/files/Emergency_Change_Process... · Change Management IT –Emergency Change Process ... Current Procedure

Emergency Change Criteria (continued)

What does NOT have to follow the Emergency Change Protocol:

A change that must be applied to resolve/workaround an INCIDENT or a PROBLEM that has created a Major disruption of service follows the Incident Management Protocol.

Caveats: 1. ONLY if the ENTIRE service is affected and 2. NO other services are affected.

The Incident Management Protocol needs to be followed for changes required due to Major Incidents or Problems. Incident Management Protocol includes coordinating notifications through the Manager on Duty (MOD).

19

Page 20: IT – Emergency Change Process - University of New …aissis.unm.edu/nav_pages/files/Emergency_Change_Process... · Change Management IT –Emergency Change Process ... Current Procedure

Emergency Change Criteria (continued)

Examples – are these Emergency Changes?

1. GroupWise Post Office #1 is down, however, all users in Post Office #2 are functioning without any problems. The incident requires bringing down the server which will disrupt the functioning GroupWise Users instead of just the Users in Post Office #1.

20

Page 21: IT – Emergency Change Process - University of New …aissis.unm.edu/nav_pages/files/Emergency_Change_Process... · Change Management IT –Emergency Change Process ... Current Procedure

Emergency Change Criteria (continued)

Examples – are these Emergency Changes?

1. Yes…..By bringing down the entire system, instead of a 50%

degradation of service, a 100% degradation of service will occur.

21

Page 22: IT – Emergency Change Process - University of New …aissis.unm.edu/nav_pages/files/Emergency_Change_Process... · Change Management IT –Emergency Change Process ... Current Procedure

Emergency Change Criteria (continued)

Examples – are these Emergency Changes?

2. One of the applications which shares the use of a specific server is experiencing slow response time. Some Users are able to use the application while others are unable to logon. The Support Center is receiving multiple complaints about the application. Applications Support, after contacting the application vendor, has received an application configuration change relating to this issue and the Server needs to be rebooted after the configuration change is made.

22

Page 23: IT – Emergency Change Process - University of New …aissis.unm.edu/nav_pages/files/Emergency_Change_Process... · Change Management IT –Emergency Change Process ... Current Procedure

Emergency Change Criteria (continued)

Examples – are these Emergency Changes?

2. Yes……The application resides on a server shared with other

applications. Rebooting the server will create an outage for the users of the other applications.

23

Page 24: IT – Emergency Change Process - University of New …aissis.unm.edu/nav_pages/files/Emergency_Change_Process... · Change Management IT –Emergency Change Process ... Current Procedure

Emergency Change Criteria (continued)

Examples – are these Emergency Changes?

3. A known problem exists within the GroupWise application. The existing workaround requires a reboot of the email server twice a week in order to prevent a major disruption of service until the Root Cause can be identified and resolved. The scheduled reboot will occur outside of the regular maintenance window.

24

Page 25: IT – Emergency Change Process - University of New …aissis.unm.edu/nav_pages/files/Emergency_Change_Process... · Change Management IT –Emergency Change Process ... Current Procedure

Emergency Change Criteria (continued)

Examples – are these Emergency Changes?

3. Yes & then No….. As soon as the initial workaround has

been identified, the emergency change process should be followed for the first scheduled outage. No additional approvals will be required for subsequent outages relating to this incident since rebooting on a planned schedule prevents a major disruption with the potential for loss of data. However, some notification is still required.

25

Page 26: IT – Emergency Change Process - University of New …aissis.unm.edu/nav_pages/files/Emergency_Change_Process... · Change Management IT –Emergency Change Process ... Current Procedure

Emergency Change Criteria (continued)

Examples – are these Emergency Changes?

4. Servers receive system maintenance patches from their vendors on a regular basis. The technical support staff apply the patches on a weekly basis using their maintenance window. These patches will require a reboot of their servers. The reboot process takes no more than 30 minutes.

26

Page 27: IT – Emergency Change Process - University of New …aissis.unm.edu/nav_pages/files/Emergency_Change_Process... · Change Management IT –Emergency Change Process ... Current Procedure

Emergency Change Criteria (continued)

Examples – are these Emergency Changes?

4. No …..Since this is occurring on a regular basis in their

maintenance window, TAT/CAB needs to approve the first occurrence in order to approve the process. Subsequent outages do not need approval but still require notification.

27

Page 28: IT – Emergency Change Process - University of New …aissis.unm.edu/nav_pages/files/Emergency_Change_Process... · Change Management IT –Emergency Change Process ... Current Procedure

Step 2 – Enter a Change Request in Peregrine

28

Page 29: IT – Emergency Change Process - University of New …aissis.unm.edu/nav_pages/files/Emergency_Change_Process... · Change Management IT –Emergency Change Process ... Current Procedure

Step 3 – Contact the Change Manager

• Pager: 505-951-0950• Email to Pager: [email protected]

Reminder: Our pagers do not accept voice mail. You must key in, not state, your contact cell or telephone number when prompted.

Good Idea! Enter the pager # in your cell phone; Enter the email of pager in your address book. 29

Page 30: IT – Emergency Change Process - University of New …aissis.unm.edu/nav_pages/files/Emergency_Change_Process... · Change Management IT –Emergency Change Process ... Current Procedure

Step 3 – Contact the Change Manager (continued)

When contacting the Change Manager, be prepared to answer the following

questions:

1. Why the change cannot follow the normal RFC process?

2. Reason for change,

3. Date and time the change initiator would like to implement the change

4. The risk associated with the change or with not applying the change

5. If an outage will occur, how long it will last,

6. What services will be affected,

7. Back out plan, if change is unsuccessful & how long the back out

process will take. 30

Page 31: IT – Emergency Change Process - University of New …aissis.unm.edu/nav_pages/files/Emergency_Change_Process... · Change Management IT –Emergency Change Process ... Current Procedure

Step 4. TAT/CAB Notification and CAB/EC approvers

The Change Manager, working with information from the Change Initiator and

the information available on the Service Catalog IT Internal Web site:

1. Selects the members of the TAT/CAB that will approve the change at

the CAB/EC meeting.

2. Identifies additional staff deemed necessary for the particular change being considered.

3. Schedules a conference room and a Web-Conference or GWIM chat

4. Prepares and sends the text message and email notifying the TAT/CAB

and identified staff of the need for a CAB/EC meeting . 31

Page 32: IT – Emergency Change Process - University of New …aissis.unm.edu/nav_pages/files/Emergency_Change_Process... · Change Management IT –Emergency Change Process ... Current Procedure

CAB/EC is a subset of the TAT/CAB(+):At a minimum, the CAB/EC approvers are:

• the Change Manager,• the Manager On Duty, • the Change Initiator’s Director, • the affected Service Owner(s)• IT Security

32

Page 33: IT – Emergency Change Process - University of New …aissis.unm.edu/nav_pages/files/Emergency_Change_Process... · Change Management IT –Emergency Change Process ... Current Procedure

Step 4. TAT/CAB Notification and CAB/EC approvers (continued)

Notification will include:

• The TAT/CAB members plus necessary staff required for the emergency change

approval CAB/EC,

• Description of the emergency change,

• Services affected,

• Where and when the CAB/EC meeting will occur: GWIM, Audio-conference,

conference room or someone’s office.

33

Page 34: IT – Emergency Change Process - University of New …aissis.unm.edu/nav_pages/files/Emergency_Change_Process... · Change Management IT –Emergency Change Process ... Current Procedure

Change Advisory Board (TAT/CAB)

Change Advisory Board/Emergency change (CAB/EC)

34

Step 4. TAT/CAB Notification and CAB/EC approvers (continued)

Page 35: IT – Emergency Change Process - University of New …aissis.unm.edu/nav_pages/files/Emergency_Change_Process... · Change Management IT –Emergency Change Process ... Current Procedure

Step 4. TAT/CAB Notification and CAB/EC approvers (continued)

If the CAB/EC cannot convene the Change Manager Decides.

• If it is impossible to convene the CAB/EC, the Change Manager will make an informed decision as to whether or not the change may be applied.

• The Change Manager will make the decision after discussing the reason for and the implications of not applying the change with the MOD and the change initiator.

35

Page 36: IT – Emergency Change Process - University of New …aissis.unm.edu/nav_pages/files/Emergency_Change_Process... · Change Management IT –Emergency Change Process ... Current Procedure

Step 4. TAT/CAB Notification and CAB/EC approvers (continued)

The TAT/CAB members receive notification via email and/or a text message noting the emergency change request and the need for a CAB/EC meeting. The text message will also include when and where the meeting will occur plus the names of the individuals necessary for approval. 36

Page 37: IT – Emergency Change Process - University of New …aissis.unm.edu/nav_pages/files/Emergency_Change_Process... · Change Management IT –Emergency Change Process ... Current Procedure

STEP 5. Approval or Rejection

CAB/EC Agenda1. Roll Call2. Change Manager introduces Change Initiator3. Change Initiator

• Why the change cannot follow the normal RFC process?• Reason for change, • If approved, when it will be implemented,• The risk associated with the change, • How long the outage will last, if one is to occur, will last, • What services will be affected,• Back out plan if change is unsuccessful & how long the back out process will take,

4. Change is approved or rejected by CAB/EC 37

Page 38: IT – Emergency Change Process - University of New …aissis.unm.edu/nav_pages/files/Emergency_Change_Process... · Change Management IT –Emergency Change Process ... Current Procedure

Step 6 – Notification Coordination

After a decision is made, the Change Manager:1. Coordinates communication with the Support Center/Customer

Care and the Service Owner(s), if the change is approved. Depending on the nature of the change, notification will also be sent to all IT management.

2. Updates the Emergency RFC entry in Peregrine reflecting status;

3. Notifies the TAT/CAB of decision made by the CAB/EC;38

Page 39: IT – Emergency Change Process - University of New …aissis.unm.edu/nav_pages/files/Emergency_Change_Process... · Change Management IT –Emergency Change Process ... Current Procedure

Step 6 – Notification Coordination - approved (continued)

• The Service Owner (SO), if available, contacts the primary users of the service(s) and informs them of the upcoming emergency change.

• If the SO is unavailable, the Change Manager will contact the primary users identified in the Service Catalog.

39

Page 40: IT – Emergency Change Process - University of New …aissis.unm.edu/nav_pages/files/Emergency_Change_Process... · Change Management IT –Emergency Change Process ... Current Procedure

Step 6 – Notification Coordination - approved (continued)

Service Owner & Primary User Contact information can be found in the IT Service Catalog by using one of the following links:

Service Owner by Service Categoryhttp://itinternal.unm.edu/servicecatalog/service_own_cat.php/#content

Services by Ownerhttp://itinternal.unm.edu/servicecatalog/service_by_owner.php

40

Page 41: IT – Emergency Change Process - University of New …aissis.unm.edu/nav_pages/files/Emergency_Change_Process... · Change Management IT –Emergency Change Process ... Current Procedure

Step 7 – RFC approved

Change Initiator follows through with the change:

1. Implements the change;2. Reports, to the Change Manager, on the success or failure of the

change;3. Attends the next TAT/CAB meeting for a post-implementation

review of the Emergency RFC and its outcome.

41

Page 42: IT – Emergency Change Process - University of New …aissis.unm.edu/nav_pages/files/Emergency_Change_Process... · Change Management IT –Emergency Change Process ... Current Procedure

Step 7 – RFC Not Approved

1. Change Initiator’s RFC will follow the appropriate RFC request process;

2. Change Initiator will then request approval via the next TAT/CAB meeting.

42

Page 43: IT – Emergency Change Process - University of New …aissis.unm.edu/nav_pages/files/Emergency_Change_Process... · Change Management IT –Emergency Change Process ... Current Procedure

Next Steps

1. Set up the Audio-Conference for control and use by the ITSM office. 2. Create the IT CAB/EC GWIM Group.3. Create the IT CAB/EC texting Group.4. Present to TAT.5. Schedule presentations with all IT Groups.

• After receiving this overview, you are to begin following the Emergency Change Process.

6. Modify the existing RFC-IT form to include an Emergency Change Identifier and to automatically page the Change Manager.

7. Create the IT Emergency RFC form in Peregrine.8. Post docs in fastinfo and create links in Peregrine.

43

Page 44: IT – Emergency Change Process - University of New …aissis.unm.edu/nav_pages/files/Emergency_Change_Process... · Change Management IT –Emergency Change Process ... Current Procedure

Happy, Happy, Joy, Joy!

Whether the Emergency Change Request is approved or not:

1. Decreases the pain points associated with Emergency Changes;

2. Process and Roles are well defined;

3. IT Management, End Users and Customers will be kept informed re. availability of IT services.

44

Page 45: IT – Emergency Change Process - University of New …aissis.unm.edu/nav_pages/files/Emergency_Change_Process... · Change Management IT –Emergency Change Process ... Current Procedure

45