itimpulse noc process this is an interactive, detailed, step wise guide explaining how alerts are...

30
ITimpulse NOC process This is an interactive, detailed, step wise guide explaining how alerts are managed at our NOC. This document contains information that is considered proprietary and confidential. No information contained in this document may be released, re-printed, or redistributed without prior permission from ITimpulse.

Upload: lambert-porter

Post on 25-Dec-2015

222 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: ITimpulse NOC process This is an interactive, detailed, step wise guide explaining how alerts are managed at our NOC. This document contains information

ITimpulse NOC process

This is an interactive, detailed, step wise guide explaining how alerts

are managed at our NOC.

This document contains information that is considered proprietary and confidential. No information contained in this document may be released, re-printed, or redistributed without prior permission from ITimpulse.

Page 2: ITimpulse NOC process This is an interactive, detailed, step wise guide explaining how alerts are managed at our NOC. This document contains information

How to navigate this PPT

• View this presentation in Slide Show (fullscreen)mode.

• Do not navigate using Keyboard. • Use your mouse & click on buttons ,it will

redirect you to the appropriate slide.• Using button will get you to the previous

slide & to the next.• button will redirect you for more

information on the topic.• button will get you back to 1st slide.• Click F5 for Slide Show mode.

Page 3: ITimpulse NOC process This is an interactive, detailed, step wise guide explaining how alerts are managed at our NOC. This document contains information

Alert detected

• An alert is generated by the RMM and an email is sent to the NOC.

• Our Service desk responds to the alert within minutes.

• Service desk checks if the alert is valid.

• Service desk sets the priority and directs the ticket to correct resource in our NOC team.

Valid Invalid

Page 4: ITimpulse NOC process This is an interactive, detailed, step wise guide explaining how alerts are managed at our NOC. This document contains information

Valid Alerts

• Valid alerts are categorized and assigned a priority depending on our SLA.

• They are then assigned to L1 techs.

Urgent High Low

Work request

Ticket Life Cycle

Server Outage

Page 5: ITimpulse NOC process This is an interactive, detailed, step wise guide explaining how alerts are managed at our NOC. This document contains information

Urgent Priority

• All urgent requests are responded with in 10 minutes. In simpler words an engineer is working towards problem resolution within 10 minutes.

• Urgent priority tickets (excluding server outage) are directly assigned to L2 Engineers

• A L3 gets involved if the problem is not resolved in an hour.

Typical Urgent alerts

Page 6: ITimpulse NOC process This is an interactive, detailed, step wise guide explaining how alerts are managed at our NOC. This document contains information

Handling Server Outage

A server outage is categorized urgent. NOC performs these steps to verify if it’s a network problem or server crash.1. Check for scheduled outage.2. Check if other devices in same site are

online. 3. Ping site Public IP.4. Try to access device from another

computer in the network.

Server Down Network Down

Page 7: ITimpulse NOC process This is an interactive, detailed, step wise guide explaining how alerts are managed at our NOC. This document contains information

Server down

If a server is confirmed as offline the NOC performs the following actions1. check if server reboots and comes back up 2. access the device using ILO/DRAC3. If server is virtualized, check access from host machine.4. Inform Customer

Yes No

Page 8: ITimpulse NOC process This is an interactive, detailed, step wise guide explaining how alerts are managed at our NOC. This document contains information

Server reboots

Since we set all servers to reboot automatically, in case of a BSOD they mostly come back up. If they do...

1. Once the server is back online, Our engineers perform a root cause analysis of the issue.

2. We implement a fix and monitor the server for another 7 days.

Page 9: ITimpulse NOC process This is an interactive, detailed, step wise guide explaining how alerts are managed at our NOC. This document contains information

Server Stays offline

Since we set all servers to reboot automatically in case of a BSOD, they mostly come back up. If they don’t...

1. We inform you that the server has been offline and needs onsite attention.2. We document the probable cause and all the things we have tried in the ticket.3. Our Engineer is available to help when someone gets onsite.

Page 10: ITimpulse NOC process This is an interactive, detailed, step wise guide explaining how alerts are managed at our NOC. This document contains information

Network down

The NOC checks to see if other devices at the site are online. Yes/NoWe try to ping the gateway to see if it is an internet connection issue yes / no

Inform Customer

Page 11: ITimpulse NOC process This is an interactive, detailed, step wise guide explaining how alerts are managed at our NOC. This document contains information

Inform Customer

• We will call a number provided by you depending on the time of day.

• We will email you about the problem with our investigations.

• All troubleshooting will be documented in detail in your PSA.

Page 12: ITimpulse NOC process This is an interactive, detailed, step wise guide explaining how alerts are managed at our NOC. This document contains information

High Priority

• All High priority requests are responded with in 30 minutes.

• A L2 Engineer get involved on the ticket before 60 minutes and a L3 if the problem is not resolved in 4 hours.

• We resolve most high priority tickets in 24 hours.

Typical High priority alerts

Page 13: ITimpulse NOC process This is an interactive, detailed, step wise guide explaining how alerts are managed at our NOC. This document contains information

Low priority

• All Low priority requests are responded with in 4 hours.

• A L2 Engineer gets involved on the ticket if it can not be resolved after 1 hour of troubleshooting.

• We resolve most low priority tickets in 48 hours.

Typical Low priority alerts

Page 14: ITimpulse NOC process This is an interactive, detailed, step wise guide explaining how alerts are managed at our NOC. This document contains information

Examples of urgent priority alerts

Server down Alert Critical Event viewer error

Critical RMM alert

Event viewer error that leads to critical error

Device failure caused by patch deployment

Database offline alert

Scheduled task failure

Exchange service outage alerts

Server Performance threshold alert

Page 15: ITimpulse NOC process This is an interactive, detailed, step wise guide explaining how alerts are managed at our NOC. This document contains information

Examples of high priority alerts

Non-Critical Event viewer Error

Event viewer warning

Non-Critical RMM alert

Server Anti-virus scan or update alert

Server Malware infection alert

Server Backup failure alert

Scheduled task failure

Software or RMM agent deployment

Page 16: ITimpulse NOC process This is an interactive, detailed, step wise guide explaining how alerts are managed at our NOC. This document contains information

Examples of Low priority alerts

Event viewer alerts from workstations

Performance issues on workstations

Workstation backup failure alert

RMM alert for workstations

Workstation Anti-virus scan or update alert

Workstations Malware infection alert

Software or RMM agent deployment

Workstation Patch installation failure alert

Patch approval

Page 17: ITimpulse NOC process This is an interactive, detailed, step wise guide explaining how alerts are managed at our NOC. This document contains information

Work request

• More info on Work requests • All work requests are responded with

in 4 hours. • All work requests are resolved within

24 hours.• The time may vary depending on the

scope of the request.

Page 18: ITimpulse NOC process This is an interactive, detailed, step wise guide explaining how alerts are managed at our NOC. This document contains information

Invalid alerts

• Invalid alerts are closed and a properly documented.

Page 19: ITimpulse NOC process This is an interactive, detailed, step wise guide explaining how alerts are managed at our NOC. This document contains information

Ticket Life Cycle

•An alert generated by your RMM creates a ticket in the PSA. For devices managed by our NOC, this alert is forwarded to the NOC’s Board or queue.

Acknowledge

•Our Service desk team validates these alert. They remove the false positives. The validated alerts are further prioritized and categorized.Validate•Our Service desk assigns the ticket to the right resource. If something needs to be done at a later time, they also schedule it.Assign

The service desk is our front line of support. They perform the below tasks on each and every ticket. Service desk does not perform any troubleshooting.

Page 20: ITimpulse NOC process This is an interactive, detailed, step wise guide explaining how alerts are managed at our NOC. This document contains information

Assigned to L1L1 receives tickets assigned by SD

Our L1 team follows our internal Knowledge base and documented resolutions to resolve a problem.

ResolvedMajority of tickets are resolved by the L1 team

Unresolved Tickets are escalated to L2

If an input is needed , we contact you

MonitorWhere resolution can not be confirmed immediately

Page 21: ITimpulse NOC process This is an interactive, detailed, step wise guide explaining how alerts are managed at our NOC. This document contains information

Escalated to L2• When a ticket can not be resolved with

known procedures, the tickets are escalated to L2

• All our L2 engineers are MCITP certified and have over 3 years of experience.

• L2 engineers find the root cause and resolve the problem.

• Depending on priority, they get 30 minutes to 4 hours to research and resolve the problem.

• Any tickets that are not resolved are further escalated to L3.

Page 22: ITimpulse NOC process This is an interactive, detailed, step wise guide explaining how alerts are managed at our NOC. This document contains information

Assigned to L2L2 receives tickets assigned by SD

Depending on priority, L2 engineers get 30 minutes to 4 hours to research and resolve the problem.

ResolvedResolved tickets are documented and closed

Unresolved Tickets are escalated to L3

MonitorWhere resolution can not be confirmed immediately

If an input is needed , we contact you

Page 23: ITimpulse NOC process This is an interactive, detailed, step wise guide explaining how alerts are managed at our NOC. This document contains information

Escalated to L3• When a ticket can not be resolved a L2, the

tickets are escalated to L3• L3 is our last tier of support. Our L3

engineers have over 6 years of experience on the field and they are also Subject matter experts in a field of their choice.

• In a rare circumstance a ticket can not be resolved by a L3, we will call you to discuss how to proceed further.

Page 24: ITimpulse NOC process This is an interactive, detailed, step wise guide explaining how alerts are managed at our NOC. This document contains information

Assigned to L3L3 receives escalation from L2

L3 engineers form our final tier of support.

ResolvedResolved tickets are documented and closed

Unresolved Tickets are escalated

MonitorWhere resolution can not be confirmed immediately

Page 25: ITimpulse NOC process This is an interactive, detailed, step wise guide explaining how alerts are managed at our NOC. This document contains information

Resolved tickets

• Resolved tickets are fully documented in the PSA.

• An appropriate time entry is added in the PSA.

• Ticket is marked closed.• Our Quality team reviews ticket properly

closed.• If a ticket was closed by a L2 or L3

engineer he creates a new solution article for the problem in our internal KB.

Page 26: ITimpulse NOC process This is an interactive, detailed, step wise guide explaining how alerts are managed at our NOC. This document contains information

Assigned to Customer

• Any tickets that need physical access to the site are assigned to customer.

• Tickets where more information is required for resolution are assigned to customer.

• Only 1 in every 50 tickets will require your attention.

Page 27: ITimpulse NOC process This is an interactive, detailed, step wise guide explaining how alerts are managed at our NOC. This document contains information

Unresolved by L3

• This often means that we have reached a dead end and may need a workaround or replacement as the problem can not be resolved.

• Our L3 Engineer will call you and discuss available options, their down sides and time it will take for implementation.

• Any changes will only be made after your approval.

Page 28: ITimpulse NOC process This is an interactive, detailed, step wise guide explaining how alerts are managed at our NOC. This document contains information

Ticket on hold for monitoring

• Some tickets may be resolved but need confirmation before closure.

• Such tickets are assigned back to SD team and put on hold for a specified period of time.

• After the period of time has passed, the SD team checks if the issue is resolved.

• Resolved tickets are closed. Unresolved tickets are reassigned to engineers.

Page 29: ITimpulse NOC process This is an interactive, detailed, step wise guide explaining how alerts are managed at our NOC. This document contains information

End of ticket Life Cycle

This brings us to the end of the Ticket Life cycle section. Press the back button below to go to previous section. Click home to get to beginning of the slide show.

Page 30: ITimpulse NOC process This is an interactive, detailed, step wise guide explaining how alerts are managed at our NOC. This document contains information

This document contains information that is considered proprietary and confidential. No information contained in this document may be released, re-printed, or redistributed without prior permission from ITimpulse.

To know more about how to get started with NOC services, our NOC onboarding process, how we integrate with your existing tools and deliver

seamless NOC services schedule a web-demo with us.

Email [email protected] to schedule a live demo.

For further inquiries and information please feel free to contact us at: US: +1 646-351-8634 India: +91 020-6500-2328 Email: [email protected] Website: www.itimpulse.in Direct mail: ITimpulse, B112, Ganga Osian Square, Wakad, Pune – 411057

ITimpulse provides RMM agnostic, White label NOC services for MSPs