a deep dive into nagios analytics

49
A Deep Dive into Nagios Analytics Alexis Lê-Quôc (@alq) http://datadoghq.com

Upload: tatyana-diaz

Post on 02-Jan-2016

29 views

Category:

Documents


4 download

DESCRIPTION

A Deep Dive into Nagios Analytics. Alexis Lê-Quôc (@alq) http://datadoghq.com. @alq Dev & Ops Nagios user since 2008 Datadog co-founder. A little survey. Top 3 failed checks. That I responded to last week. That woke me up. That most of my team responded to at least once. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: A Deep Dive into Nagios Analytics

A Deep Dive into Nagios Analytics

Alexis Lê-Quôc (@alq)http://datadoghq.com

Page 2: A Deep Dive into Nagios Analytics

@alqDev & OpsNagios user since 2008Datadog co-founder

Page 3: A Deep Dive into Nagios Analytics

A little survey

Page 4: A Deep Dive into Nagios Analytics

Top 3 failed checks

Page 5: A Deep Dive into Nagios Analytics

Top 3 failed checks

That I responded tolast week

That woke me up

That most of my teamresponded to at least once

That impacts our businessthe most?

That I responded to5 weeks ago

Page 6: A Deep Dive into Nagios Analytics

Top 3 failed checks

That I responded tolast week

That woke me up

That most of my teamresponded to at least once

That impacts our businessthe most?

That I responded to5 weeks ago

Page 7: A Deep Dive into Nagios Analytics

Using memory to prioritize

remediation...

At best, finding local optimums

At worst, brownian motion

Page 8: A Deep Dive into Nagios Analytics

Analytics

Page 9: A Deep Dive into Nagios Analytics

Performance Metrics

Nagios TrafficOther Sources

In the “Cloud”

Page 10: A Deep Dive into Nagios Analytics

Nagios a “chatty” source

out of 40+ Datadog supports

Page 11: A Deep Dive into Nagios Analytics

One example

Page 12: A Deep Dive into Nagios Analytics
Page 13: A Deep Dive into Nagios Analytics

Almost 13000 Nagios “events”over past week

Page 14: A Deep Dive into Nagios Analytics

Constant stream

Page 15: A Deep Dive into Nagios Analytics

86 notifications!

Page 16: A Deep Dive into Nagios Analytics

Pattern

Page 17: A Deep Dive into Nagios Analytics

Pattern

Page 18: A Deep Dive into Nagios Analytics

More data?More questions.

Page 19: A Deep Dive into Nagios Analytics

A dialog with dataNot a scientific study

Page 20: A Deep Dive into Nagios Analytics

Population

25% 50% 75% 100% 20 93 322 904

Page 21: A Deep Dive into Nagios Analytics

Does size matter?

Page 22: A Deep Dive into Nagios Analytics

Weekly Count per host split by quartile

Page 23: A Deep Dive into Nagios Analytics

Weekly count per host split by quartile

Outliers Sick hosts,

silenced checks

Page 24: A Deep Dive into Nagios Analytics

Notifications

Page 25: A Deep Dive into Nagios Analytics

Notifications1-3% of alerts notify

Little difference per quartile

Page 26: A Deep Dive into Nagios Analytics

Does time of day matter?

Page 27: A Deep Dive into Nagios Analytics
Page 28: A Deep Dive into Nagios Analytics

Mean about the sameacross quartiles

Time-based deviation?

Page 29: A Deep Dive into Nagios Analytics

Does the day of week matter?

Page 30: A Deep Dive into Nagios Analytics
Page 31: A Deep Dive into Nagios Analytics

Not really

Page 32: A Deep Dive into Nagios Analytics

Squeaky wheels? (checks)

Page 33: A Deep Dive into Nagios Analytics

Outlier

Page 34: A Deep Dive into Nagios Analytics

Outlier in more detail

Page 35: A Deep Dive into Nagios Analytics

Long Tail

Page 36: A Deep Dive into Nagios Analytics

Squeaky wheel?(hosts)

Page 37: A Deep Dive into Nagios Analytics

Same outlier

Page 38: A Deep Dive into Nagios Analytics

Similar pattern as checks

Page 39: A Deep Dive into Nagios Analytics

Long Tail

Page 40: A Deep Dive into Nagios Analytics

Recurring alerts

Page 41: A Deep Dive into Nagios Analytics

Young Old

Seldom happen

s

Happens

Often

Page 42: A Deep Dive into Nagios Analytics

Happen once in a while

Occur often, for a long time Tolerated

Page 43: A Deep Dive into Nagios Analytics

More data?More questions.

Page 44: A Deep Dive into Nagios Analytics

HOWTO?

Page 45: A Deep Dive into Nagios Analytics

Find out tomorrow!Awk

Postgres

R

d3

Page 46: A Deep Dive into Nagios Analytics

Presentation matters

Page 47: A Deep Dive into Nagios Analytics
Page 48: A Deep Dive into Nagios Analytics

Take-away?

Page 49: A Deep Dive into Nagios Analytics

Take-aways

•Don’t rely on your memory

•Your Nagios logs are a treasure trove

•Have a dialog with your data

•Presentation matters