this one weird time series trick

47
THIS ONE WEIRD TIME-SERIES TRICK Baron Schwartz Monitorama 2014 Portland

Upload: vividcortex

Post on 19-Jul-2015

143 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: This One Weird Time Series Trick

THIS ONE WEIRD TIME-SERIES TRICKBaron Schwartz

Monitorama 2014 Portland

Page 2: This One Weird Time Series Trick

PERFORMANCE MANAGEMENT

Optimization, Backups, Replication, and more

Baron Schwartz, Peter Zaitsev &

Vadim Tkachenko

High PerformanceMySQL

3rd Edition

Covers Version 5.5

Baron Schwartz - [email protected] - @xaprb

YES, WE ARE HIRING

Page 3: This One Weird Time Series Trick
Page 4: This One Weird Time Series Trick

WHAT IS TYPICAL?

Page 5: This One Weird Time Series Trick

BUT FIRST...

Page 6: This One Weird Time Series Trick

WORLDVIEW

•We need performance management, not monitoring.

•Work-getting-done is top priority.

•We need more than recipes.

Page 7: This One Weird Time Series Trick

ANOMALY DETECTION

• Trending and prediction are not anomaly detection.

• Anomalousness is not true-or-false, it’s a probability.

• Anomalous and bad are not the same.

• “There is an anomaly” implies a statistical framework.

Page 8: This One Weird Time Series Trick

THANK YOU

• Any questions?

Page 9: This One Weird Time Series Trick

HOLY GRAIL*

•Determine “normal” behavior

• Predict how a metric “should” behave

•Quantify deviation from normality

•Do useful* stuff

* For the record, this is not something I value highly, but it’s worth talking about.

Page 10: This One Weird Time Series Trick

OUT OF SCOPE

• All that stuff is potentially too practical for this talk.

• I just want to talk about a fun tool I use sometimes.

• I’d rather play with math than do anything useful.

Page 11: This One Weird Time Series Trick

WHY GIVE THIS TALK?Toufic already gave the beginning and end.

Page 12: This One Weird Time Series Trick

IT’S A TIME SERIES WORLD

Ability to process cheaply is important.

Page 13: This One Weird Time Series Trick
Page 14: This One Weird Time Series Trick

HOW MANY OF YOU KNOW...

Binomial Theorem

Standard Deviation

Standard Error

Random Walk

T-Statistic, R-Squared

Pearson’s Correlation

Heteroscedasticity

Homoscedasticity

ANOVA

ARIMA

YAGNI

Page 15: This One Weird Time Series Trick

TRIED SMIRNOV

TESTBartender didn’t have any

Kolmogorov

Page 17: This One Weird Time Series Trick

SERIOUSLY

• Basic statistics is good for a lot of amazing things.

• Use simple, efficient approaches if you can.

• Hat tip to Neil Gunther

• Advanced stuff is there if you need it; use an advisor.

• Hat tip to Neil Gunther

•Don’t let the clowns and naysayers stop you.

Page 18: This One Weird Time Series Trick

• http://blog.b3k.us/2012/06/08/the-engineering-activity-spectrum.html

Metrics Analysis

Page 19: This One Weird Time Series Trick

BEGIN AT THE BEGINNING

Page 20: This One Weird Time Series Trick

ANOMALY DETECTION

Characterization of Past

Central Tendency

Forecast/Predict

Deviation

Anomaly

Page 21: This One Weird Time Series Trick
Page 22: This One Weird Time Series Trick

CENTRAL TENDENCIESThe mother of all central tendencies is the average.*

Page 23: This One Weird Time Series Trick

CONTROL CHARTSIs the process within normal limits?

Page 24: This One Weird Time Series Trick

PROBLEMControl charts assume a stationary mean.

Systems are “less normal” than we assume, in both senses.

Page 25: This One Weird Time Series Trick

RECENCYWhat is a system’s “recent” normal?

Page 26: This One Weird Time Series Trick

MOVING AVERAGEAverage over a window of recent data

Page 27: This One Weird Time Series Trick

MOVING CONTROL CHARTS

Page 28: This One Weird Time Series Trick

PROBLEMSMoving average is “more expensive” to compute

Moving average is influenced by “distant” past

Page 29: This One Weird Time Series Trick

These days should be remembered and kept throughout every generation- Esther 9:28

REMEMBER ALL THE THINGS

Page 30: This One Weird Time Series Trick

EXPONENTIAL MOVING AVERAGES

• Infinite memory, biased towards recent history (past data trails off to nothing)

• Cheap to compute

• Choose a decay factor α

• St = αxt + (1-α) St-1

Page 31: This One Weird Time Series Trick

0

100

200

300

400

1 2 3 4 5

0

100

200

300

400

1 2 3 4 5

Page 32: This One Weird Time Series Trick

EWMA = LOW-PASS FILTER

Page 33: This One Weird Time Series Trick

CHOOSING DECAYα = 2/(N+1), where N is desired avg age of samples

Page 34: This One Weird Time Series Trick

EXPONENTIAL MOVING CONTROL CHARTS

•Need exponential moving average - easy

•Need exponential moving standard deviation - hmm.

• Standard deviation = square root of variance

• Variance = “mean of square minus square of mean”

•MVP solution: exponential moving avg of squared values

Page 35: This One Weird Time Series Trick

EMCC

Page 36: This One Weird Time Series Trick

SHORTCOMINGS

• Works well when data is approximately normally distributed

• Non-Gaussian data throws “standard deviation” for a loop; false positives ensue

• Requires more advanced techniques

• STILL USEFUL ANYWAY.

Page 37: This One Weird Time Series Trick

EWMA WITH TREND

Page 38: This One Weird Time Series Trick

DOUBLE EXPONENTIAL SMOOTHING

• Predict the series based on the average and the trend

•Need decay factors α, β

• St = αxt + (1-α)(St-1 + Bt-1)

• Bt = β(St-St-1) + (1-β)Bt-1

Page 39: This One Weird Time Series Trick

DOUBLE EXPONENTIAL SMOOTHING

Page 40: This One Weird Time Series Trick

HOLT-WINTERS FORECASTING

• Holt-Winters Forecasting adds seasonality indexes

• Relatively complex, expensive, slow to train, learns to expect anomalies

Page 41: This One Weird Time Series Trick

DIMINISHING RETURNS

•DES and H-W get complicated fast

• They work fine on non-Gaussian data.

• But Gaussian anomaly detection won’t.

• Lots of parameters; choosing them requires:

•Work, prior knowledge, complex optimization; or

•Machine learning

Page 42: This One Weird Time Series Trick

MACD

•Moving Average Convergence-Divergence

•Difference of two EMAs with different decay factors

Page 43: This One Weird Time Series Trick

MACD

Page 44: This One Weird Time Series Trick

QUESTIONS?@xaprb • linkedin.com/in/xaprb

baron - at - vividcortex.com

Page 45: This One Weird Time Series Trick

CREDITS

• car crash

• squirrel

• highway

• nerdy kid

• boarded up windows

• grail

• out of scope

• drill press

• alice in wonderland

• owl

• singapore opera

• chess

• decaying leaf

• spiderweb

• train tracks

• eye

• spiral

Page 46: This One Weird Time Series Trick

REGRESSIONAdd(x,  y)  {      n++      sx  +=  x      sy  +=  y      sxx  +=  x  *  x      sxy  +=  x  *  y      syy  +=  y  *  y}

Slope()  {    ss_xy  =  n*sxy-­‐sx*sy    ss_xx  =  n*sxx-­‐sx*sx    return  ss_xy/ss_xx}

Intercept()  {    return        (sy-­‐Slope()*sx)/n}

//  And  so  on

Page 47: This One Weird Time Series Trick

EWMA REGRESSIONAdd(x,  y)  {

     //  replace  sx  +=  x  by  the  following:      sx  =  sx*a  +  (1-­‐a)*x

     //  repeat  as  needed;  left  as  an      //  exercise  for  the  reader