introduction to time series data and analysis
TRANSCRIPT
![Page 1: Introduction to Time Series Data and Analysis](https://reader031.vdocuments.net/reader031/viewer/2022021617/620acfcef2eb5f486e1b27ac/html5/thumbnails/1.jpg)
Introduction to Time Series Data andAnalysis
Simon TaylorDepartment of Mathematics and Statistics
20 April 2016
![Page 2: Introduction to Time Series Data and Analysis](https://reader031.vdocuments.net/reader031/viewer/2022021617/620acfcef2eb5f486e1b27ac/html5/thumbnails/2.jpg)
Contents
What is Time Series Data?Analysis Tools
Trace PlotAuto-Correlation FunctionSpectrum
Time Series ModelsMoving AverageAuto-Regressive
Further Topics
![Page 3: Introduction to Time Series Data and Analysis](https://reader031.vdocuments.net/reader031/viewer/2022021617/620acfcef2eb5f486e1b27ac/html5/thumbnails/3.jpg)
What is Time Series Data?
A time series is a set of observations made sequentially throughtime.
Examples:• Changes in execution time, RAM or bandwidth usage.• Times a software has run in consecutive periods of time.• Financial, geophysical, marketing, demographic, etc.
The objectives in time series analysis are:Description How does the data vary over time?Explanation What causes the observed variation?Prediction What are the future values of the series?Control Aim to improve control over the process.
![Page 4: Introduction to Time Series Data and Analysis](https://reader031.vdocuments.net/reader031/viewer/2022021617/620acfcef2eb5f486e1b27ac/html5/thumbnails/4.jpg)
Common Questions
Q: How important is preservingdata order?
A: Very! Changing data orderbreaks the dependencebetween measurements.
Q: How frequent do I need to takemeasurements?
A: It depends:
• Too sparse, risk missing thedependence structure.
• Too frequent, swamped with noise.
−5
05
n
1Hz
−5
05
n
10H
z
0 10 20 30 40 50
−5
05
n
100H
z
Time
Figure: Sampling Frequency
![Page 5: Introduction to Time Series Data and Analysis](https://reader031.vdocuments.net/reader031/viewer/2022021617/620acfcef2eb5f486e1b27ac/html5/thumbnails/5.jpg)
Why is time series important inbenchmarking?
Q: Can I use simple summarystatistics?
A: You can, but they onlydescribe overall properties.
Q: Can’t I just interpolatebetween data points?
A: Signals are often subject touncontrollable random noise.Error from interpolation maybe large if noise is large.
−3
−2
−1
01
23
X
−3
−2
−1
01
23
Y
0 200 400 600 800 1000
−3
−2
−1
01
23
Z
Time
Figure: Three times series withx̄ = 0 and s2 = 1.
![Page 6: Introduction to Time Series Data and Analysis](https://reader031.vdocuments.net/reader031/viewer/2022021617/620acfcef2eb5f486e1b27ac/html5/thumbnails/6.jpg)
Analysis Tools – Trace Plot
A trace plot is a graph of the measurements against time.
Easy to visually identify key features:
• Trends – Long-term trend in the mean level.
• Seasonality – Regular peaks & falls in the measurements.
• Outliers – Unusual measurements that are inconsistent withthe rest of the data.
• Discontinuities – Abrupt change to the underlying process.
![Page 7: Introduction to Time Series Data and Analysis](https://reader031.vdocuments.net/reader031/viewer/2022021617/620acfcef2eb5f486e1b27ac/html5/thumbnails/7.jpg)
Analysis Tools – Auto-correlation function
Correlation measures thelinear dependence between twodata sets.
Auto-correlation measures thecorrelation between all datapairs at lag k apart.
rk =
∑T−kt=1 (xt − x̄)(xt+k − x̄)
(T − 1)s2 ,
where x̄ and s2 is the samplemean and variance.
Figure: Lag 5 ACF calculation.
![Page 8: Introduction to Time Series Data and Analysis](https://reader031.vdocuments.net/reader031/viewer/2022021617/620acfcef2eb5f486e1b27ac/html5/thumbnails/8.jpg)
Analysis Tools – Spectrum
The spectrum describes howthe power in a time seriesvaries across frequencies.
I(ω) =1πT
∣∣∣∣∣T∑
t=1
xtei2πtω
∣∣∣∣∣2
,
for ω ∈ (0,1/2].
Identifies prominent seasonaland cyclic variation.
0 10 20 30 40 50
Time
1/4
1/5
1/6
1/7
1/8
X
0.0 0.1 0.2 0.3 0.4 0.5
010
2030
4050
FrequencyS
pect
rum
Figure: Fourier decomposition andspectrum of time series Xt .
![Page 9: Introduction to Time Series Data and Analysis](https://reader031.vdocuments.net/reader031/viewer/2022021617/620acfcef2eb5f486e1b27ac/html5/thumbnails/9.jpg)
Time series models
Let X1:T = {X1, . . . ,XT} denote a sequence of T measurements.
A time series is stationary if the distribution of any pair of subsetseparated by lag k , X1:t and X1+k ,t+k , are the same.
A time series is weakly stationary if the first two moments areconstant over time:
E[Xt ] = µ and Cov(Xt ,Xt+k ) = γ(k).
Gaussian White Noise Process, GWNP
The time series {Zt} follows a Gaussian white noise process if:
Zt ∼ N (0, σ2), t = 1, . . . ,T
![Page 10: Introduction to Time Series Data and Analysis](https://reader031.vdocuments.net/reader031/viewer/2022021617/620acfcef2eb5f486e1b27ac/html5/thumbnails/10.jpg)
Gaussian White Noise Process
Figure: Gaussian white noise process.
![Page 11: Introduction to Time Series Data and Analysis](https://reader031.vdocuments.net/reader031/viewer/2022021617/620acfcef2eb5f486e1b27ac/html5/thumbnails/11.jpg)
MA(q) process
Moving Average Process of Order q, MA(q)
The process {Xt} is a moving average process of order q if:
Xt = β0Zt + β1Zt−1 + · · ·+ βqZt−q
where {Zt} is a GWNP and β0, . . . , βq are constants (β0 = 1).
Expectation: E[Xt ] = 0.
Auto-covariance:
Cov(Xt ,Xt+k ) =
{σ2∑q−|k |
i=0 βiβi+|k |, |k | = 0, . . . ,q;0, otherwise.
![Page 12: Introduction to Time Series Data and Analysis](https://reader031.vdocuments.net/reader031/viewer/2022021617/620acfcef2eb5f486e1b27ac/html5/thumbnails/12.jpg)
MA(q) process
Figure: Left: MA(1), β1 = 0.9. Right: MA(2), (β1, β2) = (−0.4,0.9).
![Page 13: Introduction to Time Series Data and Analysis](https://reader031.vdocuments.net/reader031/viewer/2022021617/620acfcef2eb5f486e1b27ac/html5/thumbnails/13.jpg)
AR(p) process
Autoregressive Process of Order p, AR(p)
The process {Yt} is an autoregressive process of order p if:
Yt = α1Yt−1 + · · ·+ αpYt−p + Zt
where {Zt} is a GWNP and α1, . . . , αp are constants.
Expectation: E[Xt ] = 0.
Auto-covariance for AR(1):
Cov(Xt ,Xt+k ) = σ2 α|k |1
1− α21, provided |α1| < 1.
![Page 14: Introduction to Time Series Data and Analysis](https://reader031.vdocuments.net/reader031/viewer/2022021617/620acfcef2eb5f486e1b27ac/html5/thumbnails/14.jpg)
AR(p) process
Figure: Left: AR(1), α1 = 0.9. Right: AR(2), (α1, α2) = (0.8,−0.64).
![Page 15: Introduction to Time Series Data and Analysis](https://reader031.vdocuments.net/reader031/viewer/2022021617/620acfcef2eb5f486e1b27ac/html5/thumbnails/15.jpg)
Non-stationary process
0 200 400 600 800 1000
−10
010
20
Random Walk
Time
0 200 400 600 800 1000
−20
020
Non−stationary AR(2)
Time
0 200 400 600 800 1000
−8
−6
−4
−2
02
AR(1) w/ Mean Change
Time
0 200 400 600 800 1000
−3
−2
−1
01
23
Concatenated Haar MA
Time
Figure: Examples of non-stationary processes.
![Page 16: Introduction to Time Series Data and Analysis](https://reader031.vdocuments.net/reader031/viewer/2022021617/620acfcef2eb5f486e1b27ac/html5/thumbnails/16.jpg)
On-going Research in Time Series
local Whittlenonstationary
residual autocorrelationsleast squares estimation
fractional integrationheteroscedasticunobserved components
model selectionspectral analysis
ARIMA
state−space model
empirical distribution function
linear time seriesFisher information matrix
Stochastic volatility
ARVARMA
conditional heteroscedasticity
simulation
kernel−density estimation
deterministic trend CUSUM test
TAR
nonlinear AR
Brownian motion
consistency
count data
likelihood ratio test
fractional cointegration
Markov chain
Additive outliers
random coefficients
EM algorithm
stationary process
bilinear model
parameter estimation
neural network
QMLE
Bayesian inference
time series
integer−valued time series
estimation
high−frequency data
spatio−temporal
periodogram
long memory
Gaussian process
power
wavelet
RJMCMChypothesis testing semiparametric estimation
goodness−of−fit
threshold model
factor modelleast squares
score test
point process
GARCH
ARMAPortmanteau test
Dickey−Fuller test
changepoint
robustness
spectral density
INAR
cointegration
Monte Carlo experimentnonstationary time series
efficiency
partial autocorrelationnonparametric estimation
MLE
CLT
structural break
ARFIMA
smoothing
Seasonal unit roots
VAR
missing data
infinite variance
periodic time series
block bootstrap
spectral density matrix
unit roots
forecastingmaximum likelihood
ARCH
periodically correlated process
prediction
confidence interval
stationarity
structural change
outlier
AIC
Whittle likelihood
seasonalitysubsampling
Asymptotic distributionLagrange multiplier
Asymptotic normality
stationary test
temporal aggregationnon−Gaussian
Autocorrelation
bootstrap
multivariate time series
locally stationary processes
heavy tailKalman filtering
MCMC
nonlinear
unit root test
MA
identification
ergodicity
Figure: Keyword cloud from the Journal of Time Series Analysis,2002–2015. Red: Models, Navy: Properties, Grey: Inference &Methods
![Page 17: Introduction to Time Series Data and Analysis](https://reader031.vdocuments.net/reader031/viewer/2022021617/620acfcef2eb5f486e1b27ac/html5/thumbnails/17.jpg)
Further Reading
• Box, G. E. P., Jenkins, G. M. and Reinsel, G. C. (2008) Timeseries analysis: Forecasting and control. 4th ed., JohnWiley & Sons.
• Chatfield, C. (2004) The Analysis of Time Series: AnIntroduction. 6th ed., CRC press.
• Signal processing toolbox, MATLABr
(http://uk.mathworks.com/products/signal/)
• Statsmodels, python(http://statsmodels.sourceforge.net/)
![Page 18: Introduction to Time Series Data and Analysis](https://reader031.vdocuments.net/reader031/viewer/2022021617/620acfcef2eb5f486e1b27ac/html5/thumbnails/18.jpg)
Simon Taylor
Department of Mathematics and Statistics
20 April 2016