introduction to algorithmic trading strategies...
TRANSCRIPT
Introduction to Algorithmic Trading StrategiesLecture 5
Pairs Trading by Stochastic Spread Methods
Haksun [email protected]
www.numericalmethod.com
Outline First passage time Kalman filter Maximum likelihood estimate EM algorithm
2
References As the emphasis of the basic co‐integration methods of most papers are
on the construction of a synthetic mean‐reverting asset, the stochastic spread methods focuses on the dynamic of the price of the synthetic asset.
Most referenced academic paper: Elliot, van der Hoek, and Malcolm, 2005, Pairs Trading Model the spread process as a state‐space version of Ornstein‐Uhlenbeck
process Jonathan Chiu, Daniel Wijaya Lukman, Kourosh Modarresi, Avinayan
Senthi Velayutham. High‐frequency Trading. Stanford University. 2011 The idea has been conceived by a lot of popular pairs trading books
Technical analysis and charting for the spread, Ehrman, 2005, The Handbook of Pairs Trading
ARMA model, HMM ARMA model, some non‐parametric approach, and a Kalman filter model, Vidyamurthy, 2004, Pairs Trading: Quantitative Methods and Analysis
3
Spread as a Mean‐Reverting Process
The long term mean = .
The rate of mean reversion = .
4
Sum of Power Series We note that
∑
5
Unconditional Mean
6
Long Term Mean
7
Unconditional Variance
8
Long Term Variance
9
Observations and Hidden State Process The hidden state process is:
1
0, 0 1 The observations:
We want to compute the expected state from observations. | |
10
First Passage Time Standardized Ornstein‐Uhlenbeck process 2
First passage time , inf 0, 0| 0
The pdf of , has a maximum value at
ln 1 3 4 3
11
A Sample Trading Strategy
,
Buy when unwind after time
Sell when unwind after time
12
Kalman Filter The Kalman filter is an efficient recursive filter that estimates the state of a dynamic system from a series of incomplete and noisy measurements.
13
Conceptual Diagram
prediction at time t Update at time t+1
as new measurements come in
correct for better estimation
14
A Linear Discrete System
: the state transition model applied to the previous state
: the control‐input model applied to control vectors : the noise process drawn from multivariate Normal distribution
15
Observations and Noises
: the observation model mapping the true states to observations
: the observation noise
16
Discrete System Diagram
17
Prediction predicted a prior state estimate | |
predicted a prior estimate covariance | |
18
Update measurement residual |
residual covariance |
optimal Kalman gain |
updated a posteriori state estimate | |
updated a posteriori estimate covariance | |
19
Computing the ‘Best’ State Estimate Given , , , , we define the conditional variance Σ | ≡ E |
Start with | , .
20
Predicted (a Priori) State Estimation |
|
21
Predicted (a Priori) Variance |
|
|
|
|
22
Minimize Posteriori Variance Let the Kalman updating formula be | | |
We want to solve for K such that the conditional variance is minimized. Σ | E |
23
Solve for K E |
E | | |
E | | |
E 1 | |
1 E | |
1 Σ |
24
First Order Condition for k
|
|
|
25
Optimal Kalman Filter
|
|
26
Updated (a Posteriori) State Estimation So, we have the “optimal” Kalman updating rule. | | |
||
||
27
Updated (a Posteriori) Variance Σ | E | 1 Σ |
1 |
|Σ |
|
|
|
Σ ||
|
| |
|
| |
|
| |
|
Σ |
28
Parameter Estimation We need to estimate the parameters from the observable data before we can use the Kalman filter model.
We need to write down the likelihood function in terms of , and then maximize w.r.t. .
29
Likelihood Function A likelihood function (often simply the likelihood) is a function of the parameters of a statistical model, defined as follows: the likelihood of a set of parameter values given some observed outcomes is equal to the probability of those observed outcomes given those parameter values.
30
Maximum Likelihood Estimate We find such that is maximized given the observation.
31
Example Using the Normal Distribution We want to estimate the mean of a sample of size
drawn from a Normal distribution.
exp
32
Log‐Likelihood
Maximizing the log‐likelihood is equivalent to maximizing the following. ∑
First order condition w.r.t., ∑
33
Nelder‐Mead After we write down the likelihood function for the Kalman model in terms of , we can run any multivariate optimization algorithm, e.g., Nelder‐Mead, to search for . ma ;
The disadvantage is that it may not converge well, hence not landing close to the optimal solution.
34
Marginal Likelihood For the set of hidden states, , we write ; | ∑ , |
Assume we know the conditional distribution of , we could instead maximize the following. ma E | , , or
ma E log | ,
The expectation is a weighted sum of the (log‐) likelihoods weighted by the probability of the hidden states.
35
The Q‐Function Where do we get the conditional distribution of from?
Suppose we somehow have an (initial) estimation of the parameters, . Then the model has no unknowns. We can compute the distribution of .
| ,
36
EM Intuition Suppose we know , we know completely about the mode; we can find
Suppose we know , we can estimate , by, e.g., maximum likelihood.
What do we do if we don’t know both and ?
37
Expectation‐Maximization Algorithm Expectation step (E‐step): compute the expected value of the log‐likelihood function, w.r.t., the conditional distribution of under and . | E
| ,log | ,
Maximization step (M‐step): find the parameters, , that maximize the Q‐value. argmax |
38
EM Algorithms for Kalman Filter Offline: Shumway and Stoffer smoother approach, 1982
Online: Elliott and Krishnamurthy filter approach, 1999
39
A Trading Algorithm From , , …, , we estimate . Decide whether to make a trade at , unwind at
, or some time later, e.g., . As arrives, estimate . Repeat.
40
Results (1)
41
Results (2)
42
Results (3)
43