ppt0320defenseday
TRANSCRIPT
Ph.D. dissertation presentation
Empirical properties of functional regression models andapplication to high-frequency financial data
Xi ZhangDepartment of Mathematics and Statistics
Utah State University
March 20, 2013
1 Xi Zhang | March 20, 2013 1 / 48
Ph.D. dissertation presentation | Introduction
Outline
1 IntroductionFunctional data analysisHigh-frequency financial data sets
2 Empirical properties of forecasts with the functional autoregressive model
3 Functional prediction of intraday cumulative returns
4 Functional multifactor regression for intraday price curves
5 Summary and Conclusions
2 Xi Zhang | March 20, 2013 2 / 48
Ph.D. dissertation presentation | Introduction | Functional data analysis
Functional Data Analysis(FDA)
It analyzes data providing information about curves, surfaces or anything elsevarying over a continuum (time, spatial location, wavelength, probability, etc).
The core idea is that curves should be treated as individual and completestatistical objects, rather than as collections of individual observations.
Statistical tools of FDA typically rely on some form of smoothing to transformhigh dimensional or incomplete data building up a curve into a smoother curvethat can be described by a smaller number of parameters.
The inherent complexity of FDA makes it impossible in a meaningful way toestimate the “distribution” of a random function, or to find estimates that couldconverge in a reasonable rate, which indicates that the properties of the FPCA areof great importance in FDA.
3 Xi Zhang | March 20, 2013 3 / 48
Ph.D. dissertation presentation | Introduction | High-frequency financial data sets
8 years price process
4 Xi Zhang | March 20, 2013 4 / 48
Ph.D. dissertation presentation | Introduction | High-frequency financial data sets
Cumulative Intraday returns
Definition
Suppose Pn(tj), n = 1, . . . ,N, j = 1, . . . ,m is the price of a financial asset at time tjon day n. The functions
rn(tj) = 100[lnPn(tj)− lnPn(t1)], j = 2, . . . ,m, n = 1, . . . ,N,
are defined as the intraday cumulative returns (CIDR’s/ IDCR’s).
The above definition implicitly assumes that tj+1 > tj .we work with one minute averages, so tj+1 − tj = 1 min, and P(tj) is the average of themaximum and minimum price within the jth minute.
5 Xi Zhang | March 20, 2013 5 / 48
Ph.D. dissertation presentation | Introduction | High-frequency financial data sets
Cumulative Intraday returns
6 Xi Zhang | March 20, 2013 6 / 48
Ph.D. dissertation presentation | Introduction | High-frequency financial data sets
Five days closer look
7 Xi Zhang | March 20, 2013 7 / 48
Ph.D. dissertation presentation | Introduction | High-frequency financial data sets
Why CIDR’s/IDCR’s?
Similar to curves of the price Pn(tj) for a trading day n which are of high interestby stock investors
Give more relevant information by showing how the return changes during atrading day
Can be treated as continuous curves, one curve per day, adapted to functional data
8 Xi Zhang | March 20, 2013 8 / 48
Ph.D. dissertation presentation | Introduction | High-frequency financial data sets
High frequency returns
9 Xi Zhang | March 20, 2013 9 / 48
Ph.D. dissertation presentation | Empirical properties of forecasts with the functional autoregressive model
Outline
1 Introduction
2 Empirical properties of forecasts with the functional autoregressive modelIntroductionSimulation studyResults
3 Functional prediction of intraday cumulative returns
4 Functional multifactor regression for intraday price curves
5 Summary and Conclusions
10 Xi Zhang | March 20, 2013 10 / 48
Ph.D. dissertation presentation | Empirical properties of forecasts with the functional autoregressive model | Introduction
Functional Autoregressive model(FAR)
FAR(1) modelXn+1 = Ψ(Xn) + εn+1,
where errors εn and the observations Xn are curves, and the operator Ψ acting on afunction X is defined as
Ψ(X )(t) =
∫ψ(t, s)X (s)ds,
where ψ(t, s) is a bivariate kernel assumed to satisfy ||Ψ|| < 1, where
||Ψ||2 =
∫∫ψ2(t, s)dtds. (1)
The condition ||Ψ|| < 1 ensures the existence of a stationary causal solution to FAR(1)equations.
11 Xi Zhang | March 20, 2013 11 / 48
Ph.D. dissertation presentation | Empirical properties of forecasts with the functional autoregressive model | Introduction
Methods
Bosq (2000) advocated a standard method by estimating the operator Ψ andforecasting Xn+1 by Ψ(Xn). (Estimated Kernel (EK))
The empirical version of bivariate kernel ψ:
ψp(t, s) =
p∑k,`=1
ψk`vk(t)v`(s), (2)
where
ψji = λ−1i (N − 1)−1
N−1∑n=1
〈Xn, vi 〉〈Xn+1, vj〉. (3)
where vk , k = 1, 2, . . . , p, the estimated (or empirical) FPC’s (EFPC’s).p is thenumber of EFPC’s.
Kargin and Onatski (2008) proposed a sophisticated method: one step aheadprediction in FAR(1) model based on predictive factors. (Predictive Factors (PF))
12 Xi Zhang | March 20, 2013 12 / 48
Ph.D. dissertation presentation | Empirical properties of forecasts with the functional autoregressive model | Introduction
Objective
Is the method of Predictive Factors (PF) superior in finite samples to the EstimatedKernel (EK)?
13 Xi Zhang | March 20, 2013 13 / 48
Ph.D. dissertation presentation | Empirical properties of forecasts with the functional autoregressive model | Simulation study
Data generating process
FAR(1) model
Xn+1(t) =
∫ 1
0
ψ(t, s)Xn(s)ds + εn+1(t), n = 1, 2, . . . ,N.
Three error processes
Brownian bridgesε(1)(t) = BB(t)
ε(2)(t) = ξ1
√2 sin(2πt) +
√λ√
2ξ2 cos(2πt) ,
where ξ1 and ξ2 are independent standard normals, λ can be any constant (in thesimulations we use λ = 0.5).
ε(3)(t) = ε(2)(t) + aε(1)(t) ,
14 Xi Zhang | March 20, 2013 14 / 48
Ph.D. dissertation presentation | Empirical properties of forecasts with the functional autoregressive model | Simulation study
Kernels
Four kernels (defined for (t, s) ∈ [0, 1]2):
Gaussian : ψ(t, s) = C exp{−(t2 + s2)/2
},
Identity : ψ(t, s) = C ,
Sloping plane (t) : ψ(t, s) = Ct,
Sloping plane (s) : ψ(t, s) = Cs.
C are chosen such that ||Ψ|| = 0.5 or ||Ψ|| = 0.8.
15 Xi Zhang | March 20, 2013 15 / 48
Ph.D. dissertation presentation | Empirical properties of forecasts with the functional autoregressive model | Simulation study
Measures of quality of prediction
Quantities:
En =
√∫ 1
0
(Xn(t)− Xn(t)
)2
dt and Rn =
∫ 1
0
∣∣∣Xn(t)− Xn(t)∣∣∣ dt.
are used to measure the prediction error at time n.
16 Xi Zhang | March 20, 2013 16 / 48
Ph.D. dissertation presentation | Empirical properties of forecasts with the functional autoregressive model | Results
Comparison of five prediction methods
MP Mean Prediction Xn+1(t) = 0.
NP Naive Prediction Xn+1 = Xn.
EX Exact Xn+1 = Ψ(Xn).
EK Estimated Kernel.
EKI Estimated Kernel Improved, using λi + b instead of λi .
PF Predictive Factors.
17 Xi Zhang | March 20, 2013 17 / 48
Ph.D. dissertation presentation | Empirical properties of forecasts with the functional autoregressive model | Results
Boxplots of the prediction errors ||Ψ|| = 0.5
En (left) and Rn (right); innovations: ε(1), kernel: sloping plane (t), N = 100, p = 3.
18 Xi Zhang | March 20, 2013 18 / 48
Ph.D. dissertation presentation | Empirical properties of forecasts with the functional autoregressive model | Results
Conclusions
Based on all 32 sets of boxplots and 32 sets of tables, we report:
Taking the autoregressive structure into account reduces prediction errors.
None of the Methods EX, EK, EKI uniformly dominates the other. In most casesmethod EK is the best, or at least as good as the others.
In some cases, method PF performs visibly worse than the other methods, butalways better than NP.
Using the improved estimation does not generally reduce prediction errors.
19 Xi Zhang | March 20, 2013 19 / 48
Ph.D. dissertation presentation | Functional prediction of intraday cumulative returns
Outline
1 Introduction
2 Empirical properties of forecasts with the functional autoregressive model
3 Functional prediction of intraday cumulative returnsIntroductionMethods and modelsApplication to US stocksResults
4 Functional multifactor regression for intraday price curves
5 Summary and Conclusions
20 Xi Zhang | March 20, 2013 20 / 48
Ph.D. dissertation presentation | Functional prediction of intraday cumulative returns | Introduction
Capital Asset Pricing Model(CAPM)
The simplest form of celebrated Capital Asset Pricing Model(CAPM):
rn = α + βrm,n + εn (4)
where
rn = 100(lnPn − lnPn−1) ≈ 100Pn − Pn−1
Pn−1(5)
is the return, in percent, over a unit of time on a specific asset, e.g. a stock, and rm,n isthe analogously defined return on a relevant market index.
21 Xi Zhang | March 20, 2013 21 / 48
Ph.D. dissertation presentation | Functional prediction of intraday cumulative returns | Introduction
Objective
Model the relationship between the IDCR’s curves for a single asset and those fora market index
Evaluate their relevance by comparing their predictive power
22 Xi Zhang | March 20, 2013 22 / 48
Ph.D. dissertation presentation | Functional prediction of intraday cumulative returns | Methods and models
Simple Functional CAPM (SF)
A simple functional CAPM is defined as
Yn(t) = α + ψXn(t) + εn(t), t ∈ [0, 1]. (6)
A model without the intercept (α ≡ 0), denoted SF*, is also considered.
23 Xi Zhang | March 20, 2013 23 / 48
Ph.D. dissertation presentation | Functional prediction of intraday cumulative returns | Methods and models
Fully Functional CAPM (FF)
This model is defined by the relation
Yn(t) = α(t) +
∫ψ(t, s)Xn(s)ds + εn(t), t ∈ [0, 1]. (7)
If α ≡ 0, this model is denoted FF*.
24 Xi Zhang | March 20, 2013 24 / 48
Ph.D. dissertation presentation | Functional prediction of intraday cumulative returns | Methods and models
Functional CAPM with dependent errors
This model is defined by 6, but the errors are assumed to follow a functionalautoregressive process of order 1, FAR(1) process:
εn(t) =
∫ϕ(t, s)εn−1(s)ds + wn(t), (8)
where the wn are iid mean zero random functions.
Fully Functional CAPM with dependent errors (FFDE). This model is defined by 7with errors which follow the FAR(1) process. When doing prediction, this model fails,because kernel operators ϕ(t, s) and ψ(t, s) cannot commute.
25 Xi Zhang | March 20, 2013 25 / 48
Ph.D. dissertation presentation | Functional prediction of intraday cumulative returns | Methods and models
Problems seek to solve
Can a simpler model with a scalar coefficient give predictions as good as a modelwith a kernel coefficient?
Does including an intercept improve predictions, or does this extra parameteractually make them worse?
Does modeling error correlation lead to improved predictions?
26 Xi Zhang | March 20, 2013 26 / 48
Ph.D. dissertation presentation | Functional prediction of intraday cumulative returns | Methods and models
Estimation of regression parameters
All calculations have been performed in the R package fda.
The cumulative returns in one minute resolution are converted to functionalobjects.
99 Fourier basis functions are used.
Empirical functional principal components (EFPC’s) v1, . . . , vp of the data arecomputed.
27 Xi Zhang | March 20, 2013 27 / 48
Ph.D. dissertation presentation | Functional prediction of intraday cumulative returns | Methods and models
Evaluate the quality of prediction
The integrated mean squared error defined as
MSEP(N) = N−1N∑
n=1
∫(Yn(t)− Yn(t))2dt. (9)
28 Xi Zhang | March 20, 2013 28 / 48
Ph.D. dissertation presentation | Functional prediction of intraday cumulative returns | Application to US stocks
Data preparation
10 large U.S. corporations in five sectors
Standard & Poor’s 100 index representing market index
1000–day long periods: 01/03/2000 to 02/22/2006 without obvious outliers
29 Xi Zhang | March 20, 2013 29 / 48
Ph.D. dissertation presentation | Functional prediction of intraday cumulative returns | Application to US stocks
Description of 10 Stocks representing five sectors
Sector Stocks Full Name 1000 days period
EnergyXOM Exxon Mobil 05/25/2000-05/19/2004
CVX Chevron10/10/2001-07/23/200412/13/2004-02/22/2006
Information MSFT Microsoft 05/25/2000-05/19/2004Technology IBM IBM 01/03/2000-12/24/2003
FinancialCITI Citi Bank 10/17/2000-03/07/2005BOA Bank of America 03/13/2001-12/19/2005
Consumer KO Coca-Cola 05/25/2000-05/19/2004Staples WMT Wal-Mart Stores 05/25/2000-05/19/2004
Consumer MCD McDonald’s 10/17/2000-03/07/2005Discretionary DIS The Walt Disney 05/25/2000-05/19/2004
30 Xi Zhang | March 20, 2013 30 / 48
Ph.D. dissertation presentation | Functional prediction of intraday cumulative returns | Results
Prediction results (1)
31 Xi Zhang | March 20, 2013 31 / 48
Ph.D. dissertation presentation | Functional prediction of intraday cumulative returns | Results
Prediction results (2)
32 Xi Zhang | March 20, 2013 32 / 48
Ph.D. dissertation presentation | Functional prediction of intraday cumulative returns | Results
Conclusions
Models with intercept, i.e. SF and FF, make better prediction than modelswithout intercept i.e. SF* and FF*. The latter should not be used.
Modeling error dependence with a functional AR(1) model does not improveMSEP’s.
The two models with intercept, i.e. SF and FF, do NOT dominate each other.They have almost the same MSEP’s.
SF model is recommended if minimizing the MSEP is the only concern. It isintuitive, its estimation is straightforward, and the prediction equation is verysimple.
33 Xi Zhang | March 20, 2013 33 / 48
Ph.D. dissertation presentation | Functional multifactor regression for intraday price curves
Outline
1 Introduction
2 Empirical properties of forecasts with the functional autoregressive model
3 Functional prediction of intraday cumulative returns
4 Functional multifactor regression for intraday price curvesMotivationMethods and modelsApplication to U.S. stocksresults
5 Summary and Conclusions
34 Xi Zhang | March 20, 2013 34 / 48
Ph.D. dissertation presentation | Functional multifactor regression for intraday price curves | Motivation
Objective
Whether adding additional factors beyond IDCR’s/CIDR’s on a market index arestatistically significant and whether they lead to improved predictions?
35 Xi Zhang | March 20, 2013 35 / 48
Ph.D. dissertation presentation | Functional multifactor regression for intraday price curves | Methods and models
A general factor model
Factor model
Rn(t) = β0(t) +
p∑j=1
βjFnj(t) + εn(t). (10)
The parameters of the model are the mean function β0(·) and the vector of thecoefficients:
β = [β1, . . . , βp]T .
36 Xi Zhang | March 20, 2013 36 / 48
Ph.D. dissertation presentation | Functional multifactor regression for intraday price curves | Methods and models
Parameter Estimation
The mean function is estimated by
β0(t) = R(t)−p∑
j=1
βj Fj(t), (11)
The method of moments estimator of β is
β = F−1
R, (12)
where
F =
[N−1
N∑n=1
⟨F cnj ,F
cnk
⟩, j , k = 1, 2, . . . , p
](p × p), (13)
R =
[N−1
N∑n=1
⟨Rcn ,F
cnj
⟩, j = 1, 2, . . . , p
]T(p × 1). (14)
37 Xi Zhang | March 20, 2013 37 / 48
Ph.D. dissertation presentation | Functional multifactor regression for intraday price curves | Methods and models
Predictive efficiency
Relative predictive efficiency gains (in percent) defined as
E = 100
(MSEPM
MSEPF− 1
),
where MSEPM is the MSEP computing using only Mn from model SF, and MSEPF isthe MSEP computed using all factors in the model.
38 Xi Zhang | March 20, 2013 38 / 48
Ph.D. dissertation presentation | Functional multifactor regression for intraday price curves | Methods and models
Confidence Intervals
Asymptotical
β asymptotically distributed with the mean β and the covariance matrixN−1F−1ΓF−1.
The matrix Γ is estimated as the long run covariance matrix of the sequence ξn.
ξn =[⟨εn,Fn1 − F1
⟩, . . . ,
⟨εn,Fnp − Fp
⟩]T.
and
εn(t) = Rn(t)− β0(t)−p∑
j=1
βjFnj(t).
An R function lrvar with default kernel and bandwidth values is used to estimateΓ.
The variance of βj is the jth diagonal element of N−1F−1ΓF−1.
Subsampling
39 Xi Zhang | March 20, 2013 39 / 48
Ph.D. dissertation presentation | Functional multifactor regression for intraday price curves | Application to U.S. stocks
Sector Symbol Full Name
EnergyXOM Exxon Mobil CorporationCVX Chevron CorporationCOP ConocoPhillips
Information MSFT Microsoft CorporationTechnology IBM IBM Corporation
ORCL Oracle Corporation
FinancialCITI Citi BankBOA Bank of America CorporationJPM JPMorgan Chase Co.
Consumer StaplesKO Coca-ColaWMT Wal-Mart StoresPG Procter Gamble Co.
Consumer MCD McDonald’s CorporationDiscretionary DIS The Walt Disney Corporation
CMCSA Comcast Corporation
TransportationFDX FedEx CorporationJBLU JetBlue Airways CorporationUPS United Parcel Service, Inc.
40 Xi Zhang | March 20, 2013 40 / 48
Ph.D. dissertation presentation | Functional multifactor regression for intraday price curves | Application to U.S. stocks
Models to test
A simpler model
Rn(t) = β0(t) + β1Mn(t) + β2Ln−1 + εn(t), (15)
PA model with Ln−1 representing the asset daily return;
PI model with Ln−1 representing the index daily return;
FF Fama–French model:
Rn(t) = β0(t) + β1Mn(t) + β2Sn + β3Hn + εn(t), (16)
where Sn and Hn are the Fama–French factors (scalars).
OF model with oil futures as the extra factor:
Rn(t) = β0(t) + β1Mn(t) + β2Cn(t) + εn(t), (17)
41 Xi Zhang | March 20, 2013 41 / 48
Ph.D. dissertation presentation | Functional multifactor regression for intraday price curves | results
Table : Summary of conclusions for the OF model for the stocks
Sector Subsampling Asymptotic
Energy 0/+ +Information Technology 0 −Financial 0 −/0Consumer Staples 0 −/0Consumer Discretionary 0 0/−Transportation 0 −
42 Xi Zhang | March 20, 2013 42 / 48
Ph.D. dissertation presentation | Functional multifactor regression for intraday price curves | results
Table : Monte Carol study results out of bootstrapping.
Data size powerBootstrapped asymptotic subsampling asymptotic subsampling
MSFT1 7 0 74 0WMT1 5 0 98 3UPS1 6 0 56 0
43 Xi Zhang | March 20, 2013 43 / 48
Ph.D. dissertation presentation | Summary and Conclusions
Outline
1 Introduction
2 Empirical properties of forecasts with the functional autoregressive model
3 Functional prediction of intraday cumulative returns
4 Functional multifactor regression for intraday price curves
5 Summary and Conclusions
44 Xi Zhang | March 20, 2013 44 / 48
Ph.D. dissertation presentation | Summary and Conclusions
Main results
The sophisticated method of prediction recently proposed in Kargin andOnatski(2008), actually does not dominate a simpler method based on thefunctional principal components. Limits on the quality of predictions are foundedand showed that no other method can exceed them.
Complex functional regression models do not perform better than a simple model.
A functional regression framework that allows us to evaluate quantitatively howthe shapes of intraday price curves depend on the shapes of other curve–valuedfactors or on scalar factors is proposed.
Scalar factors have no significant impact on the shape of the price curves.
Oil factors affect the oil companys’ intraday price evolution significantly, butmostly negative to other stocks.
Asymptotic theory leads to practically useful confidence intervals for the regressioncoefficients.
45 Xi Zhang | March 20, 2013 45 / 48
Ph.D. dissertation presentation | Summary and Conclusions
Publication
Kokoszka, P., Miao, H., and Zhang, X. Functional multifactor regression forintraday price curves. Submitted to Journal of Econometrics.
Kokoszka, P. and Zhang, X. Functional prediction of intra-day cumulative returns.Statistical Modeling. 12(4):377-398, 2012.
Didericksen, D., Kokoszka, P., and Zhang, X. Empirical properties of forecastswith the functional autoregressive model. Computational Statistics.27(2):285-298, 2012.
Kokoszka, P. and Zhang X. Estimation of the autoregressive kernel in thefunctional AR(1) process. Utah State University, Utah, USA. 2011.
46 Xi Zhang | March 20, 2013 46 / 48
Ph.D. dissertation presentation
Acknowledgement
Special thanks to: Dr. Piotr S. Kokoszka, and my PhD committee members: Dr. DanielCoster, Dr. Richard Cutler, Dr. John Stevens, and Dr. Lie Zhu.
47 Xi Zhang | March 20, 2013 47 / 48