ppt0320defenseday

48
Ph.D. dissertation presentation Empirical properties of functional regression models and application to high-frequency financial data Xi Zhang Department of Mathematics and Statistics Utah State University March 20, 2013 1 Xi Zhang | March 20, 2013 1 / 48

Upload: xi-shay-zhang-phd

Post on 24-Jul-2015

125 views

Category:

Documents


0 download

TRANSCRIPT

Ph.D. dissertation presentation

Empirical properties of functional regression models andapplication to high-frequency financial data

Xi ZhangDepartment of Mathematics and Statistics

Utah State University

March 20, 2013

1 Xi Zhang | March 20, 2013 1 / 48

Ph.D. dissertation presentation | Introduction

Outline

1 IntroductionFunctional data analysisHigh-frequency financial data sets

2 Empirical properties of forecasts with the functional autoregressive model

3 Functional prediction of intraday cumulative returns

4 Functional multifactor regression for intraday price curves

5 Summary and Conclusions

2 Xi Zhang | March 20, 2013 2 / 48

Ph.D. dissertation presentation | Introduction | Functional data analysis

Functional Data Analysis(FDA)

It analyzes data providing information about curves, surfaces or anything elsevarying over a continuum (time, spatial location, wavelength, probability, etc).

The core idea is that curves should be treated as individual and completestatistical objects, rather than as collections of individual observations.

Statistical tools of FDA typically rely on some form of smoothing to transformhigh dimensional or incomplete data building up a curve into a smoother curvethat can be described by a smaller number of parameters.

The inherent complexity of FDA makes it impossible in a meaningful way toestimate the “distribution” of a random function, or to find estimates that couldconverge in a reasonable rate, which indicates that the properties of the FPCA areof great importance in FDA.

3 Xi Zhang | March 20, 2013 3 / 48

Ph.D. dissertation presentation | Introduction | High-frequency financial data sets

8 years price process

4 Xi Zhang | March 20, 2013 4 / 48

Ph.D. dissertation presentation | Introduction | High-frequency financial data sets

Cumulative Intraday returns

Definition

Suppose Pn(tj), n = 1, . . . ,N, j = 1, . . . ,m is the price of a financial asset at time tjon day n. The functions

rn(tj) = 100[lnPn(tj)− lnPn(t1)], j = 2, . . . ,m, n = 1, . . . ,N,

are defined as the intraday cumulative returns (CIDR’s/ IDCR’s).

The above definition implicitly assumes that tj+1 > tj .we work with one minute averages, so tj+1 − tj = 1 min, and P(tj) is the average of themaximum and minimum price within the jth minute.

5 Xi Zhang | March 20, 2013 5 / 48

Ph.D. dissertation presentation | Introduction | High-frequency financial data sets

Cumulative Intraday returns

6 Xi Zhang | March 20, 2013 6 / 48

Ph.D. dissertation presentation | Introduction | High-frequency financial data sets

Five days closer look

7 Xi Zhang | March 20, 2013 7 / 48

Ph.D. dissertation presentation | Introduction | High-frequency financial data sets

Why CIDR’s/IDCR’s?

Similar to curves of the price Pn(tj) for a trading day n which are of high interestby stock investors

Give more relevant information by showing how the return changes during atrading day

Can be treated as continuous curves, one curve per day, adapted to functional data

8 Xi Zhang | March 20, 2013 8 / 48

Ph.D. dissertation presentation | Introduction | High-frequency financial data sets

High frequency returns

9 Xi Zhang | March 20, 2013 9 / 48

Ph.D. dissertation presentation | Empirical properties of forecasts with the functional autoregressive model

Outline

1 Introduction

2 Empirical properties of forecasts with the functional autoregressive modelIntroductionSimulation studyResults

3 Functional prediction of intraday cumulative returns

4 Functional multifactor regression for intraday price curves

5 Summary and Conclusions

10 Xi Zhang | March 20, 2013 10 / 48

Ph.D. dissertation presentation | Empirical properties of forecasts with the functional autoregressive model | Introduction

Functional Autoregressive model(FAR)

FAR(1) modelXn+1 = Ψ(Xn) + εn+1,

where errors εn and the observations Xn are curves, and the operator Ψ acting on afunction X is defined as

Ψ(X )(t) =

∫ψ(t, s)X (s)ds,

where ψ(t, s) is a bivariate kernel assumed to satisfy ||Ψ|| < 1, where

||Ψ||2 =

∫∫ψ2(t, s)dtds. (1)

The condition ||Ψ|| < 1 ensures the existence of a stationary causal solution to FAR(1)equations.

11 Xi Zhang | March 20, 2013 11 / 48

Ph.D. dissertation presentation | Empirical properties of forecasts with the functional autoregressive model | Introduction

Methods

Bosq (2000) advocated a standard method by estimating the operator Ψ andforecasting Xn+1 by Ψ(Xn). (Estimated Kernel (EK))

The empirical version of bivariate kernel ψ:

ψp(t, s) =

p∑k,`=1

ψk`vk(t)v`(s), (2)

where

ψji = λ−1i (N − 1)−1

N−1∑n=1

〈Xn, vi 〉〈Xn+1, vj〉. (3)

where vk , k = 1, 2, . . . , p, the estimated (or empirical) FPC’s (EFPC’s).p is thenumber of EFPC’s.

Kargin and Onatski (2008) proposed a sophisticated method: one step aheadprediction in FAR(1) model based on predictive factors. (Predictive Factors (PF))

12 Xi Zhang | March 20, 2013 12 / 48

Ph.D. dissertation presentation | Empirical properties of forecasts with the functional autoregressive model | Introduction

Objective

Is the method of Predictive Factors (PF) superior in finite samples to the EstimatedKernel (EK)?

13 Xi Zhang | March 20, 2013 13 / 48

Ph.D. dissertation presentation | Empirical properties of forecasts with the functional autoregressive model | Simulation study

Data generating process

FAR(1) model

Xn+1(t) =

∫ 1

0

ψ(t, s)Xn(s)ds + εn+1(t), n = 1, 2, . . . ,N.

Three error processes

Brownian bridgesε(1)(t) = BB(t)

ε(2)(t) = ξ1

√2 sin(2πt) +

√λ√

2ξ2 cos(2πt) ,

where ξ1 and ξ2 are independent standard normals, λ can be any constant (in thesimulations we use λ = 0.5).

ε(3)(t) = ε(2)(t) + aε(1)(t) ,

14 Xi Zhang | March 20, 2013 14 / 48

Ph.D. dissertation presentation | Empirical properties of forecasts with the functional autoregressive model | Simulation study

Kernels

Four kernels (defined for (t, s) ∈ [0, 1]2):

Gaussian : ψ(t, s) = C exp{−(t2 + s2)/2

},

Identity : ψ(t, s) = C ,

Sloping plane (t) : ψ(t, s) = Ct,

Sloping plane (s) : ψ(t, s) = Cs.

C are chosen such that ||Ψ|| = 0.5 or ||Ψ|| = 0.8.

15 Xi Zhang | March 20, 2013 15 / 48

Ph.D. dissertation presentation | Empirical properties of forecasts with the functional autoregressive model | Simulation study

Measures of quality of prediction

Quantities:

En =

√∫ 1

0

(Xn(t)− Xn(t)

)2

dt and Rn =

∫ 1

0

∣∣∣Xn(t)− Xn(t)∣∣∣ dt.

are used to measure the prediction error at time n.

16 Xi Zhang | March 20, 2013 16 / 48

Ph.D. dissertation presentation | Empirical properties of forecasts with the functional autoregressive model | Results

Comparison of five prediction methods

MP Mean Prediction Xn+1(t) = 0.

NP Naive Prediction Xn+1 = Xn.

EX Exact Xn+1 = Ψ(Xn).

EK Estimated Kernel.

EKI Estimated Kernel Improved, using λi + b instead of λi .

PF Predictive Factors.

17 Xi Zhang | March 20, 2013 17 / 48

Ph.D. dissertation presentation | Empirical properties of forecasts with the functional autoregressive model | Results

Boxplots of the prediction errors ||Ψ|| = 0.5

En (left) and Rn (right); innovations: ε(1), kernel: sloping plane (t), N = 100, p = 3.

18 Xi Zhang | March 20, 2013 18 / 48

Ph.D. dissertation presentation | Empirical properties of forecasts with the functional autoregressive model | Results

Conclusions

Based on all 32 sets of boxplots and 32 sets of tables, we report:

Taking the autoregressive structure into account reduces prediction errors.

None of the Methods EX, EK, EKI uniformly dominates the other. In most casesmethod EK is the best, or at least as good as the others.

In some cases, method PF performs visibly worse than the other methods, butalways better than NP.

Using the improved estimation does not generally reduce prediction errors.

19 Xi Zhang | March 20, 2013 19 / 48

Ph.D. dissertation presentation | Functional prediction of intraday cumulative returns

Outline

1 Introduction

2 Empirical properties of forecasts with the functional autoregressive model

3 Functional prediction of intraday cumulative returnsIntroductionMethods and modelsApplication to US stocksResults

4 Functional multifactor regression for intraday price curves

5 Summary and Conclusions

20 Xi Zhang | March 20, 2013 20 / 48

Ph.D. dissertation presentation | Functional prediction of intraday cumulative returns | Introduction

Capital Asset Pricing Model(CAPM)

The simplest form of celebrated Capital Asset Pricing Model(CAPM):

rn = α + βrm,n + εn (4)

where

rn = 100(lnPn − lnPn−1) ≈ 100Pn − Pn−1

Pn−1(5)

is the return, in percent, over a unit of time on a specific asset, e.g. a stock, and rm,n isthe analogously defined return on a relevant market index.

21 Xi Zhang | March 20, 2013 21 / 48

Ph.D. dissertation presentation | Functional prediction of intraday cumulative returns | Introduction

Objective

Model the relationship between the IDCR’s curves for a single asset and those fora market index

Evaluate their relevance by comparing their predictive power

22 Xi Zhang | March 20, 2013 22 / 48

Ph.D. dissertation presentation | Functional prediction of intraday cumulative returns | Methods and models

Simple Functional CAPM (SF)

A simple functional CAPM is defined as

Yn(t) = α + ψXn(t) + εn(t), t ∈ [0, 1]. (6)

A model without the intercept (α ≡ 0), denoted SF*, is also considered.

23 Xi Zhang | March 20, 2013 23 / 48

Ph.D. dissertation presentation | Functional prediction of intraday cumulative returns | Methods and models

Fully Functional CAPM (FF)

This model is defined by the relation

Yn(t) = α(t) +

∫ψ(t, s)Xn(s)ds + εn(t), t ∈ [0, 1]. (7)

If α ≡ 0, this model is denoted FF*.

24 Xi Zhang | March 20, 2013 24 / 48

Ph.D. dissertation presentation | Functional prediction of intraday cumulative returns | Methods and models

Functional CAPM with dependent errors

This model is defined by 6, but the errors are assumed to follow a functionalautoregressive process of order 1, FAR(1) process:

εn(t) =

∫ϕ(t, s)εn−1(s)ds + wn(t), (8)

where the wn are iid mean zero random functions.

Fully Functional CAPM with dependent errors (FFDE). This model is defined by 7with errors which follow the FAR(1) process. When doing prediction, this model fails,because kernel operators ϕ(t, s) and ψ(t, s) cannot commute.

25 Xi Zhang | March 20, 2013 25 / 48

Ph.D. dissertation presentation | Functional prediction of intraday cumulative returns | Methods and models

Problems seek to solve

Can a simpler model with a scalar coefficient give predictions as good as a modelwith a kernel coefficient?

Does including an intercept improve predictions, or does this extra parameteractually make them worse?

Does modeling error correlation lead to improved predictions?

26 Xi Zhang | March 20, 2013 26 / 48

Ph.D. dissertation presentation | Functional prediction of intraday cumulative returns | Methods and models

Estimation of regression parameters

All calculations have been performed in the R package fda.

The cumulative returns in one minute resolution are converted to functionalobjects.

99 Fourier basis functions are used.

Empirical functional principal components (EFPC’s) v1, . . . , vp of the data arecomputed.

27 Xi Zhang | March 20, 2013 27 / 48

Ph.D. dissertation presentation | Functional prediction of intraday cumulative returns | Methods and models

Evaluate the quality of prediction

The integrated mean squared error defined as

MSEP(N) = N−1N∑

n=1

∫(Yn(t)− Yn(t))2dt. (9)

28 Xi Zhang | March 20, 2013 28 / 48

Ph.D. dissertation presentation | Functional prediction of intraday cumulative returns | Application to US stocks

Data preparation

10 large U.S. corporations in five sectors

Standard & Poor’s 100 index representing market index

1000–day long periods: 01/03/2000 to 02/22/2006 without obvious outliers

29 Xi Zhang | March 20, 2013 29 / 48

Ph.D. dissertation presentation | Functional prediction of intraday cumulative returns | Application to US stocks

Description of 10 Stocks representing five sectors

Sector Stocks Full Name 1000 days period

EnergyXOM Exxon Mobil 05/25/2000-05/19/2004

CVX Chevron10/10/2001-07/23/200412/13/2004-02/22/2006

Information MSFT Microsoft 05/25/2000-05/19/2004Technology IBM IBM 01/03/2000-12/24/2003

FinancialCITI Citi Bank 10/17/2000-03/07/2005BOA Bank of America 03/13/2001-12/19/2005

Consumer KO Coca-Cola 05/25/2000-05/19/2004Staples WMT Wal-Mart Stores 05/25/2000-05/19/2004

Consumer MCD McDonald’s 10/17/2000-03/07/2005Discretionary DIS The Walt Disney 05/25/2000-05/19/2004

30 Xi Zhang | March 20, 2013 30 / 48

Ph.D. dissertation presentation | Functional prediction of intraday cumulative returns | Results

Prediction results (1)

31 Xi Zhang | March 20, 2013 31 / 48

Ph.D. dissertation presentation | Functional prediction of intraday cumulative returns | Results

Prediction results (2)

32 Xi Zhang | March 20, 2013 32 / 48

Ph.D. dissertation presentation | Functional prediction of intraday cumulative returns | Results

Conclusions

Models with intercept, i.e. SF and FF, make better prediction than modelswithout intercept i.e. SF* and FF*. The latter should not be used.

Modeling error dependence with a functional AR(1) model does not improveMSEP’s.

The two models with intercept, i.e. SF and FF, do NOT dominate each other.They have almost the same MSEP’s.

SF model is recommended if minimizing the MSEP is the only concern. It isintuitive, its estimation is straightforward, and the prediction equation is verysimple.

33 Xi Zhang | March 20, 2013 33 / 48

Ph.D. dissertation presentation | Functional multifactor regression for intraday price curves

Outline

1 Introduction

2 Empirical properties of forecasts with the functional autoregressive model

3 Functional prediction of intraday cumulative returns

4 Functional multifactor regression for intraday price curvesMotivationMethods and modelsApplication to U.S. stocksresults

5 Summary and Conclusions

34 Xi Zhang | March 20, 2013 34 / 48

Ph.D. dissertation presentation | Functional multifactor regression for intraday price curves | Motivation

Objective

Whether adding additional factors beyond IDCR’s/CIDR’s on a market index arestatistically significant and whether they lead to improved predictions?

35 Xi Zhang | March 20, 2013 35 / 48

Ph.D. dissertation presentation | Functional multifactor regression for intraday price curves | Methods and models

A general factor model

Factor model

Rn(t) = β0(t) +

p∑j=1

βjFnj(t) + εn(t). (10)

The parameters of the model are the mean function β0(·) and the vector of thecoefficients:

β = [β1, . . . , βp]T .

36 Xi Zhang | March 20, 2013 36 / 48

Ph.D. dissertation presentation | Functional multifactor regression for intraday price curves | Methods and models

Parameter Estimation

The mean function is estimated by

β0(t) = R(t)−p∑

j=1

βj Fj(t), (11)

The method of moments estimator of β is

β = F−1

R, (12)

where

F =

[N−1

N∑n=1

⟨F cnj ,F

cnk

⟩, j , k = 1, 2, . . . , p

](p × p), (13)

R =

[N−1

N∑n=1

⟨Rcn ,F

cnj

⟩, j = 1, 2, . . . , p

]T(p × 1). (14)

37 Xi Zhang | March 20, 2013 37 / 48

Ph.D. dissertation presentation | Functional multifactor regression for intraday price curves | Methods and models

Predictive efficiency

Relative predictive efficiency gains (in percent) defined as

E = 100

(MSEPM

MSEPF− 1

),

where MSEPM is the MSEP computing using only Mn from model SF, and MSEPF isthe MSEP computed using all factors in the model.

38 Xi Zhang | March 20, 2013 38 / 48

Ph.D. dissertation presentation | Functional multifactor regression for intraday price curves | Methods and models

Confidence Intervals

Asymptotical

β asymptotically distributed with the mean β and the covariance matrixN−1F−1ΓF−1.

The matrix Γ is estimated as the long run covariance matrix of the sequence ξn.

ξn =[⟨εn,Fn1 − F1

⟩, . . . ,

⟨εn,Fnp − Fp

⟩]T.

and

εn(t) = Rn(t)− β0(t)−p∑

j=1

βjFnj(t).

An R function lrvar with default kernel and bandwidth values is used to estimateΓ.

The variance of βj is the jth diagonal element of N−1F−1ΓF−1.

Subsampling

39 Xi Zhang | March 20, 2013 39 / 48

Ph.D. dissertation presentation | Functional multifactor regression for intraday price curves | Application to U.S. stocks

Sector Symbol Full Name

EnergyXOM Exxon Mobil CorporationCVX Chevron CorporationCOP ConocoPhillips

Information MSFT Microsoft CorporationTechnology IBM IBM Corporation

ORCL Oracle Corporation

FinancialCITI Citi BankBOA Bank of America CorporationJPM JPMorgan Chase Co.

Consumer StaplesKO Coca-ColaWMT Wal-Mart StoresPG Procter Gamble Co.

Consumer MCD McDonald’s CorporationDiscretionary DIS The Walt Disney Corporation

CMCSA Comcast Corporation

TransportationFDX FedEx CorporationJBLU JetBlue Airways CorporationUPS United Parcel Service, Inc.

40 Xi Zhang | March 20, 2013 40 / 48

Ph.D. dissertation presentation | Functional multifactor regression for intraday price curves | Application to U.S. stocks

Models to test

A simpler model

Rn(t) = β0(t) + β1Mn(t) + β2Ln−1 + εn(t), (15)

PA model with Ln−1 representing the asset daily return;

PI model with Ln−1 representing the index daily return;

FF Fama–French model:

Rn(t) = β0(t) + β1Mn(t) + β2Sn + β3Hn + εn(t), (16)

where Sn and Hn are the Fama–French factors (scalars).

OF model with oil futures as the extra factor:

Rn(t) = β0(t) + β1Mn(t) + β2Cn(t) + εn(t), (17)

41 Xi Zhang | March 20, 2013 41 / 48

Ph.D. dissertation presentation | Functional multifactor regression for intraday price curves | results

Table : Summary of conclusions for the OF model for the stocks

Sector Subsampling Asymptotic

Energy 0/+ +Information Technology 0 −Financial 0 −/0Consumer Staples 0 −/0Consumer Discretionary 0 0/−Transportation 0 −

42 Xi Zhang | March 20, 2013 42 / 48

Ph.D. dissertation presentation | Functional multifactor regression for intraday price curves | results

Table : Monte Carol study results out of bootstrapping.

Data size powerBootstrapped asymptotic subsampling asymptotic subsampling

MSFT1 7 0 74 0WMT1 5 0 98 3UPS1 6 0 56 0

43 Xi Zhang | March 20, 2013 43 / 48

Ph.D. dissertation presentation | Summary and Conclusions

Outline

1 Introduction

2 Empirical properties of forecasts with the functional autoregressive model

3 Functional prediction of intraday cumulative returns

4 Functional multifactor regression for intraday price curves

5 Summary and Conclusions

44 Xi Zhang | March 20, 2013 44 / 48

Ph.D. dissertation presentation | Summary and Conclusions

Main results

The sophisticated method of prediction recently proposed in Kargin andOnatski(2008), actually does not dominate a simpler method based on thefunctional principal components. Limits on the quality of predictions are foundedand showed that no other method can exceed them.

Complex functional regression models do not perform better than a simple model.

A functional regression framework that allows us to evaluate quantitatively howthe shapes of intraday price curves depend on the shapes of other curve–valuedfactors or on scalar factors is proposed.

Scalar factors have no significant impact on the shape of the price curves.

Oil factors affect the oil companys’ intraday price evolution significantly, butmostly negative to other stocks.

Asymptotic theory leads to practically useful confidence intervals for the regressioncoefficients.

45 Xi Zhang | March 20, 2013 45 / 48

Ph.D. dissertation presentation | Summary and Conclusions

Publication

Kokoszka, P., Miao, H., and Zhang, X. Functional multifactor regression forintraday price curves. Submitted to Journal of Econometrics.

Kokoszka, P. and Zhang, X. Functional prediction of intra-day cumulative returns.Statistical Modeling. 12(4):377-398, 2012.

Didericksen, D., Kokoszka, P., and Zhang, X. Empirical properties of forecastswith the functional autoregressive model. Computational Statistics.27(2):285-298, 2012.

Kokoszka, P. and Zhang X. Estimation of the autoregressive kernel in thefunctional AR(1) process. Utah State University, Utah, USA. 2011.

46 Xi Zhang | March 20, 2013 46 / 48

Ph.D. dissertation presentation

Acknowledgement

Special thanks to: Dr. Piotr S. Kokoszka, and my PhD committee members: Dr. DanielCoster, Dr. Richard Cutler, Dr. John Stevens, and Dr. Lie Zhu.

47 Xi Zhang | March 20, 2013 47 / 48

Ph.D. dissertation presentation

Thank You.

48 Xi Zhang | March 20, 2013 48 / 48