svd and ls

43
M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009 1 SVD and LS M.A. Miceli University of Rome I Stats in the Château Jouy-en-Josas August 31 - September 4 2009

Upload: tavia

Post on 30-Jan-2016

67 views

Category:

Documents


1 download

DESCRIPTION

SVD and LS. M.A. Miceli University of Rome I Stats in the Château Jouy-en-Josas August 31 - September 4 2009. Motivations. Problems of high dimensionality in estimation: Rank < actual dimension of the data sets  inverse problems - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: SVD and LS

M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

1

SVD and LS

M.A. MiceliUniversity of Rome I

Stats in the ChâteauJouy-en-Josas

August 31 - September 4 2009

Page 2: SVD and LS

M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

2

Motivations

• Problems of high dimensionality in estimation:– Rank < actual dimension of the data sets inverse

problems– Threholds in accepting variables eases on every

dimension, as the number of variables/dimensions increases (ex. Wald test).

• How the SVD helps in extracting robust correlations between dependent and independent variables: automatic choice of “model”.

• Why• Some evidence in predicting US CPIs indexes• Some issues about normalizations

Page 3: SVD and LS

M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

3

MotivationsGiven a simultaneous linear system of equations

1. Collapsing dimensionality of the system to its min rank = min [rank(Y), rank (X)],

2. Advantages of SVD w.r.t. Principal Components: • PC requires a sqare matrix, e.g. autocorrelation matrix,

and ranks the dimensions within that single matrix;• SVD ranks the correlations between X and Y dimensions

3. Discretionary possibility of getting rid of some - believed negligible – dimensions: we are interested in getting rid of those dimensions that can be generated by a totally random system of same dimensions (Marchenko-Pastur conditions adapted to a rectangular matrix).

ErrorsBXY NMMTNT ,,,

Page 4: SVD and LS

M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

4

Definition of SVD of a matrix product

• SVD definition

Having two matrices

one can write

and therefore

If T << max(M,N)? No problems

NNNNNMNM

NNNMMMNM

VSUA

or

VSUA

,,,,

,,,,

'

'

NMMMMTNNNM SUXVY ,,,,,

MTNT XY ,, ,

'' ,,,, NNNMMMNM VSUYX

Page 5: SVD and LS

M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

5

Diagonalizing the LS estimator• Consider regressing every column y over the set of

explanatory variables X:

• we write

• We diagonalize both matrices: (X’X) and (X’Y):– X’X

– X’Y rectangular

– NB. The SVD of a square matrix IS the same as the diagonalisation. We will write

nn XyXXb 1)'(

1' XX PPXX

NMMM YXXXB ,,1 )'()'(

'' xyxyxy VSUYX

xxxxxx VSUXX ''

Page 6: SVD and LS

M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

6

Page 7: SVD and LS

M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

7

(X’ Y) Uxy

0

Sxy Vxy

SVD of the covariance matrix

Page 8: SVD and LS

M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

8

X’Y Vxy Uxy Sxy

0

SVD mapping from column basis to row basis

Page 9: SVD and LS

M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

9

Y Vxy X Uxy Sxy

Y linear combinX linear combin

SVD: splitting the product X’Y

Page 10: SVD and LS

M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

10

Adding diagonalisation of both X and Y matrices

Page 11: SVD and LS

M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

11

Y X Uxx Uxy Inv(Dxx) Sxy Vxy ‘ Vyy ’

Returning to the original variables

Replacing the old “B”:any advantage??!!

We may cancel factors: any criterium?

Page 12: SVD and LS

M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

12

RMT

1. Marcenko-Pastur conditions compute singular values density and interval limits for square matrices. Bouchaud, Miceli et al (2005) derive them for rectangular matrices.

2. We run exactly the same experiment with purely random generated matrices for “many times”: limits and densities reply the theory

Page 13: SVD and LS

M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

13

Marcenko-Pastur limits and density

Page 14: SVD and LS

M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

14

RMT

1. Density and limits do change if we use raw or already diagonalized data.

2. Is this “double diagonalization” worthwhile?

• singular values are HD0 in standardization, eigenvectors are NOT.

Page 15: SVD and LS

M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

15

Diagonalized “LS estimator”We may approach the same problem in different ways1. raw data

2. normalized factors

3. non normalized factors

“unfortunately” 3. works best. Why? …Is it because factor normalization changes the ranking of

the SVD singular values and this affect eventually the factor selection? NO!

Answer at the end ….

NNxy

NMMMxy

MTNT VUXY ,,,,,

)')(('))(( ,2/1

,,,,2/1

,,, NNyy

yyNNxy

NMMMxy

MMxxMMxx

MTNT VVUUXY

NNyy

NNxy

NMMMMMxy

MMxx

MTNT VVUUXY ,,,,1

,,,, ])[(

Very disturbing

Page 16: SVD and LS

M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

16

Example: Forecasting US CPIs Indexes

Time series are mom % changes:• Y:= 9 CPIs Indexes, aug83 – apr07

• X:= 77 macroeconomic series nov83-apr07 including 3 lags of the Ys.

T=282, N=9, M=77, rolling window W=100 or else.

n= N/W, m=M/W.

CPI_CMDTY Commodities SACPI_APPAREL Apparel SACPI_FD Food & Beverages SACPI_HOUS Housing SACPI_SERV Services SACPI_TRASP Transportation SACPI_MEDIC Medical Care SAPPI_TOT_MOM US PPI SAPPI_CORE_MOM US PPI Core SA

Page 17: SVD and LS

M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

17

0 50 100 150 200 250 300-0.05

-0.04

-0.03

-0.02

-0.01

0

0.01

0.02

0.03

0.04

0.05

CPIs

Page 18: SVD and LS

M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

18

0 50 100 150 200 250 300-0.5

-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0.4

0.5

Xs

Page 19: SVD and LS

M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

19

Estimation by Model III

Page 20: SVD and LS

M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

20

Singular values - Model I: raw and random data

0.000

0.200

0.400

0.600

0.800

1.000

1.200

1.400

1.600

1.800

set-

93

mar

-94

set-

94

mar

-95

set-

95

mar

-96

set-

96

mar

-97

set-

97

mar

-98

set-

98

mar

-99

set-

99

mar

-00

set-

00

mar

-01

set-

01

mar

-02

set-

02

mar

-03

set-

03

mar

-04

set-

04

mar

-05

set-

05

mar

-06

set-

06

mar

-07

Page 21: SVD and LS

M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

21

0 20 40 60 80 100 120 140 160 1800.2

0.25

0.3

0.35

0.4

0.45

0.5

Singular values: Model I – Random generated DATA

Page 22: SVD and LS

M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

22

Singular values - Model 1: raw and random data

0.000

0.200

0.400

0.600

0.800

1.000

1.200

1.400

1.600

1.800

1 2 3 4 5 6 7 8 9

Page 23: SVD and LS

M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

23

Singular values for SVD on raw and random DATA

0.2 0.25 0.3 0.35 0.4 0.45 0.50

10

20

30

40

50

60

70

80

90

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.80

50

100

150

200

250

300

350

Page 24: SVD and LS

M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

24

Interest Rates Coefficients

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

set-96

mar

-97

set-97

mar

-98

set-98

mar

-99

set-99

mar

-00

set-00

mar

-01

set-01

mar

-02

set-02

mar

-03

set-03

mar

-04

set-04

mar

-05

set-05

mar

-06

set-06

R3M_USMACRO R10Y_USMACRO R2Y_USD_M

Page 25: SVD and LS

M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

25

Estimation by Model IIFactors are divided by their own eigenvalue

Page 26: SVD and LS

M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

26

0 20 40 60 80 100 120 140 160 180

0.65

0.7

0.75

0.8

0.85

0.9

0.95

1

Singular values: Model II – Data NORMALIZED FACTORS

lambda max= 0.934

Lambda min=0.608

Page 27: SVD and LS

M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

27

lambda max= 0.934

Lambda min=0.608

0 20 40 60 80 100 120 140 160 1800.5

0.55

0.6

0.65

0.7

0.75

0.8

0.85

0.9

0.95

Singular values: Model II – Random generated NORMALIZED FACTORS

Random generated singular values don’t look very differently ….

Page 28: SVD and LS

M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

28

Singular values: Model II: normalized data and random factors

0.600

0.650

0.700

0.750

0.800

0.850

0.900

0.950

1.000

1 2 3 4 5 6 7 8 9

Page 29: SVD and LS

M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

29

Singular values: Model II: normalized data and random factors

0.600

0.650

0.700

0.750

0.800

0.850

0.900

0.950

1.000

1 2 3 4 5 6 7 8

Page 30: SVD and LS

M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

30

Singular values for SVD on raw and random FACTORS

0.65 0.7 0.75 0.8 0.85 0.9 0.95 10

20

40

60

80

100

120

0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 10

50

100

150

Page 31: SVD and LS

M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

31

Let’s see estimations

by Model III

Page 32: SVD and LS

M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

32

0 20 40 60 80 100 120-10

0

10

20

30

40

50

60

70

80

90

P&L Model III - Factors on raw data

Page 33: SVD and LS

M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

33

P&L Model III - CPI Indexes (Model of Non Normalized Factors) – In sample

0 20 40 60 80 100 120-5

0

5

10

15

20

25

30

35

0 20 40 60 80 100 120-5

0

5

10

15

20

25

30

With ALL svd factors 2 svd factors

Page 34: SVD and LS

M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

34

Let’s see estimations

by Model II (normalized factors)

Page 35: SVD and LS

M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

35

0 20 40 60 80 100 120-50

0

50

100

150

200

250

P&L Model II (Normalized factors) - Factors

Page 36: SVD and LS

M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

36

0 20 40 60 80 100 120-150

-100

-50

0

50

100

150

200

250

P&L Model II (Normalized factors) – CPI’s

Page 37: SVD and LS

M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

37

0 20 40 60 80 100 120-5

-4

-3

-2

-1

0

1

2

3

4

0 20 40 60 80 100 120-50

-40

-30

-20

-10

0

10

20

30

40

Normalized factorsNon normalized factors

Example of CPI_comdty estimation

Page 38: SVD and LS

M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

38

OUT OF SAMPLE

• Estimation on t=1,…,120• Forecast at fixed coefficients for t= 121, … 282

Page 39: SVD and LS

M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

39

0 50 100 150 200 250 300-20

0

20

40

60

80

100

120

140

160

P&L: Factors (Model II)

Page 40: SVD and LS

M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

40

0 50 100 150 200 250 300-10

0

10

20

30

40

50

60

70

80

90

Forecast on CPI’s

0 50 100 150 200 250 300-10

0

10

20

30

40

50

60

70

80

All factors 2 factors only

Easier to predict: 1. medical care (since stable), 2. commodities (oil), 3. Transports

Page 41: SVD and LS

M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

41

0 50 100 150 200 250 300-6

-4

-2

0

2

4

6

Forecasts on Cpi’s Comdty

Page 42: SVD and LS

M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

42

Conclusions 1

Page 43: SVD and LS

M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

43

Conclusions on the example