rank regularized estimation of approximate factor …sn2294/papers/baing17_slides.pdfapproximate...
TRANSCRIPT
Rank Regularized Estimation of Approximate
Factor Models
Jushan Bai Serena Ng
Columbia University
April 2018
Approximate Factor Models Rank Minimization: NP Hard Approximate-Rank Minimization Rank Regularized Factor Models
Outline
1 Approximate Factor ModelsAPC vs PC
2 Rank Minimization: NP Hard
3 Approximate-Rank MinimizationRPC vs PC
4 Rank Regularized Factor ModelsNumber of FactorsLinear Restrictions
Approximate Factor Models Rank Minimization: NP Hard Approximate-Rank Minimization Rank Regularized Factor Models
Overview: Model: X = FΛ′ + e
APC: asymptotic principal components F =√TUr
eigenvectors can be constructed by iterative OLS.
What if we do iterative ridge regressions instead of OLS?
Singular value thresholding ⇒ robust PCRegularize rank of common componentAlgorithmic view: finite sample error bounds.
This paper: rank regularized factor analysis
Parametric analysis, asymptotic results for inference.A new, conservative factor selection rule.(*) Factor analysis under general linear restrictions.
Approximate Factor Models Rank Minimization: NP Hard Approximate-Rank Minimization Rank Regularized Factor ModelsAPC vs PC
Notation
Xit ∼ (0, 1), i = 1, . . .N , t = 1, . . .T .
svd: X = UDV ′ = Ur DrVr′ + Un−r Dn−rVn−r
′
Normalized Data: Z = X√NT
= UDV ′, D = D√NT
Unscaled model: X = F 0Λ0′ + e.
Scaled model: Z = F ∗Λ∗′ + e∗
F ∗ = F 0√T
, Λ∗ = Λ0√N
.
Approximate Factor Models Rank Minimization: NP Hard Approximate-Rank Minimization Rank Regularized Factor ModelsAPC vs PC
Asymptotic Principal Components: (APC)
minF ,Λ1
NT‖X − FΛ′‖2
F assuming strong factor structure
ΣF > 0,ΣΛ > 0; e weakly correlated.
(F , Λ) = (√TUr ,VrDr ), F ′F
T= Ir ,
Λ′ΛN
= D2r .
(Bai 2003): Under normalization F ′FT
= Ir or Λ′ΛN
= Ir ,
√N(Ft − H ′NTF
0t )
d−→ N (0,Avar(Ft))√T (Λi − GNTΛ0
i )d−→ N (0,Avar(Λt)).
with G = H−1NT .
Approximate Factor Models Rank Minimization: NP Hard Approximate-Rank Minimization Rank Regularized Factor ModelsAPC vs PC
Lemma
Rotation matrix: HNT =
(Λ0′Λ0
N
)(F
0′F
T
)D−2
r .
Let H1,NT = (Λ0′Λ0)(Λ′Λ0)−1;
HNT = H1,NT + op(1).
Let H2,NT = (F 0′F 0)−1(F 0′F );
HNT = H2,NT + op(1).
Results of independent interest:
Approximate Factor Models Rank Minimization: NP Hard Approximate-Rank Minimization Rank Regularized Factor ModelsAPC vs PC
Principal Components (PC)
Recall APC: (F , Λ) = (√TUr ,VrDr ), F ′F
T= Ir ,
Λ′ΛN
= D2r .
Many definitions of PC: e.g. (F , Λ) = (√TUrDr ,Vr ).
This paper defines PC:(F , Λ) = (
√TUrD
1/2r ,√NVrD
1/2r ) = (
√TFz ,
√NΛz).
Normalization: F ′FT
= Dr ,Λ′ΛN
= Dr .
Why? F ′FT
= Ir . Not convenient to put restrictions.
Approximate Factor Models Rank Minimization: NP Hard Approximate-Rank Minimization Rank Regularized Factor ModelsAPC vs PC
Principal Components (PC)
Recall APC: (F , Λ) = (√TUr ,VrDr ), F ′F
T= Ir ,
Λ′ΛN
= D2r .
Many definitions of PC: e.g. (F , Λ) = (√TUrDr ,Vr ).
This paper defines PC:(F , Λ) = (
√TUrD
1/2r ,√NVrD
1/2r ) = (
√TFz ,
√NΛz).
Normalization: F ′FT
= Dr ,Λ′ΛN
= Dr .
Why? F ′FT
= Ir . Not convenient to put restrictions.
Approximate Factor Models Rank Minimization: NP Hard Approximate-Rank Minimization Rank Regularized Factor ModelsAPC vs PC
Relation with APC: F = FD1/2r Λ = ΛD
−1/2r .
Define HNT = HNTD1/2r . From identities:
√N(Ft − H ′NTF
0t ) =
√ND1/2
r (Ft − HNT′F 0
t ),√T (Λi − H−1
NTΛ0i ) =
√TD−1/2
r (Λi − H−1NTΛ0
i ).
Asymptotic properties:
(i)√N(Ft − H ′NTF
0t )
d−→N
(0, D1/2
r Avar(Ft)D1/2r
);
(ii)√T (Λi − GNTΛ0
i )d−→N
(0, D−1/2
r Avar(Λi )D−1/2r
).
with GNT = H−1NT .
Approximate Factor Models Rank Minimization: NP Hard Approximate-Rank Minimization Rank Regularized Factor Models
1 Approximate Factor ModelsAPC vs PC
2 Rank Minimization: NP Hard
3 Approximate-Rank MinimizationRPC vs PC
4 Rank Regularized Factor ModelsNumber of FactorsLinear Restrictions
Approximate Factor Models Rank Minimization: NP Hard Approximate-Rank Minimization Rank Regularized Factor Models
Let A be a n × n matrix with eigenvalues in D = diag(d):
Trace norm:∑n
k=1 Akk =∑n
k=1 dk
Nuclear norm: ||A||∗ =∑n
k=1 dk
Frobenius norm: ||A||2F =∑
ij A2ij = trace(A′A).
`1 norm: ||A||1 =∑
ij |Aij |Spectral norm: ||A||2 = maxk |dk |.
Approximate Factor Models Rank Minimization: NP Hard Approximate-Rank Minimization Rank Regularized Factor Models
Spark vs Rank
For A ∈ Rm×n, n < m
spark(A) = minx 6=0||x ||0 s.t. Ax = 0
rank(A) = ||D||0 = nnz(D).
spark(A) = size of smallest set of lin. dep. columns.
rank(A)= size of largest set of lin. indep. columns.
spark(A) = n + 1⇔ rank(A) = n.
If spark(A) 6= n + 1:
spark(A) ≤ rank(A).spark(A) ≥ 1 + 1
µ(A) , µ(A) = maxm 6=n(am, an)|.
Computing spark(A) is NP-hard: Tillmann/Pfetsch IEEE-14
Approximate Factor Models Rank Minimization: NP Hard Approximate-Rank Minimization Rank Regularized Factor Models
NP Hard
NP problems: decision problems in which the answer”yes” can be efficiently verified using deterministiccomputations performed in polynomial time.
An NP hard problem is one that admits no generalcomputational solution that is significantly faster than abrute force search.
Approximate Factor Models Rank Minimization: NP Hard Approximate-Rank Minimization Rank Regularized Factor Models
1. Minimum Rank Factor Analysis
Early factor analysis: decompose ΣX = ΣC + Σe s.t.
i commonality matrix ΣC has smallest rankii ΣC : a non-negative definite,iii Σe : diagonal positive definite matrix (Haywood cases).
Rank minimization is NP hard (non-convexity),
Evidence in 1950s suggest many non-zero eigenvalues.Questioned usefulness of the concept of minimum rank.
Approximate Factor Models Rank Minimization: NP Hard Approximate-Rank Minimization Rank Regularized Factor Models
1980s: Decompose ΣX by solving surrogate problems s.t.
(i) ΣX − Σe ≥ 0 (ii) Σe ≥ 0.
(i) CMTFA: min trace(ΣX − Σe)=∑N
i=1 DCii
(ii) MARFA: C = C ∗ + C−.
C ∗ is best minimum rank approximation of C .rank(C ∗) = r , min
∑Ni=r+1 D
Cii
Approximate Factor Models Rank Minimization: NP Hard Approximate-Rank Minimization Rank Regularized Factor Models
Approximate Minimum Rank: ten Berge-Kiers (1991)
minr
N∑i=r+1
DCii ≤ δ, s.t. (i)+(ii). (∗).
δ: tolerance for max unexplained common variance.
The approximate minimum rank of ΣC is the smallest rthat solves (*) for some δ ≥ 0.
Minimum rank: special case of δ = 0.
Sum of eigenvalues is convex.
Approximate Factor Models Rank Minimization: NP Hard Approximate-Rank Minimization Rank Regularized Factor Models
2. Matrix Completion
Complete the matrix Z with missing values.
Ω = index set of positions of observed data.
Underdetermined without some structure.
Assume the latent matrix L is low rank.
Netflix challenge: L = AB ′, A=movie genres, B=taste
Hard problem:
min rank(L) with Lij = Zij (i , j) ∈ Ω.
Surrogate problem:
min ‖L‖∗ with Lij = Zij , (i , j) ∈ Ω.
Z can be recovered if (i) there are not too many missingvalues, and (ii) they are missing at random.
Approximate Factor Models Rank Minimization: NP Hard Approximate-Rank Minimization Rank Regularized Factor Models
3. Low Rank Decomposition
Eckart-Young: Best rank r approximation of Z : UrDrV′r .
svd is sensitive to noise corruption.
Z = L︸︷︷︸low rank
+ S︸︷︷︸sparse, big noise
Compressed sensing: solve underdetermined systems,recover sparse signalsComputer vision: S=background noise.
Hard problem:
minL,S
rank(L) + γ ||S ||0.︸ ︷︷ ︸sparsity constraint
Objective function and constraint both non-convex.
Approximate Factor Models Rank Minimization: NP Hard Approximate-Rank Minimization Rank Regularized Factor Models
Candes et al (2009)
Surrogate problem is convex:
minL,S||L||∗ + γ||S ||1,
L, S can be recovered with high probability under
incoherence conditions: L not sparse, S not low rank,
General problem
Z = L︸︷︷︸low rank
+ S︸︷︷︸sparse, big noise
+ W︸︷︷︸small noise
minL,S||L||∗ + γ||S ||1, ||W ||F ≤ δ.
Approximate Factor Models Rank Minimization: NP Hard Approximate-Rank Minimization Rank Regularized Factor Models
Overview: Good to Relax
Hard problems: rank function.
Surrogate problems: nuclear norm
Cai et al (2008, Theorem 1):
UrDγr V′r = argminLγ ‖L‖∗ +
1
2‖Z − L‖2
F .
SVT=Singular-value thresholding operator:
Dγr =
(D11 − γ)+
. . .(Drr − γ)+)
SVT is the proximal operator of the nuclear norm:
Optimal low rank approx. under rank constraint: UrDγr Vr .
Approximate Factor Models Rank Minimization: NP Hard Approximate-Rank Minimization Rank Regularized Factor ModelsRPC vs PC
Relation to Factor Models
We have low rank solution
UrDγr V′r = min
Lγ ‖L‖∗ +
1
2‖Z − L‖2
F . (1).
L (of rank r) can be factorized: L = AB ′,
minA,B
γ ‖AB ′‖∗ +1
2‖Z − AB ′‖2
F (2)
Theorem: (A,B) solves (2) iff L = A B ′ solves (1).
Solution: Robust Principal Components (RPCA)
A = Ur (Dγr )1/2 B = Vr (D
γr )1/2.
Approximate Factor Models Rank Minimization: NP Hard Approximate-Rank Minimization Rank Regularized Factor ModelsRPC vs PC
Sketch of idea, γ = 0: AB ′ = UrDrVTr :
trace(Dr ) = trace(Ur′AB ′Vr ) ≤ ‖A‖F ‖B‖F ≤
1
2
(‖A‖2
F + ‖B‖2F
).
L = AB ′, ‖L‖∗ = trace(Dr ) ≤ 12(‖A‖2
F + ‖B‖2F ).
Put, A = UD1/2r ,B = D
1/2r V ,
1
2(‖A‖2
F + ‖B‖2F ) =
1
2(||D1/2
r ||2F + ||D1/2r ||2F ) = ‖Dr‖1 .
Bound holds with equality: ‖Dr‖1 = 12(‖A‖2
F + ‖B‖2F ).
Approximate Factor Models Rank Minimization: NP Hard Approximate-Rank Minimization Rank Regularized Factor ModelsRPC vs PC
FOC View
FOC: (i) −(Z − AB′)B + γA = 0, (ii) −(Z − AB
′)′A + γB = 0.
Left multiplying (i) by A′ and (ii) by B ′: A′A = B ′B. Rearrange(−γI ZZ ′ −γI
)(AB
)=
(AB
)A′A.
This has the generic structure ZV = VX.
Eigenvalues of X are those of Z, V are corresponding eigenvectors.
A = Ur (Dγr )1/2, B = Vr (D
γr )1/2.
A particular normalization.
Approximate Factor Models Rank Minimization: NP Hard Approximate-Rank Minimization Rank Regularized Factor ModelsRPC vs PC
Factor Analysis and RPC
With A′A = B
′B = D
γ
r .
RPC of Z : (A,B) = (Ur (Dγr )1/2,Vr (D
γr )1/2)
RPC of X : (F ,Λ) = (√TUr (D
γr )1/2,
√NVr (D
γr )1/2).
PC of X : (F , Λ) = (√TUr (Dr )
1/2,√NVr (Dr )
1/2).
Relation between RPC and PC:
F = F
(Dγ
r D−1r
)1/2
Λ = Λ
(Dγ
r D−1r
)1/2
.
Even big factors will be shrunk
Small factors can be killed since rank(Dγr ) ≤ r
Sparse large noise not treated as factors
Smaller common component: var(C ) ≤ var(C ).
Approximate Factor Models Rank Minimization: NP Hard Approximate-Rank Minimization Rank Regularized Factor ModelsRPC vs PC
Effects of Regularization: ∆2NT = Dγ
r D−1r .
HNT = HNT∆NT .
F t − H′NTF
0t = ∆NT (Ft − H ′NTF
0t )
Λi − GNTΛ0i = ∆NT (Λi − H−1
NTΛ0i )
Proposition
(i)√N(F t − H
′NTF
0t )
d−→N
(0,∆∞Avar(F )∆∞
);
(ii)√T (Λi − GNTΛ0
i )d−→N
(0,∆∞Avar(Λ)∆∞
).
Unlike APC and PC, GNT = ∆NT H−1NT 6= H−1
NT .
Approximate Factor Models Rank Minimization: NP Hard Approximate-Rank Minimization Rank Regularized Factor ModelsRPC vs PC
Bias/Variance Tradeoff
diag(∆∞) = δ, δi < 1. Proposition implies
Avar(F ) ≤ Avar(F ), and Avar(Λ) ≤ Avar(Λ).
Regularization bias since C = UrDrVr′ 6= C = UrD
γr Vr
′.
Case r = 1: δ1 = (D11−γ)+
D11, C it = δ1Cit .
Abias(C it) = (δ1 − 1)C 0it
Avar(C it) = δ21Amse(Cit)
Amse(C it) = (δ1 − 1)2(C 0it)
2 + δ21Amse(Cit).
Relative MSE < 1 when Amse(Cit) large:
Amse(C it)
Amse(Cit)= (δ1 − 1)2 (C 0
it)2
Amse(Cit)+ δ2
1.
Approximate Factor Models Rank Minimization: NP Hard Approximate-Rank Minimization Rank Regularized Factor ModelsRPC vs PC
Asymptotic vs. Finite Sample Results
Z = L + S consistent with many probabilistic structure
Econometric theory: X = F 0Λ0′ + e, Z = X√NT
Strong factor structure ΣF > 0,ΣΛ > 0
r population eigenvalues diverge with N
Estimation: choose F ,Λ with e residually determined.
min(√N ,√T )(Cit − C 0
it)d−→N(0,Avar(Cit)).
Approximate Factor Models Rank Minimization: NP Hard Approximate-Rank Minimization Rank Regularized Factor ModelsRPC vs PC
Machine Learning Results:
Solve problem given data (finite sample).
Choose L and S simultaneously.
Netflix/noiseless problems: no reference to eigenvalues.
Incoherence condition: L is not sparse. Details
S is selected uniformly at random and not low rank.
For γ = 1√max(m,n)
, (L, S) = (L, S) with prob. 1− c0
n10 if
||S ||0 < c1mn, and rank (L) ≤ c1min(m,n)
µ log(max(m, n))−2.
Approximate Factor Models Rank Minimization: NP Hard Approximate-Rank Minimization Rank Regularized Factor ModelsRPC vs PC
Agarwal. Negahban and Wainwright (2012, Annals of Statistics)
M estimation based on regularized nuclear norm. Assumerestricted strong convexity of loss function.
With noisy data, cannot exactly recovery L.
What matters are eigenvectors of largest singular values.
err2 = ||L− L||2F + ||S − S ||2F
if ||L||∞ < c√m n
, with high probability, err2 ≤ c
(N+TNT
).
Approximate Factor Models Rank Minimization: NP Hard Approximate-Rank Minimization Rank Regularized Factor ModelsRPC vs PC
||L||∞ = maxit |Lit |, ||L||2F =∑r
i=1 d2i .
||L||∞ < cmn
is a constraint on sum of eigenvalues.
N+TNT≈ min(N ,T )−1.
Econometric theory: min(√N ,√T )(Cit − Cit) = Op(1).
Different objective, results broadly agree.
Also related: Bertisimas, Copenhaver, Mazumder (2016),Lettau and Pelger (2017).
Approximate Factor Models Rank Minimization: NP Hard Approximate-Rank Minimization Rank Regularized Factor ModelsNumber of Factors Linear Restrictions
Number of Factors: min rank+model complexity
BaiNg-02 r = mink
log(ssrk) + kg(N,T ), ssrk =∥∥∥Z − Fk Λ′k
∥∥∥2
F
BaiNg-17 r = mink
log(ssrk) + kg(N,T ), ssrk =∥∥∥Z − F kΛ
′k
∥∥∥2
F
ssrk = 1−k∑
j=1
d2j , ssrk = 1−
k∑j=1
(dj − γ)2+
ICk ≈ ICk + γ
∑kj=1(2dj − γ)
ssrk.
A data dependent, heavier penalty.r ≥ r ∗: sparse outliers or weak factors.||Z ||F = 1. γ = .05 reduces contribution of factor i by(di − .05)2. Effect on small factors proportionally larger.
Approximate Factor Models Rank Minimization: NP Hard Approximate-Rank Minimization Rank Regularized Factor ModelsNumber of Factors Linear Restrictions
Implications for Factor Augmented Regressions
yt+h = α′Ft + β′Wt + εt+h.
Replace F by F , F , or F will give identical fit! They areall spanned by Ur , hence perfectly correlated.
The estimates of α will simply adjust for scale difference.
For F to have effect, do ridge regressions. Given κ,
αols = (F ′F )−1Fy = (Dγr )−1/2U ′y/
√T
αR = (F ′F + κIr )−1Fy
= (Dγr + κT Ir )
−1Dγr αOLS = (Ir + κT (Dγ
r ))−1 αOLS
≈ (Ir − κTDγr )αOLS .
Approximate Factor Models Rank Minimization: NP Hard Approximate-Rank Minimization Rank Regularized Factor ModelsNumber of Factors Linear Restrictions
RPC by SVT via Iterative Ridge
Given a m × n matrix Z , initialize a m × r matrix F = UDwhere U is orthonormal and D = Ir .
A. Repeat till convergence
i. (solve Λ given F ): Λ = Z ′F (F ′F + γIr )−1.
ii svd(Λ) = UΛDΛVΛ′, Λ = UΛDΛ. D = DΛ.
iii (solve F given Λ): F = ZΛ(Λ′Λ + γIr )−1.
iv svd(F ) = UF DF VF′, let F = UF DF and D = DF .
B. (Cleanup) From svd(ZUΛ) = UrDrVr′, let Vr
′ = V′Ur ,Dγ
r = (Dr − γIr )+.
Useful when T ,N are large and direct svd is expensive.
Iterative ridge regressions implement SVT.
Cleanup to take care of nuermical precision problem.
Approximate Factor Models Rank Minimization: NP Hard Approximate-Rank Minimization Rank Regularized Factor ModelsNumber of Factors Linear Restrictions
Generalized Ridge
General regularized problem :
(F γ1,γ2,τ ,Λγ1,γ2,τ ) = argminF ,Λ
1
2||Z−FΛ′||2F+
γ1
2||F ||2F+
γ2
2||Λ||2F .
Let Dγ
r = (Dr −√γ1γ2 Ir )+. Solution is
F γ1,γ2 =(γ2
γ1
)1/4
Ur (Dγ
r )1/2
Λγ1,γ2 =(γ1
γ2
)1/4
(Dγ
r )1/2
C γ1,γ2 = UrDγ
r V′r .
Approximate Factor Models Rank Minimization: NP Hard Approximate-Rank Minimization Rank Regularized Factor ModelsNumber of Factors Linear Restrictions
Monte Carlo
Xit = F 0′t Λ0
i + eit + sit , eit ∼ (0, 1)
sparse error sit ∼ N(µ, ω2) if (i , t) ∈ Ω.
[κNN] units have outliers in [κTT ] of sample.
(κN , κT ) = (0.1, 0.03), ω ∈ (5, 10, 20)µ = 5, r = 5.
DGP1 (outliers) : F 0t ∼ N(0, Ir ), Λ0
i ∼ N(0, Ir ).
DGP2 (weak loadings): F 0 = UrD1/2r , Λ0 = VrD
1/2r
diag(Dr ) = [1, 0.8, 0.5, 0.3, 0.2θ], and ω = 5.
θ (1, 0.75, 0.5).
Approximate Factor Models Rank Minimization: NP Hard Approximate-Rank Minimization Rank Regularized Factor ModelsNumber of Factors Linear Restrictions
Case 1: Outlier, ω = 5
0 50 100 150 200 250 300 350 400-4
-3
-2
-1
0
1
2
3
4
5
6
Approximate Factor Models Rank Minimization: NP Hard Approximate-Rank Minimization Rank Regularized Factor ModelsNumber of Factors Linear Restrictions
Case 2: Small Eigenvalue, θ = 0.75
1 2 3 4 5 60
0.5
1
1.5
2
2.5
3
3.5
4104
Approximate Factor Models Rank Minimization: NP Hard Approximate-Rank Minimization Rank Regularized Factor ModelsNumber of Factors Linear Restrictions
Table 1: DGP 1, N = 100, r = 5, r∗ = 5params signal noise mean span F 0
T ,ω C r Cr S r r Cr C r C C100, 5 0.83 0.12 0.00 5.00 5.00 0.98 0.98 0.98 0.98100, 10 0.83 0.12 0.00 5.00 5.00 0.98 0.98 0.98 0.98100, 20 0.83 0.12 0.00 5.00 5.00 0.98 0.98 0.98 0.98200, 5 0.83 0.13 0.00 5.00 5.00 0.98 0.98 0.98 0.98200, 10 0.83 0.13 0.00 5.00 5.00 0.98 0.98 0.98 0.98200, 20 0.83 0.13 0.00 5.00 5.00 0.98 0.98 0.98 0.98400, 5 0.83 0.13 0.00 5.00 5.00 0.98 0.98 0.98 0.98400, 10 0.83 0.13 0.00 5.00 5.00 0.98 0.98 0.98 0.98400, 20 0.83 0.13 0.00 5.00 5.00 0.98 0.98 0.98 0.98
100 , 5 0.81 0.12 0.02 5.36 5.00 0.63 0.98 0.92 0.98100 ,10 0.78 0.12 0.06 5.79 5.00 0.28 0.98 0.85 0.97100 ,20 0.69 0.12 0.17 6.81 5.00 0.01 0.97 0.72 0.97200 , 5 0.81 0.13 0.02 5.67 5.00 0.32 0.98 0.87 0.98200 ,10 0.78 0.13 0.06 5.91 5.00 0.19 0.98 0.84 0.98200 ,20 0.69 0.13 0.17 7.13 5.00 0.00 0.98 0.69 0.98400 , 5 0.81 0.13 0.02 5.88 5.00 0.12 0.98 0.84 0.98400 ,10 0.78 0.13 0.06 5.90 5.00 0.16 0.98 0.84 0.98400 ,20 0.69 0.13 0.18 7.15 5.00 0.00 0.98 0.69 0.98
Approximate Factor Models Rank Minimization: NP Hard Approximate-Rank Minimization Rank Regularized Factor ModelsNumber of Factors Linear Restrictions
Table 2: DGP 2, N = 100, r = 5, r∗ = 3, ω = 5params signal noise mean span F 0
T , ω C r Cr S r r Cr C r C C100, 1.00 0.67 0.02 0.00 3.94 3.00 0.07 0.95 0.74 0.96100, 0.75 0.67 0.01 0.00 3.95 3.00 0.05 0.95 0.73 0.96100, 0.50 0.67 0.01 0.00 3.97 3.00 0.04 0.95 0.73 0.96200, 1.00 0.67 0.02 0.00 4.01 3.00 0.00 0.95 0.73 0.97200, 0.75 0.67 0.01 0.00 4.00 3.00 0.00 0.95 0.73 0.97200, 0.50 0.67 0.01 0.00 4.00 3.00 0.00 0.95 0.73 0.97400, 1.00 0.67 0.02 0.00 4.26 3.00 0.00 0.95 0.69 0.97400, 0.75 0.67 0.01 0.00 4.00 3.00 0.00 0.95 0.73 0.97400, 0.50 0.67 0.01 0.00 4.00 3.00 0.00 0.95 0.73 0.97
100 ,1.00 0.60 0.02 0.11 4.81 2.93 0.01 0.93 0.61 0.96100 ,0.75 0.59 0.01 0.11 4.84 2.95 0.01 0.93 0.60 0.96100 ,0.50 0.59 0.01 0.11 4.86 2.96 0.01 0.93 0.60 0.96200 ,1.00 0.60 0.02 0.11 5.01 3.00 0.01 0.93 0.58 0.96200 ,0.75 0.59 0.01 0.11 5.00 3.01 0.01 0.93 0.58 0.96200 ,0.50 0.59 0.01 0.11 5.00 3.01 0.01 0.93 0.58 0.96400 ,1.00 0.60 0.02 0.11 5.21 3.10 0.00 0.84 0.56 0.94400 ,0.75 0.59 0.01 0.11 5.00 3.12 0.00 0.83 0.58 0.94400 ,0.50 0.59 0.01 0.11 5.00 3.13 0.00 0.82 0.58 0.93
Approximate Factor Models Rank Minimization: NP Hard Approximate-Rank Minimization Rank Regularized Factor ModelsNumber of Factors Linear Restrictions
FRED-MD Data
Eigenvalues
F Balanced Panel Non-Balanced Panel
d21 d
2
1 d21 d
2
1
1 0.1828 0.1426 0.1493 0.11312 0.0921 0.0643 0.0709 0.04683 0.0716 0.0473 0.0682 0.04464 0.0604 0.0384 0.0561 0.03495 0.0453 0.0265 0.0426 0.02456 0.0416 0.0237 0.0341 0.01827 0.0301 0.0152 0.0317 0.01648 0.0287 0.0143 0.0268 0.0129(r , r) 8 3 8 3
Approximate Factor Models Rank Minimization: NP Hard Approximate-Rank Minimization Rank Regularized Factor ModelsNumber of Factors Linear Restrictions
Financial Data
Eigenvalues
F Balanced Panel Non-Balanced Panel
d21 d
2
1 d21 d
2
1
1 0.6896 0.6090 0.6800 0.60012 0.0464 0.0274 0.0447 0.02613 0.0341 0.0181 0.0337 0.01784 0.0138 0.0045 0.0141 0.00475 0.0114 0.0032 0.0133 0.00436 0.0092 0.0021 0.0109 0.00307 0.0072 0.0012 0.0090 0.00208 0.0066 0.0010 0.0075 0.0013(r , r) 8 3 8 3
Approximate Factor Models Rank Minimization: NP Hard Approximate-Rank Minimization Rank Regularized Factor ModelsNumber of Factors Linear Restrictions
Linear Restrictions: : Rvec(Λ) = φ
(F γ,τ ,Λγ,τ ) = argminF ,Λ
1
2‖Z − FΛ′‖2
F +γ
2
(‖F‖2
F + ‖Λ‖2F
)+τ
2‖R vec(Λ)− φ‖2
2 .
Vector form: ‖Z − FΛ′‖2F = ‖vec(Z ′)− (F ⊗ IN)vec(Λ)‖2
2.
Allow cross-equation restrictions.
F given Λ: F γ,τ = ZΛ(Λ′Λ + γIr )−1 (standard ridge)
Λ given F : (generalized ridge)
vec(Λγ,τ ) =(
(F ′F ⊗ IN) + γINr + τR ′R)−1[
vec(Z ′F ) + τR ′φ]
Approximate Factor Models Rank Minimization: NP Hard Approximate-Rank Minimization Rank Regularized Factor ModelsNumber of Factors Linear Restrictions
Implementation when constraints bind
Let WF = (F ′F + γIr )−1.
vec(Λγ,∞) = vec(Λγ,0)− (WF ⊗ IN)R ′ ·[R(WF ⊗ IN)R ′
]−1(R vec(Λγ,0)− φ
)Two Step Approach
1 Estimate without linear restrictions, τ = 0:
Λγ,0 = Z ′F k(F kT
F k + γIr )−1.
2 Impose binding linear restrictions :
vec(Λγ,∞) = vec(Λγ,0)−W kF⊗INR ′·
[R(W k
F⊗IN)R ′]−1(
R vec(Λγ,0)−φ)
Note: F′γ,∞Fγ,∞ and Λ
′γ,∞Λγ,∞ will not , in general, be diagonal.
Approximate Factor Models Rank Minimization: NP Hard Approximate-Rank Minimization Rank Regularized Factor ModelsNumber of Factors Linear Restrictions
Conclusion
Iterative least squares: PC
Iterative ridge: implements SVT
SVT solves surragate of minimum rank problem.
min rank +parsimony: ⇒ IC , a data dependent penalty.
FRED-MD, Finance data: r = 8, r = 3.
Factor estimation under linear restrictions
Missing values problem: in progress
Approximate Factor Models Rank Minimization: NP Hard Approximate-Rank Minimization Rank Regularized Factor ModelsNumber of Factors Linear Restrictions
Incoherence Conditions
U ∈ RT×r , V ∈ RN×r
Single incoherence: singular vectors not too skewed:maxi=1,...T ||U ′ei ||2 ≤ µ0r
T, and maxj=1,...N ||V ′ej ||2 ≤ µ0r
N
Joint incoherence: singular vectors not too correlated:maxi ,j ||(UV T )ij || ≤
õ1rNT
Singular vectors are reasonably spread out for small µ.∑i(UV
′)ij = ||V ′ej ||22 and∑
j(UV′)2ij = ||U ′ei ||22.
µ1 dominates.
Approximate Factor Models Rank Minimization: NP Hard Approximate-Rank Minimization Rank Regularized Factor ModelsNumber of Factors Linear Restrictions
Example when incoherence condition fails:
Z =
1 0 00 0 00 0 0
=
100
[1](1 0 0
)
Z too sparse, and singular vectors too sparse.
Completion requires many entries of Z to be observed.
Back to Main Text
Approximate Factor Models Rank Minimization: NP Hard Approximate-Rank Minimization Rank Regularized Factor ModelsNumber of Factors Linear Restrictions
Rank easy to compute, spark needs combinatorial search.
Spark(A)≤ m + 1 =rank(A)+1.
Donoho and Elad (2003): spark(A) ≥ 1 + µ−1(A).
Stable L1 recovery: min ||x ||1 s.t. ||Ax − b||2 ≤ ε.
Coherence-base guarantee: if A has normalized columns
and Ax = b has solution satisfying ||x ||0 < (1+µ−1(A))2
,then x is the unique sparse solution.
Approximate Factor Models Rank Minimization: NP Hard Approximate-Rank Minimization Rank Regularized Factor ModelsNumber of Factors Linear Restrictions
Restricted Isometry Property: a m × n matrix A satisfiesthe RIP of order k if
(1− δk)||z ||22 ≤ ||Az ||22 ≤ (1 + δk)||z ||22, ||z ||0 ≤ k .
RIP ensures that the matrix is property scaled.
Statistical RIP property: P(||Ax ||2 − ||x ||2) ≥ 1− ε withrespect to a uniform distribution of vector x among all ksparse in Rn.