*0.5in @let@token thebibliography[1]![1][]== mortgage and … · · 2016-01-29mortgage and...
TRANSCRIPT
Mortgage and Housing Microdata: DatabaseConstruction and Challenges for Statistical
Inference
Nancy Wallace
Haas School of BusinessReal Estate and Financial Markets Laboratory
Fisher Center for Real Estate and Urban Economics
Macro Financial Modeling ConferenceJanuary 29, 2016
1
Policy Challenge Network Structures House Price Indices Conclusions
Statistical Inference in the Mortgage Market
I Standards should be very high given policy objectives.
I Mortgage origination is a high dimensional contracting problem so mostcontract elements are jointly endogenous.
I Lenders set the contract menus based on pricing embedded options (defaultand prepayment) – leads to ex post sample selection and sorting in mortgageorigination data.
I Pre-crisis mortgage regulation controlled by competing regulatory authorities,Federal TILA, state consumer protection and foreclosure laws, zip code levelUnderserved Area Goals.• Induces non-random assignment and rationing of contract types even at
the level of zip codes.
I No standardized loan identification number exists in U.S. – leads tochallenges in data set assembly and sampling.
I Available mortgage data sets are problematic because none of them spanthe contracting space.
2
Policy Challenge Network Structures House Price Indices Conclusions
No Single Data Set Spans the Contracting Space HMDA LPS/
EquifaxABSNet/Lewtan
DataQuick/Corelogic
FNMA/FHLMC
Coverage ~90%Orig.
~20%Orig. ~100%PLS LiensandTransfers
N.A.
Storage(G) 100 8,500 360 2,100 130Loans(M) 262.97 67.40 20.00 118.47 31.31Orig.($T) 44.70 11.50 4.48 36.48 5.61
StaticFieldsLoanLevelProductTypes
All PrimarilyGSE
Non-GSE All GSEFRM
GeographicIdentifiers
CensusTracts
ZipCodes,Zip(3)GSE
ZipCodes Address Zip(3)
BorrowerIncome
Yes No Yes No DTI
InterestRate No Yes Yes No NoOrig.Date Year YM YM YMD NoOrig.Amount Yes Yes Yes Yes YesPurpose P,R P,R,Other P,R,Other P,R P,ROriginator Yes No Aggregator Yes AggregatorHousePrice No Yes–LTV Yes–LTV Yes Yes-LTVOtherLiens No IfEquifax Partial Yes NoSec.Status Partial Yes Yes No Partial
DynamicFieldsLoanLevelPaymentPerformance
No YMD YMD PayoffYMD Y
HousePrice No No No Yes,notattributes
No
CreditScore No IfEquifax No No No
3
Policy Challenge Network Structures House Price Indices Conclusions
Linking mortgage data: Without Loan IDs
DataQuickCharacteris/cs
DataQuickTransac/ons
HMDA LPS
ABSNetRMBS
GSEs
HousePriceModel
MortgagePopula/on
MicroNetworkModel
Compe//on&Funding
MicroCorporateStressTest
MacroStressTest
InterestRateModel
Equifax FlowofFunds Y-9C
Analy&
cs
Data
Techno
logy
HighPerformanceDataManagement:Cloudera/Impala/HadoopHighPerformanceCompu/ngCluster(HPCC)
MicroLoanLevel
StressTest
EDGARBondCUSIP
RMBSBonds
ABSNetMort.
Liens
OtherCredit
4
Policy Challenge Network Structures House Price Indices Conclusions
Combined Matching and Learning Algorithms Success Rate
I Average match rates in the literature are usually around 30% or less.
I DataQuick and ABSNet Merge (2000 - 2013): For zip codes that arepresent in DQ (90% coverage in U.S.) overall match is 87%, 97% forCalifornia.
I DataQuick and HMDA Merge (2005 - 2013): Overall match of 60%,90% for California.
I LPS and DQ merge is currently underway.
I Exploit network structure of lenders and subsidiaries to enhancematches.
5
Policy Challenge Network Structures House Price Indices Conclusions
Why does this Matter?
I Challenges of sampling and population inference using existing data sets.
I Example 1: Inference concerning the competitive structure of the mortgagemarket.
I Example 2: Inference concerning house price dynamics and mortgageperformance.
6
Policy Challenge Network Structures House Price Indices Conclusions
Herfindahls using only HMDA: Market Appears Competitive
Stanton et al. (2014)
25°N
45°N
100°W 80°W
0.035
0.043
0.051
0.058
0.064
0.073
0.082
0.093
0.104
0.117
0.134
0.157
0.185
0.225
0.277
0.367
0.462
0.604
0.850
1.000
7
Policy Challenge Network Structures House Price Indices Conclusions
Mortgage Market Loan-Flow Networks
1.3 Aggregator Subsidiary: Correspondent, wholesale, retail
1.4 Ind./Affil.Dep.
1.5 Ind. MC1.6 Ind.Broker
1.1 HoldingCo. Sec.. Shelf
1.2 HoldingCo. Sec. Shelf
1.0 Bank/ThriftHolding Company
2.0 IBank/Ind.Holding Company
2.1 HoldingCo. Sec Shelf
2.2 HoldingCo. Sec Shelf
2.3 Aggregator Subsidiary: Correspondent, wholesale
2.4 Ind. Dep. 2.5 Ind. MC2.6 Ind.Broker
Supply Chain
Network
8
Policy Challenge Network Structures House Price Indices Conclusions
Example: Trees – 2006 Mortgage Originations
WEPCO FCU
WESTWORKS MORTGAGE
WHITE SANDS FCU
WEST TEXAS STATE BANK
VVV MANAGEMENT CORP
VNB NEW YORK CORP
WESTERN MORTGAGE
BNC MORTGAGE INC.
INDYMAC BANCORP INC
H&R BLOCK INC
WAUWATOSA SAVINGS BANK
WOODFOREST NATIONAL MORTGAGE
TAYLOR, BEAN & WHITAKER
WRENTHAM CP BANK
NEW CENTURY
LUMINENT MORTGAGE
VOLUNTEER FED L S&L
CAPSTEAD MORTGAGE CO
WASHINGTON MUTUAL CAP MARKETS
CITIGROUP INC
WOODS GUARANTEE SAVINGS
WESTERN CITIES FINANCIAL
WEICHERT FINANCIAL C
COUNTRYWIDE FINANCIAL
GENERAL MOTORS CORPORATION
WESTERN CU
WESTERN PLAINS MORTGAGE
WEYERHAEUSER EMPLOYEE CU
WESTERN SIERRA BANK
WEST−AIR COMM FCU
WHITMAN MORTGAGE
YANKEE FARM CREDIT
WASHINGTION MUTUAL BANK
9
Policy Challenge Network Structures House Price Indices Conclusions
Example: Networks – 2006 Mortgage Originations
WEPCO FCU
WESTWORKS MORTGAGE
WHITE SANDS FCU
WEST TEXAS STATE BANK
VVV MANAGEMENT CORP
VNB NEW YORK CORP
WESTERN MORTGAGE
BNC MORTGAGE INC.
INDYMAC BANCORP INC
H&R BLOCK INC
WAUWATOSA SAVINGS BANK
WOODFOREST NATIONAL MORTGAGE
TAYLOR, BEAN & WHITAKER
WRENTHAM CP BANK
NEW CENTURY
LUMINENT MORTGAGE
VOLUNTEER FED L S&L
CAPSTEAD MORTGAGE CO
WASHINGTON MUTUAL CAP MARKETS
CITIGROUP INC
WOODS GUARANTEE SAVINGS
WESTERN CITIES FINANCIAL
WEICHERT FINANCIAL C
COUNTRYWIDE FINANCIAL
GENERAL MOTORS CORPORATION
WESTERN CU
WESTERN PLAINS MORTGAGE
WEYERHAEUSER EMPLOYEE CU
WESTERN SIERRA BANK
WEST−AIR COMM FCU
WHITMAN MORTGAGE
YANKEE FARM CREDIT
WASHINGTION MUTUAL BANK
10
Policy Challenge Network Structures House Price Indices Conclusions
“Mortgage Loan-Flow Networks and Financial Norms,” Stanton,
Walden, and Wallace (2015)I Develop an equilibrium model of intermediaries that share risk in a
financial network.I Key properties:
• Participation in network is voluntary (assume pairwise stability as inJackson and Wolinsky, 1996)
• Agents decide whether to invest in quality• Financial norms represented by investment decisions• Model simple enough to allow for numerical approximation of equilibrium
in mid- large-sized networks.
I Three general implications:1. Network structure influences financial norms – both direct and indirect
counterparties.2. Heterogeneous financial norms often coexist in the network.3. Network proximity is related to financial norms.
11
Policy Challenge Network Structures House Price Indices Conclusions
Key Results
I Decisions to share risk and to invest in screening are both non-monotonic incapital requirements (analytic).
I Heterogeneous screening (quality) policies arise naturally within lendingnetworks, leading to clusters of high and low quality lending (analytic).
I High quality lenders tend to have more connections and these tend to be ofhigher quality (simulation).
I Ex post, a sub-network of concentrated, low quality lending channels isobservable in the 2006 U.S. mortgage market (empirical).
I Systemically, pivotal firms can be forecast (empirical and analytical).
12
Policy Challenge Network Structures House Price Indices Conclusions
Pivotal Firm Linkages
−1 −0.5 0 0.5 1−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
13
Policy Challenge Network Structures House Price Indices Conclusions
No Single Data Set Links Lenders, House Prices with
Time-Matched House Characteristics and Mortgage IDs
DataQuick ModifiedDataQuick
Corelogic Trulia
TransactionPrices
Yes Yes Yes Yes
HouseCharacteristics
Static UpdatedAnnually
Static Static
LienRecordingHistory
Yes Yes No No
Foreclosure Yes Yes Yes YesRecordedMortgageOriginator
Yes Yes No No
MortgageID No No No NoHousePriceIndices
RepeatSales/Hedonic
DynamicHedonic
RepeatSales RepeatSales
14
Policy Challenge Network Structures House Price Indices Conclusions
Shortcomings of repeat-sales indices
I Small sample size: Ignores houses that sell just once during sampleperiod.
I Changing Quality/Quantity: Assumes house characteristics remainconstant (and that relative prices of each characteristic remainconstant).• Adding a new bedroom will make us think prices of unaltered houses
have gone up.I Volatility: Volatility of index will underestimate volatility of individual
house prices.• Construction of index automatically induces smoothing.• Even ignoring this, index is a (somewhat) diversified portfolio, not an
individual house.
I The first-differencing required in repeat-sales estimators mechanicallyinduces serial correlation depending on the timing of sales.
I Sample-selection bias: Are houses that sell representative of housesthat don’t?
15
Policy Challenge Network Structures House Price Indices Conclusions
Problems with repeat-sales indices 1Most houses don’t sell very often
I A house is only included if it sells more than once, so we throw out ahuge fraction of house sales.
I In San Francisco between 2003–2012,• 24,342 houses sold at least once.• 20,778 (85.4%) sold only once, so are discarded.• Index considers only 3,564 houses (14.6%)
I In Los Angeles between 2003–2012,• 236,406 houses sold at least once.• 194,375 (82.2%) sold only once, so are discarded.• Index considers only 42,031 houses (17.8%)
16
Policy Challenge Network Structures House Price Indices Conclusions
Problems with repeat-sales indices 2Changes in property characteristics
I In San Francisco, 2502 Leavenworth sold in 2003 and 2008.• In 2003:
F Square footage = 1,752F Total rooms = 5F Bathrooms = 1F Price = $1,503,000
• In 2008:F Square footage = 2,913F Total rooms = 8F Bathrooms = 3F Price = $5,500,000
I Changes in size (or quality) are wrongly counted as “returns”
17
Policy Challenge Network Structures House Price Indices Conclusions
Problems with Repeat-Sales indices 3The index is not a house price!
I For mortgage valuation, stress testing, etc., we need the distribution offuture house prices.
I In repeat-sales methodology, each period’s index growth is aconstant.• No specification of inter-period dynamics.
I Even if we assume house prices follow (say) geometric Brownianmotion, we can’t just use the volatility of the index.• Index levels are estimated (though standard errors are never shown).• Construction of index automatically induces smoothing.• Volatility of index will underestimate volatility of individual house prices.• Even ignoring this, index is a (somewhat) diversified portfolio, not an
individual house.I We also can’t use index properties to test for (e.g.) serial correlation in
house prices.• Estimation procedure mechanically induces serial correlation in index.
F Even when there is none in individual house prices.
18
Policy Challenge Network Structures House Price Indices Conclusions
“A New Dynamic House-Price Index for Mortgage Valuation and
Stress Testing,” Stanton and Wallace, 2015
I Objective to account for:• Dynamics flexibly.• Property characteristics.• Macroeconomic variables.• All sales, not just repeats.• Unobserved heterogeneity across properties.
I Purpose of the index• Develop suitable dynamic house price index for applications in mortgage
valuation.• Develop suitable index for DFAST nine-quarter-ahead planning horizon –
incorporate macro fundamentals directly into index.
19
Policy Challenge Network Structures House Price Indices Conclusions
New IndexWrite log-price, yi ,t , as
yi ,t =
house price index︷ ︸︸ ︷Abxi ,t + Bbξt + αb,t + µi + εi ,t ,
= X tβb + αb,t + µi + εi ,t ,
αb,t = ραb,t−1 + ηt , where
εi ,t ∼ i.i.d. N(0,σ2ε ),
µi ∼ i.i.d. N(0,σ2µ),
ηi ,t ∼ i.i.d. N(0,σ2η).
I xi ,t = home hedonics (beds, size, lot size).I ξt = macro fundamentals (interest rates, population, unemployment).I µi = house-specific random effect (due to unobservable differences).I αb,t = unexplained (and unobserved) portion of the index.
20
Policy Challenge Network Structures House Price Indices Conclusions
Estimation
I Model can be written as a standard, linear state-space model if weaugment the state, st , to include µi :
st =
αb,t
β1,t
β2,t...
βk ,t
µ1,t
µ2,t...
µI,t
≡
αb,t
β1
β2...βk
µ1
µ2...µI
=
α+ ραb,t−1 + ηt
β1,t−1
β2,t−1...
βk ,t−1
µ1,t−1
µ2,t−1...
µI,t−1
.
21
Policy Challenge Network Structures House Price Indices Conclusions
Estimation
I In principle, we could estimate the model’s parameters by maximizingthe likelihood function for this state space model using standardKalman Filter (Kalman and Bucy, 1961).
I We don’t do it this way.• Dimensionality of problem is very high.• Lots of missing data.• Want to be able to extend to even more general settings.
F E.g., more general distributional assumptions.
I Instead we use Markov Chain Monte Carlo (MCMC) Bayesianmethods (e.g., Johannes and Polson, 2009).• Details in paper.
22
Policy Challenge Network Structures House Price Indices Conclusions
Alameda County: Comparison WRS Indices and Dynamic Index
23
Policy Challenge Network Structures House Price Indices Conclusions
San Diego County: Comparison WRS indices and Dynamic Index
24
Policy Challenge Network Structures House Price Indices Conclusions
San Francisco County: WRS indices and Dynamic Index
25
Policy Challenge Network Structures House Price Indices Conclusions
Alameda Zip = 94618: Comparison Zillow Medians, Medians and
Model
26
Policy Challenge Network Structures House Price Indices Conclusions
San Diego Zip = 92129: Comparison Zillow Medians, Medians and
Model
27
Policy Challenge Network Structures House Price Indices Conclusions
San Francisco, Zip = 94111: Comparison Zillow Medians,
Medians, and Model
28
Policy Challenge Network Structures House Price Indices Conclusions
Conclusions
I Without a mortgage loan ID integrating existing data sets is a significanthurdle – regulatory oversight is challenging
I Random sampling from non-integrated loan-level mortgage data sets maylead to biased estimates of the true population distributions of the contractspace, the geographic coverage of the contracts, and industrial organizationof the industry.
I Lack of dynamic house price data sets present significant challenges fordisentangling price from quality changes in the housing market – especially achallenge in coastal states with high real option values.
• Currently, market and regulators rely on WRS indices due to lack of data.• Use of a fundamentally static price indices leads to significant problems
in mortgage valuation and stress testing.
I New large data set matching and learning algorithms hold considerablepromise in addressing both the mortgage and the house price datachallenges for economists and regulators.
29