ccar/dfast data wrangling - opex analytics...ccar/dfast data wrangling regulatory environment...
TRANSCRIPT
PG 1
CCAR/DFAST Data Wrangling
Regulatory Environment Summary
Fallout from the 2008-2009 financial crisis included the emergence of a new regulatory landscape intended to safeguard the U.S. banking system from a systemic collapse. In 2012, the Federal Reserve Board of Governors (Fed), began to require the largest U.S. Bank Holding Companies (BHCs) to file a Comprehensive Capital Analysis and Review (CCAR), with stress tests intended to assess the capital adequacy of these BHCs in times of crisis. By 2015, CCAR and stress tests, now known as DFAST (after the Dodd-Frank Act Stress Tests) were expanded to include U.S. BHCs with between $10 and $50 billion in consolidated assets and foreign banks, whose exempt status expired. For banks, CCAR reporting and DFAST stress testing are complex and data intensive endeavors with some of the following challenges:
DFAST requires credit modeling and risk assessment at a
granular level over vast amounts of data
There is often a need for third-party data from sources such as Trepp to supplement internal data
Retrieving, maintaining, & standardizing both internal and
external data is usually difficult and time-consuming
Subsets of data selected for reporting and testing must reflect the existing portfolio of loans at the bank
Like many organizations, BHCs store data in several data repositories used by organizational units such as Finance, and Treasury and Credit. Stress and risk models require on a repetitive basis data from these silos augmented by a variety of external data. The latter include but are not limited to economic data, exogenous credit scores, and external loan augmentation data such as data provided by Trepp.
Most financial institutions simply do not have the expertise nor the personnel necessary to efficiently meet their regulatory requirements. Thus, they require outside data preparation and reporting assistance in the form of staff augmentation or automated solutions. In 2013, a Fed report on the financial industry’s compliance progress noted that several banks’ revenue estimates were inaccurate due to data limitations, and weak information management systems.
Why do banks have
challenges in data wrangling?
Internal data silos 1
Incomplete data 2
Unstructured data 3
Requirements for CCAR
reporting and DFAST
stress testing result in
complex data challenges
for many banks.
BHCs store data in
several repositories,
posing data integration
challenges.
PG 2
Opex Analytics Experience
CCAR/DFAST Data Wrangling
Opex Analytics has gained substantial experience with extract-transform-load steps in support of CCAR and DFAST from past projects. BHCs typically start mostly manually assembling and preparing data. Due to quarterly and annual report requirements, such laborious processes soon become a burden. On top, regulators require creation of scenarios based on idiosyncratic risk drivers and granular loss estimates. The challenge lies in integrating various data sources that historically have served only specific purposes, including:
Most of the steps performed for each analysis can be automated by creating data marts and automated scripts to perform the following:
Often, banks overwrite data with newer information — e.g. borrower’s credit score on the day of the loan application — yet the previous versions become indispensable for loan modeling. On top there are several loan types such as adjustable, fixed, commercial and industrial, residential, etc.
PG 3
Opex Analytics Experience
CCAR/DFAST Data Wrangling
At Opex, we executed several projects with BHCs requiring automation of data wrangling. With our diverse knowledge and expertise in a variety of tools, we create solutions tailored for each individual client. The workflow consists not only of typical extract, transform, load steps, but is also augmented with advanced data cleansing techniques requiring technical and business knowledge, and specific methodologies to automatically understand unstructured data.
At Opex Analytics, we use the tool of your choice, be it open source Python, R or Java/C, or a commercial offering such as SAS. We assist banks in transitioning from manual extract-transform-load processes in support of CCAR and DFAST to automated and intelligent solutions.
PG 4
Opex DFAST Leadership Team
CCAR/DFAST Data Wrangling
Diego Klabjan, Ph.D. is a founder of Opex Analytics. He serves as a chief data scientist
and technology officer. Diego is a leader in the field of analytics. As a full professor at Northwestern, he is the Founding Director, Master of Science in Analytics. He was also in the first group of people to be recognized as Certified Analytics Professionals (CAP) by INFORMS. Diego is a full professor in Northwestern’s Department of Industrial Engineering and Management Sciences.
Bradford Winkelman is a senior data scientist at Opex Analytics, where he uses his
diverse background in optimization and statistical modeling to bring creative solutions to difficult problems. In addition to Bachelor’s degrees in mathematics and economics from Indiana University, he recently completed a Master’s degree in Industrial and Systems Engineering at the University of Wisconsin in Madison. His work experience includes statistical analysis of state highway maintenance quality assurance data, and various analytical roles at Bank of America. At the bank, he first worked within the risk organization, gaining experience in economic time-series analysis and geographic risk assessment, and later developed models for customer credit card behavior.