rajesh shekhar data mining prof. chris volinsky. ◦ use data mining techniques to build a portfolio...

21
Equity Portfolio Analysis using Data Mining Rajesh Shekhar Data Mining Prof. Chris Volinsky

Post on 20-Dec-2015

216 views

Category:

Documents


1 download

TRANSCRIPT

Equity Portfolio Analysis using Data Mining

Rajesh Shekhar

Data MiningProf. Chris Volinsky

◦ Use Data Mining techniques to build a portfolio with superior return/risk characteristics using technical indicators Maximize return Minimize risk

◦ Build different momentum based strategies

Objective

◦ Risk Diversification Select stocks across sectors for a natural

diversification. Virtual sectors created using k-means cluster

algorithm◦ Return maximization

Use momentum based indicators to predict future returns

Try different trading algorithms

Approach

Investment Universe: Large Market Cap Stocks (Top 100/300/500)

Data collected for everyday stock prices from WRDS (CRSP database) for the entire stock universe from 1999-2009 .

Custom benchmark of top 100/300/500 stocks was created as composition of S&P 500 was not known over the period

Data Collection

Data IssuesIssue Approach

Large Dataset (Entire stock universe from 1999 to 2009; more than 5 GB)

Use database (SQL Server) and query to get subset of the data and create proper indexes.

Ticker name change. Use permno

Dividends: The price change for stocks does not give the true return as it ignores the dividend paid.

Use daily adjusted return which adjusts for the dividend.

Missing Returns Use average to fill the returns

Duplicates Use ‘select distinct’ SQL query to filter the data

Null Values: Use average to fill the returns

Used k-means cluster to create virtual clusters 11 clusters for 300/500 stock universe and 10 clusters

for 100 stock universe Input: β, Market Cap (Liquidity), P/E (Price/Earning)

βstock = cov(Rstock, Rmarket)/var(Rmarke) β captures long term adjusted equilibrium rate of return

Virtual Sectors

◦ Different models tried for capturing momentum indicators (linear models (based on APT) Best model to capture model momentum was:

Based on time decay of historical returns r = kj*rj

where r = predicted stock return j = time period (j=0 for the current time) k = constant achieved after calibration

More weights on recent data

◦ Two years of moving window for prediction

◦ Portfolio analysis and rebalancing every two weeks

Stock Selection Model

Long Only

Short Only

Long-Short

Sector Rotation

Sector Portfolio Optimization

Strategies

Basic Idea: Long top “n” performing stocks in each sector based on market cap

Portfolio Weights: All selected stocks are equally weighted in

portfolio

Long Only

• Basic Idea: Short bottom “n” performing stocks in each sector based

on market cap• Portfolio Weights: All selected stocks are

equally weighted in portfolio

Short Only

Basic Idea: Combination of Long and Short

Portfolio Weights: All selected stocks are equally weighted in portfolio

Long / Short

Basic Idea: Long top performing sectors & short on bottom performing

onesPortfolio Weights: Weight in each sector is

proportional to return (More weight on the more outperforming sector; shorting allowed)

Sector Rotation

Basic Idea: Select stocks using long only strategy. Portfolio Weights: Decided by Markowitz Portfolio

optimization techniques◦ Sector Constraints : (weights vary from 1.1 to 0.9 of the

target sector weights)◦ Asset Constraints (Shorting and leverage allowed):

(weights vary from -0.1 to 1.1)◦ Allocation on the efficient frontier

Sector Portfolio Optimization

(SQL Server)Database

Portfolio Engine(MATLAB Code)

PortfolioReports &Graphs Risk Analysis

(MATLAB Code)

Performance & Risk Report

Implementation

• MATLAB (Object Oriented)

• SQL Server database (> 5 GB of raw data and with indexes 12GB)

Vary Input parameters

◦ Stock universe (100/300/500)

◦ Stock selected (10/20/40)

◦ Running time window (2001-2002, 2005-2007)

◦ Rebalancing period (15/21/30/45 days)

Robustness (Testing)

Results: 100 stocks universe

Results: 300 stocks universe

Performance Measurement : Risk Metrics

2001-2003 (100 stocks)

LongOnly

ShortOnly

LongShort

SectorRotation

SectorPortOpt

Return(annual) 20.16% 37.10% 16.18% 52.15% 14.49%

Sigma(annual) 32.93% 55.44% 36.73% 47.04% 30.01%

Alpha(annual) 37.35% 54.29% 33.36% 69.34% 31.68%

SharpeRatio(annual) 0.53233 0.62196 0.36859 1.0532 0.395InfoRatio(annual) 1.599 0.73514 0.67099 1.1749 1.2174

Var (95% Daily) -2.91% -3.93% -3.28% -3.96% -2.81%

CVAR (95% Daily) -3.72% -7.23% -5.56% -5.97% -3.69%

MaxDD(Daily) 15.39% 43.56% 27.95% 30.36% 13.40%

2005-2007 (100 stocks)

LongOnly

ShortOnly

LongShort

SectorRotation

SectorPortOpt

Return(annual) 25.55% 0.53% 6.00% 22.43% 33.30%

Sigma(annual) 15.29% 15.71% 12.49% 16.77% 16.23%

Alpha(annual) 15.43% -9.59% -4.12% 12.30% 23.18%

SharpeRatio(annual) 1.3871 -0.24379 0.13134 1.0782 1.7846InfoRatio(annual) 1.3748 -0.40856 -0.25663 0.67326 1.7481

Var (95% Daily) -1.31% -1.51% -1.22% -1.52% -1.37%

CVAR(95% Daily) -1.71% -2.15% -1.52% -1.91% -1.79%

MaxDD(Daily) 8.64% 8.04% 5.90% 8.59% 7.78%

2001-2003 (300 stocks)

LongOnly

ShortOnly

LongShort

SectorRotation

SectorPortOpt

Return(annual) 37.91% 90.17% 57.09% 105.86% 61.34%

Sigma(annual) 42.74% 66.97% 48.95% 54.23% 42.79%

Alpha(annual) 54.26% 106.53% 73.44% 122.21% 77.69%

SharpeRatio(annual) 0.8257 1.308 1.113 1.9049 1.3728InfoRatio(annual) 1.5872 1.2666 1.2248 1.891 2.1169

Var (95% Daily) -3.58% -4.69% -4.40% -4.17% -3.52%

CVAR(95% Daily) -4.53% -8.36% -6.78% -6.33% -4.48%

MaxDD(Daily) 17.55% 45.84% 27.92% 33.22% 24.54%

2005-2007 (300 stocks)

LongOnly

ShortOnly

LongShort

SectorRotation

SectorPort Opt

Return(annual) 70.02% 25.05% 45.07% 51.91% 69.17%

Sigma(annual) 21.68% 19.03% 17.62% 22.98% 23.96%

Alpha(annual) 58.18% 13.22% 33.24% 40.08% 57.33%

Sharpe Ratio(annual) 3.0318 1.0882 2.3126 2.0707 2.7072

Info Ratio(annual) 3.2054 0.5056 1.6552 1.6635 2.7043

VaR(95% Daily) -1.44% -1.54% -1.29% -1.72% -1.61%

CVAR (95% Daily) -1.89% -2.24% -1.83% -2.43% -2.34%

Max DD(Daily) 10.99% 10.09% 8.27% 13.61% 15.98%

Benchmark : Custom Benchmark Value-added Return = Pure sector allocation + Allocation/Selection interaction + Within-sector selection

RV= + +

Rv = the value-added return

wP,j = portfolio weight of sector j

wB,j = benchmark weight of sector j

RP,j = portfolio return of sector j

RB,j = benchmark return of sector j

RB = return in the portfolio’s benchmark

S = number of sectors

Performance Attribution

Performance Attribution Results

Transaction Costs:◦ Slippage cost and explicit costs are taken into

account◦ Market impact and other implicit costs are

ignored

Leverage costs are not taken into account

Portfolio Turnover not taken into account

Other Issues

Virtual sectors works reasonably well. Time decay returns is a decent predictor of future

returns in stable market for short time periods. Statistically relevant for large market caps.

Conclusions