prediction of credit default by continuous optimization

34
Prediction of Credit Default by Continuous Optimization 4th International Summer School Achievements and Applications of Contemporary Informatics, Mathematics and Physics National University of Technology of the Ukraine Kiev, Ukraine, August 5-16, 2009 Gerhard Gerhard Gerhard Gerhard Gerhard Gerhard Gerhard Gerhard- - - - -Wilhelm Weber Wilhelm Weber Wilhelm Weber Wilhelm Weber Wilhelm Weber Wilhelm Weber Wilhelm Weber Wilhelm Weber * Efsun Kürüm, Kasırga Yıldırak Institute of Applied Mathematics Institute of Applied Mathematics Middle East Technical University, Ankara, Turkey Middle East Technical University, Ankara, Turkey * Faculty of Economics, Management and Law, University of Siegen, Germany Faculty of Economics, Management and Law, University of Siegen, Germany Center for Research on Optimization and Control, University of Aveiro, Portugal

Upload: ssa-kpi

Post on 18-Dec-2014

629 views

Category:

Education


2 download

DESCRIPTION

AACIMP 2009 Summer School lecture by Gerhard Wilhelm Weber. "Modern Operational Research and Its Mathematical Methods" course.

TRANSCRIPT

Page 1: Prediction of Credit Default by Continuous Optimization

Prediction of Credit Defaultby Continuous Optimization

4th International Summer SchoolAchievements and Applications of Contemporary Informatics, Mathematics and PhysicsNational University of Technology of the UkraineKiev, Ukraine, August 5-16, 2009

GerhardGerhardGerhardGerhardGerhardGerhardGerhardGerhard--------Wilhelm Weber Wilhelm Weber Wilhelm Weber Wilhelm Weber Wilhelm Weber Wilhelm Weber Wilhelm Weber Wilhelm Weber **

Efsun Kürüm, Kasırga Yıldırak

Institute of Applied Mathematics Institute of Applied Mathematics Middle East Technical University, Ankara, TurkeyMiddle East Technical University, Ankara, Turkey

** Faculty of Economics, Management and Law, Universi ty of Siegen, GermanyFaculty of Economics, Management and Law, Universi ty of Siegen, GermanyCenter for Research on Optimization and Control, Univ ersity of Aveiro, Portugal

Page 2: Prediction of Credit Default by Continuous Optimization

• Main Problem from Credit Default

• Logistic Regression and Performance Evaluation

• Cut-Off Values and Thresholds

• Classification and Optimization

• Nonlinear Regression

Outline

• Nonlinear Regression

• Numerical Results

• Outlook and Conclusion

Page 3: Prediction of Credit Default by Continuous Optimization

� Whether a credit application should be consented or rejected .

Main Problem from Credit Default

Solution

� Learning about the default probability of the applicant.

Page 4: Prediction of Credit Default by Continuous Optimization

� Whether a credit application should be consented or rejected .

Main Problem from Credit Default

Solution

� Learning about the default probability of the applicant.

Page 5: Prediction of Credit Default by Continuous Optimization

0 1 1 2( 1 )

log( 0 )

= == + ⋅ + ⋅ + + ⋅ = =

Kl lp2 l pP Y X x

β β x β x β xP Y X x

l

l

Logistic Regression

( 1,2,..., )=l N( 1,2,..., )=l N

Page 6: Prediction of Credit Default by Continuous Optimization

Goal

We have two problems to solve here:

� To distinguish the defaults from non -defaults.

Our study is based on one of the Basel II criteria whichrecommend that the bank should divide corporate firms by8 rating degrees with one of them being the default class.

� To distinguish the defaults from non -defaults.

� To put non-default firms in an order based on their credit quality

and classify them into (sub) classes .

Page 7: Prediction of Credit Default by Continuous Optimization

Data

� Data have been collected by a bank from the firms op erating in the

manufacturing sector in Turkey.

� They cover the period between 2001 and 2006.

� There are 54 qualitative variables and 36 quantitat ive variables originally .

� Data on quantitative variables are formed based on a balance sheet

submitted by the firms ’ accountant s. submitted by the firms ’ accountant s.

Essentially, they are the well-known financial rati os.

� The data set covers 3150 firms from which 92 are in the state of default.

As the number of default is small, in order to overc ome the possible

statistical problems, we downsize the number to 551,

keeping all the default cases in the set.

Page 8: Prediction of Credit Default by Continuous Optimization

non-defaultcases

defaultcases

ROC curve

cut-off value

We evaluate performance of the model

test result value

TP

F, s

ensi

tivity

FPF, 1-specificity

Page 9: Prediction of Credit Default by Continuous Optimization

True PositiveFraction

False PositiveFraction

d n

truth

Model outcome versus truth

Fraction

TPF

Fraction

FPF

False NegativeFraction

FNF

True NegativeFraction

TNF

model outcome

d ı

total

1 1

n ı

Page 10: Prediction of Credit Default by Continuous Optimization

Definitions

• sensitivity ( TPF) := P( Dı | D)• specificity := P( NDı | ND )• 1-specificity ( FPF) := P( Dı | ND )

• points (TPF, FPF) constitute the ROC curve• c := cut-off value • c takes values between - and

• TPF(c) := P( z>c | D )• FPF(c) := P( z>c | ND )

∞ ∞

Page 11: Prediction of Credit Default by Continuous Optimization

normal-deviate axesTPF

Normal Deviate (TPF)

:= n s

s

µ - µσ

a

)Φ( ic

Φ( )+ ⋅ ia b cTPF ( ) = :ic

: = n

s

σ

FPF( ) =:ic

FPF

Normal Deviate (TPF)

Normal Deviate (FPF)

Page 12: Prediction of Credit Default by Continuous Optimization

normal-deviate axesTPF

Normal Deviate (TPF)

t

:= n s

s

µ - µσ

a

)Φ( ic

Φ( )+ ⋅ ia b cTPF ( ) = :ic

: = n

s

σ

FPF( ) =:ic

FPF

Normal Deviate (TPF)

Normal Deviate (FPF)

c

Page 13: Prediction of Credit Default by Continuous Optimization

actually non-default cases

actually default cases

Ex.: cut-off values

Classification

c

class I class II class III class IV class V

To assess discriminative power of such a model, we calculate the Area Under (ROC) Curve:

: Φ( ) Φ( ).∞

−∞= + ⋅∫AUC c d ca b

−∞ ∞

c

Page 14: Prediction of Credit Default by Continuous Optimization

relationship between thresholds and cut-off values

TPFEx.:

1Φ( ) Φ ( )−⇔= =c t c t

FPF

t1 t2 t3 t4 t5t0 R=5

Page 15: Prediction of Credit Default by Continuous Optimization

maximize AUC,

Problem:

Optimization in Credit Default

Simultaneously to obtain the thresholds and the parameters a and bthat maximize AUC,that

while balancing the size of the classes (regularization)

guaranteeing a good accuracy .and

Page 16: Prediction of Credit Default by Continuous Optimization

Optimization Problem

1 11

100

2

max-

Φ( Φ ( )) ( )γ−

+=

+ ⋅ − − −∑∫

Ri

i iia,b,

nt ta b t dt

ττττ

⋅1α ⋅2α

subject to 1)0,1,...,( −= Ri

1 02 -1 0 , 1: ( ) = == R RTt , t , ..., t t tτ

11

Φ( Φ ( )) +

−+ ⋅ ≥∫i

i

i

t

t

a b t d t δ

Page 17: Prediction of Credit Default by Continuous Optimization

Optimization Problem

1 11

100

2

max-

Φ( Φ ( )) ( )γ−

+=

+ ⋅ − − −∑∫

Ri

i iia,b,

nt ta b t dt

ττττ

⋅1α ⋅2α

1

01

1

Φ( Φ ( )) +

>

>+

−+ ⋅ ≥∫i

i

i

t

t

i i

a b t d t δ

t t

subject to

1 02 -1 0 , 1: ( ) = == R RTt , t , ..., t t tτ

1)0,1,...,( −= Ri

Page 18: Prediction of Credit Default by Continuous Optimization

TPF

AUC

1-AUC

Over the ROC Curve

0

11: (1 Φ( Φ ( ))) −= − + ⋅∫AOC a b t dt

FPF

t1 t2 t3 t4 t5t0

Page 19: Prediction of Credit Default by Continuous Optimization

1

2 10

211

10

( ) (1 Φ( ( ))) minγ−

−+

=

⋅ − − + ⋅ − + ⋅Φ∫

∑a, b,

Ri

i iτ i

α t t α a b t dtn

New Version of the Optimization Problem

1

11(1 Φ( ( )))

+

+−− + ⋅Φ ≤ − −∫

t j

j j jt j

a b t dt t t δ

subject to

( 0,1, ..., 1)= −j R

Page 20: Prediction of Credit Default by Continuous Optimization

Simultaneously to obtain the thresholds and the parameters a and b

that maximize AUC,

while balancing the size of the classes (regularization)

Optimization problem:

Regression in Credit Default

while balancing the size of the classes (regularization)

and guaranteeing a good accuracy

discretization of integral

nonlinear regression problem

Page 21: Prediction of Credit Default by Continuous Optimization

Discretization of the Integral

Riemann-Stieltjes integral

Φ ( ) Φ ( )∞

− ∞

= + ⋅∫ a b c d cA U C

Riemann integral

∑ ⋅⋅+=

−≈R

kkk t tba

1

1∆))(ΦΦ(AUC

Riemann integral

Discretization

11

0

Φ( Φ ( )) −= + ⋅∫ a b t dtAUC

Page 22: Prediction of Credit Default by Continuous Optimization

Optimization Problem with Penalty Parameters

1

0

2

11

( ) : (1-Φ( ( )))2 1 10

( ) αγ α

−+ ⋅ Φ+

=

= ⋅ − − − ⋅ +∫

∑Θ-

Ri Π a,b, a b t dti i

i

τ t tn

In the case of violation of anyone of these constraints, we in troduce penaltyparameters. As some penalty becomes increased, the iterate s are forcedtowards the feasible set of the optimization problem.

0=i

11

0

13

: ( , , )

Φ( ( ))) α

τ

+

=

= Ψ

⋅ − + ⋅ Φ ∑ ∫ 144444424444443

j

j

j

tR -

tj

j a b

δ a b t dt

1 2 1: ( , , ..., )−= TRΘ θ θ θ 0≥jθ ( 0,1, ..., 1)= −j R

⋅jθ

Page 23: Prediction of Credit Default by Continuous Optimization

2

12

11

12 ( ) 1

0

( ) ( (1-Φ( ( ))) ∆ )γτ −

=

−+

=

= ⋅ − − + ⋅ + ⋅Φ +

∑ ∑R

j j

j

Ri

Θ i ii

Π a,b, α t t α a b t tn

2

Optimization Problem further discretized

1

1

00

2

1( ( ) ) ∆

Φ

νθ η+==

− + ⋅Φ −∑ ∑ −

j

j

j j

R-

jνj

n

j jδ νa b η

t t.3α

Page 24: Prediction of Credit Default by Continuous Optimization

2

12

11

12 ( ) 1

0

( ) ( (1-Φ( ( ))) ∆ )γτ −

=

−+

=

= ⋅ − − + ⋅ + ⋅Φ +

∑ ∑R

j j

j

Ri

Θ i ii

Π a,b, α t t α a b t tn

Optimization Problem further discretized

1

1

00

2

1( ( ) ) ∆

Φ

νθ η+==

−+ ⋅Φ − ∑ ∑ −

j

j

j j

R-

jνj

n

j jδ νa b η

t t.3α

Page 25: Prediction of Credit Default by Continuous Optimization

( ) ( )

( )

2

,

1

2

1

min

:

β β

β

=

=

= −

=

N

j jj

N

jj

f d g x

f

Nonlinear Regression

min ( ) ( ) ( )β β β=

Tf F F

( )1( ) : ( ),..., ( )β β β= T

NF f f

Page 26: Prediction of Credit Default by Continuous Optimization

• Gauss-Newton method :

( ) ( ) ( ) ( )β β β β∇ ∇ = −∇T qF F F F

1 :β β+ = +k k kq

Nonlinear Regression

• Levenberg-Marquardt method :

( )( ) ( ) I ( ) ( )β β λ β β∇ ∇ + = −∇Tp qF F F F

0λ ≥

Page 27: Prediction of Credit Default by Continuous Optimization

( ) ( ),

min ,

subject to ( ) ( ) I ( ) ( ) , 0,β β λ β β∇ ∇ + − −∇ ≤ ≥

t

T

qt

F F F Fq t t

alternative solution

Nonlinear Regression

( ) ( )2

2

subject to ( ) ( ) I ( ) ( ) , 0,

|| ||

β β λ β β∇ ∇ + − −∇ ≤ ≥

TpF F F F

qL

q t t

M

conic quadratic programming

Page 28: Prediction of Credit Default by Continuous Optimization

( ) ( ),

min ,

subject to ( ) ( ) I ( ) ( ) , 0,β β λ β β∇ ∇ + − −∇ ≤ ≥

t

T

qt

F F F Fq t t

Nonlinear Regression

alternative solution

( ) ( )2

2

subject to ( ) ( ) I ( ) ( ) , 0,

|| ||

β β λ β β∇ ∇ + − −∇ ≤ ≥

TpF F F F

qL

q t t

M

interior point methods

conic quadratic programming

Page 29: Prediction of Credit Default by Continuous Optimization

Numerical Results

Initial Parameters

a b Threshold values (t)

1 0.95 0.0006 0.0015 0.0035 0.01 0.035 0.11 0.35

1.5 0.85 0.0006 0.0015 0.0035 0.01 0.035 0.11 0.35

0.80 0.95 0.0006 0.0015 0.0035 0.01 0.035 0.11 0.35

2 0.70 0.0006 0.0015 0.0035 0.01 0.035 0.11 0.35

Optimization Results

a b Threshold values (t) AUC

0.9999 0.9501 0.0004 0.0020 0.0032 0.012 0.03537 0.09 0.3400 0.8447

1.4999 0.8501 0.0003 0.0017 0.0036 0.011 0.03537 0.10 0.3500 0.9167

0.7999 0.9501 0.0004 0.0018 0.0032 0.011 0.03400 0.10 0.3300 0.8138

2.0001 0.7001 0.0004 0.0020 0.0031 0.012 0.03343 0.11 0.3400 0.9671

Page 30: Prediction of Credit Default by Continuous Optimization

Numerical Results

Accuracy Error in Each Class

I II III IV V VI VII VIII

0.0000 0.0000 0.0000 0.0001 0.0001 0.0010 0.0010 0.0075

0.0000 0.0000 0.0000 0.0001 0.0001 0.0010 0.0018 0.0094

0.0000 0.0000 0.0000 0.0000 0.0001 0.0002 0.0018 0.0059

0.0000 0.0000 0.0000 0.0001 0.0001 0.0006 0.0018 0.0075

Number of Firms in Each Class

I II III IV V VI VII VIII

4 56 27 133 115 102 129 61

2 42 52 120 119 111 120 61

4 43 40 129 114 116 120 61

4 56 24 136 106 129 111 61

Number of firms in each class at the beginning: 10, 26, 58, 106, 134, 121, 111, 61

Page 31: Prediction of Credit Default by Continuous Optimization

Generalized Additive Models

http://144.122.137.55/gweber/http://144.122.137.55/gweber/

Page 32: Prediction of Credit Default by Continuous Optimization

Generalized Additive Models

Page 33: Prediction of Credit Default by Continuous Optimization

Aster, A., Borchers, B., and Thurber, C., Parameter Estimation and Inverse Problems, Academic Press, 2004.

Boyd, S., and Vandenberghe, L., Convex Optimization, Cambridge University Press, 2004.

Buja, A., Hastie, T., and Tibshirani, R., Linear smoothers and additive models, The Ann. Stat. 17, 2 (1989) 453-510.

Fox, J., Nonparametric regression, Appendix to an R and S-Plus Companion to Applied Regression, Sage Publications, 2002.

Friedman, J.H., Multivariate adaptive regression splines, Annals of Statistics 19, 1 (1991) 1-141.

References

Friedman, J.H., Multivariate adaptive regression splines, Annals of Statistics 19, 1 (1991) 1-141.

Friedman, J.H., and Stuetzle, W., Projection pursuit regression, J. Amer. Statist Assoc. 76 (1981) 817-823.

Hastie, T., and Tibshirani, R., Generalized additive models, Statist. Science 1, 3 (1986) 297-310.

Hastie, T., and Tibshirani, R., Generalized additive models: some applications, J. Amer. Statist. Assoc.82, 398 (1987) 371-386.

Hastie, T., Tibshirani, R., and Friedman, J.H., The Element of Statistical Learning, Springer, 2001.

Hastie, T.J., and Tibshirani, R.J., Generalized Additive Models, New York, Chapman and Hall, 1990.

Nash, G., and Sofer, A., Linear and Nonlinear Programming, McGraw-Hill, New York, 1996.

Nemirovski, A., Lectures on modern convex optimization, Israel Institute of Technology (2002).

Page 34: Prediction of Credit Default by Continuous Optimization

Nemirovski, A., Modern Convex Optimization, lecture notes, Israel Institute of Technology (2005).

Nesterov, Y.E , and Nemirovskii, A.S., Interior Point Methods in Convex Programming, SIAM, 1993.

Önalan, Ö., Martingale measures for NIG Lévy processes with applications to mathematical finance,presentation in: Advanced Mathematical Methods for Finance, Side, Antalya, Turkey, April 26-29, 2006.

Taylan, P., Weber, G.-W., and Yerlikaya, F., A new approach to multivariate adaptive regression splineby using Tikhonov regularization and continuous optimization, to appear in TOP, Selected Papers at theOccasion of 20th EURO Mini Conference (Neringa, Lithuania, May 20-23, 2008).

References

Stone, C.J., Additive regression and other nonparametric models, Annals of Statistics 13, 2 (1985) 689-705.

Weber, G.-W., Taylan, P., Akteke-Öztürk, B., and Uğur, Ö., Mathematical and data mining contributionsdynamics and optimization of gene-environment networks, in the special issue Organization in Matterfrom Quarks to Proteins of Electronic Journal of Theoretical Physics.

Weber, G.-W., Taylan, P., Yıldırak, K., and Görgülü, Z.K., Financial regression and organization, to appear in the Special Issue on Optimization in Finance, of DCDIS-B (Dynamics of Continuous, Discrete andImpulsive Systems (Series B)).