pitfalls of curve fitting for large losses · pitfalls of curve fitting for large losses 12 october...

20
1 www.guycarp.com Pitfalls of Curve Fitting for Large Losses 12 October 2011 Guy Carpenter Amit Parmar, Michael Cane 1 Guy Carpenter Agenda Introduction Theoretical analysis Data sample size issues Model uncertainty Parameter error Summary Real-world analysis UK Motor market fitting Individual clients versus market curve Summary Questions

Upload: phungnhu

Post on 27-Jul-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

1

www.guycarp.com

Pitfalls of Curve Fitting for Large Losses

12 October 2011

Guy Carpenter

Amit Parmar, Michael Cane

1 Guy Carpenter

Agenda

Introduction

Theoretical analysis

– Data sample size issues

– Model uncertainty

– Parameter error

– Summary

Real-world analysis

– UK Motor market fitting

– Individual clients versus market curve

Summary

Questions

2

Introduction

3 Guy Carpenter

Introduction What is curve fitting used for?

Understanding the historical data and simplifying data sets

Modelling where there are few data points

Understanding potential extremes of the data (via tails)

Reducing sample variation

3

4 Guy Carpenter

Introduction Why is curve fitting important for actuaries?

Stochastic modelling

Benchmarking exercises

Helps alleviate free-cover problem in

experience rating

Exposure rating may not be possible

Fundamental input to the capital model

Advantages to separating the frequency

and severity

x

x

x

x

x

x

x

x

x

x

x

x

x

Free cover

5 Guy Carpenter

Introduction What are some of the common pitfalls?

4

Theoretical analysis

7 Guy Carpenter

Theoretical analysis

If we sample from:

A known distribution

With known parameters

Is it possible to go wrong?

Lets find out…

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

- 10 20 30

CD

F

Claim Size (£M)

5

8 Guy Carpenter

Sample sizes

− 30, 300 & 3000 ultimate claim data samples

Distribution

− Simple Pareto

Parameters

− Alpha = 1.6

− Lambda = 1,500,000

Reinsurance structure

− Common motor programme:

£3m xs £2m

£5m xs £5m

£15m xs £10m

Unlimited xs £25m

Theoretical analysis Our experiment

Data sample size issues

6

10 Guy Carpenter

Theoretical analysis - Data sample size issues What are the implications of insufficient data?

Sample

size

Simple Pareto

Alpha

CV

30 1.364 0.193

300 1.648 0.059

3000 1.596 0.019

Results obtained using

MetaRisk Fit

How does the low sample size affect the pricing?

11 Guy Carpenter

118%

144%

177%

235%

97% 93% 93%100%

70%

100%

130%

160%

190%

220%

3M XS 2M 5M XS 5M 15M XS 10M Unltd XS 25M

30 claims

300 claims

3000 claims

Theoretical analysis - Data sample size issues Loss cost to the layer

Pricing using Simple Pareto distribution from each data set

Significantly mis-priced with small data sample

Assume 3000

claims provides

benchmark price

7

Model uncertainty

13 Guy Carpenter

Theoretical analysis – Model uncertainty

Suppose we have:

Sufficient data:

− 3000 claim data sample

What can go wrong?

Distribution:

− What are the chances of selecting the correct distribution?

?

What is the effect on our pricing?

8

14 Guy Carpenter

MetaRisk Fit – Severity distributions

Simple Pareto Lognormal Pareto T

Extreme Value Limit Generalized Cauchy

Inverse Transformed

Gamma

Exponential Normal Split Simple Pareto

Inverse Paralogistic Uniform Transformed Gamma

Loglogistic Generalized Extreme Value Inverse Burr

Paralogistic Extremal Pareto Burr

Loggamma Ballasted Pareto Transformed Beta

Gamma Power Generalized Beta

Inverse Weibull Beta Inverse Generalized Beta

Inverse Gaussian Inverse Beta

Inverse Gamma Generalized Pareto

Key: 1-Parameter 2-Parameter 3-Parameter 4-Parameter

Theoretical analysis – Model uncertainty Possible severity distributions

Common distributions used to conduct our analysis

15 Guy Carpenter

Theoretical analysis – Model uncertainty Chances of getting the wrong distribution with sufficient data

MetaRisk Fit:

Simple Pareto

is 1 of the 28

distributions

9

16 Guy Carpenter

107% 106%

89%

70%

103% 99%93%

83%

142%

89%

12%

0%

99% 92%

83%

68%

178%

194%

72%

2%

0%

20%

40%

60%

80%

100%

120%

140%

160%

180%

200%

3M XS 2M 5M XS 5M 15M XS 10M Unltd XS 25M

Lognormal

Ballested Pareto

Exponential

Transformed Beta

Gamma

Simple pareto

means

Theoretical analysis – Model uncertainty Expected loss to the layer

3000 claims

Lognormal: Over-pricing for lower layers; Under-pricing for higher layers

17 Guy Carpenter

102%

101%

92%83%

100%

99%95%

89%

104%

83%

22%

1%

99%

95%

89%80%

109%126%

65%

7%

0%

20%

40%

60%

80%

100%

120%

140%

3M XS 2M 5M XS 5M 15M XS 10M Unltd XS 25M

Lognormal

Ballested Pareto

Exponential

Transformed Beta

Gamma

Simple pareto

Theoretical analysis – Model uncertainty Standard deviation of loss to the layer

3000 claims

Lognormal also underestimates volatility on the higher layers

10

Parameter error

19 Guy Carpenter

Theoretical analysis – Parameter error

Suppose we have:

Sufficient data:

− 3000 claim data sample

Correct distribution:

− Simple Pareto

What can go wrong?

Incorrect parameters:

− Instead of α = 1.6

− We could pick lower or higher values

What is the effect on our pricing?

?

11

20 Guy Carpenter

0%

50%

100%

150%

200%

250%

300%

350%

3M XS 2M 5M XS 5M 15M XS 10M Unltd XS 25M

Theoretical analysis – Parameter error The funnel of uncertainty

α =1.6

1.2

1.4

1.8

2.0

Simple Pareto with different alpha values

318%

177%

32%

63%

243%

153%

44%

68%

133%

114%

77%

87%

3M XS 2M 5M XS 5M 15M XS 10M Unltd XS 25M

How can we deal with this volatility?

21 Guy Carpenter

Theoretical analysis – Parameter error Quantifying parameter error

Parameter error is effectively

measuring sample size error

Distortion is accentuated in multi-

parameter distributions

Parameter standard deviation and

correlation quantifies parameter

uncertainties

We simulate parameters for each run of

the model e.g., year of simulation

We assume a lognormal distribution for

parameter uncertainty

MetaRisk Fit extract -

Ballasted Pareto, 3000 claims

12

22 Guy Carpenter

Theoretical analysis Summary

Data sample

issues

Distribution

uncertainty

Parameter

error

Effect on pricing

5M xs 5M layer

Insufficient

data

Sufficient

data

Wrong

parameters

Wrong

distribution

Right

distribution

Right

parameters

Wrong

parameters

Mis-priced by

144%

Mis-priced by

106%

No effect

Right

distribution

Real-world analysis UK Motor Market

13

24 Guy Carpenter

Real-world analysis Setting the scene

Case Study: UK Motor Market

Benchmarking is particularly important in Europe:

– No industry data collectors such as ISO / NCCI

Homogenous line of business

We have access to approximately 60% of motor market data in the UK

Unlimited reinsurance coverage

– Not loss limited

– Low deductibles

Compulsory line of business

25 Guy Carpenter

Real-world analysis Market data statistics – for developed claims

Market data summary statistics

Number of companies 21

Analysis threshold £1,700,000

Total number of claims 1750

Average claim number (per client) 83

Minimum claim number 11

Maximum claim size £30,235,668

Basis Report Year

Years selected 2000 – 2010

Inflation 7.5% pa

14

26 Guy Carpenter

The largest observed claim has a big influence on the fit

Real-world analysis Largest claim effect

How do we deal

with such outliers?

Remove

Ignore

Weighting

Transform

27 Guy Carpenter

Real-world analysis What selection criteria to use?

Mathematical tests

Goodness-of-fit tests such as:

By eye – visual judgement

E.G.,

– CDF

– PDF

– QQ Graph

– PP Graph

1

)1(

2

Akaike 2.

Kn

KKKNLL

enfornK

NLL 2

))ln(ln(.

2

HQ 3.

2

)ln(.

2

Schwartz 4.

nKNLL

Likelihood-Log Natural 1.

parameters ofnumber K

points data ofnumber n :Where

15

28 Guy Carpenter

Choosing the market curve Possible criteria

Good fit versus over parameterisation

– Use an information criteria like the H-Q test

Higher number of parameters may lead to less predictive power

Parameter CV should be low

Parameters should be significantly different from zero

Interpretability of the model and parameters

Where is the curve going to be used ?

Curve-fitting is subjective; it is an art not a science

29 Guy Carpenter

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

0M 2M 4M 6M 8M 10M

Pro

bab

ility

Claim value (£m)

Inverse Gaussian Empirical

Real-world analysis What part of curve to fit to?

Inverse Gaussian – good fit to the body of the distribution (0 - £10M)

16

30 Guy Carpenter

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

10M 15M 20M 25M 30M

Pro

bab

ility

Claim value (£m)

Inverse Gaussian Empirical

Real-world analysis What part of curve to fit to?

Although, the fit is heavier at the tail (£10M - £30M)

31 Guy Carpenter

Generalised Beta

Has a good fit when looking at the CDF graph

Best performing in tests

BUT…

CVs of parameters are too high

Beta value is too low

Selected Distribution: Weibull

17

32 Guy Carpenter

107%

124%

106%

118%

85%

90%

95%

100%

105%

110%

115%

120%

125%

3M XS 2M 5M XS 5M 15M XS 10M Unltd XS 25M

Burr

Simple pareto

Lognormal

Transformed Gamma

Generalised Beta

Weibull

Real-world analysis Effect on the layers

Burr, Lognormal & Transformed Gamma similar to Weibull

Lo

g o

f lo

ss

es

Simple Pareto & Generalised Beta: Over-pricing for higher layers

Individual clients’ versus market curve

18

34 Guy Carpenter

Real-world analysis - Individual clients’ vs. Market curve Individual client data statistics

Attribute Client A Client B Client C Market

Total number of claims 293 52 12 1,515

Analysis threshold £1,700,000

Maximum claim size £29,731,529 £16,415,791 £12,090,704 £30,235,668

Minimum claim size £1,709,255 £1,736,425 £1,727,721 £1,702,032

Average claim size £3,955,290 £4,138,872 £4,853,976 £4,209,709

Basis Report Year

Years 2000 - 2010

35 Guy Carpenter

Real-world analysis Client empirical vs. best fit

Parameters

Name Value Std Dev CV

mu 14.28 0.26 0.02

sigma 0.89 0.11 0.12

Correlations

beta

theta -0.94

Parameters

Name Value Std Dev CV

alpha 2.35 0.51 0.22

tau 1.70 0.31 0.18

Correlations

beta

theta 0.87

Parameters

Name Value Std Dev CV

alpha 1.07 0.38 0.35

Client A

Client B

Client C

19

36 Guy Carpenter

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

2M 7M 12MP

rob

abili

ty

Claim value

Chart Title

Market

Client A - Lognormal

Client B - Log Gamma

Client C - Simple Pareto

Market Empirical

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

2M 7M 12M

Pro

bab

ility

Claim value

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

2M 7M 12M

Pro

bab

ility

Claim value

Market Empirical 98% 100% 66% 27%

Client A 95% 79% 64% 55%

Client B 96% 83% 109% 256%

Client C 102% 153% 378% 1660%

Real-world analysis Market Curve vs. Clients’ best fit

Layers 3M XS 2M 5M XS 5M 15M XS 10M Unltd XS 25M

Market Dev 100% 100% 100% 100%

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

2M 7M 12M

Pro

bab

ility

Claim value

Chart Title

Market

Client A - Lognormal

Client B - Log Gamma

Client C - Simple Pareto

Market Empirical

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

2M 7M 12M

Pro

bab

ility

Claim value

Chart Title

Market

Client A - Lognormal

Client B - Log Gamma

Client C - Simple Pareto

Market Empirical

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

2M 7M 12M

Pro

bab

ility

Claim value

Chart Title

Market

Client A - Lognormal

Client B - Log Gamma

Client C - Simple Pareto

Market Empirical

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

2M 7M 12M

Pro

bab

ility

Claim value

Chart Title

Market

Client A - Lognormal

Client B - Log Gamma

Client C - Simple Pareto

Market Empirical0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

2M 7M 12M

Pro

bab

ility

Claim value

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

2M 7M 12M

Pro

bab

ility

Claim value

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

2M 7M 12M

Pro

bab

ility

Claim value

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

2M 7M 12M

Pro

bab

ility

Claim value

Market 99% 95% 86% 65%

0%10%20%30%40%50%60%70%80%90%

100%

2M 12M

Pro

ba

bil

ity

Claim value

Market

Market developed

Market Empirical

Client A - Lognormal

Client B - Log Gamma

Client C - Simple Pareto

Summary

20

38 Guy Carpenter

Summary Key messages

Parameter Selection

Model Selection

Sample Size

Bad news: Difficult to hide from

the pitfalls of curve fitting

– Multiplicative effect

– Implications where

curves are most needed

– Model selection has

least impact

Good news:

‘Ultimately curve-fitting is where science and art meet’

Bench marking

Company data

Blended approach

39 Guy Carpenter

Any Questions?