coursework1 20122013 solutions

10
Statistics and Probabilistic Modelling for Insurance Solutions to Course Work No1 2012/2013 1. (a) Since  f   x,  y is a joint density, we have that (1)      f   x,  y   y   x = c      x 2 +  y 2 §  R 2   y   x = 1 .  Noting th at the dou ble integral is equal to the area of th e circle, i.e.,    x 2 +  y 2 §  R 2   y   x = p R 2 from (1) we obtain that , c = 1 p R 2 . (b) We have  f  X   x =    f   x,  y   y = 1 p R 2     x 2 +  y 2 §  R 2   y = 1 p R 2    -  R 2 -  x 2  R 2 -  x 2   y  = 2 p R 2  R 2 - x 2 , if  x 2 § R 2  and  f  X   x = 0 if  x 2 > R 2 . By symmetry , the mar gina l dens ity of Y  is give n  by  f Y   y = 2 p R 2  R 2 -  y 2 0 for for  y 2 § R 2  y 2 > R 2 . (c)  The d istr ibut ion f unct ion, F  D a, 0 § a § R, of t he dis tance,  D =  X 2 + Y 2  is obtained as follows F  D a = P  D § a = P X 2 + Y 2 § a  = P  X 2 + Y 2 § a 2 =      x 2 +  y 2 §a 2  f   x,  y   y   x = 1 p R 2      x 2 +  y 2 §a 2   y   x = p a 2 p R 2 = a 2  R 2 , where we hav e use d the fact tha t   x 2 +  y 2 §a 2   y   x  is the area of a cir cle of radi us a  and thus is equal to p a 2 . When  R = 50 miles and a = 10 miles we obtain P  D § 10 = 100 2500  = 0.04.

Upload: yous123

Post on 04-Jun-2018

226 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CourseWork1 20122013 Solutions

8/13/2019 CourseWork1 20122013 Solutions

http://slidepdf.com/reader/full/coursework1-20122013-solutions 1/10

Statistics and Probabilistic Modelling for Insurance

Solutions to Course Work No1 2012/2013

1. (a) Since  f  x,   y is a joint density, we have that

(1)  -¶

  -¶

 f  x,   y „  y „  x = c     x2+ y2§ R2

„  y „  x = 1 .

 Noting that the double integral is equal to the area of the circle, i.e.,

   x2+ y2§ R2

„  y „  x = p R2

from (1) we obtain that , c =1

p R2.

(b)  We have

 f  X  x =   -¶

 f  x,   y „  y =1

p R2    x2+ y2§ R2

„  y

=1

p R2   -   R2- x2

 R2- x2

„  y   =2

p R2 R2 - x2 ,

if  x2 § R2  and  f  X  x = 0 if  x2 > R2. By symmetry, the marginal density of Y  is given

 by

 f Y  y =2

p R2  R2 -  y2

0

for 

for 

 y2 § R2

 y2 > R2.

(c)   The distribution function, F  Da, 0 § a § R, of the distance,  D =   X 2 + Y 2  is

obtained as follows

F  Da = P D § a = P X 2 + Y 2 § a   = P X 2 + Y 2 § a2

=     x2+ y2§a2

 f  x,   y „  y „  x =1

p R2      x2+ y2§a2

„  y „  x =p a2

p R2=

a2

 R2,

where we have used the fact that   x2+ y2§a2

„  y „  x  is the area of a circle of radius a  and 

thus is equal to p a2

. When  R = 50 miles and a = 10 miles we obtainP D § 10 =

100

2500  = 0.04.

Page 2: CourseWork1 20122013 Solutions

8/13/2019 CourseWork1 20122013 Solutions

http://slidepdf.com/reader/full/coursework1-20122013-solutions 2/10

(d)  Using the distribution function, F  Da, from part (c) we obtain

 f  Da =„ 

„ aF  Da =

2 a

 R2, 0 § a § R.

Hence, we have

 E  D =   0

 R

a2 a

 R2„ a =

2

 R2   0

 R

a2 „ a =2 R

3.

2. (a)  Equating the theoretical mean and variance to the empirical mean,  x, and 

variance, we get

‰ m+

1

2

 s2

= x‰2  m+s2‰s2

- 1 = i=1120  xi - x2 119

ï

‰2  m+s2

= i=1120 xi  120

‰2  m+s2‰s2

- 1 = i=1120  xi - x2 119

ï ‰s2

- 1 =i=1

120  xi - x2 119

 x2

Hence, we have

‰s2

- 1 =15601373

2020.292

2ï s2 = Log1 +

15601373

2020.292

2 = 1.57327   ï s = 1.2543

and 

‰ m+

1

2 s2

= 2020.292   ï m = Log2020.292 -1

2s2 = 6.82436

(b)  The likelihood function takes the form

 La,  l = i=1

n

 f  X  xi = i=1

n a la

l + xia+1

and hence

log La,  l = la,  l = Log i=1

n a la

l + xia+1

= i=1

n

log a la - logl + xia+1 = i=1

n

loga + a logl - a + 1 logl + xi

=n

loga +n

a logl -i=1

n

a + 1 logl + x

i

SPMI CW1 2012/2013 Solutions   2 

Page 3: CourseWork1 20122013 Solutions

8/13/2019 CourseWork1 20122013 Solutions

http://slidepdf.com/reader/full/coursework1-20122013-solutions 3/10

Differentiating the log-likelihood function wrt a  and l, and then solving for a, one

find the maximum likelihood estimators must satisfy

d log La,  ld  a

= n1

a

+ n log

l

-

i=1

n

log

l + xi

= 0 ï

a`

=n

i=1n logl + xi - n logl

d log La,  ld  l

= n a1

l-

i=1

n

a + 1 1

l + xi   = 0 ï

a`

=i=1

n 1

l+ xinl

 -i=1n 1l+ xi

Hence, the maximum likelihood estimator l` must be a solution of 

n

i=1n logl + xi - n logl   =

i=1n 1

l+ xin

l -i=1

n 1

l+ xiwhich may be solved in e.g. Excel. Thus, we obtain l

`= 1872.13 and a

`= 1.880467.

(c)   Note that both Lognormal and Pareto distributions are defined on

0, ¶

 and you

need to have i E i = 120 = i Oi.

(i)  One appropriate choice is to use 10 equally probable bins as determined by the

fitted Lognormal distribution. So, in the case of Lognormal with  m = 6.82436 and 

s = 1.2543 we have

where the Expected number of data in the bin  L,  U  is calculated as

120 P L < x § U  = 120 F U  - F  L = 120 μ 0.1We need to compare with  c2  with df = (10-2-1) = 7. The critical value at 5% is 14.07

and so we reject the hypothesis that data follow Lognormal distribution with

SPMI CW1 2012/2013 Solutions   3 

Page 4: CourseWork1 20122013 Solutions

8/13/2019 CourseWork1 20122013 Solutions

http://slidepdf.com/reader/full/coursework1-20122013-solutions 4/10

= 6.82436 and s = 1.2543. The critical value at 1% is 18.48 and so we do not reject

the Lognormal fit. The p-value of the test statistic is 0.018503 and of course leads to

the same conclusions.

(ii)  One appropriate choice is to use 10 equally probable bins as determined by the

fitted Pareto distribution (note that there is no built-in (inverted) Pareto distribution

function in Excel but it could be explicitly inverted in order to find the bins). So, in the

case of Pareto with a = 1.8805 and l = 1872.13 we have

where the Expected number of data in the bin  L,  U  is calculated as

120 P L < x § U  = 120 F U  - F  L = 120 μ 0.1

We need to compare with  c2  with df = (10-2-1) = 7. The critical value at 5% is 14.07

and so we do not reject the hypothesis that data follow Pareto distribution with

a = 1.8805 and l = 1872.13. The critical value at 1% is 18.48 and so we again do not

reject the Pareto fit. The p-value of the test statistic is 0.277482 and of course leads to

the same conclusions.

(iii) Recall that in order for the Goodness-of-Fit test to be valid one needs  E i ¥ 5, so in

 both cases (i) and (ii) the test is reliable. The quality of the test is also reaffirmed by

the high number of observations, namely 120, used to perform it.

Based on the test statistics one can say that the Pareto fit is better than the Lognormal

fit - it has a much smaller test statistics and thus, a higher p-value.

This particular choice ensures Oi > 5, i = 1, ..., 10 and simplifies the computation of 

 E i, i = 1, ..., 10. Other choices are also possible and clearly, the test statistic will be

influenced by the bins selected. However, given the above calculations it is unlikely

that the drawn conclusions would be affected.

3. It will be instructive to use the notation  M  for the retention level and  L for the limiting

level, ( M   = 20,  L = 60).(a) It is not difficult to see from the definition of F  Z  x that the partial density function of 

SPMI CW1 2012/2013 Solutions   4 

Page 5: CourseWork1 20122013 Solutions

8/13/2019 CourseWork1 20122013 Solutions

http://slidepdf.com/reader/full/coursework1-20122013-solutions 5/10

its continuous part is  f  Z  z ª   f  X  z + M  = c exp -c  M  +  z , if 0 < z  <  L - M , where

 f  X  ÿ  is the density of the original individual claims  X .

 Hence, one can conclude that the r.v.  X  has an exponential distribution with parameter c,

i.e.,  f  X 

 x

= c ‰-c x.

(b) We have

 Z  = min max0, X  -  M  , L-  M  = min max0, X  - 20 , 40(c) Applying similar reasoning as in the lectures,

 E  Z  =   M 

 L1 - F  X  x „  x =   M 

 L1 - 1 - ‰-c x „  x =   M 

 L‰-c x „  x =

  ‰-c M -‰-c L

c.

Alternatively, but much longer,

 E  Z  =  0¶min max0, x -  M  , L-  M    f  X  x „  x =

=   M 

 L x - M    f  X  x „  x +  L - M    L+¶ f  X  x „  x

=  0 L- M  y f  X  y + M  „  y +  L - M  1 - F  X  L

=  0 L- M  y c ‰-c  y+ M  „  y +  L - M  ‰-c L

= c ‰-c M   0 L- M  y   ‰-c y „  y +  L - M  ‰-c L

= -c ‰-c M  L - M  ‰-c  L- M  +1

c ‰-c  L- M  - 1 +  L - M  ‰-c L

=   ‰-c M 

-‰-c L

c

Hence,

 E  Z  =  ‰-c M -‰-c L

c  =

  ‰-0.1μ 20-‰-0.1μ 60

0.1  = 1.32857

(d) Applying similar reasoning as in the lectures, we have

F Y  y = ¶  F  X  y, if   y <  M 

F  X  L +  y - M , if   y ¥  M 

i.e.,

F Y  y = ¶ 1 - exp -c y, if   y <  M 

1 - exp -c  L +  y - M , if   y ¥  M 

We have

 E Y  =  0 M 1 - F  X  x „  x +   L+¶1 - F  X  x „  x.

=  0 M ‰-c x „  x +   L¶

‰-c x „  x =1+‰-c L -‰-c M 

c.

Alternatively, and simpler,

 E Y  = E  X  - E  Z  =1

c -

  ‰-c M -‰-c L

c  =

1+‰-c L -‰-c M 

c.

SPMI CW1 2012/2013 Solutions   5 

Page 6: CourseWork1 20122013 Solutions

8/13/2019 CourseWork1 20122013 Solutions

http://slidepdf.com/reader/full/coursework1-20122013-solutions 6/10

Alternatively, but much longer,

 E Y  =  0¶min  X , M    f  X  x „  x +  0¶

max0, X  -  L   f  X  x „  x =

=1+‰-c L -‰-c M 

c.

Finally, we have

 E Y  =1+‰-c L -‰-c M 

c  =

1+‰-0.1μ 60-‰-0.1μ 20

0.1  = 8.67143

4. 

(a) We have that Y  = I  x   where  I  is the indicator of the loss event (accident)

 I  =

1, with probability   q

0, with probability 1 - q

and  x  is the severity of the loss given the loss event (accident) occurs. For the cdf 

F Y  y, for  y ¥ 0, we have

F Y  y = PY  §  y = PY  §  y I  = 0 P I  = 0 +   PY  §  y I  = 1 P I  = 1= P I  x §  y I  = 0 P I  = 0 +   P I  x §  y I  = 1 P I  = 1

= P0 §  y 1 - q +   P x §  y q = 1 - q + q F  x  ysubstituting the gamma cdf (see lecture notes on Loss distributions)

F  x  y = 1 - exp -a y - a y exp -a y = 1 - exp -a y 1 + a y, 

we obtain that

F Y  y = 1 - q + q 1 - exp -a y 1 + a y = 1 - q 1 + a y exp -a y(b) Applying maximum likelihood to estimate q, we have

 Lq,  x =

n

 x   q x

1 - qn- x

=

2000

40   q40

1 - q2000-40

log Lq,   x = logn

 x  + x log q + n - x log1 - q

d log Lq,   xd q

= x

q-

n - x1 - q

= 0   ï q`

= x

n=

400

2000= 0.2

Since  E  x  = m = 2 a , we have that a`

= 2  E  x  = 2 1000 = 0.002

(c) Set  m = E 

 x 

 and s2 = Var 

 x 

. Then, in view of 

SPMI CW1 2012/2013 Solutions   6 

Page 7: CourseWork1 20122013 Solutions

8/13/2019 CourseWork1 20122013 Solutions

http://slidepdf.com/reader/full/coursework1-20122013-solutions 7/10

 E Y k  = E  I  x k  = E  I   k  E  x k  = q E  x k ï

 E Y  = mY   = q E  x  = q  m =2 q

a

Var Y  = s2 = E Y 2 -  E Y 2 = q E  x 2 - q2  E  x  2 =

qs2 + m2 - q2  m2 = q s2 + q1 - q m2 =2 q

a2+

4 q1 - qa2

=2 q3 - 2 q

a2

 E Y  = mY   = q m = 0.2 * 1000 = 200

Var 

= s2 =

2 q3 - 2 qa2

=2 * 0.2 3 - 2 * 0.2

0.0022

= 260000.

 E S n = n E Y  = 2000 * 200 = 400000;

Var S n = 20002 q3 - 2 q

a2= 2000 * 260 000 = 52 μ 107

(d) From the lectures we have that for a probability level  b, say  b = 0.95

 b = PS n §  Pn = PS n § 1 + q  E S n = PS n - E S n § q  E S n =

PS n - E S n

Var S n  §

q  E S nVar S n

  = P S * §q  E S nVar S n

where PS * § x º F x, standard normal cdf ï F   q  E S nVar S n  = 0.95 ï

(2)

q ºq b Var S n

 E S n   =

q b   n   s

n mY 

=q b s

n mY 

=1.65 * 260000

2000   * 200= 0.0940645,

where we have used that

 E S n =  j=1

n

q j m  j = n q m = n mY  and Var S n =  j=1

n

Var Y  j = n Var Y  = n s2,

and that mY  = 20, s = 260 000   = 172.047, q b = 1.65 at  b = 0.95.

From (2) we see that the security loading coefficient q  decays as 1   n , so the more

 policies n  the less the security coefficient q , which is natural since the risk is shared 

among larger number of policyholders.

SPMI CW1 2012/2013 Solutions   7 

Page 8: CourseWork1 20122013 Solutions

8/13/2019 CourseWork1 20122013 Solutions

http://slidepdf.com/reader/full/coursework1-20122013-solutions 8/10

(e) Denote the loss of the insurance company by W . We have

W  = ¶ 0, if   Y  § d 

Y  - d , if  Y  > d 

ï

F W w = PW  § w = ¶ 0, if   w < 0

F Y w + d  = PY  § w + d  if  w ¥ 0

F W w = PW  § w =0, if   w < 0

1 - q + q F  x w + d  if  w ¥ 0

F W w = PW  § w = ¶ 0, if   w < 0

1 - q 1 + a w + d  exp -a w + d , if  w ¥ 0

(f) (i) We have

W  =

0, if   Y  § d 

Y  - d , if  d  < Y  § m

m - d , if  Y  > m

F W w = PW  § w =

0, if   w < 0

F Y w + d  = PY  § w + d ,1,if 0 § w < m - d 

if  m - d  § w

F W w =

PW  § w =

0, if   w < 0

1 - q 1 + a w + d  exp -a w + d ,1,

if 0 § w < m - d 

if  m - d  § w

(ii)  We have

 E W  =   0

¶1 - F W w „ w =   0

m-d 1 - F W w „ w

=   0

m-d 1 - 1 - q 1 + a w + d  exp -a w + d  „ w

=   0

m-d 

q 1 + a w + d  exp -a w + d  „ w

SPMI CW1 2012/2013 Solutions   8 

Page 9: CourseWork1 20122013 Solutions

8/13/2019 CourseWork1 20122013 Solutions

http://slidepdf.com/reader/full/coursework1-20122013-solutions 9/10

=   0

m-d 

q exp -a w + d  „ w +  0

m-d 

q a w + d  exp -a w + d  „ w

= q ‰-a d 

  0

m-d 

‰-a w

„ w + q a   0

m-d 

w + d  ‰-a

w+d 

„ w

=q

a‰-a d  - ‰-a m+

q

a1 + a d  ‰-a d  - 1 + a m ‰-a m

=q

a2 + a d  ‰-a d  - 2 + a m ‰-a m

 E W  =q

a2 + a d  ‰-a d  - 2 + a m ‰-a m = 107.438

 E S  = 2000 E W  = 2000 * 107.438 = 214876,

so the mean has decreased almost twice compared to  E S n = 2000 * 200 = 400 000.

5. 

We have the general results

(3) E S  = m E  N  and    V S  = s2  E  N + m2 Var  N (a)(i) For the Poisson case

 E S  = m l and    V S  = l E  X 2So

 E S  = m l = 2 *1

3+ 3 *

1

2+ 4 *

1

6* 150 = 425

= l E 

 X 2

= 150 * 22 *

1

3

+ 32 *1

2

+ 42 *1

6

= 1275.

(ii) From (3) we have

 E S  = m E  N  = m1 - p

 p=

5

3*

0.98

0.02=

245

3= 81.6667

V S  = s2  E  N  + m2 Var  N  =

s

2 1 - p

 p + m

2 1 - p

 p2 =

5

32 *

0.98

0.02 +

52

32 *

0.98

0.022 =

61495

9 = 6832.78

SPMI CW1 2012/2013 Solutions   9 

Page 10: CourseWork1 20122013 Solutions

8/13/2019 CourseWork1 20122013 Solutions

http://slidepdf.com/reader/full/coursework1-20122013-solutions 10/10

(ii) From (3) we have

 E S  = m E  N  = mk * 1 - p

 p= 3 *

4 * 0.98

0.02= 588

V S  = s2  E  N  + m2 Var  N  =

s2  k * 1 - p

 p+ m2

  k * 1 - p p2

= 32 *4 * 0.98

0.02+ 32 *

4 * 0.98

0.022= 89964

(b)(i) Comparing

 M S t  = exp 2001

1 - 2 t - 1  , for  t  < 1 2

with formula (22) from the lecture notes on Risk models which is for the Poisson

number of claims

 M S t  = exp l M  X t  - 1it is evident that this corresponds to a collective risk model with Poisson l = 200

number of claims and

 M  X t  =1

1 - 2 t , for  t  < 1 2

is the m.g.f. of exponentially distributed claim amounts,  X   ~ Exp0.5 (ii) Therefore we have

 E S  = m l = 2 * 200 = 400 and     V S  = l E  X 2 = 200 * 22 + 22 = 1600

SPMI CW1 2012/2013 Solutions   1 0