the use of multivariate analysis techniques to design a class plan

25
March 11-12, 1999, Nashville, Tennessee Mark Scully, Tillinghast-Towers Perrin Use Macro to Enter TP Logo The Use of Multivariate Analysis Techniques to Design a Class Plan 1999 CAS Seminar on Ratemaking

Upload: lilah-horton

Post on 04-Jan-2016

37 views

Category:

Documents


4 download

DESCRIPTION

The Use of Multivariate Analysis Techniques to Design a Class Plan. 1999 CAS Seminar on Ratemaking. Overview of Presentation. Background Multivariate analysis techniques: Generalized Linear Models (GLMs) Classification and Regression Trees (CART,CHAID) Implementation Pricing Marketing - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: The Use of Multivariate Analysis Techniques to Design a Class Plan

March 11-12, 1999, Nashville, Tennessee

Mark Scully, Tillinghast-Towers Perrin

Use Macro to Enter TP Logo

The Use of Multivariate Analysis Techniques to Design a Class Plan

1999 CAS Seminar on Ratemaking

Page 2: The Use of Multivariate Analysis Techniques to Design a Class Plan

2

Overview of Presentation

Background

Multivariate analysis techniques: Generalized Linear Models (GLMs) Classification and Regression Trees (CART,CHAID)

Implementation Pricing Marketing Agents’ compensation Results monitoring

Page 3: The Use of Multivariate Analysis Techniques to Design a Class Plan

3

Several Factors are Converging toward Better Analysis of Customer and Prospect Attributes

Greater emphasis on pricing vs. underwriting

Increased familiarity with techniques

Faster computers

Influence of direct writers, non-standard cos.and banks

Use of multiple distribution channels

Increased competition

Page 4: The Use of Multivariate Analysis Techniques to Design a Class Plan

4

Why Multivariate Statistical Techniques?

Most rating variables are correlated.

Different variables may be showing the same underlying effect.

Repeated use of univariate techniques leads to double-counting of same effects.

Can capture interactions.

Provides more than a point estimate, also standard errors.

Page 5: The Use of Multivariate Analysis Techniques to Design a Class Plan

5

Different Rating Variables may be Manifestations of the Same Underlying Effect

DrivingIntensity

AnnualMileage

VehicleMake/Model

DriverAge

Underlying Effect Rating Variables

Page 6: The Use of Multivariate Analysis Techniques to Design a Class Plan

6

Interactions Arise when the Combined Effect of two Variables Differs from the Sum of their Single Effects

0.6

0.8

1.0

1.2

1.4

1.6

1.8

2.0

Age of Driver

Re

lati

vit

y

Female Male

The differential between femaleand male differs

by age.

Page 7: The Use of Multivariate Analysis Techniques to Design a Class Plan

7

Confidence Intervals Indicate the Degree of Certainty Inherent in Relativity Estimates

German Bonus-Malus: Frequency Model

0

0.5

1

1.5

2

2.5

3

Bonus-Malus Class

Rela

tivit

y

Page 8: The Use of Multivariate Analysis Techniques to Design a Class Plan

8

Confidence Intervals Indicate the Degree of Certainty Inherent in Relativity Estimates

Territorial Relativies: Frequency Model

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

1 2 3 4 5 6 7 8 9 10

Territory

Rela

tivit

y

Page 9: The Use of Multivariate Analysis Techniques to Design a Class Plan

9

Confidence Intervals Indicate the Degree of Certainty Inherent in Relativity Estimates

Territorial Relativies: Severity Model

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1 2 3 4 5 6 7 8 9 10

Territory

Rela

tivit

y

Page 10: The Use of Multivariate Analysis Techniques to Design a Class Plan

10

Statistical Rating Techniques Indicate the Relative Explanatory Power of each Variable...

…and the extent to which variables are correlated.

Variable A

Variable B

Page 11: The Use of Multivariate Analysis Techniques to Design a Class Plan

11

What statistical techniques do we commonly use?

Generalized Linear Models (GLMs)

Classification and regression trees CHAID CART

Page 12: The Use of Multivariate Analysis Techniques to Design a Class Plan

12

What are GLMs?

Statistical procedure for measuring the effect of one or more independent variables upon a dependent variable

Dependent variables are, for ratemaking, typically: frequency and severity

GLMs allow extreme flexibility in model structure and design multiplicative or additive plans (or others) different error distributions variable interactions

Explicitly produce relativity estimates (and more)

Page 13: The Use of Multivariate Analysis Techniques to Design a Class Plan

13

Basic Theory of GLMs (I)

Let Yi, I=1,2,…,n be observations from a random variable. We model them as follows:

,)(1

exhY i

T

iii

Where:

h=the link function

xi=a vector of variables associated with the i-th observation

I=a scalar parameter (the offset)

=the parameter vector

ei=an error term(with mean equal to 0)

Page 14: The Use of Multivariate Analysis Techniques to Design a Class Plan

14

Basic Theory of GLMs (II)

Typically, the random term ei is chosen from the exponential family with density in the following general form:

),(/

)(exp)(

ycw

byyf

Where and are parameters and w the weight of each observation.

If we denote the mean of this distribution as then its variance may be expressed as V() /w, where V(•) is referred to as the variance function.

Page 15: The Use of Multivariate Analysis Techniques to Design a Class Plan

15

Basic Theory of GLMs (III)

DistributionVarianceFunction

CanonicalLink

Normal 1 Identity

Poisson Log

Binomial (1- ) Logit=log( /(1- )}

Gamma 2 Reciprocal= -1

InverseGaussian

3 -2

Page 16: The Use of Multivariate Analysis Techniques to Design a Class Plan

16

Literature on GLMs

Generalized Linear Models, Second Edition, P. McCullach and J.A. Nelder, Chapman & Hall 1989 (ISBN 0 412 31760 5)

“Statistical Motor Rating: making Effective Use of Your Data”, M.J. Brockman and T.S. Wright, JIA 119, III, 457-543 (April 1992).

“Technical Aspects of Domestic Lines Pricing”, Greg Taylor, University of Melbourne Research Paper 45 (ISBN 0 7325 1474 6)

Page 17: The Use of Multivariate Analysis Techniques to Design a Class Plan

17

GLMs-Some Practical Considerations (I)

A log link function produces multiplicative relativities.

Separate models for frequency and severity: Better understanding of data Appropriate distributions exist

Typical error distributions for frequency: Poisson/Quasi-Poisson Negative binomial

Typical distributions for severity: Normal Gamma Inverse Gaussian

Page 18: The Use of Multivariate Analysis Techniques to Design a Class Plan

18

GLMs-Some Practical Considerations (II)

Variables may be modeled as continuous covariates or categorical factors

An array of statistical and practical tests exists for model testing: Variable significance tests Quantile plots Residual plots Comparison of actual data to model

Page 19: The Use of Multivariate Analysis Techniques to Design a Class Plan

19

Comparison of Actual to Model Helps to Identify Areas Currently Under- or Overpriced

-500

0

500

1,000

1,500

2,000

16-2021-25

26-3031-35

36-4041-45

46-5051-60

61-7071-80

81-99

Age of Driver

Pu

re P

rem

ium

-30%

-10%

10%

30%

50%

70%

90%

110%

130%

150%

170%

% D

iffe

ren

ce

% Difference Indicated Current

Loss-Segments:How much do we write?Are we growing here?How many $ involved?Other reasons to stay here?

Profit-Segments:How much do we write?Are we losing business?How many $ involved?How do we get more?

Page 20: The Use of Multivariate Analysis Techniques to Design a Class Plan

20

The Significance of these Profit/Loss Areas Depends also on their Volume of Business

Gain/(Loss) in Millions of $

-10

-5

0

5

10

15

20

16-20 21-25 26-30 31-35 36-40 41-45 46-50 51-60 61-70 71-80 81-99

Age of Driver

Note: Gain/(Loss) = (Current PP - Indicated PP) x Exposures

Page 21: The Use of Multivariate Analysis Techniques to Design a Class Plan

21

What are classification and regression trees?

Procedures for successively subdividing data into homogeneous groups

Like GLMs, they use a dependent variable and one or more independent ones

Result is not necessarily symmetric

Implicitly capture the natural interactions between factors

Can produce a simpler rating plan or form a single rating variable out of many

Produces homogeneous groups(i.e., a tree structure) but no rating plan or relativities

Page 22: The Use of Multivariate Analysis Techniques to Design a Class Plan

22

Classification and Regression Trees produce an asymmetrical grouping of the data

Bestand

SF M.O1/2

Männlich.

Kfz-Alter< 2

Weiblich

SF 1-3

Typ 10-15

R & A

Typ 16-20

Garage Keine.

SF 5-10.

Typ 10-17

SF 11-15 SF 15-22

Männlich Weiblich

1 2

3

4 5

6

7 8

9 10

11

12 13

Typ 21-25

Typ 18-25

Kfz-Alter> 2 Beamten

Page 23: The Use of Multivariate Analysis Techniques to Design a Class Plan

23

Some differences between CHAID and CART

Dependent variable for CHAID must be categorical; for CART it can be metric

Different splitting algorithm (e.g., CHAID uses a Chi-squared test using contingency tables)

CHAID splits into multiple groups, CART makes binary splits

Different stopping criteria

Page 24: The Use of Multivariate Analysis Techniques to Design a Class Plan

24

GLMS may be used to Produce a Rating Plan with Variables Generated by CART or CHAID

PotentialRating

Variables

CART/CHAID

Analysis

CART/CHAID

Variables

GLMAnalysis

Page 25: The Use of Multivariate Analysis Techniques to Design a Class Plan

25

Results from the Rating Analysis Can be Used Beyond the Production of a Rating Plan

ActuariallyOptimalModel

Constraints:•Regulatory•Agents•Stability•Competition•etc.

Rating Analysis

Rating PlanActually

Implemented

•Marketing•UW Guidelines•Agents’ Compensation•Monitoring