practical glm modeling of deductibles david cummings state farm insurance companies
TRANSCRIPT
Practical GLM Modelingof Deductibles
David Cummings
State Farm Insurance Companies
Overview
• Traditional Deductible Analyses
• GLM Approaches to Deductibles
• Tests on simulated data
Empirical Method
All losses at $500 deductible $1,000,000
Losses eliminated by
$1000 deductible $ 100,000
Loss Elimination Ratio 10%
Empirical Method
• Pros– Simple
• Cons– Need credible data at low deductible– No $1000 deductible data is used to
price the $1000 deductible
0 2000 4000 6000 8000 10000
Loss Distribution Method
• Fit a severity distribution to data
0 2000 4000 6000 8000 10000
Loss Distribution Method
• Fit a severity distribution to data• Calculate expected value of truncated
distribution
Loss Distribution Method
• Pros– Provides framework to relate data at
different deductibles– Direct calculation for any deductible
• Cons– Need to reflect other rating factors– Framework may be too rigid
0 2000 4000 6000 8000 10000
Complications
• Deductible truncation is not clean• “Pseudo-deductible” effect– Due to claims awareness/self-selection– May be difficult to detect in severity
distribution
GLM Modeling Approaches
1. Fit severity distribution using other rating variables
2. Use deductible as a variable in severity/frequency models
3. Use deductible as a variable in pure premium model
GLM Approach 1– Fit Distribution w/ variables
• Fit a severity model• Linear predictor relates to untruncated
mean• Maximum likelihood estimation adjusted
for truncation
• Reference:– Guiahi, “Fitting Loss Distributions with
Emphasis on Rating Variables”, CAS Winter Forum, 2001
GLM Approach 1– Fit Distribution w/ variables
X = untruncated random variable ~ Gamma
Y = loss data, net of deductible d
);(1
);()(
)log( 110
XX
XXY
nnX
dF
dyfyf
vv
GLM Approach 1– Fit Distribution w/ variables
• Pros– Applies GLM within framework– Directly models truncation
• Cons– Non-standard GLM application– Difficult to adapt to rate plan– No frequency data used in model
Practical Issues
• No standard statistical software– Complicates analysis– Less computationally efficient
);(1
);()(
)log( 110
XX
XXY
nnX
dF
dyfyf
vv
Not a member of Exponential Family of distributions
Practical Issues
• No clear translation into a rate plan– Deductible effect depends on mean– Mean depends on all other variables– Deductible effect varies by other variables
);(1
);()(
)log( 110
XX
XXY
nnX
dF
dyfyf
vv
Practical Issues
• No use of frequency information– Frequency effects derived from
severity fit
– Loss of information
);(1 XX dyF
GLM Approach 2-- Frequency/Severity Model
• Standard GLM approach
• Fit separate frequency and severity models
• Use deductible as independent variable
• Pros– Utilizes standard GLM packages– Incorporates deductible effects on
frequency and severity– Allows model forms that fit rate plan
• Cons– Potential inconsistency of models– Specification of deductible effects
GLM Approach 2-- Frequency/Severity Model
Test Data
• Simulated Data– 1,000,000 policies – 80,000 claims
• Risk Characteristics– Amount of Insurance– Deductible– Construction– Alarm System
• Gamma Severity Distribution• Poisson Frequency Distribution
Conclusions from Test Data– Frequency/Severity Models
• Deductible as categorical variable– Good overall fit– Highly variable estimates for higher
or less common deductibles–When amount effect is incorrect,
interaction term improves model fit
Severity RelativitiesUsing Categorical Variable
0
0.5
1
1.5
2
2.5
3
3.5
0 2000 4000 6000 8000 10000
Conclusions from Test Data– Frequency/Severity Models
• Deductible as continuous variable– Transformations with best likelihood• Ratio of deductible to coverage amount• Log of deductible
– Interaction terms with amount improve model fit
– Carefully examine the results for inconsistencies
Frequency Relativities
0
0.2
0.4
0.6
0.8
1
1.2
0 1000 2000 3000 4000 5000
Deductible
100,000
500,000
CoverageAmount
Severity Relativities
0
0.2
0.4
0.6
0.8
1
1.2
0 1000 2000 3000 4000 5000
Deductible
100,000
500,000
CoverageAmount
Pure Premium Relativities
0
0.2
0.4
0.6
0.8
1
1.2
0 1000 2000 3000 4000 5000
Deductible
100,000
500,000
CoverageAmount
GLM Approach 3 – Pure Premium Model
• Fit pure premium model using Tweedie distribution
• Use deductible as independent variable
GLM Approach 3 – Pure Premium Model
• Pros– Incorporates frequency and severity
effects simultaneously– Ensures consistency– Analogous to Empirical LER
• Cons– Specification of deductible effects
Conclusions from Test Data – Pure Premium Models
• Deductible as categorical variable– Good overall fit– Some highly variable estimates
• Good fit with some continuous transforms– Can avoid inconsistencies with good
choice of transform
Extension of GLM – Dispersion Modeling
• Double GLM • Iteratively fit two models–Mean model fit to data–Dispersion model fit to residuals
• ReferenceSmyth, Jørgensen, “Fitting Tweedie’s
Compound Poisson Model to Insurance Claims Data: Dispersion Modeling,” ASTIN Bulletin, 32:143-157
Double GLM in Modeling Deductibles
• Gamma distribution assumes that variance is proportional to µ2
• Deductible effect on severity–Mean increases– Variance increases more gradually
• Double GLM significantly improves model fit on Test Data–More significant than interactions
Pure Premium Relativities
0.8
0.9
1
1.1
0 1000 2000 3000 4000 5000
Deductible
Constant Dispersion Double GLM
Tweedie Model – $500,000 Coverage Amount
Conclusion
• Deductible modeling is difficult
• Tweedie model with Double GLM seems to be the best approach
• Categorical vs. Continuous – Need to compare various models
• Interaction terms may be important