machine learning applied to insurance problems › ~wueth › talks › 2017... · boosting machine...
TRANSCRIPT
Machine LearningApplied to Insurance Problems
Mario V. Wuthrich
RiskLab, ETH Zurich
Artificial Intelligence in Insurance and FinanceZurich University of Applied Sciences ZHAW
Winterthur, September 7, 2017
Insurance Technology (InsurTech)1 2
• Artificial Intelligence: learn and accumulate useful knowledge through data.
• Big Data Analytics based on the research of massive data.
• Cloud Computing for real-time operations.
• Block Chain Technology for a more efficient and anonymous exchange of data(decentralization of information, insured-insurer/insurer-insurer).
• Internet of Things: integration and interaction of physical devices (e.g. wearables,sensors) through computer systems to reduce and manage risk.
1innovation and application of advanced technology to the insurance industry2source: China InsurTech Development White Paper, 2017
1
Near Future InsurTech Applications
• health and accident insurance
• life and pension insurance
• non-life insurance
• re-insurance
B product development and insurance pricingB marketing and distribution of productsB administration, accounting and portfolio optimizationB claims handling, triage of potentially large claims and claims reservingB risk drivers analysis and risk management
2
• Section 1: Supervised Learning
3
Insurance Pricing: State-of-the-Art
Basic Assumption:There are structural differences which can be explained by a regression function
µ : X → R, x 7→ µ(x).
• X is the feature space, covariate space (containing all potential insurance policies);
• x ∈ X is the feature, covariate, explanatory variable of a single policy;
• µ(·) is the regression function/pricing functional describing structural differences.
Example of feature:
x = (age, gender, type of car, claims code, income, body weight, smoker, . . .)
4
Supervised Learning
Determine the (unknown) regression function (pricing functional)
µ : X → R, x 7→ µ(x)
from (noisy) observations D = {(Y1,x1), . . . , (Yn,xn)} being of type
Yi = µ (xi) + εi with3 E[εi] = 0.
This problem is not new in insurance, but X is increasingly more complex.
3Typically, we also assume independence between the εi for policies i = 1, . . . , n.
5
Classical and New Solutions to Pricing Problems
• Classical approaches use generalized linear models (GLMs) and generalized additivemodels (GAMs), implemented in commercial software like Emblem or SAS.
• New approaches use neural networks, regression trees, random forests, boosting.
⇒ These are still at an experimental stage in the insurance industry:
B Issue 1: standard software does not exist, yet (deviance statistics, big data);B Issue 2: data warehouses may still bear some issues;B Issue 3: stability/robustness over time still needs to be explored;B Issue 4: graphical illustrations are still poor;B Issue 5: communication to management and customers is still difficult;B Issue 6: uncertainty measures are completely missing.
We give some examples illustrating the state-of-the-art.
6
Sensitivity Testing with Neural Networks4
●
●
●● ●
●
●
●
●
●
●
●
1994 1996 1998 2000 2002 2004
0.05
0.06
0.07
0.08
0.09
0.10
reporting delay
accident year
●
●
●
● ●●
●
●
●
●
●
●
0.91
00.
915
0.92
00.
925
0.93
00.
935
0.94
0
●
●
mean of T (left axis)probability of T=0 (right axis)
●●
●
●●
●●
●●
●
●
●●
●
●
●
●
●
●
● ●
●
●
● ● ● ●
●●
●
●●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
0.00
0.05
0.10
0.15
0.20
0.25
reporting delay
injured part
10 13 16 22 25 31 34 37 42 45 52 55 61 64 70 81
● mean of T
Neural networks are used for sensitivity testing and analysis of non-linearities.
4more than 10 mio. accident claims over 12 accounting years
7
Prediction Uncertainties in Neural Networks
Process uncertainty from Yi = µ (xi) + εi and parameter uncertainty from5
logµ(x) = β0 +
q2∑l=1
βl φ
w(2)l,0 +
q1∑k=1
w(2)l,k φ
w(1)k,0 +
d∑j=1
w(1)k,jxj
.
inj_part
age
cc
LoB
AQ
C
5deep neural network with two hidden layers and 132 parameters
8
Regression Trees6 for Risk Driver Analysis7
time lags t+ 1: Y1 Y2 Y3 Y4 Y5 Y6numbers of leaves 8 11 18 12 4 4
feature components used for split questionsclaim closed cl cl cl cl cl cl
lawyer involved law law law
claims code cc cc cc cc cc
claims diagnosis diag diag diag diag diag
reporting delay j j j
previous payments Y0 Y1 Y2 Y3 Y4 Y5previous payments Y1 Y2
Payment process (Yt)t≥0: relevant feature information and Markov condition.
6CART, Breiman, Friedman, Olshen and Stone (1984)7insurance claims cash flows of 90k claims
9
Boosting Machine for the Lee-Carter Model8
tree improved Lee−Carter fit; females
year
age
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
1876
1881
1886
1891
1896
1901
1906
1911
1916
1921
1926
1931
1936
1941
1946
1951
1956
1961
1966
1971
1976
1981
1986
1991
1996
2001
2006
2011
−1.5
−1.0
−0.5
0.0
0.5
1.0
1.5
Stage-wise adaptive learning applied to residuals is known as a Boosting Machine.
8Swiss female mortality data: back-testing the Lee-Carter model
10
• Section 2: Unsupervised Learning
11
Telematics Car Driving Data9
B Feature Extraction: Find patterns in (noisy) features X0 = {x1, . . . ,xn}.
9GPS location data second by second: 200 individual trips of 3 car drivers.
12
Normalized v-a Heatmaps
B Unsupervised Learning: Find patterns in (noisy) features X0 = {x1, . . . ,xn}.
Normalized v-a heatmaps of 8 car drivers in speed bucket [5km/h, 20km/h).
13
Application of K-Means Algorithm for K = 4
14
Conclusions should be here ...
... and your remarks!
B Lecture notes on SSRN preprint server (first draft):
Data Analytics for Non-Life Insurance Pricing, Manuscript ID 2870308.15