reliable probability forecasting – a machine learning perspective david lindsay supervisors:...

46
Reliable Probability Forecasting – a Machine Learning Perspective David Lindsay Supervisors: Zhiyuan Luo, Alex Gammerman, Volodya Vovk

Upload: taliyah-darland

Post on 29-Mar-2015

224 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Reliable Probability Forecasting – a Machine Learning Perspective David Lindsay Supervisors: Zhiyuan Luo, Alex Gammerman, Volodya Vovk

Reliable Probability Forecasting – a Machine Learning Perspective

David Lindsay

Supervisors: Zhiyuan Luo, Alex Gammerman, Volodya Vovk

Page 2: Reliable Probability Forecasting – a Machine Learning Perspective David Lindsay Supervisors: Zhiyuan Luo, Alex Gammerman, Volodya Vovk

Overview What is probability forecasting? Reliability and resolution criteria Experimental design Problems with traditional assessment methods:

square loss, log loss and ROC curves Probability Calibration Graph (PCG) Traditional learners are unreliable yet accurate! Extension of Venn Probability Machine (VPM) Which learners are reliable? Psychological and theoretical viewpoint

Page 3: Reliable Probability Forecasting – a Machine Learning Perspective David Lindsay Supervisors: Zhiyuan Luo, Alex Gammerman, Volodya Vovk

Probability Forecasting

Qualified predictions important in many applications (especially medicine).

Most machine learning algorithms make “bare” predictions.

Those that do make qualified predictions make no claims of how effective the measures are!

Page 4: Reliable Probability Forecasting – a Machine Learning Perspective David Lindsay Supervisors: Zhiyuan Luo, Alex Gammerman, Volodya Vovk

Probability Forecasting: Generalisation of Pattern Recognition

Goal of pattern recognition = find the “best” label for each new test object.

Example Abdominal Pain Dataset:

Training Set to “learn” from

Label

Diagnosisiy

Object

Patient Details

ixName: DavidSex: MHeight: 6’2”

Appendicitis

Name: DaniilSex: MHeight: 6’4”

Dyspepsia

Name: MarkSex: MHeight: 6’1”

Non-specific

,...,Name: SianSex: FHeight: 5’8”

Dyspepsia

, , Name: WilmaSex: FHeight: 5’6”

?

Test Object, what is the true label?

True label unknown or withheld from learner

Page 5: Reliable Probability Forecasting – a Machine Learning Perspective David Lindsay Supervisors: Zhiyuan Luo, Alex Gammerman, Volodya Vovk

Probability Forecasting: Generalisation of Pattern Recognition Probability forecast – estimate the conditional probability

of a label given an observed objectˆ( | ) Pr( | )P y x y x

learner

Training set Name: Helen

Sex: FHeight: 5’6”

Name: HelenSex: FHeight: 5’6”

Name: HelenSex: FHeight: 5’6”

Name: HelenSex: FHeight: 5’6”

Testobject

?

Name: HelenSex: FHeight: 5’6”

ˆ(Dyspepsia | )P = 0.1Name: HelenSex: FHeight: 5’6”

ˆ(Appendicitis | )P = 0.7Name: HelenSex: FHeight: 5’6”

= 0.2ˆ(Non spec | )PName: HelenSex: FHeight: 5’6”

etc…

We want learner to estimate probabilities for all possible class labels:

Page 6: Reliable Probability Forecasting – a Machine Learning Perspective David Lindsay Supervisors: Zhiyuan Luo, Alex Gammerman, Volodya Vovk

Probability forecasting more formally… X object space, Y label space, Z = X Y example space

Our learner makes probability forecasts for all possible

labels 1 2 1 1 1 1 1 1 1

ˆ ˆ ˆ, , , , ( 1| ), ( 2 | ), , ( | )n n n n n n n nz z z x P y x P y x P y x Y

1 1ˆ |arg maxn n

i

y P i x

Y

Use probability forecasts to predict label most likely label:

Page 7: Reliable Probability Forecasting – a Machine Learning Perspective David Lindsay Supervisors: Zhiyuan Luo, Alex Gammerman, Volodya Vovk

Back to the plan… What is probability forecasting? Reliability and resolution criteria Experimental design Problems with traditional assessment methods:

square loss, log loss and ROC curves Probability Calibration Graph (PCG) Traditional learners are unreliable yet accurate! Extension of Venn Probability Machine (VPM) Which learners are reliable? Psychological and theoretical viewpoint

Page 8: Reliable Probability Forecasting – a Machine Learning Perspective David Lindsay Supervisors: Zhiyuan Luo, Alex Gammerman, Volodya Vovk

Studies of Probability Forecasting

Probability forecasting is well studied area since 1970’s: Psychology Statistics Meteorology

These studies assessed two criteria of probability forecasts: Reliability = the probability forecasts should not lie Resolution = the probability forecasts are practically

useful

Page 9: Reliable Probability Forecasting – a Machine Learning Perspective David Lindsay Supervisors: Zhiyuan Luo, Alex Gammerman, Volodya Vovk

When an event is predicted with probability should have approx chance of being incorrect

Reliabilityp̂

ˆ1 p

a.k.a. well calibrated, Considered an asymptotic property. Dawid (1985) proved no deterministic learner

can be reliable for all data – still interesting to investigate

This property is often overlooked in practical studies!

Page 10: Reliable Probability Forecasting – a Machine Learning Perspective David Lindsay Supervisors: Zhiyuan Luo, Alex Gammerman, Volodya Vovk

Resolution

Probability forecasts are practically useful, e.g. they can be used to rank the labels in order of likelihood!

Closely related to classification accuracy - common focus of machine learning.

Separate from reliability, i.e. do not go “hand in hand” (Lindsay, 2004)

Page 11: Reliable Probability Forecasting – a Machine Learning Perspective David Lindsay Supervisors: Zhiyuan Luo, Alex Gammerman, Volodya Vovk

Back to the plan… What is probability forecasting? Reliability and resolution criteria Experimental design Problems with traditional assessment methods:

square loss, log loss and ROC curves Probability Calibration Graph (PCG) Traditional learners are unreliable yet accurate! Extension of Venn Probability Machine (VPM) Which learners are reliable? Psychological and theoretical viewpoint

Page 12: Reliable Probability Forecasting – a Machine Learning Perspective David Lindsay Supervisors: Zhiyuan Luo, Alex Gammerman, Volodya Vovk

Experimental design Tested several learners on many datasets

in the online setting:ZeroR = ControlK-Nearest NeighbourNeural NetworkC4.5 Decision TreeNaïve BayesVenn Probability Machine Meta Learner (see

later…)

Page 13: Reliable Probability Forecasting – a Machine Learning Perspective David Lindsay Supervisors: Zhiyuan Luo, Alex Gammerman, Volodya Vovk

The Online Learning Setting

2 7 6 1 7 ? ?

2 7 6 1 7 2 ?

Before

After

Update training data for learning machine for next trial

Learning machine makes prediction for new example. (label withheld)

Repeat process for all examples

Page 14: Reliable Probability Forecasting – a Machine Learning Perspective David Lindsay Supervisors: Zhiyuan Luo, Alex Gammerman, Volodya Vovk

Lots of benchmark data Tested on data available from the UCI Machine Learning

repository: Abdominal Pain: 6387 examples, 135 features, 9 classes,

Noisy Diabetes: 768 examples, 8 features, 2 classes Heart-Statlog: 270 examples, 13 features, 2 classes Wisconsin Breast Cancer: 685 examples, 10 features, 2

classes American Votes: 435 examples, 16 features, 2 classes Lymphography: 148 examples, 18 features, 4 classes Credit Card Applications: 690 examples, 15 features, 2

classes Iris Flower: 150 examples, 4 features, 3 classes And many more…

Page 15: Reliable Probability Forecasting – a Machine Learning Perspective David Lindsay Supervisors: Zhiyuan Luo, Alex Gammerman, Volodya Vovk

Programs

Extended the WEKA data mining system implemented in Java:Added VPM meta learner to existing library of

algorithmsAllow learners to be tested in online setting.

Created Matlab scripts to easily create plots (see later)

Page 16: Reliable Probability Forecasting – a Machine Learning Perspective David Lindsay Supervisors: Zhiyuan Luo, Alex Gammerman, Volodya Vovk

Results, papers and website All results that I discuss today can be found in my

3 tech reports: The Probability Calibration Graph - a useful

visualisation of the reliability of probability forecasts, Lindsay (2004), CLRC-TR-04-01

Multi-class probability forecasting using the Venn Probability Machine - a comparison with traditional machine learning methods, Lindsay (2004), CLRC-TR-04-02

Rapid implementation of Venn Probability Machines, Lindsay (2004), CLRC-TR-04-03

And on my web site: http://www.david-lindsay.co.uk/research.html

Page 17: Reliable Probability Forecasting – a Machine Learning Perspective David Lindsay Supervisors: Zhiyuan Luo, Alex Gammerman, Volodya Vovk

Back to the plan… What is probability forecasting? Reliability and resolution criteria Experimental design Problems with traditional assessment methods:

square loss, log loss and ROC curves Probability Calibration Graph (PCG) Traditional learners are unreliable yet accurate! Extension of Venn Probability Machine (VPM) Which learners are reliable? Psychological and theoretical viewpoint

Page 18: Reliable Probability Forecasting – a Machine Learning Perspective David Lindsay Supervisors: Zhiyuan Luo, Alex Gammerman, Volodya Vovk

Loss Functions

2

1 1

( ) ,ˆn

j i

s n i jy jiI p

Y

Square loss

,1 1

ˆ( ) logi

n

i jy ji j

l n I p

Y

Log loss

There are many other possible loss functions… Degroot and Feinberg (1982) showed that all loss

functions measure a mixture of reliability and resolution Log loss punishes more harshly: forced to spread its

bets

Page 19: Reliable Probability Forecasting – a Machine Learning Perspective David Lindsay Supervisors: Zhiyuan Luo, Alex Gammerman, Volodya Vovk

ROC Curves

Naïve Bayes on the Abdominal pain data set

1. Graph shows trade off

between false and true

positive predictions

2. Want curve to be as

close to the upper left

corner as possible

(away from diagonal)

3. My results show that

this graph tests

resolution.

4. Area under curve

provides measure of

quality of probability

forecasts.

Page 20: Reliable Probability Forecasting – a Machine Learning Perspective David Lindsay Supervisors: Zhiyuan Luo, Alex Gammerman, Volodya Vovk

Table comparing traditional scores

VPM C4.5

Naïve Bayes

VPM Naïve Bayes

10-NN

20-NN

C4.5

Neural Net

30-NN

VPM 1-NN

1-NN

0.76 (1)

0.72 (5)

0.75 (2)

0.54 (10)

0.55 (9)

0.57 (8)

0.75 (3)

0.74 (4)

0.61 (6)

0.59 (7)

0.49 (11)

0.8 (4)

1.3 (7)

0.6 (1)

2.6 (10)

2.2 (9)

3.3 (11)

0.72 (2)

0.73 (3)

0.9 (5)

2.1 (8)

1.1 (6)

0.54 (5)

0.50 (4)

0.44 (1)

1.0 (11)

0.96 (10)

0.67 (7)

0.45 (2)

0.47 (3)

0.58 (6)

0.73 (8)

0.74 (9)

40.7 (8)

29.2 (2)

28.9 (1)

33.4 (4)

33.4 (4)

39.6 (7)

30.5 (3)

34.3 (5)

41.6 (9)

34.6 (6)

55.6 (10)ZeroR

PCGROC

Area

Log

Loss

Sqr Loss

ErrorAlgorithm

Page 21: Reliable Probability Forecasting – a Machine Learning Perspective David Lindsay Supervisors: Zhiyuan Luo, Alex Gammerman, Volodya Vovk

Problems with Traditional Assessment Loss functions and ROC give more information

than error rate about the quality of probability forecasts.

But… loss functions = mixture of resolution and reliability ROC curve = measures resolution

Don’t have any method of solely assessing reliability

Don’t have method of telling if probability forecasts are over- or under- estimated

Page 22: Reliable Probability Forecasting – a Machine Learning Perspective David Lindsay Supervisors: Zhiyuan Luo, Alex Gammerman, Volodya Vovk

Back to the plan… What is probability forecasting? Reliability and resolution criteria Experimental design Problems with traditional assessment methods:

square loss, log loss and ROC curves Probability Calibration Graph (PCG) Traditional learners are unreliable yet accurate! Extension of Venn Probability Machine (VPM) Which learners are reliable? Psychological and theoretical viewpoint

Page 23: Reliable Probability Forecasting – a Machine Learning Perspective David Lindsay Supervisors: Zhiyuan Luo, Alex Gammerman, Volodya Vovk

Inspiration for PCG (Meteorology)

Murphy & Winkler (1977)Calibration data for precipitation forecasts

Reliable points lie close to diagonal

Page 24: Reliable Probability Forecasting – a Machine Learning Perspective David Lindsay Supervisors: Zhiyuan Luo, Alex Gammerman, Volodya Vovk

A PCG plot of ZeroR on Abdominal Pain

Predicted Probability

Em

pir

ical

fre

qu

en

cy o

f b

ein

g c

orr

ect

Line of calibration

PCG coordinates

Reliability PCG coordinates lie close to line of calibrationi.e. ZeroR may is not accurate but it is reliable!

Plot may not span whole axis – ZeroR makes no predictions with high probability

Page 25: Reliable Probability Forecasting – a Machine Learning Perspective David Lindsay Supervisors: Zhiyuan Luo, Alex Gammerman, Volodya Vovk

PCG a visualisation tool and measure of reliability

Total 2764.5

Mean 0.0483

Standard Deviation 0.0757

Max 0.4203

Min 4.9e-17

Naïve Bayes VPM Naïve Bayes

VPM is reliable as PCG follows the diagonal!

Total 496.7

Mean 0.0087

Standard Deviation 0.0112

Max 0.1017

Min 9.2e-8

Over and under estimates its probabilities – much like real doctors!

Unreliable, forecast of 0.9 only has 0.55 chance being right! (over estimate)

Unreliable, forecast of 0.1 only has 0.3 chance being right! (under estimate)

Page 26: Reliable Probability Forecasting – a Machine Learning Perspective David Lindsay Supervisors: Zhiyuan Luo, Alex Gammerman, Volodya Vovk

Learners predicting like people!

Naïve Bayes People

Lots of psychological research people make unreliable probability forecasts

Page 27: Reliable Probability Forecasting – a Machine Learning Perspective David Lindsay Supervisors: Zhiyuan Luo, Alex Gammerman, Volodya Vovk

Back to the plan… What is probability forecasting? Reliability and resolution criteria Experimental design Problems with traditional assessment methods:

square loss, log loss and ROC curves Probability Calibration Graph (PCG) Traditional learners are unreliable yet accurate! Extension of Venn Probability Machine (VPM) Which learners are reliable? Psychological and theoretical viewpoint

Page 28: Reliable Probability Forecasting – a Machine Learning Perspective David Lindsay Supervisors: Zhiyuan Luo, Alex Gammerman, Volodya Vovk

Table comparing scores with PCG

838.1 (4)0.76 (1)0.8 (4)0.54 (5)40.7 (8)VPM C4.5

2764.5 (7)0.72 (5)1.3 (7)0.50 (4)29.2 (2)Naïve Bayes

496.7 (1)0.75 (2)0.6 (1)0.44 (1)28.9 (1)VPM Naïve Bayes

5062.9 (11)0.54 (10)2.6 (10)1.0 (11)33.4 (4)10-NN

4492.7 (10)0.55 (9)2.2 (9)0.96 (10)33.4 (4)20-NN

3481.2 (8)0.57 (8)3.3 (11)0.67 (7)39.6 (7)C4.5

1320.5 (6)0.75 (3)0.72 (2)0.45 (2)30.5 (3)Neural Net

921.2 (5)0.74 (4)0.73 (3)0.47 (3)34.3 (5)30-NN

554.6 (2)0.61 (6)0.9 (5)0.58 (6)41.6 (9)VPM 1-NN

4307.5 (9)0.59 (7)2.1 (8)0.73 (8)34.6 (6)1-NN

678.6 (3)0.49 (11)1.1 (6)0.74 (9)55.6 (10)ZeroR

PCGROC

Area

Log

Loss

Sqr Loss

ErrorAlgorithm

Page 29: Reliable Probability Forecasting – a Machine Learning Perspective David Lindsay Supervisors: Zhiyuan Luo, Alex Gammerman, Volodya Vovk

Correlations of scores

Inverse No-0.1ROC vs. Sqr Reliability

Direct Weak0.26PCG vs. Error

Direct No0.04PCG vs. Sqr Resolution

Direct Strong0.76PCG vs. Sqr Reliability

InterpretationCorr. Coeff.ScoresInverse Moderate-0.52ROC vs. Error

Direct Strong0.67ROC vs. Sqr Resolution

Page 30: Reliable Probability Forecasting – a Machine Learning Perspective David Lindsay Supervisors: Zhiyuan Luo, Alex Gammerman, Volodya Vovk

Back to the plan… What is probability forecasting? Reliability and resolution criteria Experimental design Problems with traditional assessment methods:

square loss, log loss and ROC curves Probability Calibration Graph (PCG) Traditional learners are unreliable yet accurate! Extension of Venn Probability Machine (VPM) Which learners are reliable? Psychological and theoretical viewpoint

Page 31: Reliable Probability Forecasting – a Machine Learning Perspective David Lindsay Supervisors: Zhiyuan Luo, Alex Gammerman, Volodya Vovk

What is the VPM meta-learner?

Volodya’s VPM1. Predicts a label2. Produces upper u and lower l bounds for predicted label only

My VPM extension1. Extracts more information 2. Produces probability forecast for all possible labels3. Predicts a label using these probability forecasts.4. Produces Volodya’s bounds as well!

Learner ΓVPM meta

learningframework

VPM “sits on top” of existing learner to complement predictions with probability estimates

Page 32: Reliable Probability Forecasting – a Machine Learning Perspective David Lindsay Supervisors: Zhiyuan Luo, Alex Gammerman, Volodya Vovk

Volodya’s original use of VPM

Online Trial Number

Err

or

rate

an

d b

ou

nd

s

22.1%1414.1Low Error

28.9%1835Error

34.7%2216.5Up Error

Upper (red) and lower (green) bounds lie above and below the actual number of errors (black) made on the data.

Page 33: Reliable Probability Forecasting – a Machine Learning Perspective David Lindsay Supervisors: Zhiyuan Luo, Alex Gammerman, Volodya Vovk

Output from VPM compared with that of original underlying learner

Key: Predicted = underlined , Actual =

NANA7.6e-9

6.3e-10

4.0e-112.2e-91.3e-9

0.071.7e-13

2.9e-90.935831

NANA2.2e-4

2.2e-7

0.20.460.162.3e-5

0.170.019.4e-52490

NANA1.3e-4

4.1e-10

3.4e-34.2e-30.994.4e-5

3.3e-6

4.5e-63.08e-9

1653

Naïve Bayes

LowUpDysp.Renal.

PancrIntest obstr

CholiNon. Spec

Perf. Pept.

Div.Appx

BoundsProbability forecast for each class labelTrial #

0.410.680.010.010.00.010.010.420.00.010.535831

0.070.710.40.090.080.150.050.070.100.030.022490

0.080.820.090.010.040.00.730.080.030.00.031653

VPM Naïve Bayes

Page 34: Reliable Probability Forecasting – a Machine Learning Perspective David Lindsay Supervisors: Zhiyuan Luo, Alex Gammerman, Volodya Vovk

Back to the plan… What is probability forecasting? Reliability and resolution criteria Experimental design Problems with traditional assessment methods:

square loss, log loss and ROC curves Probability Calibration Graph (PCG) Traditional learners are unreliable yet accurate! Extension of Venn Probability Machine (VPM) Which learners are reliable? Psychological and theoretical viewpoint

Page 35: Reliable Probability Forecasting – a Machine Learning Perspective David Lindsay Supervisors: Zhiyuan Luo, Alex Gammerman, Volodya Vovk

ZeroR

Heart Disease Lymphography Diabetes

• ZeroR outputs probability forecasts which are mere label frequencies

• ZeroR predicts the majority class label at each trial.• Uses no information about the objects in its learning – the

simplest of all learners.• Accuracy is poor, but reliability is good.

Page 36: Reliable Probability Forecasting – a Machine Learning Perspective David Lindsay Supervisors: Zhiyuan Luo, Alex Gammerman, Volodya Vovk

K-NN

10-NN 20-NN 30-NN

• K-NN finds subset of K closest (nearest neighbouring) examples in training data using a distance metric. Then counts the label frequencies amongst this subset.

• Acts like a more sophisticated version of ZeroR that uses information held in the object.

• Appropriate choice of K must be made to obtain reliable probability forecasts (depends on data).

Page 37: Reliable Probability Forecasting – a Machine Learning Perspective David Lindsay Supervisors: Zhiyuan Luo, Alex Gammerman, Volodya Vovk

Traditional Learners and VPM Traditional learners can be very unreliable (yet accurate) - depends on data. My research shows empirically that VPM is reliable. And it can recalibrate a learners original probability forecasts to make them

more reliable! Improvement in reliability often without detriment to classification accuracy.

Naïve Bayes

VPM Naïve Bayes

C4.5

VPM C4.5

Neural Net

VPM Neural Net

1-NN

VPM 1-NN

Page 38: Reliable Probability Forecasting – a Machine Learning Perspective David Lindsay Supervisors: Zhiyuan Luo, Alex Gammerman, Volodya Vovk

Back to the plan… What is probability forecasting? Reliability and resolution criteria Experimental design Problems with traditional assessment methods:

square loss, log loss and ROC curves Probability Calibration Graph (PCG) Traditional learners are unreliable yet accurate! Extension of Venn Probability Machine (VPM) Which learners are reliable? Psychological and theoretical viewpoint

Page 39: Reliable Probability Forecasting – a Machine Learning Perspective David Lindsay Supervisors: Zhiyuan Luo, Alex Gammerman, Volodya Vovk

Psychological Heuristics When faced with the difficult task of judging

probability, people employ a limited number of heuristics which reduce the judgements to simpler ones: Availability - An event is predicted more likely to

occur if it has occurred frequently in the past Representativeness - One compares the essential

features of the event to those of the structure of previous events

Simulation - The ease in which the simulation of a system of events reaches a particular state can be used to judge the propensity of the (real) system to produce that state.

Page 40: Reliable Probability Forecasting – a Machine Learning Perspective David Lindsay Supervisors: Zhiyuan Luo, Alex Gammerman, Volodya Vovk

Interpretation of reliable learners using heuristics ZeroR, K-NN and VPM learners are

reliable probability forecasters. Can identify heuristics in these learning

algorithms Remember psychological research states:

More heuristics More reliable forecasts

Page 41: Reliable Probability Forecasting – a Machine Learning Perspective David Lindsay Supervisors: Zhiyuan Luo, Alex Gammerman, Volodya Vovk

Psychological Interpretation of ZeroR The simplest of all reliable probability

forecasters uses 1 heuristic:The learner merely counts labels it has

observed so far, and uses the frequencies of labels as its forecasts (Availability)

Page 42: Reliable Probability Forecasting – a Machine Learning Perspective David Lindsay Supervisors: Zhiyuan Luo, Alex Gammerman, Volodya Vovk

Psychological Interpretation of K-NN More sophisticated than the ZeroR learner,

the K-NN learner uses 2 heuristics:Uses the distance metric to find subset of K

closest examples in training set. (Representativeness)

Then counts the label frequencies in the subset of K-nearest neighbours to makes its forecasts (Availability)

Page 43: Reliable Probability Forecasting – a Machine Learning Perspective David Lindsay Supervisors: Zhiyuan Luo, Alex Gammerman, Volodya Vovk

Psychological Interpretation of VPM Even more sophisticated the VPM meta-

learner uses all 3 heuristics:The VPM tries each new test example with all

possible classifications (Simulation)Then under each tentative simulation clusters

training examples which are similar into groups (Representativeness)

Finally the VPM calculates the frequency of labels in each of these groups to make its forecasts (Availability)

Page 44: Reliable Probability Forecasting – a Machine Learning Perspective David Lindsay Supervisors: Zhiyuan Luo, Alex Gammerman, Volodya Vovk

Theoretical justifications

ZeroR can be proven to be asymptotically reliable (but experiments show well in finite data)

K-NN has lots of theory Stone (1977) to support its convergence to true probability distribution

VPM has a lots of theoretical justification for finite data using martingales

Page 45: Reliable Probability Forecasting – a Machine Learning Perspective David Lindsay Supervisors: Zhiyuan Luo, Alex Gammerman, Volodya Vovk

Take home points Probability forecasting is useful for real life

applications especially medicine. Want learners to be reliable and accurate. PCG can be used to check reliability. ZeroR, K-NN and VPM provide consistently

reliable probability forecasts. Traditional learners Naïve Bayes, Neural Net

and Decision Tree can provide unreliable forecasts.

VPM can be used to improve reliability of probability forecasts without detriment to classification accuracy.

Page 46: Reliable Probability Forecasting – a Machine Learning Perspective David Lindsay Supervisors: Zhiyuan Luo, Alex Gammerman, Volodya Vovk

Supervision Alex Gammerman

Volodya VovkZhiyuan Luo

Mathematical Advice Daniil RiabkoVolodya Vovk

Teo Sharia

Proofreading Zhiyuan Luo

Siân Cox

Graphics & Design Siân Cox

Catering Siân Cox

Fin Acknowledgments