cancer incidence predictions (finnish experience) · 2016. 6. 3. · examples of predictive models...
TRANSCRIPT
Cancer Incidence Predictions
(Finnish Experience)
Tadeusz DybaTadeusz Dyba
Joint Research Center
EPAAC Workshop, January 22-23 2014, Ispra
Rational for making cancer incidence predictions
Administrative:
to plan the allocation of the resources
(then predictions should be as accurate as possible)
Scientific:
to evaluate the success of disease control
(then predictions that do not come true are useful)
Example of administrative prediction: Updating cancer registry data
(Annual Report of Finnish Cancer Registry 2006/2007)
Example of scientific prediction:
"Predictions are often given as point forecasts with no guidance as to
their likely accuracy."
"Given their importance, it is perhaps surprising and rather regrettable,
Precision of prediction
"Given their importance, it is perhaps surprising and rather regrettable,
that many […] do not regularly produce prediction intervals, and that
most predictions are still given as a single value."
Chris Chatfield, ”Time-Series Forecasting”,
Chapman&Chall, 2001.
Why use predictions intervals in incidence predictions?
• Monitoring a range of possible future outcomes
• Evaluation of cancer prevention actions
• Some predictions more accurate than others• Some predictions more accurate than others
• Elimination of absolutely inaccurate predictions
Calculating prediction intervals
Using the formula for calculating the variance of conditional distribution
var(ciT) = var( E(ciT | θiT ) ) + E(var(ciT | θiT ) )
can be shown for any model:
var(cT) = var(θT) + E(θT)
where θiT is the estimator of predicted number of cases ciT
uncertainty about = uncertainty about + uncertainty about
the predicted the parameters of the future
number of cases model distribution
------------------------
Confidence Interval
-----------------------------------------------------
Prediction Interval
Why use simple models for cancer incidence prediction?
• Rule of parsimony("status quo" assumption lying behind any prediction)
• Complicated models are not likely to hold in the future
• Lack of information or of reliable information about causes
of cancers
• Short prediction interval; more precise prediction
(if model holds)
• Clear interpretation
Decreasing trends of cancer incidence
lnE(Iit) = αi + βt Iit = cit / nit
↑ cit = number of cases
βi nit = number of person-years
i = age, t = period
AGE – PERIOD MODELS
i = age, t = period
Increasing trends of cancer incidence
E( Iit ) = αi + βit
E( Iit ) = αi ( 1+ βt ) ← no non-identifiability property
Assumption about:
cit ∼ Poisson
Iit, age-adjusted ∼ Normal
Empirical coverage error of ex post predictionsFinnish data (1953-2003)
Site Females Males
Lip 95 95
Oesophagus 95 95
Stomach 84 100
Colon 97 73
Rectum 89 97
Liver 78 65
Gallbladder 76 65
Pancreas 89 97Pancreas 89 97
Lung 84 31
Corpus uteri 86 ---
Ovary 86 ---
Kidney 84 78
Bladder 76 70
Skin melanoma 89 86
Skin non-melanoma 68 81
Nervous system 76 78
Thyroid 68 86
Non-Hodgkin 85 86
Leukaemia 81 92
cancer sites without screening activities, horizon of prediction = 5 years, 32 ex post predictions per site
Site specific predictions
- Lung Cancer
Hakulinen T and Pukkala E, Int J Epidemiol, 1981
a simulation model to predict lung cancer incidence in Finland based on historical
smoking habits and possible future scenarios of starting and quiting smoking
- Breast Cancer- Breast Cancer
Seppanen et al., Cancer Cause Control, 2006predicting breast cancer incidence under historical and possible future scenarios
of screening practices in Finland
Examples of predictive models for Finland based
on age-period-region specific data
- cancer control is by region in Finland
- stratification by region increases homogenity of data eliminating extra-Poisson
variation for the more common cancers
No. Model D.F. Pearson’s X2 Deviance Dev. + 2*NP
1 αi 702 991.2 1044.6 1070.6
2 αi ( 1 + βt ) 701 983.2 1037.8 1065.8
3 αi ( 1 + γ r + βt ) 697 701.6 724.6 760.6
4 αi ( 1 + γ r + βr t ) 693 695.9 718.9 762.9
5 αi + βi t 689 966.6 1024.5 1076.5
6 αi r 650 644.7 677.9 807.9
7 αi r ( 1 + βt ) 649 637.0 671.1 803.1
8 αi r ( 1 + βr t ) 645 634.2 667.0 807.0
9 αi r + βi t 638 619.8 657.4 811.4
10 αi r + βi r t 588 577.6 609.9 863.9
The models for cancer incidence specific to age(αi), period(β) and to region(γr) applied for cancer sites
with increasing (or stable) incidence pattern.
The example of fit is for lung cancer in females in Finland
Predictions for males in Finland as age-adjusted incidence rates
based on age-period-region models
Other approaches to prediction
Age-period-cohort models
Moller B, et al. Stat in Med, 2003
(empirical evaluation of using Age-Period-Cohort models for prediction, applying different
methods using data from Nordic countries, no evaluation of the precision of the performed
predictions by means of prediction interval)
Rutherford M et al. Int J Biost 2012, Phd Thesis 2011
(in the framework of flexible parametric modelling forces period and cohort cubic spline
functions to be linear beyond the boundary knot in order to predict the future incidence)
Bayesian age-period-cohort modelsBashir G and Esteve J, J of Epidemiol and Biostat, 2001
Bray I, Appl Stat, 2002
Baker A and Bray I, Am. J Epidemiol, 2005
Cleries R et al., Stat Med, 2012(choice of smoothing priors is crucial, long credible intervals)
Generalized additive modelsClements et al., Biostatistics, 2005(prediction interval more precise than those from Bayesian approach)
Software for incidence prediction
- Published papers are sometimes accompanied by computer code to perform prediction
or the code is available upon request, many Bayesian predictions use Winbugs software
- Nordpred package developed in Norway, written in R software
http://www.kreftregisteret.no/en/Research/Projects/Nordpred/Nordpred-software/
Moller B, et al. Stat in Med, 2003
Engholm et al., Association of Nordic Cancer Registries. Danish Cancer Society, 2009
- A four presented here age-period models can be applied using Stata macros- A four presented here age-period models can be applied using Stata macros
http://www.cancer.fi/syoparekisteri/en/general/links/,
Hakulienen T and Dyba, Stat Med, 1994; Dyba T and Hakulinen T, Stat Med, 1997; 2000
- Prediction based on APC models using restricted cubic splines uses Stata macros
Rutherford M et al., Int J Biost 2012; Stata Jour 2012; Rutherford M, Phd Thesis 2011
- On line analysis, allowing to perform predictions for certain data sets
Nordic countries: http://www.kreftregisteret.no/en/Research/Projects/Nordpred/Nordpred-software/
Other countries: http://www-dep.iarc.fr/WHOdb/predictions_sel.htm
Closing remarks
• Predictive methods should clearly state assumptions used during prediction process
• One method of fitting all cancer sites doesn't exist
• Performed cancer incidence predictions often lack necessary measure of precision
• Mathematically advanced predictive methods/models are hard to interpret• Mathematically advanced predictive methods/models are hard to interpret
• The need of collecting software used by different prediction methods at one place (website?)
• Without external information on cancer etiology, latency time, screening activities
performing cancer incidence predictions will remain a challenge
THANK YOU
Long-term Bayesian predictions for Finland:
numbers of new cases in females 1990-1994
___________________________________________
Site Observed Projected 90% CI
___________________________________________
Oesophagus 442 440 130; 1552 Oesophagus 442 440 130; 1552
Lung 2208 2152 706; 6539
Melanoma 1240 1675 431; 6216
Breast 13930 15032 4952; 46354
Brain 1948 2117 643; 8285
___________________________________________ Bashir G and Esteve J, J of Epidemiol and Biostat, 2001
Posterior estimates of the precision parameters
of the predictive model
_______________________________________________
Timescale Posterior median 90% Credible Interval
_______________________________________________
Age 18.6 6.9; 52.5 Age 18.6 6.9; 52.5
Period 674.2 54.9; 2993.4
Cohort 513.2 61.6; 2724.3
Bray I, Appl Stat, 2002
Moller B, et al. Stat in Med, 2003
Moller B, et al. Stat in Med, 2003
Moller B, et al. Stat in Med, 2003
Moller B, et al. Stat in Med, 2003