Novel Image Analysis Algorithms for Quantifying Expression of Nuclear Proteins
assessed by Immunohistochemistry
Elton Rexhepaj , MScUCD School of Biomolecular and Biomedical Science
UCD Conway Institute, University College Dublin,
Biomarker Validation: Application of Tissue Microarrays
BIOMARKER DEVELOPMENT
Interpretation of IHC
• Subjective, time consuming
• Inherent intra-observer variability
• Semi-quantitative data
• Pathologist-based analysis remains the current standard
• Objective quantification of IHC staining
• Reproducible data
• Continuous output
• A new tool in the hand of the pathologist
Manual Automated
Application of Image Analysis Approaches to assess IHC
Brennan et al resubmitted, Clinical cancer research, 2007
Rexhepaj et al, manuscript in preparation
• Promising tumour marker• Located in the cytoplasm and the nucleus• Nuclear and cytoplasmic fractions of survivin have different
biological roles • Manual interpretation of survivin is challenging• Conflicting data exists on its prognostic impact in breast cancer• Need for new automated scoring models• Can automated scores lead to discovery of new prognostic subgroups
Altered Nuclear-Cytoplasmic Ratio of Survivin is a Prognostic Indicator in Breast Cancer
Automated image analysis of survivin
• Breast Cancer TMA– 102 patients– 0.6mm cores arrayed in duplicate– Full clinicopathological data– Median follow-up 77 months
• Image acquistion – Aperio Scanscope CS Autoscanner
H&E IHC x10 IHC x40
Low
Medium
High
Staining Intensity
Brennan et al submitted 2007
Separating nuclear from cytoplasmic stain
Cytoplasm
Cytoplasm & nuclear
We were able to separate cytoplasm from nuclear staining and independently quantify the IHC staining intensity
LOW
HIGH
Positive pixel count algorithm
Random Forest Clustering Survivin cytoplasmic to nuclear ratio
• By applying RFC we were able to find 4 cluster of patients• Cytoplasm to nuclear ratio was differently expressed in each cluster
12
3
4
RFC dim 1
RF
C d
im 2
Brennan et al submitted 2007
CNR and patient survival
• Clusters with high CNR showed a increase of both BCS and OS survival O
vera
ll S
urv
ival
Time (Months)
P = 0.005
CNR < 5
CNR > 5
CNR < 5
CNR > 5
Time (Months)
P = 0.05
BC
Sp
ecif
ic S
urv
ival Low CNR
High CNRHigh CNR
Low CNR
Brennan et al submitted 2007
Cox Regression Analysis of OS
Univariate and Multivariate analysis revealed that the CNR was a significant predictor of OS in this cohort along with tumour size and nodal status
Univariate Multivariate*
HR 95% CI p
value HR 95% CI p
value
CNR (<5 v’s > 5 0.1 0.01 - 0.73 0.023 0.09 0.01 - 0.72 0.024
Nodal status (pos v’s neg) 3.03 1.48 - 6.20 0.002 2.74 1.21 - 6.19 0.015
Grade (1 & 2 v’s 3) 2.52 1.32 - 4.81 0.005 0.63 0.27 - 1.48 0.29
ER status (pos v’s neg) 0.38 0.20 - 0.73 0.004 0.61 0.22 - 1.67 0.334
Her2 (1 & 2+ v’s 3+) 2.19 1.06 - 4.52 0.034 2.0 0.84 - 4.78 0.119
PR status (pos v’s neg) 0.41 0.21 - 0.80 0.009 0.86 0.36 - 2.08 0.737
Tumor size (continuous)
1.04
1.02 - 1.06
0.001
1.05
1.02 - 1.08
0.002
Ki-67 (0 – 10% v’s 11-100%) 2.60 1.01 – 6.67 0.047 0.98 0.31 – 3.10 0.975
Brennan et al submitted 2007
Low CNR a new prognostic subgroup
Cytoplasmic:NuclearRatio <5(n = 78)
Cytoplasmic:NuclearRatio >5(n = 18)
P value
Tumor SizeMedian (Range) 22(10-100) 24 (11-60)0-20mm 33 (42) 6 (33) 0.6014
>21mm 45 (58) 12 (67)ER status1
ER - 26(33) 1 (6) 0.0195
ER + 52(67) 17 (94)PR status1
PR - 35 (45) 3 (17) 0.0335
PR + 43 (55) 15 (83)NHGNHG I & II 37 (47) 18 (100) *NHG III 41 (53) 0p53 Status2
p53 - 53 (68) 17 (94) 0.0055
p53 + 25 (42) 1 (6)Myc Amplification3
Low 43 (55) 17(94) 0.0165
Intermediate/High 16 (21) 1 (6)Missing 22
A low Survivin CNR is associated with a mitotic/proliferative phenotype
Survivin - conclusions
• Image analysis applied to Survivin IHC• Image analysis of IHC can produce new automated quantitative
scoring models• RFC was used to identify new prognostic subgroups• Previously unidentified prognostic subgroups can be uncovered• A low Survivin CNR is associated with a mitotic/proliferative
phenotype
Brennan et al submitted 2007
What can be improved
• Size• Shape• Distance . . .
Apply the learned or calibrated patterns to the rest of the cohort.
PATTERN
MACHINE LEARNING
MANUAL CALIBRATION
Alternative : Application of non-supervised learning algorithms to learn the patterns in a case by case basis
•The supervised approach - not reproducible and can’t be extended to other tissue types- requires a domain expert for the selection of validation and test cohort of patients
• The manual calibration :- It is time consuming- Need to be repeated for each new slide/cohort/type of tissue
Automated image analysis of ER and PR
• Members of the nuclear hormone family • Expressed in around 70% of breast cancer cases• Estrogen often induces a multiplication of progesterone
receptors• Currently, hormone receptor status is manually assessed by a
pathologist• an arbitrary cut off of 10% positive cells is used to decide
whether a patient should have adjuvant hormonal therapy
COHORT I
Data
• more then 1000 patients• full clinico-pathological follow up
- 564 pre-menopausal women with primary breast cancer
- Patients were randomly assigned to either two years of adjuvant tamoxifen
COHORT II
- 512 consecutive breast cancer cases
COHORT III
- 179 consecutive cases of invasive breast cancer
Application of IHC nuclear algorithm on tissue cores examples
Algorithm validation
- Validation set -18 representative tissue cores stained with ER
- A trained pathologist was ask to blindly score each tissue core
- A very good correlation was observed between manual and automated scoreM
anu
al path
olo
gist asses
smen
t
Automated percentage
Correlation of manual with automated score of ER
• A good correlation was seen between manual and automated scores
Correlation of manual with automated score of PR
• A good correlation was seen between manual and automated scores
Selection of the threshold for ER status – cohort I
0.05
• 358 thresholds were generated in the range 0-100%• For each cut-off• BCS and OS of ER negative patients was compared to that of ER positive
patients • The best cut-off for ER was 5% and for 7% for PR
A novel approach to automatically define the threshold for ER status – cohort I
- Random forest clustering was used to automatically cluster patient in ER+/- subgroups
A novel approach to automatically define the threshold for PR status – cohort I
- Random forest clustering was used to automatically cluster patient in PR+/- subgroups.
ER/PR status as defined by clusters and correlation with manual scores – cohort I
• ER status as defined by RFC was correlated with manual scores.• Spearman correlation coefficient was 0.8 for ER and 0.7 for PR
Correlation of RFC clusters with tamoxifen response cohort I
• There was a significant effect of 2 years tamoxifen treatment on the ER+ and PR+ cohort of patients as determined by RFC
• No treatment effect was evident in ER-, PR- patients as determined by RFC
Summary- A novel non-supervised image analysis algorithm
- Excellent correlation of manual with automated scoring
- Univariate analysis of OS showed no significant difference in the HRs between manual and automated scores
- A patient clustering approach to investigate patient stratification.
- A new automated approach to stratify patients in ER-/+
- The ability to predict tamoxifen response was similar in manual and automated
AcknowledgementsSupervisor UCD School of Medicine and Medical ScienceProf. William Gallagher Dr Amanda McCann
Dr Dermot LeahyUCD School of Medicine and Medical Science Dr. Donal Brennan
Gallagher Lab Dept of Pathology Lund University SwedenDr. Darran O’Connor Prof Goran LandbergDr. Linda Whelan Dr Karin JirstromDr. Annette Byrne Asa KronbladDr. Mairin RaffertyDr. Richard TalbotDr. Shauna Hegarty Dr. Helen CooneyCaroline CurridSharon McGeeElaine McSherry TARP Laboratory NCI, NIH, Washington Liam Faller Dr Stephen Hewitt Ian MillerDenise RyanFiona LaneganBen CollinsTom LauKaren PowerStephen Madden AperioSarah Penny Aisling O RiordanDr Catherine KellyDr Sallyann O’Brien
EMBO practical course on TissueMicroarray construction and image analysis
June 2008 – THE RETURN !!!
http://coursewiki.embo.org/doku.php?id=tissue_microarrays:microarray_course