unsupervised discovery of novel emphysema subtypes
TRANSCRIPT
Unsupervised Discovery of
Novel Emphysema Subtypes
Funding: NHLBI R01-HL077612, R01-HL093081, R01-HL121270,
HHSN268200900017C
Lung
Andrew Francis Laine, D.Sc. Percy K. and Vida L. W. Hudson Professor of Biomedical Engineering and
Professor of Radiology (Physics)
Department of Biomedical Engineering
Columbia University
New York, NY 10027
Presented at
New Jersey Public Health Annual Conference:
“Pressing Public Health Challenges: Systems Thinking, Opioid Crisis, and Use of Artificial Intelligence”
Rutgers Continuing Education Center at Atrium, Division of Continuing Education
October 4, 2019, Somerset, NJ
Heffner Biomedical Imaging Lab
Jie Yang (Ph.D. Student, 2019)
Xinyang Feng (Ph.D. student, 2019)
Yrjö Häme (Ph.D. student)
Yu Gan (Post-doc)
Thomas Vetterli (M.Sc. Intern)
Additional Collaborators
Columbia University Medical Center:
Pallavi P. Balte, Ph.D.
John H.M. Austin, M.D.
Benjamin M. Smith, M.D.
Yifei Sun, Ph.D.
Wei Shen, M.D.
Iowa University:
Eric A. Hoffman, Ph.D.
.
University of Virginia:
Ani Manichaikul, Ph.D.
Acknowledgements
Collaborators
Elsa D. Angelini, Ph.D.
R. Graham Barr, M.D., Ph.D.
2
And the MESA, SPIROMICS Investigators!
Emphysema
• Emphysema + COPD is 3rd leading cause of death in USA.
• Defined by loss of interalveolar septa
• Predicts mortality in patients with and without COPD
Leopold/Gough, Thorax, 1957
Johannessen, AJRCCM, 2013
Oelsner, Ann Intern Med, 2014
Background
4
Computed tomography (CT) used to analyze lung structure:
Resolution =
0.5×0.5×0.75 mm
Matrix size =
512×512×500 pixels
Intensity range =
[-1024 1024] HU
Axial Sagittal Coronal
40 megavoxels of the lung:
• Enable in vivo study of lung
structure and disease patterns.
Background
Emphysema Nodule
[1] http://www.goldcopd.org/
[2] Siegel et al., Cancer statistics, A Cancer Journal for Clinicians, 2016.
5
1st leading cause of
cancer-related death in
the US [2].
> 150,000 deaths in the
US in 2018.
Lung nodule: Higher
attenuation abnormality.
4th leading cause of death
in the world [1]:
Affects 16 millions of
subjects in the US.
Emphysema: Lower
attenuation abnormality.
Axial CT Slice
Lung tissue abnormalities on CT:
• Characterized by localized texture patterns.
COPD and Pulmonary
Emphysema
Lung Cancer and Nodule
Background
[1] Smith et al., Pulmonary emphysema subtypes on computed tomography: the MESA COPD study, The American Journal of Medicine, 2014.
6
Lung texture learning to characterize emphysema subtypes:
Emphysema Subcategories
Centrilobular
Emphysema
(CLE)
Panlobular
Emphysema
(PLE)
Paraseptal
Emphysema
(PSE)
N = 140 N = 140 N = 2
Exact mechanism of developing
COPD remains unknown;
Three standard emphysema
subtypes defined at autopsy [1]:
• Limited inter-rater agreement.
Lung texture learning for
emphysema subtyping can
advance disease understanding
COPD and Emphysema
N = 140 N = 140
Leopold/Gough, Thorax, 1957
Thurlbeck, 1963
Heard/Izukawa, 1964
Smith, Am J Med, 2013
Takahashi, Int J COPD 2008
Classic Emphysema Subtypes Lung
Panlobular
Centrilobular Paraseptal
Pananlobular
Background
Limitations of existing CT-based texture analysis:
Limited to supervised learning;
Limited to texture features, without considering spatial locations;
Superior
Anterior
Inferior
Core
Peel
Posterior
8 [1] Lynch et al., CT-definable subtypes of chronic obstructive pulmonary disease: a statement of the fleischner society. Radiology, 2015.
[2] Swensen et al., The probability of malignancy in solitary pulmonary nodules: application to small radiologically indeterminate nodules.
Archives of internal medicine, 1997.
Emphysema subtypes:
• Different in spatial prevalence [1].
Lung nodules:
• Location predicts malignancy [2].
Background
Limitations of existing CT-based texture analysis:
Limited to supervised learning;
Limited to texture features, without considering spatial locations;
Limited to high-resolution full-lung CT scans.
9 [1] Bild et al., Multi-ethnic study of atherosclerosis: objectives and design. American journal of epidemiology, 2002.
[2] Hoffman et al., Reproducibility and Validity of Lung Density Measures from Cardiac CT Scans—The Multi-Ethnic Study of
Atherosclerosis (MESA) Lung Study, Academic radiology, 2009
Large dataset of cardiac CT scans are available:
• 10,000s longitudinal cardiac CT scans + gold
standard full-lung CT scans in MESA [1];
• ~2/3 of the lung region + reproducible %𝑒𝑚𝑝ℎ [2];
• Enable large-scale longitudinal study.
Full-lung CT Scan Cardiac CT Scan
SPIROMICS and MESA
• SubPopulations and Intermediate Outcome
Measures In COPD Study
–COPD case-control study
–2,983 participants with CT scans (~6,000 scans)
–Whole genome sequencing, multi-omics
• Multi-Ethnic Study of Atherosclerosis Lung Study
–Population-based, prospective cohort study
–3,205 participants with full-lung CT scans
(~50,000 scans)
–Whole genome sequencing, multi-omics
Lung
Hypothesis
• Unsupervised learning of spatial lung texture
patterns on research CT scans will yield novel
emphysema subtypes.
–Reproducible
–Distinct symptoms
–Specific histology and genetic basis
Background
Aimed to tackle the problem of CT-based lung texture learning exploiting spatial
localization, using unsupervised / weakly-supervised learning.
12
Aim 1: Develop an algorithm for unsupervised learning
of localized texture patterns for emphysema.
Aim 2: Label the discovered localized texture patterns on
large datasets of cardiac CT scans.
Aim 3: Examine possible correlations / hits with GWAS
genomic information in MESA and SPIROMICS.
NIH R01-HL121270: Novel
Quantitative Emphysema Subtypes
in MESA and SPIROMICS.
(PIs Dr. R.G. Barr, Dr. A.F. Laine)
Data
• Lung masks:
• VIDA Diagnostics APOLLOⓇ;
• Emphysema masks for full-lung scans:
• Hidden Markov measure field (HMMF)
based model [1] : %𝑒𝑚𝑝ℎHMMF;
• Intensity thresholding: %𝑒𝑚𝑝ℎ−950;
• Lung masks:
• Intensity thresholding <-400 HU [2] +
closed space dilation [3].
-700
-800
-900
-1000
(HU)
Lung Airway
MESA: Bild et al., American Journal of Epidemiology, 2002;
MESA COPD: Thomashow et al., American journal of respiratory and critical care medicine, 2013.
SPIROMICS: Couper et al., Thorax, 2013.
LIDC-IDRI: http://www.cancerimagingarchive.net/
Kaggle DSB2017: https://www.kaggle.com/c/data-science-bowl-2017
[1] Hame et al., Adaptive quantification and longitudinal analysis of pulmonary emphysema with a hidden Markov
measure field model. IEEE TMI, 2014.
[2] Hu et al., Automatic lung segmentation for accurate quantitation of volumetric X-ray CT images. IEEE TMI, 2001.
[3] Masutani et al., Region-growing based feature extraction algorithm for tree-like objects. In VBC, 1996.
13
Preprocessing Tools
Background: Processing Overview
• Unsupervised machine learning defined novel emphysema patterns on CT.
14
Textons
Fre
quency
Texture
Spatial
From 3 standard subtypes (1950’s)
to 10 novel spatial lung texture patterns
(sLTP) for emphysema
sLTP Labeling
sLTP
No Emphysema
Similarity
! " #$%
Unsupervised
Learning
Yang et al., Unsupervised Discovery of Spatially-Informed Lung Texture Patterns for Pulmonary
Emphysema: The MESA COPD Study. MICCAI, 2017
15
Unsupervised Learning of Localized Texture
Patterns for Pulmonary Emphysema
• Learning localized texture patterns
in an unsupervised manner;
• Homogeneity vs. redundancy of
the learned patterns.
• Applied a method to standardize lung shape spatial mapping;
• Developed a two-stage unsupervised framework combining spatial and texture information;
• Discovered emphysema patterns on large COPD cohorts with compelling clinical significance.
Challenges
Contributions
[1] Jie Yang et al., Unsupervised Machine Learning to Define Quantitative Subtypes of Pulmonary
Emphysema on CT, Science (to submit in December 2018).
[2] Jie Yang et al., Unsupervised Discovery of Spatially-Informed Lung Texture Patterns for Pulmonary
Emphysema: The MESA COPD Study. MICCAI, 2017.
Method Unsupervised learning of localized texture patterns for pulmonary emphysema
• Poisson Distance Map [1] (PDM): 𝒓 = 𝟏 − 𝑼
• Poisson Distance Conformal Map (PDCM): 𝜽 & 𝝓
∆𝑈 𝑥, 𝑦, 𝑧 = −1, for 𝑥, 𝑦, 𝑧 ∈ 𝑉 and ∆𝑈 = 𝑈𝑥𝑥 + 𝑈𝑦𝑦 + 𝑈𝑧𝑧
subject to 𝑈 𝑥, 𝑦, 𝑧 = 0, for 𝑥, 𝑦, 𝑧 ∈ 𝜕𝑉
Lung mask
Lung mask boundary
Lung Shape
Spatial Mapping 1
[1] Gorelick et al., Shape representation and classification using the poisson equation. IEEE TPAMI, 2006.
16
(𝒙, 𝒚, 𝒛) → (𝒓, 𝜽, 𝝓)
-700
-800
-900
-1000
( HU )
𝜽 𝝓
PDM PDCM
Anterior
Superior
Medial
Intensity Image
𝒓=𝟏−𝑼
Core
1
0.75
0.5
0.25
0
𝑼
Preliminary Evaluation of PDCM Unsupervised learning of localized texture patterns for pulmonary emphysema
PDCM for Studying Emphysema Spatial Locations over Populations
17
PDCM = a useful tool
for population study
of spatial patterns.
MESA COPD Study:
• N = 317 CT scans;
• Angular and radial
projections of intensity or
relative intensity values;
• Agree with definitions or
observations of standard
emphysema subtypes.
S = superior
I = Inferior
A = Anterior
P = Posterior
L = Lateral
M = Medial
S
L
I
Core
Peel
P
A M
Method Unsupervised learning of localized texture patterns for pulmonary emphysema
Learning Stage 1:
Augmented Lung Texture Patterns (LTPs) 2
Region of interest (ROI)
= 25 × 25 × 25 mm3
Texture feature 𝑭𝑻
= texton [1] - based feature
Spatial feature 𝑭𝑺
= location in 36 lung sub-regions
𝑟/3, 𝜃/4, 𝜙/3
𝜽
𝝓 𝐹𝑇𝑥 / 𝐹𝑆𝑥 = texture / spatial feature of ROI 𝑥
𝐹𝑇𝑘 / 𝐹𝑆𝑘 = texture / spatial centroid of 𝐿𝑇𝑃𝑘
[1] Varma et al., A Statistical Approach to Material Classification using Image Patch Exemplars, IEEE TPAMI, 2009.
18
Method Unsupervised learning of localized texture patterns for pulmonary emphysema
2
𝜒2 𝐹𝑇𝑥 , 𝐹𝑇𝑘𝑡−1
𝜔 ∙ 𝑊 ∙ 𝐹𝑆𝑥 , 𝐹𝑆𝑘𝑡−1
2
2
𝛾 ∙ 𝕀 𝜒2 𝐹𝑇𝑥 , 𝐹𝑇𝑘𝑡−1
> 𝑡ℎ𝑟𝑒𝑠ℎ𝜒2
Iteratively update ROI assignment
Λkt
of LTPk, by minimizing a
dedicated cost function [1] :
Texture
distance
Spatial
regularization
Texture
penalty
𝐹𝑇𝑥 / 𝐹𝑆𝑥 = texture / spatial feature of ROI 𝑥
𝐹𝑇𝑘 / 𝐹𝑆𝑘 = texture / spatial centroid of 𝐿𝑇𝑃𝑘
19
Learning Stage 1:
Augmented Lung Texture Patterns (LTPs)
Method Unsupervised learning of localized texture patterns for pulmonary emphysema
Learning Stage 2:
Spatially-informed lung
texture patterns (sLTPs) 3
sLTP Labeling
sLTP
No Emphysema
Similarity
! " #$%Similar LTP: can be replaced by each other.
𝑵𝒊⟶𝒋 = # of ROIs labeled with 𝐿𝑇𝑃𝑗 when removing 𝐿𝑇𝑃𝑖.
Infomap [1] Graph Partitioning of LTP similarity:
𝐺𝑖,𝑗 =𝑁𝑖⟶𝑗 + 𝑁𝑗⟶𝑖
𝑁𝑖 + 𝑁𝑗∙ 𝕀
𝑁𝑖⟶𝑘𝑘
𝑁𝑖> 𝑡ℎ𝑟𝑒𝑠ℎ𝑁→
∙ 𝕀 𝑁𝑗⟶𝑘𝑘
𝑁𝑗> 𝑡ℎ𝑟𝑒𝑠ℎ𝑁→
[1] Rosvall et al., Maps of random walks on complex networks reveal community structure. PNAS, 2008.
20
21
Learning Pipeline and Results in SPIROMICS and MESA Lung Study Unsupervised learning of localized texture patterns for pulmonary emphysema
𝜽
𝝓
Anterior
Superior
Lateral
Core
Spatial
Feature
Texture
Feature
Textons
Fre
qu
en
cy
sLTP 1-10
Lung CT Scan
Emphysema 3D ROI
High-Resolution
CT Images Feature Extraction Unsupervised Learning of sLTPs
Inferior
Posterior
Medial
# 1
# 3
# 10
(b) Similarity
Graph Partition
(a) Initial
clustering
Training Set 1 (N = 1,462) Training Set 2 (N = 1,460)
21
Spatial density
0 1 2 3 4
Training Set 1
(N = 1,462)
Training Set 2
(N = 1,460)
Learning Reproducibility
• Regional level:
• Proportion of labeling
overlap of test ROIs = 0.82.
• Individual level:
• Spearman’s correlation of
sLTP histograms over all
subjects (N = 2,922)
• All sLTP > 0.95
DATA: SPIROMICS (N=2,922)
𝜽
𝝓
Anterior
Superior
Lateral
Core
Spatial
Feature
Texture
Feature
Textons
Fre
qu
en
cy
sLTP 1-10
Lung CT Scan
Emphysema 3D ROI
High-Resolution
CT Images Feature Extraction Unsupervised Learning of sLTPs
Inferior
Posterior
Medial
# 1
# 3
# 10
(b) Similarity
Graph Partition
(a) Initial
clustering
All SPIROMICS (N = 2,922)
MESA Lung
N = 3,128
SPIROMICS
N = 2,922
Quantitative Emphysema
Subtype (QES) 1-6
sLTP 3
sLTP 5
sLTP 9
sLTP 2
sLTP 1
sLTP 8
sLTP 10
sLTP 6
sLTP 4
sLTP 7
Population
Clustering
Popu
lation
-based
sLT
P M
erg
ing
Heatmap of %sLTP label histograms
over populations
10 sLTP to 6 QES
sLTP
Labeling
Apical
Vanishing Lung
Obstructive CPFE
Diffuse
Restrictive CPFE
Senile sLTP
histogram
sLTP
histogram
Learning Pipeline and Results in SPIROMICS and MESA Lung Study Unsupervised learning of localized texture patterns for pulmonary emphysema
DATA: SPIROMICS (N=2,922) and MESA Lung Study (N=3,128)
Experimental Results in SPIROMICS and MESA Lung Study Unsupervised learning of localized texture patterns for pulmonary emphysema
Apical Diffuse Senile Restrictive
CPFE
Obstructive
CPFE
Vanishing
Lung Units
SPIROMICS (n=2,853) β Est. β Est. β Est. β Est. β Est. β Est.
MRC-Dyspnea 0.1 -0.001 0.1 0.2 -0.1 0.1 Scale 0-4
SGRQ Score 1.1 0.8 -0.4 4.2 -1 0.4 Score 0-100
Resting O2 saturation (%) 0.004 -0.4 0.1 -0.9 0.2 -0.2 %
Post-exercise O2 saturation (%) -0.8 -0.4 -0.3 -1.8 0.4 -0.5 %
6 Minute Walk Test Distance (m) -0.3 -2.3 5.2 -22.0 5.2 -1.8 Meters
Hemoglobin 0.06 0.06 0.03 0.15 -0.05 0.07 g/dl
Exacerbations 0.1 0.2 0.03 0.3 0.1 0.1 Count
MESA Lung (n=2,949)
MRC-Dyspnea 0.3 -0.01 -0.0002 0.2 -0.02 0.02 Scale 0-4
Resting O2 saturation (%) -0.5 -0.3 0.2 -0.8 0.3 -2.2 %
FEV1 (mL) -310 18.8 14.6 -126.5 -82.9 817.6 Milliliters
FVC (mL) -150.2 153.7 102.8 -100.3 -56.0 1149.6 Milliliters
FEV1/FVC (%) -6.9 -2.6 -2.6 -0.9 -2.0 8.1 %
Total Lung Volume (mL) -64.2 485.3 333.8 -378.4 145.9 1781.7 Milliliters
MESA (n=6,683) HR HR HR HR HR HR
CLRD Hospitalization 2.9 1.5 0.8 1.0 1.4 1.1
CLRD Mortality 2.2 1.5 0.9 0.99 1.8 1.3
All-cause Mortality 1.6 0.9 0.96 1.1 0.9 0.99
MRC= Medical Research Council; O2=Oxygen; SGRQ=St. George's respiratory questionnaire;
6MW=Six minute walk; FEV1= Forced expiratory volume in one second; FVC=Forced expiratory volume in one second;
HR=Hazards ratio; CLRD=Chronic lower respiratory disease
β estimates compared to normal lung from multivariate linear regression models adjusted for age, sex, race, height, weight,
smoking status, pack-years, COPD, scanner manufacturer, FEV1, other QES
Summary of Clinical
Significance in SPIROMICS
and MESA:
Association of QES with
respiratory symptoms,
physiology, and prognosis
Red shading = statistically significant worsening;
Blue/green shading = statistically significant “improvement”.
SPIROMICS
MESA
Apical Diffuse Senile Restrictive Obstructive Vanishing CPFE CPFE Lung
Me
an
QE
S (
%)
9
8
7
6
5
4
3
2
1
0
Population Distribution of the QES
# 1
S
IP A
S
IP A
S
IP A
S
IP A
S
IP A
# 2 # 3 # 4 # 5
# 6 # 7 # 8 # 9 # 10
S
IP A
S
IP A
S
IP A
S
IP A
S
IP A
Apical Diffuse Senile Restrictive CPFE Obstructive CPFE Vanishing Lung
Apical
Adjusted for age, sex, race/ethnicity,
height, weight, smoking status, pack-
years, COPD, scanner manufacturer,
FEV1, other QES.
Associated with
Dyspnea
Desaturation on exertion
↓ 6MWT
Exacerbations, COPD death
↓↓ FEV1
↓ FVC
↓ FEV1/FVC
- TLV on CT
Experimental Results in SPIROMICS and MESA Lung Study Unsupervised learning of localized texture patterns for pulmonary emphysema
Diffuse # 1
S
IP A
S
IP A
S
IP A
S
IP A
S
IP A
# 2 # 3 # 4 # 5
# 6 # 7 # 8 # 9 # 10
S
IP A
S
IP A
S
IP A
S
IP A
S
IP A
Apical Diffuse Senile Restrictive CPFE Obstructive CPFE Vanishing Lung
Adjusted for age, sex, race/ethnicity,
height, weight, smoking status, pack-
years, COPD, scanner manufacturer,
FEV1, other QES.
Associated with
Hypoxemia at rest
Desaturation on exertion
Exacerbations, COPD death
FEV1
↑ FVC
↓ FEV1/FVC
↑ TLV on CT
Experimental Results in SPIROMICS and MESA Lung Study Unsupervised learning of localized texture patterns for pulmonary emphysema
Senile # 1
S
IP A
S
IP A
S
IP A
S
IP A
S
IP A
# 2 # 3 # 4 # 5
# 6 # 7 # 8 # 9 # 10
S
IP A
S
IP A
S
IP A
S
IP A
S
IP A
Apical Diffuse Senile Restrictive CPFE Obstructive CPFE Vanishing Lung
Adjusted for age, sex, race/ethnicity,
height, weight, smoking status, pack-
years, COPD, scanner manufacturer,
FEV1, other QES.
Associated with
FEV1
↑ FVC
↓ FEV1/FVC
↑ TLV on CT
Experimental Results in SPIROMICS and MESA Lung Study Unsupervised learning of localized texture patterns for pulmonary emphysema
Restrictive CPFE # 1
S
IP A
S
IP A
S
IP A
S
IP A
S
IP A
# 2 # 3 # 4 # 5
# 6 # 7 # 8 # 9 # 10
S
IP A
S
IP A
S
IP A
S
IP A
S
IP A
Apical Diffuse Senile Restrictive CPFE Obstructive CPFE Vanishing Lung
Adjusted for age, sex, race/ethnicity,
height, weight, smoking status, pack-
years, COPD, scanner manufacturer,
FEV1, other QES.
Associated with
Dyspnea
Hypoxemia at rest
Desaturation on exertion
↓↓6MWT
Exacerbations
↓ FEV1
FVC
FEV1/FVC
↓↓ TLV on CT
Experimental Results in SPIROMICS and MESA Lung Study Unsupervised learning of localized texture patterns for pulmonary emphysema
Obstructive CPFE # 1
S
IP A
S
IP A
S
IP A
S
IP A
S
IP A
# 2 # 3 # 4 # 5
# 6 # 7 # 8 # 9 # 10
S
IP A
S
IP A
S
IP A
S
IP A
S
IP A
Apical Diffuse Senile Restrictive CPFE Obstructive CPFE Vanishing Lung
Adjusted for age, sex, race/ethnicity,
height, weight, smoking status, pack-
years, COPD, scanner manufacturer,
FEV1, other QES.
Associated with
Desaturation on exertion
COPD death
↓ FEV1
↓ FVC
↓ FEV1/FVC
↑ TLV on CT
Experimental Results in SPIROMICS and MESA Lung Study Unsupervised learning of localized texture patterns for pulmonary emphysema
Vanishing Lung # 1
S
IP A
S
IP A
S
IP A
S
IP A
S
IP A
# 2 # 3 # 4 # 5
# 6 # 7 # 8 # 9 # 10
S
IP A
S
IP A
S
IP A
S
IP A
S
IP A
Apical Diffuse Senile Restrictive CPFE Obstructive CPFE Vanishing Lung
Adjusted for age, sex, race/ethnicity,
height, weight, smoking status, pack-
years, COPD, scanner manufacturer,
FEV1, other QES.
Associated with
Dyspnea
Desaturation on exertion
↑↑ FEV1
↑↑ FVC
FEV1/FVC
↑↑ TLV on CT
Experimental Results in SPIROMICS and MESA Lung Study Unsupervised learning of localized texture patterns for pulmonary emphysema
30
Unsupervised Learning of Localized Texture Patterns for Pulmonary Emphysema
Severe Apical QES Restrictive CPFE QES
Obstructive CPFE QES Severe Vanishing Lung QES
0
2
4
6
8
10
-lo
g1
0(p−
valu
e)
0
20
40
60
80
100
Reco
mbin
atio
n ra
te (c
M/M
b)
chr5:174730928 0.2
0.4
0.6
0.8
r2
DRD1 SFXN1
174.6 174.7 174.8 174.9
Position on chr5 (Mb)
• GWAS results:
• 5 genetic variants for four QES
• Apical QES: DRD1
Summary Unsupervised learning of localized texture patterns for pulmonary emphysema
Novel unsupervised learning of emphysema patterns on CT:
• A standardized lung shape spatial mapping;
• A two-stage learning framework.
Applied on large COPD and controls yielded:
• 10 highly-reproducible sLTPs;
• Six quantitative emphysema subtypes, associated independently
with distinct symptoms, lung function changes and mortality.
Enables:
• Novel definitions of emphysema subtypes;
• CT-based emphysema-specific signatures (biomarkers)
of the lungs;
• May facilitate future study for understanding COPD and
emphysema, and the design of personalized / gene / drug
therapies.
Other Evaluations:
• GWAS; 3 “hits” reproducible
• Extensive evaluation of sLTP
reproducibility in MESA
COPD (N = 317);
• Linking sLTP and standard
emphysema subtypes in
MESA COPD (N=317).
32
Labeling the Unsupervised Localized Texture
Patterns on Large Datasets of Cardiac CT Scans
• Introduced a robust emphysema segmentation framework on cardiac CT scans;
• Proposed a deep learning based domain adaptation [2] for robust lung texture learning.
• Labeled emphysema texture patterns on 17,039 longitudinal cardiac and full-lung CT scans.
Challenges working with
cardiac CT Scans
Overview of Method
[1] Jie Yang et al., Emphysema Quantification on Cardiac CT Scans Using Hidden Markov Measure
Field Model: The MESA Lung Study, MICCAI, 2016.
[2] Jie Yang et al., Unsupervised Domain Adaption with Adversarial Learning (UDAA) for Emphysema
Subtyping on Cardiac CT Scans: e MESA Study, ISBI, 2019 (under review).
• Missing apical regions;
• Degraded texture quality;
• Heterogeneous scanner types.
Method Labeling the unsupervised localized texture patterns on large datasets of cardiac CT scans
Adaptation of HMMF-based method
for emphysema segmentation on
cardiac CT scans [1]
1
Measure field 𝒒 Emphysema mask Intensity image
-700
-800
-900
-1000
(HU)
1
0.8
0.6
0.4
0.2
0
𝑷 𝒒, 𝜽 𝑰 =𝟏
𝑹𝑷(𝑰|𝒒, 𝜽)𝑷𝒒(𝒒)𝑷𝜽(𝜽)
𝒒 = measure field
𝑹 = constant
𝜽 = likelihood parameter
HMMF-based segmentation [2]:
1. Parametric models
𝑷(𝑰|𝒒, 𝜽) of intensity
distributions for emphysema
and normal tissue classes.
2. Enforces spatial coherence
via 𝑷𝒒 𝒒 ;
33
Method Labeling the unsupervised localized texture patterns on large datasets of cardiac CT scans
1
34
HU
-1000 -950 -900 -850 -800 -7500
0.01
0.02
0.03
0.04
0.05
Histogram
Skew-normal fit
𝜽𝑵 = [𝝁𝑵, 𝝈𝑵, 𝜶𝑵 𝜽𝑬 = [𝝁𝑬, 𝝈𝑬
Normal lung tissue:
Skew-normal distribution
Emphysema:
Normal distribution
• 𝑪 = clique: 8-connected in 2D.
• 𝝀 = Markovian weight:
• scanner-specific.
• Use longitudinal scans of
healthy population to tune.
Adaptation of HMMF-based method
for emphysema segmentation on
cardiac CT scans [1] HMMF-based segmentation [2]:
1. Parametric models
𝑷(𝑰|𝒒, 𝜽) of intensity
distributions for emphysema
and normal tissue classes.
2. Enforces spatial coherence
via 𝑷𝒒 𝒒 ;
𝝁𝑵
𝝁𝑬
Scanner-specific
Subject-specific
Method Labeling the unsupervised localized texture patterns on large datasets of cardiac CT scans
CNN model to classify ROIs from synthetic cardiac CT scans 2
35
Cardiac CT Scan
Full-lung CT Scan
Synthetic Cardiac CT Scan
36×36×8
Convolution (Conv)kernel size = 3×3×3stride = 1
32@36×36×8
Rectified linear unit
(ReLU)
32@18×18×4 48@18×18×4 48@9×9×2
Fully-connected (FC)
…
… …
…
128 128
Softmax
sLTP
Label…
…
Input: ROIs from synthetic cardiac CT scans
kernel number@ feature map size
Feature output: Operators:
Max-poolingkernel size = 3×3×3stride = 2
Method Labeling the unsupervised localized texture patterns on large datasets of cardiac CT scans
CNN model to classify ROIs from synthetic cardiac scans 2
NN1: feature extractor NN2: image classifier
36
ROIs ! " :
from synthetic
cardiac scans
ROIs ! #:
from real
cardiac scans
…
… ……
… …
NN1: feature extractor
NN3: domain discriminator
NN2: image classifier
sLTP label
… …
domain label
36×36×8
36×36×8
Method Labeling the unsupervised localized texture patterns on large datasets of cardiac CT scans
Unsupervised domain adaptation with adversarial
learning (UDAA) to classify ROIs from real cardiac scans 3
𝑳𝒕𝒐𝒕𝒂𝒍 = 𝑳𝒄𝒍𝒂𝒔𝒔 − 𝜶 ∙ 𝑳𝒅𝒐𝒎𝒂𝒊𝒏 Loss function:
𝑳𝒄𝒍𝒂𝒔𝒔 = − 𝒚𝒄 ∙ 𝐥𝐨𝐠 (𝒚 𝒄)𝑵𝒔𝑳𝑻𝑷
𝒄=𝟏
𝑳𝒅𝒐𝒎𝒂𝒊𝒏 = − 𝒚𝒅 ∙ 𝐥𝐨𝐠 (𝒚 𝒅) − (𝟏 − 𝒚𝒅) ∙ 𝐥𝐨𝐠(𝟏 − 𝒚 𝒅) 37
Source domain 𝑺
Target domain 𝑻
Data - MESA Lung Study: • N = 6,814 subjects in Exam 1 – Exam 5 (2000 – 2012)
• Exam 1-4: two repeated cardiac CT scans per visit;
38
Experimental Results: HMMF-based Emphysema Segmentation Labeling the unsupervised lung texture patterns on large datasets of cardiac CT scans
%emph in first scan
0 10 20 30 40 50
%em
ph
in
rep
eate
d s
can
0
10
20
30
40
50%emph
-950 ICC=0.980
%emph-950G
ICC=0.981
%emphHMMF
ICC=0.986
%emph-950
%emph-950G
%emphHMMF
%emph
%emph
Higher intra-class correlation
(ICC) on repeated cardiac scans
in Exam 1- 4 (N=9,621)
TPFN
FP%emph
10 20 30 40
Dic
e
0
0.2
0.4
0.6
0.8
1
%emph
10 20 30 40
Dic
e
0
0.2
0.4
0.6
0.8
1
%emph-950
Dice=0.39
%emph-950G
Dice=0.40
%emphHMMF
Dice=0.61
%emph
10 20 30 40
Dic
e
0
0.2
0.4
0.6
0.8
1
%emph
10 20 30 40
Dic
e
0
0.2
0.4
0.6
0.8
1
%emph-950
Dice=0.39
%emph-950G
Dice=0.40
%emphHMMF
Dice=0.61
%emph-950
%emph-950G
%emphHMMF
%emph-950
%emph-950G
%emphHMMF
%emph
Higher Dice overlap
on repeated cardiac scans in
Exam 1- 4 (Ndisease=471)
%emph-950
%emph-950G
TP
FN
FP
%emphHMMF
Example of
emphysema spatial
overlap:
HMMF => less FN
and less FP.
Data - MESA Lung Study: • N = 6,814 subjects in Exam 1 – Exam 5 (2000 – 2012)
• Longitudinal cardiac scans in Exam 1-4 and full-lung scan in Exam 5.
39
Experimental Results: HMMF-based Emphysema Segmentation Labeling the unsupervised lung texture patterns on large datasets of cardiac CT scans
Higher pairwise Pearson’s
correlation 𝒓 on longitudinal
cardiac scans within 18
months (Nnormal=478).
%emph cardiac Exam1
0 10 20 30
%em
ph
ca
rdia
c E
xa
m 2
-4
0
5
10
15
20
25
30%emph
-950 r=0.80
%emph-950C
r=0.83
%emphHMMF
r=0.87
(d)
Steadier emphysema longitudinal
progression (Nnormal=87; Ndisease=238)
• ∆(𝑡) = %𝑒𝑚𝑝ℎ(𝑡) − %𝑒𝑚𝑝ℎ(𝑏𝑎𝑠𝑒𝑙𝑖𝑛𝑒)
40
Experimental Results – UDAA-based Lung Texture Learning Labeling the unsupervised lung texture patterns on large datasets of cardiac CT scans
ROI-level validation accuracy for sLTP labeling and
domain classification in MESA Exam 5 (N=7,816 ROIs)
𝐴𝐶𝐶𝑐𝑙𝑎𝑠𝑠 = sLTP classification accuracy on synthetic ROIs
𝐴𝐶𝐶𝑑𝑜𝑚𝑎𝑖𝑛 = domain classification accuracy of real vs. synthetic ROIs
CNN = training NN1+ NN2 without domain discriminator NN3
UDAA = unsupervised domain adaptation with adversarial training
Slightly lower
sLTP classification
accuracy
Much lower (better)
domain classification
accuracy
ROI-Level Evaluation:
• Synthetic cardiac ROIs in Exam 5 and real
cardiac ROIs in Exam 1-5
36×36×8
Convolution (Conv)
32@36×36×8
ReLU
32@18×18×4 48@18×18×4 48@9×9×2
Fully-connected (FC)
…
… …
…
128 128
Softmax
LTP
Label
…
…
Input: ROIs fromsynthetic cardiac CT scans
# of kernels@feature size Max-pooling
Input 1: ROIs from synthetic cardiac scans
Input 2: ROIsfrom real cardiac scans
…
… ……
… …
) ) * : feature extractor
) ) + : domain discriminator
) ) , : image classifier
LTP label
… … domain label
36×36×8
36×36×8
Cardiac Scan
Full-lung Scan
Synthetic Cardiac Scan
Intensity Image LTP Labeling Map
(A) (B)
(C)
Fig. 1: Illustration of the UDAA framework: (A) Generation of synthetic cardiac CT scans; (B) CNN architecture for sLTP labeling on synthetic cardiac CT
scans; (C) Domain adaptation component to learn discriminative image features between synthetic and real cardiac scans.
Table 1: MESA cardiac CT examsalong with splits used to train thedomain adaptation module.
Exam (Year) Ex1 (2000-2002) Ex2 (2002-2004) Ex3 (2004-2005) Ex4 (2005-2008) Ex5 (2010-2012)
# of Helical scans ( train | val | test ) - | - | - - | - | - - | - | - - | - | - 1,146 | 422 | 414
# of MDCT scans ( train | val | test ) 934 | 329 | 2,245 408 | 150 | 983 423 | 151 | 699 152 | 54 | 241 - | - | -
# of EBT scans ( train | val | test ) 748 | 260 | 2,167 271 | 102 | 967 459 | 152 | 772 245 | 83 | 380 - | - | -
Total # of scans evaluated 6,683 2,881 2,656 1,155 1,982
mizes sLTPclassification loss, as:
L t ot al = L cl ass − ↵L dom ai n (1)
where ↵ is a positive weight that defines the relative impor-
tance of the domain-adaptation task for the sLTP classifier.
During training, ↵ is initiated at 0 and is gradually increased
up to ↵m ax using the following schedule [12]:
↵ =2 · ↵m ax
1 + exp(− γ ·p)− 1 (2)
where γ was set to 10, and p is the training progress, lin-
early increasing from 0 to 1. This strategy allows the N Nd
to be less sensitive to noisy signal at the early training
stages. The weight ↵m ax is determined by maximizing the
ACCt ot al = ACCcl ass − ACCdom ai n metric in the valida-
tion set, where ACCcl ass is the sLTP classification accuracy
and ACCdom ai n is the domain classification accuracy.
2.3.2. Domain Discrimination in Longitudinal Setting
While the domain discriminator does not require registered
inputs in DS and DT , there is a risk that the learning process
isbeing driven by population-differences rather than domain-
differences if the sampling is not regulated. We therefore en-
force sampling of ROIs per training batch to come from the
samesubjects and similar locations, matching therelativedis-
tance vectors d i between the center voxels of ROIs x i to the
lung mask bounding box (hence not using fine registration).
In our longitudinal setting, N Nd needs to discriminate syn-
thetic cardiac ROIs in MESA Exam 5 from real cardiac ROIs
in earlier exams. We further constrain the ROIs sampling
for training N Nd, such that the percent emphysema differ-
ence is less than 5% for two ROIs x i 2 S and x j 2 T , if
x i and x j come from same subjects longitudinal scans with
|d i − d j | < 0.1. This excludes pairing ROIs with drastic
changes in inflation level or emphysema progression, which
may introduce some bias when training N Nd.
3. RESULTS
3.1. Exper imental Setting
We use all full-lung HRCT scans and cardiac CT scans in
MESA that have ULN values of percent emphysema [11].
Thisresulted in N = 2,837 subjects in full-lung Exam 5, which
we randomly divide into training St r ai n , validation Sval and
test St est sets with a ratio of 3:1:1. In longitudinal cardiac
exams, subjectsbelonging to St r ai n and Sval areused to opti-
mizetheUDAA framework, whilesubjectsbelonging to St est
and subjects not having HRCT full-lung Exam 5 (i.e. un-
seen during training and validation) are used as longitudinal
test sets to evaluate the sLTP labeling performance. Thefinal
number of cardiac scans evaluated in this study is reported in
Table 1. To measure the influence of the domain-adaptation
task in the UDAA module, we also test a source-only CNN
training (i.e. training only N N f and N Nc using the source
synthetic scans, and applying thetrained model to real cardiac
scans). In validation steps, we measure ACCdom ai n for the
source-only CNN model by adding adomain classifier similar
to N Nd, but setting its gradient to zero when backpropagat-
ing to N N f , so that thedomain classifier doesnot imposeany
effect on the feature extractor.
Data - MESA Lung Study
41
Individual-level reproducibility of sLTP labeling on
longitudinal scan pairs in MESA acquired within a time lapse
<= 48 months
𝑁𝑝 = number of scan pairs;
𝑁𝑘 = number of sLTPs present;
𝜒2 Distance = average 𝜒2 distance
between sLTP histograms in pairs of scan;
Correlation = Spearman’s correlation
between %sLTP, reporting mean, min and
max among sLTPs.
UDAA has generally higher
consistency:
• Significantly better when
scanner type changes.
Experimental Results – UDAA-based Lung Texture Learning Labeling the unsupervised lung texture patterns on large datasets of cardiac CT scans
Individual-Level Evaluation of sLTP
Labeling in MESA Exam 1-5
Summary Unsupervised learning of localized texture patterns for pulmonary emphysema
Robust emphysema segmentation on cardiac CT scans:
• Scanner-specific and subject-specific parameterization.
Consistent lung texture learning on cardiac CT scans:
• Unsupervised domain adaptation with adversarial training;
• Leads to domain invariant feature learning.
Enables:
• Large-scale multi-site longitudinal studies over 10 years of follow-up;
• May facilitate future study for better understanding of the disease progression.
Summary - Contributions and Impact
43
• Unsupervised learning of localized emphysema texture patterns:
• Novel lung shape spatial mapping = a useful tool to study spatial patterns on lung CT.
• Novel discovery of 10 highly-reproducible sLTPs and 6 clinically-significant QES:
• May facility disease understanding and personalized therapy.
• Labeling emphysema texture on cardiac CT scans:
• Robust emphysema segmentation on cardiac CT scans;
• Novel lung texture labeling with domain adaptation on cardiac CT scans:
• Enable usage of widely available cardiac CT scans.
44
10/4 /19, 12)35 PMAltmet ric – Associat ion Between Long- term Exposure to Ambient Air…and Change in Quant itat ively Assessed Emphysema and Lung Funct ion
Page 1 of 14ht tps:// jamanetwork.altmet ric.com/details/64927651/news
! Share
1916
MORE…
? About this Attention Score
In the top 5% of all research
outputs scored by Altmetr ic
Mentioned by
195 news outlets
12 blogs
417 tweeters
6 Facebook pages
1 Redditor
What is this page?
Association Between Long-term Exposure to Ambient Air
Pollution and Change in Quantitatively Assessed
Emphysema and Lung FunctionOverview of attention for article published in JAMA: Journal of the American Medical Association, August 2019
? So far, Altmetric has seen 215 news stor ies from 195 outlets.
Showing items 1–100
The Harmful Effects of Poor Air Quality
Technology Networks, 23 Sep 2019
A recent study published in The Journal of the American Medical Association
(JAMA) found that long-term exposure to even…
August 2019 Brie fing - Pulmonology
Physician's Briefing, 03 Sep 2019
Here are what the editors at HealthDay consider to be the most important
developments in Pulmonology for August 2019.
August 2019 Brie fing - Radiology
Physician's Briefing, 03 Sep 2019
Here are what the editors at HealthDay consider to be the most important
developments in Radiology for August 2019.
Polusi udara kota ternyata t idak cuma berdampak
pada pernapasan
BBC News, 02 Sep 2019
Hak atas foto Getty Images Kaitan antara polusi perkotaan dan penyakit
pernapasan mendorong banyak seruan untuk membersihkan…
EPA-Funded Research: Climate Change Will Worsen
Lung Disease for Americans
Rocket News, 21 Aug 2019
E.A. Crunden Environment, World By as much as a pack of cigarettes a day.
Air pollution, especially one type that is worsening…
News
References
• A. E. Anderson, J. Hernandez, P. Eckert, and A. G. Foraker. Emphysema in lung macrosections correlated with smoking habits. Science, 144(3621):1025–1026, 1964.
• S. G. Armato III, G. McLennan, L. Bidaut, M. F. McNitt-Gray, C. R. Meyer, A. P. Reeves, B. Zhao, D. R. Aberle, C. I. Henschke, and E. A. Hoffman. The lung image database consortium (LIDC) and image database resource
initiative (IDRI): a completed reference database of lung nodules on CT scans. Medical Physics, 38(2):915–931, 2011.
• D. E. Bild, D. A. Bluemke, G. L. Burke, R. Detrano, A. V. D. Roux, A. R. Folsom, P. Greenland, D. R. Jacobs Jr, R. Kronmal, and K. Liu. Multi-ethnic study of atherosclerosis: objectives and design. American Journal of
Epidemiology, 156(9):871–881, 2002.
• D. Couper, L. M. LaVange, M. Han, R. G. Barr, E. Bleecker, E. A. Hoffman, R. Kanner, E. Kleerup, F. J. Martinez, P. G. Woodruff, et al. Design of the subpopulations and intermediate outcomes in COPD study (SPIROMICS).
Thorax, pages thoraxjnl–2013, 2013.
• Y. Häme, E. D. Angelini, M. A. Parikh, B. M. Smith, E. A. Hoffman, R. G. Barr, and A. F. Laine. Sparse sampling and unsupervised learning of lung texture patterns in pulmonary emphysema: MESA COPD study. In 2015 IEEE
12th International Symposium on Biomedical Imaging (ISBI), pages 109–113. IEEE, 2015.
• Y. Hame, E.D. Angelini, E.A. Hoffman, R.G. Barr, and A.F. Laine, 2014. Adaptive quantification and longitudinal analysis of pulmonary emphysema with a hidden markov measure field model. IEEE transactions on medical
imaging, 33(7), pp.1527-1540.
• E. A. Hoffman, R. Jiang, H. Baumhauer, M. A. Brooks, J. J. Carr, R. Detrano, J. Reinhardt, J. Rodriguez, K. Stukovsky, N. D. Wong, et al. Reproducibility and validity of lung density measures from cardiac CT scans - the Multi-
Ethnic Study of Atherosclerosis (MESA) Lung Study. Academic radiology, 16(6):689–699, 2009.
• S. Hu, E. A. Hoffman, and J. M. Reinhardt. Automatic lung segmentation for accurate quantitation of volumetric X-ray CT images. IEEE Transactions on Medical Imaging, 20(6):490–498, 2001.
• Y. Ganin, E. Ustinova, H. Ajakan, P. Germain, H. Larochelle, F. Laviolette, M. Marchand, and V. Lempitsky. Domain-adversarial training of neural networks. Journal of Machine Learning Research, 17(59):1–35, 2016.
• L. Gorelick, M. Galun, E. Sharon, R. Basri, and A. Brandt, 2006. Shape representation and classification using the poisson equation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(12), pp.1991-2005.
• Y. Masutani, K. Masamune, and T. Dohi. Region-growing based feature extraction algorithm for tree-like objects. In Visualization in Biomedical Computing, pages 159–171. Springer, 1996.
• K. Murphy, J. P. Pluim, E. M. van Rikxoort, P. A. de Jong, B. de Hoop, H. A. Gietema, O. Mets, M. de Bruijne, P. Lo, and M. Prokop. Toward automatic regional analysis of pulmonary function using inspiration and expiration
thoracic CT. Medical Physics, 39(3):1650– 1662, 2012.
• M. Rosvall and C. T. Bergstrom. Maps of random walks on complex networks reveal community structure. Proceedings of the National Academy of Sciences, 105(4):1118–1123, 2008.
• R. L. Siegel, K. D. Miller, and A. Jemal. Cancer statistics, 2016. CA: A Cancer Journal for Clinicians, 66(1):7–30, 2016.
• B. M. Smith, J. H. Austin, and et al. Pulmonary emphysema subtypes on computed tomography: the MESA COPD study. The American Journal of Medicine, 127(1):94.e7–23, 2014.
• S. J. Swensen, M. D. Silverstein, D. M. Ilstrup, C. D. Schleck, and E. S. Edell. The probability of malignancy in solitary pulmonary nodules: application to small radiologically indeterminate nodules. Archives of internal
medicine, 157(8):849–855, 1997.
• M. A. Thomashow, D. Shimbo, M. A. Parikh, E. A. Hoffman, J. Vogel-Claussen, K. Hueper, J. Fu, C.-Y. Liu, D. A. Bluemke, C. E. Ventetuolo, et al. Endothelial microparticles in mild chronic obstructive pulmonary disease and
emphysema. The Multi-Ethnic Study of Atherosclerosis Chronic Obstructive Pulmonary Disease study. American journal of respiratory and critical care medicine, 188(1):60– 68, 2013.
• M. Varma, and A. Zisserman, 2005. A statistical approach to texture classification from single images. International Journal of Computer Vision, 62(1), pp.61-81.
• J. Yang, E. D. Angelini, B. M. Smith, J. H. Austin, E. A. Hoffman, D. A. Bluemke, R. G. Barr, and A. F. Laine. Explaining Radiological Emphysema Subtypes with Unsupervised Texture Prototypes: MESA COPD Study. In
International MICCAI Workshop on Medical Computer Vision. Springer, 2016.
• B. Zhou, A. Khosla, A. Lapedriza, A. Oliva, and A. Torralba. Learning Deep Features for Discriminative Localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016.
45