statistical connectomics and clinical applications materials/zalesky_stat... · and clinical...

Andrew Zalesky [email protected]

Statistical Connectomics and Clinical Applications

Overview

1. Connectome thresholding 2. Connectome sensitivity and specificity 3. Comparing connectomes 4. Clinical applications

5. Q & A / Discussion

Chapter 11

Topic 1 Connectome thresholding

1. Global thresholding

• Weight-based thresholding

• Density-based thresholding

• Consensus thresholding

2. Local thresholding

• Minimum spanning tree

• Disparity filter

• Multi-scale methods

• Remove spurious connections, thereby improve specificity

• Emphasize topological properties • Simplify graph analysis

Benefits of thresholding

Weight-based thresholding

Unthresholded matrix, 𝐶𝑖𝑗

Thresholded, 𝐴𝑖𝑗

Binarized, 𝐵𝑖𝑗

𝐴𝑖𝑗 = 𝐶𝑖𝑗 if 𝐶𝑖𝑗 > 𝜏

0 otherwise

𝐵𝑖𝑗 = 1 if 𝐶𝑖𝑗 > 𝜏

0 otherwise

Density-based thresholding • Same connection density for all individuals (but different 𝜏)

Patient Control Density = 53% 𝜏 = 0.20

Density = 75% 𝜏 = 0.20

Density = 20% 𝜏 = 0.31

Density = 20% 𝜏 = 0.42

Weigh

t-based

D

ensity-b

ased

Connectivity strength Nu

mb

er

of

con

ne

ctio

ns

Fornito et al, 2013

Consensus-based & group thresholding

de Reus & van den Heuvel et al, 2013 ; Mišic et al, 2015

• Eliminate connections that are not consistently identified in at least 𝑚 of 𝑁 individuals

1

𝑁 𝟏𝐶𝑖𝑗 𝑛 ≥𝜏𝑛

< 𝑚

𝐶𝑖𝑗 𝑛 ← 0 ∀𝑛

Group threshold

Individual threshold

𝑛 indexes individuals

Assumptions made by global thresholding

1. Assumption: Thresholds and connection densities are typically chosen arbitrarily

• Alternatives: Area under curve 2. Assumption: Weak connections are spurious

connections

• Is a weak connection spurious, or is a weak connection simply a weak connection?

• Connection strength versus connection probability

Minimum spanning tree (MST)

• Global thresholding can result in network fragmentation

• Step 1: Find MST

• Step 2: Add 𝑘-strongest neighbours of each node to MST

• Is this contrived? Forcing a property onto the network

Alexander-Bloch et al, 2010

Disparity filter

Serrano et al, 2009; Foti et al, 2011

𝑿

1 2 𝒌

• Threshold each node independently

• Null hypothesis: node-wise normalized connection strengths are given by uniformly distributing 𝑘 − 1 points on the unit interval

⋯

0.8 0.1 0.1

0 0.5 1 What is the probability that the longest interval exceeds connection strength?

Trial 1

Trial 2

Trial 𝑁

Topic 2 Connectome sensitivity and specificity

• Methods for mapping connectomes are imperfect

• False positives (FPs) and false negatives (FNs)

• Intuition for why FPs are most detrimental

• Asymptotic analysis

• Streamline filters (SIFT, LiFE)

Images by O’Donnell & Westin

False positives (FPs) and false negatives (FNs)

FP

FN

Erroneous Correct

Tractography

Tract tracer

False positive rate

Tru

e p

osi

tive

rat

e

Receiver operating characteristic (ROC curve)

Calabrese et al, 2015

Deterministic Probabilistic

Healthy male, 32 years; 𝑏 = 3000 s/mm2; 64 dx; 3T

116 AAL regions

23

%

>80

%

Det

erm

inis

tic

P

rob

abili

stic

False positive rate Tr

ue

po

siti

ve r

ate

Patient-control comparisons

Neurosurgical planning

Graph analysis

FPs or FNs: which are most detrimental?

• Depends on the application

2:1 rule of thumb

Connectome specificity is at least twice as important as connectome sensitivity to accurate estimation of:

• Network modularity (𝑄-index)

• Clustering coefficient

• Network efficiency

• Many other measures

False positive rate

Tru

e p

osi

tive

rat

e

Zalesky et al, 2016

Ground truth

network

What is meant by “FPs twice as bad as FNs”?

Network perturbed with FPs

Network perturbed with FNs

Add 10 FPs Add 10 FNs

𝐶𝐺𝑇 𝐶𝐹𝑃 𝐶𝐹𝑁

𝐶𝐹𝑃:𝐹𝑁 =Δ𝐹𝑃 𝐶

Δ𝐹𝑁 𝐶

Δ𝐹𝑃𝐶 = 𝐶𝐺𝑇 − 𝐶𝐹𝑃 Δ𝐹𝑁𝐶 = 𝐶𝐺𝑇 − 𝐶𝐹𝑁

Why are FPs so detrimental?

FNs are intra-modular FPs are inter-modular

Nodes per module, n

Clustering coefficient FP:FN Network efficiency FP:FN

lim𝑛→∞𝐶𝐹𝑃:𝐹𝑁 = 2 lim

𝑛→∞𝐸𝐹𝑃:𝐹𝑁 = ∞

Asymptotic analysis: clustering and efficiency

A B

Qazi et al, 2009

Streamline filters: (SIFT, LiFE)

SIFT: Smith et al, 2012; LiFE: Pestilli et al, 2014

• Post-processing filter applied to set of streamlines to reduce tractography biases

• Use cautiously in patient-control studies

Fiber 1: 100 streamlines


Healthy Control



Patient

Pathology

Topic 3 Comparing connectomes 1. Measures

• Global descriptors of topology

• Nodal (regional) measures

• Edge-based measures

2. Statistical inference on connectomes

• Generic methods

• Network-specific and multivariate methods

3. Clinical applications and software

• Modularity • Small-world organization • Rich club, etc.

• Node degree • Connection

density

Level 3: Complex topology

Level 1: Connectivity strength (e.g. streamline count)

Level 2: Low-level topological descriptors

What can be compared between connectomes?

Level 3: Complex topology

Level 1: Connectivity strength (e.g. streamline count)

Level 2: Low-level topological descriptors

𝑂 1

𝑂 𝑁

𝑂 𝑁2

𝑁: number of nodes

Multiple comparisons and localizing power

𝑝 = 0.04

𝑝 = 0.04

𝑝 = 0.67

𝑝 = 0.7

• Mass univariate testing: independently test hypothesis at each of thousands of connections

Generic methods for multiple testing correction

• Bonferroni correction – reject null hypothesis for any edge with 𝑝-value satisfying

𝑝 ≤𝛼

𝑀

Total number of edges tested

Family-wise error rate

• False discovery rate

𝐹𝐷𝑅 = 𝐸𝐹𝑃

𝐹𝑃 + 𝑇𝑃

Number of false positives

Number of true positives

Expectation operator

• Step 1. Sort 𝑝-values from smallest to largest

𝑝(𝑗) = [0.002, 0.02, 0.3, 0.4, 0.8]

Let 𝑝(𝑗) denote the 𝑗th smallest 𝑝-value

• Step 2. Identify the largest 𝑗 such that:

𝑝(𝑗) ≤𝑗𝛼

𝑀

Total number of edges tested

Desired FDR

• Step 3. Reject the null hypothesis for 𝑝1, … , 𝑝(𝑗∗)

𝑗𝛼

𝑀 = 0.01 0.02 0.03 0.04 0.05

𝑗 = 1 2 3 4 5

× ×

Benjamini-Hochberg procedure for FDR control

Rationale for network-based inference • Axonal fibres can provide conduits for propagation

of disease processes • Corollary: Pathological cascades and sub-networks

Seeley et al, 2009; Raj et al, 2015; Raj et al, 2012; Schmidt et al, 2015; Buldyrev et al, 2011

Network-based statistic (NBS)

Cluster of voxels in an image

Connected component in a network

Network-based statistic (NBS)

Patients Controls

Test-statistic matrix

Thresholded test-statistc

matrix

t-st

atis

tic

Component size

Freq

uen

cy

Null distribution

Significant sub-networks

Pati

ents

C

on

tro

ls

• If null hypothesis is true, distribution of test statistic is insensitive to permutation of patients and controls

Permutation testing

Pati

ents

C

on

tro

ls

Size of component

×

Largest component found in Permutation 1 has 𝑆𝑖𝑧𝑒 = 2

0 1 2 3 4 5

Permutation #1

Pati

ents

C

on

tro

ls

×

Largest component found in Permutation 1 has 𝑆𝑖𝑧𝑒 = 1

0 1 2 3 4 5

×

Size of component

Permutation #2

Size of connected component

×

5

× 4 3 2 1 0

× × × ×

×

×

×

× × × × × × × ×

× × ×

× × × × ×

×

×

×

𝑝 =#𝑝𝑒𝑟𝑚𝑢𝑡𝑎𝑡𝑖𝑜𝑛𝑠 ≥ 5

#𝑡𝑜𝑡𝑎𝑙 𝑝𝑒𝑟𝑚𝑢𝑡𝑎𝑡𝑖𝑜𝑛𝑠=3

5000

6

× ×

×

×

×

Permutation #5000

Extensions and alternative methods

• Spatial pairwise clustering (SPC) Zalesky et al, 2013

• NBS – TFCE Poster #3917

• Screening-filtering method Meskaldji etal, 2014

• Sum of powered score Kim et al, 2013

• Differential network detection Chen et al, 2015

Multivariate inference

• Inference at the level of individual patients

• Argument against mass univariate approaches: Complex interactions intrinsic to brain graphs cannot be reduced to isolated elements and processes

• Support vector machine (SVM) Dosenbach et al, 2010

• Multivariate distance matrix regression (MDMR) Sheehzad et al, 2014

• Canonical correlation analysis (CCA) and partial least squares (PLS) Smith et al, 2015

• Network-based sparse regression (network-based and fused lasso) Li & Li, 2008

Cannabis use

Schizophrenia

Amyotrophic lateral sclerosis (ALS)

Task-based functional connectivity

Depression

Clinical connectomics

Fornito et al, 2015

Modelling responses to pathological perturbation of brain networks

Soft

war

e f

or

con

ne

cto

me

infe

ren

ce

• CONN: functional connectivity toolbox https://www.nitrc.org/projects/conn/

• NBS: network-based statistic https://www.nitrc.org/projects/nbs/

• Graphvar https://www.nitrc.org/projects/graphvar/

• BCT: brain connectivity toolbox https://sites.google.com/site/bctnet/

• Connectome Viewer http://cmtk.org/viewer/

• GLG: graph theory GLM (MEGA LAB) https://www.nitrc.org/projects/metalab_gtg/

Modulating functional brain systems with non-invasive brain stimulation

Tuesday June 28

Afternoon Symposium: 14:45 – 16:00

statistical connectomics and clinical applications materials/zalesky_stat... · and clinical...

Documents