batch effect correction: how do we compare against combat? - yalchin oytam

15
Batch effect correction: How do we compare against ComBat? Yalchin Oytam* & Fariborz Sobhanmanesh

Upload: australian-bioinformatics-network

Post on 10-May-2015

799 views

Category:

Technology


2 download

DESCRIPTION

The TBCP-funded global signals in genomic data project is developing methods and software to view and characterise and in the case of batch effects, also correct for, large correlated signals in genomic data. In the last year, we have developed Python-based software to quickly identify genomic signals, with the next phase being the characterisation of these signals. In parallel, we have finished development of a method to identify and remove batch effects which outperforms existing methods. While we have several bodies of work in development, in this talk we will discuss in particular, the performance and importance of the new batch effect removal algorithm. This new technique maximises the removal of the the structured technical noise known as batch effects, with the constraint that the probability of overcorrection is kept to a fraction which is set by the end-user. This tunability allows control for overcorrection - defined as, the removal of genuine biological variance as well as batch noise. Overcorrection should be minimised as it can lead to false positive results due to the artificial deflation of within-group variances. Benchmarking across four datasets against Combat, the leading currently used technique, we show this new method is far superior in balancing removal of batch noise while preserving biological signal. Additionally, the new method is able to leave largely unchanged one of the datasets which has no significant batch effect, whereas Combat reduces the variance of that dataset by over 45%. For noise removal, we use “guided-PCA” a recently published quantifier of batch effects to show the probability of batch effects remaining in the data post correction. For signal preservation, we calculate in each case, the proportion of the original variance which remains in the datasets after correction.

TRANSCRIPT

Page 1: Batch effect correction: How do we compare against ComBat? - Yalchin Oytam

Batch effect correction: How do we compare against ComBat?

Yalchin Oytam* & Fariborz Sobhanmanesh

Page 2: Batch effect correction: How do we compare against ComBat? - Yalchin Oytam

Synopsis

Batch Effects: •Uncorrected (or under-corrected) Detrimental reduction in power of test; distortion to multiplicity correction •Over-corrected False positives; distortion to multiplicity correction

Novel method, which: •Quantifies the probability of under/over correction •Enables to experimenter to choose confidence/risk (p-value) as constraint for batch removal

AIM: Benchmark the novel method against ComBat

Summary: •Discuss batch effects •Introduce performance criteria •Compare the two methods

Page 3: Batch effect correction: How do we compare against ComBat? - Yalchin Oytam

Batch Effects?

•Definition

•Structured technical noise / distortion common to all replicates in a processing batch.

•And, vary markedly from batch to batch. • Pervasive and persistent under best practice.

•Not remediable by normalisation techniques. • Typically account for 20-45% of the power in the measurement data!

Page 4: Batch effect correction: How do we compare against ComBat? - Yalchin Oytam

Impact of batch effects

Rep1 Rep2 Rep3 Rep4 Treat1 t11 + B1 t12 + B2 t13 + B3 t14 + B4

Treat2 t21 + B1 t22 + B2 t23 + B3 t24 + B4

Treat3 t31 + B1 t32 + B2 t33 + B3 t34 + B4

Treat4 t41 + B1 t42 + B2 t43 + B3 t44 + B4

Treat5 t51 + B1 t52 + B2 t53 + B3 t54 + B4

Treat6 t61 + B1 t62 + B2 t63 + B3 t64 + B4

Control c1 + B1 c2 + B2 c3 + B3 c4 + B4

•Differences between B1, B2, B3, and B4 inflate within-treatment variances, diminishing power of any between-treatment comparison test.

•Different genes are affected differently, distorting rank of p-values, and hence distorting multiplicity correction (FDR).

“What if treatments are not distributed across batches?”

Page 5: Batch effect correction: How do we compare against ComBat? - Yalchin Oytam

Method: Principal Component Analysis

CSIRO Overcoming the challenges of multiplicity and batch effects

Page 6: Batch effect correction: How do we compare against ComBat? - Yalchin Oytam

Method: Principal Component Analysis

CSIRO Overcoming the challenges of multiplicity and batch effects

Page 7: Batch effect correction: How do we compare against ComBat? - Yalchin Oytam

A snapshot of batch correction software

Page 8: Batch effect correction: How do we compare against ComBat? - Yalchin Oytam

Benchmarking – ComBat vs Our Method

• Two dimensions: Noise Rejection and Signal Preservation

•Noise Rejection: Guided PCA (third party quantification of batch noise in data). Reese et al. 2013

•Signal Preservation: data variance after batch correction/ raw data variance

•Ideal: Reject all batch noise, without removing any biological variance.

Page 9: Batch effect correction: How do we compare against ComBat? - Yalchin Oytam

Benchmarking – Cell Data

gPCA p-value for batch effect presence in raw data = 0.008

Page 10: Batch effect correction: How do we compare against ComBat? - Yalchin Oytam

Benchmarking – Animal Data

gPCA p-value for batch effect presence in raw data = 0.037

Page 11: Batch effect correction: How do we compare against ComBat? - Yalchin Oytam

Benchmarking – Combat’s “Native” Dataset

gPCA p-value for batch effect presence in raw data = 0.225

Page 12: Batch effect correction: How do we compare against ComBat? - Yalchin Oytam

Benchmarking – Combat’s “Native” Dataset

Page 13: Batch effect correction: How do we compare against ComBat? - Yalchin Oytam

Benchmarking – Combat’s “Native” Dataset

Page 14: Batch effect correction: How do we compare against ComBat? - Yalchin Oytam

Benchmarking – Combat’s “Native” Dataset

Page 15: Batch effect correction: How do we compare against ComBat? - Yalchin Oytam

Thank you

CAFHS/Genomics Yalchin Oytam Research Scientist Phone: +61 2 9490 5077 Email: [email protected]

Contact Us Phone: 1300 363 400 or +61 3 9545 2176

Email: [email protected] Web: www.csiro.au

Acknowledgements Konsta Duesing Mike Buckley Bill Wilson Maxine McCall