rna-seq - bioinformatics differential gene expressi… · rna-seq quantification harm nijveen...

Post on 07-Jun-2020

22 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

RNA-seq

Quantification

HarmNijveen

Differentialexpression

Whichgenesarehigher/lowerexpressedbetweentissues,aftertreatment,etc.?DifferentiallyExpressedgenes(DEGs)haveanexpressionlevelthatissignificantlydifferentbetweendifferentconditions.

RootLeaf

ExpressiongeneX

IstheexpressionofgeneXdifferentbetweenrootandleaf?

ExpressiongeneX

IstheexpressionofgeneXdifferentbetweenrootandleaf?

Basedononesample:perhaps…

RootLeaf

IstheexpressionofgeneXdifferentbetweenrootandleaf?

Basedonthesesamples:NO!

ExpressiongeneX

RootLeaf

IstheexpressionofgeneXdifferentbetweenrootandleaf?

Basedonthesesamples:YES!

ExpressiongeneX

RootLeaf

Isagenedifferentiallyexpressed?

Withonlyonemeasurement:impossibletosayWehavetoknowthewithin-treatmentvariation

Determiningexpressionvariation

Accuratelydeterminingthevariationrequiresmanybiologicalsamples(replicates)Unfortunatelyinmostcaseweonlyhavetwoorthreereplicates

Variationhastobeestimated

Readcountdistribution

Poissondistribution:variance=meanHoldsfortechnicalreplicates

Negativebinomial:variance>meanBetterfitforbiologicalreplicates

https://intro-prog-bioinfo-2012.wikispaces.com/

Variancedependsonthemean

Trapnelletal.2012

Mainassumption:Variancedependsonthemean.Objective:Findafunctionthatbestdescribestherelationshipbetweenthemeanandvariance.

p-value

Tofinddifferentiallyexpressedgeneswecandoastatisticaltestanddetermineap-value.p-value=0.05meansthatthereisa5%chanceforanot-differentiallyexpressedgenetoshowthesekindofexpressiondifferencesBut:with10,000genesi.e.10,000tests,youcanexpect0.05x10,000=500falsepositives!

MultipletestingcorrectionWeneedtocorrectthep-valuefordoingalargenumberoftestsWecanusedtheFalseDiscoveryRate(FDR)thatproducesanadjustedp-valuecalledq-valueq-value=0.05meansthatthereisa5%chancethattheseexpressionvaluesarefromanotdifferentiallyexpressedgene

Sometools..

DESeq/DESeq2EdgeRSleuth(kallisto)HISAT2/StringTie/Ballgown(canquantifyisoforms)

PlottingDEGsVolcanoplot

x:log2(foldchange)y:-log10(p-value)

MAplot

x:meanexpressiony:log2(foldchange)

AdotrepresentsonegeneReddotsaresignificant

VolcanoplotVolcanoplot

MAplot

Reddotshavep-adj<0.01

Schurch,N.J.,P.Schofield,M.Gierliński,C.Cole,A.Sherstnev,V.Singh,N.Wrobel,K.Gharbi,G.G.SimpsonandT.Owen-Hughes(2015)."EvaluationoftoolsfordifferentialgeneexpressionanalysisbyRNA-seqona48biologicalreplicateexperiment."arXivpreprintarXiv:1505.02017.

Log2(|Foldchange|)

Whicharetheinterestinggenes?

Highestfoldchange?Lowestp-value?Other?

Comparingmultipletreatments

Timeseries,multipletissues,etc.LookforgeneswithasimilarexpressionpatternUsingvariouskindsofclusteringmethods

Co-expression

Andnow?

Howdothe‘usualsuspects’behave?Whichbiologicalprocessesareenriched?Whichpathwaysareenriched?Dependsonthebiologicalquestion!Continuedthisafternoon…

top related