20130701 統計論文勉強会 遺伝的差異の定量的解析法
DESCRIPTION
TRANSCRIPT
Early changes of hepatitis B virus quasispecies during lamivudine treatment and the correlation with antiviral efficacy.J Hepatol. 2009 May;50(5):895-905.
20130701Journal
Evaluation of genomic diversity and complexityAdvice on calculation from base frequency data
A viral quasispecies is a group of viruses related by a similar mutation or mutations, competing within a highly mutagenic environment.
Hepatitis B virus (HBV)
3.2kb, double strand, circular DNA virus.HBV is the second most cause of hepatocellular carcinoma (HCC).HCC is the 3rd cause of cancer death in Japan.A nucleoside analog (lamivudine) therapy is available but we often encounter drug-resistance.
n=25
n=14
n=11
Experimental design and resultBaseline(0 week) 4 weeks
DNA cloning
ResponderLow HBV-DNALow AST, ALT
Non-responderHigh HBV-DNAHigh AST, ALT
???
What dynamic changes could occur during anti-viral therapy?
???
DNA cloning
DNA cloning
Lamivudine (anti-HBV drug)
n=14
n=11
Experimental design and result4 weeks
ResponderLow HBV-DNALow AST, ALT
Non-responderLow HBV-DNAHigh AST, ALT→ Responders were low
complexity and diversity. Non-responders were high.
↓ Responders were monoclonal. Non-responders were polyclonal.
Evaluation of quasispecies heterogeneity
complexity diversityShannon entropy (Sn) Genetic distance (d)
Synonymous substitution/site (dS or Ks)Non-Synonymous substitution/site (dN or Ka)
4/8 2/8 1/8
Evaluation of quasispecies heterogeneity
complexityShannon entropy (Sn)
𝑆𝑛=−∑𝑖
❑ 𝑝𝑖 ln𝑝𝑖
ln𝑁
𝑝𝑖
𝑁Frequency of clones in viral population
Total # of clones
p <- c(1,2,4)/8- sum(log(p)*p)/log(8)[1] 0.4583333
0 ~ monoclonal1~ polyclonal
Evaluation of quasispecies heterogeneity
diversityGenetic distance (d)Synonymous substitution/site (dS or Ks)Non-Synonymous substitution/site (dN or Ka)
Hamming distance
A A T G C T
A C T G T T Dist 2
Levenshtein distanceA A T G - C T
A C T G G T T
Transversion 1Transition 1Insertion 1(any cost can be set)
Dist 3
Evaluation of quasispecies heterogeneity
diversityGenetic distance (d)Synonymous substitution/site (dS or Ks)Non-Synonymous substitution/site (dN or Ka)
C G A
C A A
C C A
C T A
G G A
T G A
C G G
C G C
C G T
Arg(R)
Synonymous mutation
Possible dS sitesC G A
1/3 0/3 3/3
MethodsMiyata & YasunagaNei & GojoboriLiMaximum Likelihood
A G A
Evaluation of quasispecies heterogeneity
diversityGenetic distance (d)Synonymous substitution/site (dS or Ks)Non-Synonymous substitution/site (dN or Ka)
C G A
C A A
C C A
C T A
G G A
T G A
C G G
C G C
C G T
Arg(R)
MethodsMiyata & YasunagaNei & GojoboriLiMaximum Likelihood𝐿0
𝐿2All are non-synonymous.
1/3 is synonymous.
𝐿4𝐴𝑖𝐵𝑖
All are synonymous.
# of transition at
# of transversion atC G A
𝐿4𝐿0𝐿2
𝑑𝑆=𝐿2 𝐴2+𝐿4 𝐴4
𝐿2+𝐿4+𝐵4
𝑑𝑁=𝐿0𝐵0+𝐿2𝐵2
𝐿2+𝐿4+𝐴0
A G A
𝐿𝑖
𝐿𝑖
Synonymous mutation
Evaluation of quasispecies heterogeneity
diversityGenetic distance (d)Synonymous substitution/site (dS or Ks)Non-Synonymous substitution/site (dN or Ka)
C G A
C A A
C C A
C T A
G G A
T G A
C G G
C G C
C G T
Arg(R)
Synonymous mutation
MethodsMiyata & YasunagaNei & GojoboriLiMaximum Likelihood
𝑞𝑖 , 𝑗={0𝜋 𝑗
𝜋 𝑗𝜅𝜋 𝑗𝜔𝜋 𝑗𝜅𝜔
more than one mutationsynonymous transversion
synonymous transition
nonsynonymous transversionnonsynonymous transition
𝜔=𝑑𝑁𝑑𝑆
𝜅=𝑇 𝑠
𝑇 𝑣
𝑇 𝑠𝑇 𝑣
Transition rateTransversion rate
𝜋 𝑗
𝑞𝑖 , 𝑗Substitution frequency from codon i to j.
Equilibrium frequency of codon j.
𝜒2
𝐿𝑅=2 ln𝐿𝜔𝑒𝑠𝑡𝑖𝑚𝑎𝑡𝑒
𝐿𝜔=1
distribution (df=1)
A G A
Clustering
Sorry!!!
n=14
n=11
Experimental design and result4 weeks
ResponderLow HBV-DNALow AST, ALT
Non-responderLow HBV-DNAHigh AST, ALT→ Responders were low
complexity and diversity. Non-responders were high.
↓ Responders were monoclonal. Non-responders were polyclonal.
Genotype estimation from frequency?
Cloning NGS
4/8 2/8 1/8
A T T G C G A T G C A C T G C T
A T T C G G A T G C G C T G C T
A T T G C G A A G C G C T G C T
?? ?? ??
A T T G C G A T G C A C T G C T
A T T ? ? G A T G C ? C T G C T
A T T G C G A ? G C ? C T G C T
A
G
C
T
0.95
0.05
0.98
0.02
0.90
0.10
Monte Carlo simulation?
Random samplingGenerate pseudosequenceRare mutation ??Linkage ??
NGS
?? ?? ??
A T T G C G A T G C A C T G C T
A T T ? ? G A T G C ? C T G C T
A T T G C G A ? G C ? C T G C T
A
G
C
T
0.95
0.05
0.98
0.02
0.90
0.10