evolutionary trajectory analysis-atlanta-2010.graffle
TRANSCRIPT
0.02
M63758-A/swine/Jamesburg/1942
AF455704-A/Swine/Indiana/P12439/00
CY022112-A/swine/Tennessee/19/1977
CY026470-A/swine/Iowa/100/1977
AB434393-A/swine/Hokkaido/2/1981
AF222768-A/Swine/Wisconsin/125/97
CY036802-A/swine/Wisconsin/30954/1976
U49091-A/swine/Beij ing/94/1991
CY022360-A/swine/Minnesota/27/1976
CY028182-A/swine/Tennessee/3/1978
AB434401-A/swine/Niigata/1/1977
M63759-A/swine/Iowa/1946
AF222770-A/Swine/Wisconsin/163/97
AF222777-A/Swine/Wisconsin/458/98
CY022368-A/swine/Minnesota/5892_7/1979
CY009919-A/swine/Tennessee/25/1977
M63763-A/swine/Wisconsin/1/1961
AF455702-A/Swine/Minnesota/55551/00
CY028783-A/swine/California/T9001707/1991
CY026462-A/swine/Tennessee/10/1976
AF222769-A/Swine/Wisconsin/136/97
M76609-A/turkey/North Carolina/1790/1988
EU743154-A/turkey/IA/10271_3/1990
CY022280-A/swine/Tennessee/62/1977
CY032216-A/swine/Wisconsin/1/1961
AF455705-A/Swine/Il l inois/100085A/01
AY744935-A/Brevig Mission/1/1918
AF342819-A/Wisconsin/10/98
CY026422-A/swine/Tennessee/23/1976
CY022344-A/swine/I l l inois/1/1975
M63755-A/Wisconsin/3523/1988
AF397199-A/swine/Quebec/5393/91
CY022296-A/swine/Tennessee/79/1977
CY022032-A/swine/Tennessee/3/1976
CY027526-A/swine/Tennessee/8/1978
CY022048-A/swine/Tennessee/15/1976
M30747-A/swine/Iowa/15/1930
CY022376-A/swine/Nebraska/123/1977
CY027510-A/swine/Iowa/2/1985
CY030738-A/swine/Tennessee/9/1978
CY026478-A/swine/Tennessee/105/1977
CY024981-A/swine/Tennessee/88/1977
CY022104-A/swine/Iowa/4/1976
CY024989-A/swine/Tennessee/10/1978
CY028174-A/swine/Iowa/2/1987
CY022464-A/swine/Wisconsin/8/1980
AF397198-A/swine/Quebec/192/81
FJ966083-A/California/04/2009
CY025013-A/swine/Kansas/3024/1987
CY022288-A/swine/Tennessee/65/1977
EU743162-A/turkey/IA/21089_3/1992
CY026486-A/swine/Tennessee/106/1977
CY022973-A/swine/Iowa/31483/1988
M76606-A/New Jersey/8/1976
CY028791-A/swine/Iowa/1/1986
CY036866-A/swine/Tennessee/107/1977
M22578-A/swine/Iowa/1976/1931
CY022997-A/swine/Wisconsin/629/1980
CY022320-A/swine/Iowa/1/1985
L24394-A/MD/12/1991
CY024957-A/swine/Tennessee/84/1977
M76602-A/Ohio/3523/1988
M63767-A/swine/Ontaria/2/1981
AF251423-A/Swine/Iowa/569/99
CY027302-A/swine/Tennessee/118/1977
AF455706-A/Swine/Il l inois/100084/01
CY024965-A/swine/Tennessee/86/1977
CY022965-A/swine/Iowa/1/1987
CY022120-A/swine/Tennessee/21/1977
L46850-A/swine/WI/1915/1988
M63757-A/swine/29/1937
CY025060-A/swine/Tennessee/48/1977
CY022144-A/swine/Tennessee/64/1977
CY022136-A/swine/Tennessee/49/1977
M22570-A/swine/Hong Kong/127/1982
CY024973-A/swine/Tennessee/87/1977
AF222771-A/Swine/Wisconsin/164/97
M63761-A/swine/May/1954
CY026286-A/swine/Wisconsin/1/1957
AF455701-A/Swine/North Carolina/93523/01
EU735789-A/turkey/NC/19762/1988
CY027158-A/swine/Iowa/24297/1991
FJ357107-A/turkey/NC/17026/1988
AF222772-A/Swine/Wisconsin/166/97
CY028438-A/swine/Tennessee/7/1978
CY026430-A/swine/Jamesburg/1942
AF251407-A/Swine/Nebraska/209/98
EU742639-A/turkey/KS/4880/1980
CY022957-A/swine/Iowa/1/1977
CY022072-A/swine/Iowa/1/1976
CY025005-A/swine/Arizona/148/1977
EU743178-A/chicken/NY/21665_73/1998
M63768-A/swine/Iowa/17672/1988
CY022064-A/swine/Tennessee/19/1976
CY009631-A/swine/1931
AF222774-A/Swine/Wisconsin/235/97
CY024945-A/swine/Tennessee/61/1977
AF251431-A/Swine/Minnesota/593/99
CY024928-A/Ohio/3559/1988
CY028190-A/swine/Wisconsin/30747/1976
CY022336-A/swine/Iowa/17672/1988
CY022400-A/swine/Tennessee/1/1975
AF250127-A/Swine/Indiana/9K035/99
CY022384-A/swine/Ontario/4/1981
CY032932-A/swine/Tennessee/109/1977
CY022480-A/swine/Maryland/23239/1991
CY022392-A/swine/Ontario/7/1981
AF222773-A/Swine/Wisconsin/168/97
CY022304-A/swine/Tennessee/82/1977
CY024936-A/swine/Minnesota/24/1975
CY027310-A/swine/Tennessee/2/1978
CY022416-A/swine/Wisconsin/1/1971
M76610-A/Wisconsin/3623/1988
CY028430-A/swine/Tennessee/4/1978
CY026142-A/Wisconsin/301/1976
CY026294-A/swine/Wisconsin/1/1967
CY025208-A/swine/Tennessee/37/1977
CY022472-A/swine/Kansas/3228/1987
M63766-A/swine/Italy/141/1981
CY022040-A/swine/Tennessee/7/1976
AF251415-A/Swine/Iowa/533/99
CY035073-A/swine/Memphis/1/1990
CY022128-A/swine/Tennessee/31/1977
CY026302-A/swine/Wisconsin/2/1966
CY022981-A/swine/Ontario/1/1981
M30748-A/swine/Tennessee/24/1977
M76608-A/swine/Wisconsin/1915/1988
CY022448-A/swine/Wisconsin/641/1980
CY024953-A/swine/Tennessee/96/1977
CY026438-A/swine/Ontario/2/1981
AF455699-A/Swine/Ohio/891/01
CY026454-A/swine/Ontario/6/1981
M76607-A/swine/Wisconsin/1/1967CY022440-A/swine/Wisconsin/2/1970
CY022432-A/swine/Wisconsin/1915/1988
AB434385-A/swine/Kyoto/3/1979
CY026446-A/swine/Ontario/3/1981
CY022272-A/swine/Tennessee/10/1977
CY022456-A/swine/Wisconsin/661/1980
CY027294-A/swine/Ohio/23/1935
CY022056-A/swine/Tennessee/17/1976
L46849-A/swine/IN/1726/1988
M63764-A/swine/Italy/437/1976
CY022352-A/swine/Kentucky/1/1976
CY022408-A/swine/Tennessee/11/1978
CY026494-A/swine/Tennessee/112/1977
CY027518-A/swine/Tennessee/5/1978
EU735797-A/chicken/PA/35154/1991
AF222776-A/Swine/Wisconsin/457/98
CY014763-A/turkey/Minnesota/12537/1989
M63765-A/swine/I taly/2/1979
AF455703-A/Swine/Iowa/930/01
CY022328-A/swine/Iowa/3/1985
AF222775-A/Swine/Wisconsin/238/97
M60762-A/swine/Italy/147/1981
U49092-A/swine/Hong Kong/273/1994
M63756-A/swine/Ohio/23/1935
CY024997-A/swine/Wisconsin/663/1980
AB434409-A/swine/Ehime/1/1980
M63760-A/swine/41/1949
M63762-A/swine/Wisconsin/1/1957
AY233394-A/Duck/NC/91347/01AF455700-A/Swine/North Carolina/98225/01
L11164-A/swine/Nebraska/1/1992
M63754-A/New Jersey/8/1976
CY022424-A/swine/Wisconsin/11/1980
0.02
EF551046-A/turkey/Il l inois/2004
AB434361-A/swine/Nakhon pathom/NIAH586_1/2005
DQ469959-A/Ontario/RV1273/2005
DQ666928-A/Swine/Korea/S5/2005
FJ789828-A/swine/Shanghai/2/2005
AB441173-A/swine/Miyazaki/1/2006
EU798852-A/swine/Korea/CAS09/2006
DQ469991-A/swine/Ontario/33853/2005
EU697205-A/turkey/Minnesota/366767/2005
AY363542-A/Swine/Hong Kong/9745/01
EU798840-A/swine/Korea/Hongsong2/2004
DQ889686-A/Iowa/CEID23/2005
EU735821-A/turkey/OH/313053/2004
FJ966083-A/California/04/2009
EU850624-A/swine/Hainan/1/2005
EU258936-A/swine/Missouri/2124514/2006
DQ280217-A/swine/Ontario/53518/03
EU502887-A/swine/Shanghai/1/2005
DQ923513-A/swine/Korea/CN22/2006
DQ469967-A/swine/Alberta/14722/2005
EU798843-A/swine/Korea/JL04/2005
EU798847-A/swine/Korea/PZ14/2006
FJ638307-A/swine/NC/00573/2005
AB434377-A/swine/Ratchaburi/NIAH874/2005
DQ150434-A/swine/IN/PU542/04
EF551054-A/swine/North Carolina/2003
DQ280201-A/swine/Alberta/56626/03
AY623834-A/swine/Pingtung/199_2/2002
DQ280241-A/swine/Ontario/23866/04
DQ145541-A/swine/Minnesota/00395/2004
EU798846-A/swine/Korea/PZ7/2006
AY129159-A/Swine/Korea/CY02/02
FJ789833-A/swine/Shanghai/3/2005
DQ150426-A/swine/MI/PU243/04
EU015990-A/swine/Guangxi/13/2006
EU301304-A/swine/Korea/JNS06/2004
EU798851-A/swine/Korea/CAS07/2005
DQ666936-A/Swine/Korea/S11/2005
DQ280193-A/swine/Ontario/57561/03
EU798857-A/swine/Korea/CY10/2007
AY623836-A/swine/Pingtung/92_2/2003
EU798853-A/swine/Korea/CY04/2007
EU850621-A/swine/Guangxi/17/2005
EU798841-A/swine/Korea/JL01/2005
FJ638299-A/swine/IL/00685/2005
DQ666947-A/Swine/Korea/S15/2006
EU798854-A/swine/Korea/CY05/2007
EU798842-A/swine/Korea/JL02/2005
DQ335774-A/turkey/Ohio/313053/04
DQ280233-A/swine/Ontario/48235/04
DQ469983-A/swine/Manitoba/12707/2005
EU399754-A/Ontario/1252/2007
EU798845-A/swine/Korea/PZ4/2006
AY363543-A/Swine/Hong Kong/1144/02
EU604694-A/swine/OH/511445/2007
AB434417-A/swine/Saitama/1996
EU798856-A/swine/Korea/CY09/2007
DQ469975-A/swine/British Columbia/28103/2005
EU409950-A/swine/Ohio/24366/07
AB434345-A/swine/Chachoengsao/2003
EU798848-A/swine/Korea/CY08/2007
FJ638315-A/swine/IL/07003243/2007
DQ923512-A/swine/Korea/PZ72_1/2006
DQ280249-A/swine/Ontario/11112/04
EU798838-A/swine/Korea/CAN01/2004
EU798844-A/swine/Korea/Asan04/2006
DQ280209-A/swine/Ontario/55383/04
DQ139324-A/Swine/Zhejiang/1/2004
EU798855-A/swine/Korea/CY07/2007
DQ469999-A/turkey/Ontario/31232/2005
EU798839-A/swine/Korea/CAS08/2005
FJ374515-A/swine/Shanghai/1/2007
AY623835-A/swine/Taichung/200_8/2002
EU258946-A/swine/Missouri/4296424/2006
EU798850-A/swine/Korea/CAN04/2005
EU743213-A/turkey/MN/366767/2005
EU798849-A/swine/Korea/CAS05/2004
AY623833-A/swine/Chiai/77_10/2001
AY363544-A/Swine/Hong Kong/1197/02
FJ461599-A/swine/Korea/C13/2008
EU697210-A/turkey/North Carolina/353568/2005
Introduction
In late April and early May 2009 the first pandemic strain of the 21st Century arrived in the form of the Pandemic H1N1 2009 influenza virus. Initially named "Swine Flu" the origin of the Pandemic H1N1 2009 sequence presented many initial questions about each of the segments of the virus.
We began by examining the relationship between nucleotide differences of the A/California/04/2009 and a number of top BLAST hits versus the isolation year differences. Although we expected that a correlation would exist between the nucleotides difference and isolation year difference would exist, we were surprised by the triangular pattern, shown in Figure 1
Plot Results Three clusters of sequences were noted:
• Cluster 1, diagonal triangle side labeled "1" in Figure 1• Cluster 2, vertical triangle side labeled "2"• Cluster 3, top triangle side labeled "3"
R. Burke Squires1, Elizabeth McClellan3, Jyothi Noronha1, Victoria Hunt1, Richard H. Scheuermann1,2
1Department of Pathology, 2Division of Biomedical Informatics, University of Texas Southwestern Medical Center at Dallas, Dallas, Texas 2 Department of Statistics, Southern Methodist University, Dallas, Texas.
Evolutionary Trajectory Analysis of Pandemic H1N1 2009
Quantitate Alignments
To quantify the difference we were observing in the alignments, we developed a metric to count the number of amino acid changes at each position
• Any change in an amino acid, within a single column, is counted.• If an amino acid persists then counts would be low, while amino acids that
changed often would show a higher count.
The fact that this metric was consistently lower for cluster 1 segments (Table 1) is consistent with the relative persistence of the amino acid substitutions over time.
Table 3. Slope of the evolutionary trajectory clusters and the corresponding mutation rates for influenza A virus segments.
Segment ET Slope
SUR Slope
Exp. Determined Mutations Rate.
PB2 6.9 24.9 4.3 (1)PB1 7.7 26.9PA 6.0 23.2HA 5.7 28.8 5.7 (2)NP 3.2 18.2 3.6 (3)NA 3.9 23.1 3.2 (4)M 1.4 5.6 1.5 (5)NS 2.1 12.5 1.5 - 2.6 (6, 7)
Lastly, we compared the slopes of the trend lines of the ET and SUR clusters to experimentally determined mutations rates for each of the 8 segments and found that the slopes of the ET cluster, as shown in Table 3, match published mutation rates closely.
Conclusion In conclusion, we have shown that sequence similar to the Pandemic H1N1 strain as identified by BLAST analysis fall into different categories based on other measures of sequence comparison. We propose that:
• Cluster 1 represents the true "Evolutionary Trajectory" (ET) of each segment of the A/California/04/2009 pandemic H1N1 2009 strain.
• Cluster 2 contains sequences that are "Similar, but Unrelated" (SUR).
• Cluster 3 contains sequences that are only moderately similar and also unrelated.
It has become apparent through this analysis that sequence similarity, in the absence of a time component, is not a stringent enough criterion to determine relationships among sequences.
Figure 4. Nucleotide difference versus isolation year difference between A/California/04/2009 with the three clusters of sequences highlighted with different symbols
Figure 2. Alignment of Evolutionary Trajectory strains, sorted by isolation year, which shows the persistence of amino acids changes (white background).
Figure 3. Alignment of Similar but unrelated (SUR) strains, sorted by isolation year, which shows a lack of persistence of amino acids changes (white background).
Table 1. Average number of amino acids changes per strain in
alignments of cluster 1 and 2.
IRD fludb.orgInfluenza Research Database
10
Figure 1. Nucleotide difference versus isolation year difference between A/California/04/2009 and the top 1000 BLAST hits (performed June, 2009).
1
2
3
Alignments
To further examine the sequence relationships in the different strain clusters we assembled working set of sequences for each cluster in the Influenza Research Database (IRD) (www.fludb.org). We aligned representative sequences from strains in cluster 1:
• Sorted sequences by ascending isolation years, (Figure 2).• We observed a persistence of amino acid changes in the sequences over time.
We aligned representative sequences from strains in cluster 2 (Figure 3):
• We observed no persistence. Amino acid changes seemed to appear randomly
Cluster 27.897.4
8.5511.21
11.92
7.0138.75
2.03
HA
M
Cluster 1
PB1
NA
PA
NS
5.092.22
3.2812.18
5.19
5.53
NP9.73
PB2
1.3
Segment
Phylogenic Trees
Phylogenetic trees for cluster 1 and 2 sequences were generated
• Trees are rooted at the earliest strain with ordered subtrees.• Observed a gradual progression of strains with short branch lengths
in the cluster 1 tree
In contrast, cluster 2 trees lacked the gradual progression of strains and appeared to possess longer branch lengths.
Tree Quantitation We compared the average branch lengths of each tree and performed a T-test with the results shown in Table 2.
Results confirm that cluster 1 show a gradual accumulation of sequence alterations as evidenced by relatively short branch lengths. In addition, the gradual transition in branches seemed to reflect the normal progression of time based on the year of isolation.
2.21E-05NS 6.422.879.13E-01M 4.36 4.144.44E-01NA 5.39 5.55
NP 1.50E-052.87 6.693.55HA 4.42E-089.55
2.855.57PA 2.93E-068.57 9.13E-116.92PB1
3.47PB2 4.23 1.84E-02P-valueCluster 2 Avg.Cluster 1 Avg.Segment
Table 2. Analysis of branch length of the phylogenetic trees of the cluster 1 and 2. T-test results show a low probability that the two
groups are related (Average values are 10^-3).
(A)
Overall Method
Figure 5 (A) (Above). Phylogenetic Tree of cluster 2 rooted but the earliest 2001 strain, with A/California/04/2009 indicated by the red arrow.
(B) (Left). Phylogenetic tree of cluster 1 of the A/California/04/2009 (red arrow), rooted at A/Brevig Mission/1918 in the lower left, with ordered subtrees.
(B)
• BLAST each segment returning top 1000 hits. BLAST analysis was performed with all influenza segments on June, 2009.
• Normalize for short sequence lengths, assuming a constant rate of evolution.• Plot nucleotide differences versus isolation year differences.• Determine one-phase association equation to separate cluster 1, 2.• Align cluster 1, 2. • Quantitate alignment, counting amino acids changes per column • Determine phylogenetic trees for cluster 1, 2.• Quantitate trees averaging branch lengths and T-test.
1. Gorman et al. Evolution of influenza A virus PB2 genes: implications for evolution of the ribonucleoprotein complex and origin of human influenza A virus. The Journal of Virology (1990) vol. 64 (10).
2. Fitch et al. Long term trends in the evolution of H(3) HA1 human influenza type A. Proceedings of the National Academy of Sciences of the United States of America (1997) vol. 94 (15).
3. Shu et al. Analysis of the evolution and variation of the human influenza A virus nucleoprotein gene from 1933 to 1990. The Journal of Virology (1993) vol. 67 (5).
4. Xu et al. Genetic Variation in Neuraminidase Genes of Influenza A (H3N2) Viruses. Virology (1996) vol. 224 (1).
5. Saitou and Nei. Polymorphism and evolution of influenza A virus genes. Molecular Biology and Evolution (1986) vol. 3 (1).
6. Nobusawa and Sato. Comparison of the Mutation Rates of Human Influenza A and B Viruses. The Journal of Virology (2006) vol. 80 (7).
7. Parvin, J. D., A. Moscona, W. T. Pan, J. M. Leider, and P. Palese. 1986. Measurement of the mutation rates of animal viruses: influenza A virus and poliovirus type 1. J. Virol. 59:377-383.
<= A/California/04/2009
A/California/04/2009 =>