conserved epitope regions (cer): elucidation of ...€¦ · conserved epitope regions (cer):...

1
Introduction We utilized data from the Influenza Research Database (IRD) [4] (www.fludb.org), which integrates information from primary sources such as: NCBI’s Influenza Virus Database (IVD) [5] Immune Epitope Database (IEDB) (www.immuneepitope.org ) [2] RCSB Protein Databank (PDB) SNP/Consensus sequence generated through an internal IRD pipeline (adapted from Crooks et al. [6]) was used as well as custom script to quantify epitope coverage across each protein. Results Five CERs can be found in the HA protein (Figure 1B, Table 2). These regions are not only well conserved, but also include both B cell and T cell epitopes. For the most part, the HA CER are found in the stalk region of the protein, with only one of the five CER located at the base of the globular head. Similar CER are found in each of the influenza proteins with significant representation in the IEDB data (Figure 4B, Table 2). Conserved Epitope Regions (CER): Elucidation of evolutionarily stable, immunologically reactive regions of human H1N1 influenza viruses R. Burke Squires 1 , Brett Pickett 1 , Jyothi Noronha 1 , Victoria Hunt 1 , Richard H. Scheuermann 1 , 2 1 Department of Pathology, 2 Division of Biomedical Informatics, University of Texas Southwestern Medical Center, Dallas, Texas. Results Figure 1 (Top right): (A) Number of T cell, B cell and MHC Class I and II epitopes broken down by protein and host (human, mouse or other). For mouse and human, MHC epitopes are broken down for Class I and Class II; for other host they remain grouped together (MHC). (B) Influenza segment sequence variation as measured by polymorphism score (poly) plotted in comparison with epitope coverage. Sequence polymorphism scores were computed for human H1N1 subtype sequences using a formula adapted from Crooks et al [6] and downloaded from the IRD, with scores for proteins ranging from 0 (fully conserved) to 432 (all amino acids equally represented at a position). In order to visualize regions of sequence variation, an average polymorphism score was computed using a 5 amino acid sliding window. An epitope coverage score was computed by counting the number of epitopes that occur at each amino acid position. T cell and MHC binding epitope data shown are from human only, while B cell epitope data is from all host species. All epitope data was obtained from the Immune Epitope Database (IEDB) on August 30, 2010 [2]. Colored bars at the top of each chart show regions of high immunological activity but little sequence variation – Conserved Epitope Regions (CER). For HA, CER are colored as follows: HA-CER1 (magenta), HA-CER2 (yellow), HA-CER3 (blue), HA-CER4 (cyan), and HA-CER5 (red). (C) The protein structure of H1N1 HA (PDB ID: 1RUZ) showing CER regions colored as in (B). Regions in only one HA monomer were highlighted for clarity. Discussion The CERs for HA: Include all four highly cross-reactive epitopes predicted by Duvvuri, et al. [10] Are conserved in both seasonal H1N1 and the pandemic H1N1 2009 viruses. Overlap with 6 other epitopes found by Duvvuri et al. for 10 of 18 (55.6%) epitopes in total. Conclusion References 1. Both, G.W., et al., Antigenic drift in influenza virus H3 hemagglutinin from 1968 to 1980: multiple evolutionary pathways and sequential amino acid changes at key antigenic sites. J. Virol., 1983. 48(1): p. 52-60. 2. Peters, B., et al., The Immune Epitope Database and Analysis Resource: From Vision to Blueprint. PLoS Biology, 2005. 3(3): p. e91. 3. Lamb, R. and R. Krug, Orthomyxoviridae: the viruses and their replication. Fields virology, 2001. 1: p. 1487ñ1531. 4. Squires, B., et al., BioHealthBase: informatics support in the elucidation of influenza virus host pathogen interactions and virulence. Nucl. Acids Res., 2007: p. gkm905. 5. Bao, Y., et al., The influenza virus resource at the National Center for Biotechnology Information. Journal of virology, 2008. 82(2): p. 596. 6. Crooks, G.E., et al., WebLogo: a sequence logo generator. Genome Res, 2004. 14(6): p. 1188-90. 7. Throsby, M., et al., Heterosubtypic neutralizing monoclonal antibodies cross- protective against H5N1 and H1N1 recovered from human IgM+ memory B cells. PLoS One, 2008. 3(12): p. 3942. 8. Ekiert, D., et al., Antibody recognition of a highly conserved influenza virus epitope. Science, 2009. 324(5924): p. 246. 9. Wang, T., et al., Broadly protective monoclonal antibodies against H3 influenza viruses following sequential immunization with different hemagglutinins. PLoS Pathog, 2010. 6: p. e100796. 10. Duvvuri, V.R.S.K., et al., Original Article: Highly conserved cross-reactive CD4+ T-cell HA-epitopes of seasonal and the 2009 pandemic influenza viruses. Influenza and Other Respiratory Viruses, 2010. 4(5): p. 249-258. a Monoclonal antibody binding region reported in Reference b Numbering scheme used in Reference c Equivalent location in numbering scheme used in Figure 4 d Overlap with HA conserved epitope regions (CER) reported here This study was conducted under the contract HHSN266200400041C Influenza viruses evolve through competing forces of immune pressure and protein functional constraints [1]. Immune pressure is exerted on the viruses to select for naturally occurring sequence variants that alter protein regions recognized by both the humoral and cellular arms of the adaptive immune system. However, the ability of influenza virus to evade host immunity through mutational selection is constrained by the requirement to retain essential protein functions necessary for effective virus replication and spread. This results in a constant mutation/selection tug-of-war that selects for sequence alterations in protein regions recognized by the host immune system (immune epitopes) while selecting against alterations in sequence features that impair protein function. For vaccine development, the ideal immunogen would focus the vaccine immune response on protein regions that are both strong immune epitopes and functionally constrained. Protein Regions HA 25 – 50 (HA-CER1); 115 – 130 (HA-CER2); 340 – 375 (HA-CER3); 395 – 410 (HA-CER4); 435 – 450 (HA-CER5) NA 130 – 145 (NA-CER1); 405 – 425 (NA-CER2); 435 – 450 (NA-CER3) M1 15 – 30 (M1-CER1) NS1 30 – 45 (NS1-CER1); 150 – 165 (NS1-CER2); 180 – 190 (NS1-CER3) PA 395 – 420 (PA-CER1) Region a Location b H0 coordinates c CER coverage d Eikert et al. [8] CR6261 Region1 HA1 Val40 47 HA-CER1 HA1 Leu42 49 HA-CER1 HA1 Leu292 307? none HA2 Thr49 392 none HA2 Val52 395 HA-CER4 HA2 Ile56 399 HA-CER4 CR6261 Region2 HA1 His18 25 HA-CER1 HA1 His38 45 HA-CER1 HA2 Trp21 364 HA-CER3 HA2 Thr41 384 none HA2 Ile45 388 none Wang et al. [9] 12D1 HA2 76-106 419-446 HA-CER5 In conclusion, the CERs described here provide excellent candidates for vaccine development that are known targets of the adaptive immune system. Epitopes from these CER offer broad cross-reactive immune response while being conserved in both seasonal H1N1 and the pandemic H1N1 2009 viruses. Table 1. Conserved Epitope Regions. Discussion The current strategy for seasonal influenza vaccine development relies on the selection of two type A strains, an H1N1 and an H3N2, and one type B strain from recently circulating viruses predicted to be similar to the strains that will emerge in the coming season. This approach is problematic in that it: Requires the strain selection committee to accurately predict the basic structures of future viruses and also Requires vaccine manufacturers to develop new vaccine formulations each year. Through the integration of data about the location of experimentally validated immune epitopes from the IEDB with an analysis of amino acid sequence conservation from the IRD, we have identified Conserved Epitope Regions (CER) within each of the influenza proteins that are targeted by the host immune system and that are conserved within type A influenza viruses. We propose that these CER, especially those within the HA protein, could be excellent candidates for epitopes that could generate broad cross-reactive antibodies and therefore cross-protective immunity Recent experimental evidence suggests that the computational strategy described here does indeed generate experimentally verifiable results. Recently, investigators have isolated monoclonal antibodies that demonstrate heterosubtypic cross-reactivity [7-9].. A comparison of the regions identified by the computational methods described here with the regions identified by these experimental methods shows a dramatic correlation (Table 2) with each of the antibody binding regions covered by one or more CER. Table 2. Overlap between experimentally determined cross- reactive epitopes and computationally determined epitope regions. Materials and Methods Results To investigate the relationships between protein functional regions and immune epitopes, we compared the location of amino acids included in immune epitopes with a surrogate of functional constraint – sequence conservation. Figure 1B shows an overlay of graphs for epitope coverage and sequence variation for several influenza proteins. In some cases: 1. Curves closely parallel each other (e.g. the region between amino acids (aa) 199 - 232 in HA (left)) 2. Regions showing sequence variation are not found in validated immune epitopes 3. Regions with limited sequence variation include amino acid positions found in multiple immune epitopes (e.g. the region around aa441 in HA; the region between aa406 – 421 in PB1). These amino acids appear to be highly conserved in spite of immune recognition; we refer to these as Conserved Epitope Regions (CER).

Upload: others

Post on 14-Jun-2020

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Conserved Epitope Regions (CER): Elucidation of ...€¦ · Conserved Epitope Regions (CER): Elucidation of evolutionarily stable, immunologically reactive regions of human H1N1 influenza

Introduction

We utilized data from the Influenza Research Database (IRD) [4]

(www.fludb.org), which integrates information from primary sources

such as:

•  NCBI’s Influenza Virus Database (IVD) [5]

•  Immune Epitope Database (IEDB) (www.immuneepitope.org) [2]

•  RCSB Protein Databank (PDB)

SNP/Consensus sequence generated through an internal IRD

pipeline (adapted from Crooks et al. [6]) was used as well as custom

script to quantify epitope coverage across each protein.

Results

Five CERs can be found in the HA protein (Figure 1B, Table 2).

These regions are not only well conserved, but also include both B

cell and T cell epitopes.

For the most part, the HA CER are found in the stalk region of the

protein, with only one of the five CER located at the base of the

globular head. Similar CER are found in each of the influenza

proteins with significant representation in the IEDB data (Figure 4B,

Table 2).

Conserved Epitope Regions (CER): Elucidation of evolutionarily stable, immunologically reactive regions of human H1N1 influenza viruses

R. Burke Squires1, Brett Pickett1, Jyothi Noronha1, Victoria Hunt1, Richard H. Scheuermann1,2 1Department of Pathology, 2Division of Biomedical Informatics, University of Texas Southwestern Medical Center, Dallas, Texas.

Results

Figure 1 (Top right): (A) Number of T cell, B cell and MHC Class I and II epitopes broken down by protein and host (human, mouse or other). For mouse and human, MHC epitopes are broken down for Class I and Class II; for other host they remain grouped together (MHC). (B) Influenza segment sequence variation as measured by polymorphism score (poly) plotted in comparison with epitope coverage. Sequence polymorphism scores were computed for human H1N1 subtype sequences using a formula adapted from Crooks et al [6] and downloaded from the IRD, with scores for proteins ranging from 0 (fully conserved) to 432 (all amino acids equally represented at a position). In order to visualize regions of sequence variation, an average polymorphism score was computed using a 5 amino acid sliding window. An epitope coverage score was computed by counting the number of epitopes that occur at each amino acid position. T cell and MHC binding epitope data shown are from human only, while B cell epitope data is from all host species. All epitope data was obtained from the Immune Epitope Database (IEDB) on August 30, 2010 [2]. Colored bars at the top of each chart show regions of high immunological activity but little sequence variation – Conserved Epitope Regions (CER). For HA, CER are colored as follows: HA-CER1 (magenta), HA-CER2 (yellow), HA-CER3 (blue), HA-CER4 (cyan), and HA-CER5 (red). (C) The protein structure of H1N1 HA (PDB ID: 1RUZ) showing CER regions colored as in (B). Regions in only one HA monomer were highlighted for clarity.

Discussion

The CERs for HA:

•  Include all four highly cross-reactive epitopes predicted by

Duvvuri, et al. [10]

•  Are conserved in both seasonal H1N1 and the pandemic H1N1

2009 viruses.

•  Overlap with 6 other epitopes found by Duvvuri et al. for 10 of 18

(55.6%) epitopes in total.

Conclusion

References

1.  Both, G.W., et al., Antigenic drift in influenza virus H3 hemagglutinin from 1968 to 1980: multiple evolutionary pathways and sequential amino acid changes at key antigenic sites. J. Virol., 1983. 48(1): p. 52-60.

2.  Peters, B., et al., The Immune Epitope Database and Analysis Resource: From Vision to Blueprint. PLoS Biology, 2005. 3(3): p. e91.

3.  Lamb, R. and R. Krug, Orthomyxoviridae: the viruses and their replication. Fields virology, 2001. 1: p. 1487ñ1531.

4.  Squires, B., et al., BioHealthBase: informatics support in the elucidation of influenza virus host pathogen interactions and virulence. Nucl. Acids Res., 2007: p. gkm905.

5.  Bao, Y., et al., The influenza virus resource at the National Center for Biotechnology Information. Journal of virology, 2008. 82(2): p. 596.

6.  Crooks, G.E., et al., WebLogo: a sequence logo generator. Genome Res, 2004. 14(6): p. 1188-90.

7.  Throsby, M., et al., Heterosubtypic neutralizing monoclonal antibodies cross-protective against H5N1 and H1N1 recovered from human IgM+ memory B cells. PLoS One, 2008. 3(12): p. 3942.

8.  Ekiert, D., et al., Antibody recognition of a highly conserved influenza virus epitope. Science, 2009. 324(5924): p. 246.

9.  Wang, T., et al., Broadly protective monoclonal antibodies against H3 influenza viruses following sequential immunization with different hemagglutinins. PLoS Pathog, 2010. 6: p. e100796.

10.  Duvvuri, V.R.S.K., et al., Original Article: Highly conserved cross-reactive CD4+ T-cell HA-epitopes of seasonal and the 2009 pandemic influenza viruses. Influenza and Other Respiratory Viruses, 2010. 4(5): p. 249-258.

a Monoclonal antibody binding region reported in Reference b Numbering scheme used in Reference c Equivalent location in numbering scheme used in Figure 4 d Overlap with HA conserved epitope regions (CER) reported here

This study was conducted under the contract HHSN266200400041C

Influenza viruses evolve through competing forces of immune

pressure and protein functional constraints [1]. Immune pressure is

exerted on the viruses to select for naturally occurring sequence

variants that alter protein regions recognized by both the humoral

and cellular arms of the adaptive immune system. However, the

ability of influenza virus to evade host immunity through mutational

selection is constrained by the requirement to retain essential protein

functions necessary for effective virus replication and spread.

This results in a constant mutation/selection tug-of-war that selects

for sequence alterations in protein regions recognized by the host

immune system (immune epitopes) while selecting against

alterations in sequence features that impair protein function.

For vaccine development, the ideal immunogen would focus the

vaccine immune response on protein regions that are both strong

immune epitopes and functionally constrained.

Protein   Regions  HA   25 – 50 (HA-CER1); 115 – 130 (HA-CER2); 340 – 375

(HA-CER3); 395 – 410 (HA-CER4); 435 – 450 (HA-CER5)  NA   130 – 145 (NA-CER1); 405 – 425 (NA-CER2); 435 – 450

(NA-CER3)  M1   15 – 30 (M1-CER1)  NS1   30 – 45 (NS1-CER1); 150 – 165 (NS1-CER2); 180 – 190

(NS1-CER3)  PA   395 – 420 (PA-CER1)  

Regiona   Locationb   H0 coordinatesc  

CER coveraged  

Eikert et al. [8]  

CR6261 Region1  

HA1 Val40   47   HA-CER1  HA1 Leu42   49   HA-CER1  HA1 Leu292   307?   none  HA2 Thr49   392   none  HA2 Val52   395   HA-CER4  HA2 Ile56   399   HA-CER4  

CR6261 Region2  

HA1 His18   25   HA-CER1  HA1 His38   45   HA-CER1  HA2 Trp21   364   HA-CER3  HA2 Thr41   384   none  HA2 Ile45   388   none  

Wang et al. [9]   12D1   HA2 76-106   419-446   HA-CER5  

In conclusion, the CERs described here provide excellent candidates

for vaccine development that are known targets of the adaptive

immune system. Epitopes from these CER offer broad cross-reactive

immune response while being conserved in both seasonal H1N1 and

the pandemic H1N1 2009 viruses.

Table 1. Conserved Epitope Regions.

Discussion

The current strategy for seasonal influenza vaccine development

relies on the selection of two type A strains, an H1N1 and an H3N2,

and one type B strain from recently circulating viruses predicted to

be similar to the strains that will emerge in the coming season. This

approach is problematic in that it:

•  Requires the strain selection committee to accurately predict the

basic structures of future viruses and also

•  Requires vaccine manufacturers to develop new vaccine

formulations each year.

Through the integration of data about the location of experimentally

validated immune epitopes from the IEDB with an analysis of amino

acid sequence conservation from the IRD, we have identified

Conserved Epitope Regions (CER) within each of the influenza

proteins that are targeted by the host immune system and that are

conserved within type A influenza viruses.

We propose that these CER, especially those within the HA protein,

could be excellent candidates for epitopes that could generate broad

cross-reactive antibodies and therefore cross-protective immunity

Recent experimental evidence suggests that the computational

strategy described here does indeed generate experimentally

verifiable results. Recently, investigators have isolated monoclonal

antibodies that demonstrate heterosubtypic cross-reactivity [7-9]..

A comparison of the regions identified by the computational methods

described here with the regions identified by these experimental

methods shows a dramatic correlation (Table 2) with each of the

antibody binding regions covered by one or more CER.

Table 2. Overlap between experimentally determined cross-reactive epitopes and computationally determined epitope regions.

Materials and Methods

Results

To investigate the relationships between protein functional regions

and immune epitopes, we compared the location of amino acids

included in immune epitopes with a surrogate of functional constraint

– sequence conservation. Figure 1B shows an overlay of graphs for

epitope coverage and sequence variation for several influenza

proteins.

In some cases:

1.  Curves closely parallel each other (e.g. the region between

amino acids (aa) 199 - 232 in HA (left))

2.  Regions showing sequence variation are not found in

validated immune epitopes

3.  Regions with limited sequence variation include amino acid

positions found in multiple immune epitopes (e.g. the region

around aa441 in HA; the region between aa406 – 421 in

PB1). These amino acids appear to be highly conserved in

spite of immune recognition; we refer to these as

Conserved Epitope Regions (CER).