an epidemiological analysis of sars-cov-2 genomic ......16 bhagat phool singh government medical...

13
viruses Article An Epidemiological Analysis of SARS-CoV-2 Genomic Sequences from Different Regions of India Pragya D. Yadav 1,† , Dimpal A. Nyayanit 1,† , Triparna Majumdar 1 , Savita Patil 1 , Harmanmeet Kaur 2 , Nivedita Gupta 2, *, Anita M. Shete 1 , Priyanka Pandit 1 , Abhinendra Kumar 1 , Neeraj Aggarwal 2 , Jitendra Narayan 2 , Neetu Vijay 2 , Usha Kalawat 3 , Attayur P. Sugunan 4 , Ashok Munivenkatappa 5 , Tara Sharma 6 , Sulochna Devi 7 , Tapan Majumdar 8 , Subhash Jaryal 9 , Rupinder Bakshi 10 , Yash Joshi 1 , Rima Sahay 1 , Jayanti Shastri 11 , Mini Singh 12 , Manoj Kumar 13 , Vinita Rawat 14 , Shanta Dutta 15 , Sarita Yadav 16 , Kaveri Krishnasamy 17 , Sharmila Raut 18 , Debasis Biswas 19 , Biswajyoti Borkakoty 20 , Santwana Verma 21 , Sudha Rani 22 , Hirawati Deval 23 , Disha Patel 24 , Jyotirmayee Turuk 25 , Bharti Malhotra 26 , Bashir Fomda 27 , Vijaylakshmi Nag 28 , Amita Jain 29 , Anudita Bhargava 30 , Varsha Potdar 1 , Sarah Cherian 1 , Priya Abraham 1 , Anjani Gopal 31 , Samiran Panda 2 and Balram Bhargava 2 Citation: Yadav, P.D.; Nyayanit, D.A.; Majumdar, T.; Patil, S.; Kaur, H.; Gupta, N.; Shete, A.M.; Pandit, P.; Kumar, A.; Aggarwal, N.; et al. An Epidemiological Analysis of SARS-CoV-2 Genomic Sequences from Different Regions of India. Viruses 2021, 13, 925. https:// doi.org/10.3390/v13050925 Academic Editors: KrisztiánBányai, István Kiss and György Lengyel Received: 22 March 2021 Accepted: 4 May 2021 Published: 17 May 2021 Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affil- iations. Copyright: © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/). 1 Indian Council of Medical Research-National Institute of Virology (ICMR-NIV), Pune 411021, India; [email protected] (P.D.Y.); [email protected] (D.A.N.); [email protected] (T.M.); [email protected] (S.P.); [email protected] (A.M.S.); [email protected] (P.P.); [email protected] (A.K.); [email protected] (Y.J.); [email protected] (R.S.); [email protected] (V.P.); [email protected] (S.C.); [email protected] (P.A.) 2 Indian Council of Medical Research, New Delhi 110029, India; [email protected] (H.K.); [email protected] (N.A.); [email protected] (J.N.); [email protected] (N.V.); [email protected] (S.P.); [email protected] (B.B.) 3 Sri Venkateswara Institute of Medical Sciences, Tirupati 517507, India; [email protected] 4 ICMR—NIV Field Unit, Kerala 688005, India; [email protected] 5 ICMR-NIV, Bangalore Unit, Bangalore 560029, India; [email protected] 6 VRDL Sikkim Government College of Nursing, Gangtok 737101, India; [email protected] 7 Regional Institute of Medical Sciences IMPHAL, Imphal 795004, India; [email protected] 8 Government Medical College, Agartala, Tripura 799006, India; [email protected] 9 Dr. Rajendra Prasad Government Medical College, (H.P.), Kangra 176001, India; [email protected] 10 Government Medical College, Patiala 147001, India; [email protected] 11 Kasturba Hospital for Infectious Diseases, Mumbai 400034, India; [email protected] 12 Post Graduate Institute of Medical Education & Research, Chandigarh 160012, India; [email protected] 13 Rajendra Institute of Medical Sciences, Ranchi 834009, India; [email protected] 14 Government Medical College, Haldwani 263129, India; [email protected] 15 National Institute of Cholera and Enteric Diseases, Kolkata 700010, India; [email protected] 16 Bhagat Phool Singh Government Medical College, Sonipat 131305, India; [email protected] 17 King Institute of Preventive Medicine & Research, Chennai 600032, India; [email protected] 18 Indira Gandhi Government Medical College & Hospital, Nagpur 440018, India; [email protected] 19 All India Institute of Medical Sciences, Bhopal 462020, India; [email protected] 20 Regional Medical Research Centre, Dibrugarh 786010, India; [email protected] 21 Indira Gandhi Medical College & Hospital, Shimla 171001, India; [email protected] 22 Osmania Medical College, Hyderabad 500095, India; [email protected] 23 Regional Medical Research Center Gorakhpur, Gorakhpur 273013, India; [email protected] 24 B.J. Medical College, Ahmedabad 380016, India; [email protected] 25 RMRC, Bhubaneswar 751023, India; [email protected] 26 Sawai Man Singh Medical College, Jaipur 302004, India; [email protected] 27 Sher-I-Kashmir Institute of Medical Sciences, Srinagar 190011, India; [email protected] 28 All India Institute of Medical Sciences, Jodhpur 342005, India; [email protected] 29 Department of Microbiology, King George’s Medical University, Lucknow 226003, India; [email protected] 30 All India Institute of Medical Sciences, Raipur, Raipur 492099, India; [email protected] 31 Indian Institute of Science Education and Research, Pune 411008, India; [email protected] * Correspondence: [email protected] The authors contributed equally to the manuscript. Viruses 2021, 13, 925. https://doi.org/10.3390/v13050925 https://www.mdpi.com/journal/viruses

Upload: others

Post on 12-Aug-2021

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: An Epidemiological Analysis of SARS-CoV-2 Genomic ......16 Bhagat Phool Singh Government Medical College, Sonipat 131305, India; yadav78sarita@yahoo.com 17 King Institute of Preventive

viruses

Article

An Epidemiological Analysis of SARS-CoV-2 GenomicSequences from Different Regions of India

Pragya D. Yadav 1,† , Dimpal A. Nyayanit 1,†, Triparna Majumdar 1, Savita Patil 1, Harmanmeet Kaur 2,Nivedita Gupta 2,*, Anita M. Shete 1, Priyanka Pandit 1 , Abhinendra Kumar 1, Neeraj Aggarwal 2 ,Jitendra Narayan 2 , Neetu Vijay 2, Usha Kalawat 3, Attayur P. Sugunan 4, Ashok Munivenkatappa 5,Tara Sharma 6, Sulochna Devi 7, Tapan Majumdar 8, Subhash Jaryal 9, Rupinder Bakshi 10, Yash Joshi 1 ,Rima Sahay 1, Jayanti Shastri 11, Mini Singh 12, Manoj Kumar 13 , Vinita Rawat 14, Shanta Dutta 15 ,Sarita Yadav 16, Kaveri Krishnasamy 17, Sharmila Raut 18, Debasis Biswas 19, Biswajyoti Borkakoty 20,Santwana Verma 21, Sudha Rani 22, Hirawati Deval 23, Disha Patel 24, Jyotirmayee Turuk 25, Bharti Malhotra 26,Bashir Fomda 27, Vijaylakshmi Nag 28, Amita Jain 29, Anudita Bhargava 30, Varsha Potdar 1 , Sarah Cherian 1,Priya Abraham 1, Anjani Gopal 31, Samiran Panda 2 and Balram Bhargava 2

�����������������

Citation: Yadav, P.D.; Nyayanit, D.A.;

Majumdar, T.; Patil, S.; Kaur, H.;

Gupta, N.; Shete, A.M.; Pandit, P.;

Kumar, A.; Aggarwal, N.; et al. An

Epidemiological Analysis of

SARS-CoV-2 Genomic Sequences

from Different Regions of India.

Viruses 2021, 13, 925. https://

doi.org/10.3390/v13050925

Academic Editors: Krisztián Bányai,

István Kiss and György Lengyel

Received: 22 March 2021

Accepted: 4 May 2021

Published: 17 May 2021

Publisher’s Note: MDPI stays neutral

with regard to jurisdictional claims in

published maps and institutional affil-

iations.

Copyright: © 2021 by the authors.

Licensee MDPI, Basel, Switzerland.

This article is an open access article

distributed under the terms and

conditions of the Creative Commons

Attribution (CC BY) license (https://

creativecommons.org/licenses/by/

4.0/).

1 Indian Council of Medical Research-National Institute of Virology (ICMR-NIV), Pune 411021, India;[email protected] (P.D.Y.); [email protected] (D.A.N.);[email protected] (T.M.); [email protected] (S.P.); [email protected] (A.M.S.);[email protected] (P.P.); [email protected] (A.K.); [email protected] (Y.J.);[email protected] (R.S.); [email protected] (V.P.); [email protected] (S.C.);[email protected] (P.A.)

2 Indian Council of Medical Research, New Delhi 110029, India; [email protected] (H.K.);[email protected] (N.A.); [email protected] (J.N.); [email protected] (N.V.);[email protected] (S.P.); [email protected] (B.B.)

3 Sri Venkateswara Institute of Medical Sciences, Tirupati 517507, India; [email protected] ICMR—NIV Field Unit, Kerala 688005, India; [email protected] ICMR-NIV, Bangalore Unit, Bangalore 560029, India; [email protected] VRDL Sikkim Government College of Nursing, Gangtok 737101, India; [email protected] Regional Institute of Medical Sciences IMPHAL, Imphal 795004, India; [email protected] Government Medical College, Agartala, Tripura 799006, India; [email protected] Dr. Rajendra Prasad Government Medical College, (H.P.), Kangra 176001, India; [email protected] Government Medical College, Patiala 147001, India; [email protected] Kasturba Hospital for Infectious Diseases, Mumbai 400034, India; [email protected] Post Graduate Institute of Medical Education & Research, Chandigarh 160012, India; [email protected] Rajendra Institute of Medical Sciences, Ranchi 834009, India; [email protected] Government Medical College, Haldwani 263129, India; [email protected] National Institute of Cholera and Enteric Diseases, Kolkata 700010, India; [email protected] Bhagat Phool Singh Government Medical College, Sonipat 131305, India; [email protected] King Institute of Preventive Medicine & Research, Chennai 600032, India; [email protected] Indira Gandhi Government Medical College & Hospital, Nagpur 440018, India; [email protected] All India Institute of Medical Sciences, Bhopal 462020, India; [email protected] Regional Medical Research Centre, Dibrugarh 786010, India; [email protected] Indira Gandhi Medical College & Hospital, Shimla 171001, India; [email protected] Osmania Medical College, Hyderabad 500095, India; [email protected] Regional Medical Research Center Gorakhpur, Gorakhpur 273013, India; [email protected] B.J. Medical College, Ahmedabad 380016, India; [email protected] RMRC, Bhubaneswar 751023, India; [email protected] Sawai Man Singh Medical College, Jaipur 302004, India; [email protected] Sher-I-Kashmir Institute of Medical Sciences, Srinagar 190011, India; [email protected] All India Institute of Medical Sciences, Jodhpur 342005, India; [email protected] Department of Microbiology, King George’s Medical University, Lucknow 226003, India;

[email protected] All India Institute of Medical Sciences, Raipur, Raipur 492099, India; [email protected] Indian Institute of Science Education and Research, Pune 411008, India; [email protected]* Correspondence: [email protected]† The authors contributed equally to the manuscript.

Viruses 2021, 13, 925. https://doi.org/10.3390/v13050925 https://www.mdpi.com/journal/viruses

Page 2: An Epidemiological Analysis of SARS-CoV-2 Genomic ......16 Bhagat Phool Singh Government Medical College, Sonipat 131305, India; yadav78sarita@yahoo.com 17 King Institute of Preventive

Viruses 2021, 13, 925 2 of 13

Abstract: The number of Severe Acute Respiratory Syndrome Coronavirus-2 (SARS-CoV-2) casesis increasing in India. This study looks upon the geographic distribution of the virus clades andvariants circulating in different parts of India between January and August 2020. The NPS/OPSfrom representative positive cases from different states and union territories in India were collectedevery month through the VRDLs in the country and analyzed using next-generation sequencing.Epidemiological analysis of the 689 SARS-CoV-2 clinical samples revealed GH and GR to be thepredominant clades circulating in different states in India. The northern part of India largely re-ported the ‘GH’ clade, whereas the southern part reported the ‘GR’, with a few exceptions. Thesesequences also revealed the presence of single independent mutations—E484Q and N440K—fromMaharashtra (first observed in March 2020) and Southern Indian States (first observed in May 2020),respectively. Furthermore, this study indicates that the SARS-CoV-2 variant (VOC, VUI, variant ofhigh consequence and double mutant) was not observed during the early phase of virus transmission(January–August). This increased number of variations observed within a short timeframe across theglobe suggests virus evolution, which can be a step towards enhanced host adaptation.

Keywords: SARS-CoV-2; epidemiology; NGS; clades; India

1. Introduction

The severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) that was firstidentified in Wuhan City in the Hubei Province of China in December 2019 has now spreadworldwide [1]. The causative agent was identified as SARS-COV-2 based on laboratorydiagnoses and the study of its genome through next-generation sequencing (NGS) [2].SARS-CoV-2 belongs to the subgenus Sarbecovirus of the Betacoronavirus genus that is partof the Coronaviridae family, which has one of the largest RNA viral genomes. It is anenveloped, non-segmented, positive-sense RNA virus, with a genome approximately 30 kbin length [3]. The first ten whole-genome sequences of SARS-CoV-2 obtained from nineaffected patients in China showed 99.98% sequence identity with each other, and ~96.3%identity with a bat Coronavirus (BatCoV) strain, RaTG13 [4].

A wide range of studies has been published to describe the geographic distributionof SARS-CoV-2 clades and variants across the world using whole-genome sequencing(WGS). WGS has emerged as an important tool for understanding the effect of muta-tions on disease transmission dynamics, and for predicting the trends of the ongoingpandemic. Several studies have reported genetic variations in the virus resulting fromdifferent types of mutations: missense, synonymous, insertion, deletion, and non-codingmutations [5–7]. Recently, the Center for Disease Control (CDC) classified SARS-CoV-2variants into three groups: “variant of interest (VUI)”, “variant of concern”, and “vari-ant of high consequence” (https://www.cdc.gov/coronavirus/2019-ncov/cases-updates/variant-surveillance/variant-info.html (accessed on 26 April 2021)). The B.1.526, B.1.525,P.2 Phylogenetic Assignment of Named Global Outbreak LINeages (PangoLIN) lineagesare classified as VUIs, whereas the B.1.1.7, B.1.351, B.1.427, B.1.429, P.1 PangoLIN lineagesare classified as VOCs.

The present study was designed to detect the genetic diversity of SARS-CoV-2 strainssampled from different states in India, which were taken monthly at different time pointsand which demonstrated the presence of mutations in the SARS-CoV-2 variant as describedby the CDC, among other institutions. This study reveals the circulating lineages of SARS-CoV-2 in India and also highlights the mutational rates of the virus in different geographicregions in India over a consistent period of six months (January–August 2020).

2. Materials and Methods2.1. Sample Acquisition

Respiratory specimens i.e., nasopharyngeal/oropharyngeal swabs (NPS/OPS) posi-tive for SARS-CoV-2 detected by real-time RT-PCR, were collected monthly (from 1 January

Page 3: An Epidemiological Analysis of SARS-CoV-2 Genomic ......16 Bhagat Phool Singh Government Medical College, Sonipat 131305, India; yadav78sarita@yahoo.com 17 King Institute of Preventive

Viruses 2021, 13, 925 3 of 13

to 31 August 2020) from Virus Research and Diagnostic Laboratories (VRDLs) throughoutIndia and sequenced. Samples were collected from 25 states and Union Territories (UTs)in India (Andhra Pradesh, Assam, Chandigarh, Chhattisgarh, Delhi, Gujarat, Haryana,Himachal Pradesh, Jammu and Kashmir, Jharkhand, Karnataka, Kerala, Madhya Pradesh,Maharashtra, Manipur, Odisha, Punjab, Rajasthan, Sikkim, Tamil Nadu, Tripura, Telan-gana, Uttarakhand, Uttar Pradesh, and West Bengal). The inclusion criteria for the samplesin the study were the following: (i) appropriate storage of the sample at −80 ◦C, and(ii) a cycle threshold of <30. The samples fulfilling the above criteria were packed indry ice and in triple-layer packaging, according to guidelines established by the Inter-national Air Transport Association (IATA). The samples were subsequently transferredto the nodal laboratory—the Indian Council of Medical Research-National Institute ofVirology (ICMR-NIV) in Pune—for sequencing and analysis. Apart from the sequences ofthe above-collected samples, all other sequences of the SARS-CoV-2 collected in India fromJanuary to August and deposited in the Global Initiative on Sharing All Influenza Data(GISAID) by various researchers, updated until 9 December 2020, were also analyzed.

2.2. RNA Extraction and Next-Generation Sequencing

Viral nucleic acid was extracted from the NPS/OPS specimens using 400 µL of thesample. Nucleic acid extraction was conducted using a MagMAX™ viral pathogen nucleicacid isolation kit (Thermo Fisher Scientific, Waltham, MA, USA). An automated protocolusing the KingFisher Flex (Thermo Fisher Scientific, Waltham, MA, USA) Magnetic ParticleProcessor for high-throughput nucleic acid extraction was used, following the instructionsof the manufacturer. Nucleic acid was eluted with 50 µL of elution buffer. The extractedRNA was quantified using the Qubit® 2.0 Fluorometer (Thermo Fisher Scientific, Waltham,MA, USA) with the Qubit RNA High Sensitivity kit. Host ribosomal RNA (rRNA) depletionwas carried out using the NEBNext rRNA depletion kit (New England Biolabs, Ipswich,MA, USA) and the extracted RNA was re-quantified. The quantified RNA was usedto generate genomic libraries for sequencing. In brief, the library preparation involvedfragmentation, adapter ligation, amplification, and quantification. The quantified librarieswere normalized and loaded onto the Illumina machine for sequencing [8,9].

The paired-end FASTQ files generated from the MiniSeq machine were analyzed on theCLC Genomics Workbench version 20 (CLC, Qiagen, Hilden, Germany). A reference-basedassembly method, as implemented in the Workbench, was used to retrieve the SARS-CoV-2 sequence. The SARS-CoV-2 isolate Wuhan-Hu-1 (Accession No.: NC_045512.2) wasused as the reference for mapping. The retrieved sequences were deposited in the publicrepository, GISAID.

2.3. Phylogenetic and Sequence Analysis

SARS-CoV-2 sequences from India (as of 9 December 2020), were downloaded (n = 3119)from the GISAID database, including the sequences of COVID-19 cases that occurred fromJanuary to August 2020 [10]. Representative sequences based on data from the differentStates/UT and the specific clades were used in the analysis along with the sequencesretrieved in this study. The sequences were aligned using the CLC genomics workbench.The aligned file was manually checked for correctness. A neighbor-joining (NJ) phyloge-netic tree was constructed from the coding region of the SARS-CoV-2 genome using themaximum composite likelihood model along with gamma distribution as the rate variationparameter. A bootstrap replication of 1000 cycles was performed to assess the statisticalrobustness of the generated tree. The amino acid variation for each gene, as well as the netnucleotide and amino acid divergence were identified using the MEGA software version7.0 [11] and illustrated using the GraphPad Prism v9. PopART v1.7 was used to drawhaplotype networks using the median-joining approach [12,13].

Page 4: An Epidemiological Analysis of SARS-CoV-2 Genomic ......16 Bhagat Phool Singh Government Medical College, Sonipat 131305, India; yadav78sarita@yahoo.com 17 King Institute of Preventive

Viruses 2021, 13, 925 4 of 13

3. Results3.1. Site Selection and Clinical Analysis

Altogether, 1603 samples were received from twenty-five states and UTs, 79 of whichwere discarded in compliance with the inclusion criteria. The data set of 1524 samples wasused for the clinical and NGS analyses.

The clinical data were analyzed for the samples that gave complete SARS-CoV-2sequences (n = 689). The mean age of patients in the study was 37 years, with a medianof 35 years (range 25–45 years). A total of 71.45% of the samples were from male patientswith a mean age of 37 years, while the remaining samples were from females with a meanof 38 years. The data set for age and gender were not available for three samples.

The clinical symptoms observed in the patients included in the study indicated that67.95% of the males and 68.39% of the females sampled were asymptomatic. Overall,67.34% of cases were asymptomatic, while 23.80% were symptomatic with one or morepresenting symptoms (Supplementary Table S1). Symptom history was not available for theremaining 8.85% of patients. Low-grade fever was the most consistent clinical presentation,observed in 64.02% of symptomatic cases, followed by cough (51.22%), sore throat (25%),breathlessness (20.73%), headache (17.07%), body ache (11.59%), and feeling cold (9.15%).Loss of smell and taste was observed in 9.76% of cases, while nausea and vomiting wereobserved in 2.44% of cases. Hemoptysis was reported in a single case.

3.2. NGS Data Selection and Phylogenetic Analysis

Out of the 1524 total samples sequenced, reference-based mapping led to the retrievalof 819 SARS-CoV-2 genomic sequences that had the genome coverage of 98%. However,the study analyzed 689 complete SARS-CoV-2 genomic sequences that were retrieved forthis study along with seven additional sequences. These 689 sequences had a genomecoverage of more than 99.75%. A total of 696 SARS-CoV-2 sequences were analyzed in thisstudy. The percentage of relevant mapped reads had a median value of 10.21, with an intra-quartile range of 48.29 [1.58–49.44]. Details with respect to the percentage of the genomeretrieved and the read mapped to it, along with the percentage of genome length takenfrom each SARS-CoV-2 sequence in this study are provided in the Supplementary Table S2.The phylogenetic tree revealed that the SARS-CoV-2 sequences collected from the differentstates in India represent different GISAID clades with a small proportion of L and Vclades [14] (Figure 1).

3.3. Mutations Observed in the SARS-CoV-2 Sequences

From among the 3815 sequences analyzed, which include sequences from the GISAIDdatabase, 2473 independent amino acid variations were observed with respect to the SARS-CoV-2 isolate Wuhan-Hu-1 (Accession No.: NC_045512.2). Many individual variationsobserved in the SARS-CoV-2 sequences in the study were not reported. Additionally,we looked for the presence of the amino acid mutation that was reportedly found in theSARS-CoV-2 variants. However, independent amino acid mutations found in the SARS-CoV-2 variants were observed in the sequences retrieved between January and August.Figure 2A depicts the month-wise amino acid mutations observed in more than 0.5% ofthe sequences, along with those seen in the SARS-CoV-2 variants. It was noted that themajority of the changes occurred in the N terminal and the RBD domains of the S1 subunitof the spike protein.

Page 5: An Epidemiological Analysis of SARS-CoV-2 Genomic ......16 Bhagat Phool Singh Government Medical College, Sonipat 131305, India; yadav78sarita@yahoo.com 17 King Institute of Preventive

Viruses 2021, 13, 925 5 of 13Viruses 2021, 13, x FOR PEER REVIEW 5 of 13

Figure 1. Phylogenetic tree of the SARS-CoV-2 genome: Neighbor-joining tree for Indian SARS-CoV-2 sequences with a bootstrap replication of 1000 cycles. FigTree v1.4.4 was used to visualize the generated tree. Various GISIAD clades of the SARS-CoV-2 are marked in different colors. The number of sequences used to generate the tree is assigned the n value.

3.3. Mutations Observed in the SARS-CoV-2 Sequences From among the 3815 sequences analyzed, which include sequences from the

GISAID database, 2473 independent amino acid variations were observed with respect to the SARS-CoV-2 isolate Wuhan-Hu-1 (Accession No.: NC_045512.2). Many individual variations observed in the SARS-CoV-2 sequences in the study were not reported. Addi-tionally, we looked for the presence of the amino acid mutation that was reportedly found in the SARS-CoV-2 variants. However, independent amino acid mutations found in the SARS-CoV-2 variants were observed in the sequences retrieved between January and Au-gust. Figure 2A depicts the month-wise amino acid mutations observed in more than 0.5% of the sequences, along with those seen in the SARS-CoV-2 variants. It was noted that the majority of the changes occurred in the N terminal and the RBD domains of the S1 subunit of the spike protein.

Figure 1. Phylogenetic tree of the SARS-CoV-2 genome: Neighbor-joining tree for Indian SARS-CoV-2 sequences with a bootstrap replication of 1000 cycles. FigTree v1.4.4 was used to visualizethe generated tree. Various GISIAD clades of the SARS-CoV-2 are marked in different colors. Thenumber of sequences used to generate the tree is assigned the n value.

Another interesting finding that warrants attention is the presence of a combinationof amino acid mutations, used by the GISAID to differentiate clades. Firstly, the GR-GHvariants defined by the presence of the D614G, G204R, and Q57H mutations in the S, N,and ORF3a proteins, respectively, were observed in 14 sequences from different parts ofIndia, starting from May 2020 (Figure 2). Ten of these were from Maharashtra (n = 6) andGujarat (n = 4), and one each from Odisha, Kerala, Chhattisgarh, and Punjab. Secondly, theGV-GR variant defined by the presence of A222V in the S and G204R mutations in the Nprotein, along with the common D614G mutation of the G clade, were observed startingfrom August 2020 in Maharashtra (n = 1) and Telangana (n = 2). Finally, the G-S variant,defined by the presence of the D614G and L84S mutations in the S and ORF8 proteins,respectively, was observed in four sequences of SARS-CoV-2 reported from Gujarat inJune 2020.

Page 6: An Epidemiological Analysis of SARS-CoV-2 Genomic ......16 Bhagat Phool Singh Government Medical College, Sonipat 131305, India; yadav78sarita@yahoo.com 17 King Institute of Preventive

Viruses 2021, 13, 925 6 of 13Viruses 2021, 13, x FOR PEER REVIEW 6 of 13

Figure 2. Month-wise changes in the mutations observed in SARS-CoV-2 sequences: (A) Month-wise changes in the amino acid mutations in the spike protein that were observed in more than 0.5% of the samples studied. (B) Month-wise changes in the Leucine and Phenylalanine amino acids at GP 3606, position 3606, observed within different clades.

Another interesting finding that warrants attention is the presence of a combination of amino acid mutations, used by the GISAID to differentiate clades. Firstly, the GR-GH variants defined by the presence of the D614G, G204R, and Q57H mutations in the S, N, and ORF3a proteins, respectively, were observed in 14 sequences from different parts of India, starting from May 2020 (Figure 2). Ten of these were from Maharashtra (n = 6) and Gujarat (n = 4), and one each from Odisha, Kerala, Chhattisgarh, and Punjab. Secondly, the GV-GR variant defined by the presence of A222V in the S and G204R mutations in the N protein, along with the common D614G mutation of the G clade, were observed starting from August 2020 in Maharashtra (n = 1) and Telangana (n = 2). Finally, the G-S variant, defined by the presence of the D614G and L84S mutations in the S and ORF8 proteins, respectively, was observed in four sequences of SARS-CoV-2 reported from Gujarat in June 2020.

The overall percentage of net evolutionary nucleotide and amino acid divergence be-tween different clades is provided in Table 1. The lower left matrix and the upper right matrix show the percentage of nucleotide and amino acid divergence of the different clades. It can be observed that the nucleotide divergence of the clades circulating in the early phase of the pandemic is less, indicating closer evolutionary relatedness. The G clade

Figure 2. Month-wise changes in the mutations observed in SARS-CoV-2 sequences: (A) Month-wise changes in the aminoacid mutations in the spike protein that were observed in more than 0.5% of the samples studied. (B) Month-wise changesin the Leucine and Phenylalanine amino acids at GP 3606, position 3606, observed within different clades.

The overall percentage of net evolutionary nucleotide and amino acid divergencebetween different clades is provided in Table 1. The lower left matrix and the upper rightmatrix show the percentage of nucleotide and amino acid divergence of the different clades.It can be observed that the nucleotide divergence of the clades circulating in the early phaseof the pandemic is less, indicating closer evolutionary relatedness. The G clade sequenceshave higher divergence, which increased with the evolution of the G clade sequences intoits subclades. The highest divergence from the L clade sequences was observed in therecently identified GV-GR clade sequences. However, the G-S clade sequences were foundto be more similar (99.959%) to the S clade sequences rather than to the G clade sequences.

Page 7: An Epidemiological Analysis of SARS-CoV-2 Genomic ......16 Bhagat Phool Singh Government Medical College, Sonipat 131305, India; yadav78sarita@yahoo.com 17 King Institute of Preventive

Viruses 2021, 13, 925 7 of 13

Table 1. Percentage nucleotide and amino acid evolutionary divergence over Sequence Pairs between different GISAIDgroups, using the P distance along with uniform distribution as the rate variation parameter among sites.

GISAID CladesPercentage Evolutionary Divergence over Sequence Pairs between Groups

L S V Unclassified Cluster G GR GH GH-GR GV-GR GS

L 0.081 0.077 0.070 0.077 0.108 0.101 0.112 0.152 0.097S 0.041 0.093 0.088 0.094 0.125 0.117 0.129 0.170 0.114V 0.043 0.050 0.100 0.090 0.122 0.114 0.125 0.166 0.074

UnclassifiedCluster 0.035 0.045 0.052 0.097 0.128 0.120 0.131 0.173 0.118

G 0.043 0.051 0.053 0.051 0.080 0.072 0.085 0.124 0.084GR 0.058 0.066 0.068 0.066 0.047 0.104 0.069 0.100 0.115GH 0.059 0.067 0.069 0.066 0.047 0.063 0.087 0.149 0.105

GH-GR 0.060 0.068 0.070 0.068 0.050 0.041 0.055 0.109 0.114GV-GR 0.077 0.085 0.087 0.085 0.066 0.052 0.083 0.056 0.160

GS 0.054 0.062 0.040 0.063 0.050 0.065 0.061 0.064 0.085

Further conservation of the missense mutation (Leucine (Leu)-Phenylalanine (Phe))was observed at the genomic position (GP) 3606, located in the Non-Structural Protein 6(NSP-6) of the SARS-CoV-2. We investigated this missense mutation month-wise andobserved that majority of the G clade and its subclades sequences had Leucine at GP:3606 (Figure 2B, marked as a circle). Furthermore, the SARS-CoV-2 sequences that werereported in the early pandemic (L, S, V) likewise had Leucine at this position, althoughwith a few exceptions.

Interestingly, 12.7% of SARS-CoV-2 sequences (n = 3815) analyzed in this study hadPhenylalanine (Figure 2B, marked as a square). The majority of these sequences (84%) aregrouped in the unclassified cluster, as opposed to the sequences containing Leucine at GP:3606. The rest of the sequences belonged to different GISIAD clades. The two sequencesreported from India during January 2020, had Leucine at GP: 3606. The SARS-CoV-2sequences deposited from India from March onwards had the Phenylalanine amino acidat GP: 3606. Figure 2 depicts the different GISAID clades that were observed month-wiseand had the GP: 3606 missense mutation. This study looks upon the conservation of aminoacids in the 406 unclassified SARS-CoV-2 sequences that contain Phenylalanine amino acidat GP: 3606.

Figure 3 is the NJ tree of the 406 unclassified sequences with the L3606F mutation,along with the representative clade sequences, which show two different clusters. Inthe first cluster, we can see the branches with nodes in blue and its shades, while in thesecond, the branches in the color green can be observed. An analysis of the unclassifiedsequences was performed to identify the presence of any conserved amino acid mutationalpattern (n = 406) that led to the observed clustering. It was noted that 331/406 sequenceshad a conserved pattern in the ORF1ab (T2016K, A4489V) and N: P13L, Figure 3 (blueand its shades). This is identified as the B.6 variant in the PangoLIN classification [15].Interestingly, another conserved pattern was also observed in 61/406 sequences in theORF1ab (R207C, V378I, and M2790I) and classified within the B.4 variant in the PangoLINclassification (Figure 3, green). These CI2 sequences were mainly from the southern part ofIndia (Kerala (n = 33), Karnataka (n = 6), Leh-Ladakh (n = 6), Gujarat (n = 2), Maharashtra(n = 1), UP (n = 1), West Bengal (n = 1), and Indian citizens sampled in Iran (n = 13). A totalof 82.5% of these sequences were observed in March. Guanine (G) was found in 59% of B.4(CI2 sequences) at position 8 in ORF8, resulting in the G8stop codon mutation in the ORF8protein along with the ORF1ab (D6270G) amino acid change. These CI2 sequences weremainly identified in Kerala (n = 28), Karnataka (n = 6), and Gujarat (n = 2) during the earlyperiod of the pandemic (January–May).

Page 8: An Epidemiological Analysis of SARS-CoV-2 Genomic ......16 Bhagat Phool Singh Government Medical College, Sonipat 131305, India; yadav78sarita@yahoo.com 17 King Institute of Preventive

Viruses 2021, 13, 925 8 of 13

1.0E-4

hCoV-19/India/MP-NIHSAD-0405-24/2020|EPI_ISL_452793|_2020-05-03

hCoV-19/India/MH-AFMC-5458/2020|EPI_ISL_496529|_2020-05-23

hCoV-19/K

erala_India/nCoV22920/2020/06-2020

hCoV-19/India/MH-NIV-46430/2020|EPI_ISL_479545|_2020-05-26

hCoV-19

/India/J

K-NCDC

-02326/2

020|EPI

_ISL_43

5090|_2

020-03-

29

hCoV-19/In

dia/H

R-THSTI2010C

/2020|EPI_IS

L_528384|_2020-04-27

hCoV-19/India/DL-MaxCov0021-CSIR-IGIB/2020|EPI_ISL_459923|_2020-05-08

hCoV-19/India/O

R-ILS

CV13929/2020|E

PI_IS

L_463026|_2020-05-10

hCoV-19/Andhra_Pradesh_India/SVIMSCOV-408/03-2020

hCoV-19/India/MP-DRDE_759/2020|EPI_ISL_476848|_2020-04-11

hCoV-19/India/DL-ILBS-299M/2020|EPI_ISL_605809|_2020-05-01

hCoV-19/India/MH-NIV-6315/2020|EPI_ISL_454542|_20

20-03-30

hCoV-19/India/TG-CCMB-M499/2020|EPI_ISL_495213|_2020-06-10

hCoV-19/India/TG-CCMB-O2-P1/2020|EPI_ISL_458075|_2020-04-15

hcov-19/TCP-504_Lucknow/UP

hCoV-19/India/DL-NCDC7661-CSIR-IGIB/2020|EPI_ISL_482656|_2020-05-28

hCoV-19/M

aharashtra

_India/_IG

GMC/NGP

/2468/2020

/04-2020

hCoV-19/India/TG-CCMB-J021/2020|EPI_ISL_447556|_2020-03-30

hCoV-19/India/MH-MW8/2020|EPI_ISL_508440|_2020-04-27

hCoV-19/In

dia/U

N-1093/2020|E

PI_IS

L_421663|_2020-03-10

hCoV-19/Kerala_India/nCoV3206/2020/03-2020

hCoV-19

/India/D

L-NCDC1

957-CSIR

-IGIB/202

0|EPI_IS

L_482515

|_2020-0

5-18

hCoV-19/India/M

P-DRDE_530/2020|E

PI_IS

L_476840|_2020-04-09

hCoV-19/India/TG-CCMB-L302/2020|EPI_ISL_471643|_2020-04-20

hCoV-19/India/U

N-1652/2020|E

PI_IS

L_424363|_2020-03-12

hCoV-19/India/OR-ILSCV1274

2/2020|EPI_ISL_463013|_2020

-05-09

hCoV-19/India/MH-BJMC-1697/2020|EPI_ISL_496521|_2020-04-19

hCoV-19/In

dia/BR-NC

DC-4155/2

020|EPI_IS

L_4364

49|_202

0-04-12

hCoV-19/Kerala_India/nCoV4743/2020/03-2020

hCoV-19/India/HR-THSTI-BAL-913/2020|EPI_ISL_454865|_2020-04-16

hCoV-19/India/TG-CCMB-K496/2020|EPI_ISL_447580|_2020-04-13

hCoV-19/India/BR-NCDC-02251/2020|EPI_ISL_435112|_2020-03-28

hCoV-19/Kerala_India/nCoVQC-71/2020/04-2020

hCoV-19/India/DL-NCDC7512-CSIR-IGIB/2020|EPI_ISL_482634|_2020-05-28

hCoV-19/India/TG-CCMB-J230/2020|EPI_ISL_447572|_2020-04-01

hCoV-19/India/TG-CCMB-K849/2020|EPI_ISL_447860|_2020-04-16

hCoV-19/India/MH-MW12/2020|EPI_ISL_508424|_2020-05-01

hCoV-19/India/MH-NIV-22876/2020|EPI_ISL_479520|_2020-05-07

hCoV-19/India/DL-NCDC2334-CSIR-IGIB/2020|EPI_ISL_482560|_2020-05-19

hCoV-19/In

dia/K

A-IB47/2020|E

PI_IS

L_508325|_2020-06-03

hCoV-19/India/TG-GMC-RR1191/2020|EPI_ISL_438138|_2020-03-25

hCoV-19/India/WB-IICB-005/2020|EPI_ISL_569863|_2020-08-12

hCoV-19/India/KA-nimh-15896/2020|EPI_ISL_486409|_2020-05-10

hCoV-19/India/TG-CCMB-J812-P1/2020|EPI_ISL_458072|_2020-04-08

hCoV-19/India/TN-CCMB-C2-13/2020|EPI_ISL_447585|_2020-04-16

hCoV-19/India/MH-NIV-7966/2020|EPI_ISL_454552|_2020-04-06

hCoV-19/India/UT-AR45/2020|EPI_ISL_508189|_2020-05-23

hCoV-19/India/WB-NCDC-2502/2020|EPI_ISL_436414|_2020-03-31

hCoV-19/India/OR-ILSCV27725/2020|EPI_ISL_481122|_2020-05-21

hCoV-19/Karnataka_India/CoV-20-1261/2020/04-2020

hCoV-19/India/DL-MaxCov0010-CSIR-IGIB/2020|EPI_ISL_459913|_2020-05-03

hCoV-19/In

dia/U

N-1115/2020|E

PI_IS

L_421667|_2020-03-10

hCoV-19/India/M

H-NIV-7558/2020|E

PI_IS

L_454549|_2020-04-05

hCoV-19/India/DL-NCDC1934-CSIR-IGIB/2020|EPI_ISL_482513|_2020-05-18

hCoV-19/India/DL-NCDC7629-CSIR-IGIB/2020|EPI_ISL_482651|_2020-05-28hCoV-19/India/K

A-nim

h-7817/2020|EPI_IS

L_486399|_2020-04-29hCoV-19/India/TN-NCDC-02311/2020|EPI_ISL_435078|_2020-03-29hCoV-19/India/MH-NIV-SARI-4375/2020|EPI_ISL_479572|_2020-04-04

hCoV-19/India/DL-NCDC1580-CSIR-IGIB/2020|EPI_ISL_482498|_2020-05-16

hCoV-19/India/TG-CCMB-K272/2020|EPI_ISL_458070|_2020-04-12

hCoV-19/India/DL-NCDC7032-CSIR-IGIB/2020|EPI_ISL_482620|_2020-05-26

hCoV-19/India/TG-CCMB-O2/2020|EPI_ISL_458067|_2020-04-15

hCoV-19/India/M

H-NIV-Q

C-798/2020|E

PI_IS

L_454568|_2020-04-25

hCoV-19/India/DL-NCDC-3400/2020|EPI_ISL_436436|_2020-04-06

hCoV-19/Asam_India/nCoV11481/2020/05-2020

hCoV-19/India/KA-nimh-19184/2020|EPI_ISL_515966|_2020-05-14

hCoV-19/In

dia/D

L-NCDC-3831/2020|E

PI_IS

L_436437|_2020-04-08

hCoV-19/India/O

R-ILSC

V32670/2020|EPI_ISL_481162|_2020-06-03

hCoV-19/Mahara

shtra_India/_IGG

MC/NGP/11318/2

020/05-2020

hCoV-19/K

erala_India/n

CoV5479/2020//04-2020

hCoV-19/Kerala_India/nCoV3521/2020/03-2020

hCoV-19/India/TG-CCMB-L301-P1/2020|EPI_ISL_458074|_2020-04-20

hCoV-19/India/M

H-ACTREC-105/2020|EPI_ISL_699761|_2020-05-23

hCoV-19/India/MP-DRDE_1794/2020|EPI_IS

L_476849|_2020-04-29

hCoV-19/India/KA-nimh-17608/2020|EPI_ISL_515964|_2020-05-12

hCoV-19/India/GJ-GB

RC111/2020|EPI_IS

L_451156|_2020-05-01

hCoV-19/India/TG-CCMB-L1975/2020|EPI_ISL_495179|_2020-06-04

hCoV-19/India/GJ-GBRC98/2020|EPI_ISL_450787|_2020-05-11

hCoV-19/India/DL-NCDC5743-CSIR

-IGIB/2020|EPI_ISL_482614|_2020-0

5-25

hCoV-19/Asam_India/nCoV11521/2020/05-2020

hCoV-19/India/DL-NCDC7757-CSIR-IGIB/2020|EPI_ISL_482669|_2020-05-28

hCoV-19/India/D

L-NCDC-4138

/2020|EPI_IS

L_4364

48|_20

20-04-12

hCoV-19/India/DL-MaxCov0022-CSIR-IGIB/2020|EPI_ISL_459924|_2020-05-08

hCoV-19/Maharashtra

_India/_IGGMC/NGP

/6754/2020/05-2020

hCoV-19/India/KA-nimh-15835/2020|EPI_ISL_486389|_2020-05-10

hCoV-19/In

dia/W

B-NCDC-02248/2020|E

PI_IS

L_435081|_2020-03-28

hCoV-19/India/TG-CCMB-L040-P3/2020|EPI_ISL_471587|_2020-04-18

hCoV-19/India/RJ-SMSCOV175/2020|EPI_ISL_454833|_2020-04-27

hCoV-19

/India/D

L-NCDC

-3230/20

20|EPI_I

SL_436

431|_20

20-04-05

hCoV-19/Kerala_India/nCoV3814/2020/03-2020

hCoV-19/India/OR-ILSCV29181/2020|EPI_ISL_481127|_2020-05-22

hCoV-19/Andhra_Pradesh_India/SVIMSCOV-491/03-2020

hCoV-19/India/DL-NCDC1873-CSIR-IGIB/2020|EPI_ISL_482509|_2020-05-18

hCoV-19/India/TN-NCDC-02333/2020|EPI_ISL_435087|_2020-03-29

hCoV-19/India/KA-nimh-14835/2020|EPI_ISL_515959|_2020-05-08

hCoV-19/Kerala_India/nCoV4200/2020/03-2020

hCoV-19/India/KA-nimh-10559/2020|EPI_ISL_486400|_2020-05-03

hCoV-19/India/DL-NCDC-4189/2020|EPI_ISL_436450|_2020-04-13

hCoV-19/India/WB-S38/2020|EPI_ISL_508446|_2020-05-02

hCoV-19/In

dia/O

R-ILSCV15907/2020|E

PI_IS

L_463038|_2020-05-11

hCoV-19/India/UN-UN

1/2020|EPI_ISL_486852|_2020-06-10

hCoV-19/Ind

ia/TG-CCMB

-J318/2020|E

PI_ISL_4478

55|_2020-04-

02

hCoV-19/Punjab_India/ncov8370/2020/05-2020

hCoV-19/India/M

H-NIV-981/2020|E

PI_IS

L_452213|_2020-03-10

hCoV-19/India/UT-IMT-BY60/2020|EPI_ISL_547584|_2020-08-24

hCoV-19/India/DL-NCDC-4205/2020|EPI_ISL_436451|_2020-04-13

hCoV-19/India/TG-CCMB-N7/2020|EPI_ISL_495273|_2020-06-13

hCoV-19/India/M

H-ACTREC-031/2020|E

PI_IS

L_699687|_2020-05-06

hCoV-19/India/DL-NCDC7557-CSIR-IGIB/2020|EPI_ISL_482643|_2020-05-27

hCoV-19/India/DL-MaxCov0003-CSIR-IGIB/2020|EPI_ISL_459911|_2020-05-09

hCoV-19/India/OR-ILS

CV13676/2020|EPI_IS

L_463019|_2020-05-0

9

hCoV-19/In

dia/T

N-NCDC-02309/2020|E

PI_IS

L_435084|_2020-03-29

hCoV-19/India/DL-NCDC7721-CSIR-IGIB/2020|EPI_ISL_482664|_2020-05-28

hCoV-19/India/K

A-nim

h-3720/2020|EPI_IS

L_486394|_2020-04-23

hCoV-19/India/TG-CCMB-L040-P1/2020|EPI_ISL_458077|_2020-04-18

hCoV-19/In

dia/U

P-ACTREC-001/2020|E

PI_IS

L_699658|_2020-04-23

hCoV-19/Andhra_Pradesh_India/SVIMSCOV-334/03-2020

hCoV-19/India/HR-THSTI-BAL-231/2020|EPI_ISL_454862|_2020-04-11

hCoV-19/Kerala_India/nCoV6580/2020//04-2020

hCoV-19/India/TG-CCMB-L1973/2020|EPI_ISL_495178|_2020-06-04

hCoV-19/India/MP-NCDC-4867/2020|EPI_ISL_436458|_2020-04-16

hCoV-19/Kerala_India/nCoV4408/2020/03-2020

hCoV-19/India/WB-IK3/2020|EPI_ISL_511900|_2020-03-31

hCoV-19/India/MH-NIV-7044/2020|EPI_ISL_454547|_2020-04-04

hCoV-19/India/TG-CCMB-J068/2020|EPI_ISL_447562|_2020-03-31

hCoV-19/India/TG-CCMB-J125/2020|EPI_ISL_450326|_2020-04-01

hCoV-19/India/HR-THSTI-BAL-39/2020|EPI_ISL_454858|_2020-04-07

hCoV-19/Rajasthan_India/C-103525/2020_/05-2020

hCoV-19/India/UT-AR37/2020|EPI_ISL_511910|_2020-05-28

hCoV-19/Rajasthan_India/C-108451/2020_/06-2020

hCoV-19/In

dia/L

A-NCDC-01441/2020|E

PI_IS

L_435101|_2020-03-15

hCoV-19/India/DL-ILBS-282M/2020|EPI_ISL

_605800|_2020-04-09

hCoV-19/Punjab_India/ncoV8440/2020/05-2020

hCoV-19/India/DL-MaxCov0014-CSIR-IGIB/2020|EPI_ISL_459916|_2020-05-12

hCoV-19/India/TG-CCMB-O3/2020|EPI_ISL_458068|_2020-04-14

hCoV-19

/India/DL

-NCDC22

93-CSIR-

IGIB/202

0|EPI_ISL

_482556|

_2020-05

-19

hCoV-19/I

ndia/TN-N

CDC-023

27/2020|E

PI_ISL_43

5094|_202

0-03-29

hCoV-19/India/AS-NCDC-2525/2020|EPI_ISL_436421|_2020-03-31

hCoV-19/K

erala_India/n

CoV3448/2020/03-2020

hCoV-19/India/O

R-RMRC157/2020|EPI_ISL_455758|_2020-05-07

hCoV-19/India/

OR-ILSCV1276

2/2020|EPI_ISL_

463018|_2020-0

5-09

hCoV-19/India/GJ-GBRC213a/2020|EPI_ISL_475043|_2020-06-11

hCoV-19/India/MH-ACTREC-002/2020|EPI_ISL_699659|_2020-04-29

hCoV-19/India/MH-NIV-7786/2020|EPI_ISL_452209|_2020-04-06

hCoV-19/India/TG-CCMB-L149/2020|EPI_ISL_447861|_2020-04-16

hCoV-19

/India/TN

-NCDC-02

245/2020

|EPI_ISL

_435091|

_2020-03

-28

hCoV-19/India/M

H-ACTREC-144/2020|EPI_ISL_699800|_2020-05-29

hCoV-19

/India/D

L-NCDC

2189-C

SIR-IGI

B/2020|E

PI_ISL_

482546

|_2020-0

5-18

hCoV-19/India/KA-nimh-15833/2020|EPI_ISL_515961|_2020-05-10

hCoV-19/India/MH-NCDC-3965/2020|EPI_ISL_436444|_2020-04-09

hCoV-19/K

erala_India/n

CoV4350/2020/03-2020

hCoV-19/India/TG-CCMB-K601/2020|EPI_ISL_447582|_2020-04-14

hCoV-19/India/M

H-NIV-7893/2020|EPI_ISL_454551|_2020-04-07

hCoV-19/India/OR-ILSCV31447/2020|EPI_ISL_481148|_2020-05-26

hCoV-19/India/DL-NCDC7649-CSIR-IGIB/2020|EPI_ISL_482655|_2020-05-28

hCoV-19/India/GJ-GBRC10/2020|EPI_ISL_437438|_2020-04-24

hCoV-19/India/H

R-TH

STI2010B

/2020|EPI_IS

L_528383|_2020-04-27

hCoV-19/India/OR-ILSCV27726/2020|EPI_ISL_481123|_2020-05-21

hCoV-19/India/TG-CCMB-J166/2020|EPI_ISL_447568|_2020-04-01

hCoV-19/India/DL-MaxCov0018-CSIR-IGIB/2020|EPI_ISL_459920|_2020-05-21

hCoV-19/R

ajasthan_

India/1488

4/2020/05-

2020

hCoV-19/Ind

ia/OR-RMR

C169/2020|E

PI_ISL_4557

68|_2020-05-

07

hCoV-19/India/KA-nimh-11071/2020|EPI_ISL_515947|_2020-05-04

hCoV-19/India/OR-ILSCV32593/2020|EPI_ISL_481159|_2020-06-03

hCoV-19/India/TN-CCMB-C13/2020|EPI_ISL_458035|_2020-04-29

hCoV-19/India/WB-IICB-029/2020|EPI_ISL_581504|_2020-08-04

hCoV-19/India/DL-NCDC4462-CSIR-IGIB/2020|EPI_ISL_482591|_2020-05-23

hCoV-19/Ind

ia/DL-M

axCov0

034-CS

IR-IGIB/2020|EPI_IS

L_459936|_2020-05-10

hCoV-19/Kerala_India/nCoV4317/2020/03-2020

hCoV-19/In

dia/U

N-1644/2020|E

PI_IS

L_421672|_2020-03-12

hCoV-19/India

/GJ-GBRC16/2

020|EPI_ISL_

437444|_2020

-04-26

hCoV-19/India/O

R-ILC

-RMRC123/2020|EPI_ISL_455311|_2020-05-07

hCoV-19/India/KA-nimh-1598/

2020|EPI_ISL_436157|_2020-04

-17

hCoV-19/India/DL-NCDC7554-CSIR-IGIB/2020|EPI_ISL_482641|_2020-05-27

hCoV-19/India/KA-nimh-51

02/2020|EPI_ISL_486397|_

2020-04-25

hCoV-19/U

ttarakhand_India/4868/COVID-19/H

LD/2020/05-2020

hCoV-19/In

dia/L

A-NCDC-01616/2020|E

PI_IS

L_435105|_2020-03-17

hCoV-19/India/GJ-GBRC148b/2020|EPI_ISL_458110|_2020-05-17

hCoV-19/India/TG-CCMB-J288/2020|EPI_ISL_450329|_2020-04-01

hCoV-19/In

dia/T

G-CCMB-K499/2020|E

PI_IS

L_447865|_2020-04-14

hCoV-19/K

erala_India/nCoV3582/2020/03-2020

hCoV-19/India/MH-NIV-9127/2020|EPI_ISL_454556|_2020-04-12

hCoV-19/India/GJ-GBRC-415b/2020|EPI_ISL_586543|_2020-07-11

hCoV-19/Uttar_Pradesh_India/_RMRCGKPCOV-3101/04/20/07-2020/07-2020

hCoV-19/In

dia/M

H-NIV-3472/2020|E

PI_IS

L_454528|_2020-03-17

hCoV-19/India/W

B-IICB-023/2020|EPI_ISL_581497|_2020-08-04

hCoV-19/India/D

L-NCDC7527-C

SIR-IGIB/2020|E

PI_IS

L_482637|_2020-05-28

hCoV-19/Kerala_India/nCoVQC-73/2020/04-2020

hCoV-19/India/OR-I

LSCV13744/2020|E

PI_ISL_463023|_202

0-05-09

hCoV-19/India/KA-NCDC-4055/2020|EPI_ISL_436447|_2020-04-10

hCoV-19/India/DL-NCDC-3271/2020|EPI_ISL_436433|_2020-04-05

hCoV-19/India/GJ-GB

RC114/2020|EPI_IS

L_451159|_2020-05-03

hCoV-19/India/D

L-ILBS-2476T/2020|E

PI_IS

L_605807|_2020-05-01

hCoV-19/India/TG-CCMB-L302-P1/2020|EPI_ISL_458076|_2020-04-20

hCoV-19/India/TG-CCMB-OM35/2020|EPI_ISL_644729|_2020-04-15

hCoV-19/In

dia/U

N-1100/2020|E

PI_IS

L_421664|_2020-03-10

hCoV-19/India/TG-CCMB-K599/2020|EPI_ISL_447863|_2020-04-14

hCoV-19/India/TG-CCMB-M930/2020|EPI_ISL_495264|_2020-06-13

hCoV-19/India/H

R-TH

STI2010D

/2020|EPI_IS

L_528385|_2020-04-27

hCoV-19/India/UP-NCDC

-02322/2020|EPI_ISL_43

5082|_2020-03-29

hCoV-19/India/D

L-MaxC

ov0036

-CSIR-IGIB/2020

|EPI_IS

L_4599

38|_20

20-05-11

hCoV-19/In

dia/U

N-1621/2020|E

PI_IS

L_421671|_2020-03-12

hCoV-19/India/M

H-ACTREC-017/2020|E

PI_IS

L_699673|_2020-05-03

hCoV-19/India/TG-CCMB-J812-P2/2020|EPI_ISL_471585|_2020-04-08

hCoV-19/India/DL-MaxCov0019-CSIR-IGIB/2020|EPI_ISL_459921|_2020-05-08

hCoV-19/M

aharashtra_India/C179/2020/03-2020

hCoV-19/In

dia/U

N-1616/2020|E

PI_IS

L_421669|_2020-03-12

hCoV-19/India/TG-CCMB-L301-P3/2020|EPI_ISL_471642|_2020-04-20

hCoV-19/Madhyapradesh_India/nCoV2415/2020/04-2020

hcov-19/Q100/IR

AN/UK

hCoV-19/Kerala_India/nCoV4758/2020/03-2020

hCoV-19/India/KA-IB14/2020|EPI_ISL_508292|_2020-06-18

hCoV-19/India/M

H-MW5/2020|E

PI_IS

L_508439|_2020-04-27

hCoV-19/In

dia/K

A-IB51/2020|E

PI_IS

L_508329|_2020-06-19

hCoV-19/India/TG-CCMB-OM32/2020|EPI_ISL_644726|_2020-04-13

hCoV-19/India/RJ-SMSCOV161/2020|EPI_ISL_454832|_2020-04-21

hCoV-19/Andh

ra_Prades

h_India/

SVIMSCOV-474/0

3-2020

hCoV-19/In

dia/K

A-InStem

-NCBS-0046/2020|E

PI_IS

L_477246|_2020-05-02

hCoV-19/In

dia/TG-CC

MB-J118/2

020|EPI_IS

L_447573|

_2020-04-0

1

hCoV-19/India/KA-nimh-17585/2020|EPI_ISL_515963|_2020-05-12

hCoV-19/India/DL-ILBS-9306T/2020|EPI_ISL_605804|_2020-05-01

hCoV-19/Kerala_India/nCoV4332/2020/03-2020

hCoV-19/Kerala_India/nCoV4753/2020/03-2020

hCoV-19/India/MH-NIV-12430/2020|EPI_ISL_479496|_2020-04-21

hCoV-19/India/TG-CCMB-L040/2020|EPI_ISL_471586|_2020-04-18

hCoV-19/India/TG-CCMB-J199/2020|EPI_ISL_450327|_2020-04-01

hCoV-19/India/TG-CCMB-J321/2020|EPI_ISL_447577|_2020-04-02

hCoV-19/Jammu_and_Kashmir_India/_SK-CoV-2-279/2020_04-2020

hCoV-19/India/TN-NCDC-02312/2020|EPI_ISL_435093|_2020-03-29

hCoV-19/K

erala_India/n

CoV3333/2020/03-2020

hCoV-19/In

dia/M

H-NCDC-02315/2020|E

PI_IS

L_435086|_2020-03-29

hCoV-19/India/KA-nimh-14723/2020|EPI_ISL_486388|_2020-05-10

hCoV-19/In

dia/W

B-IK22/2020|E

PI_IS

L_508352|_2020-04-25

hCoV-19/India/TG

-CCMB-OM34/2020|E

PI_IS

L_644728|_2020-04-15

hCoV-19/Kerala_India/nCoV3073/2020/03-2020

hCoV-19/Kerala_India/nCoV4318/2020/03-2020

hCoV-19/India/DL-MaxCov0032-CSIR-IGIB/2020|EPI_ISL_459934|_2020-05-05

hCoV-19/India/D

L-NCDC-3163/2020|EPI_IS

L_436424|_2020-04-05

hCoV-19/Uttar_Pradesh_India/_RMRCGKPCOV-3120/04/20/07-2020/07-2020

hCoV-19

/India/T

G-CCMB

-J130/20

20|EPI_I

SL_4580

71|_2020

-04-01

hCoV-19/Chattisgarh_India/2810/04-2020hCoV-19/India/TN-NCDC-02332/2020|EPI_ISL_435096|_2020-03-29

hCoV-19/India/MH-NIV-7511/2020|EPI_ISL_452208|_2020-04-05

hCoV-19/India/DL-NCDC-2507/2020|EPI_ISL_436415|_2020-03-31

hCoV-19/India/OR-ILSCV1

2743/2020|EPI_ISL_463014

|_2020-05-09

hCoV-19/India/H

R-TH

STI2010A

/2020|EPI_IS

L_528382|_2020-04-27

hCoV-19/India/KA

-nimh-0351/2020|EPI_ISL_428484|_2020-04-10

hCoV-19/India/TN-CCMB-C3-14/2020|EPI_ISL_447586|_2020-04-16

hCoV-19/India/D

L-ILBS-9304D

/2020|EPI_ISL_605813|_2020-04-12

hCoV-19/India/D

L-MaxC

ov0039-CS

IR-IGIB/2020|EPI_IS

L_459941|_2020-05-11

hCoV-19/India/O

R-RMRC108/2020|EPI_ISL_455752|_2020-05-06

hCoV-19/In

dia/DL-

NCDC-3

982/202

0|EPI_ISL_43

6445|_2

020-04-

09

hCoV-19/India/H

R-TF24/2020|EPI_ISL_508498|_2020-06-19

hCoV-19/India/TG-CCMB-L276-P1/2020|EPI_ISL_458073|_2020-04-20

hCoV-19/Kerala_India/nCoV4352/2020/03-2020

hCoV-19/Maha

rashtra_India/

_IGGMC/NGP

/6837/2020/05

-2020

hCoV-19/Kerala_India/nCoV5093/2020/03-2020

hCoV-19/India/DL-NCDC7556-CSIR-IGIB/2020|EPI_ISL_482642|_2020-05-27

hCoV-19/Ind

ia/DL-NCDC-3195/2020|EPI_IS

L_436430|_2020-04-05

hCoV-19

/India/D

L-NCDC

-3264/20

20|EPI_I

SL_436

432|_20

20-04-05

hCoV-19/In

dia/U

N-1063/2020|E

PI_IS

L_424361|_2020-03-10

hCoV-19/India/TG-CCMB-L301/2020|EPI_ISL_458063|_2020-04-20

hCoV-19

/Rajastha

n_India/1

4887/202

0_/05-20

20

hCoV-19/U

ttar_Pradesh_India/_RMRCGKPCOV-8545/05/20/06-2020/06-2020

hCoV-19/India/U

N-1617/2020|E

PI_IS

L_421670|_2020-03-12

hCoV-19/India/MH-BJMC-1675/2020|EPI_ISL_496547|_2020-04-18

hCoV-19/India/TG-CCMB-J009/2020|EPI_ISL_447557|_2020-03-30

hCoV-19/India/WB-NCDC-02250/2020|EPI_ISL_435074|_2020-03-28

hCoV-19/India/K

A-nim

h-5297/2020|EPI_IS

L_486398|_2020-04-25

hCoV-19/India/TN-CCMB-C1-12/2020|EPI_ISL_447584|_2020-04-16

hCoV-19/India/WB-S7/2020|EPI_ISL_455641|_2020-03-31

hCoV-19/Chandigarh_India/_nCOV-4498/2020/05-2020

hCoV-19/India/MH-CCMB-NIV2/2020|EPI_ISL_450322|_2020-04-04hCoV-19/Andhra_Pradesh_India/SVIMSCOV-389/03-2020

hCoV-19/India/TG-CCMB-J247/2020|EPI_ISL_447853|_2020-04-01

hCoV-19/India/MH-NCDC-02310/2020|EPI_ISL_435085|_2020-03-29

hCoV-19/India/TG-CCMB-J283/2020|EPI_ISL_450328|_2020-04-01

hCoV-19/India/KA-nimh-11872/2020|EPI_ISL_515952|_2020-05-06

hCoV-19/India/TG-CCMB-J132/2020|EPI_ISL_447850|_2020-04-01

hCoV-19/Jammu_and_Kashmir_India/_SK-CoV-2-52551/2020/05-2020

hCoV-19/Ma

harashtra_Ind

ia/_IGGMC/NG

P/4666/2020/

05-2020

hCoV-19/In

dia/D

L-ILBS-271M

/2020|EPI_IS

L_605808|_2020-05-01

hCoV-19/India/MP-NCDC-4879/2020|EPI_ISL_436463|_2020-04-16

hCoV-19/India/TG-CCMB-J047/2020|EPI_ISL_447862|_2020-03-31

hCoV-19/India/TN-NCDC-02328/2020|EPI_ISL_435095|_2020-03-29

hCoV-19/India/WB-S27/2020|EPI_ISL_455656|_2020-04-30

hCoV-19/In

dia/MH-NI

V-6423/20

20|EPI_IS

L_454543

|_2020-03-

31

hCoV-19/India/TG-CCMB-K500/2020|EPI_ISL_447866|_2020-04-14

hCoV-19/Uttar_Pradesh_India/_RMRCGKPCOV-3100/04/20/07-2020/07-2020

hCoV-19/K

erala_India/n

CoV5411/2020/03-2020

hCoV-19/India/DL-NCDC5723-CSIR-IGIB/2020|EPI_ISL_482611|_2020-05-23

hCoV-19/Uttar_Pradesh_India/_RMRCGKPCOV-3117/04/20/07-2020/07-2020

hCoV-19

/India/D

L-MaxC

ov0033

-CSIR-IG

IB/2020

|EPI_ISL

_45993

5|_2020

-05-06

hCoV-19/India/DL-MaxCov0012-CSIR-IGIB/2020|EPI_ISL_459914|_2020-05-01

hCoV-19/India/DL-NCDC7515-CSIR-IGIB/2020|EPI_ISL_482635|_2020-05-28

hCoV-19/M

aharashtra

_India/_IG

GMC/NGP

/1635/2020

/04-2020

hCoV-19/K

arnataka_In

dia/C

oV-20-10727/2020/04-2020

hCoV-19/India/OR-ILSCV11165/2020|EPI_ISL_463010|_2020-05-07

hCoV-19/India/MP-DRDE_2497/2020|EPI_ISL_476852|_

2020-05-04

hCoV-19/Kerala_India/nCoV4432/2020/03-2020

hCoV-19/India/TG-CCMB-J207/2020|EPI_ISL_447570|_2020-04-01

hCoV-19/India/DL-NC

DC-3190/2020|EPI_ISL_436429|_2020-04-05

hCoV-19/Tamilna

du_India/K

IPM-20-75638

/2020/06-202

0

hCoV-19/India/MH-CCMB-NIV1/2020|EPI_ISL_450321|_2020-04-04

hCoV-19/India/DL-NCDC5726-CSIR-IGIB/2020|EPI_ISL_482613|_2020-05-23

hCoV-19/India/TG-CCMB-L007/2020|EPI_ISL_447583|_2020-04-16

hCoV-19/M

aharashtra

_India/_IGG

MC/NGP/18

89/2020/04

-2020

hCoV-19/India/GJ-GBRC96/2020|EPI_ISL_450784|_2020-05-09

hCoV-19/India/GJ-GBRC271b/2020|EPI_ISL_483869|_2020-06-13

hCoV-19/India/TG

-CCMB-M7/2020|EPI_ISL_495226|_2020-06-04

hCoV-19/India/TG-CCMB-J206/2020|EPI_ISL_447569|_2020-04-01

hCoV-19/Uttar_Pradesh_India/_RMRCGKPCOV-3107/04/20/07-2020/07-2020

hCoV-19/India/O

R-NCDC-02336/2020|E

PI_IS

L_435088|_2020-03-29

hCoV-19/India/OR-ILSCV13921/2020|EPI_ISL_463025|_2020-05-10

hCoV-19/India/MP-DRDE_2429/2020|EPI_ISL_476850|_2020-05-03

hCoV-19/K

erala_India/n

CoV5185/2020/03-2020

hCoV-19/India/G

J-GBRC103/2020|E

PI_IS

L_451666|_2020-05-04

hCoV-19/India/PB-IMT-L176/2020|EPI_ISL_539486|_2020-04-30

hCoV-19/India/TG-CC

MB-OM31/2020|EPI_ISL_644725|_2020-04-13

hCoV-19/In

dia/L

A-NCDC-01444/2020|E

PI_IS

L_435102|_2020-03-15

hCoV-19/Maha

rashtra_India/_

IGGMC/NGP/46

53/2020/05-202

0

hCoV-19/India/WB-IICB-018/2020|EPI_ISL_581450|_2020-08-12

hCoV-19/Chandigarh_India/_nCOV-4339/2020/05-2020

hCoV-19/India/TG-GMC-KN443/2020|EPI_ISL_431103|_2020-03-16

hCoV-19/W

est_Bengal_In

dia/2019-n

CoV/0144/20/03-2020

hCoV-19/Chattisgarh_India/32050/06-2020

hCoV-19/In

dia/U

N-1073/2020|E

PI_IS

L_421662|_2020-03-10

hCoV-19/India/WB-IICB-024/2020|EPI_ISL_581499|_2020-08-12

hCoV-19/India/TG-CCMB-J304/2020|EPI_ISL_447567|_2020-04-01

hCoV-19/Chandigarh_India/_nCOV-4476/2020/05-2020

hCoV-19/India/DL-NCDC-3279/2020|EPI_ISL_436434|_2020-04-05

hCoV-19/India/KA-nimh-19188/2020|EPI_ISL_515967|_2020-05-14

hCoV-19/India/M

H-NIV-6614/2020|EPI_ISL_454544|_2020-04-02

hCoV-19/India/TG-CCMB-J224/2020|EPI_ISL_447571|_2020-04-01

hCoV-19/Jammu_and_Kashmir_India/SK-CoV-2-229/2020_03-2020

hCoV-19/India/WB-IICB-026/2020|EPI_ISL_581501|_2020-08-12

hCoV-19/India/TN-NCDC-2517/2020|EPI_ISL_436418|_2020-03-31

hCoV-19/Kerala_India/nCoV5899/2020//04-2020

hCoV-19/India/U

N-1111/2020|E

PI_IS

L_421666|_2020-03-10

hCoV-19/Karnatak

a_India/

CoV-20-

3781/20

20/04-20

20

hCoV-19/Kerala_India/nCoV3586/2020/03-2020

hCoV-19/India/GJ-GBRC208b/2020|EPI_ISL_475036|_2020-06-11

hCoV-19/Kerala_India/nCoV3233/2020/03-2020

hCoV-19/K

erala_India/n

CoV4763/2020/03-2020

hCoV-19

/India/D

L-NCDC-

3170/202

0|EPI_IS

L_436425

|_2020-0

4-05

hCoV-19/In

dia/TG-CCM

B-J165/202

0|EPI_ISL_4

47848|_202

0-04-01

hCoV-19/In

dia/P

B-IMT-L178/2020|E

PI_IS

L_539487|_2020-05-30

hCoV-19/India/MH-NIV-6964/2020|EPI_ISL_454546|_2020-04-03

hCoV-19/India/OR-ILSCV

13678/2020|EPI_ISL_46

3020|_2020-05-09

hCoV-19/India/KA-nimh-15819/2020|EPI_ISL_486408|_2020-05-10

hCoV-19/In

dia/K

A-nimh-1454/2020|E

PI_IS

L_486392|_2020-04-15

hCoV-19/India/U

N-1104/2020|E

PI_IS

L_421665|_2020-03-10

hCoV-19/India/OR-ILSCV32656/2020|EPI_ISL_481161|_2020-06-03

hCoV-19/India/MH-NIV-729-3/2020|EPI_ISL_454561|_2020-04-07

hCoV-19/India/OR-ILSCV16512/2020|EPI_ISL_463051|_2020-05-12

hCoV-19/West_Bengal_India_/2019-nCoV/5047/20/05-2020

hCoV-19/Kerala_India/nCoV5157/2020/03-2020

hCoV-19/_Jammu_and_KashmirIndia/_Sk-Cov-2-52384/2020_05-2020

hCoV-19/Karnataka_India/CoV-20-18177/2020/05-2020

hCoV-19/Kerala_India/nCoV4762/2020/03-2020

hCoV-19/India/D

L-NCDC7715-C

SIR-IGIB/2020|E

PI_IS

L_482661|_2020-05-28

hCoV-19/Ind

ia/OR-ILSCV

13715/2020|

EPI_ISL_463

022|_2020-05

-09

hCoV-19/K

erala_India/n

CoV4198/2020/03-2020

hCoV-19/India/U

N-1125/2020|E

PI_IS

L_421668|_2020-03-10

hCoV-19/India/WB-IICB-019/2020|EPI_ISL_581451|_2020-08-12

hCoV-19/India/M

H-NIV-1804/2020|E

PI_IS

L_454526|_2020-03-13

hCoV-19/India/UN-NCDC-02240/2020|EPI_ISL_435098|_2020-03-28

hCoV-19/P

unjab_India/n

cov8298/2020/05-2020

hCoV-19/India/D

L-NCDC7686-C

SIR-IGIB/2020|EPI_ISL_482660|_2020-05-28

hCoV-19/India/U

P-NBRI-N13-TC

R323/2020|EPI_ISL_516942|_2020-04-02

hCoV-19/India/DL-MaxCov0031-CSIR-IGIB/2020|EPI_ISL_459933|_2020-05-05

hCoV-19/India/TG-CCMB-J232/2020|EPI_ISL_447576|_2020-04-02

hCoV-19/Andhra_Pradesh_India/SVIMSCOV-404/03-2020

hCoV-19/In

dia/BR-NC

DC-2515/20

20|EPI_ISL

_436417|_2

020-03-31

hCoV-19/India/TG-CCMB-J131/2020|EPI_ISL_447566|_2020-04-01

hCoV-19/India/DL-NCDC-4208/2020|EPI_ISL_436452|_2020-04-14

hCoV-19/India/TG-CCMB-J375/2020|EPI_ISL_450330|_2020-04-02

hCoV-19/India/DL-MaxCov0038-CSIR-IGIB/2020|EPI_ISL_459940|_2020-05-10

hCoV-19/India/TG-CCMB-O2-P4S/2020|EPI_ISL_471646|_2020-04-15

hCoV-19/India/UP-NCDC-02320/2020|EPI_ISL_435099|_2020-03-29

hCoV-19/India/TG-CCMB-J208/2020|EPI_ISL_447574|_2020-04-01

hCoV-19/India/PB-IMT-L169/2020|EPI_ISL_561344|_2020-04-30

hCoV-19/India/TG-CCMB-K771/2020|EPI_ISL_447859|_2020-04-15

hCoV-19/India/W

B-IICB-027/2020|EPI_ISL_581502|_2020-08-12

hCoV-19/India/U

P-NBRI-N14-TCR32

4/2020

|EPI_IS

L_5169

43|_20

20-04-02

hCoV-19/India/TG

-CCMB-J235/2020|E

PI_IS

L_447852|_2020-04-01

hCoV-19/India/TG

-CCMB-J048/2020|E

PI_IS

L_447559|_2020-03-31

hCoV-19/India/TG-CCMB-L276-P3/2020|EPI_ISL_471641|_2020-04-20

hCoV-19/India/TG-CCMB-J045/2020|EPI_ISL_447847|_2020-03-31

hCoV-19/K

arnataka_In

dia/C

oV-20-3693/2020/04-2020

hCoV-19/India/TG-CCMB-J043/2020|EPI_ISL_447560|_2020-03-31

hCoV-19/India/MP-DRDE-2498/2020|EPI_ISL_476853|_2020-05-04

hCoV-19/Maharas

htra_India/_IGGM

C/NGP/7180/2020

/05-2020

hCoV-19/Ind

ia/DL-M

axCov0

035-CS

IR-IGIB/2020|EPI_IS

L_459937|_2020-05-10

hCoV-19/India/KA

-nimh-0318/2020|EPI_ISL_428483|_2020-04-10

hCoV-19/India/MH-NIV-QC-797/2020|EPI_ISL_452215|_2020-04-22

hCoV-19/India/DL-MaxCov0020-CSIR-IGIB/2020|EPI_ISL_459922|_2020-05-10

hCoV-19/In

dia/T

G-CCMB-J220/2020|E

PI_IS

L_450331|_2020-04-01

hCoV-19/Maharashtr

a_India/_IGGMC/NG

P17399/2020/06-202

0

hCoV-19/In

dia/K

A-nimh-4263/2020|E

PI_IS

L_486395|_2020-04-23

hCoV-19/Kerala_India/nCoV3336/2020/03-2020

hCoV-19/In

dia/O

R-ILSCV15937/2020|E

PI_IS

L_463041|_2020-05-11 h

CoV-19/India/TG-CCMB-J327/2020|EPI_ISL_447578|_2020-04-02

hCoV-19/India/TG-CCMB-J458/2020|EPI_ISL_447575|_2020-04-02

hCoV-19/In

dia/T

G-CCMB-J714/2020|E

PI_IS

L_458298|_2020-04-06

hCoV-19/Jammu_and_Kashmir_India/_SK-CoV-2-803/2020_04-2020

hCoV-19/Ind

ia/GJ-GBRC109/2020|EPI_ISL_451154|_2020-05-03

hCoV-19/India/MH-NIV-6311/2020|EPI_ISL_452206|_2020-03-30

hcov-19/Q

111/Luckn

ow/UP

hCoV-19/India/GJ-GBRC271a/2020|EPI_ISL_483868|_2020-06-13

hCoV-19/India/KA-nimh-0999/2020|

EPI_ISL_486382|_2020-04-14

hCoV-19/India/MP-DRDE_3926/2020|EPI_ISL_476884|_2020-05-16

hCoV-19/India/DL-MaxCov0040-CSIR-IGIB/2020|EPI_ISL_459942|_2020-04-27

hCoV-19/India/TG

-CCMB-J222/2020|E

PI_IS

L_450332|_2020-04-01

hCoV-19/India/DL-NCDC1892-CSIR-IGIB/2020|EPI_ISL_482511|_2020-05-18

hCoV-19/In

dia/T

G-CCMB-O1/2020|E

PI_IS

L_458066|_2020-04-13

hCoV-19/India/OR

-ILSCV19192/2020

|EPI_ISL_463067|

_2020-05-14

hCoV-19/India/TG-CCMB-J812/2020|EPI_ISL_447858|_2020-04-06

hCoV-19/India/TG-CCMB-J212/2020|EPI_ISL_447564|_2020-04-01

hCoV-19/India/TG-CCMB-M980/2020|EPI_ISL_495269|_2020-06-13

hCoV-19/Uttar_Pradesh_India/_RMRCGKPCOV-3112/04/20/07-2020/07-2020

hCoV-19/In

dia/D

L-NCDC-3394/2020|E

PI_IS

L_436435|_2020-04-09

hCoV-19/India/M

H-NIV-6216/2020|EPI_ISL_454540|_2020-04-01

hCoV-19/India/OR-ILSCV14374/2020|EPI_ISL_463031|_2020-05-10

hCoV-19/K

erala_India/n

CoV4201/2020/03-2020

hCoV-19/India/KA-nimh-14834/2020|EPI_ISL_515958|_2020-05-08

hCoV-19/In

dia/HR-TH

STI-BAL-9

14/2020|E

PI_ISL_45

4866|_202

0-04-16

hCoV-19/In

dia/DL-NC

DC218

8-CSIR

-IGIB/2020

|EPI_IS

L_4825

45|_202

0-05-18

hCoV-19/India

/GJ-GBRC14/2

020|EPI_ISL_4

37442|_2020-04

-27

hCoV-19

/India/T

G-CCMB-J

067/202

0|EPI_IS

L_4475

61|_202

0-03-31

hCoV-19/India/TG-CCMB-K128/2020|EPI_ISL_447579|_2020-04-10

hCoV-19/India/BR-NCDC-2519/2020|EPI_ISL_436419|_2020-03-31

hCoV-19/Jammu_and_Kashmir_India/_SK-CoV-2-52564/2020_05-2020

hCoV-19/India/WB-IICB-022/2020|EPI_ISL_581496|_2020-08-12

hCoV-19/India/DL-MaxCov0015-CSIR-IGIB/2020|EPI_ISL_459917|_2020-05-08

hCoV-19/India/WB-IICB-020/2020|EPI_ISL_581494|_2020-08-12

hCoV-19/India/HR-THSTI-BAL-918/2020|EPI_ISL_454867|_2020-04-16

hCoV-19/India/M

H-NIV-1738/2020|E

PI_IS

L_454525|_2020-03-12

hCoV-19/India/DL-ILBS-279M/2020|EPI_ISL_605799|_2020-05-01

hCoV-19/India/PB-IMT-I14/2020|EPI_ISL_539483|_2020-04-27

hCoV-19/India/GJ-GBRC116/2020|EPI_ISL_451161|_2020-05-03

hCoV-19/India/WB-IICB-017/2020|EPI_ISL_581449|_2020-08-04

hCoV-19/India/GJ-GBRC24b/2020|EPI_ISL_437454|_2020-04-26

hCoV-19/India/TG-CCMB-J350/2020|EPI_ISL_447854|_2020-04-02

hCoV-19/Kerala_India/nCoV5591/2020//04-2020

hCoV-19/India/TG-CCMB-J223/2020|EPI_ISL_447851|_2020-04-01

hCoV-19

/India/DL

-NCDC205

8-CSIR-IG

IB/2020|E

PI_ISL_48

2537|_20

20-05-19

hCoV-19/India/un-OUMRK1090/2020|EPI_ISL_447902|_2020-03-25

hCoV-19/India/GJ-GBRC272b/2020|EPI_ISL_483871|_2020-06-13

hCoV-19/India/TG-CCMB-O2-P4M/2020|EPI_ISL_471645|_2020-04-15

hCoV-19/India/OR-ILSCV16509/2020|EPI_ISL_463049|_2020-05-12

hCoV-19/India/MH-MW3/2020|EPI_ISL_508433|_2020-04-27

hCoV-19/A

ndhra_Pr

adesh_Ind

ia/SVIMSC

OV-381/03

-2020

hCoV-19/India/DL-NCDC2037-CSIR-IGIB/2020|EPI_ISL_482531|_2020-05-18

hCoV-19/Kerala_India/nCoV3256/2020/03-2020

hCoV-19/India/DL-ILBS-1718M/2020|EPI_ISL_605805|_2020-05-01

hCoV-19/India/DL-NCDC7728-CSIR-IGIB/2020|EPI_ISL_482665|_2020-05-28

hCoV-19

/India/T

G-CCMB

-K600B/2

020|EPI_

ISL_4475

81|_2020

-04-14

hCoV-19/India/TG-CCMB-L044/2020|EPI_ISL_447864|_2020-04-16

hCoV-19/India/UP-NCDC-02321/2020|EPI_ISL_435100|_2020-03-29

hCoV-19/India/DL-NCDC7717-CSIR-IGIB/2020|EPI_ISL_482663|_2020-05-28

hCoV-19/India/DL-NCDC7101-CSIR-IGIB/2020|EPI_ISL_482629|_2020-05-27

hCoV-19/India/DL-NCDC2192-CSIR-IGIB/2020|EPI_ISL_482547|_2020-05-18

hCoV-19/India/DL-NCDC2214-CSIR-IGIB/2020|EPI_ISL_482552|_2020-05-18

hCoV-19/India/WB-NCDC-02252/2020|EPI_ISL_435097|_2020-03-28

hCoV-19/India/DL-MaxCov0030-CSIR-IGIB/2020|EPI_ISL_459932|_2020-05-13

hCoV-19/India/DL-MaxCov0013-CSIR-IGIB/2020|EPI_ISL_459915|_2020-05-12

hCoV-19/India/MH-AFMC-5473/2020|EPI_ISL_496520|_2020-05-23

hCoV-19/K

erala_India/n

CoV3213/2020/03-2020

hCoV-19/India/RJ-SMSCOV141/2020|EPI_ISL_454831|_2020-04-29

hCoV-19/India/D

L-NCDC1722

-CSIR-IGIB/2020

|EPI_IS

L_4825

01|_20

20-05-17

hCoV-19/India/RJ-SMSCOV109/2020|EPI_ISL_454830|_2020-04-23

NC_045512_SA

RS-CoV-2_isolate_W

uhan-Hu-1

hCoV-19/India/GJ-GBRC-415a/2020|EPI_ISL_586542|_2020-07-11

hCoV-19/In

dia/DL-Ma

xCov00

41-CSIR-IGIB

/2020|EPI_ISL_4

59943|_20

20-05-01

hCoV-19/India/T

G-CCMB-J390

/2020|EPI_IS

L_4478

57|_20

20-04-02

hCoV-19/In

dia/L

A-NCDC-01614/2020|E

PI_IS

L_435104|_2020-03-17

hCoV-19/Uttar_Pradesh_India/_RMRCGKPCOV-3090/04/20/07-2020/07-2020

hCoV-19/India/DL-MaxCov0037-CSIR-IGIB/2020|EPI_ISL_459939|_2020-05-11

hCoV-19/In

dia/DL-NC

DC-3175/2

020|EPI_IS

L_4364

26|_202

0-04-05

hCoV-19/Sikkim-India/5925/2020/06-2020

hCoV-19/India/TG-CCMB-O4/2020|EPI_ISL_458069|_2020-04-03

hCoV-19/India/DL-NCDC1733-CSIR-IGIB/2020|EPI_ISL_482503|_2020-05-18

hCoV-19/India/KA-nimh-11068/2020|EPI_ISL_515945|_2020-05-04

hCoV-19/India/O

R-ILSCV12752/2

020|EPI_ISL_463

017|_2020-05-09

hCoV-19/In

dia/L

A-NCDC-01760/2020|E

PI_IS

L_435106|_2020-03-18

hCoV-19/India/KA-nimh-19688/2020|EPI_ISL_515970|_2020-05-16

hCoV-19/Uttar_Pradesh_India/_RMRCGKPCOV-3116/04/20/07-2020/07-2020

hCoV-19/In

dia/T

N-NCDC-02242/2020|E

PI_IS

L_435083|_2020-03-28

hCoV-19/India/TN-CCMB-C21/2020|EPI_ISL_471584|_2020-05-06hCoV-19/India/GJ-GBRC322b/2020|EPI_ISL_500950|_2020-06-17

hCoV-19/India/TG-CCMB-L299/2020|EPI_ISL_458062|_2020-04-20

hCoV-19/In

dia/DL-ILBS

-1929M/202

0|EPI_ISL_6

05806|_202

0-05-01

hCoV-19/In

dia/U

P-NBRI-N11-T

CR316/2020|E

PI_IS

L_516940|_2020-04-02

hCoV-19/India/TG-CCMB-J107/2020|EPI_ISL_447849|_2020-04-01

hCoV-19/India/TG

-GMC-RK1090/2020|EPI_ISL_438139|_2020-03-25

hCoV-19/In

dia/L

A-NCDC-01604/2020|E

PI_IS

L_435103|_2020-03-17

hCoV-19/India/A

S-NCDC-2535

/2020|EPI_IS

L_4364

22|_20

20-03-31

hCoV-19/Maharashtra_India/C1046/2020/03-2020

hCoV-19/Maha

rashtra_India/

_IGGMC/NGP

/7141/2020/05-

2020

hCoV-19/India/DL-MaxCov0016-CSIR-IGIB/2020|EPI_ISL_459918|_2020-05-09

hCoV-19/India/TN-NCDC-02244/2020|EPI_ISL_435092|_2020-03-28

hCoV-19/India/AP-NCDC-3936/2020|EPI_ISL_436440|_2020-04-09

hCoV-19/India/WB-IICB-028/2020|EPI_ISL_581503|_2020-08-12

hCoV-19/India/DL-MaxCov0017-CSIR-IGIB/2020|EPI_ISL_459919|_2020-05-21

hCoV-19/India/DL-NCDC-3181/2020|EPI_ISL_436428|_2020-04-05

hCoV-19/India/UT-AR46/2020|EPI_ISL_508190|_2020-05-23

hCoV-19/P

unjab_India/n

coV8446/2020/05-2020

hCoV-19/India/TG-CCMB-J057/2020|EPI_ISL_447563|_2020-03-31

hCoV-19/India/TG-CCMB-M664/2020|EPI_ISL_495222|_2020-06-11

hCoV-19/India/TG-CCMB-J374/2020|EPI_ISL_447856|_2020-04-02

hCoV-19/India/MP-DRDE-4022/2020|EPI_ISL_476890|_2020-05-16

hCoV-19/India/MP-NIHSAD-1005-52/2020|EPI_ISL_452795|_2020-05-09

hCoV-19/India/TN-NCDC-02334/2020|EPI_ISL_435080|_2020-03-29

Figure 3. Phylogenetic tree of the 406 unclassified SARS-CoV-2 genomes: A Neighbor-joining tree of the 406 unclassifiedSARS-CoV-2 sequences retrieved in this study, along with the representative SARS-Cov-2 sequences from different cladeswith a bootstrap replication of 1000 cycles. Two major groups of unclassified sequences were observed, which are marked indifferent shades of blue and green. The first cluster has amino acid changes at ORF1ab (T2016K, A4489V) and N: P13L,represented in blue and its shades, whereas the second cluster has amino acid changes at ORF1ab (R207C, V378I, andM2790I), represented with the green color edges. The representative L, S, V, G, GH, GR GISAID clades are marked on thenodes with the colors red, pink, orange, green, blue, and light blue, respectively. FigTree v1.4.4 was used to visualize thegenerated tree.

A haplotype network plot was generated for the unclassified sequences retrieved(n = 96) in this study along with the other representative clade sequences (Figure 4). Thenetwork plot indicates the presence of two different clusters within the unclassified SARS-CoV-2 sequences. Supplementary Figure S1 depicts the NJ tree for the same set of sequences.The rest of the sequences (n = 14) did not have any of the above-described patterns due tothe ambiguity in the sequences.

Page 9: An Epidemiological Analysis of SARS-CoV-2 Genomic ......16 Bhagat Phool Singh Government Medical College, Sonipat 131305, India; yadav78sarita@yahoo.com 17 King Institute of Preventive

Viruses 2021, 13, 925 9 of 13

Viruses 2021, 13, x FOR PEER REVIEW 9 of 13

A haplotype network plot was generated for the unclassified sequences retrieved (n = 96) in this study along with the other representative clade sequences (Figure 4). The network plot indicates the presence of two different clusters within the unclassified SARS-CoV-2 sequences. Supplementary Figure S1 depicts the NJ tree for the same set of se-quences. The rest of the sequences (n = 14) did not have any of the above-described pat-terns due to the ambiguity in the sequences.

Figure 4. A haplotype network plot was generated from the 96 SARS-Cov-2 sequences belonging to the unclassified cluster, along with representative sequences of the other clades using the me-dian-joining method in PopART v1.7 with epsilon as 0. The light-green color of the branches de-picts the B.4 variant of the SARS-CoV-2. Blue-grey, orange, and dark green are the sequences from the B.6 variant.

3.4. The Temporal Trend of Indian SARS-CoV-2 Sequences Demonstrates an Increase of G and Its Subclades in Different States of India

Figure 5 illustrates the temporal trend along with the geographic distribution of the SARS-CoV-2 sequences in India. In the earlier stage of the COVID-19 pandemic, the L, S, and V clade sequences were observed only in Delhi, Maharashtra, and Karnataka states in India. This was followed by the reporting of the G and its subclades from March, an observation similar to previous literature [16,17]. The G clade sequences evolved into two newer subclades (GR and GH). The GR and the GH clades established themselves and superseded their parent strain within only two months (March–May). The predominance of different strains was observed in different states in India. The parental G strain majorly affected Rajasthan, Madhya Pradesh, and Odisha. The southern parts of India (Telangana, Andhra Pradesh, Karnataka, and Maharashtra) seemed to be predominantly affected by the GR strain. The northern states Jammu and Kashmir, Punjab, Chandigarh, Haryana,

Figure 4. A haplotype network plot was generated from the 96 SARS-Cov-2 sequences belonging tothe unclassified cluster, along with representative sequences of the other clades using the median-joining method in PopART v1.7 with epsilon as 0. The light-green color of the branches depictsthe B.4 variant of the SARS-CoV-2. Blue-grey, orange, and dark green are the sequences from theB.6 variant.

3.4. The Temporal Trend of Indian SARS-CoV-2 Sequences Demonstrates an Increase of G and ItsSubclades in Different States of India

Figure 5 illustrates the temporal trend along with the geographic distribution of theSARS-CoV-2 sequences in India. In the earlier stage of the COVID-19 pandemic, the L, S,and V clade sequences were observed only in Delhi, Maharashtra, and Karnataka statesin India. This was followed by the reporting of the G and its subclades from March, anobservation similar to previous literature [16,17]. The G clade sequences evolved into twonewer subclades (GR and GH). The GR and the GH clades established themselves andsuperseded their parent strain within only two months (March–May). The predominanceof different strains was observed in different states in India. The parental G strain majorlyaffected Rajasthan, Madhya Pradesh, and Odisha. The southern parts of India (Telangana,Andhra Pradesh, Karnataka, and Maharashtra) seemed to be predominantly affected bythe GR strain. The northern states Jammu and Kashmir, Punjab, Chandigarh, Haryana,and Delhi had a prevalence of the GH strains (Figure 5). It was observed that the northernpart of India had a higher dominance of the GH clade, whereas the southern and centralparts of India had the GR clade [18]. These scenarios were derived from the representativesamples, which are fewer in number as compared to the real scenario, hence the limitationto the sample set analyzed.

Page 10: An Epidemiological Analysis of SARS-CoV-2 Genomic ......16 Bhagat Phool Singh Government Medical College, Sonipat 131305, India; yadav78sarita@yahoo.com 17 King Institute of Preventive

Viruses 2021, 13, 925 10 of 13

Viruses 2021, 13, x FOR PEER REVIEW 10 of 13

and Delhi had a prevalence of the GH strains (Figure 5). It was observed that the northern part of India had a higher dominance of the GH clade, whereas the southern and central parts of India had the GR clade.[18] These scenarios were derived from the representative samples, which are fewer in number as compared to the real scenario, hence the limitation to the sample set analyzed.

Figure 5. Distribution of the SARS-CoV-2 genome prevalence from the outbreak phase (January 2020) up to the seventh month of the pandemic. Stacked area plots are generated to demonstrate the cumulative temporal trends of the SARS-CoV-2 observed in the different states in India. The x-axis depicts the number of SARS-Cov-2 sequences observed in the respective months. The size of each pie chart within the states of the Indian map is proportional to the numbers in each respec-tive clade. The outline of India’s map is downloaded from http://www.survey-ofindia.gov.in/file/Map%20of%20India_1.jpg (accessed on 20 March 2020) and further modified to include relevant data in the SVG editor.

4. Discussion Whole-genome sequencing (WGS) serves as an important tool for determining geo-

graphical prevalence, the evolution of viruses over time, predicting the trends of disease transmission, and for understanding the most effective designs and platforms for devel-oping vaccines and therapeutics [19,20]. WGS also helps in tracing the transmission chains of the virus [21]. The number of cases of COVID-19 is on a continuously rising trend all across the world (1). In India, during the first SARS-CoV-2 wave, a maximum number of cases was reached in the period between September and October 2020, and subsequently

Figure 5. Distribution of the SARS-CoV-2 genome prevalence from the outbreak phase (January2020) up to the seventh month of the pandemic. Stacked area plots are generated to demonstrate thecumulative temporal trends of the SARS-CoV-2 observed in the different states in India. The x-axisdepicts the number of SARS-Cov-2 sequences observed in the respective months. The size of eachpie chart within the states of the Indian map is proportional to the numbers in each respective clade.The outline of India’s map is downloaded from http://www.surveyofindia.gov.in/file/Map%20of%20India_1.jpg (accessed on 20 March 2020) and further modified to include relevant data in theSVG editor.

4. Discussion

Whole-genome sequencing (WGS) serves as an important tool for determining geo-graphical prevalence, the evolution of viruses over time, predicting the trends of diseasetransmission, and for understanding the most effective designs and platforms for develop-ing vaccines and therapeutics [19,20]. WGS also helps in tracing the transmission chains ofthe virus [21]. The number of cases of COVID-19 is on a continuously rising trend all acrossthe world (1). In India, during the first SARS-CoV-2 wave, a maximum number of cases wasreached in the period between September and October 2020, and subsequently declineduntil February 2021. The recent exponential upsurge (second wave) of the COVID-19 casesin India has been observed from April 2021, with more than 0.2 million new cases being re-ported as of 17 April 2021 (https://www.worldometers.info/coronavirus/country/india/(accessed on 22 March 2021)). During the first year of its spread, GISAID classified the virusinto different clades based on the specific mutation observed at different protein positions(Figure S2) [22]. However, with the recent diversity of the new SARS-CoV-2 variants, a

Page 11: An Epidemiological Analysis of SARS-CoV-2 Genomic ......16 Bhagat Phool Singh Government Medical College, Sonipat 131305, India; yadav78sarita@yahoo.com 17 King Institute of Preventive

Viruses 2021, 13, 925 11 of 13

dynamic nomenclature based on the phylogenetic framework is used to identify lineageswith an active spread, referred to as PangoLIN [15].

Three variants as defined in the PangoLIN nomenclature—B.1.1.7 lineage (also knownas 20B/501Y.V1 Variant of Concern (VOC) 1 December 2020) and B.1.351 lineage (alsoknown as 20C/501Y.V2) have recently been reported from India [23,24]. These variants areof concern due to antigenic drift, increased transmissibility, and immune escape (especiallyfor B.1.351) mechanisms. The number of variants is increasing, and these strains carrysignificant mutations in the S gene. We identified a group of strains within lineage B.4,defined by two major changes including a stop codon in ORF8. However, the effect of thisunique cluster in disease outcomes or virus transmission has not been ascertained thus far.The analysis of the genome sequences of the SARS-CoV-2 retrieved in this study led to theidentification of individual amino acid mutations present in the early samples. Recently, anew PangoLIN lineage (B.1.617) was identified in Indian SARS-CoV-2 sequences, with theE484Q and L452R mutation (commonly known as a double mutant) in the spike proteinof SARS-CoV-2, which is considered to have higher transmission rates. The SARS-CoV-2 sequence analyses during the period between January and August 2020 revealed thepresence of the E484Q mutation in the spike protein. These sequences were found inMaharashtra in March (n = 1) and July (n = 2) 2020. Another immune escape mutation, theN440K amino acid in the spike protein, was also observed in Telangana (n = 7), AndhraPradesh (n = 5), and Assam (n = 1) from May 2020. This indicates that despite the absenceof the double mutant variant during the early phase of infection, the presence of a singleindependent mutation could be seen. Furthermore, it was also observed that the multiplemutations found in the VOC, VUI, and variant of high consequence were not presentduring this period, although single independent mutations were seen. In addition, thisstudy observed the presence of individual amino acid variants in the SARS-CoV-2 variantsB.1.1.7 (S494P), B.1.525 (A67V, Q677H), B.1.526 (L5F, T95I, S477N), and P2 (V1176F) in theearlier samples.

The amino acid mutation—L3606F in the NSP6 region of ORF1ab—is quite intriguing.Most of the strains group in the unclassified cluster. The ORF1ab with L3606 prominentlyconsist of the early S clade sequences, along with the newly emerged G clade and itssubclades. Based on the analysis of this study, 84.6% of sequences in the unclassified clusterhad the ORF1ab: L3606F mutation. This comprises the B.4 and B.6 lineages. The changesin the amino acid of the remaining SARS-CoV-2 sequences (3.4%) in this set are undefined.The effect at this genomic position needs to be looked upon as limited evolution is observedwhen Phenylalanine is present at GP 3606 as compared to Leucine. It is observed that theLeucine at the 3606 GP has an increasing number of G and its subclade. This study alsoidentifies the presence of stop codons in the ORF8 protein of the Kerala SARS-CoV-2. Theaccessory protein ORF8 plays a role in the immune response and evasion, as reported by arecent computational study [25].

This analysis further demonstrates that the SARS-CoV-2 variant was not reportedto be circulating in India until August, the samples of which were downloaded on 9thDecember 2020 [23]. The circulating clades in the country may be attributed to the earlyintroductions into India through travelers as well as the mixing of clades. Besides, the earlytransmissions within the country could be chiefly traced to movements of migrant workersand the holding of religious gatherings. The independent identification of the amino acidmutations observed in the SARS-CoV-2 variants from the early phase samples indicates anevolutionary trend in the current circulating strain that is geared towards host adaptation.The molecular epidemiology of SARS-CoV-2 needs to be analyzed continuously so thatchanges in the amino acids can be tracked, and the effect of these mutations in the diseasetransmission dynamics and its pathophysiology can be promptly assessed.

Page 12: An Epidemiological Analysis of SARS-CoV-2 Genomic ......16 Bhagat Phool Singh Government Medical College, Sonipat 131305, India; yadav78sarita@yahoo.com 17 King Institute of Preventive

Viruses 2021, 13, 925 12 of 13

Supplementary Materials: The following are available online at https://www.mdpi.com/article/10.3390/v13050925/s1 Figure S1 Neighbor-joining of the 96 SARS-CoV-2 sequences belonging to theunclassified cluster using the Tamura-Nei model with a bootstrap replication of 1000 cycles. Thelight green colour depicts the CI2 (B.4 lineage) variant of the SARS-CoV-2 and Blue and its shadesare the sequences from A3i (B.6 lineage) variant. The red colour is a part of the A3i group that hasadditional missense conservation of S2015R. The representative sequences of L, S, V, G, GH, GRGISAID clades are marked in Red, pink, orange, Green, Blue, and light blue colour respectively.Figure S2. Temporal distribution of the SARS-CoV-2 genome across the world from the outbreakphase (January 2020) till seven (or eight) months of the pandemic. Stacked area plots are generatedto demonstrate the cumulative temporal trends of the SARS-CoV-2 observed for different countriesof the various continent. The x-axis depicts the number of SARS-Cov-2 sequences observed duringrespective months. The size of each pie chart within the continent of the world map is proportionalto the numbers within each respective clade. The outline of the world map is downloaded fromhttps://d-maps.com/index.php?lang=en and further modified to include relevant data in the SVGeditor. Table S1. The patient’s clinical history, for the clinical sample that could retrieve SARS-CoV-2genomic sequences. Table S2. Details of the patient’s age and gender along with the percentage of thegenome retrieved and relevant reads mapped to the SARS-CoV-2 sequence retrieved in this study.

Author Contributions: Conceptualization, P.D.Y., N.G., S.P. (Savita Patil), P.A. and B.B. (Balram Bhar-gava); methodology, P.D.Y., V.P., S.P. (Savita Patil), T.M., A.M.S., P.P. software, H.K., S.C., D.A.N., A.K.,Y.J.; re-sources, R.S., N.A., J.N., N.V., U.K., A.P.S., A.M., T.S., S.D. (Sulochna Devi), T.M. (Tapan Ma-jumdar), S.C.J., R.B., Y.J., R.S., J.S., M.S., M.K., V.R., S.D. (Shanta Dutta), S.Y., K.K., S.R., D.B., B.B.(Biswajyoti Borkakoty), S.V., S.R. (Sharmila Raut), H.D., D.P., J.T., B.M., B.F., V.N., A.J., A.B., A.G.;writing—original draft preparation, P.D.Y., N.G., D.A.N.; writing—P.D.Y., V.P., S.P. (Samiran Panda),T.M., A.M.S., P.P., D.A.N. supervision, P.D.Y., N.G., S.P. (Samiran Panda), P.A.; project administration,P.D.Y., N.G., S.P. (Samiran Panda), P.A.; funding acquisition, P.D.Y., P.A. All authors have read andagreed to the published version of the manuscript.

Funding: Intramural funding from the Indian Council of Medical Research-National Institute ofVirology, Pune, supported this work under the project fund approved for “Molecular epidemiologicalanalysis of SARS-CoV-2 circulating in different regions of India” (20-3-18N).

Institutional Review Board Statement: The institutional human ethics committee at ICMR NIV Punehad approved the study protocol (IHEC no. NIV/IEC/Dec/2020/D-6 dated 31 December 2020).

Informed Consent Statement: Patient consent was waived due to screening of retrospective samples(TS/NS) already collected during Covid-19 pandemic by all the respective centers involved inthe study.

Data Availability Statement: All the sequences are already submitted and are available at the publicdomain (GISAID database) available from https://www.gisaid.org (accessed on 22 March 2021).

Acknowledgments: The authors thank the staff of ICMR-NIV Pune, especially Pranita Gawande,Hitesh Dighe, Kaumudi Kalele, Ashwini Waghmare, Manisha Dudhmal, Rajshree Lande and Vish-wajeet Dhanure for their excellent support in completing this study. The authors would like toacknowledge Krishnapal Karmodia and Sanjeev Galande from IISER, Pune for helping us utilize theirNGS facility. We would like to acknowledge all the authors that have submitted the SARS-CoV-2sequences to the GISAID database.

Conflicts of Interest: The authors declare no conflict of interest. The findings and conclusions are ofthe authors, and the funding agencies have no role in any part of the study.

References1. Novel Coronavirus (2019-NCoV) Situation Reports. Available online: https://www.who.int/emergencies/diseases/novel-

coronavirus-2019/situation-reports (accessed on 7 April 2020).2. Yadav, P.D.; Potdar, V.A.; Choudhary, M.L.; Nyayanit, D.A.; Agrawal, M.; Jadhav, S.M.; Majumdar, T.D.; Shete-Aich, A.; Basu, A.;

Abraham, P.; et al. Full-Genome Sequences of the First Two SARS-CoV-2 Viruses from India. Indian J. Med. Res. 2020, 151, 200.[CrossRef]

3. Wu, A.; Peng, Y.; Huang, B.; Ding, X.; Wang, X.; Niu, P.; Meng, J.; Zhu, Z.; Zhang, Z.; Wang, J.; et al. Genome Composition andDivergence of the Novel Coronavirus (2019-NCoV) Originating in China. Cell Host Microbe 2020, 27, 325–328. [CrossRef]

Page 13: An Epidemiological Analysis of SARS-CoV-2 Genomic ......16 Bhagat Phool Singh Government Medical College, Sonipat 131305, India; yadav78sarita@yahoo.com 17 King Institute of Preventive

Viruses 2021, 13, 925 13 of 13

4. Lu, R.; Zhao, X.; Li, J.; Niu, P.; Yang, B.; Wu, H.; Wang, W.; Song, H.; Huang, B.; Zhu, N.; et al. Genomic Characterisationand Epidemiology of 2019 Novel Coronavirus: Implications for Virus Origins and Receptor Binding. Lancet 2020, 395, 10224.[CrossRef]

5. Rahimi, A.; Mirzazadeh, A.; Tavakolpour, S. Genetics and Genomics of SARS-CoV-2: A Review of the Literature with the SpecialFocus on Genetic Diversity and SARS-CoV-2 Genome Detection. Genomics 2020, 113. [CrossRef]

6. Roy, C.; Mandal, S.M.; Mondol, S.K.; Mukherjee, S.; Ghosh, W.; Chakraborty, R. Trends of Mutation Accumulation across GlobalSARS-CoV-2 Genomes: Implications for the Ecology and Evolution of the Novel Coronavirus. Genomics 2020. [CrossRef]

7. Mercatelli, D.; Giorgi, F.M. Geographic and Genomic Distribution of SARS-CoV-2 Mutations. Front. Microbiol. 2020, 11. [CrossRef]8. Yadav, P.D.; Albariño, C.G.; Nyayanit, D.A.; Guerrero, L.; Jenks, M.H.; Sarkale, P.; Nichol, S.T.; Mourya, D.T. Equine Encephalosis

Virus in India, 2008. Emerg. Infect. Dis. 2018, 24, 898–901. [CrossRef]9. Yadav, P.D.; Nyayanit, D.A.; Shete, A.M.; Jain, S.; Majumdar, T.P.; Chaubal, G.Y.; Shil, P.; Kore, P.M.; Mourya, D.T. Complete

Genome Sequencing of Kaisodi Virus Isolated from Ticks in India Belonging to Phlebovirus Genus, Family Phenuiviridae. TicksTick-Borne Dis. 2019, 10, 23–33. [CrossRef]

10. Shu, Y.; McCauley, J. GISAID: Global Initiative on Sharing All Influenza Data—from Vision to Reality. Eurosurveillance 2017, 22.[CrossRef]

11. Kumar, S.; Stecher, G.; Tamura, K. MEGA7: Molecular Evolutionary Genetics Analysis Version 7.0 for Bigger Datasets. Mol. Biol.Evol. 2016, 33, 1870–1874. [CrossRef]

12. Leigh, J.W.; Bryant, D. Popart: Full-Feature Software for Haplotype Network Construction. Methods Ecol. Evol. 2015, 6, 1110–1116.[CrossRef]

13. Bandelt, H.J.; Forster, P.; Röhl, A. Median-Joining Networks for Inferring Intraspecific Phylogenies. Mol. Biol. Evol. 1999, 16,37–48. [CrossRef]

14. Banu, S.; Jolly, B.; Mukherjee, P.; Singh, P.; Khan, S.; Zaveri, L.; Shambhavi, S.; Gaur, N.; Reddy, S.; Kaveri, K.; et al. A DistinctPhylogenetic Cluster of Indian SARS-CoV-2 Isolates. Open Forum Infect. Dis. 2020. [CrossRef]

15. Rambaut, A.; Holmes, E.C.; O’Toole, Á.; Hill, V.; McCrone, J.T.; Ruis, C.; du Plessis, L.; Pybus, O.G. A Dynamic NomenclatureProposal for SARS-CoV-2 Lineages to Assist Genomic Epidemiology. Nat. Microbiol. 2020, 5, 1403–1407. [CrossRef]

16. Isabel, S.; Graña-Miraglia, L.; Gutierrez, J.M.; Bundalovic-Torma, C.; Groves, H.E.; Isabel, M.R.; Eshaghi, A.; Patel, S.N.; Gubbay,J.B.; Poutanen, T.; et al. Evolutionary and Structural Analyses of SARS-CoV-2 D614G Spike Protein Mutation Now DocumentedWorldwide. Sci. Rep. 2020, 10, 14031. [CrossRef]

17. Potdar, V.; Cherian, S.S.; Deshpande, G.R.; Ullas, P.T.; Yadav, P.D.; Choudhary, M.L.; Gughe, R.; Vipat, V.; Jadhav, S.; Patil, S.; et al.Genomic Analysis of SARS-CoV-2 Strains among Indians Returning from Italy, Iran & China, & Italian Tourists in India. Indian J.Med. Res. 2020, 151, 255. [CrossRef]

18. Coronavirus Disease (COVID-19) Situation Reports. Available online: https://www.who.int/emergencies/diseases/novel-coronavirus-2019/situation-reports/ (accessed on 22 August 2020).

19. Leitmeyer, K.; Rico-Hesse, R. Viral Evolution and Epidemiology. Curr. Opin. Infect. Dis. 1997, 10, 367–371. [CrossRef]20. Parvez, M.K.; Parveen, S. Evolution and Emergence of Pathogenic Viruses: Past, Present, and Future. Intervirology 2017, 60, 1–7.

[CrossRef] [PubMed]21. Campbell, F.; Strang, C.; Ferguson, N.; Cori, A.; Jombart, T. When Are Pathogen Genome Sequences Informative of Transmission

Events? PLoS Pathog. 2018, 14. [CrossRef]22. GISAID—HCoV-19 Genomic Epidemiology. Available online: https://www.gisaid.org/epiflu-applications/hcov-19-genomic-

epidemiology/ (accessed on 22 September 2020).23. Yadav, P.D.; Nyayanit, D.A.; Sahay, R.R.; Sarkale, P.; Pethani, J.; Patil, S.; Baradkar, S.; Potdar, V.; Patil, D.Y. Isolation and

characterization of VUI-202012/01, a SARS-CoV-2 variant: Human cases travelled from United Kingdom to India. J. Travel Med.2021, 28, taab009.

24. Yadav, P.D.; Nyayanit, D.A.; Sahay, R.R.; Shete, A.M.; Majumdar, T.; Patil, S.; Patil, D.Y.; Gupta, N.; Kaur, H.; Aggarwal, N.; et al.Imported SARS-CoV-2 V501Y.V2 variant (B.1.351) detected in travelers from South Africa and Tanzania to India. Travel Med.Infect. Dis. 2021. [CrossRef]

25. Flower, T.G.; Buffalo, C.Z.; Hooy, R.M.; Allaire, M.; Ren, X.; Hurley, J.H. Structure of SARS-CoV-2 ORF8, a Rapidly EvolvingImmune Evasion Protein. Proc. Natl. Acad. Sci. USA 2021, 118. [CrossRef] [PubMed]