COVID | Free Full-Text | Genetic Analysis and Epitope Prediction of SARS-CoV-2 Genome in Bahia, Brazil: An In Silico Analysis of First and Second Wave Genomics Diversity

[ad_1]

1. Introduction

In December 2019, severe acute respiratory syndrome type 2 emerged as a pneumonia of unknown origin in Wuhan, Hubei Province, China. The pathogen was subsequently identified as SARS-CoV-2, and is responsible for causing the 2019 Coronavirus Disease (COVID-19). Due to its high transmissibility rate, the virus was soon disseminated in various other countries, which led the World Health Organization (WHO) to declare a state of public health emergency of international concern, and soon afterwards, a state of pandemic [1,2]. According to the WHO, as of 31 December 2022, the world has now presented more than 730 million confirmed COVID-19 infections and almost 6.7 million deaths (https://covid19.who.int/, accessed on 1 March 2023).
SARS-CoV-2 is the seventh human coronavirus (CoV) to be described. Its genome was first sequenced in January 2020 in China. This virus is a betacoronavirus from the Coronaviridae family, which usually cause respiratory and gastrointestinal tract problems [2,3]. Since the beginning of the COVID-19 pandemic, genetic analyses of SARS-CoV-2 in various countries and at different times have revealed that the virus has undergone many mutations that may well confer new chemical properties to its viral proteins. As examples, the D614G and 501Y mutations of the spike (S) protein confer greater transmissibility to the virus [4]. Various studies have identified SARS-CoV-2 genomic variations of differing types, including missense, synonyms, insertions, deletions, and non-coding mutations [5]. This mutational variation has allowed the emergence of strains around the globe, such as B.1 (derived from the Chinese B variant), which resulted in a large Italian outbreak [6].
The pandemic has brought challenges that need to be addressed, and bioinformatics can play a fundamental role in this process. Obtaining and analyzing crucial parameters, including the virus transmissibility rate, reconstructing transmission routes, and identifying possible animal sources and reservoirs [7], are all necessary. In the specific case of this study, the objective was to conduct in silico analysis of the diversity of SARS-CoV-2 genomes circulating in the state of Bahia. Monitoring these factors is essential for many reasons and can help with genetic diversity analysis, associating clinical and epidemiological patterns, evaluating diagnostic methods, and developing more effective vaccine production or other therapeutic approaches [8]. Our specific emphasis was on the state of Bahia.

4. Discussion

Comparing the nucleotide variants found in Bahia with an Indian study [18], similar results were observed. That study analyzed 20,086 genomes and obtained the most prevalent mutations: A23403G, C3037T, C241T, and C14408T. The C14408T mutation is also implicated in a significant reduction in the rate of viral replication, and when associated with the C241T and C3037T mutations, it further reduces virus replication. This has been analyzed using a SARS-CoV-2 replication model [19].
A divergence was found in a protein study performed in the state of Amazonas, in which the N protein variant suffered the most mutations. This structural protein forms the nucleocapsid. In the state of Bahia, the S protein (spike protein) presented the highest mutation frequency [20]. Various studies have conducted genomic analysis of SARS-CoV-2 and revealed mutations in many genes, including ORF1ab, ORF3a, ORF6, ORF7, ORF8, ORF10, S, M, E, and N. Among them, the ORF1ab, S, and ORF8 genes were reported as having considerably more mutations than other genes [5].
The mutations found in amino acids allow for explaining essential mechanisms of action of the virus concerning the host. For example, the high prevalence of aspartate and glutamate substitutions explains the increased affinity of the virus (when aspartate is replaced by tyrosine) for ACE-2 cell receptors. The rapid epidemic expansion of SARS-CoV-2 can thus be explained [4].
In the structural and functional distribution of these genomes, the variable sites were transition and transversion types, with non-synonymous mutations prevailing over synonymous in both. This was also observed in a global study that analyzed 4254 strains of SARS-CoV-2, to obtain 767 synonymous mutations against 1352 non-synonymous mutations. However, they did not specify whether the mutations were of the transition or transversion type [20].
When analyzing high-impact mutations, there was a lack of information in the literature concerning other possible mutation locations (or resulting viral advantages) present in the Brazil/BA-FIOCRUZ-PVM54698/2021|EPI_ISL_2663286|2021-03-04 genome. The set of mutations is believed to be exclusive to Bahia. The Brazil/BA-FIOCRUZ-PVM54698/2021|EPI_ISL_2663286|2021-03-04 genome presents three (3) labeled private mutations and 20 unlabeled private mutations. Labeled private mutations are private mutations in a common genotype within a clade, while unlabeled private mutations are private mutations that are neither labeled nor reverse [21].
Other high-impact mutations were found in various countries such as Japan, the United States, and France [22,23,24,25]. The advantages of these mutations are reported to be decreased susceptibility to treatments and monoclonal effects of bamlanivimab/etesevimab, decreased neutralization for convalescent and post-vaccination sera, and increased production of Spike proteins (ACE2), which increases both transmissibility and virulence [26,27].
HLA is essential in both antigen presentation and adaptive immune response [28]. Identifying alleles associated with viral epitopes is therefore essential for developing vaccines and therapeutic options, as well as for developing a better understanding of the mechanism of viral infection. For example, HLA-B*15:01 (also observed in other studies) is strongly associated with asymptomatic SARS-CoV-2 infection, and is likely involved in early viral clearance [29]. HLA-A*02:01 is protective against severe cases of COVID-19 [30].
During lineage analysis, the prevalence of the P.1 and P.2 lineages was observed in more than 50% of the genomes, this was also seen in the greater Brazilian scenario, where these same lineages (after October 2020) reached 75% in national-level sequencing [4].
According to data from the FIOCRUZ Genomic Network, through October 2020, the B.1.1.28, and B.1.1.33 strains were the most prevalent in Brazil, and played an essential role in the first wave of the pandemic [4,31]. This was corroborated in the Bahia scenario, since these lineages were initially prevalent, and both originated from the B.1.1 lineage. From the phylogenetic tree, we find the lineage B.1.1.7, alpha variant arising from the United Kingdom, its outstanding mutation is the amino acid exchange of asparagine with tyrosine [32].
As seen in this phylogeny, the gamma variant is predominant. Both the gamma and zeta variants originate from the B.1.1.28 lineage, with the gamma variant appearing in Manaus and the zeta variant appearing in Rio de Janeiro. Both variants present protein S mutations and certain SNP types, in which glutamate is converted to lysine [4]. These mutations, in particular, affect both transmissibility and host immune response [33], and are evidently one of the reasons for the high frequency of the P.1 lineage in this population. Yet, when compared with greater Brazil, in March 2021, Bahia was the state with the seventh highest preponderance of gamma lineage genomes [32].
According to the WHO, these variants can be divided into variants of concern (VOCs) and variants of interest (VOIs). VOCs present increased transmissibility and virulence, while VOIs involve community transmission or are present in multiple countries. Of the existing VOCs in the period of the present study, only the alpha and gamma variants were present in the state of Bahia. When comparing existing VOIs from March 2020 to September 2021 (in addition to the zeta variant, which stood out in frequency), the presence of the beta variant (B.1.525) was noted [34].

[ad_2]

This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Accept Read More