Enrichment of G‐to‐U Substitution in SARS‐CoV‐2 Functional Regions and Its Compensation via Concurrent Mutations
Date
Language
Embargo Lift Date
Committee Members
Degree
Degree Year
Department
Grantor
Journal Title
Journal ISSN
Volume Title
Found At
Abstract
We surveyed single nucleotide variant (SNV) patterns from 5 903 647 complete SARS-CoV-2 genomes. Among 10 012 SNVs, APOBEC-mediated C-to-U (C > U) deamination was the most prevalent, followed by G > U and other RNA editing-related substitutions including (A > G, U > C, G > A). However, C > U mutations were less frequent in functional regions, for example, S protein, intrinsic disordered regions, and nonsynonymous mutations, where G > U were over-represented. Notably, G-loss substitutions rarely appeared together. Instead, G-gain mutations tended to more frequently co-occur with others, with a marked preference in the S protein, suggesting a compensatory mechanism for G loss in G > U mutations. The temporal patterns revealed C > U frequency declined until late 2021 then resurged in early 2022. Conversely, G > U steadily decreased, with a pronounced drop in January 2022, coinciding with reduced COVID-19 severity. Vaccinated individuals exhibited a slightly but significantly higher C > U frequency and a notably lower G > U frequency compared to the unvaccinated group. Additionally, cancer patients had higher G > U frequency than general patients during the same period. Interestingly, none of the C > U SNVs were uniquely identified in 2724 environmental samples. These findings suggest novel functional roles of G > U in COVID-19 symptoms, potentially linked to oxidative stress and reactive oxygen species, while C > U remains the dominant substitution, likely driven by host immune-mediated RNA editing.