- Browse by Author
Browsing by Author "Srivastava, Mansi"
Now showing 1 - 10 of 11
Results Per Page
Sort Options
Item CASowary: CRISPR-Cas13 guide RNA predictor for transcript depletion(BMC, 2022) Krohannon, Alexander; Srivastava, Mansi; Rauch, Simone; Srivastava, Rajneesh; Dickinson, Bryan C.; Janga, Sarath Chandra; BioHealth Informatics, School of Informatics and ComputingBackground: Recent discovery of the gene editing system - CRISPR (Clustered Regularly Interspersed Short Palindromic Repeats) associated proteins (Cas), has resulted in its widespread use for improved understanding of a variety of biological systems. Cas13, a lesser studied Cas protein, has been repurposed to allow for efficient and precise editing of RNA molecules. The Cas13 system utilizes base complementarity between a crRNA/sgRNA (crispr RNA or single guide RNA) and a target RNA transcript, to preferentially bind to only the target transcript. Unlike targeting the upstream regulatory regions of protein coding genes on the genome, the transcriptome is significantly more redundant, leading to many transcripts having wide stretches of identical nucleotide sequences. Transcripts also exhibit complex three-dimensional structures and interact with an array of RBPs (RNA Binding Proteins), both of which may impact the effectiveness of transcript depletion of target sequences. However, our understanding of the features and corresponding methods which can predict whether a specific sgRNA will effectively knockdown a transcript is very limited. Results: Here we present a novel machine learning and computational tool, CASowary, to predict the efficacy of a sgRNA. We used publicly available RNA knockdown data from Cas13 characterization experiments for 555 sgRNAs targeting the transcriptome in HEK293 cells, in conjunction with transcriptome-wide protein occupancy information. Our model utilizes a Decision Tree architecture with a set of 112 sequence and target availability features, to classify sgRNA efficacy into one of four classes, based upon expected level of target transcript knockdown. After accounting for noise in the training data set, the noise-normalized accuracy exceeds 70%. Additionally, highly effective sgRNA predictions have been experimentally validated using an independent RNA targeting Cas system - CIRTS, confirming the robustness and reproducibility of our model's sgRNA predictions. Utilizing transcriptome wide protein occupancy map generated using POP-seq in HeLa cells against publicly available protein-RNA interaction map in Hek293 cells, we show that CASowary can predict high quality guides for numerous transcripts in a cell line specific manner. Conclusions: Application of CASowary to whole transcriptomes should enable rapid deployment of CRISPR/Cas13 systems, facilitating the development of therapeutic interventions linked with aberrations in RNA regulatory processes.Item Computational Methods for Determining RNA-RNA Interactions(2023-06) Schaeper, David; Janga, Sarath Chandra; Yan, Jingwen; Srivastava, MansiRNA molecules play vital roles in both viruses and cells, and one way to study their function is through the RNA-RNA interactions (RRIs) that occur. RRIs form in one of two ways, through protein mediated RRIs, where a protein brings the RNA molecules together, or through direct complimentary base pairing between the molecules, called RNA centric. Protein mediated RRIs have been captured and analyzed through experimental protocols such as cross-linking ligation and sequencing of hybrids (CLASH) and mapping RNA interactome in vivo (MARIO). RNA centric interactions have been investigated through experimental protocols ligation of interacting RNA followed by high-throughput sequencing (LIGR-seq), sequencing of psoralen crosslinked, ligated, selected hybrids (SPLASH), psoralen analysis of RNA interactions and structures (PARIS), and cross-linking of matched RNAs and deep sequencing (COMRADES). There are also tools that have been developed to predict RRIs and the predominant tools, RNAup and IntaRNA, utilize minimum free energy (MFE) calculations. In this work, initially RRIs were studied in the context of SARS-CoV-2 and its variants to observe evolutionary changes to RRIs. Using in silico RRIs generated through the COMRADES protocol by Ziv et al alongside computational predictions generated through IntaRNA and a large population of SARS-CoV-2 sequences, covariation analysis was used on the population stratified by variants to determine variant-specific evolutionary changes for certain long-range RRIs. Also, statistical evidence was found for a novel Beta variant specific RNA-RNA interaction. After this, RRIs were studied in the human HEK293T cell line through a novel experimental protocol using Oxford Nanopore long-read sequencing technology to be able to capture more complete information on RRIs mapped with the newly developed pipeline Alignment of Chimera through Clustering and Read Splitting (ACCRES). Through this, multi-molecule RNA interactions were able to be detected using an iterative BLAST approach, which is the first time these have been reported to our knowledge. Interaction interfaces were quantified, and the interactions were characterized by their biotype to understand the landscape of these interactions in the cell line. A network was built, and functional enrichment performed to show the interplay between known functions in the cell.Item Direct RNA-sequencing of human cell lines for transcriptome-wide mapping and annotation of 3' tails at single molecule resolutionGovindaraman, Aniruddhan; Quoseena, Mir; Kadumuri, Raja Shekar Vanna; Srivastava, Mansi; Srivastava, Rajneesh; Janga, Sarath ChandraThe 3' endonucleolytic cleavage of pre-messenger RNA (pre-mRNA) and successive polyadenylation is a fundamental cellular process in eukaryotes. The 3' terminal regions are known to be polyadenylated by canonical poly(A) polymerases during RNA processing of messenger RNA (mRNA) molecules, however, they are also known to harbor additional UnMapped Regions (UMR) composed of uridylation and guanylation[1]. Although short read sequencing technologies are extensively used to study 3' terminal regions, major limitations of these approaches include their inability to detect homopolymeric sequences and sequence full length isoforms [1-2]. Nanopore sequencing enables the long read sequencing and identification of full length transcripts at a single molecule resolution, however currently there are no tools for systematically analyzing 3' terminal UMRs from direct RNA-sequencing datasets. Here, we present RAPTOR (https://github.com/aniram118/RAPTOR), a command line tool for 3' terminal UMR analysis of nanopore direct RNA sequencing data. RAPTOR provides a comprehensive report of UMR sequence information, cognate transcript annotations, nucleotide base composition, conserved hexamer signals and a range of analyses plots at a single molecule resolution. For benchmarking, we sequenced mRNA samples obtained from HepG2 (Liver Hepatocellular Carcinoma) & K562 (Bone Marrow Chronic myelogenous leukemia) cell lines resulting in 243,802 & 598,428 reads respectively. RAPTOR identified high quality UMRs, exhibited median lengths of 201 and 173 nt in HepG2 and K562 transcriptomes respectively. Nucleotide composition analysis of the identified 3' UMRs showed an enrichment for A and U nucleotides in both HepG2 [A: 29%, U: 28%, G:20%, C:23%] and K562 [A : 30%, U: 29%, G:1 9%, C:22%] cells. Several high confidence UMRs were verified by qPCR and sanger sequencing confirming sequence length and identity, respectively. In addition, denovo motif analysis of UMR regions enabled the discovery of several noncanonical motifs beyond Poly A/U patterns. These UMR motifs were identified to be significantly - (p-value <0.01) associated with the established binding motifs of several known RNA Binding Proteins including SART3, HuR (ELAVL1), TIA1 , IGF2BP2/3, PABPCs, PCBPs, SRSFs, HNRNPs and RBM /6, suggesting an unappreciated role of these RBPs in binding to 3' tails of mRNAs.Item Experimental and computational methods for studying the dynamics of RNA-RNA interactions in SARS-COV2 genomes(Oxford University Press, 2024) Srivastava, Mansi; Dukeshire, Matthew R.; Mir, Quoseena; Omoru, Okiemute Beatrice; Manzourolajdad, Amirhossein; Janga, Sarath Chandra; BioHealth Informatics, School of Informatics and ComputingLong-range ribonucleic acid (RNA)–RNA interactions (RRI) are prevalent in positive-strand RNA viruses, including Beta-coronaviruses, and these take part in regulatory roles, including the regulation of sub-genomic RNA production rates. Crosslinking of interacting RNAs and short read-based deep sequencing of resulting RNA–RNA hybrids have shown that these long-range structures exist in severe acute respiratory syndrome coronavirus (SARS-CoV)-2 on both genomic and sub-genomic levels and in dynamic topologies. Furthermore, co-evolution of coronaviruses with their hosts is navigated by genetic variations made possible by its large genome, high recombination frequency and a high mutation rate. SARS-CoV-2’s mutations are known to occur spontaneously during replication, and thousands of aggregate mutations have been reported since the emergence of the virus. Although many long-range RRIs have been experimentally identified using high-throughput methods for the wild-type SARS-CoV-2 strain, evolutionary trajectory of these RRIs across variants, impact of mutations on RRIs and interaction of SARS-CoV-2 RNAs with the host have been largely open questions in the field. In this review, we summarize recent computational tools and experimental methods that have been enabling the mapping of RRIs in viral genomes, with a specific focus on SARS-CoV-2. We also present available informatics resources to navigate the RRI maps and shed light on the impact of mutations on the RRI space in viral genomes. Investigating the evolution of long-range RNA interactions and that of virus–host interactions can contribute to the understanding of new and emerging variants as well as aid in developing improved RNA therapeutics critical for combating future outbreaks.Item Mutational Landscape and Interaction of SARS-CoV-2 with Host Cellular Components(MDPI, 2021-09) Srivastava, Mansi; Hall, Dwight; Omoru, Okiemute Beatrice; Gill, Hunter Mathias; Smith, Sarah; Janga, Sarath Chandra; BioHealth Informatics, School of Informatics and ComputingThe emergence of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and its rapid evolution has led to a global health crisis. Increasing mutations across the SARS-CoV-2 genome have severely impacted the development of effective therapeutics and vaccines to combat the virus. However, the new SARS-CoV-2 variants and their evolutionary characteristics are not fully understood. Host cellular components such as the ACE2 receptor, RNA-binding proteins (RBPs), microRNAs, small nuclear RNA (snRNA), 18s rRNA, and the 7SL RNA component of the signal recognition particle (SRP) interact with various structural and non-structural proteins of the SARS-CoV-2. Several of these viral proteins are currently being examined for designing antiviral therapeutics. In this review, we discuss current advances in our understanding of various host cellular components targeted by the virus during SARS-CoV-2 infection. We also summarize the mutations across the SARS-CoV-2 genome that directs the evolution of new viral strains. Considering coronaviruses are rapidly evolving in humans, this enables them to escape therapeutic therapies and vaccine-induced immunity. In order to understand the virus’s evolution, it is essential to study its mutational patterns and their impact on host cellular machinery. Finally, we present a comprehensive survey of currently available databases and tools to study viral–host interactions that stand as crucial resources for developing novel therapeutic strategies for combating SARS-CoV-2 infection.Item Role of SARS-CoV-2 in Altering the RNA-Binding Protein and miRNA-Directed Post-Transcriptional Regulatory Networks in Humans(MDPI, 2020-09-25) Srivastava, Rajneesh; Daulatabad, Swapna Vidhur; Srivastava, Mansi; Janga, Sarath Chandra; BioHealth Informatics, School of Informatics and ComputingThe outbreak of a novel coronavirus SARS-CoV-2 responsible for the COVID-19 pandemic has caused a worldwide public health emergency. Due to the constantly evolving nature of the coronaviruses, SARS-CoV-2-mediated alterations on post-transcriptional gene regulations across human tissues remain elusive. In this study, we analyzed publicly available genomic datasets to systematically dissect the crosstalk and dysregulation of the human post-transcriptional regulatory networks governed by RNA-binding proteins (RBPs) and micro-RNAs (miRs) due to SARS-CoV-2 infection. We uncovered that 13 out of 29 SARS-CoV-2-encoded proteins directly interacted with 51 human RBPs, of which the majority of them were abundantly expressed in gonadal tissues and immune cells. We further performed a functional analysis of differentially expressed genes in mock-treated versus SARS-CoV-2-infected lung cells that revealed enrichment for the immune response, cytokine-mediated signaling, and metabolism-associated genes. This study also characterized the alternative splicing events in SARS-CoV-2-infected cells compared to the control, demonstrating that skipped exons and mutually exclusive exons were the most abundant events that potentially contributed to differential outcomes in response to the viral infection. A motif enrichment analysis on the RNA genomic sequence of SARS-CoV-2 clearly revealed the enrichment for RBPs such as SRSFs, PCBPs, ELAVs, and HNRNPs, suggesting the sponging of RBPs by the SARS-CoV-2 genome. A similar analysis to study the interactions of miRs with SARS-CoV-2 revealed functionally important miRs that were highly expressed in immune cells, suggesting that these interactions may contribute to the progression of the viral infection and modulate the host immune response across other human tissues. Given the need to understand the interactions of SARS-CoV-2 with key post-transcriptional regulators in the human genome, this study provided a systematic computational analysis to dissect the role of dysregulated post-transcriptional regulatory networks controlled by RBPs and miRs across tissue types during a SARS-CoV-2 infection.Item Role of SARS-CoV-2 in Altering the RNA-Binding Protein and miRNA-Directed Post-Transcriptional Regulatory Networks in Humans(MDPI, 2020) Srivastava, Rajneesh; Daulatabad, Swapna Vidhur; Srivastava, Mansi; Janga, Sarath Chandra; BioHealth Informatics, School of Informatics and ComputingThe outbreak of a novel coronavirus SARS-CoV-2 responsible for the COVID-19 pandemic has caused a worldwide public health emergency. Due to the constantly evolving nature of the coronaviruses, SARS-CoV-2-mediated alterations on post-transcriptional gene regulations across human tissues remain elusive. In this study, we analyzed publicly available genomic datasets to systematically dissect the crosstalk and dysregulation of the human post-transcriptional regulatory networks governed by RNA-binding proteins (RBPs) and micro-RNAs (miRs) due to SARS-CoV-2 infection. We uncovered that 13 out of 29 SARS-CoV-2-encoded proteins directly interacted with 51 human RBPs, of which the majority of them were abundantly expressed in gonadal tissues and immune cells. We further performed a functional analysis of differentially expressed genes in mock-treated versus SARS-CoV-2-infected lung cells that revealed enrichment for the immune response, cytokine-mediated signaling, and metabolism-associated genes. This study also characterized the alternative splicing events in SARS-CoV-2-infected cells compared to the control, demonstrating that skipped exons and mutually exclusive exons were the most abundant events that potentially contributed to differential outcomes in response to the viral infection. A motif enrichment analysis on the RNA genomic sequence of SARS-CoV-2 clearly revealed the enrichment for RBPs such as SRSFs, PCBPs, ELAVs, and HNRNPs, suggesting the sponging of RBPs by the SARS-CoV-2 genome. A similar analysis to study the interactions of miRs with SARS-CoV-2 revealed functionally important miRs that were highly expressed in immune cells, suggesting that these interactions may contribute to the progression of the viral infection and modulate the host immune response across other human tissues. Given the need to understand the interactions of SARS-CoV-2 with key post-transcriptional regulators in the human genome, this study provided a systematic computational analysis to dissect the role of dysregulated post-transcriptional regulatory networks controlled by RBPs and miRs across tissue types during a SARS-CoV-2 infection.Item SARS-CoV-2 contributes to altering the post-transcriptional regulatory networks across human tissues by sponging RNA binding proteins and micro-RNAs(2020-07-06) Srivastava, Rajneesh; Daulatabad, Swapna Vidhur; Srivastava, Mansi; Janga, Sarath Chandra; BioHealth Informatics, School of Informatics and ComputingThe outbreak of a novel coronavirus SARS-CoV2 responsible for COVID-19 pandemic has caused worldwide public health emergency. Due to the constantly evolving nature of the coronaviruses, SARS-CoV-2 mediated alteration on post-transcriptional gene regulation across human tissues remains elusive. In this study, we systematically dissected the crosstalk and dysregulation of human post-transcriptional regulatory networks governed by RNA binding proteins (RBPs) and micro-RNAs (miRs), due to SARS-CoV-2 infection. We uncovered that 13 out of 29 SARS-CoV- 2 encoded proteins directly interact with 51 human RBPs of which majority of them were abundantly expressed in gonadal tissues and immune cells. We further performed functional analysis of differentially expressed genes in mock treated versus SARS-CoV-2 infected lung cells that revealed an enrichment for immune response, cytokine mediated signaling, and metabolism associated genes. This study also characterized the alternative splicing events in SARS-CoV-2 infected cells compared to control demonstrating that skipped exons and mutually exclusive exons were the most abundant events that potentially contributed to differential outcomes in response to viral infection. Motif enrichment analysis on the RNA genomic sequence of SARS-CoV-2 clearly revealed an enrichment for RBPs such as SRSFs, PCBPs, ELAVs and HNRNPs illustrating the sponging of RBPs by SARS-CoV-2 genome. Similar analysis to study the interactions of miRs with SARS-CoV-2 revealed the potential for several miRs to be sponged, suggesting that these interactions may contribute to altered pos-transcriptional regulation across human tissues. Given the need to understand the interactions of SARS-CoV-2 with key pos-transcriptional regulators in the human genome, this study provides a systematic analysis to dissect the role of dysregulated post-transcriptional regulatory networks controlled by RBPs and miRs, across tissues types during SARS-CoV2 infection.Item Trancriptome-Wide Applications of Protein Occupancy Profile Sequencing (POP-seq)(2023-06) Sangani, Neel; Janga, Sarath Chandra; Yan, Jingwen; Srivastava, MansiDynamic protein-RNA interactions regulate RNA metabolism and alter cellular physiology by altering key regulatory processes such as capping, splicing, polyadenylation, and localization. Several high throughput methods have been developed to detect protein-RNA interactions, but they often exhibit biases due to the inherent limitations of crosslinking-based approaches. We propose Protein Occupancy Profile-Sequencing (POP-seq), a phase separation-based method that does not require crosslinking to detect protein occupancy transcriptome wide. In this study, we employed POP-seq to examine the unbiased regulatory protein-RNA interactions in the following cancer cell lines: K562, HepG2, A549, MCF7, Jurkat, and HEK293. In our preliminary analysis, we performed a comparison of the POP-seq identified interactions using two protocols, one involving UV crosslinking (UPOP-seq) and the other with no-crosslinking (NPOP-seq), in K562 and HepG2 cells. This comparative analysis of two protocol showed >70% overlapping genes detected by both approaches in the two cell lines. Most of these peaks were mapped to intronic regions of the protein coding gene. Concurrently, we also implemented this crosslinking free approach on two leukemia cell lines: Jurkat and K562. Differential analysis shows higher binding activity in Jurkat compared to K562 with majority of the peaks spanned over intronic protein coding region followed by SINE and LINE. Differential proximal binding analysis shows that SE events followed by A3SS events plays a major role in alternative splicing suggesting enriched regions plays vital role in cellular functions including post-transcriptional regulation of gene expression. Motif analysis shows clinically relevant significant motif enrichment of POP-seq identified peaks. This study was further expanded by adding three human additional cell lines: MCF7, A459, and HEK293. Differential peak analysis across cell lines revealed a closer association between A549 and MCF7 cells based on the normalized POP-seq peaks per gene. We observed that genes associated with differential peaks between cell lines exhibited enrichment for crucial cellular functions, particularly in the post-transcriptional regulation of gene expression. Our analysis unveiled a notable enrichment of specific motifs within the identified peaks obtained from POP-seq. These overrepresented motifs were significantly linked to somatic variation, phenotypic variation (Phenvar), clinical variation (Clinvar), GWAS, and allele-specific expression (ASE), with a preferential abundance of the motifs on the C and G bases. Additionally, our alternative splicing analysis revealed that POP-seq detected protein-RNA interactions that substantially contributed to splicing events in certain cell line pairs, while their impact was less pronounced in others. Overall, our study offers the first extensive dataset of protein-RNA interaction maps across the transcriptome in multiple cell lines, utilizing a crosslinking-free approach. This valuable resource not only provides comprehensive insights into regulatory interactions but also opens new possibilities for applying this method in primary tissues to detect and study protein-RNA interactions in a broader biological context.Item Transcription Factors in the Development and Pro-Allergic Function of Mast Cells(Frontiers Media, 2021-06-07) Srivastava, Mansi; Kaplan, Mark H.; BioHealth Informatics, School of Informatics and ComputingMast cells (MCs) are innate immune cells of hematopoietic origin localized in the mucosal tissues of the body and are broadly implicated in the pathogenesis of allergic inflammation. Transcription factors have a pivotal role in the development and differentiation of mast cells in response to various microenvironmental signals encountered in the resident tissues. Understanding the regulation of mast cells by transcription factors is therefore vital for mechanistic insights into allergic diseases. In this review we summarize advances in defining the transcription factors that impact the development of mast cells throughout the body and in specific tissues, and factors that are involved in responding to the extracellular milieu. We will further describe the complex networks of transcription factors that impact mast cell physiology and expansion during allergic inflammation and functions from degranulation to cytokine secretion. As our understanding of the heterogeneity of mast cells becomes more detailed, the contribution of specific transcription factors in mast cell-dependent functions will potentially offer new pathways for therapeutic targeting.