- Browse by Author
Browsing by Author "Daulatabad, Swapna Vidhur"
Now showing 1 - 10 of 15
Results Per Page
Sort Options
Item Genome Scale Analysis of Alternative Splicing Events in lncRNA KnockdownsPorto, Felipe Wendt; Daulatabad, Swapna Vidhur; Janga, Sarath ChandraRecent developments in our understanding of the interactions between lncRNA and cellular components have improved treatment approaches for various human diseases including cancer, vascular diseases, and brain diseases (1, 2, 3). Although investigation of specific lncRNAs revealed their role in the metabolism of cellular RNA, our understanding of their contribution to post-transcriptional regulation is relatively limited. In this study, we explore the role of lncRNAs in modulating alternative splicing and their impact on downstream protein-RNA interaction networks. Analysis of alternative splicing events across 39 lncRNA wildtype and knockout RNA-sequencing datasets from three human cell lines: Hela (Cervical Cancer), K562 (Myeloid Leukemia), and U87 (Glioblastoma), resulted in high confidence (fdr < 0.01) identification of 4432 skipped exon events and 2474 retained intron events, implicating 759 genes to be impacted at post-transcriptional level due to the loss of lncRNAs. We observed that a majority of the alternatively spliced genes in a lncRNA knockout were specific to the cell type, in agreement with the finding that genes affected by alternative splicing also displayed enriched functions in a cell type specific manner (4, 5). To understand the mechanism behind this cell-type specific alternative splicing patterns, we analyzed RNA binding protein (RBP)-RNA interaction profiles across the spliced regions. Despite limited REP binding data across cell lines, alternatively spliced events detected in lncRNA perturbation experiments were associated with RBPs binding in proximal intron-exon junctions, in a cell type specific manner. Based on the RBP binding profiles in Hela and K562 cells, we hypothesize that several lncRNAs are likely to exhibit a sponge effect in disease contexts, resulting in the functional disruption of RBPs, and their downstream functions. We propose that such lncRNA sponges can extensively rewire the post-transcriptional gene regulatory networks by altering the protein-RNA interaction landscape in a cell-type specific manner.Item Lantern: a semi-automated pipeline and repository for annotating lncRNAs with ontologiesDaulatabad, Swapna Vidhur; Srivastava, Rajneesh; Janga, Sarath ChandraWith advancements in in omics technologies, the range of biological processes long non-coding RNAs (lncRNAs) are involved in is expanding extensively [1,2). The accelerating rate of evidence discovery of lncRNAs' role in various critical biochemical, cellular, and physiological processes is necessitating a robust platform of lncRNA annotation resources. Available resources with lncRNA ontology annotations are rare: despite a plethora of resources for annotating genes, and an extensive body of lncRNA literature. Here, we present a lncRNA annotation extractor and repository (Lantern), was developed using PubMed's abstract retrieval engine and NCBO's recommender annotation system [3]. Between 1-150 abstracts were extracted per lncRNA, which were subsequently used for extracting annotations with respect to each ontology by querying NCBO's recommender system via Application Programming Interface (API). To evaluate the quality of annotations in Lantern, benchmarking analysis was performed by deploying Lantern's pipeline over 182 lncRNAs from lncRNAdb [4) and compared the extracted annotations against annotations mapped onto the lncRNAdb's manually curated free text. Benchmarking analysis suggested that Lantern has a recall of 0.62 against lncRNAdb for 182 lncRNAs and precision of 0.8 based on manual verification of ontology annotations for 50 lncRNAs. Additionally, lncRNAs were also annotated with multiple omics information, like: RBP-interaction, tissue specific expression, protein c-expression, coding potential, sub-cellular localization and SNPs for around 11000 lncRNAs; retrieved and analyzed by robust NGS tools and pipelines. The extracted annotations for 11000 lncRNAs are available at http://www.iupui.edu/~sysbio/lantern/.Item Lantern: an integrative repository of functional annotations for lncRNAs in the human genome(BMC, 2021-05-26) Daulatabad, Swapna Vidhur; Srivastava, Rajneesh; Janga, Sarath Chandra; BioHealth Informatics, School of Informatics and ComputingBackground: With advancements in omics technologies, the range of biological processes where long non-coding RNAs (lncRNAs) are involved, is expanding extensively, thereby generating the need to develop lncRNA annotation resources. Although, there are a plethora of resources for annotating genes, despite the extensive corpus of lncRNA literature, the available resources with lncRNA ontology annotations are rare. Results: We present a lncRNA annotation extractor and repository (Lantern), developed using PubMed's abstract retrieval engine and NCBO's recommender annotation system. Lantern's annotations were benchmarked against lncRNAdb's manually curated free text. Benchmarking analysis suggested that Lantern has a recall of 0.62 against lncRNAdb for 182 lncRNAs and precision of 0.8. Additionally, we also annotated lncRNAs with multiple omics annotations, including predicted cis-regulatory TFs, interactions with RBPs, tissue-specific expression profiles, protein co-expression networks, coding potential, sub-cellular localization, and SNPs for ~ 11,000 lncRNAs in the human genome, providing a one-stop dynamic visualization platform. Conclusions: Lantern integrates a novel, accurate semi-automatic ontology annotation engine derived annotations combined with a variety of multi-omics annotations for lncRNAs, to provide a central web resource for dissecting the functional dynamics of long non-coding RNAs and to facilitate future hypothesis-driven experiments. The annotation pipeline and a web resource with current annotations for human lncRNAs are freely available on sysbio.lab.iupui.edu/lantern.Item Long Non-Coding RNA Expression Levels Modulate Cell-Type-Specific Splicing Patterns by Altering Their Interaction Landscape with RNA-Binding Proteins(MDPI, 2019-08-06) Porto, Felipe Wendt; Daulatabad, Swapna Vidhur; Janga, Sarath Chandra; BioHealth Informatics, School of Informatics and ComputingRecent developments in our understanding of the interactions between long non-coding RNAs (lncRNAs) and cellular components have improved treatment approaches for various human diseases including cancer, vascular diseases, and neurological diseases. Although investigation of specific lncRNAs revealed their role in the metabolism of cellular RNA, our understanding of their contribution to post-transcriptional regulation is relatively limited. In this study, we explore the role of lncRNAs in modulating alternative splicing and their impact on downstream protein-RNA interaction networks. Analysis of alternative splicing events across 39 lncRNA knockdown and wildtype RNA-sequencing datasets from three human cell lines-HeLa (cervical cancer), K562 (myeloid leukemia), and U87 (glioblastoma)-resulted in the high-confidence (false discovery rate (fdr) < 0.01) identification of 11,630 skipped exon events and 5895 retained intron events, implicating 759 genes to be impacted at the post-transcriptional level due to the loss of lncRNAs. We observed that a majority of the alternatively spliced genes in a lncRNA knockdown were specific to the cell type. In tandem, the functions annotated to the genes affected by alternative splicing across each lncRNA knockdown also displayed cell-type specificity. To understand the mechanism behind this cell-type-specific alternative splicing pattern, we analyzed RNA-binding protein (RBP)-RNA interaction profiles across the spliced regions in order to observe cell-type-specific alternative splice event RBP binding preference. Despite limited RBP binding data across cell lines, alternatively spliced events detected in lncRNA perturbation experiments were associated with RBPs binding in proximal intron-exon junctions in a cell-type-specific manner. The cellular functions affected by alternative splicing were also affected in a cell-type-specific manner. Based on the RBP binding profiles in HeLa and K562 cells, we hypothesize that several lncRNAs are likely to exhibit a sponge effect in disease contexts, resulting in the functional disruption of RBPs and their downstream functions. We propose that such lncRNA sponges can extensively rewire post-transcriptional gene regulatory networks by altering the protein-RNA interaction landscape in a cell-type-specific manner.Item A long non‐coding RNA (Lrap) modulates brain gene expression and levels of alcohol consumption in rats(Wiley, 2021-03) Saba, Laura M.; Hoffman, Paula L.; Homanics, Gregg E.; Mahaffey, Spencer; Daulatabad, Swapna Vidhur; Janga, Sarath Chandra; Tabakoff, Boris; BioHealth Informatics, School of Informatics and ComputingLncRNAs are important regulators of quantitative and qualitative features of the transcriptome. We have used QTL and other statistical analyses to identify a gene coexpression module associated with alcohol consumption. The "hub gene" of this module, Lrap (Long non-coding RNA for alcohol preference), was an unannotated transcript resembling a lncRNA. We used partial correlation analyses to establish that Lrap is a major contributor to the integrity of the coexpression module. Using CRISPR/Cas9 technology, we disrupted an exon of Lrap in Wistar rats. Measures of alcohol consumption in wild type, heterozygous and knockout rats showed that disruption of Lrap produced increases in alcohol consumption/alcohol preference. The disruption of Lrap also produced changes in expression of over 700 other transcripts. Furthermore, it became apparent that Lrap may have a function in alternative splicing of the affected transcripts. The GO category of "Response to Ethanol" emerged as one of the top candidates in an enrichment analysis of the differentially expressed transcripts. We validate the role of Lrap as a mediator of alcohol consumption by rats, and also implicate Lrap as a modifier of the expression and splicing of a large number of brain transcripts. A defined subset of these transcripts significantly impacts alcohol consumption by rats (and possibly humans). Our work shows the pleiotropic nature of non-coding elements of the genome, the power of network analysis in identifying the critical elements influencing phenotypes, and the fact that not all changes produced by genetic editing are critical for the concomitant changes in phenotype.Item Monoallelically expressed noncoding RNAs form nucleolar territories on NOR-containing chromosomes and regulate rRNA expression(eLife Sciences, 2024-01-19) Hao, Qinyu; Liu, Minxue; Daulatabad, Swapna Vidhur; Gaffari, Saba; Song, You Jin; Srivastava, Rajneesh; Bhaskar, Shivang; Moitra, Anurupa; Mangan, Hazel; Tseng, Elizabeth; Gilmore, Rachel B.; Frier, Susan M.; Chen, Xin; Wang, Chengliang; Huang, Sui; Chamberlain, Stormy; Jin, Hong; Korlach, Jonas; McStay, Brian; Sinha, Saurabh; Janga, Sarath Chandra; Prasanth, Supriya G.; Prasanth, Kannanganattu V.; Biomedical Engineering and Informatics, Luddy School of Informatics, Computing, and EngineeringOut of the several hundred copies of rRNA genes arranged in the nucleolar organizing regions (NOR) of the five human acrocentric chromosomes, ~50% remain transcriptionally inactive. NOR-associated sequences and epigenetic modifications contribute to the differential expression of rRNAs. However, the mechanism(s) controlling the dosage of active versus inactive rRNA genes within each NOR in mammals is yet to be determined. We have discovered a family of ncRNAs, SNULs (Single NUcleolus Localized RNA), which form constrained sub-nucleolar territories on individual NORs and influence rRNA expression. Individual members of the SNULs monoallelically associate with specific NOR-containing chromosomes. SNULs share sequence similarity to pre-rRNA and localize in the sub-nucleolar compartment with pre-rRNA. Finally, SNULs control rRNA expression by influencing pre-rRNA sorting to the DFC compartment and pre-rRNA processing. Our study discovered a novel class of ncRNAs influencing rRNA expression by forming constrained nucleolar territories on individual NORs.Item Nm-Nano: a machine learning framework for transcriptome-wide single-molecule mapping of 2´-O-methylation (Nm) sites in nanopore direct RNA sequencing datasets(Taylor & Francis, 2024) Hassan, Doaa; Ariyur, Aditya; Daulatabad, Swapna Vidhur; Mir, Quoseena; Janga, Sarath Chandra; Biomedical Engineering and Informatics, Luddy School of Informatics, Computing, and Engineering2´-O-methylation (Nm) is one of the most abundant modifications found in both mRNAs and noncoding RNAs. It contributes to many biological processes, such as the normal functioning of tRNA, the protection of mRNA against degradation by the decapping and exoribonuclease (DXO) protein, and the biogenesis and specificity of rRNA. Recent advancements in single-molecule sequencing techniques for long read RNA sequencing data offered by Oxford Nanopore technologies have enabled the direct detection of RNA modifications from sequencing data. In this study, we propose a bio-computational framework, Nm-Nano, for predicting the presence of Nm sites in direct RNA sequencing data generated from two human cell lines. The Nm-Nano framework integrates two supervised machine learning (ML) models for predicting Nm sites: Extreme Gradient Boosting (XGBoost) and Random Forest (RF) with K-mer embedding. Evaluation on benchmark datasets from direct RNA sequecing of HeLa and HEK293 cell lines, demonstrates high accuracy (99% with XGBoost and 92% with RF) in identifying Nm sites. Deploying Nm-Nano on HeLa and HEK293 cell lines reveals genes that are frequently modified with Nm. In HeLa cell lines, 125 genes are identified as frequently Nm-modified, showing enrichment in 30 ontologies related to immune response and cellular processes. In HEK293 cell lines, 61 genes are identified as frequently Nm-modified, with enrichment in processes like glycolysis and protein localization. These findings underscore the diverse regulatory roles of Nm modifications in metabolic pathways, protein degradation, and cellular processes. The source code of Nm-Nano can be freely accessed at https://github.com/Janga-Lab/Nm-Nano.Item Penguin: A Tool for Predicting Pseudouridine Sites in Direct RNA Nanopore Sequencing Data(Elsevier, 2022) Hassan, Doaa; Acevedo, Daniel; Daulatabad, Swapna Vidhur; Mir, Quoseena; Janga, Sarath Chandra; BioHealth Informatics, School of Informatics and ComputingPseudouridine is one of the most abundant RNA modifications, occurring when uridines are catalyzed by Pseudouridine synthase proteins. It plays an important role in many biological processes and has been reported to have application in drug development. Recently, the single-molecule sequencing techniques such as the direct RNA sequencing platform offered by Oxford Nanopore technologies have enabled direct detection of RNA modifications on the molecule being sequenced. In this study, we introduce a tool called Penguin that integrates several machine learning (ML) models to identify RNA Pseudouridine sites on Nanopore direct RNA sequencing reads. Pseudouridine sites were identified on single molecule sequencing data collected from direct RNA sequencing resulting in 723K reads in Hek293 and 500K reads in Hela cell lines. Penguin extracts a set of features from the raw signal measured by the Oxford Nanopore and the corresponding basecalled k-mer. Those features are used to train the predictors included in Penguin, which in turn, can predict whether the signal is modified by the presence of Pseudouridine sites in the testing phase. We have included various predictors in Penguin, including Support vector machines (SVM), Random Forest (RF), and Neural network (NN). The results on the two benchmark data sets for Hek293 and Hela cell lines show outstanding performance of Penguin either in random split testing or in independent validation testing. In random split testing, Penguin has been able to identify Pseudouridine sites with a high accuracy of 93.38% by applying SVM to Hek293 benchmark dataset. In independent validation testing, Penguin achieves an accuracy of 92.61% by training SVM with Hek293 benchmark dataset and testing it for identifying Pseudouridine sites on Hela benchmark dataset. Thus, Penguin outperforms the existing Pseudouridine predictors in the literature by 16 % higher accuracy than those predictors using independent validation testing. Employing penguin to predict Pseudouridine revealed a significant enrichment of “regulation of mRNA 3’-end processing” in Hek293 cell line and positive regulation of transcription from RNA polymerase II promoter involved in cellular response to chemical stimulus in Hela cell line. Penguin software and models are available on GitHub at https://github.com/Janga-Lab/Penguin and can be readily employed for predicting Ψ sites from Nanopore direct RNA-sequencing datasets.Item Role of SARS-CoV-2 in Altering the RNA-Binding Protein and miRNA-Directed Post-Transcriptional Regulatory Networks in Humans(MDPI, 2020-09-25) Srivastava, Rajneesh; Daulatabad, Swapna Vidhur; Srivastava, Mansi; Janga, Sarath Chandra; BioHealth Informatics, School of Informatics and ComputingThe outbreak of a novel coronavirus SARS-CoV-2 responsible for the COVID-19 pandemic has caused a worldwide public health emergency. Due to the constantly evolving nature of the coronaviruses, SARS-CoV-2-mediated alterations on post-transcriptional gene regulations across human tissues remain elusive. In this study, we analyzed publicly available genomic datasets to systematically dissect the crosstalk and dysregulation of the human post-transcriptional regulatory networks governed by RNA-binding proteins (RBPs) and micro-RNAs (miRs) due to SARS-CoV-2 infection. We uncovered that 13 out of 29 SARS-CoV-2-encoded proteins directly interacted with 51 human RBPs, of which the majority of them were abundantly expressed in gonadal tissues and immune cells. We further performed a functional analysis of differentially expressed genes in mock-treated versus SARS-CoV-2-infected lung cells that revealed enrichment for the immune response, cytokine-mediated signaling, and metabolism-associated genes. This study also characterized the alternative splicing events in SARS-CoV-2-infected cells compared to the control, demonstrating that skipped exons and mutually exclusive exons were the most abundant events that potentially contributed to differential outcomes in response to the viral infection. A motif enrichment analysis on the RNA genomic sequence of SARS-CoV-2 clearly revealed the enrichment for RBPs such as SRSFs, PCBPs, ELAVs, and HNRNPs, suggesting the sponging of RBPs by the SARS-CoV-2 genome. A similar analysis to study the interactions of miRs with SARS-CoV-2 revealed functionally important miRs that were highly expressed in immune cells, suggesting that these interactions may contribute to the progression of the viral infection and modulate the host immune response across other human tissues. Given the need to understand the interactions of SARS-CoV-2 with key post-transcriptional regulators in the human genome, this study provided a systematic computational analysis to dissect the role of dysregulated post-transcriptional regulatory networks controlled by RBPs and miRs across tissue types during a SARS-CoV-2 infection.Item Role of SARS-CoV-2 in Altering the RNA-Binding Protein and miRNA-Directed Post-Transcriptional Regulatory Networks in Humans(MDPI, 2020) Srivastava, Rajneesh; Daulatabad, Swapna Vidhur; Srivastava, Mansi; Janga, Sarath Chandra; BioHealth Informatics, School of Informatics and ComputingThe outbreak of a novel coronavirus SARS-CoV-2 responsible for the COVID-19 pandemic has caused a worldwide public health emergency. Due to the constantly evolving nature of the coronaviruses, SARS-CoV-2-mediated alterations on post-transcriptional gene regulations across human tissues remain elusive. In this study, we analyzed publicly available genomic datasets to systematically dissect the crosstalk and dysregulation of the human post-transcriptional regulatory networks governed by RNA-binding proteins (RBPs) and micro-RNAs (miRs) due to SARS-CoV-2 infection. We uncovered that 13 out of 29 SARS-CoV-2-encoded proteins directly interacted with 51 human RBPs, of which the majority of them were abundantly expressed in gonadal tissues and immune cells. We further performed a functional analysis of differentially expressed genes in mock-treated versus SARS-CoV-2-infected lung cells that revealed enrichment for the immune response, cytokine-mediated signaling, and metabolism-associated genes. This study also characterized the alternative splicing events in SARS-CoV-2-infected cells compared to the control, demonstrating that skipped exons and mutually exclusive exons were the most abundant events that potentially contributed to differential outcomes in response to the viral infection. A motif enrichment analysis on the RNA genomic sequence of SARS-CoV-2 clearly revealed the enrichment for RBPs such as SRSFs, PCBPs, ELAVs, and HNRNPs, suggesting the sponging of RBPs by the SARS-CoV-2 genome. A similar analysis to study the interactions of miRs with SARS-CoV-2 revealed functionally important miRs that were highly expressed in immune cells, suggesting that these interactions may contribute to the progression of the viral infection and modulate the host immune response across other human tissues. Given the need to understand the interactions of SARS-CoV-2 with key post-transcriptional regulators in the human genome, this study provided a systematic computational analysis to dissect the role of dysregulated post-transcriptional regulatory networks controlled by RBPs and miRs across tissue types during a SARS-CoV-2 infection.