Dissecting Protein-RNA Interaction Network in Human Genome
Date
Authors
Language
Embargo Lift Date
Department
Committee Chair
Committee Members
Degree
Degree Year
Department
Grantor
Journal Title
Journal ISSN
Volume Title
Found At
Abstract
In eukaryotes, gene regulation is a complex multilevel process comprising of transcriptional, post-transcriptional, and post-translational control. Although the regulation at transcriptional and post-translational levels is gradually being understood, protein machinery and the mechanisms underlying the post-transcriptional regulation remain to be elucidated. In the first study of this dissertation, I designed and implemented a database of RNA Binding Protein (RBP) Expression and Disease Dynamics (READ-DB: darwin.soic.iupui.edu), a non-redundant, curated database of human RBPs. This RBP knowledge base includes data from different experimental studies providing a one stop portal for understanding the expression, evolutionary trajectories, and disease dynamics of RBPs in the context of post-transcriptional regulatory networks. Despite the existence of several experimental procedures to understand the function of RBPs, a lack of a proper computational method to profile differential occupancy limits the scope of research. In the second study, I built a scalable framework for comparing genome-wide protein occupancy profiles among cell-types data, to uncover alterations in protein-RNA interactomes. diffHunter (github.com/Sasanh/diffHunter), is a window based peak calling and profile comparison method that can efficiently store the base-pair level read information of every given sample in a NoSQL (Not Only SQL) database. It identifies and quantitates the genome-wide binding differences between a pair of samples in two stages: Peak Calling and Differential Binding Identification. Identifying such regions enables us to compare the biologically important regions that differ between two conditions. Finally, I studied A-to- I RNA editing as one of the special functions of an RBPs’ family. ADAR family RBPs are the primary driver in the conversion of adenosine to inosine (A-to-I) within mRNA. I developed a Cancer-specific RNA-editing Identification using Somatic variation Pipeline (CRISP: github.com/Sasanh/CRISP) a computational framework for accurate identification of A-to-I editing events contributing to the prognosis and stratification of glioblastoma subtypes as well as the editing events that can serve as molecular classifiers for therapeutic approaches. I proposed two models that explains the cis-regulatory role of A-to-I editing events in noncoding regions in modulating the post-transcriptional regulation of target transcripts in glioblastoma.