Prediction and validation of the unexplored RNA-binding protein atlas of the human proteome
Date
Language
Embargo Lift Date
Committee Members
Degree
Degree Year
Department
Grantor
Journal Title
Journal ISSN
Volume Title
Found At
Abstract
Detecting protein-RNA interactions is challenging both experimentally and computationally because RNAs are large in number, diverse in cellular location and function, and flexible in structure. As a result, many RNA-binding proteins (RBPs) remain to be identified. Here, a template-based, function-prediction technique SPOT-Seq for RBPs is applied to human proteome and its result is validated by a recent proteomic experimental discovery of 860 mRNA-binding proteins (mRBPs). The coverage (or sensitivity) is 42.6% for 1217 known RBPs annotated in the Gene Ontology and 43.6% for 860 newly discovered human mRBPs. Consistent sensitivity indicates the robust performance of SPOT-Seq for predicting RBPs. More importantly, SPOT-Seq detects 2418 novel RBPs in human proteome, 291 of which were validated by the newly discovered mRBP set. Among 291 validated novel RBPs, 61 are not homologous to any known RBPs. Successful validation of predicted novel RBPs permits us to further analysis of their phenotypic roles in disease pathways. The dataset of 2418 predicted novel RBPs along with confidence levels and complex structures is available at http://sparks-lab.org (in publications) for experimental confirmations and hypothesis generation.