Characterizing alternative splicing and long non-coding RNA with high-throughput sequencing technology
Date
Authors
Language
Embargo Lift Date
Department
Committee Chair
Committee Members
Degree
Degree Year
Department
Grantor
Journal Title
Journal ISSN
Volume Title
Found At
Abstract
Several experimental methods has been developed for the study of the central dogma since late 20th century. Protein mass spectrometry and next generation sequencing (including DNA-Seq and RNA-Seq) forms a triangle of experimental methods, corresponding to the three vertices of the central dogma, i.e., DNA, RNA and protein. Numerous RNA sequencing and protein mass spectrometry experiments has been carried out in attempt to understand how the expression change of known genes affect biological functions in various of organisms, however, it has been once overlooked that the result data of these experiments are in fact holograms which also reveals other delicate biological mechanisms, such as RNA splicing and the expression of long non-coding RNAs. In this dissertation, we carried out five studies based on high-throughput sequencing data, in an attempt to understand how RNA splicing and differential expression of long non-coding RNAs is associated biological functions. In the first two studies, we identified and characterized 197 stimulant induced and 477 developmentally regulated alternative splicing events from RNA sequencing data. In the third study, we introduced a method for identifying novel alternative splicing events that were never documented. In the fourth study, we introduced a method for identifying known and novel RNA splicing junctions from protein mass spectrometry data. In the fifth study, we introduced a method for identifying long non-coding RNAs from poly-A selected RNA sequencing data. Taking advantage of these methods, we turned RNA sequencing and protein mass spectrometry data into an information gold mine of splicing and long non-coding RNA activities.