- Browse by Author
Browsing by Author "Zhou, Ao"
Now showing 1 - 5 of 5
Results Per Page
Sort Options
Item Alt Event Finder: a tool for extracting alternative splicing events from RNA-seq data.(BMC, 2012) Zhou, Ao; Breese, Marcus R.; Hao, Yangyang; Edenberg, Howard J.; Li, Lang; Skaar, Todd C.; Liu, YunlongBACKGROUND: Alternative splicing increases proteome diversity by expressing multiple gene isoforms that often differ in function. Identifying alternative splicing events from RNA-seq experiments is important for understanding the diversity of transcripts and for investigating the regulation of splicing. RESULTS: We developed Alt Event Finder, a tool for identifying novel splicing events by using transcript annotation derived from genome-guided construction tools, such as Cufflinks and Scripture. With a proper combination of alignment and transcript reconstruction tools, Alt Event Finder is capable of identifying novel splicing events in the human genome. We further applied Alt Event Finder on a set of RNA-seq data from rat liver tissues, and identified dozens of novel cassette exon events whose splicing patterns changed after extensive alcohol exposure. CONCLUSIONS: Alt Event Finder is capable of identifying de novo splicing events from data-driven transcript annotation, and is a useful tool for studying splicing regulation.Item Characterizing alternative splicing and long non-coding RNA with high-throughput sequencing technology(2018-10) Zhou, Ao; Wu, Huanmei; Liu, Yunlong; Janga, Sarath C.; Liu, XiaowenSeveral experimental methods has been developed for the study of the central dogma since late 20th century. Protein mass spectrometry and next generation sequencing (including DNA-Seq and RNA-Seq) forms a triangle of experimental methods, corresponding to the three vertices of the central dogma, i.e., DNA, RNA and protein. Numerous RNA sequencing and protein mass spectrometry experiments has been carried out in attempt to understand how the expression change of known genes affect biological functions in various of organisms, however, it has been once overlooked that the result data of these experiments are in fact holograms which also reveals other delicate biological mechanisms, such as RNA splicing and the expression of long non-coding RNAs. In this dissertation, we carried out five studies based on high-throughput sequencing data, in an attempt to understand how RNA splicing and differential expression of long non-coding RNAs is associated biological functions. In the first two studies, we identified and characterized 197 stimulant induced and 477 developmentally regulated alternative splicing events from RNA sequencing data. In the third study, we introduced a method for identifying novel alternative splicing events that were never documented. In the fourth study, we introduced a method for identifying known and novel RNA splicing junctions from protein mass spectrometry data. In the fifth study, we introduced a method for identifying long non-coding RNAs from poly-A selected RNA sequencing data. Taking advantage of these methods, we turned RNA sequencing and protein mass spectrometry data into an information gold mine of splicing and long non-coding RNA activities.Item Characterizing the roles of long non-coding RNA in rat alcohol preference(IEEE, 2016-12) Zhou, Ao; Wang, Yadong; Liu, Yunlong; Feng, Weixing; Edenberg, Howard J.; Medical and Molecular Genetics, School of MedicineAlcohol is one of the major threats to health in United States. With the emerging of next-generation sequencing technology, the association between alcohol preference and the variants and expression of genes has been investigated. However, the roles of long non-coding RNAs (lncRNA) in alcohol preference remains unclear. In this study, we identified 37 novel lncRNAs that differentially expressed across alcohol preferring (P) and non-preferring (NP) rats. The functional study on these lncRNAs demonstrates that they are associated with gene regulation, as well as neural functions. This suggests that these lncRNAs may contribute to the alcohol preference behaviors.Item Lipopolysaccharide treatment induces genome-wide pre-mRNA splicing pattern changes in mouse bone marrow stromal stem cells(BioMed Central, 2016-08-22) Zhou, Ao; Li, Meng; He, Bo; Feng, Weixing; Huang, Fei; Xu, Bing; Dunker, A. Keith; Balch, Curt; Li, Baiyan; Liu, Yunlong; Wang, Yue; Department of Medical and Molecular Genetics, IU School of MedicineBackground Lipopolysaccharide (LPS) is a gram-negative bacterial antigen that triggers a series of cellular responses. LPS pre-conditioning was previously shown to improve the therapeutic efficacy of bone marrow stromal cells/bone-marrow derived mesenchymal stem cells (BMSCs) for repairing ischemic, injured tissue. Results In this study, we systematically evaluated the effects of LPS treatment on genome-wide splicing pattern changes in mouse BMSCs by comparing transcriptome sequencing data from control vs. LPS-treated samples, revealing 197 exons whose BMSC splicing patterns were altered by LPS. Functional analysis of these alternatively spliced genes demonstrated significant enrichment of phosphoproteins, zinc finger proteins, and proteins undergoing acetylation. Additional bioinformatics analysis strongly suggest that LPS-induced alternatively spliced exons could have major effects on protein functions by disrupting key protein functional domains, protein-protein interactions, and post-translational modifications. Conclusion Although it is still to be determined whether such proteome modifications improve BMSC therapeutic efficacy, our comprehensive splicing characterizations provide greater understanding of the intracellular mechanisms that underlie the therapeutic potential of BMSCs. Electronic supplementary material The online version of this article (doi:10.1186/s12864-016-2898-5) contains supplementary material, which is available to authorized users.Item PEPPI: a peptidomic database of human protein isoforms for proteomics experiments(BMC, 2010-10-07) Zhou, Ao; Zhang, Fan; Chen, Jake Yue; BioHealth Informatics, School of Informatics and ComputingBackground Protein isoform generation, which may derive from alternative splicing, genetic polymorphism, and posttranslational modification, is an essential source of achieving molecular diversity by eukaryotic cells. Previous studies have shown that protein isoforms play critical roles in disease diagnosis, risk assessment, sub-typing, prognosis, and treatment outcome predictions. Understanding the types, presence, and abundance of different protein isoforms in different cellular and physiological conditions is a major task in functional proteomics, and may pave ways to molecular biomarker discovery of human diseases. In tandem mass spectrometry (MS/MS) based proteomics analysis, peptide peaks with exact matches to protein sequence records in the proteomics database may be identified with mass spectrometry (MS) search software. However, due to limited annotation and poor coverage of protein isoforms in proteomics databases, high throughput protein isoform identifications, particularly those arising from alternative splicing and genetic polymorphism, have not been possible. Results Therefore, we present the PEPtidomics Protein Isoform Database (PEPPI, http://bio.informatics.iupui.edu/peppi), a comprehensive database of computationally-synthesized human peptides that can identify protein isoforms derived from either alternatively spliced mRNA transcripts or SNP variations. We collected genome, pre-mRNA alternative splicing and SNP information from Ensembl. We synthesized in silico isoform transcripts that cover all exons and theoretically possible junctions of exons and introns, as well as all their variations derived from known SNPs. With three case studies, we further demonstrated that the database can help researchers discover and characterize new protein isoform biomarkers from experimental proteomics data. Conclusions We developed a new tool for the proteomics community to characterize protein isoforms from MS-based proteomics experiments. By cataloguing each peptide configurations in the PEPPI database, users can study genetic variations and alternative splicing events at the proteome level. They can also batch-download peptide sequences in FASTA format to search for MS/MS spectra derived from human samples. The database can help generate novel hypotheses on molecular risk factors and molecular mechanisms of complex diseases, leading to identification of potentially highly specific protein isoform biomarkers.