- Browse by Author
Browsing by Author "Feng, Haodi"
Now showing 1 - 2 of 2
Results Per Page
Sort Options
Item IsoTree: A New Framework for De novo Transcriptome Assembly from RNA-seq Reads(IEEE, 2018-02) Zhao, Jin; Feng, Haodi; Zhu, Daming; Zhang, Chi; Xu, Ying; Medical and Molecular Genetics, School of MedicineHigh-throughput sequencing of mRNA has made the deep and efficient probing of transcriptome more affordable. However, the vast amounts of short RNA-seq reads make de novo transcriptome assembly an algorithmic challenge. In this work, we present IsoTree, a novel framework for transcripts reconstruction in the absence of reference genomes. Unlike most of de novo assembly methods that build de Bruijn graph or splicing graph by connecting $k-mers$ which are sets of overlapping substrings generated from reads, IsoTree constructs splicing graph by connecting reads directly. For each splicing graph, IsoTree applies an iterative scheme of mixed integer linear program to build a prefix tree, called isoform tree. Each path from the root node of the isoform tree to a leaf node represents a plausible transcript candidate which will be pruned based on the information of paired-end reads. Experiments showed that in most cases IsoTree performs better than other leading transcriptome assembly programs. IsoTree is available at https://github.com/Jane110111107/IsoTree.Item The Longest Common Exemplar Subsequence Problem(IEEE, 2018-12) Zhang, Shu; Wang, Ruizhi; Zhu, Daming; Jiang, Haitao; Feng, Haodi; Guo, Jiong; Liu, Xiaowen; BioHealth Informatics, School of Informatics and ComputingIn this paper, we propose to find order conserved subsequences of genomes by finding longest common exemplar subsequences of the genomes. The longest common exemplar subsequence problem is given by two genomes, asks to find a common exemplar subsequence of them, such that the exemplar subsequence length is maximized. We focus on genomes whose genes of the same gene family are in at most s spans. We propose a dynamic programming algorithm with time complexity O(s4 s mn) to find a longest common exemplar subsequence of two genomes with one genome admitting s span genes of the same gene family, where m, n stand for the gene numbers of those two given genomes. Our algorithm can be extended to find longest common exemplar subsequences of more than one genomes.