Application of Data Pipelining Technology in Cheminformatics and Bioinformatics

dc.contributor.advisorPerry, Douglas G.
dc.contributor.authorMao, Linyong
dc.date.accessioned2005-08-08T17:13:11Z
dc.date.available2005-08-08T17:13:11Z
dc.date.issued2002-12
dc.degree.disciplineSchool of Informatics
dc.degree.grantorIndiana University
dc.degree.levelMaster of Science
dc.descriptionSubmitted to the faculty of the University Graduate School in partial fulfillment of the requirements for the degree Master of Sciences in the School of Informatics Indiana University December 2002en
dc.description.abstractData pipelining is the processing, analysis, and mining of large volumes of data through a branching network of computational steps. A data pipelining system consists of a collection of modular computational components and a network for streaming data between them. By defining a logical path for data through a network of computational components and configuring each component accordingly, a user can create a protocol to perform virtually any desired function with data and extract knowledge from them. A set of data pipelines were constructed to explore the relationship between the biodegradability and structural properties of halogenated aliphatic compounds in a data set in which each compound has one degradation rate and nine structure-derived properties. After training, the data pipeline was able to calculate the degradation rates of new compounds with a relatively accurate rate. A second set of data pipelines was generated to cluster new DNA sequences. The data pipelining technology was applied to identify a core sequence to represent a DNA cluster and construct the 95% confidence distance interval for the cluster. The result shows that 74% of the DNA sequences were correctly clustered and there was no false clustering.en
dc.format.extent910363 bytes
dc.format.mimetypeapplication/pdf
dc.identifier.urihttps://hdl.handle.net/1805/322
dc.identifier.urihttp://dx.doi.org/10.7912/C2/813
dc.language.isoen_US
dc.subjectdata pipelining technologyen
dc.subjectcheminformaticsen
dc.subjectbioinformaticsen
dc.titleApplication of Data Pipelining Technology in Cheminformatics and Bioinformaticsen
dc.typeThesisen
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Linyong Mao.pdf
Size:
889.03 KB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.83 KB
Format:
Item-specific license agreed upon to submission
Description: