Application of Data Pipelining Technology in Cheminformatics and Bioinformatics
dc.contributor.advisor | Perry, Douglas G. | |
dc.contributor.author | Mao, Linyong | |
dc.date.accessioned | 2005-08-08T17:13:11Z | |
dc.date.available | 2005-08-08T17:13:11Z | |
dc.date.issued | 2002-12 | |
dc.degree.discipline | School of Informatics | |
dc.degree.grantor | Indiana University | |
dc.degree.level | Master of Science | |
dc.description | Submitted to the faculty of the University Graduate School in partial fulfillment of the requirements for the degree Master of Sciences in the School of Informatics Indiana University December 2002 | en |
dc.description.abstract | Data pipelining is the processing, analysis, and mining of large volumes of data through a branching network of computational steps. A data pipelining system consists of a collection of modular computational components and a network for streaming data between them. By defining a logical path for data through a network of computational components and configuring each component accordingly, a user can create a protocol to perform virtually any desired function with data and extract knowledge from them. A set of data pipelines were constructed to explore the relationship between the biodegradability and structural properties of halogenated aliphatic compounds in a data set in which each compound has one degradation rate and nine structure-derived properties. After training, the data pipeline was able to calculate the degradation rates of new compounds with a relatively accurate rate. A second set of data pipelines was generated to cluster new DNA sequences. The data pipelining technology was applied to identify a core sequence to represent a DNA cluster and construct the 95% confidence distance interval for the cluster. The result shows that 74% of the DNA sequences were correctly clustered and there was no false clustering. | en |
dc.format.extent | 910363 bytes | |
dc.format.mimetype | application/pdf | |
dc.identifier.uri | https://hdl.handle.net/1805/322 | |
dc.identifier.uri | http://dx.doi.org/10.7912/C2/813 | |
dc.language.iso | en_US | |
dc.subject | data pipelining technology | en |
dc.subject | cheminformatics | en |
dc.subject | bioinformatics | en |
dc.title | Application of Data Pipelining Technology in Cheminformatics and Bioinformatics | en |
dc.type | Thesis | en |