- Browse by Date Submitted
Luddy School of Informatics, Computing, and Engineering
Permanent URI for this community
The Indiana University Luddy School of Informatics, Computing, and Engineering is a core school with programs on the Bloomington and Indianapolis campuses. Works found here were created by Indianapolis faculty, staff, and students.
Browse
Browsing Luddy School of Informatics, Computing, and Engineering by browse.metadata.dateaccessioned
Now showing 1 - 10 of 1320
Results Per Page
Sort Options
Item IUPUI New Media Animation Projects(2005-02-14T22:08:50Z) Wiser, LeslieItem American Family Archive Project(2005-02-14T22:21:24Z) Wiser, LeslieItem Fitness Training Guide(2005-03-21T16:38:14Z) Reed, DavidThis interactive multi-media presentation guides the user through a fitness training regiment. It details specific excercises and muscle groups and discusses nutritional concerns.Item UITS Communications & Planning Office: 2002-2003 Year in Review Portfolio(2005-07-21T20:40:15Z) Hoffman, James C.This is an interactive portfolio showcasing the work of the UITS Communications & Planning Office. The portfolio presents a sampling of projects completed during the 2002-2003 fiscal year.Item Capstone 2003(2005-08-02) Nguyen, Diep (Christine) 1980-Item Security of our Personal Genome(2003-08) Smith, Gregory H.Our personal genome, which is the map of our DNA, is our ultimate source of identity, which should be given our highest concern for security. The primary approach used for securing any highly sensitive health care data such as our genome would be to guard against any personal identity information being associated with the data. The belief that nameless data records eliminates risk and would be a benefit to research is the common pretense for how we manage our health data systems. However, the incredible advances that we are seeing with computational power and more affordable and sophisticated DNA sequencing software may be creating a problem greater then the benefit that it is providing. Now we must be concerned about all data in the health care systems that could provide a link to accessible identity free data. Old data records or samples that provide possibilities of DNA sequence matching to existing identity free genomic data presents a whole new problem. How might this change the face of health care? Will further advances in technology make it impossible for us to secure our personal health information? Solutions could lead to restricting our ability to improve health care or it could force us to rely more heavily on ethical judgment to protect the rights of patients. The unprecedented rate of recent advances in information technologies along with improved speed, economy and accuracy of mapping the human genome has created serious concerns about the usage and security of this new highly sensitive genetic data. Our knowledge of DNA has come along way in the 50 years since James Watson and Francis Crick first presented their discovery of the double helix. The discovery timeline has been crowded in recent years starting with the U.S, Department of Energy’s Human Genome Initiative in 1986 and culminating in completion of the Human Genome Project in 2003. The exponential growth of genomic scientific accomplishment now forces us to assume new milestones will arrive sooner then later.Item Web-based Email Management For Email Overload(2005-08-08T16:50:27Z) Campiranon, ChatreeAn email overload problem occurs when users try to utilize email service in a way it was not designed for. Moreover, many web-based email services provide large email storage space and users tend to keep more unused emails. Issues that cause email overload are 1) Keeping too many emails, 2) Using email for conversational threads, and 3) Using email as a task management tool. Forty-five participants were selected to participate in user study sessions including questionnaire, time-on-task study, and interview. Participants were divided into three groups of 15. Participants in the first group were assigned as Gmail users. Participants in the second group were assigned as Yahoo! Mail users. After finishing user study sessions for the first two groups, the results were analyzed and the new web-based email prototype was designed as a suggestion of how the web-based email could be developed to handle the email overload problem. Then users in the third group tested the new prototype in the same manner the research was conducted with the first two groups of users. Users in the third group were satisfied with the features and design of the new prototype. The design of the new prototype focused on solutions that are able to handle email overload problem which are 1) Email categorizing, 2) Email thread grouping, 3) Email searching, and 4) Email task management. This study illustrates how the web-based email can be designed with features to handle email overload problems while maintaining the interface usable to most users.Item Construction of a Database of Secondary Structure Segments and Short Regions of Disorder and Analysis of Their PropertiesZang, Yizhi; Dunker, KeithPrediction of the secondary structure of a protein from its amino acid sequence remains an important task. Not only did the growth of database holding only protein sequences outpace that of solved protein structures, but successful predictions can provide a starting point for direct tertiary structure modeling [1],[2], and they can also significantly improve sequence analysis and sequence-structure threading [3],[4] for aiding in structure and function determination. Previous works on predicting secondary structures of proteins have yielded the best percent accuracy ranging from 63% to 71% [5]. These numbers, however, should be taken with caution since performance of a method based on a training set may vary when trained on a different training set. In order to improve predictions of secondary structure, there are three challenges. The first challenge is establishing an appropriate database. The next challenge is to represent the protein sequence appropriately. The third challenge is finding an appropriate method of classification. So, two of three challenges are related to an appropriate database and characteristic features. Here, we report the development of a database of non-identical segments of secondary structure elements and fragments with missing electron densities (disordered fragments) extracted from Protein Data Bank and categorized into groups of equal lengths, from 6 to 40. The number of residues corresponding to the above-mentioned categories is: 219,788 for α-helices, 82,070 for β-sheets, 179,388 for coils, and 74,724 for disorder. The total number of fragments in the database is 49,544; 17,794 of which are α-helices, 10,216 β-sheets, 16,318 coils, and 5,216 disordered regions. Across the whole range of lengths, α-helices were found to be enriched in L, A, E, I, and R, β-sheets were enriched in V, I, F, Y, and L, coils were enriched in P, G, N, D, and S, while disordered regions were enriched in S, G, P, H, and D. In addition to the amino acid sequence, for each fragment of every structural type, we calculated the distance between the residues immediately flanking its termini. The observed distances have ranges between 3 and 30Å. We found that for the three secondary structure types the average distance between the bookending residues linearly increases with sequence length, while distances were more constant for disorder. For each length between 6 and 40, we compared amino acid compositions of all four structural types and found a strong compositional dependence on length only for the β-sheet fragments, while the other three types showed virtually no change with length. Using the Kullback-Leibler (KL) distance between amino acid compositions, we quantified the differences between the four categories. We found that the closest pair in terms of the KL-distance were coil and disorder (dKL = 0.06 bits), then α-helix and β-sheet (dKL = 0.14 bits), while all other pairs we almost equidistant from one another (dKL ≈ 0.25 bits). With the increasing segment length we found a decreasing KL-distance between sheet and coil, sheet and disorder, and disorder and helix. Analyzing hierarchical clustering of length from 6 to 18 for sheet, coil, disorder, and helix, we found that the group coil had the closet proximity among lengths from 6 to 18. The next closest were helix and disorder. The sheet has the most difference among its length from 6 to 18. In group sheet and coil, fragments of length 17 had the longest distance while fragments of length 6 had the longest distance in group disorder and helix.Item Application of Data Pipelining Technology in Cheminformatics and Bioinformatics(2002-12) Mao, Linyong; Perry, Douglas G.Data pipelining is the processing, analysis, and mining of large volumes of data through a branching network of computational steps. A data pipelining system consists of a collection of modular computational components and a network for streaming data between them. By defining a logical path for data through a network of computational components and configuring each component accordingly, a user can create a protocol to perform virtually any desired function with data and extract knowledge from them. A set of data pipelines were constructed to explore the relationship between the biodegradability and structural properties of halogenated aliphatic compounds in a data set in which each compound has one degradation rate and nine structure-derived properties. After training, the data pipeline was able to calculate the degradation rates of new compounds with a relatively accurate rate. A second set of data pipelines was generated to cluster new DNA sequences. The data pipelining technology was applied to identify a core sequence to represent a DNA cluster and construct the 95% confidence distance interval for the cluster. The result shows that 74% of the DNA sequences were correctly clustered and there was no false clustering.Item Automating Laboratory Operations by Intergrating Laboratory Information Management Systems (LIMS) with Analytical Instruments and Scientific Data Management System (SDMS)(2005-06) Zhu, Jianyong; Merchant, MaheshThe large volume of data generated by commercial and research laboratories, along with requirements mandated by regulatory agencies, have forced companies to use laboratory information management systems (LIMS) to improve efficiencies in tracking, managing samples, and precisely reporting test results. However, most general purpose LIMS do not provide an interface to automatically collect data from analytical instruments to store in a database. A scientific data management system (SDMS) provides a “Print-to-Database” technology, which facilitates the entry of reports generated by instruments directly into the SDMS database as Windows enhanced metafiles thus to minimize data entry errors. Unfortunately, SDMS does not allow performing further analysis. Many LIMS vendors provide plug-ins for single instrument but none of them provides a general purpose interface to extract the data from SDMS and store in LIMS. In this project, a general purpose middle layer named LabTechie is designed, built and tested for seamless integration between instruments, SDMS and LIMS. This project was conducted at American Institute of Technology (AIT) Laboratories, an analytical laboratory that specializes in trace chemical measurement of biological fluids. Data is generated from 20 analytical instruments, including gas chromatography/mass spectrometer (GC/MS), high performance liquid chromatography (HPLC), and liquid chromatography/mass spectrometer (LC/MS), and currently stored in NuGenesis SDMS iv (Waters, Milford, MA). This approach can be easily expanded to include additional instruments.