- Browse by Author
Browsing by Author "Department of Biohealth Informatics, School of Informatics and Computing"
Now showing 1 - 10 of 18
Results Per Page
Sort Options
Item Benchmarking of de novo assembly algorithms for Nanopore data reveals optimal performance of OLC approaches(Biomed Central, 2016) Cherukuri, Yesesri; Janga, Sarath Chandra; Department of Biohealth Informatics, School of Informatics and ComputingImproved DNA sequencing methods have transformed the field of genomics over the last decade. This has become possible due to the development of inexpensive short read sequencing technologies which have now resulted in three generations of sequencing platforms. More recently, a new fourth generation of Nanopore based single molecule sequencing technology, was developed based on MinION® sequencer which is portable, inexpensive and fast. It is capable of generating reads of length greater than 100 kb. Though it has many specific advantages, the two major limitations of the MinION reads are high error rates and the need for the development of downstream pipelines. The algorithms for error correction have already emerged, while development of pipelines is still at nascent stage.Item Characterization of proteoforms with unknown post-translational modi cations using the MIScore(ACS, 2016) Kou, Qiang; Zhu, Binhai; Wu, Si; Ansong, Charles; Tolić, Nikola; Paša-Tolić, Ljiljana; Liu, Xiaowen; Department of Biohealth Informatics, School of Informatics and ComputingVarious proteoforms may be generated from a single gene due to primary structure alterations (PSAs) such as genetic variations, alternative splicing, and post-translational modifications (PTMs). Top-down mass spectrometry is capable of analyzing intact proteins and identifying patterns of multiple PSAs, making it the method of choice for studying complex proteoforms. In top-down proteomics, proteoform identification is often performed by searching tandem mass spectra against a protein sequence database that contains only one reference protein sequence for each gene or transcript variant in a proteome. Because of the incompleteness of the protein database, an identified proteoform may contain unknown PSAs compared with the reference sequence. Proteoform characterization is to identify and localize PSAs in a proteoform. Although many software tools have been proposed for proteoform identification by top-down mass spectrometry, the characterization of proteoforms in identified proteoform–spectrum matches still relies mainly on manual annotation. We propose to use the Modification Identification Score (MIScore), which is based on Bayesian models, to automatically identify and localize PTMs in proteoforms. Experiments showed that the MIScore is accurate in identifying and localizing one or two modifications.Item Consumer Health Informatics: Empowering Healthy-Lifestyle-Seekers Through mHealth(Elsevier, 2016) Faiola, Anthony; Holden, Richard J.; Department of Biohealth Informatics, School of Informatics and ComputingPeople are at risk from noncommunicable diseases (NCD) and poor health habits, with interventions like medications and surgery carrying further risk of adverse effects. This paper addresses ways people are increasingly moving to healthy living medicine (HLM) to mitigate such health threats. HLM-seekers increasingly leverage mobile technologies that enable control of personal health information, collaboration with clinicians/other agents to establish healthy living practices. For example, outcomes from consumer health informatics research include empowering users to take charge of their health through active participation in decision-making about healthcare delivery. Because the success of health technology depends on its alignment/integration with a person's sociotechnical system, we introduce SEIPS 2.0 as a useful conceptual model and analytic tool. SEIPS 2.0 approaches human work (i.e., life's effortful activities) within the complexity of the design and implementation of mHealth technologies and their potential to emerge as consumer-facing NLM products that support NCDs like diabetes.Item Data collection challenges in community settings: Insights from two field studies of patients with chronic disease(Springer, 2015-05) Holden, Richard J.; McDougald Scott, Amanda M.; Hoonakker, Peter L.T.; Hundt, Ann S.; Carayon, Pascale; Department of Biohealth Informatics, School of Informatics and ComputingPurpose Collecting information about health and disease directly from patients can be fruitfully accomplished using contextual approaches, ones that combine more and less structured methods in home and community settings. This paper's purpose is to describe and illustrate a framework of the challenges of contextual data collection. Methods A framework is presented based on prior work in community-based participatory research and organizational science, comprised of ten types of challenges across four broader categories. Illustrations of challenges and suggestions for addressing them are drawn from two mixed-method, contextual studies of patients with chronic disease in two regions of the US. Results The first major category of challenges was concerned with the researcher-participant partnership, for example, the initial lack of mutual trust and understanding between researchers, patients, and family members. The second category concerned patient characteristics such as cognitive limitations and a busy personal schedule that created barriers to successful data collection. The third concerned research logistics and procedures such as recruitment, travel distances, and compensation. The fourth concerned scientific quality and interpretation, including issues of validity, reliability, and combining data from multiple sources. The two illustrative studies faced both common and diverse research challenges and used many different strategies to address them. Conclusion Collecting less structured data from patients and others in the community is potentially very productive but requires the anticipation, avoidance, or negotiation of various challenges. Future work is necessary to better understand these challenges across different methods and settings, as well as to test and identify strategies to address them.Item A framework for identifying genotypic information from clinical records: exploiting integrated ontology structures to transfer annotations between ICD codes and Gene Ontologies(IEEE, 2015-09) Hashemikhabir, Seyedsasan; Xia, Ran; Xiang, Yang; Janga, Sarath Chandra; Department of Biohealth Informatics, School of Informatics and ComputingAlthough some methods are proposed for automatic ontology generation, none of them address the issue of integrating large-scale heterogeneous biomedical ontologies. We propose a novel approach for integrating various types of ontologies efficiently and apply it to integrate International Classification of Diseases, Ninth Revision, Clinical Modification (ICD9CM) and Gene Ontologies (GO). This approach is one of the early attempts to quantify the associations among clinical terms (e.g. ICD9 codes) based on their corresponding genomic relationships. We reconstructed a merged tree for a partial set of GO and ICD9 codes and measured the performance of this tree in terms of associations’ relevance by comparing them with two well-known disease-gene datasets (i.e. MalaCards and Disease Ontology). Furthermore, we compared the genomic-based ICD9 associations to temporal relationships between them from electronic health records. Our analysis shows promising associations supported by both comparisons suggesting a high reliability. We also manually analyzed several significant associations and found promising support from literature.Item HAPPI-2: a Comprehensive and High-quality Map of Human Annotated and Predicted Protein Interactions(BioMed Central, 2017-02-17) Chen, Jake Yue; Pandey, Ragini; Nguyen, Thanh M.; Department of Biohealth Informatics, School of Informatics and ComputingBACKGROUND: Human protein-protein interaction (PPI) data is essential to network and systems biology studies. PPI data can help biochemists hypothesize how proteins form complexes by binding to each other, how extracellular signals propagate through post-translational modification of de-activated signaling molecules, and how chemical reactions are coupled by enzymes involved in a complex biological process. Our capability to develop good public database resources for human PPI data has a direct impact on the quality of future research on genome biology and medicine. RESULTS: The database of Human Annotated and Predicted Protein Interactions (HAPPI) version 2.0 is a major update to the original HAPPI 1.0 database. It contains 2,922,202 unique protein-protein interactions (PPI) linked by 23,060 human proteins, making it the most comprehensive database covering human PPI data today. These PPIs contain both physical/direct interactions and high-quality functional/indirect interactions. Compared with the HAPPI 1.0 database release, HAPPI database version 2.0 (HAPPI-2) represents a 485% of human PPI data coverage increase and a 73% protein coverage increase. The revamped HAPPI web portal provides users with a friendly search, curation, and data retrieval interface, allowing them to retrieve human PPIs and available annotation information on the interaction type, interaction quality, interacting partner drug targeting data, and disease information. The updated HAPPI-2 can be freely accessed by Academic users at http://discovery.informatics.uab.edu/HAPPI . CONCLUSIONS: While the underlying data for HAPPI-2 are integrated from a diverse data sources, the new HAPPI-2 release represents a good balance between data coverage and data quality of human PPIs, making it ideally suited for network biology.Item Health Care Human Factors/Ergonomics Fieldwork in Home and Community Settings(Sage, 2016-10) Valdez, Rupa S.; Holden, Richard J.; Department of Biohealth Informatics, School of Informatics and ComputingDesigning innovations aligned with patients’ needs and workflows requires human factors/ergonomics (HF/E) fieldwork in home and community settings. Fieldwork in these extra-institutional settings is challenged by a need to balance the occasionally competing priorities of patient and informal caregiver participants, study team members, and the overall project. We offer several strategies that HF/E professionals can use before, during, and after home and community site visits to optimize fieldwork and mitigate challenges in these settings. Strategies include interacting respectfully with participants, documenting the visit, managing the study team–participant relationship, and engaging in dialogue with institutional review boards.Item Healthcare Data Analytics for Parkinson’s Disease Patients: A Study of Hospital Cost and Utilization in the United States(American Medical Informatics Association, 2017-02-10) Mukherjee, Sunanda; Wu, Huanmei; Jones, Josette; Department of Biohealth Informatics, School of Informatics and ComputingParkinson's Disease (PD), a prevalent problem, especially for the aged populations, is a progressive but non-fatal nervous system disorder. PD patients have special motor as well as non-motor symptoms over time. There are several limitations in the study of PD such as unavailability of data, proper diagnosis and treatment methods. These limitations significantly reduce the quality of PD patient life quality, either directly or indirectly. PD also imposes great financial burdens to PD patients and their family. This project aims to analyze the most common reasons for PD patient hospitalization, review complications that occur during inpatient stays, and measure the costs associated with PD patient characteristics. Using the HCUP NIS data, comprehensive data analysis has been performed. The results are customized visualized using Tableau and other software systems. The preliminary findings sheds light into how to improve the life quality of PD patients.Item Identification of discriminative imaging proteomics associations in Alzheimer's Disease via a novel sparse correlation model(World Scientific, 2016-12) Yan, Jingwen; Risacher, Shannon L.; Nho, Kwangsik; Saykin, Andrew J.; Shen, Li; Department of Biohealth Informatics, School of Informatics and ComputingBrain imaging and protein expression, from both cerebrospinal fluid and blood plasma, have been found to provide complementary information in predicting the clinical outcomes of Alzheimer's disease (AD). But the underlying associations that contribute to such a complementary relationship have not been previously studied yet. In this work, we will perform an imaging proteomics association analysis to explore how they are related with each other. While traditional association models, such as Sparse Canonical Correlation Analysis (SCCA), can not guarantee the selection of only disease-relevant biomarkers and associations, we propose a novel discriminative SCCA (denoted as DSCCA) model with new penalty terms to account for the disease status information. Given brain imaging, proteomic and diagnostic data, the proposed model can perform a joint association and multi-class discrimination analysis, such that we can not only identify disease-relevant multimodal biomarkers, but also reveal strong associations between them. Based on a real imaging proteomic data set, the empirical results show that DSCCA and traditional SCCA have comparable association performances. But in a further classification analysis, canonical variables of imaging and proteomic data obtained in DSCCA demonstrate much more discrimination power toward multiple pairs of diagnosis groups than those obtained in SCCA.Item IODNE: An integrated optimization method for identifying the deregulated subnetwork for precision medicine in cancer(Wiley, 2017-03) Renbarger, J.; Radovich, M.; Vasudevaraja, V.; Kinnebrew, G.H.; Zhang, S.; Cheng, L.; Inavolu Mounika, S.; Department of Biohealth Informatics, School of Informatics and ComputingSubnetwork analysis can explore complex patterns of entire molecular pathways for the purpose of drug target identification. In this article, the gene expression profiles of a cohort of patients with breast cancer are integrated with protein-protein interaction (PPI) networks using, simultaneously, both edge scoring and node scoring. A novel optimization algorithm, integrated optimization method to identify deregulated subnetwork (IODNE), is developed to search for the optimal dysregulated subnetwork of the merged gene and protein network. IODNE is applied to select subnetworks for Luminal-A breast cancer from The Cancer Genome Atlas (TCGA) data. A large fraction of cancer-related genes and the well-known clinical targets, ER1/PR and HER2, are found by IODNE. This validates the utility of IODNE. When applying IODNE to the triple-negative breast cancer (TNBC) subtype data, we identified subnetworks that contain genes such as ERBB2, HRAS, PGR, CAD, POLE, and SLC2A1.