Applications of Data Mining in Healthcare

Peng, Bo

Applications of Data Mining in Healthcare

Files

Bo_dissertation_3.pdf (374.13 KB)

Date

2019-05

Authors

Peng, Bo

Language

American English

Committee Chair

Mohler, George

Committee Members

Dundar, Murat
Zheng, Jiang Yu

Degree

M.S.

Degree Year

2019

Grantor

Purdue University

Abstract

With increases in the quantity and quality of healthcare related data, data mining tools have the potential to improve people’s standard of living through personalized and predictive medicine. In this thesis we improve the state-of-the-art in data mining for several problems in the healthcare domain. In problems such as drug-drug interaction prediction and Alzheimer’s Disease (AD) biomarkers discovery and prioritization, current methods either require tedious feature engineering or have unsatisfactory performance. New effective computational tools are needed that can tackle these complex problems. In this dissertation, we develop new algorithms for two healthcare problems: high-order drug-drug interaction prediction and amyloid imaging biomarker prioritization in Alzheimer’s Disease. Drug-drug interactions (DDIs) and their associated adverse drug reactions (ADRs) represent a significant detriment to the public h ealth. Existing research on DDIs primarily focuses on pairwise DDI detection and prediction. Effective computational methods for high-order DDI prediction are desired. In this dissertation, I present a deep learning based model D 3 I for cardinality-invariant and order-invariant high-order DDI pre- diction. The proposed models achieve 0.740 F1 value and 0.847 AUC value on high-order DDI prediction, and outperform classical methods on order-2 DDI prediction. These results demonstrate the strong potential of D 3 I and deep learning based models in tackling the prediction problems of high-order DDIs and their induced ADRs. The second problem I consider in this thesis is amyloid imaging biomarkers discovery, for which I propose an innovative machine learning paradigm enabling precision medicine in this domain. The paradigm tailors the imaging biomarker discovery process to individual characteristics of a given patient. I implement this paradigm using a newly developed learning-to-rank method PLTR. The PLTR model seamlessly integrates two objectives for joint optimization: pushing up relevant biomarkers and ranking among relevant biomarkers. The empirical study of PLTR conducted on the ADNI data yields promising results to identify and prioritize individual-specific amyloid imaging biomarkers based on the individual’s structural MRI data. The resulting top ranked imaging biomarkers have the potential to aid personalized diagnosis and disease subtyping.

Description

Indiana University-Purdue University Indianapolis (IUPUI)

Keywords

Data mining, Healthcare

Rights

Type

Thesis

Permanent Link

https://hdl.handle.net/1805/18933
http://dx.doi.org/10.7912/C2/2363

Collections

Computer & Information Science Department Theses and Dissertations

Full item page