- Browse by Subject
Browsing by Subject "machine learning"
Now showing 1 - 10 of 49
Results Per Page
Sort Options
Item Advancing cyanobacteria biomass estimation from hyperspectral observations: Demonstrations with HICO and PRISMA imagery(Elsevier, 2021-12) O'Shea, Ryan E.; Pahlevan, Nima; Smith, Brandon; Bresciani, Mariano; Egerton, Todd; Giardino, Claudia; Li, Lin; Moore, Tim; Ruiz-Verdu, Antonio; Ruberg, Steve; Simis, Stefan G. H.; Stumpf, Richard; Vaičiūtė, Diana; Earth Sciences, School of ScienceRetrieval of the phycocyanin concentration (PC), a characteristic pigment of, and proxy for, cyanobacteria biomass, from hyperspectral satellite remote sensing measurements is challenging due to uncertainties in the remote sensing reflectance (∆Rrs) resulting from atmospheric correction and instrument radiometric noise. Although several individual algorithms have been proven to capture local variations in cyanobacteria biomass in specific regions, their performance has not been assessed on hyperspectral images from satellite sensors. Our work leverages a machine-learning model, Mixture Density Networks (MDNs), trained on a large (N = 939) dataset of collocated in situ chlorophyll-a concentrations (Chla), PCs, and remote sensing reflectance (Rrs) measurements to estimate PC from all relevant spectral bands. The performance of the developed model is demonstrated via PC maps produced from select images of the Hyperspectral Imager for the Coastal Ocean (HICO) and Italian Space Agency's PRecursore IperSpettrale della Missione Applicativa (PRISMA) using a matchup dataset. As input to the MDN, we incorporate a combination of widely used band ratios (BRs) and line heights (LHs) taken from existing multispectral algorithms, that have been proven for both Chla and PC estimation, as well as novel BRs and LHs to increase the overall cyanobacteria biomass estimation accuracy and reduce the sensitivity to ∆Rrs. When trained on a random half of the dataset, the MDN achieves uncertainties of 44.3%, which is less than half of the uncertainties of all viable optimized multispectral PC algorithms. The MDN is notably better than multispectral algorithms at preventing overestimation on low (<10 mg m−3) PC. Visibly, HICO and PRISMA PC maps show the wider dynamic range that can be represented by the MDN. The available in situ and satellite-derived Rrs matchups and measured in situ PC demonstrate the robustness of the MDN for estimating low (<10 mg m−3) PC and the reduced impact of ∆Rrs on medium-to-high in situ PC (>10 mg m−3). According to our extensive assessments, the developed model is anticipated to enable practical PC products from PRISMA and HICO, therefore the model is promising for planned hyperspectral missions, such as the Plankton Aerosol and Cloud Ecosystem (PACE). This advancement will enhance the complementary roles of hyperspectral radiometry from satellite and low-altitude platforms for quantifying and monitoring cyanobacteria harmful algal blooms at both large and local spatial scales.Item Advancing Expert Human-Computer Interaction Through Music(Michigan Publishing, 2012-09) Smith, Benjamin D.; Garnett, Guy E.One of the most important challenges for computing over the next decade is discovering ways to augment and extend human control over ever more powerful, complex, and numerous devices and software systems. New high-dimensional input devices and control systems provide these affordances, but require extensive practice and learning on the part of the user. This paper describes a system created to leverage existing human expertise with a complex, highly dimensional interface, in the form of a trained violinist and violin. A machine listening model is employed to provide the musician and user with direct control over a complex simulation running on a high-performance computing system.Item An Adversorial Approach to Enable Re-Use of Machine Learning Models and Collaborative Research Efforts Using Synthetic Unstructured Free-Text Medical Data(IOS, 2019) Kasthurirathne, Suranga N.; Dexter, Gregory; Grannis, Shaun J.; Epidemiology, School of Public HealthWe leverage Generative Adversarial Networks (GAN) to produce synthetic free-text medical data with low re-identification risk, and apply these to replicate machine learning solutions. We trained GAN models to generate free-text cancer pathology reports. Decision models were trained using synthetic datasets reported performance metrics that were statistically similar to models trained using original test data. Our results further the use of GANs to generate synthetic data for collaborative research and re-use of machine learning models.Item AIMS Philanthropy Project: Studying AI, Machine Learning & Data Science Technology for Good(Indiana University Lilly Family School of Philanthropy and Indiana University School of Informatics and Computing, IUPUI, Indianapolis, IN., 2021-02-07) Herzog, Patricia Snell; Naik, Harshal R.; Khan, Haseeb A.This project investigates philanthropic activities related to Artificial Intelligence, Machine Learning, and Data Science technology (AIMS). Advances in AIMS technology are impacting the field of philanthropy in substantial ways. This report focuses on methods employed in analyzing and visualizing five data sources: Open Philanthropy grants database, Rockefeller Foundation grants database, Chronicle of Philanthropy article database, GuideStar Nonprofit Database, and Google AI for Social Good grant awardees. The goal was to develop an accessible website platform that engaged human-centered UX user experience design techniques to present information about AIMS Philanthropy (https://www.aims-phil.org/). Each dataset was analyzed for a set of general questions that could be answered visually. The visuals aim to provide answers to these two primary questions: (1) How much funding was invested in AIMS? and (2) What focus areas, applications, discovery, or other purposes was AIMS-funded directed toward? Cumulatively, this project identified 325 unique organizations with a total of $2.6 billion in funding for AIMS philanthropy.Item Analysis of AI Models for Student Admissions: A Case Study(ACM, 2023-03) Van Basum, Kelly; Fang, Shaiofen; Computer and Information Science, School of ScienceThis research uses machine learning-based AI models to predict admissions decisions at a large urban research university. Admissions data spanning five years was used to create an AI model to determine whether a given student would be directly admitted into the School of Science under various scenarios. During this time, submission of standardized test scores as part of a student's application became optional which led to interesting questions about the impact of standardized test scores on admission decisions. We first developed AI models and analyzed these models to understand which variables are important in admissions decisions, and how the decision to exclude test scores affects the demographics of the students who are admitted. We then evaluated the predictive models to detect and analyze biases these models may carry with respect to three variables chosen to represent sensitive populations: gender, race, and whether a student was the first in his family to attend college.Item Analyzing the symptoms in colorectal and breast cancer patients with or without type 2 diabetes using EHR data(Sage, 2021) Luo, Xiao; Storey, Susan; Gandhi, Priyanka; Zhang, Zuoyi; Metzger, Megan; Huang, Kun; Computer Information and Graphics Technology, School of Engineering and TechnologyThis research extracted patient-reported symptoms from free-text EHR notes of colorectal and breast cancer patients and studied the correlation of the symptoms with comorbid type 2 diabetes, race, and smoking status. An NLP framework was developed first to use UMLS MetaMap to extract all symptom terms from the 366,398 EHR clinical notes of 1694 colorectal cancer (CRC) patients and 3458 breast cancer (BC) patients. Semantic analysis and clustering algorithms were then developed to categorize all the relevant symptoms into eight symptom clusters defined by seed terms. After all the relevant symptoms were extracted from the EHR clinical notes, the frequency of the symptoms reported from colorectal cancer (CRC) and breast cancer (BC) patients over three time-periods post-chemotherapy was calculated. Logistic regression (LR) was performed with each symptom cluster as the response variable while controlling for diabetes, race, and smoking status. The results show that the CRC and BC patients with Type 2 Diabetes (T2D) were more likely to report symptoms than CRC and BC without T2D over three time-periods in the cancer trajectory. We also found that current smokers were more likely to report anxiety (CRC, BC), neuropathic symptoms (CRC, BC), anxiety (BC), and depression (BC) than non-smokers.Item Artificial Intelligence for Global Health: Learning From a Decade of Digital Transformation in Health Care(arXiv, 2020) Mathur, Varoon; Purkayastha, Saptarshi; Gichoya, Judy Wawira; BioHealth Informatics, School of Informatics and ComputingThe health needs of those living in resource-limited settings are a vastly overlooked and understudied area in the intersection of machine learning (ML) and health care. While the use of ML in health care is more recently popularized over the last few years from the advancement of deep learning, low-and-middle income countries (LMICs) have already been undergoing a digital transformation of their own in health care over the last decade, leapfrogging milestones due to the adoption of mobile health (mHealth). With the introduction of new technologies, it is common to start afresh with a top-down approach, and implement these technologies in isolation, leading to lack of use and a waste of resources. In this paper, we outline the necessary considerations both from the perspective of current gaps in research, as well as from the lived experiences of health care professionals in resource-limited settings. We also outline briefly several key components of successful implementation and deployment of technologies within health systems in LMICs, including technical and cultural considerations in the development process relevant to the building of machine learning solutions. We then draw on these experiences to address where key opportunities for impact exist in resource-limited settings, and where AI/ML can provide the most benefit.Item Assessing Demand for Transparency in Intelligent Systems Using Machine Learning(IEEE, 2018-07) Vorm, Eric S.; Miller, Andrew D.; Human-Centered Computing, School of Informatics and ComputingIntelligent systems offering decision support can lessen cognitive load and improve the efficiency of decision making in a variety of contexts. These systems assist users by evaluating multiple courses of action and recommending the right action at the right time. Modern intelligent systems using machine learning introduce new capabilities in decision support, but they can come at a cost. Machine learning models provide little explanation of their outputs or reasoning process, making it difficult to determine when it is appropriate to trust, or if not, what went wrong. In order to improve trust and ensure appropriate reliance on these systems, users must be afforded increased transparency, enabling an understanding of the systems reasoning, and an explanation of its predictions or classifications. Here we discuss the salient factors in designing transparent intelligent systems using machine learning, and present the results of a user-centered design study. We propose design guidelines derived from our study, and discuss next steps for designing for intelligent system transparency.Item Associating persistent self-reported cognitive decline with neurocognitive decline in older breast cancer survivors using machine learning: The Thinking and Living with Cancer study(Elsevier, 2022-11) Van Dyk, Kathleen; Ahn, Jaeil; Zhou, Xingtao; Zhai, Wanting; Ahles, Tim A.; Bethea, Traci N.; Carroll, Judith E.; Cohen, Harvey Jay; Dilawari, Asma A.; Graham, Deena; Jacobsen, Paul B.; Jim, Heather; McDonald, Brenna C.; Nakamura, Zev M.; Patel, Sunita K.; Rentscher, Kelly E.; Saykin, Andrew J.; Small, Brent J.; Mandelblatt, Jeanne S.; Root, James C.; Radiology and Imaging Sciences, School of MedicineIntroduction: Many cancer survivors report cognitive problems following diagnosis and treatment. However, the clinical significance of patient-reported cognitive symptoms early in survivorship can be unclear. We used a machine learning approach to determine the association of persistent self-reported cognitive symptoms two years after diagnosis and neurocognitive test performance in a prospective cohort of older breast cancer survivors. Materials and Methods: We enrolled breast cancer survivors with non-metastatic disease (n=435) and age- and education-matched non-cancer controls (n=441) between August 2010 and December 2017 and followed until January 2020; we excluded women with neurological disease and all women passed a cognitive screen at enrollment. Women completed the FACT-Cog Perceived Cognitive Impairment (PCI) scale and neurocognitive tests of attention, processing speed, executive function, learning, memory and visuospatial ability, and timed activities of daily living assessments at enrollment (pre-systemic treatment) and annually to 24 months, for a total of 59 individual neurocognitive measures. We defined persistent self-reported cognitive decline as clinically meaningful decline (3.7+ points) on the PCI scale from enrollment to twelve months with persistence to 24 months. Analysis used four machine learning models based on data for change scores (baseline to twelve months) on the 59 neurocognitive measures and measures of depression, anxiety, and fatigue to determine a set of variables that distinguished the 24-month persistent cognitive decline group from non-cancer controls or from survivors without decline. Results: The sample of survivors and controls ranged in age from were ages 60–89. Thirty-three percent of survivors had self-reported cognitive decline at twelve months and two-thirds continued to have persistent decline to 24 months (n=60). Least Absolute Shrinkage and Selection Operator (LASSO) models distinguished survivors with persistent self-reported declines from controls (AUC=0.736) and survivors without decline (n=147; AUC=0.744). The variables that separated groups were predominantly neurocognitive test performance change scores, including declines in list learning, verbal fluency, and attention measures. Discussion: Machine learning may be useful to further our understanding of cancer-related cognitive decline. Our results suggest that persistent self-reported cognitive problems among older women with breast cancer are associated with a constellation of mild neurocognitive changes warranting clinical attention.Item Automated Assessment of Psychiatric Patients Using Medical Notes(2022-12) Wang, Shuo; Miled, Zina Ben; King, Brain; Lee, JohnPsychiatric patients require continuous monitoring on par with their severity status. Unfortunately, current assessment instruments are often time-consuming. The present thesis introduces several passive digital markers (PDMs) that can help reduce this burden by automating the assessment using medical notes. The methodology leverages medical notes already annotated according to the General Assessment of Functioning (GAF) scale to develop a disease severity PDM for schizophrenia, bipolar type I or mixed bipolar and non-psychotic patients. Topic words that are representative of three disease severity levels (severe impairment, serious impairment, moderate to no impairment) are identified and the top 50 words from each severity level are used to summarize the raw text of the medical notes. The summary of the text is processed by a classifier that generates a disease severity level. Two classifiers are considered: BERT PDM and Clinical BERT PDM. The evaluation of these classifiers showed that the BERT PDM delivered the best performance. The PDMs developed using the BERT PDM can assign medical notes from each encounter to a severe impairment level with a positive predictive value higher than 0.84. These PDMs are generalizable and their development was facilitated by the availability of a substantial number of medical notes from multiple institutions that were annotated by several health care providers. The methodology introduced in the present thesis can support the automated monitoring of the progression of the disease severity for psychiatric patients by digitally processing the medical note produced at each encounter without additional burden on the health care system. Applying the same methodology to other diseases is possible subject to availability of the necessary data.