Department of Computer Science Works

Permanent URI for this collection

Browse

Recent Submissions

Now showing 1 - 10 of 186
  • Item
    Classification of Alzheimer’s Disease Leveraging Multi-task Machine Learning Analysis of Speech and Eye-Movement Data
    (Frontiers Media, 2021-09-20) Jang, Hyeju; Soroski, Thomas; Rizzo, Matteo; Barral, Oswald; Harisinghani, Anuj; Newton-Mason, Sally; Granby, Saffrin; da Cunha Vasco, Thiago Monnerat Stutz; Lewis, Caitlin; Tutt, Pavan; Carenini, Giuseppe; Conati, Cristina; Field, Thalia S.; Computer Science, Luddy School of Informatics, Computing, and Engineering
    Alzheimer’s disease (AD) is a progressive neurodegenerative condition that results in impaired performance in multiple cognitive domains. Preclinical changes in eye movements and language can occur with the disease, and progress alongside worsening cognition. In this article, we present the results from a machine learning analysis of a novel multimodal dataset for AD classification. The cohort includes data from two novel tasks not previously assessed in classification models for AD (pupil fixation and description of a pleasant past experience), as well as two established tasks (picture description and paragraph reading). Our dataset includes language and eye movement data from 79 memory clinic patients with diagnoses of mild-moderate AD, mild cognitive impairment (MCI), or subjective memory complaints (SMC), and 83 older adult controls. The analysis of the individual novel tasks showed similar classification accuracy when compared to established tasks, demonstrating their discriminative ability for memory clinic patients. Fusing the multimodal data across tasks yielded the highest overall AUC of 0.83 ± 0.01, indicating that the data from novel tasks are complementary to established tasks.
  • Item
    T3-Vis: a visual analytic framework for Training and fine-Tuning Transformers in NLP
    (ACL Anthology, 2021) Li, Raymond; Xiao, Wen; Wang, Lanjun; Jang, Hyeju; Carenini, Giuseppe; Computer Science, Luddy School of Informatics, Computing, and Engineering
    Transformers are the dominant architecture in NLP, but their training and fine-tuning is still very challenging. In this paper, we present the design and implementation of a visual analytic framework for assisting researchers in such process, by providing them with valuable insights about the model’s intrinsic properties and behaviours. Our framework offers an intuitive overview that allows the user to explore different facets of the model (e.g., hidden states, attention) through interactive visualization, and allows a suite of built-in algorithms that compute the importance of model components and different parts of the input sequence. Case studies and feedback from a user focus group indicate that the framework is useful, and suggest several improvements. Our framework is available at: https://github.com/raymondzmc/T3-Vis.
  • Item
    Zero-shot Learning with Minimum Instruction to Extract Social Determinants and Family History from Clinical Notes using GPT Model
    (IEEE, 2023) Bhate, Neel Jitesh; Mittal, Ansh; He, Zhe; Luo, Xiao; Computer Science, Luddy School of Informatics, Computing, and Engineering
    Demographics, social determinants of health, and family history documented in the unstructured text within the electronic health records are increasingly being studied to understand how this information can be utilized with the structured data to improve healthcare outcomes. After the GPT models were released, many studies have applied GPT models to extract this information from the narrative clinical notes. Different from the existing work, our research focuses on investigating the zero-shot learning on extracting this information together by providing minimum information to the GPT model. We utilize de-identified real-world clinical notes annotated for demographics, various social determinants, and family history information. Given that the GPT model might provide text different from the text in the original data, we explore two sets of evaluation metrics, including the traditional NER evaluation metrics and semantic similarity evaluation metrics, to completely understand the performance. Our results show that the GPT-3.5 method achieved an average of 0.975 F1 on demographics extraction, 0.615 F1 on social determinants extraction, and 0.722 F1 on family history extraction. We believe these results can be further improved through model fine-tuning or few-shots learning. Through the case studies, we also identified the limitations of the GPT models, which need to be addressed in future research.
  • Item
    Identification of predictive patient characteristics for assessing the probability of COVID-19 in-hospital mortality
    (Public Library of Science, 2024) Rajwa, Bartek; Naved, Md Mobasshir Arshed; Adibuzzaman, Mohammad; Grama, Ananth Y.; Khan, Babar A.; Dundar, M. Murat; Rochet, Jean-Christophe; Computer Science, Luddy School of Informatics, Computing, and Engineering
    As the world emerges from the COVID-19 pandemic, there is an urgent need to understand patient factors that may be used to predict the occurrence of severe cases and patient mortality. Approximately 20% of SARS-CoV-2 infections lead to acute respiratory distress syndrome caused by the harmful actions of inflammatory mediators. Patients with severe COVID-19 are often afflicted with neurologic symptoms, and individuals with pre-existing neurodegenerative disease have an increased risk of severe COVID-19. Although collectively, these observations point to a bidirectional relationship between severe COVID-19 and neurologic disorders, little is known about the underlying mechanisms. Here, we analyzed the electronic health records of 471 patients with severe COVID-19 to identify clinical characteristics most predictive of mortality. Feature discovery was conducted by training a regularized logistic regression classifier that serves as a machine-learning model with an embedded feature selection capability. SHAP analysis using the trained classifier revealed that a small ensemble of readily observable clinical features, including characteristics associated with cognitive impairment, could predict in-hospital mortality with an accuracy greater than 0.85 (expressed as the area under the ROC curve of the classifier). These findings have important implications for the prioritization of clinical measures used to identify patients with COVID-19 (and, potentially, other forms of acute respiratory distress syndrome) having an elevated risk of death.
  • Item
    Geometrically Matched Multi-source Microscopic Image Synthesis Using Bidirectional Adversarial Networks
    (Springer, 2022) Zhuang, Jun; Wang, Dali; Computer Science, Luddy School of Informatics, Computing, and Engineering
    Microscopic images from multiple modalities can produce plentiful experimental information. In practice, biological or physical constraints under a given observation period may prevent researchers from acquiring enough microscopic scanning. Recent studies demonstrate that image synthesis is one of the popular approaches to release such constraints. Nonetheless, most existing synthesis approaches only translate images from the source domain to the target domain without solid geometric associations. To embrace this challenge, we propose an innovative model architecture, BANIS, to synthesize diversified microscopic images from multi-source domains with distinct geometric features. The experimental outcomes indicate that BANIS successfully synthesizes favorable image pairs on C. elegans microscopy embryonic images. To the best of our knowledge, BANIS is the first application to synthesize microscopic images that associate distinct spatial geometric features from multi-source domains.
  • Item
    mmEat: Millimeter wave-enabled environment-invariant eating behavior monitoring
    (Elsevier, 2022-03) Xie, Yucheng; Jiang, Ruizhe; Guo , Xiaonan; Wang , Yan; Cheng , Jerry; Chen, Yingying; Computer Science, Luddy School of Informatics, Computing, and Engineering
    Dietary habits are closely related to people’s health condition. Unhealthy diet can cause obesity, diabetes, heart diseases, as well as increase the risk of cancers. It is necessary to have a monitoring system that helps people keep tracking his/her eating behaviors. Traditional sensor-based and camera-based dietary monitoring systems either require users to wear dedicated devices or may potentially incur privacy concerns. WiFi-based methods, though yielding reasonably robust performance in certain cases, have major limitations. The wireless signals usually carry substantial information that is specific to the environment where eating activities are performed. To overcome these limitations, we propose mmEat, a millimeter wave-enabled environment-invariant eating behavior monitoring system. In particular, we propose an environment impact mitigation method by analyzing mmWave signals in Dopper-Range domain. To differentiate dietary activities with various utensils (i.e., eating with fork, fork and knife, spoon, chopsticks, bare hand) for fine-grained eating behavior monitoring, we construct Spatial–Temporal Heatmap by integrating multiple dimensional measurements. We further utilize an unsupervised learning-based 2D segmentation method and an eating period derivation algorithm to estimate time duration of each eating activity. Our system has the potential to infer the food categories and eating speed. Extensive experiments with over 1000 eating activities show that our system can achieve dietary activity recognition with an average accuracy of 97.5% and a false detection rate of 5%.
  • Item
    A framework for graph-base neural network using numerical simulation of metal powder bed fusion for correlating process parameters and defect generation
    (Elsevier, 2022) Akter Jahan, Suchana; Al Hasan, Mohammad; El-Mounayri, Hazim; Computer Science, Luddy School of Informatics, Computing, and Engineering
    Powder bed fusion (PBF) is the most common technique used for metal additive manufacturing. This process involves consolidation of metal powder using a heat source such as laser or electron beam. During the formation of three-dimensional(3D) objects by sintering metal powders layer by layer, many different thermal phenomena occur that can create defects or anomalies on the final printed part. Similar to other additive manufacturing techniques, PBF has been in practice for decades, yet it is still going through research and development endeavors which is required to understand the physics behind this process. Defects and deformations highly impact the product quality and reliability of the overall manufacturing process; hence, it is essential that we understand the reason and mechanism of defect generation in PBF process and take appropriate measures to rectify them. In this paper, we have attempted to study the effect of processing parameters (scanning speed, laser power) on the generation of defects in PBF process using a graph-based artificial neural network that uses numerical simulation results as input or training data. Use of graph-based machine learning is novel in the area of manufacturing let alone additive manufacturing or powder bed fusion. The outcome of this study provides an opportunity to design a feedback controlled in-situ online monitoring system in powder bed fusion to reduce printing defects and optimize the manufacturing process.
  • Item
    Informative Causality Extraction from Medical Literature via Dependency-Tree-Based Patterns
    (Springer, 2022-05-25) Kabir, M. Ahsanul; Almulhim, AlJohara; Luo, Xiao; Al Hasan, Mohammad; Computer Science, Luddy School of Informatics, Computing, and Engineering
    Extracting cause-effect entities from medical literature is an important task in medical information retrieval. A solution for solving this task can be used for compilation of various causality relations, such as causality between disease and symptoms, between medications and side effects, and between genes and diseases. Existing solutions for extracting cause-effect entities work well for sentences where the cause and the effect phrases are name entities, single-word nouns, or noun phrases consisting of two to three words. Unfortunately, in medical literature, cause and effect phrases in a sentence are not simply nouns or noun phrases, rather they are complex phrases consisting of several words, and existing methods fail to correctly extract the cause and effect entities in such sentences. Partial extraction of cause and effect entities conveys poor quality, non-informative, and often, contradictory facts, comparing to the one intended in the given sentence. In this work, we solve this problem by designing an unsupervised method for cause and effect phrase extraction, patterncausality, which is specifically suitable for the medical literature. Our proposed approach first uses a collection of cause-effect dependency patterns as template to extract head words of cause and effect phrases and then it uses a novel phrase extraction method to obtain complete and meaningful cause and effect phrases from a sentence. Experiments on a cause-effect dataset built from sentences from PubMed articles show that for extracting cause and effect entities, patterncausality is substantially better than the existing methods—with an order of magnitude improvement in the F-score metric over the best of the existing methods. We also build different variants of patterncausality, which use different phrase extraction methods; all variants are better than the existing methods. patterncausality and its variants also show modest performance improvement over the existing methods for extracting cause and effect entities in a domain-neutral benchmark dataset, in which cause and effect entities are nouns or noun phrases consisting of one to two words.
  • Item
    The United States COVID-19 Forecast Hub dataset
    (Springer, 2022-08-01) Cramer, Estee Y.; Huang, Yuxin; Wang, Yijin; Ray, Evan L.; Cornell, Matthew; Bracher, Johannes; Brennen, Andrea; Rivadeneira, Alvaro J. Castro; Gerding, Aaron; House, Katie; Jayawardena, Dasuni; Kanji, Abdul Hannan; Khandelwal, Ayush; Le, Khoa; Mody, Vidhi; Mody, Vrushti; Niemi, Jarad; Stark, Ariane; Shah, Apurv; Wattanchit, Nutcha; Zorn, Martha W.; Reich, Nicholas G.; US COVID-19 Forecast Hub Consortium; Computer Science, Luddy School of Informatics, Computing, and Engineering
    Academic researchers, government agencies, industry groups, and individuals have produced forecasts at an unprecedented scale during the COVID-19 pandemic. To leverage these forecasts, the United States Centers for Disease Control and Prevention (CDC) partnered with an academic research lab at the University of Massachusetts Amherst to create the US COVID-19 Forecast Hub. Launched in April 2020, the Forecast Hub is a dataset with point and probabilistic forecasts of incident cases, incident hospitalizations, incident deaths, and cumulative deaths due to COVID-19 at county, state, and national, levels in the United States. Included forecasts represent a variety of modeling approaches, data sources, and assumptions regarding the spread of COVID-19. The goal of this dataset is to establish a standardized and comparable set of short-term forecasts from modeling teams. These data can be used to develop ensemble models, communicate forecasts to the public, create visualizations, compare models, and inform policies regarding COVID-19 mitigation. These open-source data are available via download from GitHub, through an online API, and through R packages.
  • Item
    An adaptive hybrid approach: Combining genetic algorithm and ant colony optimization for integrated process planning and scheduling
    (Emerald Insight, 2020) Uslu, Mehmet Fatih; Uslu, Süleyman; Bulut, Faruk; Computer Science, Luddy School of Informatics, Computing, and Engineering
    Optimization algorithms can differ in performance for a specific problem. Hybrid approaches, using this difference, might give a higher performance in many cases. This paper presents a hybrid approach of Genetic Algorithm (GA) and Ant Colony Optimization (ACO) specifically for the Integrated Process Planning and Scheduling (IPPS) problems. GA and ACO have given different performances in different cases of IPPS problems. In some cases, GA has outperformed, and so do ACO in other cases. This hybrid method can be constructed as (I) GA to improve ACO results or (II) ACO to improve GA results. Based on the performances of the algorithm pairs on the given problem scale. This proposed hybrid GA-ACO approach (hAG) runs both GA and ACO simultaneously, and the better performing one is selected as the primary algorithm in the hybrid approach. hAG also avoids convergence by resetting parameters which cause algorithms to converge local optimum points. Moreover, the algorithm can obtain more accurate solutions with avoidance strategy. The new hybrid optimization technique (hAG) merges a GA with a local search strategy based on the interior point method. The efficiency of hAG is demonstrated by solving a constrained multi-objective mathematical test-case. The benchmarking results of the experimental studies with AIS (Artificial Immune System), GA, and ACO indicate that the proposed model has outperformed other non-hybrid algorithms in different scenarios.