- Browse by Author
Browsing by Author "He, Zhe"
Now showing 1 - 6 of 6
Results Per Page
Sort Options
Item Annotation and Information Extraction of Consumer-Friendly Health Articles for Enhancing Laboratory Test Reporting(American Medical Informatics Association, 2024-01-11) He, Zhe; Tian, Shubo; Erdengasileng, Arslan; Hanna, Karim; Gong, Yang; Zhang, Zhan; Luo, Xiao; Lustria, Mia Liza A.; Engineering Technology, Purdue School of Engineering and TechnologyViewing laboratory test results is patients' most frequent activity when accessing patient portals, but lab results can be very confusing for patients. Previous research has explored various ways to present lab results, but few have attempted to provide tailored information support based on individual patient's medical context. In this study, we collected and annotated interpretations of textual lab result in 251 health articles about laboratory tests from AHealthyMe.com. Then we evaluated transformer-based language models including BioBERT, ClinicalBERT, RoBERTa, and PubMedBERT for recognizing key terms and their types. Using BioPortal's term search API, we mapped the annotated terms to concepts in major controlled terminologies. Results showed that PubMedBERT achieved the best F1 on both strict and lenient matching criteria. SNOMED CT had the best coverage of the terms, followed by LOINC and ICD-10-CM. This work lays the foundation for enhancing the presentation of lab results in patient portals by providing patients with contextualized interpretations of their lab results and individualized question prompts that they can, in turn, refer to during physician consults.Item Attention Mechanism with BERT for Content Annotation and Categorization of Pregnancy-Related Questions on a Community Q&A Site(IEEE, 2020-12) Luo, Xiao; Ding, Haoran; Tang, Matthew; Gandhi, Priyanka; Zhang, Zhan; He, Zhe; Engineering Technology, School of Engineering and TechnologyIn recent years, the social web has been increasingly used for health information seeking, sharing, and subsequent health-related research. Women often use the Internet or social networking sites to seek information related to pregnancy in different stages. They may ask questions about birth control, trying to conceive, labor, or taking care of a newborn or baby. Classifying different types of questions about pregnancy information (e.g., before, during, and after pregnancy) can inform the design of social media and professional websites for pregnancy education and support. This research aims to investigate the attention mechanism built-in or added on top of the BERT model in classifying and annotating the pregnancy-related questions posted on a community Q&A site. We evaluated two BERT-based models and compared them against the traditional machine learning models for question classification. Most importantly, we investigated two attention mechanisms: the built-in self-attention mechanism of BERT and the additional attention layer on top of BERT for relevant term annotation. The classification performance showed that the BERT-based models worked better than the traditional models, and BERT with an additional attention layer can achieve higher overall precision than the basic BERT model. The results also showed that both attention mechanisms work differently on annotating relevant content, and they could serve as feature selection methods for text mining in general.Item Biostatistics and Health Data Science, School of Medicine(JMIR, 2021-11-25) Zhang, Zhan; Kmoth, Lukas; Luo, Xiao; He, Zhe; Biostatistics and Health Data Science, Richard M. Fairbanks School of Public HealthBackground: Personal clinical data, such as laboratory test results, are increasingly being made available to patients via patient portals. However, laboratory test results are presented in a way that is difficult for patients to interpret and use. Furthermore, the indications of laboratory test results may vary among patients with different characteristics and from different medical contexts. To date, little is known about how to design patient-centered technology to facilitate the interpretation of laboratory test results. Objective: The aim of this study is to explore design considerations for supporting patient-centered communication and comprehension of laboratory test results, as well as discussions between patients and health care providers. Methods: We conducted a user-centered, multicomponent design research consisting of user studies, an iterative prototype design, and pilot user evaluations, to explore design concepts and considerations that are useful for supporting patients in not only viewing but also interpreting and acting upon laboratory test results. Results: The user study results informed the iterative design of a system prototype, which had several interactive features: using graphical representations and clear takeaway messages to convey the concerning nature of the results; enabling users to annotate laboratory test reports; clarifying medical jargon using nontechnical verbiage and allowing users to interact with the medical terms (eg, saving, favoriting, or sorting); and providing pertinent and reliable information to help patients comprehend test results within their medical context. The results of a pilot user evaluation with 8 patients showed that the new patient-facing system was perceived as useful in not only presenting laboratory test results to patients in a meaningful way but also facilitating in situ patient-provider interactions. Conclusions: We draw on our findings to discuss design implications for supporting patient-centered communication of laboratory test results and how to make technology support informative, trustworthy, and empathetic.Item How the clinical research community responded to the COVID-19 pandemic: an analysis of the COVID-19 clinical studies in ClinicalTrials.gov(AMIA, 2021-04-01) He, Zhe; Erdengasileng, Arslan; Luo, Xiao; Xing, Aiwen; Charness, Neil; Bian, Jiang; Computer Information and Graphics Technology, School of Engineering and TechnologyIn the past few months, a large number of clinical studies on the novel coronavirus disease (COVID-19) have been initiated worldwide to find effective therapeutics, vaccines, and preventive strategies for COVID-19. In this study, we aim to understand the landscape of COVID-19 clinical research and identify the issues that may cause recruitment difficulty or reduce study generalizability.We analyzed 3765 COVID-19 studies registered in the largest public registry—ClinicalTrials.gov, leveraging natural language processing (NLP) and using descriptive, association, and clustering analyses. We first characterized COVID-19 studies by study features such as phase and tested intervention. We then took a deep dive and analyzed their eligibility criteria to understand whether these studies: (1) considered the reported underlying health conditions that may lead to severe illnesses, and (2) excluded older adults, either explicitly or implicitly, which may reduce the generalizability of these studies to the older adults population.Our analysis included 2295 interventional studies and 1470 observational studies. Most trials did not explicitly exclude older adults with common chronic conditions. However, known risk factors such as diabetes and hypertension were considered by less than 5% of trials based on their trial description. Pregnant women were excluded by 34.9% of the studies.Most COVID-19 clinical studies included both genders and older adults. However, risk factors such as diabetes, hypertension, and pregnancy were under-represented, likely skewing the population that was sampled. A careful examination of existing COVID-19 studies can inform future COVID-19 trial design towards balanced internal validity and generalizability.Item Pregnancy-Related Information Seeking in Online Health Communities: A Qualitative Study(Springer, 2021) Lu, Yu; Zhang, Zhan; Min, Katherine; Luo, Xiao; He, Zhe; Engineering Technology, School of Engineering and TechnologyPregnancy often imposes risks on women's health. Consumers are increasingly turning to online resources (e.g., online health communities) to look for pregnancy-related information for better care management. To inform design opportunities for online support interventions, it is critical to thoroughly understand consumers' information needs throughout the entire course of pregnancy including three main stages: pre-pregnancy, during-pregnancy, and postpartum. In this study, we present a content analysis of pregnancy-related question posts on Yahoo! Answers to examine how they formulated their inquiries, and the types of replies that information seekers received. This analysis revealed 14 main types of information needs, most of which were "stage-based". We also found that peers from online health communities provided a variety of support, including affirmation of pregnancy, opinions or suggestions, health information, personal experience, and reference to health providers' service. Insights derived from the findings are drawn to discuss design opportunities for tailoring informatics interventions to support consumers' information needs at different pregnancy stages.Item Zero-shot Learning with Minimum Instruction to Extract Social Determinants and Family History from Clinical Notes using GPT Model(IEEE, 2023) Bhate, Neel Jitesh; Mittal, Ansh; He, Zhe; Luo, Xiao; Computer Science, Luddy School of Informatics, Computing, and EngineeringDemographics, social determinants of health, and family history documented in the unstructured text within the electronic health records are increasingly being studied to understand how this information can be utilized with the structured data to improve healthcare outcomes. After the GPT models were released, many studies have applied GPT models to extract this information from the narrative clinical notes. Different from the existing work, our research focuses on investigating the zero-shot learning on extracting this information together by providing minimum information to the GPT model. We utilize de-identified real-world clinical notes annotated for demographics, various social determinants, and family history information. Given that the GPT model might provide text different from the text in the original data, we explore two sets of evaluation metrics, including the traditional NER evaluation metrics and semantic similarity evaluation metrics, to completely understand the performance. Our results show that the GPT-3.5 method achieved an average of 0.975 F1 on demographics extraction, 0.615 F1 on social determinants extraction, and 0.722 F1 on family history extraction. We believe these results can be further improved through model fine-tuning or few-shots learning. Through the case studies, we also identified the limitations of the GPT models, which need to be addressed in future research.