- Browse by Author
Browsing by Author "Jones, Josette"
Now showing 1 - 10 of 69
Results Per Page
Sort Options
Item An answer recommendation framework for an online cancer community forum(Springer Nature, 2023-05-15) Athira, B.; Idicula, Sumam Mary; Jones, Josette; Kulanthaivel, Anand; BioHealth Informatics, School of Informatics and ComputingHealth community forums are a kind of online platform to discuss various matters related to management of illness. People are increasingly searching for answers online, particularly when they are diagnosed with cancer like life-threatening diseases. People seek suggestions or advice through these platforms to make decisions during their treatments. However, locating the correct information or similar people is often a great challenge for them. In this scenario, this paper proposes an answer recommendation system in an online breast cancer community forum that provide guidance and valuable references to users while making decisions. The answer is the summary of already discussed topic in the forum, so that they do not need to go through all the answer posts which spans over multiple pages or initiate a thread once again. There are three phases for the answer recommendation system, including query similarity model to retrieve the past similar query, query-answer pair generation and answer recommendation. Query similarity model is employed by a Siamese network with Bi-LSTM architecture which could achieve an F1-score of 85.5%. Also, the paper shows the efficacy of transfer learning technique to generalize the model well in our breast cancer query-query pair data set. The query-answer pairs are generated by an extractive summarization technique that is based on an optimization algorithm. The effectiveness of the generated summary is evaluated based on a manually generated summary, and the result shows a ROUGE-1 score of 49%.Item Analysis of Co-Indicators and Counter-Indicators Among Patients Using Coding Algorithms: Learning Phenotype studyReddy, Nagarjuna; Jones, Josette; Kanakasabai, Saravanan; Klapper, GregoryChronic complications associated with the diabetes are responsible for increase in mortality and morbidity rate. The main aim of the project is to analyze the co-indicators and counter-indicators among the patients by mapping the conditions with ICD codes and developing an algorithm. A positive and strong correlation is identified with respect to BMI, Poverty, Education, Age and T2DM cohorts and it's comorbidities.Item Analyzing Chlamydia and Gonorrhea Health Disparities from Health Information Systems: A Closer Examination Using Spatial Statistics and Geographical Information Systems(2022-05) Lai, Patrick T. S.; Jones, Josette; Dixon, Brian E.; Wilson, Jeffrey; Wu, Huanmei; Shih, PatrickThe emergence and development of electronic health records have contributed to an abundance of patient data that can greatly be used and analyzed to promote health outcomes and even eliminate health disparities. However, challenges exist in the data received with factors such as data inconsistencies, accuracy issues, and unstructured formatting being evident. Furthermore, the current electronic health records and clinical information systems that are present do not contain the social determinants of health that may enhance our understanding of the characteristics and mechanisms of disease risk and transmission as well as health disparities research. Linkage to external population health databases to incorporate these social determinants of health is often necessary. This study provides an opportunity to identify and analyze health disparities using geographical information systems on two important sexually transmitted diseases in chlamydia and gonorrhea using Marion County, Indiana as the geographical location of interest. Population health data from the Social Assets and Vulnerabilities Indicators community information system and electronic health record data from the Indiana Network for Patient Care will be merged to measure the distribution and variability of greatest chlamydia and gonorrhea risk and to determine where the greatest areas of health disparities exist. A series of both statistical and spatial statistical methods such as a longitudinal measurement of health disparity through the Gini index, a hot-spot and cluster analysis, and a geographically weighted regression will be conducted in this study. The outcome and broader impact of this research will contribute to enhanced surveillance and increased effective strategies in identifying the level of health disparities for sexually transmitted diseases in vulnerable localities and high-risk communities. Additionally, the findings from this study will lead to improved standardization and accuracy in data collection to facilitate subsequent studies involving multiple disparate data sources. Finally, this study will likely introduce ideas for potential social determinants of health to be incorporated into electronic health records and clinical information systems.Item Annotating and Detecting Topics in Social Media Forum and Modelling the Annotation to Derive Directions-A Case Study(Research Square, 2021) B., Athira; Jones, Josette; Idicula, Sumam Mary; Kulanthaivel, Anand; Zhang, Enming; BioHealth Informatics, School of Informatics and ComputingThe widespread influence of social media impacts every aspect of life, including the healthcare sector. Although medics and health professionals are the final decision makers, the advice and recommendations obtained from fellow patients are significant. In this context, the present paper explores the topics of discussion posted by breast cancer patients and survivors on online forums. The study examines an online forum, Breastcancer.org, maps the discussion entries to several topics, and proposes a machine learning model based on a classification algorithm to characterize the topics. To explore the topics of breast cancer patients and survivors, approximately 1000 posts are selected and manually labeled with annotations. In contrast, millions of posts are available to build the labels. A semi-supervised learning technique is used to build the labels for the unlabeled data; hence, the large data are classified using a deep learning algorithm. The deep learning algorithm BiLSTM with BERT word embedding technique provided a better f1-score of 79.5%. This method is able to classify the following topics: medication reviews, clinician knowledge, various treatment options, seeking and providing support, diagnostic procedures, financial issues and implications for everyday life. What matters the most for the patients is coping with everyday living as well as seeking and providing emotional and informational support. The approach and findings show the potential of studying social media to provide insight into patients' experiences with cancer like critical health problems.Item Assessment of Parkinson's Disease Progression by Feature Relevance Analysis and Regression Techniques Using Machine Learning AlgorithmsGullapelli, Rakesh; Jones, Josette; Lai, Patrick T. S.Remote patient tracking has been gaining increased attention due to its low-cost non-invasive methods. Unified Parkinson's Disease Rating Scale (UPDRS) is used often to track Parkinson's Disease (PD) symptoms which requires the patient's visit to the clinic and time consuming medical tests that may not be feasible for most of the elderly PD patients. One of the major concerns to predict the PD in early stages is that PD symptoms overlap with the symptoms of other diseases such as Multiple Sclerosis, Alzheimer's disease. Moreover, most of the current methods used for tracking PD rely on expert clinical raters, from which PD symptoms assessment may be difficult due to inter-individual variability. Predicting relevant features using machine learning algorithms is helpful in providing the scientific decision-making classification rules necessary to assess the disease progression in early stages.Item Association between Urinary Phytoestrogens and C-reactive Protein in the Continuous National Health and Nutrition Examination Survey(Taylor & Francis, 2017) Reger, Michael K.; Zollinger, Terrell W.; Liu, Ziyue; Jones, Josette; Zhang, Jianjun; Epidemiology, School of Public HealthObjective: A reduced risk of some cancers and cardiovascular disease associated with phytoestrogen intake may be mediated through its effect on serum C-reactive protein (CRP; an inflammation biomarker). Therefore, this study examined the associations between urinary phytoestrogens and serum CRP. Methods: Urinary phytoestrogen and serum CRP data obtained from 6009 participants aged ≥ 40 years in the continuous National Health and Nutrition Examination Survey during 1999–2010 were analyzed. Results: After adjustment for confounders, urinary concentrations of total and all individual phytoestrogens were inversely associated with serum concentrations of CRP (all p < 0.004). The largest reductions in serum CRP (mg/L) per interquartile range increase in urinary phytoestrogens (ng/mL) were observed for total phytoestrogens (β = −0.18; 95% confidence interval [CI], −0.22, −0.15), total lignan (β = −0.15; 95% CI, −0.18, −0.12), and enterolactone (β = −0.15; 95% CI, −0.19, −0.12). A decreased risk of having high CRP concentrations (≥3.0 mg/L) for quartile 4 vs quartile 1 was also found for total phytoestrogens (OR = 0.63; 95% CI, 0.53, 0.73), total lignan (OR = 0.64; 95% CI, 0.54, 0.75), and enterolactone (OR = 0.59; 95% CI, 0.51, 0.69). Conclusion: Urinary total and individual phytoestrogens were significantly inversely associated with serum CRP in a nationally representative sample of the U.S. population.Item Bioinformatics and Pharmacogenomics in Drug Discovery and DevelopmentAnyanwu, Chukwuma Eustace; Jones, JosetteObjective: Literature review to evaluate the extent to which Bioinformatics has facilitated the drug discovery and development process from an economic perspective Problem: A plethora of genomic and proteomic information was uncovered by the U.S Human Genome Project (HGP). Despite the projected impact that Bioinformatics and Pharmacogenomics were projected to have in the drug discovery and development process, the challenges facing the pharmaceutical companies – in this regard, still persist. Design: An extensive integrated literature review of library resources such as MEDLINE, ERIC, PsychInfo, EconLit, Social Services Abstracts, ABI/INFORM and LISA (all 1990 – Present). These electronic databases were researched because of their focuses on the healthcare sector, medical and scientific innovations, economic modeling and analysis, bioinformatics and computational biology, applied social research and technology applications. Semi-structured interviews of Bioinformatics professionals were also conducted to complement the literature review. Also, Internet-based databases from reliable resources were also researched resulting in serendipitous discoveries. Sample: Published English language reports of studies and research carried out worldwide from 1990 to 2004, relating to drug discovery and development. Selection criteria: Primary focus was on research publications and journals that identify and discuss the practice of Bioinformatics, especially in the area of drug discovery and development. Premium was placed on articles and publications that discussed the economic impacts of Bioinformatics in the drug discovery process. Results: Though the goals of Bioinformatics have been clearly defined, and the discipline is widely practiced in the pharmaceutical industry, this study has not found any definite attempts to evaluate its economic and regulatory impact specifically in facilitating the drug discovery and development process, and the delivery of personalized drugs. Discussion: Bioinformatics and Pharmacogenomics are the new facets of the ever-evolving drug discovery and development process. It may still be a while before their full impact and potential is attained.Item Bioinformatics and Pharmacogenomics in Drug Discovery and Development- a Socio-economic Perspective(2006-07-26T14:37:43Z) Anyanwu, Chukwuma Eustace; Jones, JosetteA plethora of genomic and proteomic information was uncovered by the U.S Human Genome Project (HGP) – mostly by means of bioinformatics tools and techniques. Despite the impact that bioinformatics and pharmacogenomics were projected to have in the drug discovery and development process, the challenges facing the pharmaceutical industry, such as the high cost and the slow pace of drug development, appear to persist. Socio-economic barriers exist that mitigate the full integration of bioinformatics and pharmacogenomics into the drug discovery and development process, hence limiting the desired and expected effects.Item Biomedical Literature Mining and Knowledge Discovery of Phenotyping Definitions(2019-07) Binkheder, Samar Hussein; Jones, Josette; Li, Lang; Quinney, Sara Kay; Wu, Huanmei; Zhang, ChiPhenotyping definitions are essential in cohort identification when conducting clinical research, but they become an obstacle when they are not readily available. Developing new definitions manually requires expert involvement that is labor-intensive, time-consuming, and unscalable. Moreover, automated approaches rely mostly on electronic health records’ data that suffer from bias, confounding, and incompleteness. Limited efforts established in utilizing text-mining and data-driven approaches to automate extraction and literature-based knowledge discovery of phenotyping definitions and to support their scalability. In this dissertation, we proposed a text-mining pipeline combining rule-based and machine-learning methods to automate retrieval, classification, and extraction of phenotyping definitions’ information from literature. To achieve this, we first developed an annotation guideline with ten dimensions to annotate sentences with evidence of phenotyping definitions' modalities, such as phenotypes and laboratories. Two annotators manually annotated a corpus of sentences (n=3,971) extracted from full-text observational studies’ methods sections (n=86). Percent and Kappa statistics showed high inter-annotator agreement on sentence-level annotations. Second, we constructed two validated text classifiers using our annotated corpora: abstract-level and full-text sentence-level. We applied the abstract-level classifier on a large-scale biomedical literature of over 20 million abstracts published between 1975 and 2018 to classify positive abstracts (n=459,406). After retrieving their full-texts (n=120,868), we extracted sentences from their methods sections and used the full-text sentence-level classifier to extract positive sentences (n=2,745,416). Third, we performed a literature-based discovery utilizing the positively classified sentences. Lexica-based methods were used to recognize medical concepts in these sentences (n=19,423). Co-occurrence and association methods were used to identify and rank phenotype candidates that are associated with a phenotype of interest. We derived 12,616,465 associations from our large-scale corpus. Our literature-based associations and large-scale corpus contribute in building new data-driven phenotyping definitions and expanding existing definitions with minimal expert involvement.Item Bridging The Gap Between Healthcare Providers and Consumers: Extracting Features from Online Health Forum to Meet Social Needs of Patients using Network Analysis and Embedding(2020-08) Mokashi, Maitreyi; Chakraborty, Sunandan; Jones, Josette; Zheng, JiapingChronic disease patients have to face many issues during and after their treatment. A lot of these issues are either personal, professional, or social in nature. It may so happen that these issues are overlooked by the respective healthcare providers and become major obstacles in the patient’s day-to-day life and their disease management. We extract data from an online health platform that serves as a ‘safe haven’ to the patients and survivors to discuss help and coping issues. This thesis presents a novel approach that acts as the first step to include the social issues discussed by patients on online health forums which the healthcare providers need to consider in order to create holistic treatment plans. There are numerous online forums where patients share their experiences and post questions about their treatments and their subsequent side effects. We collected data from an “Online Breast Cancer Forum”. On this forum, users (patients) have created threads across many related topics and shared their experiences and questions. We connect the patients (users) with the topic in which they have posted by converting the data into a bipartite network and turn the network nodes into a high-dimensional feature space. From this feature space, we perform community detection on the node embeddings to unearth latent connections between patients and topics. We claim that these latent connections, along with the existing ones, will help to create a new knowledge base that will eventually help the healthcare providers to understand and acknowledge the non-medical related issues to a treatment, and create more adaptive and personalized plans. We performed both qualitative and quantitative analysis on the obtained embeddings to prove the superior quality of our approach and its potential to extract more information when compared to other models.