- Browse by Subject
Browsing by Subject "Databases"
Now showing 1 - 9 of 9
Results Per Page
Sort Options
Item Cancer reporting: timeliness analysis and process reengineering(2015-11-09) Jabour, Abdulrahman M.; Jones, Josette; Dixon, Brian; Haggstrom, David; Davide, BolchiniIntroduction: Cancer registries collect tumor-related data to monitor incident rates and support population-based research. A common concern with using population-based registry data for research is reporting timeliness. Data timeliness have been recognized as an important data characteristic by both the Centers for Disease Control and Prevention (CDC) and the Institute of Medicine (IOM). Yet, few recent studies in the United States (U.S.) have systemically measured timeliness. The goal of this research is to evaluate the quality of cancer data and examine methods by which the reporting process can be improved. The study aims are: 1- evaluate the timeliness of cancer cases at the Indiana State Department of Health (ISDH) Cancer Registry, 2- identify the perceived barriers and facilitators to timely reporting, and 3- reengineer the current reporting process to improve turnaround time. Method: For Aim 1: Using the ISDH dataset from 2000 to 2009, we evaluated the reporting timeliness and subtask within the process cycle. For Aim 2: Certified cancer registrars reporting for ISDH were invited to a semi-structured interview. The interviews were recorded and qualitatively analyzed. For Aim 3: We designed a reengineered workflow to minimize the reporting timeliness and tested it using simulation. Result: The results show variation in the mean reporting time, which ranged from 426 days in 2003 to 252 days in 2009. The barriers identified were categorized into six themes and the most common barrier was accessing medical records at external facilities. We also found that cases reside for a few months in the local hospital database while waiting for treatment data to become available. The recommended workflow focused on leveraging a health information exchange for data access and adding a notification system to inform registrars when new treatments are available.Item Cardioinformatics Advancements in Healthcare and Biotechnology(American Heart Association, 2023) Khomtchouk, Bohdan B.; Biomedical Engineering and Informatics, Luddy School of Informatics, Computing, and EngineeringItem End-User Needs of Fragmented Databases in Higher Education Data Analysis and Decision Making(2019-05) Briggs, Amanda; Cafaro, Francesco; Dombrowski, Lynn; Reda, KhairiIn higher education, a wealth of data is available to advisors, recruiters, marketers, and program directors. However, data sources can be accessed in a variety of ways and often do not seem to represent the same data set, presenting users with the confounding notion that data sources are in conflict with one another. As users are identifying new ways of accessing and analyzing this data, they are modifying existing work practices and sometimes creating their own databases. To understand how users are navigating these databases, the researchers employed a mixed methods research design including a survey and interview to understand the needs to end users who are accessing these seemingly fragmented databases. The study resulted in a three overarching categories – access, understandability, and use – that affect work practices for end users. The researchers used these themes to develop a set of broadly applicable design recommendations as well as six sets of sketches for implementation – development of a data gateway, training, collaboration, tracking, definitions and roadblocks, and time management.Item HAPPI-2: a Comprehensive and High-quality Map of Human Annotated and Predicted Protein Interactions(BioMed Central, 2017-02-17) Chen, Jake Yue; Pandey, Ragini; Nguyen, Thanh M.; Department of Biohealth Informatics, School of Informatics and ComputingBACKGROUND: Human protein-protein interaction (PPI) data is essential to network and systems biology studies. PPI data can help biochemists hypothesize how proteins form complexes by binding to each other, how extracellular signals propagate through post-translational modification of de-activated signaling molecules, and how chemical reactions are coupled by enzymes involved in a complex biological process. Our capability to develop good public database resources for human PPI data has a direct impact on the quality of future research on genome biology and medicine. RESULTS: The database of Human Annotated and Predicted Protein Interactions (HAPPI) version 2.0 is a major update to the original HAPPI 1.0 database. It contains 2,922,202 unique protein-protein interactions (PPI) linked by 23,060 human proteins, making it the most comprehensive database covering human PPI data today. These PPIs contain both physical/direct interactions and high-quality functional/indirect interactions. Compared with the HAPPI 1.0 database release, HAPPI database version 2.0 (HAPPI-2) represents a 485% of human PPI data coverage increase and a 73% protein coverage increase. The revamped HAPPI web portal provides users with a friendly search, curation, and data retrieval interface, allowing them to retrieve human PPIs and available annotation information on the interaction type, interaction quality, interacting partner drug targeting data, and disease information. The updated HAPPI-2 can be freely accessed by Academic users at http://discovery.informatics.uab.edu/HAPPI . CONCLUSIONS: While the underlying data for HAPPI-2 are integrated from a diverse data sources, the new HAPPI-2 release represents a good balance between data coverage and data quality of human PPIs, making it ideally suited for network biology.Item Improved Adverse Drug Event Prediction Through Information Component Guided Pharmacological Network Model (IC-PNM)(IEEE, 2021) Ji, Xiangmin; Wang, Lei; Hua, Liyan; Wang, Xueying; Zhang, Pengyue; Shendre, Aditi; Feng, Weixing; Li, Jin; Li, Lang; Biostatistics and Health Data Science, Richard M. Fairbanks School of Public HealthImproving adverse drug event (ADE) prediction is highly critical in pharmacovigilance research. We propose a novel information component guided pharmacological network model (IC-PNM) to predict drug-ADE signals. This new method combines the pharmacological network model and information component, a Bayes statistics method. We use 33,947 drug-ADE pairs from the FDA Adverse Event Reporting System (FAERS) 2010 data as the training data, and the new 21,065 drug-ADE pairs from FAERS 2011-2015 as the validations samples. The IC-PNM data analysis suggests that both large and small sample size drug-ADE pairs are needed in training the predictive model for its prediction performance to reach an area under the receiver operating characteristic curve (\textAUROC)= 0.82(AUROC)=0.82. On the other hand, the IC-PNM prediction performance improved to \textAUROC= 0.91AUROC=0.91 if we removed the small sample size drug-ADE pairs from the prediction model during validation.Item In Silico Target Prediction by Training Naive Bayesian Models on Chemogenomics Databases(2006-06-29T19:50:21Z) Nidhi; Merchant, MaheshThe completion of Human Genome Project is seen as a gateway to the discovery of novel drug targets (Jacoby, Schuffenhauer, & Floersheim, 2003). How much of this information is actually translated into knowledge, e.g., the discovery of novel drug targets, is yet to be seen. The traditional route of drug discovery has been from target to compound. Conventional research techniques are focused around studying animal and cellular models which is followed by the development of a chemical concept. Modern approaches that have evolved as a result of progress in molecular biology and genomics start out with molecular targets which usually originate from the discovery of a new gene .Subsequent target validation to establish suitability as a drug target is followed by high throughput screening assays in order to identify new active chemical entities (Hofbauer, 1997). In contrast, chemogenomics takes the opposite approach to drug discovery (Jacoby, Schuffenhauer, & Floersheim, 2003). It puts to the forefront chemical entities as probes to study their effects on biological targets and then links these effects to the genetic pathways of these targets (Figure 1a). The goal of chemogenomics is to rapidly identify new drug molecules and drug targets by establishing chemical and biological connections. Just as classical genetic experiments are classified into forward and reverse, experimental chemogenomics methods can be distinguished as forward and reverse depending on the direction of investigative process i.e. from phenotype to target or from target to phenotype respectively (Jacoby, Schuffenhauer, & Floersheim, 2003). The identification and characterization of protein targets are critical bottlenecks in forward chemogenomics experiments. Currently, methods such as affinity matrix purification (Taunton, Hassig, & Schreiber, 1996) and phage display (Sche, McKenzie, White, & Austin, 1999) are used to determine targets for compounds. None of the current techniques used for target identification after the initial screening are efficient. In silico methods can provide complementary and efficient ways to predict targets by using chemogenomics databases to obtain information about chemical structures and target activities of compounds. Annotated chemogenomics databases integrate chemical and biological domains and can provide a powerful tool to predict and validate new targets for compounds with unknown effects (Figure 1b). A chemogenomics database contains both chemical properties and biological activities associated with a compound. The MDL Drug Data Report (MDDR) (Molecular Design Ltd., San Leandro, California) is one of the well known and widely used databases that contains chemical structures and corresponding biological activities of drug like compounds. The relevance and quality of information that can be derived from these databases depends on their annotation schemes as well as the methods that are used for mining this data. In recent years chemists and biologist have used such databases to carry out similarity searches and lookup biological activities for compounds that are similar to the probe molecules for a given assay. With the emergence of new chemogenomics databases that follow a well-structured and consistent annotation scheme, new automated target prediction methods are possible that can give insights to the biological world based on structural similarity between compounds. The usefulness of such databases lies not only in predicting targets, but also in establishing the genetic connections of the targets discovered, as a consequence of the prediction. The ability to perform automated target prediction relies heavily on a synergy of very recent technologies, which includes: i) Highly structured and consistently annotated chemogenomics databases. Many such databases have surfaced very recently; WOMBAT (Sunset Molecular Discovery LLC, Santa Fe, New Mexico), KinaseChemBioBase (Jubilant Biosys Ltd., Bangalore, India) and StARLITe (Inpharmatica Ltd., London, UK), to name a few. ii) Chemical descriptors (Xue & Bajorath, 2000) that capture the structure-activity relationship of the molecules as well as computational techniques (Kitchen, Stahura, & Bajorath, 2004) that are specifically tailored to extract information from these descriptors. iii) Data pipelining environments that are fast, integrate multiple computational steps, and support large datasets. A combination of all these technologies may be employed to bridge the gap between chemical and biological domains which remains a challenge in the pharmaceutical industry.Item Mapping the rules: conceptual and logical relationships in a system for pediatric clinical decision support(2013-10-07) Ralston, Rick K.; Odell, Jere D.; Whipple, Elizabeth C.; Liu, Gilbert C.The Child Health Improvement through Computer Automation (CHICA) system uses evidence-based guidelines and information collected in the clinic and stored in an electronic medical record (EMR) to inform physician and patient decision making. CHICA helps physicians to identify and select relevant screenings and also provides personalized, just-in-time information for patients. This system relies on a database of Medical Logic Modules (MLMS) written in the Arden Rules syntax. These MLMs store observations (StorObs) during the clinical encounter which trigger potential screenings and preventive health interventions for discussion with the patient or for follow up at the next visit. This poster shows how informationists worked with the CHICA team to describe the MLMs using standard vocabularies, including Medical Subject Headings (MeSH) and Logical Observation Identifiers Names and Codes (LOINC). After assigning keywords to the database of MLMs, the informationists used visualization tools to generate maps. These maps show how rules are related by logic (shared StorObs) and by concept (shared vocabulary). The CHICA team will use these maps to identify gaps in the clinical decision support database and (if needed) to develop rules which bridge related but currently isolated concepts.Item Predicting Opioid Prescriptions based on Patient Demographics in MIMIC-IV(IEEE Xplore, 2021-06) Kodela, Snigdha; Pinnamraju, Jahnavi; Gichoya, Judy W.; Purkayastha, Saptarshi; Biohealth Informatics, School of Engineering and TechnologyOpioids are widely used analgesics because of their efficacy, mild sedative and anxiolytic properties, and flexibility to administer through multiple routes. Understanding the demographics of the patients receiving these medications helps provide customized care for the susceptible group of people. We conducted a demographic evaluation of the frequently prescribed opioid drug prescriptions from the MIMIC IV database. We analyzed prescribing patterns of six commonly used opioids with demographics such as age, gender, ethnicity, marital status, and year predominantly. After conducting exploratory data analysis, we built models using Logistic Regression, Random Forest, and XGBoost to predict opioid prescriptions and demographics for those. We also analyzed the association between demographics and the frequency of prescribed medications for pain management. We found statistically significant differences in opioid prescriptions among the male and female population, married and unmarried, various ages, ethnic groups, and an association with in-hospital deaths.Item The United States COVID-19 Forecast Hub dataset(Springer, 2022-08-01) Cramer, Estee Y.; Huang, Yuxin; Wang, Yijin; Ray, Evan L.; Cornell, Matthew; Bracher, Johannes; Brennen, Andrea; Rivadeneira, Alvaro J. Castro; Gerding, Aaron; House, Katie; Jayawardena, Dasuni; Kanji, Abdul Hannan; Khandelwal, Ayush; Le, Khoa; Mody, Vidhi; Mody, Vrushti; Niemi, Jarad; Stark, Ariane; Shah, Apurv; Wattanchit, Nutcha; Zorn, Martha W.; Reich, Nicholas G.; US COVID-19 Forecast Hub Consortium; Computer Science, Luddy School of Informatics, Computing, and EngineeringAcademic researchers, government agencies, industry groups, and individuals have produced forecasts at an unprecedented scale during the COVID-19 pandemic. To leverage these forecasts, the United States Centers for Disease Control and Prevention (CDC) partnered with an academic research lab at the University of Massachusetts Amherst to create the US COVID-19 Forecast Hub. Launched in April 2020, the Forecast Hub is a dataset with point and probabilistic forecasts of incident cases, incident hospitalizations, incident deaths, and cumulative deaths due to COVID-19 at county, state, and national, levels in the United States. Included forecasts represent a variety of modeling approaches, data sources, and assumptions regarding the spread of COVID-19. The goal of this dataset is to establish a standardized and comparable set of short-term forecasts from modeling teams. These data can be used to develop ensemble models, communicate forecasts to the public, create visualizations, compare models, and inform policies regarding COVID-19 mitigation. These open-source data are available via download from GitHub, through an online API, and through R packages.