- Browse by Author
Browsing by Author "Payne, Philip R. O."
Now showing 1 - 3 of 3
Results Per Page
Sort Options
Item Developing real‐world evidence from real‐world data: Transforming raw data into analytical datasets(Wiley, 2021-10-14) Bastarache, Lisa; Brown, Jeffrey S.; Cimino, James J.; Dorr, David A.; Embi, Peter J.; Payne, Philip R. O.; Wilcox, Adam B.; Weiner, Mark G.; Medicine, School of MedicineDevelopment of evidence-based practice requires practice-based evidence, which can be acquired through analysis of real-world data from electronic health records (EHRs). The EHR contains volumes of information about patients-physical measurements, diagnoses, exposures, and markers of health behavior-that can be used to create algorithms for risk stratification or to gain insight into associations between exposures, interventions, and outcomes. But to transform real-world data into reliable real-world evidence, one must not only choose the correct analytical methods but also have an understanding of the quality, detail, provenance, and organization of the underlying source data and address the differences in these characteristics across sites when conducting analyses that span institutions. This manuscript explores the idiosyncrasies inherent in the capture, formatting, and standardization of EHR data and discusses the clinical domain and informatics competencies required to transform the raw clinical, real-world data into high-quality, fit-for-purpose analytical data sets used to generate real-world evidence.Item A protocol to evaluate RNA sequencing normalization methods(BMC, 2019-12-20) Abrams, Zachary B.; Johnson, Travis S.; Huang, Kun; Payne, Philip R. O.; Coombes, Kevin; Medicine, School of MedicineBackground RNA sequencing technologies have allowed researchers to gain a better understanding of how the transcriptome affects disease. However, sequencing technologies often unintentionally introduce experimental error into RNA sequencing data. To counteract this, normalization methods are standardly applied with the intent of reducing the non-biologically derived variability inherent in transcriptomic measurements. However, the comparative efficacy of the various normalization techniques has not been tested in a standardized manner. Here we propose tests that evaluate numerous normalization techniques and applied them to a large-scale standard data set. These tests comprise a protocol that allows researchers to measure the amount of non-biological variability which is present in any data set after normalization has been performed, a crucial step to assessing the biological validity of data following normalization. Results In this study we present two tests to assess the validity of normalization methods applied to a large-scale data set collected for systematic evaluation purposes. We tested various RNASeq normalization procedures and concluded that transcripts per million (TPM) was the best performing normalization method based on its preservation of biological signal as compared to the other methods tested. Conclusion Normalization is of vital importance to accurately interpret the results of genomic and transcriptomic experiments. More work, however, needs to be performed to optimize normalization methods for RNASeq data. The present effort helps pave the way for more systematic evaluations of normalization methods across different platforms. With our proposed schema researchers can evaluate their own or future normalization methods to further improve the field of RNASeq normalization.Item The National COVID Cohort Collaborative (N3C): Rationale, design, infrastructure, and deployment(Oxford University Press, 2021) Haendel, Melissa A.; Chute, Christopher G.; Bennett, Tellen D.; Eichmann, David A.; Guinney, Justin; Kibbe, Warren A.; Payne, Philip R. O.; Pfaff, Emily R.; Robinson, Peter N.; Saltz, Joel H.; Spratt, Heidi; Suver, Christine; Wilbanks, John; Wilcox, Adam B.; Williams, Andrew E.; Wu, Chunlei; Blacketer, Clair; Bradford, Robert L.; Cimino, James J.; Clark, Marshall; Colmenares, Evan W.; Francis, Patricia A.; Gabriel, Davera; Graves, Alexis; Hemadri, Raju; Hong, Stephanie S.; Hripscak, George; Jiao, Dazhi; Klann, Jeffrey G.; Kostka, Kristin; Lee, Adam M.; Lehmann, Harold P.; Lingrey, Lora; Miller, Robert T.; Morris, Michele; Murphy, Shawn N.; Natarajan, Karthik; Palchuk, Matvey B.; Sheikh, Usman; Solbrig, Harold; Visweswaran, Shyam; Walden, Anita; Walters, Kellie M.; Weber, Griffin M.; Zhang, Xiaohan Tanner; Zhu, Richard L.; Amor, Benjamin; Girvin, Andrew T.; Manna, Amin; Qureshi, Nabeel; Kurilla, Michael G.; Michael, Sam G.; Portilla, Lili M.; Rutter, Joni L.; Austin, Christopher P.; Gersing, Ken R.; Biomedical Engineering and Informatics, Luddy School of Informatics, Computing, and EngineeringObjective: Coronavirus disease 2019 (COVID-19) poses societal challenges that require expeditious data and knowledge sharing. Though organizational clinical data are abundant, these are largely inaccessible to outside researchers. Statistical, machine learning, and causal analyses are most successful with large-scale data beyond what is available in any given organization. Here, we introduce the National COVID Cohort Collaborative (N3C), an open science community focused on analyzing patient-level data from many centers. Materials and methods: The Clinical and Translational Science Award Program and scientific community created N3C to overcome technical, regulatory, policy, and governance barriers to sharing and harmonizing individual-level clinical data. We developed solutions to extract, aggregate, and harmonize data across organizations and data models, and created a secure data enclave to enable efficient, transparent, and reproducible collaborative analytics. Results: Organized in inclusive workstreams, we created legal agreements and governance for organizations and researchers; data extraction scripts to identify and ingest positive, negative, and possible COVID-19 cases; a data quality assurance and harmonization pipeline to create a single harmonized dataset; population of the secure data enclave with data, machine learning, and statistical analytics tools; dissemination mechanisms; and a synthetic data pilot to democratize data access. Conclusions: The N3C has demonstrated that a multisite collaborative learning health network can overcome barriers to rapidly build a scalable infrastructure incorporating multiorganizational clinical data for COVID-19 analytics. We expect this effort to save lives by enabling rapid collaboration among clinicians, researchers, and data scientists to identify treatments and specialized care and thereby reduce the immediate and long-term impacts of COVID-19.