- Browse by Subject
Browsing by Subject "Imputation"
Now showing 1 - 3 of 3
Results Per Page
Sort Options
Item Odyssey: a semi-automated pipeline for phasing, imputation, and analysis of genome-wide genetic data(Biomed Central, 2019-06-28) Eller, Ryan J.; Janga, Sarath C.; Walsh, Susan; Biology, School of ScienceBACKGROUND: Genome imputation, admixture resolution and genome-wide association analyses are timely and computationally intensive processes with many composite and requisite steps. Analysis time increases further when building and installing the run programs required for these analyses. For scientists that may not be as versed in programing language, but want to perform these operations hands on, there is a lengthy learning curve to utilize the vast number of programs available for these analyses. RESULTS: In an effort to streamline the entire process with easy-to-use steps for scientists working with big data, the Odyssey pipeline was developed. Odyssey is a simplified, efficient, semi-automated genome-wide imputation and analysis pipeline, which prepares raw genetic data, performs pre-imputation quality control, phasing, imputation, post-imputation quality control, population stratification analysis, and genome-wide association with statistical data analysis, including result visualization. Odyssey is a pipeline that integrates programs such as PLINK, SHAPEIT, Eagle, IMPUTE, Minimac, and several R packages, to create a seamless, easy-to-use, and modular workflow controlled via a single user-friendly configuration file. Odyssey was built with compatibility in mind, and thus utilizes the Singularity container solution, which can be run on Linux, MacOS, and Windows platforms. It is also easily scalable from a simple desktop to a High-Performance System (HPS). CONCLUSION: Odyssey facilitates efficient and fast genome-wide association analysis automation and can go from raw genetic data to genome: phenome association visualization and analyses results in 3-8 h on average, depending on the input data, choice of programs within the pipeline and available computer resources. Odyssey was built to be flexible, portable, compatible, scalable, and easy to setup. Biologists less familiar with programing can now work hands on with their own big data using this easy-to-use pipeline.Item SPCS: a spatial and pattern combined smoothing method for spatial transcriptomic expression(Oxford University Press, 2022) Liu, Yusong; Wang, Tongxin; Duggan, Ben; Sharpnack, Michael; Huang, Kun; Zhang, Jie; Ye, Xiufen; Johnson, Travis S.; Biostatistics and Health Data Science, School of MedicineHigh-dimensional, localized ribonucleic acid (RNA) sequencing is now possible owing to recent developments in spatial transcriptomics (ST). ST is based on highly multiplexed sequence analysis and uses barcodes to match the sequenced reads to their respective tissue locations. ST expression data suffer from high noise and dropout events; however, smoothing techniques have the promise to improve the data interpretability prior to performing downstream analyses. Single-cell RNA sequencing (scRNA-seq) data similarly suffer from these limitations, and smoothing methods developed for scRNA-seq can only utilize associations in transcriptome space (also known as one-factor smoothing methods). Since they do not account for spatial relationships, these one-factor smoothing methods cannot take full advantage of ST data. In this study, we present a novel two-factor smoothing technique, spatial and pattern combined smoothing (SPCS), that employs the k-nearest neighbor (kNN) technique to utilize information from transcriptome and spatial relationships. By performing SPCS on multiple ST slides from pancreatic ductal adenocarcinoma (PDAC), dorsolateral prefrontal cortex (DLPFC) and simulated high-grade serous ovarian cancer (HGSOC) datasets, smoothed ST slides have better separability, partition accuracy and biological interpretability than the ones smoothed by preexisting one-factor methods. Source code of SPCS is provided in Github (https://github.com/Usos/SPCS).Item Validating Imputation Procedures to Calculate Corrected Opioid-Involved Overdose Deaths, Marion County, Indiana, 2011-2016(Sage, 2020-01) Gupta, Sumedha; Cohen, Alex; Lowder, Evan M.; Ray, Bradley R.; Economics, School of Liberal ArtsObjectives: Understanding the scope of the current opioid epidemic requires accurate counts of the number of opioid-involved drug overdose deaths. Given known errors and limitations in the reporting of these deaths, several studies have used statistical methods to develop estimates of the true number of opioid-involved overdose deaths. This study validates these procedures using a detailed county-level database of linked toxicology and vital records data. Methods: We extracted and linked toxicology and vital records data from Marion County, Indiana (Indianapolis), during a 6-year period (2011-2016). Using toxicology data as a criterion measure, we tested the validity of multiple imputation procedures, including the Ruhm regression-based imputation approach for correcting the number of opioid-involved overdose deaths. Results: Estimates deviated from true opioid-involved overdose deaths by 3% and increased in accuracy during the study period (2011-2016). For example, in 2016, 231 opioid-involved overdose deaths were noted in the toxicology data, whereas the corresponding imputed estimate was 233 opioid-involved overdose deaths. A simple imputation approach, based on the share of opioid-involved overdose deaths among all drug overdose deaths for which the death certificate specified ≥1 drug, deviated from true opioid-involved overdose deaths by ±5%. Conclusions: Commonly used imputation procedures produced estimates of the number of opioid-involved overdose deaths that are similar to the true number of opioid-involved overdose deaths obtained from toxicology data. Although future studies should examine whether these results extend beyond the geographic area covered in our data set, our findings support the continued use of these imputation procedures to quantify the extent of the opioid epidemic.