CRISPR Cas13 sg A Designer
dc.contributor.author | Krohannon, Alexander | |
dc.contributor.author | Janga, Sarath | |
dc.date.accessioned | 2022-04-15T20:40:46Z | |
dc.date.available | 2022-04-15T20:40:46Z | |
dc.description | Digitized for IUPUI ScholarWorks inclusion in 2021. | |
dc.description.abstract | Recent discovery of the gene editing system - CRISPR (Clustered Regularly Interspersed Short Palindromic Repeats) associated proteins (cas), has resulted in its widespread use for improved understanding of a variety of biological systems, by enabling large-scale perturbation of the genomes and transcriptomes. Cas13, a lesser studied cas protein, has been repurposed to allow for efficient and precise editing of RNA molecules. The cas13 system utilizes base complementarity between a crRNA (crispr RNA) and a target RNA transcript, to preferentially bind to only the target transcript. Unlike targeting the upstream regulatory regions of protein coding genes on the genome, the transcriptome is significantly more redundant, leading to many transcripts having wide stretches of identical nucleotide sequences. Additionally, transcripts exhibit complex three-dimensional structures and interact with multiple RBPs (RNA binding proteins), both of which further limit the scope of effective target sequences. As a result, there currently exists no method to predict whether a crRNA will be effective or not. This project aims to create a novel machine learning model to predict the efficacy of a crRNA; using publicly available RNA knockdown data from cas13 characterization experiments1 for 555 sgRNAs targeting the transcriptome in HEK293 cells. Numerous types of machine learning models were tested during development including ARD (Automatic Relevance Determination), Bayesian Ridge, Elastic Net, Huber, K-Nearest Neighbors, Linear, and SVM (Support Vector Machines). K-Nearest Neighbors showed the greatest accuracy, predicting knockdown value within 10% of the mean value in 39.1% of the instances. Despite their differences in accuracy, Elastic Net had the lowest precision error (0.0638) and SVM had the lowest recall error (0.0094). Implementation of this model will allow for rapid deployment of new types of screening the transcriptomes and enable potential treatments for diseases linked with aberrations in RNA regulatory processes. | en_US |
dc.identifier.uri | https://hdl.handle.net/1805/28536 | |
dc.title | CRISPR Cas13 sg A Designer | en_US |
dc.type | Poster | en_US |