Predictions on and Analysis of Viral Proteins Encoded by Overlapping Genes

dc.contributor.advisorDunker, A. Keith
dc.contributor.authorKhosravi, Mahvash
dc.date.accessioned2011-08-19T18:10:48Z
dc.date.available2011-08-19T18:10:48Z
dc.date.issued2011-08-19
dc.degree.dateAugust 2007en_US
dc.degree.grantorIndiana Universityen_US
dc.degree.levelM.A.en_US
dc.description.abstractOverlapping genes are adjacent genes that share a portion of their coding sequence. Such genes are often observed in the compact genomes of viruses, prokaryotes,and mitochondria. Overlapping genes are also seen in human and other mammalian genomes. Gene overlapping is a phenomenon to minimize genomic size and maximize encoding capacity. Overlapping genes produce different proteins. A major task in the post genomic era is the large-scale study of the structures and functions of proteins. Proteins play crucial roles in virtually all biological processes. In general it is assumed that 3-D structure determines the function of proteins, but many proteins or region of proteins may function in the absence of 3-D structure. The term disordered is used to describe these proteins. A large number of studies has shown that biological functions depend on both ordered and disordered proteins. Natively disordered regions are common and play essential roles in many proteins, especially, with regard to activities involved in signaling and regulation. The goal of this research was the analysis of the ordered and disordered tendencies of viral proteins encoded by overlapping genes. Our hypothesis is that, in a pair of proteins or protein regions encoded by overlapping genes, at least one of the pair is disordered (or unstructured). Our hypothesis is based on the observation that structural proteins require highly specific amino acid sequences, while unstructured (disordered) sequences are essentially unconstrained. Thus, given a structural protein and its associated mRNA sequence, any sequence derived from an overlapping reading frame seems highly unlikely to have a sequence pattern commensurate with a structural protein; on the other hand, a sequence pattern consistent with a disordered protein seems much more likely. We performed studies on the protein products of overlapping gene sequences, tested the hypothesis and addressed the following two questions: First do the proteins encoded by overlapping genes have opposite order-disorder content, that is, does the ordered part of one of the overlapping proteins correspond to a disordered part in the other overlapping protein? Second, does the encoded protein in the overlapping regions have more disordered amino acids than the non-overlapping regions? Using our database of overlapping viral genes and the protein predictor PONDR VL3, we predicted the order-disorder of amino acids in the sequence of 97 viral protein samples. An analysis of the results supported our hypothesis and indicated that the ordered amino acids are mostly associated with non-overlapping regions while disordered amino acids are more prevalent in overlapping regions. In the overlapping regions for 52 protein pairs, we showed that most of the amino acid pairs facing each other on the protein sequences had at least one disorder for most cases. Out of 52 pairs, there were 3 protein pairs where there were no disordered amino acids and 22 protein pairs where there were no ordered amino acids on either sequence. The fraction of ordered pairs in the pool of overlapping regions of 52 protein pairs was 0.28. The non-overlapping region of 97 proteins had predominantly ordered proteins. The fraction of ordered amino acids in the pool of non-overlapping regions was determined to be 0.77.en_US
dc.identifier.urihttps://hdl.handle.net/1805/2622
dc.identifier.urihttp://dx.doi.org/10.7912/C2/803
dc.subjectGenesen_US
dc.subjectProteinsen_US
dc.subjectSignalingen_US
dc.subjectRegulationen_US
dc.subjectAmino acidsen_US
dc.subjectSequenceen_US
dc.subjectOverlappingen_US
dc.subjectProtein regionsen_US
dc.subjectProtein predictoren_US
dc.subjectDatabaseen_US
dc.titlePredictions on and Analysis of Viral Proteins Encoded by Overlapping Genesen_US
dc.typeThesisen
Files
Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Thesis_Mahvash Khosravi (2).pdf
Size:
514.22 KB
Format:
Adobe Portable Document Format
Description:
Main Thesis
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.88 KB
Format:
Item-specific license agreed upon to submission
Description: