Protein function prediction by integrating sequence, structure and binding affinity information

dc.contributor.advisorZhou, Yaoqi
dc.contributor.authorZhao, Huiying
dc.contributor.otherLiu, Yunlong
dc.contributor.otherMeroueh, Samy
dc.contributor.otherJanga, Sarath Chandra
dc.date.accessioned2014-02-03T18:13:42Z
dc.date.available2014-02-03T18:13:42Z
dc.date.issued2014-02-03
dc.degree.date2013en_US
dc.degree.disciplineSchool of Informaticsen
dc.degree.grantorIndiana Universityen_US
dc.degree.levelPh.D.en_US
dc.descriptionIndiana University-Purdue University Indianapolis (IUPUI)en_US
dc.description.abstractProteins are nano-machines that work inside every living organism. Functional disruption of one or several proteins is the cause for many diseases. However, the functions for most proteins are yet to be annotated because inexpensive sequencing techniques dramatically speed up discovery of new protein sequences (265 million and counting) and experimental examinations of every protein in all its possible functional categories are simply impractical. Thus, it is necessary to develop computational function-prediction tools that complement and guide experimental studies. In this study, we developed a series of predictors for highly accurate prediction of proteins with DNA-binding, RNA-binding and carbohydrate-binding capability. These predictors are a template-based technique that combines sequence and structural information with predicted binding affinity. Both sequence and structure-based approaches were developed. Results indicate the importance of binding affinity prediction for improving sensitivity and precision of function prediction. Application of these methods to the human genome and structure genome targets demonstrated its usefulness in annotating proteins of unknown functions and discovering moon-lighting proteins with DNA,RNA, or carbohydrate binding function. In addition, we also investigated disruption of protein functions by naturally occurring genetic variations due to insertions and deletions (INDELS). We found that protein structures are the most critical features in recognising disease-causing non-frame shifting INDELs. The predictors for function predictions are available at http://sparks-lab.org/spot, and the predictor for classification of non-frame shifting INDELs is available at http://sparks-lab.org/ddig.en_US
dc.identifier.urihttps://hdl.handle.net/1805/3913
dc.identifier.urihttp://dx.doi.org/10.7912/C2/932
dc.language.isoen_USen_US
dc.subjectprotein functionen_US
dc.subject.lcshProteomics -- Data processing -- Researchen_US
dc.subject.lcshProteins -- Analysis -- Mathematics -- Researchen_US
dc.subject.lcshArtificial intelligenceen_US
dc.subject.lcshAlgorithmsen_US
dc.subject.lcshProteins -- Structure-activity relationshipsen_US
dc.subject.lcshProtein-protein interactionsen_US
dc.subject.lcshGenomics -- Data processingen_US
dc.subject.lcshProteins -- Analysisen_US
dc.subject.lcshBiologically-inspired computing -- Researchen_US
dc.subject.lcshExpert systems (Computer science)en_US
dc.subject.lcshData miningen_US
dc.subject.lcshBioinformatics -- Researchen_US
dc.subject.lcshDNA-protein interactionsen_US
dc.subject.lcshRNA-protein interactionsen_US
dc.subject.lcshCarbohydratesen_US
dc.titleProtein function prediction by integrating sequence, structure and binding affinity informationen_US
dc.typeThesisen
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
thesis_Sep2.pdf
Size:
1.89 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.88 KB
Format:
Item-specific license agreed upon to submission
Description: