Protein Fold Recognition Using Adaboost Learning Strategy

dc.contributor.authorSu, Yijing
dc.date.accessioned2010-09-29T20:32:02Z
dc.date.available2010-09-29T20:32:02Z
dc.degree.date2007-12
dc.degree.disciplineSchool of Informatics
dc.degree.grantorIndiana University
dc.degree.levelM.S.
dc.description.abstractProtein structure prediction is one of the most important and difficult problems in computational molecular biology. Unlike sequence-only comparison, protein fold recognition based on machine learning algorithms attempts to detect similarities between protein structures which might not be accompanied with any significant sequence similarity. It takes advantage of the information from structural and physic properties beyond sequence information. In this thesis, we present a novel classifier on protein fold recognition, using AdaBoost algorithm that hybrids to k Nearest Neighbor classifier. The experiment framework consists of two tasks: (i) carry out cross validation within the training dataset, and (ii) test on unseen validation dataset, in which 90% of the proteins have less than 25% sequence identity in training samples. Our result yields 64.7% successful rate in classifying independent validation dataset into 27 types of protein folds. Our experiments on the task of protein folding recognition prove the merit of this approach, as it shows that AdaBoost strategy coupling with weak learning classifiers lead to improved and robust performance of 64.7% accuracy versus 61.2% accuracy in published literatures using identical sample sets, feature representation, and class labels.en
dc.identifier.urihttps://hdl.handle.net/1805/2267
dc.identifier.urihttp://dx.doi.org/10.7912/C2/889
dc.language.isoen_USen
dc.subjectAdaboosten
dc.subjectRecognitionen
dc.subjectLearning Strategyen
dc.subjectProtein Folden
dc.titleProtein Fold Recognition Using Adaboost Learning Strategyen
dc.typeThesisen
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Su_11-15.pdf
Size:
296.71 KB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.96 KB
Format:
Item-specific license agreed upon to submission
Description: