Direct prediction of profiles of sequences compatible to a protein structure by neural networks with fragment-based local and energy-based nonlocal profiles

dc.contributor.authorLi, Zhixiu
dc.contributor.authorYang, Yuedong
dc.contributor.authorFaraggi, Eshel
dc.contributor.authorZhou, Jian
dc.contributor.authorZhou, Yaoqi
dc.contributor.departmentDepartment of BioHealth Informatics, IU School of Informatics and Computingen_US
dc.date.accessioned2016-07-07T16:28:15Z
dc.date.available2016-07-07T16:28:15Z
dc.date.issued2014-10
dc.description.abstractLocating sequences compatible with a protein structural fold is the well-known inverse protein-folding problem. While significant progress has been made, the success rate of protein design remains low. As a result, a library of designed sequences or profile of sequences is currently employed for guiding experimental screening or directed evolution. Sequence profiles can be computationally predicted by iterative mutations of a random sequence to produce energy-optimized sequences, or by combining sequences of structurally similar fragments in a template library. The latter approach is computationally more efficient but yields less accurate profiles than the former because of lacking tertiary structural information. Here we present a method called SPIN that predicts Sequence Profiles by Integrated Neural network based on fragment-derived sequence profiles and structure-derived energy profiles. SPIN improves over the fragment-derived profile by 6.7% (from 23.6 to 30.3%) in sequence identity between predicted and wild-type sequences. The method also reduces the number of residues in low complex regions by 15.7% and has a significantly better balance of hydrophilic and hydrophobic residues at protein surface. The accuracy of sequence profiles obtained is comparable to those generated from the protein design program RosettaDesign 3.5. This highly efficient method for predicting sequence profiles from structures will be useful as a single-body scoring term for improving scoring functions used in protein design and fold recognition. It also complements protein design programs in guiding experimental design of the sequence library for screening and directed evolution of designed sequences. The SPIN server is available at http://sparks-lab.org.en_US
dc.eprint.versionAuthor's manuscripten_US
dc.identifier.citationLi, Z., Yang, Y., Faraggi, E., Zhan, J., & Zhou, Y. (2014). Direct prediction of profiles of sequences compatible to a protein structure by neural networks with fragment-based local and energy-based nonlocal profiles. Proteins, 82(10), 2565–2573. http://doi.org/10.1002/prot.24620en_US
dc.identifier.urihttps://hdl.handle.net/1805/10313
dc.language.isoen_USen_US
dc.publisherWiley Online Libraryen_US
dc.relation.isversionof10.1002/prot.24620en_US
dc.relation.journalProteinsen_US
dc.rightsPublisher Policyen_US
dc.sourcePMCen_US
dc.subjectInverse protein folding problemen_US
dc.subjectKnowledge-based energy functionen_US
dc.subjectNeural networken_US
dc.subjectProtein designen_US
dc.subjectSequence profilesen_US
dc.titleDirect prediction of profiles of sequences compatible to a protein structure by neural networks with fragment-based local and energy-based nonlocal profilesen_US
dc.typeArticleen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
nihms602508.pdf
Size:
1.15 MB
Format:
Adobe Portable Document Format
Description:
Main Article
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.88 KB
Format:
Item-specific license agreed upon to submission
Description: