Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Improved protein model quality prediction by changing the target function
Stockholm University, Faculty of Science, Department of Biochemistry and Biophysics.ORCID iD: 0000-0003-2232-3006
Stockholm University, Faculty of Science, Department of Biochemistry and Biophysics.
Stockholm University, Faculty of Science, Department of Biochemistry and Biophysics. Stockholm University, Science for Life Laboratory (SciLifeLab).
Show others and affiliations
(English)Manuscript (preprint) (Other academic)
Abstract [en]

Protein modeling quality is an important part of protein structure prediction. We have for more than a decade developed a set of methods using various types of protein descriptions and machine learning methods. Common to all these methods has been that the target function, i.e. the description of the quality of a residue in a protein model, has been the S-score. However, many other quality estimation functions also exist. These can roughly be divided into superposition, like S-score, and contact-based functions. The contact-based methods have been shown to be better at evaluating the quality of multi-domain proteins.

Here, we examine the effects of retraining ProQ3D using identical inputs but different target functions. We find that using the same target and test function provides the best agreement. However using contact-based methods provide higher correlations and a better ranking of individual models.

Keyword [en]
Model Quality Assessment, Protein Model Quality Assessment, structural bioinformatics, machine learning, deep learning, target function, S-score, TM-score, GDT_TS, LDDT, CAD
National Category
Bioinformatics and Systems Biology
Research subject
Biochemistry towards Bioinformatics
Identifiers
URN: urn:nbn:se:su:diva-137694OAI: oai:DiVA.org:su-137694DiVA, id: diva2:1063478
Funder
Swedish Research Council, VR-NT 2012-5046
Available from: 2017-01-10 Created: 2017-01-10 Last updated: 2017-01-16Bibliographically approved
In thesis
1. Protein Model Quality Assessment: A Machine Learning Approach
Open this publication in new window or tab >>Protein Model Quality Assessment: A Machine Learning Approach
2017 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Many protein structure prediction programs exist and they can efficiently generate a number of protein models of a varying quality. One of the problems is that it is difficult to know which model is the best one for a given target sequence. Selecting the best model is one of the major tasks of Model Quality Assessment Programs (MQAPs). These programs are able to predict model accuracy before the native structure is determined. The accuracy estimation can be divided into two parts: global (the whole model accuracy) and local (the accuracy of each residue). ProQ2 is one of the most successful MQAPs for prediction of both local and global model accuracy and is based on a Machine Learning approach.

In this thesis, I present my own contribution to Model Quality Assessment (MQA) and the newest developments of ProQ program series. Firstly, I describe a new ProQ2 implementation in the protein modelling software package Rosetta. This new implementation allows use of ProQ2 as a scoring function for conformational sampling inside Rosetta, which was not possible before. Moreover, I present two new methods, ProQ3 and ProQ3D that both outperform their predecessor. ProQ3 introduces new training features that are calculated from Rosetta energy functions and ProQ3D introduces a new machine learning approach based on deep learning. ProQ3 program participated in the 12th Community Wide Experiment on the Critical Assessment of Techniques for Protein Structure Prediction (CASP12) and was one of the best methods in the MQA category. Finally, an important issue in model quality assessment is how to select a target function that the predictor is trying to learn. In the fourth manuscript, I show that MQA results can be improved by selecting a contact-based target function instead of more conventional superposition based functions.

Place, publisher, year, edition, pages
Stockholm: Department of Biochemistry and Biophysics, Stockholm University, 2017. p. 46
Keyword
Protein Model Quality Assessment, structural bioinformatics, machine learning, deep learning, support vector machine, proq, Artificial Neural Network, protein structure prediction
National Category
Bioinformatics and Systems Biology
Research subject
Biochemistry towards Bioinformatics
Identifiers
urn:nbn:se:su:diva-137695 (URN)978-91-7649-633-6 (ISBN)978-91-7649-634-3 (ISBN)
Public defence
2017-02-10, Magnélisalen, Kemiska övningslaboratoriet, Svante Arrhenius väg 16 B, Stockholm, 14:00 (English)
Opponent
Supervisors
Funder
Swedish Research Council, VR-NT 2012-5046
Note

At the time of the doctoral defense, the following paper was unpublished and had a status as follows: Paper 3: Manuscript.

Available from: 2017-01-18 Created: 2017-01-10 Last updated: 2017-01-18Bibliographically approved

Open Access in DiVA

fulltext(25826 kB)162 downloads
File information
File name FULLTEXT01.pdfFile size 25826 kBChecksum SHA-512
17d6b6e1d0846a20f0a244aa89ec6d85ff1c68768857148296fdcfff0cb5f51298b6e1d158c74cb1894de3407ff68ea511eed27f15bf9990cbf6040ac831100b
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Uziela, KarolisMenéndez Hurtado, DavidShu, NanjiangElofsson, Arne
By organisation
Department of Biochemistry and BiophysicsScience for Life Laboratory (SciLifeLab)
Bioinformatics and Systems Biology

Search outside of DiVA

GoogleGoogle Scholar
Total: 162 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 1130 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf