Improved protein structure prediction using potentials from deep learning.
Journal
Nature
ISSN: 1476-4687
Titre abrégé: Nature
Pays: England
ID NLM: 0410462
Informations de publication
Date de publication:
01 2020
01 2020
Historique:
received:
02
04
2019
accepted:
10
12
2019
pubmed:
17
1
2020
medline:
6
5
2020
entrez:
17
1
2020
Statut:
ppublish
Résumé
Protein structure prediction can be used to determine the three-dimensional shape of a protein from its amino acid sequence
Identifiants
pubmed: 31942072
doi: 10.1038/s41586-019-1923-7
pii: 10.1038/s41586-019-1923-7
doi:
Substances chimiques
Proteins
0
Caspases
EC 3.4.22.-
caspase 13
EC 3.4.22.-
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
706-710Commentaires et corrections
Type : CommentIn
Type : CommentIn
Références
Dill, K. A., Ozkan, S. B., Shell, M. S. & Weikl, T. R. The protein folding problem. Annu. Rev. Biophys. 37, 289–316 (2008).
pubmed: 18573083
pmcid: 2443096
Dill, K. A. & MacCallum, J. L. The protein-folding problem, 50 years on. Science 338, 1042–1046 (2012).
pubmed: 23180855
Schaarschmidt, J., Monastyrskyy, B., Kryshtafovych, A. & Bonvin, A. M. J. J. Assessment of contact predictions in CASP12: co-evolution and deep learning coming of age. Proteins 86, 51–66 (2018).
pubmed: 29071738
Kirkwood, J. Statistical mechanics of fluid mixtures. J. Chem. Phys. 3, 300–313 (1935).
Kryshtafovych, A., Schwede, T., Topf, M., Fidelis, K. & Moult, J. Critical assessment of methods of protein structure prediction (CASP)—Round XIII. Proteins 87, 1011–1020 (2019).
pubmed: 31589781
Zhang, Y. & Skolnick, J. Scoring function for automated assessment of protein structure template quality. Proteins 57, 702–710 (2004).
pubmed: 15476259
Zhang, Y. Protein structure prediction: when is it useful? Curr. Opin. Struct. Biol. 19, 145–155 (2009).
pubmed: 19327982
pmcid: 2673339
Senior, A. W. et al. Protein structure prediction using multiple deep neural networks in the 13th Critical Assessment of Protein Structure Prediction (CASP13). Proteins 87, 1141–1148 (2019).
pubmed: 31602685
Das, R. & Baker, D. Macromolecular modeling with Rosetta. Annu. Rev. Biochem. 77, 363–382 (2008).
pubmed: 18410248
Jones, D. T. Predicting novel protein folds by using FRAGFOLD. Proteins 45, 127–132 (2001).
Zhang, C., Mortuza, S. M., He, B., Wang, Y. & Zhang, Y. Template-based and free modeling of I-TASSER and QUARK pipelines using predicted contact maps in CASP12. Proteins 86, 136–151 (2018).
pubmed: 29082551
Kirkpatrick, S., Gelatt, C. D. Jr & Vecchi, M. P. Optimization by simulated annealing. Science 220, 671–680 (1983).
pubmed: 17813860
Berman, H. M. et al. The Protein Data Bank. Nucleic Acids Res. 28, 235–242 (2000).
pubmed: 10592235
pmcid: 102472
Altschuh, D., Lesk, A. M., Bloomer, A. C. & Klug, A. Correlation of co-ordinated amino acid substitutions with function in viruses related to tobacco mosaic virus. J. Mol. Biol. 193, 693–707 (1987).
pubmed: 3612789
Ovchinnikov, S., Kamisetty, H. & Baker, D. Robust and accurate prediction of residue–residue interactions across protein interfaces using evolutionary information. eLife 3, e02030 (2014).
pubmed: 24842992
pmcid: 4034769
Seemayer, S., Gruber, M. & Söding, J. CCMpred—fast and precise prediction of protein residue–residue contacts from correlated mutations. Bioinformatics 30, 3128–3130 (2014).
pubmed: 25064567
pmcid: 4201158
Morcos, F. et al. Direct-coupling analysis of residue coevolution captures native contacts across many protein families. Proc. Natl Acad. Sci. USA 108, E1293–E1301 (2011).
pubmed: 22106262
Jones, D. T., Buchan, D. W., Cozzetto, D. & Pontil, M. PSICOV: precise structural contact prediction using sparse inverse covariance estimation on large multiple sequence alignments. Bioinformatics 28, 184–190 (2012).
pubmed: 22101153
Skwark, M. J., Raimondi, D., Michel, M. & Elofsson, A. Improved contact predictions using the recognition of protein like contact patterns. PLOS Comput. Biol. 10, e1003889 (2014).
pubmed: 25375897
pmcid: 4222596
Jones, D. T., Singh, T., Kosciolek, T. & Tetchner, S. MetaPSICOV: combining coevolution methods for accurate prediction of contacts and long range hydrogen bonding in proteins. Bioinformatics 31, 999–1006 (2015).
pubmed: 25431331
Wang, S., Sun, S., Li, Z., Zhang, R. & Xu, J. Accurate de novo prediction of protein contact map by ultra-deep learning model. PLOS Comput. Biol. 13, e1005324 (2017).
pubmed: 28056090
pmcid: 5249242
Jones, D. T. & Kandathil, S. M. High precision in protein contact prediction using fully convolutional neural networks and minimal sequence features. Bioinformatics 34, 3308–3315 (2018).
pubmed: 29718112
pmcid: 6157083
Ovchinnikov, S. et al. Improved de novo structure prediction in CASP11 by incorporating coevolution information into Rosetta. Proteins 84, 67–75 (2016).
pubmed: 26677056
pmcid: 5490371
Aszódi, A. & Taylor, W. R. Estimating polypeptide α-carbon distances from multiple sequence alignments. J. Math. Chem. 17, 167–184 (1995).
Zhao, F. & Xu, J. A position-specific distance-dependent statistical potential for protein structure and functional study. Structure 20, 1118–1126 (2012).
pubmed: 22608968
pmcid: 3372698
Xu, J. & Wang, S. Analysis of distance-based protein structure prediction by deep learning in CASP13. Proteins 87, 1069–1081 (2019).
pubmed: 31471916
Aszódi, A., Gradwell, M. J. & Taylor, W. R. Global fold determination from a small number of distance restraints. J. Mol. Biol. 251, 308–326 (1995).
pubmed: 7643405
Kandathil, S. M., Greener, J. G. & Jones, D. T. Prediction of interresidue contacts with DeepMetaPSICOV in CASP13. Proteins 87, 1092–1099 (2019).
pubmed: 31298436
pmcid: 6899903
He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In Proc. IEEE Conference on Computer Vision and Pattern Recognition 770–778 (2016).
Simons, K. T., Kooperberg, C., Huang, E. & Baker, D. Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions. J. Mol. Biol. 268, 209–225 (1997).
pubmed: 9149153
Liu, D. C. & Nocedal, J. On the limited memory BFGS method for large scale optimization. Math. Program. 45, 503–528 (1989).
Li, Y., Zhang, C., Bell, E. W., Yu, D.-J. & Zhang, Y. Ensembling multiple raw coevolutionary features with deep residual neural networks for contact-map prediction in CASP13. Proteins 87, 1082–1091 (2019).
pubmed: 31407406
Konagurthu, A. S., Lesk, A. M. & Allison, L. Minimum message length inference of secondary structure from protein coordinate data. Bioinformatics 28, i97–i105 (2012).
pubmed: 22689785
pmcid: 3371855
Dawson, N. L. et al. CATH: an expanded resource to predict protein function through structure and sequence. Nucleic Acids Res. 45, D289–D295 (2017).
pubmed: 27899584
Mirdita, M. et al. Uniclust databases of clustered and deeply annotated protein sequences and alignments. Nucleic Acids Res. 45, D170–D176 (2017).
pubmed: 27899574
Remmert, M., Biegert, A., Hauser, A. & Söding, J. HHblits: lightning-fast iterative protein sequence searching by HMM–HMM alignment. Nat. Methods 9, 173–175 (2012).
Altschul, S. F. et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402 (1997).
pubmed: 9254694
pmcid: 146917
Yu, F. & Koltun, V. Multi-scale context aggregation by dilated convolutions. Preprint at arXiv https://arxiv.org/abs/1511.07122 (2015).
Oord, A. d. et al. Wavenet: a generative model for raw audio. Preprint at arXiv https://arxiv.org/abs/1609.03499 (2016).
Clevert, D.-A., Unterthiner, T. & Hochreiter, S. Fast and accurate deep network learning by exponential linear units (ELUs). Preprint at arXiv https://arxiv.org/abs/1511.07289 (2015).
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I. & Salakhutdinov, R. Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1929–1958 (2014).
Kabsch, W. & Sander, C. Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 22, 2577–2637 (1983).
pubmed: 6667333
Yang, Y. et al. Sixty-five years of the long march in protein secondary structure prediction: the final stretch? Briefings Bioinf. 19, 482–494 (2018).
Zemla, A., Venclovas, C., Moult, J. & Fidelis, K. Processing and analysis of CASP3 protein structure predictions. Proteins 37, 22–29 (1999).
Mariani, V., Biasini, M., Barbato, A. & Schwede, T. lDDT: a local superposition-free score for comparing protein structures and models using distance difference tests. Bioinformatics 29, 2722–2728 (2013).
pubmed: 23986568
pmcid: 3799472
Abriata, L. A., Tamo, G. E. & Dal Peraro, M. A further leap of improvement in tertiary structure prediction in CASP13 prompts new routes for future assessments. Proteins 87, 1100–1112 (2019).
pubmed: 31344267
Kayikci, M. et al. Visualization and analysis of non-covalent contacts using the Protein Contacts Atlas. Nat. Struct. Mol. Biol. 25, 185–194 (2018).
pubmed: 29335563
pmcid: 5837000
Croll, T. I. et al. Evaluation of template-based modeling in CASP13. Proteins 87, 1113–1127 (2019).
pubmed: 31407380
pmcid: 6851432
Sundararajan, M., Taly, A. & Yan, Q. Axiomatic attribution for deep networks. In Proc. 34th International Conference on Machine Learning Vol. 70, 3319–3328 (2017).
Abadi, M. et al. Tensorflow: a system for large-scale machine learning. In Proc. 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16) 265–283 (2016).
Söding, J., Biegert, A. & Lupas, A. N. The HHpred interactive server for protein homology detection and structure prediction. Nucleic Acids Res. 33, W244–W248 (2005).
pubmed: 15980461
pmcid: 1160169
Cong, Q. et al. An automatic method for CASP9 free modeling structure prediction assessment. Bioinformatics 27, 3371–3378 (2011).
pubmed: 21994223
pmcid: 3232368
Zhang, Y. & Skolnick, J. TM-align: a protein structure alignment algorithm based on the TM-score. Nucleic Acids Res. 33, 2302–2309 (2005).
pubmed: 15849316
pmcid: 1084323
Tovchigrechko, A., Wells, C. A. & Vakser, I. A. Docking of protein models. Protein Sci. 11, 1888–1896 (2002).
pubmed: 12142443
pmcid: 2373684
Audet, M. et al. Crystal structure of misoprostol bound to the labor inducer prostaglandin E
pubmed: 30510194