Improved protein structure prediction using potentials from deep learning.


Journal

Nature
ISSN: 1476-4687
Titre abrégé: Nature
Pays: England
ID NLM: 0410462

Informations de publication

Date de publication:
01 2020
Historique:
received: 02 04 2019
accepted: 10 12 2019
pubmed: 17 1 2020
medline: 6 5 2020
entrez: 17 1 2020
Statut: ppublish

Résumé

Protein structure prediction can be used to determine the three-dimensional shape of a protein from its amino acid sequence

Identifiants

pubmed: 31942072
doi: 10.1038/s41586-019-1923-7
pii: 10.1038/s41586-019-1923-7
doi:

Substances chimiques

Proteins 0
Caspases EC 3.4.22.-
caspase 13 EC 3.4.22.-

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

706-710

Commentaires et corrections

Type : CommentIn
Type : CommentIn

Références

Dill, K. A., Ozkan, S. B., Shell, M. S. & Weikl, T. R. The protein folding problem. Annu. Rev. Biophys. 37, 289–316 (2008).
pubmed: 18573083 pmcid: 2443096
Dill, K. A. & MacCallum, J. L. The protein-folding problem, 50 years on. Science 338, 1042–1046 (2012).
pubmed: 23180855
Schaarschmidt, J., Monastyrskyy, B., Kryshtafovych, A. & Bonvin, A. M. J. J. Assessment of contact predictions in CASP12: co-evolution and deep learning coming of age. Proteins 86, 51–66 (2018).
pubmed: 29071738
Kirkwood, J. Statistical mechanics of fluid mixtures. J. Chem. Phys. 3, 300–313 (1935).
Kryshtafovych, A., Schwede, T., Topf, M., Fidelis, K. & Moult, J. Critical assessment of methods of protein structure prediction (CASP)—Round XIII. Proteins 87, 1011–1020 (2019).
pubmed: 31589781
Zhang, Y. & Skolnick, J. Scoring function for automated assessment of protein structure template quality. Proteins 57, 702–710 (2004).
pubmed: 15476259
Zhang, Y. Protein structure prediction: when is it useful? Curr. Opin. Struct. Biol. 19, 145–155 (2009).
pubmed: 19327982 pmcid: 2673339
Senior, A. W. et al. Protein structure prediction using multiple deep neural networks in the 13th Critical Assessment of Protein Structure Prediction (CASP13). Proteins 87, 1141–1148 (2019).
pubmed: 31602685
Das, R. & Baker, D. Macromolecular modeling with Rosetta. Annu. Rev. Biochem. 77, 363–382 (2008).
pubmed: 18410248
Jones, D. T. Predicting novel protein folds by using FRAGFOLD. Proteins 45, 127–132 (2001).
Zhang, C., Mortuza, S. M., He, B., Wang, Y. & Zhang, Y. Template-based and free modeling of I-TASSER and QUARK pipelines using predicted contact maps in CASP12. Proteins 86, 136–151 (2018).
pubmed: 29082551
Kirkpatrick, S., Gelatt, C. D. Jr & Vecchi, M. P. Optimization by simulated annealing. Science 220, 671–680 (1983).
pubmed: 17813860
Berman, H. M. et al. The Protein Data Bank. Nucleic Acids Res. 28, 235–242 (2000).
pubmed: 10592235 pmcid: 102472
Altschuh, D., Lesk, A. M., Bloomer, A. C. & Klug, A. Correlation of co-ordinated amino acid substitutions with function in viruses related to tobacco mosaic virus. J. Mol. Biol. 193, 693–707 (1987).
pubmed: 3612789
Ovchinnikov, S., Kamisetty, H. & Baker, D. Robust and accurate prediction of residue–residue interactions across protein interfaces using evolutionary information. eLife 3, e02030 (2014).
pubmed: 24842992 pmcid: 4034769
Seemayer, S., Gruber, M. & Söding, J. CCMpred—fast and precise prediction of protein residue–residue contacts from correlated mutations. Bioinformatics 30, 3128–3130 (2014).
pubmed: 25064567 pmcid: 4201158
Morcos, F. et al. Direct-coupling analysis of residue coevolution captures native contacts across many protein families. Proc. Natl Acad. Sci. USA 108, E1293–E1301 (2011).
pubmed: 22106262
Jones, D. T., Buchan, D. W., Cozzetto, D. & Pontil, M. PSICOV: precise structural contact prediction using sparse inverse covariance estimation on large multiple sequence alignments. Bioinformatics 28, 184–190 (2012).
pubmed: 22101153
Skwark, M. J., Raimondi, D., Michel, M. & Elofsson, A. Improved contact predictions using the recognition of protein like contact patterns. PLOS Comput. Biol. 10, e1003889 (2014).
pubmed: 25375897 pmcid: 4222596
Jones, D. T., Singh, T., Kosciolek, T. & Tetchner, S. MetaPSICOV: combining coevolution methods for accurate prediction of contacts and long range hydrogen bonding in proteins. Bioinformatics 31, 999–1006 (2015).
pubmed: 25431331
Wang, S., Sun, S., Li, Z., Zhang, R. & Xu, J. Accurate de novo prediction of protein contact map by ultra-deep learning model. PLOS Comput. Biol. 13, e1005324 (2017).
pubmed: 28056090 pmcid: 5249242
Jones, D. T. & Kandathil, S. M. High precision in protein contact prediction using fully convolutional neural networks and minimal sequence features. Bioinformatics 34, 3308–3315 (2018).
pubmed: 29718112 pmcid: 6157083
Ovchinnikov, S. et al. Improved de novo structure prediction in CASP11 by incorporating coevolution information into Rosetta. Proteins 84, 67–75 (2016).
pubmed: 26677056 pmcid: 5490371
Aszódi, A. & Taylor, W. R. Estimating polypeptide α-carbon distances from multiple sequence alignments. J. Math. Chem. 17, 167–184 (1995).
Zhao, F. & Xu, J. A position-specific distance-dependent statistical potential for protein structure and functional study. Structure 20, 1118–1126 (2012).
pubmed: 22608968 pmcid: 3372698
Xu, J. & Wang, S. Analysis of distance-based protein structure prediction by deep learning in CASP13. Proteins 87, 1069–1081 (2019).
pubmed: 31471916
Aszódi, A., Gradwell, M. J. & Taylor, W. R. Global fold determination from a small number of distance restraints. J. Mol. Biol. 251, 308–326 (1995).
pubmed: 7643405
Kandathil, S. M., Greener, J. G. & Jones, D. T. Prediction of interresidue contacts with DeepMetaPSICOV in CASP13. Proteins 87, 1092–1099 (2019).
pubmed: 31298436 pmcid: 6899903
He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In Proc. IEEE Conference on Computer Vision and Pattern Recognition 770–778 (2016).
Simons, K. T., Kooperberg, C., Huang, E. & Baker, D. Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions. J. Mol. Biol. 268, 209–225 (1997).
pubmed: 9149153
Liu, D. C. & Nocedal, J. On the limited memory BFGS method for large scale optimization. Math. Program. 45, 503–528 (1989).
Li, Y., Zhang, C., Bell, E. W., Yu, D.-J. & Zhang, Y. Ensembling multiple raw coevolutionary features with deep residual neural networks for contact-map prediction in CASP13. Proteins 87, 1082–1091 (2019).
pubmed: 31407406
Konagurthu, A. S., Lesk, A. M. & Allison, L. Minimum message length inference of secondary structure from protein coordinate data. Bioinformatics 28, i97–i105 (2012).
pubmed: 22689785 pmcid: 3371855
Dawson, N. L. et al. CATH: an expanded resource to predict protein function through structure and sequence. Nucleic Acids Res. 45, D289–D295 (2017).
pubmed: 27899584
Mirdita, M. et al. Uniclust databases of clustered and deeply annotated protein sequences and alignments. Nucleic Acids Res. 45, D170–D176 (2017).
pubmed: 27899574
Remmert, M., Biegert, A., Hauser, A. & Söding, J. HHblits: lightning-fast iterative protein sequence searching by HMM–HMM alignment. Nat. Methods 9, 173–175 (2012).
Altschul, S. F. et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402 (1997).
pubmed: 9254694 pmcid: 146917
Yu, F. & Koltun, V. Multi-scale context aggregation by dilated convolutions. Preprint at arXiv https://arxiv.org/abs/1511.07122 (2015).
Oord, A. d. et al. Wavenet: a generative model for raw audio. Preprint at arXiv https://arxiv.org/abs/1609.03499 (2016).
Clevert, D.-A., Unterthiner, T. & Hochreiter, S. Fast and accurate deep network learning by exponential linear units (ELUs). Preprint at arXiv https://arxiv.org/abs/1511.07289 (2015).
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I. & Salakhutdinov, R. Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1929–1958 (2014).
Kabsch, W. & Sander, C. Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 22, 2577–2637 (1983).
pubmed: 6667333
Yang, Y. et al. Sixty-five years of the long march in protein secondary structure prediction: the final stretch? Briefings Bioinf. 19, 482–494 (2018).
Zemla, A., Venclovas, C., Moult, J. & Fidelis, K. Processing and analysis of CASP3 protein structure predictions. Proteins 37, 22–29 (1999).
Mariani, V., Biasini, M., Barbato, A. & Schwede, T. lDDT: a local superposition-free score for comparing protein structures and models using distance difference tests. Bioinformatics 29, 2722–2728 (2013).
pubmed: 23986568 pmcid: 3799472
Abriata, L. A., Tamo, G. E. & Dal Peraro, M. A further leap of improvement in tertiary structure prediction in CASP13 prompts new routes for future assessments. Proteins 87, 1100–1112 (2019).
pubmed: 31344267
Kayikci, M. et al. Visualization and analysis of non-covalent contacts using the Protein Contacts Atlas. Nat. Struct. Mol. Biol. 25, 185–194 (2018).
pubmed: 29335563 pmcid: 5837000
Croll, T. I. et al. Evaluation of template-based modeling in CASP13. Proteins 87, 1113–1127 (2019).
pubmed: 31407380 pmcid: 6851432
Sundararajan, M., Taly, A. & Yan, Q. Axiomatic attribution for deep networks. In Proc. 34th International Conference on Machine Learning Vol. 70, 3319–3328 (2017).
Abadi, M. et al. Tensorflow: a system for large-scale machine learning. In Proc. 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16) 265–283 (2016).
Söding, J., Biegert, A. & Lupas, A. N. The HHpred interactive server for protein homology detection and structure prediction. Nucleic Acids Res. 33, W244–W248 (2005).
pubmed: 15980461 pmcid: 1160169
Cong, Q. et al. An automatic method for CASP9 free modeling structure prediction assessment. Bioinformatics 27, 3371–3378 (2011).
pubmed: 21994223 pmcid: 3232368
Zhang, Y. & Skolnick, J. TM-align: a protein structure alignment algorithm based on the TM-score. Nucleic Acids Res. 33, 2302–2309 (2005).
pubmed: 15849316 pmcid: 1084323
Tovchigrechko, A., Wells, C. A. & Vakser, I. A. Docking of protein models. Protein Sci. 11, 1888–1896 (2002).
pubmed: 12142443 pmcid: 2373684
Audet, M. et al. Crystal structure of misoprostol bound to the labor inducer prostaglandin E
pubmed: 30510194

Auteurs

Andrew W Senior (AW)

DeepMind, London, UK. andrewsenior@google.com.

Richard Evans (R)

DeepMind, London, UK.

John Jumper (J)

DeepMind, London, UK.

James Kirkpatrick (J)

DeepMind, London, UK.

Laurent Sifre (L)

DeepMind, London, UK.

Tim Green (T)

DeepMind, London, UK.

Chongli Qin (C)

DeepMind, London, UK.

Augustin Žídek (A)

DeepMind, London, UK.

Alexander W R Nelson (AWR)

DeepMind, London, UK.

Alex Bridgland (A)

DeepMind, London, UK.

Hugo Penedones (H)

DeepMind, London, UK.

Stig Petersen (S)

DeepMind, London, UK.

Karen Simonyan (K)

DeepMind, London, UK.

Steve Crossan (S)

DeepMind, London, UK.

Pushmeet Kohli (P)

DeepMind, London, UK.

David T Jones (DT)

The Francis Crick Institute, London, UK.
University College London, London, UK.

David Silver (D)

DeepMind, London, UK.

Koray Kavukcuoglu (K)

DeepMind, London, UK.

Demis Hassabis (D)

DeepMind, London, UK.

Articles similaires

Selecting optimal software code descriptors-The case of Java.

Yegor Bugayenko, Zamira Kholmatova, Artem Kruglov et al.
1.00
Software Algorithms Programming Languages
Databases, Protein Protein Domains Protein Folding Proteins Deep Learning
Animals Hemiptera Insect Proteins Phylogeny Insecticides

Exploring blood-brain barrier passage using atomic weighted vector and machine learning.

Yoan Martínez-López, Paulina Phoobane, Yanaima Jauriga et al.
1.00
Blood-Brain Barrier Machine Learning Humans Support Vector Machine Software

Classifications MeSH