Genome3D: integrating a collaborative data pipeline to expand the depth and breadth of consensus protein structure annotation.


Journal

Nucleic acids research
ISSN: 1362-4962
Titre abrégé: Nucleic Acids Res
Pays: England
ID NLM: 0411011

Informations de publication

Date de publication:
08 01 2020
Historique:
accepted: 07 11 2019
revised: 09 10 2019
received: 23 09 2019
pubmed: 17 11 2019
medline: 30 5 2020
entrez: 17 11 2019
Statut: ppublish

Résumé

Genome3D (https://www.genome3d.eu) is a freely available resource that provides consensus structural annotations for representative protein sequences taken from a selection of model organisms. Since the last NAR update in 2015, the method of data submission has been overhauled, with annotations now being 'pushed' to the database via an API. As a result, contributing groups are now able to manage their own structural annotations, making the resource more flexible and maintainable. The new submission protocol brings a number of additional benefits including: providing instant validation of data and avoiding the requirement to synchronise releases between resources. It also makes it possible to implement the submission of these structural annotations as an automated part of existing internal workflows. In turn, these improvements facilitate Genome3D being opened up to new prediction algorithms and groups. For the latest release of Genome3D (v2.1), the underlying dataset of sequences used as prediction targets has been updated using the latest reference proteomes available in UniProtKB. A number of new reference proteomes have also been added of particular interest to the wider scientific community: cow, pig, wheat and mycobacterium tuberculosis. These additions, along with improvements to the underlying predictions from contributing resources, has ensured that the number of annotations in Genome3D has nearly doubled since the last NAR update article. The new API has also been used to facilitate the dissemination of Genome3D data into InterPro, thereby widening the visibility of both the annotation data and annotation algorithms.

Identifiants

pubmed: 31733063
pii: 5626529
doi: 10.1093/nar/gkz967
pmc: PMC7139969
doi:

Substances chimiques

Proteins 0

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

D314-D319

Subventions

Organisme : Biotechnology and Biological Sciences Research Council
ID : BB/N019431/1
Pays : United Kingdom
Organisme : Biotechnology and Biological Sciences Research Council
ID : BB/K020013/1
Pays : United Kingdom
Organisme : Biotechnology and Biological Sciences Research Council
ID : BB/M011712/1
Pays : United Kingdom
Organisme : Biotechnology and Biological Sciences Research Council
ID : BB/M011526/1
Pays : United Kingdom
Organisme : Medical Research Council
ID : MC_U105192716
Pays : United Kingdom
Organisme : Biotechnology and Biological Sciences Research Council
ID : BB/N019172/1
Pays : United Kingdom

Informations de copyright

© The Author(s) 2019. Published by Oxford University Press on behalf of Nucleic Acids Research.

Références

J Mol Biol. 2001 Jun 29;310(1):243-57
pubmed: 11419950
Nucleic Acids Res. 2018 Jan 4;46(D1):D1282
pubmed: 29194501
Nucleic Acids Res. 2019 Jan 8;47(D1):D351-D360
pubmed: 30398656
Nucleic Acids Res. 2017 Jan 4;45(D1):D289-D295
pubmed: 27899584
Nucleic Acids Res. 2010 Jul;38(Web Server issue):W563-8
pubmed: 20507913
Nucleic Acids Res. 2018 Jan 4;46(D1):D754-D761
pubmed: 29155950
Nucleic Acids Res. 2007 Jan;35(Database issue):D301-3
pubmed: 17142228
Bioinformatics. 2017 Jul 1;33(13):2040-2041
pubmed: 28334231
Nucleic Acids Res. 2015 Jan;43(Database issue):D382-6
pubmed: 25348407
Nucleic Acids Res. 2018 Jan 4;46(D1):D486-D492
pubmed: 29126160
J Mol Biol. 1995 Apr 7;247(4):536-40
pubmed: 7723011
Nucleic Acids Res. 2019 Jan 8;47(D1):D490-D494
pubmed: 30445555
Nat Protoc. 2015 Jun;10(6):845-58
pubmed: 25950237
Nucleic Acids Res. 2013 Jan;41(Database issue):D499-507
pubmed: 23203986
Nucleic Acids Res. 2017 Jan 4;45(D1):D158-D169
pubmed: 27899622

Auteurs

Ian Sillitoe (I)

Institute of Structural and Molecular Biology, UCL, Gower Street, London WC1E 6BT, UK.

Antonina Andreeva (A)

MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge CB2 0QH, UK.

Tom L Blundell (TL)

Department of Biochemistry, University of Cambridge, Old Addenbrooke's Site, 80 Tennis Court Road, Cambridge CB2 0QH, UK.

Daniel W A Buchan (DWA)

Department of Computer Science, UCL, Gower Street, London WC1E 6BT, UK.
The Francis Crick Institute, 1 Midland Rd, London NW1 1AT, UK.

Robert D Finn (RD)

European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK.

Julian Gough (J)

MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge CB2 0QH, UK.

David Jones (D)

Department of Computer Science, UCL, Gower Street, London WC1E 6BT, UK.
The Francis Crick Institute, 1 Midland Rd, London NW1 1AT, UK.

Lawrence A Kelley (LA)

Centre for Bioinformatics, Department of Life Sciences, Imperial College London, London, SW7 2AZ, UK.

Typhaine Paysan-Lafosse (T)

European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK.

Su Datt Lam (SD)

Institute of Structural and Molecular Biology, UCL, Gower Street, London WC1E 6BT, UK.
Faculty of Science and Technology, Universiti Kebangsaan Malaysia, Bangi, Selangor 43600, Malaysia.

Alexey G Murzin (AG)

MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge CB2 0QH, UK.

Arun Prasad Pandurangan (AP)

MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge CB2 0QH, UK.

Gustavo A Salazar (GA)

European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK.

Marcin J Skwark (MJ)

Department of Biochemistry, University of Cambridge, Old Addenbrooke's Site, 80 Tennis Court Road, Cambridge CB2 0QH, UK.

Michael J E Sternberg (MJE)

Centre for Bioinformatics, Department of Life Sciences, Imperial College London, London, SW7 2AZ, UK.

Sameer Velankar (S)

European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK.

Christine Orengo (C)

Institute of Structural and Molecular Biology, UCL, Gower Street, London WC1E 6BT, UK.

Articles similaires

Humans Adult Male Female Video Games
Databases, Protein Protein Domains Protein Folding Proteins Deep Learning
User-Computer Interface Software DNA Barcoding, Taxonomic Databases, Genetic Databases, Nucleic Acid
Humans Computational Biology ROC Curve Algorithms Proteins

Classifications MeSH