UniRule: a unified rule resource for automatic annotation in the UniProt Knowledgebase.


Journal

Bioinformatics (Oxford, England)
ISSN: 1367-4811
Titre abrégé: Bioinformatics
Pays: England
ID NLM: 9808944

Informations de publication

Date de publication:
01 11 2020
Historique:
received: 30 01 2020
revised: 13 04 2020
accepted: 05 05 2020
pubmed: 14 5 2020
medline: 20 2 2021
entrez: 14 5 2020
Statut: ppublish

Résumé

The number of protein records in the UniProt Knowledgebase (UniProtKB: https://www.uniprot.org) continues to grow rapidly as a result of genome sequencing and the prediction of protein-coding genes. Providing functional annotation for these proteins presents a significant and continuing challenge. In response to this challenge, UniProt has developed a method of annotation, known as UniRule, based on expertly curated rules, which integrates related systems (RuleBase, HAMAP, PIRSR, PIRNR) developed by the members of the UniProt consortium. UniRule uses protein family signatures from InterPro, combined with taxonomic and other constraints, to select sets of reviewed proteins which have common functional properties supported by experimental evidence. This annotation is propagated to unreviewed records in UniProtKB that meet the same selection criteria, most of which do not have (and are never likely to have) experimentally verified functional annotation. Release 2020_01 of UniProtKB contains 6496 UniRule rules which provide annotation for 53 million proteins, accounting for 30% of the 178 million records in UniProtKB. UniRule provides scalable enrichment of annotation in UniProtKB. UniRule rules are integrated into UniProtKB and can be viewed at https://www.uniprot.org/unirule/. UniRule rules and the code required to run the rules, are publicly available for researchers who wish to annotate their own sequences. The implementation used to run the rules is known as UniFIRE and is available at https://gitlab.ebi.ac.uk/uniprot-public/unifire.

Identifiants

pubmed: 32399560
pii: 5836494
doi: 10.1093/bioinformatics/btaa485
pmc: PMC7750954
doi:

Substances chimiques

Proteins 0

Types de publication

Journal Article Research Support, N.I.H., Extramural Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

4643-4648

Subventions

Organisme : British Heart Foundation
ID : RG/13/5/30112
Pays : United Kingdom
Organisme : NHGRI NIH HHS
ID : U41 HG002273
Pays : United States
Organisme : Parkinson's UK
ID : G-1307
Pays : United Kingdom
Organisme : NIGMS NIH HHS
ID : P20 GM103446
Pays : United States
Organisme : NHGRI NIH HHS
ID : U24 HG007822
Pays : United States
Organisme : NLM NIH HHS
ID : G08 LM010720
Pays : United States
Organisme : NIGMS NIH HHS
ID : R01 GM080646
Pays : United States

Investigateurs

Alex Bateman (A)
Alan Bridge (A)
Cathy Wu (C)
Cecilia Arighi (C)
Lionel Breuza (L)
Elisabeth Coudert (E)
Hongzhan Huang (H)
Damien Lieberherr (D)
Michele Magrane (M)
Maria J Martin (MJ)
Peter McGarvey (P)
Darren Natale (D)
Sandra Orchard (S)
Ivo Pedruzzi (I)
Sylvain Poux (S)
Manuela Pruess (M)
Shriya Raj (S)
Nicole Redaschi (N)
Lucila Aimo (L)
Ghislaine Argoud-Puy (G)
Andrea Auchincloss (A)
Kristian Axelsen (K)
Emmanuel Boutet (E)
Emily Bowler (E)
Ramona Britto (R)
Hema Bye-A-Jee (H)
Cristina Casals-Casas (C)
Paul Denny (P)
Anne Estreicher (A)
Maria Livia Famiglietti (ML)
Marc Feuermann (M)
John S Garavelli (JS)
Penelope Garmiri (P)
Arnaud Gos (A)
Nadine Gruaz (N)
Emma Hatton-Ellis (E)
Chantal Hulo (C)
Nevila Hyka-Nouspikel (N)
Florence Jungo (F)
Kati Laiho (K)
Philippe Le Mercier (P)
Antonia Lock (A)
Yvonne Lussi (Y)
Alistair MacDougall (A)
Patrick Masson (P)
Anne Morgat (A)
Sandrine Pilbout (S)
Lucille Pourcel (L)
Catherine Rivoire (C)
Karen Ross (K)
Christian Sigrist (C)
Elena Speretta (E)
Shyamala Sundaram (S)
Nidhi Tyagi (N)
C R Vinayaka (CR)
Qinghua Wang (Q)
Kate Warner (K)
Lai-Su Yeh (LS)
Rossana Zaru (R)
Shadab Ahmed (S)
Emanuele Alpi (E)
Leslie Arminski (L)
Parit Bansal (P)
Delphine Baratin (D)
Teresa Batista Neto (TB)
Jerven Bolleman (J)
Chuming Chen (C)
Yongxing Chen (Y)
Beatrice Cuche (B)
Austra Cukura (A)
Edouard De Castro (E)
ThankGod Ebenezer (T)
Elisabeth Gasteiger (E)
Sebastien Gehant (S)
Leonardo Gonzales (L)
Abdulrahman Hussein (A)
Alexandr Ignatchenko (A)
Giuseppe Insana (G)
Rizwan Ishtiaq (R)
Vishal Joshi (V)
Dushyanth Jyothi (D)
Arnaud Kerhornou (A)
Thierry Lombardot (T)
Aurelian Luciani (A)
Jie Luo (J)
Mahdi Mahmoudy (M)
Alok Mishra (A)
Katie Moulang (K)
Andrew Nightingale (A)
Joseph Onwubiko (J)
Monica Pozzato (M)
Sangya Pundir (S)
Guoying Qi (G)
Daniel Rice (D)
Rabie Saidi (R)
Edward Turner (E)
Preethi Vasudev (P)
Yuqi Wang (Y)
Xavier Watkins (X)
Hermann Zellner (H)
Jian Zhang (J)

Commentaires et corrections

Type : ErratumIn

Informations de copyright

© The Author(s) 2020. Published by Oxford University Press.

Références

Nucleic Acids Res. 2019 Jan 8;47(D1):D427-D432
pubmed: 30357350
Nucleic Acids Res. 2019 Jan 8;47(D1):D1186-D1194
pubmed: 30407590
Nucleic Acids Res. 2015 Jan;43(Database issue):D1064-70
pubmed: 25348399
Bioinformatics. 2014 May 1;30(9):1236-40
pubmed: 24451626
Trends Biotechnol. 2009 Apr;27(4):210-9
pubmed: 19251332
Nucleic Acids Res. 2019 Jan 8;47(D1):D351-D360
pubmed: 30398656
Nucleic Acids Res. 2019 Jan 8;47(D1):D506-D515
pubmed: 30395287
Evol Bioinform Online. 2007 Feb 10;2:197-209
pubmed: 19455212
PLoS Comput Biol. 2018 Apr 5;14(4):e1005756
pubmed: 29621256
Database (Oxford). 2019 Jan 1;2019:
pubmed: 30805646
PLoS Comput Biol. 2009 Dec;5(12):e1000605
pubmed: 20011109

Auteurs

Alistair MacDougall (A)

European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.

Vladimir Volynkin (V)

European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.

Rabie Saidi (R)

European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.

Diego Poggioli (D)

European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.
Kantar Consulting, Casalecchio Di Reno, 40033 Bologna, Italy.

Hermann Zellner (H)

European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.

Emma Hatton-Ellis (E)

European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.

Vishal Joshi (V)

European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.

Claire O'Donovan (C)

European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.

Sandra Orchard (S)

European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.

Andrea H Auchincloss (AH)

SIB Swiss Institute of Bioinformatics, Centre Medical Universitaire, CH-1211 Geneva 4, Switzerland.

Delphine Baratin (D)

SIB Swiss Institute of Bioinformatics, Centre Medical Universitaire, CH-1211 Geneva 4, Switzerland.

Jerven Bolleman (J)

SIB Swiss Institute of Bioinformatics, Centre Medical Universitaire, CH-1211 Geneva 4, Switzerland.

Elisabeth Coudert (E)

SIB Swiss Institute of Bioinformatics, Centre Medical Universitaire, CH-1211 Geneva 4, Switzerland.

Edouard de Castro (E)

SIB Swiss Institute of Bioinformatics, Centre Medical Universitaire, CH-1211 Geneva 4, Switzerland.

Chantal Hulo (C)

SIB Swiss Institute of Bioinformatics, Centre Medical Universitaire, CH-1211 Geneva 4, Switzerland.

Patrick Masson (P)

SIB Swiss Institute of Bioinformatics, Centre Medical Universitaire, CH-1211 Geneva 4, Switzerland.

Ivo Pedruzzi (I)

SIB Swiss Institute of Bioinformatics, Centre Medical Universitaire, CH-1211 Geneva 4, Switzerland.

Catherine Rivoire (C)

SIB Swiss Institute of Bioinformatics, Centre Medical Universitaire, CH-1211 Geneva 4, Switzerland.

Cecilia Arighi (C)

Protein Information Resource, University of Delaware, Newark, DE 19711, USA.

Qinghua Wang (Q)

Protein Information Resource, University of Delaware, Newark, DE 19711, USA.

Chuming Chen (C)

Protein Information Resource, University of Delaware, Newark, DE 19711, USA.

Hongzhan Huang (H)

Protein Information Resource, University of Delaware, Newark, DE 19711, USA.

John Garavelli (J)

Protein Information Resource, University of Delaware, Newark, DE 19711, USA.

C R Vinayaka (CR)

Protein Information Resource, Georgetown University Medical Center, Washington, DC 20007, USA.

Lai-Su Yeh (LS)

Protein Information Resource, Georgetown University Medical Center, Washington, DC 20007, USA.

Darren A Natale (DA)

Protein Information Resource, Georgetown University Medical Center, Washington, DC 20007, USA.

Kati Laiho (K)

Protein Information Resource, Georgetown University Medical Center, Washington, DC 20007, USA.

Maria-Jesus Martin (MJ)

European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.

Alexandre Renaux (A)

European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.

Klemens Pichler (K)

European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.

Articles similaires

Databases, Protein Protein Domains Protein Folding Proteins Deep Learning

Fine mapping of a major QTL, qECQ8, for rice taste quality.

Shan Zhu, Guoping Tang, Zhou Yang et al.
1.00
Oryza Quantitative Trait Loci Taste Chromosome Mapping Phenotype
Chromosomes, Plant Genome, Plant Molecular Sequence Annotation Rhizophoraceae Wetlands
Humans Computational Biology ROC Curve Algorithms Proteins

Classifications MeSH