PDBrenum: A webserver and program providing Protein Data Bank files renumbered according to their UniProt sequences.

Amino Acid Sequence Animals Databases, Protein Humans Internet / organization & administration Protein Conformation

Journal

PloS one

ISSN: 1932-6203

Titre abrégé: PLoS One

Pays: United States

ID NLM: 101285081

Informations de publication

Date de publication:
2021

Historique:

received: 05 05 2021

accepted: 05 06 2021

entrez: 6 7 2021

pubmed: 7 7 2021

medline: 3 11 2021

Statut: epublish

Résumé

The Protein Data Bank (PDB) was established at Brookhaven National Laboratories in 1971 as an archive for biological macromolecular crystal structures. In mid 2021, the database has almost 180,000 structures solved by X-ray crystallography, nuclear magnetic resonance, cryo-electron microscopy, and other methods. Many proteins have been studied under different conditions, including binding partners such as ligands, nucleic acids, or other proteins; mutations, and post-translational modifications, thus enabling extensive comparative structure-function studies. However, these studies are made more difficult because authors are allowed by the PDB to number the amino acids in each protein sequence in any manner they wish. This results in the same protein being numbered differently in the available PDB entries. For instance, some authors may include N-terminal signal peptides or the N-terminal methionine in the sequence numbering and others may not. In addition to the coordinates, there are many fields that contain structural and functional information regarding specific residues numbered according to the author. Here we provide a webserver and Python3 application that fixes the PDB sequence numbering problem by replacing the author numbering with numbering derived from the corresponding UniProt sequences. We obtain this correspondence from the SIFTS database from PDBe. The server and program can take a list of PDB entries or a list of UniProt identifiers (e.g., "P04637" or "P53_HUMAN") and provide renumbered files in mmCIF format and the legacy PDB format for both asymmetric unit files and biological assembly files provided by PDBe.

Identifiants

DOI: 10.1371/journal.pone.0253411 PMID: 34228733 PMC: PMC8259974

pubmed: 34228733

doi: 10.1371/journal.pone.0253411

pii: PONE-D-21-14953

pmc: PMC8259974

doi:

Types de publication

Journal Article Research Support, N.I.H., Extramural

Langues

eng

Sous-ensembles de citation

Pagination

e0253411

Subventions

Organisme : NIGMS NIH HHS

ID : R35 GM122517

Pays : United States

Déclaration de conflit d'intérêts

The authors have declared that no competing interests exist.

Références

Chem Biol Drug Des. 2010 Aug;76(2):100-6

pubmed: 20545947

BMC Bioinformatics. 2008 Sep 23;9:391

pubmed: 18811932

Mol Immunol. 2008 Aug;45(14):3832-9

pubmed: 18614234

Nucleic Acids Res. 2020 Jan 8;48(D1):D335-D343

pubmed: 31691821

Nat Commun. 2020 Feb 5;11(1):711

pubmed: 32024829

J Mol Biol. 2007 Sep 21;372(3):774-97

pubmed: 17681537

EMBO J. 2008 Jul 23;27(14):1985-94

pubmed: 18566589

Methods Enzymol. 1997;277:556-71

pubmed: 9379928

Nucleic Acids Res. 2007 Jan;35(Database issue):D301-3

pubmed: 17142228

Nucleic Acids Res. 2019 Jan 8;47(D1):D482-D489

pubmed: 30445541

Nucleic Acids Res. 2019 Jan 8;47(D1):D94-D99

pubmed: 30365038

Nucleic Acids Res. 2019 Jan 8;47(D1):D464-D474

pubmed: 30357411

Protein Sci. 2018 Jan;27(1):95-102

pubmed: 28815765

Nucleic Acids Res. 2019 Jan 8;47(D1):D506-D515

pubmed: 30395287

Bioinformatics. 2005 Feb 15;21(4):551-3

pubmed: 15454411

Bioinformatics. 2003 Nov 22;19(17):2308-10

pubmed: 14630660

Bioinformatics. 2005 Dec 1;21(23):4297-301

pubmed: 16188924

PDBrenum: A webserver and program providing Protein Data Bank files renumbered according to their UniProt sequences.

Journal

Informations de publication

Résumé

Identifiants

Types de publication

Langues

Sous-ensembles de citation

Pagination

Subventions

Déclaration de conflit d'intérêts

Références

Auteurs

Bulat Faezov (B)

Roland L Dunbrack (RL)

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Smoking Cessation and Incident Cardiovascular Disease.

Evaluation of Low-Value Services Across Major Medicare Advantage Insurers and Traditional Medicare.

Effectiveness of Virtual Yoga for Chronic Low Back Pain: A Randomized Clinical Trial.

Classifications MeSH