Systematic comparison of the protein-protein interaction databases from a user's perspective.
Database and software selection
Database comparisons
Molecular networks
Protein interaction databases
Protein interactions
Systems biology
Journal
Journal of biomedical informatics
ISSN: 1532-0480
Titre abrégé: J Biomed Inform
Pays: United States
ID NLM: 100970413
Informations de publication
Date de publication:
03 2020
03 2020
Historique:
received:
08
04
2019
revised:
08
11
2019
accepted:
27
01
2020
pubmed:
1
2
2020
medline:
29
7
2021
entrez:
1
2
2020
Statut:
ppublish
Résumé
In absence of periodic systematic comparisons, biologists/bioinformaticians may be forced to make a subjective selection among the many protein-protein interaction (PPI) databases and tools. We conducted a comprehensive compilation and comparison of such resources. We compiled 375 PPI resources, short-listed 125 important ones (both lists are available at startbioinfo.com), and compared the features and coverage of 16 carefully-selected databases related to human PPIs. We quantitatively compared the coverage of 'experimentally verified' as well as 'total' (experimentally verified and predicted) PPIs for these 16 databases. Coverage was compared in two ways: (a) PPIs obtained in response to gene queries using the web interfaces were compared. As a query set, 108 genes expressed differently across tissues (specific to kidney, testis, and uterus, and ubiquitous - i.e., expressed in 43 human normal tissues) or associated with certain diseases (breast cancer, lung cancer, Alzheimer's, cystic fibrosis, diabetes, and cardiomyopathy) were chosen. The coverage was also compared for the well-studied genes versus the less-studied ones. The coverage of the databases for high-quality interactions was separately assessed using a set of literature curated experimentally-proven PPIs (gold standard PPI-set); (b) the back-end-data from 15 PPI databases was downloaded and compared. Combined results from STRING and UniHI covered around 84% of 'experimentally verified' PPIs. Approximately 94% of the 'total' PPIs available across the databases were retrieved by the combined use of hPRINT, STRING, and IID. Among the experimentally verified PPIs found exclusively in each database, STRING contributed around 71% of the hits. The coverage of certain databases was skewed for some gene-types. Analysis with the gold-standard PPI-set revealed that GPS-Prot, STRING, APID, and HIPPIE, each covered ~70% of the curated interactions. The database usage frequencies did not always correlate with their respective advantages, thereby justifying the need for more frequent studies of this nature.
Identifiants
pubmed: 32001390
pii: S1532-0464(20)30007-1
doi: 10.1016/j.jbi.2020.103380
pii:
doi:
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
103380Informations de copyright
Copyright © 2020 Elsevier Inc. All rights reserved.
Déclaration de conflit d'intérêts
Declaration of Competing Interest AB, SD, DD, SO and KB were supported by Shodhaka Life Sciences Pvt. Ltd. Though KKA received financial support only from IBAB, he has also been the founder director of Shodhaka LS Pvt. Ltd. This company supports basic research and the affiliation with this company in no way alters the pure academic nature of the work being reported.