Mapping global dynamics of benchmark creation and saturation in artificial intelligence.

Artificial Intelligence Benchmarking / methods Ecosystem Physical Phenomena

Journal

Nature communications

ISSN: 2041-1723

Titre abrégé: Nat Commun

Pays: England

ID NLM: 101528555

Informations de publication

Date de publication:
10 11 2022

Historique:

received: 24 03 2022

accepted: 31 10 2022

entrez: 10 11 2022

pubmed: 11 11 2022

medline: 15 11 2022

Statut: epublish

Résumé

Benchmarks are crucial to measuring and steering progress in artificial intelligence (AI). However, recent studies raised concerns over the state of AI benchmarking, reporting issues such as benchmark overfitting, benchmark saturation and increasing centralization of benchmark dataset creation. To facilitate monitoring of the health of the AI benchmarking ecosystem, we introduce methodologies for creating condensed maps of the global dynamics of benchmark creation and saturation. We curate data for 3765 benchmarks covering the entire domains of computer vision and natural language processing, and show that a large fraction of benchmarks quickly trends towards near-saturation, that many benchmarks fail to find widespread utilization, and that benchmark performance gains for different AI tasks are prone to unforeseen bursts. We analyze attributes associated with benchmark popularity, and conclude that future benchmarks should emphasize versatility, breadth and real-world utility.

Identifiants

DOI: 10.1038/s41467-022-34591-0 PMID: 36357391 PMC: PMC9649641

pubmed: 36357391

doi: 10.1038/s41467-022-34591-0

pii: 10.1038/s41467-022-34591-0

pmc: PMC9649641

doi:

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

Pagination

6793

Informations de copyright

Références

Bioinformatics. 2005 Jun;21 Suppl 1:i47-56

pubmed: 15961493

Sci Data. 2022 Jun 17;9(1):322

pubmed: 35715466

Mapping global dynamics of benchmark creation and saturation in artificial intelligence.

Journal

Informations de publication

Résumé

Identifiants

Types de publication

Langues

Sous-ensembles de citation

Pagination

Informations de copyright

Références

Auteurs

Simon Ott (S)

Adriano Barbosa-Silva (A)

Kathrin Blagec (K)

Jan Brauner (J)

Matthias Samwald (M)

Articles similaires

AI-powered mechanisms as judges: Breaking ties in chess.

How Do Personal Attributes Shape AI Dependency in Chinese Higher Education Context? Insights from Needs Frustration Perspective.

An arithmetic operation P system based on symmetric ternary system.

SALINITY-Induced Changes in Diversity, Stability, and Functional Profiles of Microbial Communities in Different Saline Lakes in Arid Areas.

Classifications MeSH