Facilitating accessible, rapid, and appropriate processing of ancient metagenomic data with AMDirT.
FAIR data
aDNA
environmental
metadata
metagenomics
microbial
microbiome
palaeogenomics
Journal
F1000Research
ISSN: 2046-1402
Titre abrégé: F1000Res
Pays: England
ID NLM: 101594320
Informations de publication
Date de publication:
2023
2023
Historique:
accepted:
07
05
2024
medline:
12
9
2024
pubmed:
12
9
2024
entrez:
12
9
2024
Statut:
epublish
Résumé
Access to sample-level metadata is important when selecting public metagenomic sequencing datasets for reuse in new biological analyses. The Standards, Precautions, and Advances in Ancient Metagenomics community (SPAAM, https://spaam-community.org) has previously published AncientMetagenomeDir, a collection of curated and standardised sample metadata tables for metagenomic and microbial genome datasets generated from ancient samples. However, while sample-level information is useful for identifying relevant samples for inclusion in new projects, Next Generation Sequencing (NGS) library construction and sequencing metadata are also essential for appropriately reprocessing ancient metagenomic data. Currently, recovering information for downloading and preparing such data is difficult when laboratory and bioinformatic metadata is heterogeneously recorded in prose-based publications. Through a series of community-based hackathon events, AncientMetagenomeDir was updated to provide standardised library-level metadata of existing and new ancient metagenomic samples. In tandem, the companion tool 'AMDirT' was developed to facilitate rapid data filtering and downloading of ancient metagenomic data, as well as improving automated metadata curation and validation for AncientMetagenomeDir. AncientMetagenomeDir was extended to include standardised metadata of over 6000 ancient metagenomic libraries. The companion tool 'AMDirT' provides both graphical- and command-line interface based access to such metadata for users from a wide range of computational backgrounds. We also report on errors with metadata reporting that appear to commonly occur during data upload and provide suggestions on how to improve the quality of data sharing by the community. Together, both standardised metadata reporting and tooling will help towards easier incorporation and reuse of public ancient metagenomic datasets into future analyses.
Sections du résumé
Background
UNASSIGNED
Access to sample-level metadata is important when selecting public metagenomic sequencing datasets for reuse in new biological analyses. The Standards, Precautions, and Advances in Ancient Metagenomics community (SPAAM, https://spaam-community.org) has previously published AncientMetagenomeDir, a collection of curated and standardised sample metadata tables for metagenomic and microbial genome datasets generated from ancient samples. However, while sample-level information is useful for identifying relevant samples for inclusion in new projects, Next Generation Sequencing (NGS) library construction and sequencing metadata are also essential for appropriately reprocessing ancient metagenomic data. Currently, recovering information for downloading and preparing such data is difficult when laboratory and bioinformatic metadata is heterogeneously recorded in prose-based publications.
Methods
UNASSIGNED
Through a series of community-based hackathon events, AncientMetagenomeDir was updated to provide standardised library-level metadata of existing and new ancient metagenomic samples. In tandem, the companion tool 'AMDirT' was developed to facilitate rapid data filtering and downloading of ancient metagenomic data, as well as improving automated metadata curation and validation for AncientMetagenomeDir.
Results
UNASSIGNED
AncientMetagenomeDir was extended to include standardised metadata of over 6000 ancient metagenomic libraries. The companion tool 'AMDirT' provides both graphical- and command-line interface based access to such metadata for users from a wide range of computational backgrounds. We also report on errors with metadata reporting that appear to commonly occur during data upload and provide suggestions on how to improve the quality of data sharing by the community.
Conclusions
UNASSIGNED
Together, both standardised metadata reporting and tooling will help towards easier incorporation and reuse of public ancient metagenomic datasets into future analyses.
Identifiants
pubmed: 39262445
doi: 10.12688/f1000research.134798.2
pmc: PMC11387932
doi:
Substances chimiques
DNA, Ancient
0
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
926Informations de copyright
Copyright: © 2024 Borry M et al.
Déclaration de conflit d'intérêts
No competing interests were disclosed.