Processing-bias correction with DEBIAS-M improves cross-study generalization of microbiome-based prediction models.


Journal

bioRxiv : the preprint server for biology
Titre abrégé: bioRxiv
Pays: United States
ID NLM: 101680187

Informations de publication

Date de publication:
12 Feb 2024
Historique:
medline: 26 2 2024
pubmed: 26 2 2024
entrez: 26 2 2024
Statut: epublish

Résumé

Every step in common microbiome profiling protocols has variable efficiency for each microbe. For example, different DNA extraction kits may have different efficiency for Gram-positive and -negative bacteria. These variable efficiencies, combined with technical variation, create strong processing biases, which impede the identification of signals that are reproducible across studies and the development of generalizable and biologically interpretable prediction models. "Batch-correction" methods have been used to alleviate these issues computationally with some success. However, many make strong parametric assumptions which do not necessarily apply to microbiome data or processing biases, or require the use of an outcome variable, which risks overfitting. Lastly and importantly, existing transformations used to correct microbiome data are largely non-interpretable, and could, for example, introduce values to features that were initially mostly zeros. Altogether, processing bias currently compromises our ability to glean robust and generalizable biological insights from microbiome data. Here, we present DEBIAS-M (

Identifiants

pubmed: 38405914
doi: 10.1101/2024.02.09.579716
pmc: PMC10888995
pii:
doi:

Types de publication

Preprint

Langues

eng

Auteurs

Classifications MeSH