The effects of sequencing strategies on Metagenomic pathogen detection using bronchoalveolar lavage fluid samples.

Bioinformatics analysis Dataset size Pathogen detection Read length mNGS

Journal

Heliyon
ISSN: 2405-8440
Titre abrégé: Heliyon
Pays: England
ID NLM: 101672560

Informations de publication

Date de publication:
15 Jul 2024
Historique:
received: 03 12 2023
revised: 17 06 2024
accepted: 21 06 2024
medline: 19 7 2024
pubmed: 19 7 2024
entrez: 19 7 2024
Statut: epublish

Résumé

Metagenomic next-generation sequencing (mNGS) is a powerful tool for pathogen detection. The accuracy depends on both wet lab and dry lab procedures. The objective of our study was to assess the influence of read length and dataset size on pathogen detection. In this study, 43 clinical BALF samples, which tested positive via clinical mNGS and were consistent with the diagnosis, were subjected to re-sequencing on the Illumina NovaSeq 6000 platform. The raw re-sequencing data, consisting of 100 million (M) paired-end 150 bp (PE150) reads, were divided into simulated datasets with eight different data sizes (5 M, 10 M, 15 M, 20 M, 30 M, 50 M, 75 M, 100 M) and five different read lengths (single-end 50 bp (SE50), SE75, SE100, PE100, and PE150). Both Kraken2 and IDseq bioinformatics pipelines were employed to analyze the previously diagnosed pathogens in the simulated data. Detection of pathogens was based on read counts ranging from 1 to 10 and RPM values ranging from 0.2 to 2. Our results revealed that increasing dataset sizes and read lengths can enhance the performance of mNGS in pathogen detection. However, a larger data sizes for mNGS require higher economic costs and longer turnaround time for data analysis. Our findings indicate 20 M reads being sufficient for SE75 mode to achieve high recall rates. Additionally, high nucleic acid loads in samples can lead to increased stability in pathogen detection efficiency, reducing the impact of sequencing strategies. The choice of bioinformatics pipelines had a significant impact on recall rates achieved in pathogen detection. Increasing dataset sizes and read lengths can enhance the performance of mNGS in pathogen detection but increase the economic and time costs of sequencing and data analysis. Currently, the 20 M reads in SE75 mode may be the best sequencing option.

Identifiants

pubmed: 39027502
doi: 10.1016/j.heliyon.2024.e33429
pii: S2405-8440(24)09460-X
pmc: PMC11255660
doi:

Types de publication

Journal Article

Langues

eng

Pagination

e33429

Informations de copyright

© 2024 The Authors. Published by Elsevier Ltd.

Déclaration de conflit d'intérêts

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Auteurs

Ziyang Li (Z)

Department of Laboratory Medicine, The Second Xiangya Hospital, Central South University, Changsha, Hunan 410011, China.
Center for Clinical Molecular Diagnostics, The Second Xiangya Hospital, Central South University, Changsha, Hunan 410011, China.

Zhe Guo (Z)

Department of Laboratory Medicine, The Second Xiangya Hospital, Central South University, Changsha, Hunan 410011, China.
Center for Clinical Molecular Diagnostics, The Second Xiangya Hospital, Central South University, Changsha, Hunan 410011, China.

Weimin Wu (W)

Department of Laboratory Medicine, The Second Xiangya Hospital, Central South University, Changsha, Hunan 410011, China.
Center for Clinical Molecular Diagnostics, The Second Xiangya Hospital, Central South University, Changsha, Hunan 410011, China.

Li Tan (L)

Department of Laboratory Medicine, The Second Xiangya Hospital, Central South University, Changsha, Hunan 410011, China.
Center for Clinical Molecular Diagnostics, The Second Xiangya Hospital, Central South University, Changsha, Hunan 410011, China.

Qichen Long (Q)

Department of Laboratory Medicine, The Second Xiangya Hospital, Central South University, Changsha, Hunan 410011, China.
Center for Clinical Molecular Diagnostics, The Second Xiangya Hospital, Central South University, Changsha, Hunan 410011, China.

Han Xia (H)

School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an 710049, China.
MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an 710049, China.

Min Hu (M)

Department of Laboratory Medicine, The Second Xiangya Hospital, Central South University, Changsha, Hunan 410011, China.
Center for Clinical Molecular Diagnostics, The Second Xiangya Hospital, Central South University, Changsha, Hunan 410011, China.

Classifications MeSH