Evaluation of recombination detection methods for viral sequencing.

bioinformatics recombination recombination detection methods

Journal

Virus evolution
ISSN: 2057-1577
Titre abrégé: Virus Evol
Pays: England
ID NLM: 101664675

Informations de publication

Date de publication:
2023
Historique:
received: 24 04 2023
revised: 03 08 2023
accepted: 15 11 2023
medline: 22 12 2023
pubmed: 22 12 2023
entrez: 22 12 2023
Statut: epublish

Résumé

Recombination is a key evolutionary driver in shaping novel viral populations and lineages. When unaccounted for, recombination can impact evolutionary estimations or complicate their interpretation. Therefore, identifying signals for recombination in sequencing data is a key prerequisite to further analyses. A repertoire of recombination detection methods (RDMs) have been developed over the past two decades; however, the prevalence of pandemic-scale viral sequencing data poses a computational challenge for existing methods. Here, we assessed eight RDMs: PhiPack (Profile), 3SEQ, GENECONV, recombination detection program (RDP) (OpenRDP), MaxChi (OpenRDP), Chimaera (OpenRDP), UCHIME (VSEARCH), and gmos; to determine if any are suitable for the analysis of bulk sequencing data. To test the performance and scalability of these methods, we analysed simulated viral sequencing data across a range of sequence diversities, recombination frequencies, and sample sizes. Furthermore, we provide a practical example for the analysis and validation of empirical data. We find that RDMs need to be scalable, use an analytical approach and resolution that is suitable for the intended research application, and are accurate for the properties of a given dataset (e.g. sequence diversity and estimated recombination frequency). Analysis of simulated and empirical data revealed that the assessed methods exhibited considerable trade-offs between these criteria. Overall, we provide general guidelines for the validation of recombination detection results, the benefits and shortcomings of each assessed method, and future considerations for recombination detection methods for the assessment of large-scale viral sequencing data.

Identifiants

pubmed: 38131005
doi: 10.1093/ve/vead066
pii: vead066
pmc: PMC10734630
doi:

Banques de données

Dryad
['10.5061/dryad.d7wm37q6f']

Types de publication

Journal Article

Langues

eng

Pagination

vead066

Informations de copyright

© The Author(s) 2023. Published by Oxford University Press.

Déclaration de conflit d'intérêts

A.E.D. is employed by Illumina Australia Pty Ltd and holds a financial interest in its parent company Illumina Inc.

Auteurs

Frederick R Jaya (FR)

Australian Institute for Microbiology & Infection, University of Technology Sydney, 15 Broadway, Ultimo, New South Wales 2007, Australia.
Ecology and Evolution, Research School of Biology, Australian National University, 134 Linnaeus Way, Acton, Australian Capital Territory 2600, Australia.

Barbara P Brito (BP)

Australian Institute for Microbiology & Infection, University of Technology Sydney, 15 Broadway, Ultimo, New South Wales 2007, Australia.
New South Wales Department of Primary Industries, Elizabeth Macarthur Agricultural Institute, Woodbridge Road, Menangle, New South Wales 2568, Australia.

Aaron E Darling (AE)

Australian Institute for Microbiology & Infection, University of Technology Sydney, 15 Broadway, Ultimo, New South Wales 2007, Australia.
Illumina Australia Pty Ltd, Ultimo, New South Wales 2007, Australia.

Classifications MeSH