SARS-CoV-2 lineage assignments using phylogenetic placement/UShER are superior to pangoLEARN machine-learning method.
Bioinformatics
COVID-19
Phylogenetics
variants
Journal
Virus evolution
ISSN: 2057-1577
Titre abrégé: Virus Evol
Pays: England
ID NLM: 101664675
Informations de publication
Date de publication:
2024
2024
Historique:
received:
05
06
2023
revised:
13
12
2023
accepted:
05
01
2024
medline:
16
2
2024
pubmed:
16
2
2024
entrez:
16
2
2024
Statut:
epublish
Résumé
With the rapid spread and evolution of SARS-CoV-2, the ability to monitor its transmission and distinguish among viral lineages is critical for pandemic response efforts. The most commonly used software for the lineage assignment of newly isolated SARS-CoV-2 genomes is pangolin, which offers two methods of assignment, pangoLEARN and pUShER. PangoLEARN rapidly assigns lineages using a machine-learning algorithm, while pUShER performs a phylogenetic placement to identify the lineage corresponding to a newly sequenced genome. In a preliminary study, we observed that pangoLEARN (decision tree model), while substantially faster than pUShER, offered less consistency across different versions of pangolin v3. Here, we expand upon this analysis to include v3 and v4 of pangolin, which moved the default algorithm for lineage assignment from pangoLEARN in v3 to pUShER in v4, and perform a thorough analysis confirming that pUShER is not only more stable across versions but also more accurate. Our findings suggest that future lineage assignment algorithms for various pathogens should consider the value of phylogenetic placement.
Identifiants
pubmed: 38361813
doi: 10.1093/ve/vead085
pii: vead085
pmc: PMC10868549
doi:
Types de publication
Journal Article
Langues
eng
Pagination
vead085Informations de copyright
© The Author(s) 2024. Published by Oxford University Press.
Déclaration de conflit d'intérêts
None declared.