SNPtotree-Resolving the Phylogeny of SNPs on Non-Recombining DNA.

SNPs evolutionary genetics haploid markers non-recombining DNA phylogenetic tree population genetics software

Journal

Genes
ISSN: 2073-4425
Titre abrégé: Genes (Basel)
Pays: Switzerland
ID NLM: 101551097

Informations de publication

Date de publication:
22 09 2023
Historique:
received: 25 08 2023
revised: 18 09 2023
accepted: 21 09 2023
medline: 30 10 2023
pubmed: 28 10 2023
entrez: 28 10 2023
Statut: epublish

Résumé

Genetic variants on non-recombining DNA and the hierarchical order in which they accumulate are commonly of interest. This variant hierarchy can be established and combined with information on the population and geographic origin of the individuals carrying the variants to find population structures and infer migration patterns. Further, individuals can be assigned to the characterized populations, which is relevant in forensic genetics, genetic genealogy, and epidemiologic studies. However, there is currently no straightforward method to obtain such a variant hierarchy. Here, we introduce the software SNPtotree v1.0, which uniquely determines the hierarchical order of variants on non-recombining DNA without error-prone manual sorting. The algorithm uses pairwise variant comparisons to infer their relationships and integrates the combined information into a phylogenetic tree. Variants that have contradictory pairwise relationships or ambiguous positions in the tree are removed by the software. When benchmarked using two human Y-chromosomal massively parallel sequencing datasets, SNPtotree outperforms traditional methods in the accuracy of phylogenetic trees for sequencing data with high amounts of missing information. The phylogenetic trees of variants created using SNPtotree can be used to establish and maintain publicly available phylogeny databases to further explore genetic epidemiology and genealogy, as well as population and forensic genetics.

Identifiants

pubmed: 37895186
pii: genes14101837
doi: 10.3390/genes14101837
pmc: PMC10606150
pii:
doi:

Substances chimiques

DNA 9007-49-2

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Références

Mol Biol Evol. 2019 Sep 1;36(9):2069-2085
pubmed: 31127303
Forensic Sci Int Genet. 2007 Jun;1(2):88-92
pubmed: 19083735
BMC Evol Biol. 2008 Mar 26;8:95
pubmed: 18366758
J Mol Evol. 1981;17(6):368-76
pubmed: 7288891
IEEE/ACM Trans Comput Biol Bioinform. 2020 Jul-Aug;17(4):1222-1230
pubmed: 30507538
Am J Hum Genet. 2018 Sep 6;103(3):338-348
pubmed: 30100085
Mol Biol Evol. 2020 Dec 16;37(12):3632-3641
pubmed: 32637998
Mol Biol Evol. 2011 Sep;28(9):2603-13
pubmed: 21478374
PLoS Genet. 2009 Jun;5(6):e1000529
pubmed: 19543373
PLoS One. 2022 Aug 17;17(8):e0271971
pubmed: 35976870
Nucleic Acids Res. 2021 Jul 2;49(W1):W293-W296
pubmed: 33885785
Nat Rev Genet. 2010 Jul;11(7):499-511
pubmed: 20517342
Mol Biol Evol. 2015 Jan;32(1):268-74
pubmed: 25371430
Annu Rev Genet. 2007;41:539-64
pubmed: 18076332
Forensic Sci Int Genet. 2022 Jul;59:102708
pubmed: 35453088
Curr Biol. 2019 Jan 7;29(1):149-157.e3
pubmed: 30581024
Mol Phylogenet Evol. 2003 May;27(2):259-70
pubmed: 12695090
Proc Natl Acad Sci U S A. 2009 Dec 1;106(48):20174-9
pubmed: 19920170
Forensic Sci Int Genet. 2010 Feb;4(2):73-9
pubmed: 20129464
PLoS Comput Biol. 2019 Apr 8;15(4):e1006650
pubmed: 30958812
PLoS Comput Biol. 2016 Jul 12;12(7):e1004763
pubmed: 27404731
Bioinformatics. 2014 May 1;30(9):1312-3
pubmed: 24451623
Curr Biol. 2016 Mar 21;26(6):809-13
pubmed: 26923783
Forensic Sci Int Genet. 2015 Mar;15:43-8
pubmed: 25529991
Bioinformatics. 2016 May 1;32(9):1331-7
pubmed: 26733454
BMC Genomics. 2022 May 18;23(1):377
pubmed: 35585494
Bioinformatics. 2001 Aug;17(8):754-5
pubmed: 11524383
BMC Syst Biol. 2018 Nov 20;12(Suppl 5):100
pubmed: 30458842
Genome Biol Evol. 2018 Apr 1;10(5):1248-1254
pubmed: 29722813
Forensic Sci Int Genet. 2021 May;52:102474
pubmed: 33592389
Trends Genet. 2009 Aug;25(8):351-60
pubmed: 19665817
PLoS One. 2015 Mar 18;10(3):e0119586
pubmed: 25785630
Mol Biol Evol. 2020 May 1;37(5):1495-1507
pubmed: 31868908
Genome Res. 2015 Apr;25(4):459-66
pubmed: 25770088
Syst Biol. 2011 Oct;60(5):719-31
pubmed: 21447483
Mol Biol Evol. 2021 Jun 25;38(7):3022-3027
pubmed: 33892491
Nat Genet. 2007 Jul;39(7):906-13
pubmed: 17572673
Hum Hered. 2006;61(3):132-43
pubmed: 16770078
Syst Biol. 2020 Mar 1;69(2):221-233
pubmed: 31504938
PLoS Comput Biol. 2009 Sep;5(9):e1000520
pubmed: 19779555
Syst Biol. 2010 May;59(3):307-21
pubmed: 20525638
J Biomed Inform. 2006 Feb;39(1):34-42
pubmed: 15922672
BMC Plant Biol. 2015 Jul 08;15:174
pubmed: 26152193

Auteurs

Zehra Köksal (Z)

Section of Forensic Genetics, Department of Forensic Medicine, Faculty of Health and Medical Sciences, University of Copenhagen, 2100 Copenhagen, Denmark.

Claus Børsting (C)

Section of Forensic Genetics, Department of Forensic Medicine, Faculty of Health and Medical Sciences, University of Copenhagen, 2100 Copenhagen, Denmark.

Leonor Gusmão (L)

DNA Diagnostic Laboratory (LDD), State University of Rio de Janeiro (UERJ), Rio de Janeiro 20550-013, Brazil.

Vania Pereira (V)

Section of Forensic Genetics, Department of Forensic Medicine, Faculty of Health and Medical Sciences, University of Copenhagen, 2100 Copenhagen, Denmark.

Articles similaires

Genome, Chloroplast Phylogeny Genetic Markers Base Composition High-Throughput Nucleotide Sequencing

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C

Classifications MeSH