An investigation of irreproducibility in maximum likelihood phylogenetic inference.


Journal

Nature communications
ISSN: 2041-1723
Titre abrégé: Nat Commun
Pays: England
ID NLM: 101528555

Informations de publication

Date de publication:
30 11 2020
Historique:
received: 24 06 2020
accepted: 05 11 2020
entrez: 1 12 2020
pubmed: 2 12 2020
medline: 22 12 2020
Statut: epublish

Résumé

Phylogenetic trees are essential for studying biology, but their reproducibility under identical parameter settings remains unexplored. Here, we find that 3515 (18.11%) IQ-TREE-inferred and 1813 (9.34%) RAxML-NG-inferred maximum likelihood (ML) gene trees are topologically irreproducible when executing two replicates (Run1 and Run2) for each of 19,414 gene alignments in 15 animal, plant, and fungal phylogenomic datasets. Notably, coalescent-based ASTRAL species phylogenies inferred from Run1 and Run2 sets of individual gene trees are topologically irreproducible for 9/15 phylogenomic datasets, whereas concatenation-based phylogenies inferred twice from the same supermatrix are reproducible. Our simulations further show that irreproducible phylogenies are more likely to be incorrect than reproducible phylogenies. These results suggest that a considerable fraction of single-gene ML trees may be irreproducible. Increasing reproducibility in ML inference will benefit from providing analyses' log files, which contain typically reported parameters (e.g., program, substitution model, number of tree searches) but also typically unreported ones (e.g., random starting seed number, number of threads, processor type).

Identifiants

pubmed: 33257660
doi: 10.1038/s41467-020-20005-6
pii: 10.1038/s41467-020-20005-6
pmc: PMC7705714
doi:

Banques de données

figshare
['10.6084/m9.figshare.11917770']

Types de publication

Journal Article Research Support, N.I.H., Extramural Research Support, Non-U.S. Gov't Research Support, U.S. Gov't, Non-P.H.S.

Langues

eng

Sous-ensembles de citation

IM

Pagination

6096

Références

Bioinformatics. 2014 Sep 1;30(17):i541-8
pubmed: 25161245
Nat Ecol Evol. 2020 Nov;4(11):1435-1437
pubmed: 32884150
PLoS Med. 2005 Aug;2(8):e124
pubmed: 16060722
BMC Evol Biol. 2013 Aug 01;13:161
pubmed: 23914788
Bioinformatics. 2014 May 1;30(9):1312-3
pubmed: 24451623
Bioinformatics. 2019 Feb 1;35(3):526-528
pubmed: 30016406
Mol Biol Evol. 2020 Jan 1;37(1):291-294
pubmed: 31432070
PLoS One. 2019 Dec 18;14(12):e0225883
pubmed: 31851689
PLoS Biol. 2019 May 21;17(5):e3000255
pubmed: 31112549
Mol Biol Evol. 2015 Jan;32(1):268-74
pubmed: 25371430
Syst Biol. 2015 Sep;64(5):709-26
pubmed: 25999395
Science. 2016 Mar 25;351(6280):1433-6
pubmed: 26940865
Nature. 2013 Jul 11;499(7457):214-218
pubmed: 23770567
Mol Biol Evol. 2018 Jun 1;35(6):1547-1549
pubmed: 29722887
PLoS Biol. 2015 Nov 10;13(11):e1002295
pubmed: 26556502
Science. 2015 Aug 28;349(6251):aac4716
pubmed: 26315443
Trends Ecol Evol. 2016 Feb;31(2):116-126
pubmed: 26775796
Nature. 2019 Oct;574(7780):679-685
pubmed: 31645766
Curr Biol. 2011 Jan 25;21(2):134-9
pubmed: 21194949
Philos Trans R Soc Lond B Biol Sci. 2008 Dec 27;363(1512):3977-84
pubmed: 18852107
Syst Biol. 2015 Mar;64(2):356-62
pubmed: 25358969
Nature. 2018 May;557(7705):418-423
pubmed: 29743673
Nat Hum Behav. 2018 Sep;2(9):637-644
pubmed: 31346273
Syst Biol. 2002 Jun;51(3):492-508
pubmed: 12079646
Trends Pharmacol Sci. 2016 Apr;37(4):290-302
pubmed: 26776451
Nature. 2016 May 25;533(7604):452-4
pubmed: 27225100
Nature. 2011 May 19;473(7347):285
pubmed: 21593852
Syst Biol. 2012 Oct;61(5):727-44
pubmed: 22605266
PLoS Biol. 2014 Jan 28;12(1):e1001779
pubmed: 24492920
Bioinformatics. 2019 Nov 1;35(21):4453-4455
pubmed: 31070718
Angew Chem Int Ed Engl. 2016 Oct 4;55(41):12548-9
pubmed: 27558212
Bioinformatics. 2012 Sep 15;28(18):i409-i415
pubmed: 22962460
Nat Hum Behav. 2018 Nov;2(11):816-821
pubmed: 31558817
Mol Ecol. 2018 Jun 28;:
pubmed: 29953708
Trends Ecol Evol. 2016 Sep;31(9):711-719
pubmed: 27461041
Science. 2009 Jan 23;323(5913):479-83
pubmed: 19164742
Mol Phylogenet Evol. 2015 Oct;91:98-122
pubmed: 26002829
Mol Ecol Resour. 2016 Sep;16(5):1059-68
pubmed: 26215687
Nature. 2012 Feb 22;482(7386):485-8
pubmed: 22358837
BMC Res Notes. 2012 Oct 22;5:574
pubmed: 23088596
Genome Biol. 2019 Nov 14;20(1):238
pubmed: 31727128
Science. 2013 Jul 12;341(6142):179-83
pubmed: 23765279
Syst Biol. 2012 Oct;61(5):717-26
pubmed: 22232343
PLoS One. 2010 Mar 10;5(3):e9490
pubmed: 20224823
Nature. 2013 Jan 17;493(7432):305
pubmed: 23325204
Science. 2014 Jan 17;343(6168):229
pubmed: 24436391
PLoS Biol. 2013 Sep;11(9):e1001636
pubmed: 24019756
Nat Microbiol. 2016 Apr 11;1:16048
pubmed: 27572647
Syst Biol. 2020 Jul 1;69(4):795-812
pubmed: 32011711
Mol Biol Evol. 1994 May;11(3):459-68
pubmed: 8015439
Bioinformatics. 2011 Feb 15;27(4):592-3
pubmed: 21169378
Comput Appl Biosci. 1997 Jun;13(3):235-8
pubmed: 9183526
Mol Biol Evol. 2018 Feb 1;35(2):486-503
pubmed: 29177474
Cell. 2018 Nov 29;175(6):1533-1545.e20
pubmed: 30415838
BMC Bioinformatics. 2018 May 8;19(Suppl 6):153
pubmed: 29745866
Nat Ecol Evol. 2018 Apr;2(4):688-696
pubmed: 29531346
Syst Biol. 2010 May;59(3):307-21
pubmed: 20525638

Auteurs

Xing-Xing Shen (XX)

State Key Laboratory of Rice Biology, Ministry of Agriculture Key Lab of Molecular Biology of Crop Pathogens and Insects, Zhejiang University, 310058, Hangzhou, China. xingxingshen@zju.edu.cn.
Institute of Insect Sciences, Zhejiang University, 310058, Hangzhou, China. xingxingshen@zju.edu.cn.

Yuanning Li (Y)

Department of Biological Sciences, Vanderbilt University, Nashville, TN, 37235, USA.

Chris Todd Hittinger (CT)

Laboratory of Genetics, J. F. Crow Institute for the Study of Evolution, Wisconsin Energy Institute, Center for Genomic Science Innovation, University of Wisconsin-Madison, Madison, WI, 53706, USA.
DOE Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, WI, 53706, USA.

Xue-Xin Chen (XX)

State Key Laboratory of Rice Biology, Ministry of Agriculture Key Lab of Molecular Biology of Crop Pathogens and Insects, Zhejiang University, 310058, Hangzhou, China.
Institute of Insect Sciences, Zhejiang University, 310058, Hangzhou, China.

Antonis Rokas (A)

Department of Biological Sciences, Vanderbilt University, Nashville, TN, 37235, USA. antonis.rokas@vanderbilt.edu.

Articles similaires

Genome, Chloroplast Phylogeny Genetic Markers Base Composition High-Throughput Nucleotide Sequencing
Robotic Surgical Procedures Animals Humans Telemedicine Models, Animal

Odour generalisation and detection dog training.

Lyn Caldicott, Thomas W Pike, Helen E Zulch et al.
1.00
Animals Odorants Dogs Generalization, Psychological Smell
Animals TOR Serine-Threonine Kinases Colorectal Neoplasms Colitis Mice

Classifications MeSH