Analysis of Paralogs in Target Enrichment Data Pinpoints Multiple Ancient Polyploidy Events in Alchemilla s.l. (Rosaceae).


Journal

Systematic biology
ISSN: 1076-836X
Titre abrégé: Syst Biol
Pays: England
ID NLM: 9302532

Informations de publication

Date de publication:
16 12 2021
Historique:
received: 01 09 2020
revised: 28 04 2021
accepted: 03 05 2021
pubmed: 13 5 2021
medline: 8 3 2022
entrez: 12 5 2021
Statut: ppublish

Résumé

Target enrichment is becoming increasingly popular for phylogenomic studies. Although baits for enrichment are typically designed to target single-copy genes, paralogs are often recovered with increased sequencing depth, sometimes from a significant proportion of loci, especially in groups experiencing whole-genome duplication (WGD) events. Common approaches for processing paralogs in target enrichment data sets include random selection, manual pruning, and mainly, the removal of entire genes that show any evidence of paralogy. These approaches are prone to errors in orthology inference or removing large numbers of genes. By removing entire genes, valuable information that could be used to detect and place WGD events is discarded. Here, we used an automated approach for orthology inference in a target enrichment data set of 68 species of Alchemilla s.l. (Rosaceae), a widely distributed clade of plants primarily from temperate climate regions. Previous molecular phylogenetic studies and chromosome numbers both suggested ancient WGDs in the group. However, both the phylogenetic location and putative parental lineages of these WGD events remain unknown. By taking paralogs into consideration and inferring orthologs from target enrichment data, we identified four nodes in the backbone of Alchemilla s.l. with an elevated proportion of gene duplication. Furthermore, using a gene-tree reconciliation approach, we established the autopolyploid origin of the entire Alchemilla s.l. and the nested allopolyploid origin of four major clades within the group. Here, we showed the utility of automated tree-based orthology inference methods, previously designed for genomic or transcriptomic data sets, to study complex scenarios of polyploidy and reticulate evolution from target enrichment data sets.[Alchemilla; allopolyploidy; autopolyploidy; gene tree discordance; orthology inference; paralogs; Rosaceae; target enrichment; whole genome duplication.].

Identifiants

pubmed: 33978764
pii: 6274658
doi: 10.1093/sysbio/syab032
pmc: PMC8677558
doi:

Banques de données

Dryad
['10.5061/dryad.cc2fqz660']

Types de publication

Journal Article Research Support, Non-U.S. Gov't Research Support, U.S. Gov't, Non-P.H.S.

Langues

eng

Sous-ensembles de citation

IM

Pagination

190-207

Informations de copyright

© The Author(s) 2021. Published by Oxford University Press, on behalf of the Society of Systematic Biologists.

Références

Appl Plant Sci. 2016 Jul 12;4(7):
pubmed: 27437175
Front Plant Sci. 2020 Jan 09;10:1655
pubmed: 31998342
Front Genet. 2020 Feb 21;10:1407
pubmed: 32153629
Curr Opin Plant Biol. 2018 Apr;42:76-80
pubmed: 29649616
Appl Plant Sci. 2016 Jul 13;4(7):
pubmed: 27437173
Mol Biol Evol. 2019 Oct 1;36(10):2157-2164
pubmed: 31241141
Mol Biol Evol. 2011 Aug;28(8):2239-52
pubmed: 21325092
Trends Plant Sci. 2016 Jul;21(7):609-621
pubmed: 27021699
Mol Phylogenet Evol. 2020 Nov;152:106769
pubmed: 32081762
Nature. 2019 Oct;574(7780):679-685
pubmed: 31645766
BMC Evol Biol. 2015 Aug 05;15:150
pubmed: 26239519
Science. 2000 Nov 10;290(5494):1151-5
pubmed: 11073452
Syst Biol. 2018 Jul 1;67(4):735-740
pubmed: 29514307
Syst Biol. 2018 May 01;67(3):367-383
pubmed: 29029339
Appl Plant Sci. 2014 Aug 29;2(9):
pubmed: 25225629
Mol Biol Evol. 2013 Apr;30(4):772-80
pubmed: 23329690
BMC Bioinformatics. 2013 Nov 19;14:330
pubmed: 24252138
Genome Biol. 2012 Jan 26;13(1):R3
pubmed: 22280555
Front Plant Sci. 2015 Sep 17;6:710
pubmed: 26442024
New Phytol. 2018 Oct;220(2):636-650
pubmed: 30016546
Mol Phylogenet Evol. 2009 May;51(2):269-80
pubmed: 19268709
Syst Biol. 2003 Jun;52(3):374-85
pubmed: 12775526
Nat Commun. 2019 Apr 2;10(1):1485
pubmed: 30940807
Syst Biol. 2020 May 1;69(3):462-478
pubmed: 31693158
Bioinformatics. 2014 May 1;30(9):1312-3
pubmed: 24451623
PeerJ. 2019 Sep 24;7:e7747
pubmed: 31579615
Trends Genet. 2021 Feb;37(2):174-187
pubmed: 32921510
Mol Biol Evol. 2017 Feb 1;34(2):262-281
pubmed: 27856652
Appl Plant Sci. 2021 Jul 21;9(7):
pubmed: 34336402
New Phytol. 2021 Apr;230(1):66-72
pubmed: 33491778
Syst Biol. 2020 Jul 1;69(4):613-622
pubmed: 32065640
Bioinformatics. 2017 Jun 15;33(12):1886-1888
pubmed: 28174903
Curr Opin Plant Biol. 2018 Apr;42:55-65
pubmed: 29567623
Syst Biol. 2020 Sep 24;:
pubmed: 32970819
Syst Biol. 2021 Apr 15;70(3):508-526
pubmed: 32483631
Bioinformatics. 2012 Jun 15;28(12):1647-9
pubmed: 22543367
Ann Bot. 2016 Jan;117(1):121-31
pubmed: 26520565
Syst Biol. 2017 Nov 01;66(6):1007-1018
pubmed: 28419377
Mol Biol Evol. 2016 Jul;33(7):1654-68
pubmed: 27189547
Genome Res. 2003 Sep;13(9):2178-89
pubmed: 12952885
Mol Biol Evol. 2018 Oct 1;35(10):2582-2584
pubmed: 30165589
Mol Biol Evol. 2015 Aug;32(8):2001-14
pubmed: 25837578
New Phytol. 2020 May;226(4):1158-1170
pubmed: 30963585
Evol Bioinform Online. 2017 Mar 10;13:1176934317691935
pubmed: 28469378
Bioinformatics. 2020 Jul 1;36(Suppl_1):i57-i65
pubmed: 32657396
Genome Res. 2006 Jun;16(6):738-49
pubmed: 16702410
Genome Biol. 2019 Nov 14;20(1):238
pubmed: 31727128
Mol Biol Evol. 2021 May 4;38(5):1695-1714
pubmed: 33331908
Mol Phylogenet Evol. 2019 Sep;138:219-232
pubmed: 31146023
Am J Bot. 2018 Mar;105(3):385-403
pubmed: 29746719
PeerJ. 2018 Jul 13;6:e5175
pubmed: 30023140
Plant Physiol. 2016 Aug;171(4):2294-316
pubmed: 27288366
Mol Biol Evol. 2014 May;31(5):1261-71
pubmed: 24509691
Genome Biol Evol. 2016 Apr 21;8(4):1150-64
pubmed: 26988252
Mol Phylogenet Evol. 2017 Jun;111:231-247
pubmed: 28390909
Nature. 2011 May 5;473(7345):97-100
pubmed: 21478875
Appl Plant Sci. 2019 Oct 25;7(10):e11295
pubmed: 31667023
Am J Bot. 2019 Mar;106(3):415-437
pubmed: 30882906
Syst Biol. 2021 Feb 10;70(2):219-235
pubmed: 32785686
Science. 2010 May 7;328(5979):710-722
pubmed: 20448178
New Phytol. 2018 Jun;218(4):1668-1684
pubmed: 29604235
BMC Genomics. 2011 May 04;12:211
pubmed: 21542930
Syst Zool. 1970 Jun;19(2):99-113
pubmed: 5449325
PLoS Genet. 2016 Mar 07;12(3):e1005896
pubmed: 26950302
BMC Genomics. 2018 May 8;19(Suppl 5):286
pubmed: 29745854
BMC Genomics. 2018 May 8;19(Suppl 5):272
pubmed: 29745847
BMC Bioinformatics. 2016 Oct 13;17(1):422
pubmed: 27737628
Mol Phylogenet Evol. 2008 Jun;47(3):1030-44
pubmed: 18479944
BMC Evol Biol. 2017 Aug 4;17(1):180
pubmed: 28778145
Appl Plant Sci. 2014 Feb 06;2(2):
pubmed: 25202605
Front Plant Sci. 2020 Feb 07;10:1773
pubmed: 32117341
New Phytol. 2018 Jan;217(2):855-870
pubmed: 28944472
Mol Biol Evol. 2016 Nov;33(11):2820-2835
pubmed: 27604225
Am J Bot. 2018 Mar;105(3):404-416
pubmed: 29729187
Mol Biol Evol. 2020 Nov 1;37(11):3292-3307
pubmed: 32886770
Bioinformatics. 2016 Mar 1;32(5):786-8
pubmed: 26530724
BMC Bioinformatics. 2018 May 8;19(Suppl 6):153
pubmed: 29745866
Sci Adv. 2015 Nov 20;1(10):e1501084
pubmed: 26702445
Appl Plant Sci. 2015 Apr 06;3(4):
pubmed: 25909041
Evol Bioinform Online. 2013 Oct 29;9:429-35
pubmed: 24250218
Nat Genet. 2011 Feb;43(2):109-16
pubmed: 21186353
Mol Biol Evol. 2014 Nov;31(11):3081-92
pubmed: 25158799

Auteurs

Diego F Morales-Briones (DF)

Department of Plant and Microbial Biology, University of Minnesota-Twin Cities, 1445 Gortner Avenue, St. Paul, MN 55108, USA.
Department of Biological Sciences and Institute for Bioinformatics and Evolutionary Studies, University of Idaho, 875 Perimeter Drive MS 3051, Moscow, ID 83844, USA.

Berit Gehrke (B)

University Gardens, University Museum, University of Bergen, Mildeveien 240, 5259 Hjellestad, Norway.

Chien-Hsun Huang (CH)

State Key Laboratory of Genetic Engineering and Collaborative Innovation Center of Genetics and Development, Ministry of Education Key Laboratory of Biodiversity and Ecological Engineering, Institute of Plant Biology, Center of Evolutionary Biology, School of Life Sciences, Fudan University, Shanghai 200433, China.

Aaron Liston (A)

Department of Botany and Plant Pathology, Oregon State University, 2082 Cordley Hall, Corvallis, OR 97331, USA.

Hong Ma (H)

Department of Biology, the Huck Institute of the Life Sciences, the Pennsylvania State University, 510D Mueller Laboratory, University Park, PA 16802 USA.

Hannah E Marx (HE)

Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI 48109-1048, USA.
Museum of Southwestern Biology and Department of Biology, University of New Mexico, Albuquerque, NM 87131, USA.

David C Tank (DC)

Department of Biological Sciences and Institute for Bioinformatics and Evolutionary Studies, University of Idaho, 875 Perimeter Drive MS 3051, Moscow, ID 83844, USA.

Ya Yang (Y)

Department of Plant and Microbial Biology, University of Minnesota-Twin Cities, 1445 Gortner Avenue, St. Paul, MN 55108, USA.

Articles similaires

Genome, Chloroplast Phylogeny Genetic Markers Base Composition High-Throughput Nucleotide Sequencing
Animals Hemiptera Insect Proteins Phylogeny Insecticides
Amaryllidaceae Alkaloids Lycoris NADPH-Ferrihemoprotein Reductase Gene Expression Regulation, Plant Plant Proteins
Drought Resistance Gene Expression Profiling Gene Expression Regulation, Plant Gossypium Multigene Family

Classifications MeSH