A stepwise guide for pangenome development in crop plants: an alfalfa (Medicago sativa) case study.
Alfalfa
Autotetraploid
Crop pangenome
Graph-based pangenome
Polyploids
Journal
BMC genomics
ISSN: 1471-2164
Titre abrégé: BMC Genomics
Pays: England
ID NLM: 100965258
Informations de publication
Date de publication:
31 Oct 2024
31 Oct 2024
Historique:
received:
13
06
2024
accepted:
21
10
2024
medline:
1
11
2024
pubmed:
1
11
2024
entrez:
1
11
2024
Statut:
epublish
Résumé
The concept of pangenomics and the importance of structural variants is gaining recognition within the plant genomics community. Due to advancements in sequencing and computational technology, it has become feasible to sequence the entire genome of numerous individuals of a single species at a reasonable cost. Pangenomes have been constructed for many major diploid crops, including rice, maize, soybean, sorghum, pearl millet, peas, sunflower, grapes, and mustards. However, pangenomes for polyploid species are relatively scarce and are available in only few crops including wheat, cotton, rapeseed, and potatoes. In this review, we explore the various methods used in crop pangenome development, discussing the challenges and implications of these techniques based on insights from published pangenome studies. We offer a systematic guide and discuss the tools available for constructing a pangenome and conducting downstream analyses. Alfalfa, a highly heterozygous, cross pollinated and autotetraploid forage crop species, is used as an example to discuss the concerns and challenges offered by polyploid crop species. We conducted a comparative analysis using linear and graph-based methods by constructing an alfalfa graph pangenome using three publicly available genome assemblies. To illustrate the intricacies captured by pangenome graphs for a complex crop genome, we used five different gene sequences and aligned them against the three graph-based pangenomes. The comparison of the three graph pangenome methods reveals notable variations in the genomic variation captured by each pipeline. Pangenome resources are proving invaluable by offering insights into core and dispensable genes, novel gene discovery, and genome-wide patterns of variation. Developing user-friendly online portals for linear pangenome visualization has made these resources accessible to the broader scientific and breeding community. However, challenges remain with graph-based pangenomes including compatibility with other tools, extraction of sequence for regions of interest, and visualization of genetic variation captured in pangenome graphs. These issues necessitate further refinement of tools and pipelines to effectively address the complexities of polyploid, highly heterozygous, and cross-pollinated species.
Sections du résumé
BACKGROUND
BACKGROUND
The concept of pangenomics and the importance of structural variants is gaining recognition within the plant genomics community. Due to advancements in sequencing and computational technology, it has become feasible to sequence the entire genome of numerous individuals of a single species at a reasonable cost. Pangenomes have been constructed for many major diploid crops, including rice, maize, soybean, sorghum, pearl millet, peas, sunflower, grapes, and mustards. However, pangenomes for polyploid species are relatively scarce and are available in only few crops including wheat, cotton, rapeseed, and potatoes.
MAIN BODY
METHODS
In this review, we explore the various methods used in crop pangenome development, discussing the challenges and implications of these techniques based on insights from published pangenome studies. We offer a systematic guide and discuss the tools available for constructing a pangenome and conducting downstream analyses. Alfalfa, a highly heterozygous, cross pollinated and autotetraploid forage crop species, is used as an example to discuss the concerns and challenges offered by polyploid crop species. We conducted a comparative analysis using linear and graph-based methods by constructing an alfalfa graph pangenome using three publicly available genome assemblies. To illustrate the intricacies captured by pangenome graphs for a complex crop genome, we used five different gene sequences and aligned them against the three graph-based pangenomes. The comparison of the three graph pangenome methods reveals notable variations in the genomic variation captured by each pipeline.
CONCLUSION
CONCLUSIONS
Pangenome resources are proving invaluable by offering insights into core and dispensable genes, novel gene discovery, and genome-wide patterns of variation. Developing user-friendly online portals for linear pangenome visualization has made these resources accessible to the broader scientific and breeding community. However, challenges remain with graph-based pangenomes including compatibility with other tools, extraction of sequence for regions of interest, and visualization of genetic variation captured in pangenome graphs. These issues necessitate further refinement of tools and pipelines to effectively address the complexities of polyploid, highly heterozygous, and cross-pollinated species.
Identifiants
pubmed: 39482604
doi: 10.1186/s12864-024-10931-w
pii: 10.1186/s12864-024-10931-w
doi:
Types de publication
Journal Article
Review
Langues
eng
Sous-ensembles de citation
IM
Pagination
1022Subventions
Organisme : USDA-ARS
ID : 5026-12210-004-00D
Organisme : USDA-ARS
ID : 5026-12210-004-00D
Organisme : USDA-ARS
ID : 5026-12210-004-00D
Informations de copyright
© 2024. This is a U.S. Government work and not under copyright protection in the US; foreign copyright protection may apply.
Références
Jiao Y, Peluso P, Shi J, Liang T, Stitzer MC, Wang B, et al. Improved maize reference genome with single-molecule technologies. Nature. 2017;546:524–7.
pubmed: 28605751
pmcid: 7052699
doi: 10.1038/nature22971
Sun S, Zhou Y, Chen J, Shi J, Zhao H, Zhao H, et al. Extensive intraspecific gene order and gene structural variations between Mo17 and other maize genomes. Nat Genet. 2018;50:1289–95.
pubmed: 30061735
doi: 10.1038/s41588-018-0182-0
Yang N, Liu J, Gao Q, Gui S, Chen L, Yang L, et al. Genome assembly of a tropical maize inbred line provides insights into structural variation and crop improvement. Nat Genet. 2019;51:1052–9.
pubmed: 31152161
doi: 10.1038/s41588-019-0427-6
Li C, Xiang X, Huang Y, Zhou Y, An D, Dong J, et al. Long-read sequencing reveals genomic structural variations that underlie creation of quality protein maize. Nat Commun. 2020;11:17.
pubmed: 31911615
pmcid: 6946643
doi: 10.1038/s41467-019-14023-2
Ge F, Qu J, Liu P, Pan L, Zou C, Yuan G, et al. Genome assembly of the maize inbred line A188 provides a new reference genome for functional genomics. Crop J. 2022;10:47–55.
doi: 10.1016/j.cj.2021.08.002
Wang B, Hou M, Shi J, Ku L, Song W, Li C, et al. De novo genome assembly and analyses of 12 founder inbred lines provide insights into maize heterosis. Nat Genet. 2023;55:312–23.
pubmed: 36646891
doi: 10.1038/s41588-022-01283-w
Wang B, Yang X, Jia Y, Xu Y, Jia P, Dang N, et al. High-quality Arabidopsis thaliana genome assembly with nanopore and HiFi long reads. Genomics Proteom Bioinf. 2022;20:4–13.
doi: 10.1016/j.gpb.2021.08.003
Buisine N, Quesneville H, Colot V. Improved detection and annotation of transposable elements in sequenced genomes using multiple reference sequence sets. Genomics. 2008;91:467–75.
pubmed: 18343092
doi: 10.1016/j.ygeno.2008.01.005
Lee H, Chawla HS, Obermeier C, Dreyer F, Abbadi A, Snowdon R. Chromosome-scale assembly of winter oilseed rape Brassica napus. Front Plant Sci. 2020;11:496.
pubmed: 32411167
pmcid: 7202327
doi: 10.3389/fpls.2020.00496
Bayer PE, Hurgobin B, Golicz AA, Chan C-KK, Yuan Y, Lee H, et al. Assembly and comparison of two closely related Brassica napus genomes. Plant Biotechnol J. 2017;15:1602–10.
pubmed: 28403535
pmcid: 5698052
doi: 10.1111/pbi.12742
Sun F, Fan G, Hu Q, Zhou Y, Guan M, Tong C, et al. The high-quality genome of Brassica napus Cultivar ZS11 reveals the introgression history in semi-winter morphotype. Plant J. 2017;92:452–68.
pubmed: 28849613
doi: 10.1111/tpj.13669
Lv H, Wang Y, Han F, Ji J, Fang Z, Zhuang M, et al. A high-quality reference genome for cabbage obtained with SMRT reveals novel genomic features and evolutionary characteristics. Sci Rep. 2020;10:12394.
pubmed: 32709963
pmcid: 7381634
doi: 10.1038/s41598-020-69389-x
Parkin IAP, Koh C, Tang H, Robinson SJ, Kagale S, Clarke WE, et al. Transcriptome and methylome profiling reveals relics of genome dominance in the mesopolyploid Brassica oleracea. Genome Biol. 2014;15:R77.
pubmed: 24916971
pmcid: 4097860
doi: 10.1186/gb-2014-15-6-r77
Liu S, Liu Y, Yang X, Tong C, Edwards D, Parkin IAP, et al. The Brassica oleracea genome reveals the asymmetrical evolution of polyploid genomes. Nat Commun. 2014;5:3930.
pubmed: 24852848
doi: 10.1038/ncomms4930
Zhang L, Liang J, Chen H, Zhang Z, Wu J, Wang X. A near-complete genome assembly of Brassica rapa provides new insights into the evolution of centromeres. Plant Biotechnol J. 2023;21:1022–32.
pubmed: 36688739
pmcid: 10106856
doi: 10.1111/pbi.14015
Xu H, Wang C, Shao G, Wu S, Liu P, Cao P, et al. The reference genome and full-length transcriptome of pakchoi provide insights into cuticle formation and heat adaption. Hortic Res. 2022;9:uhac123.
pubmed: 35949690
pmcid: 9358696
doi: 10.1093/hr/uhac123
Yang Z, Jiang Y, Gong J, Li Q, Dun B, Liu D, et al. R gene triplication confers European fodder turnip with improved clubroot resistance. Plant Biotechnol J. 2022;20:1502–17.
pubmed: 35445530
pmcid: 9342621
doi: 10.1111/pbi.13827
Istace B, Belser C, Falentin C, Labadie K, Boideau F, Deniot G, et al. Sequencing and chromosome-scale assembly of plant genomes, Brassica rapa as a Use Case. Biology (Basel). 2021;10:732.
pubmed: 34439964
Chu JS-C, Peng B, Tang K, Yi X, Zhou H, Wang H, et al. Eight soybean reference genome resources from varying latitudes and agronomic traits. Sci Data. 2021;8:164.
pubmed: 34210987
pmcid: 8249447
doi: 10.1038/s41597-021-00947-2
Valliyodan B, Cannon SB, Bayer PE, Shu S, Brown AV, Ren L, et al. Construction and comparison of three reference-quality genome assemblies for soybean. Plant J. 2019;100:1066–82.
pubmed: 31433882
doi: 10.1111/tpj.14500
Yi X, Liu J, Chen S, Wu H, Liu M, Xu Q, et al. Genome assembly of the JD17 soybean provides a new reference genome for comparative genomics. G3 (Bethesda). 2022;12:jkac017.
pubmed: 35188189
doi: 10.1093/g3journal/jkac017
Shen Y, Du H, Liu Y, Ni L, Wang Z, Liang C, et al. Update soybean Zhonghuang 13 genome to a golden reference. Sci China Life Sci. 2019;62:1257–60.
pubmed: 31444683
doi: 10.1007/s11427-019-9822-2
Garg V, Dudchenko O, Wang J, Khan AW, Gupta S, Kaur P, et al. Chromosome-length genome assemblies of six legume species provide insights into genome organization, evolution, and agronomic traits for crop improvement. J Advanc Res. 2022;42:315–29.
doi: 10.1016/j.jare.2021.10.009
Xie M, Chung CY-L, Li M-W, Wong F-L, Wang X, Liu A, et al. A reference-grade wild soybean genome. Nat Commun. 2019;10:1216.
pubmed: 30872580
pmcid: 6418295
doi: 10.1038/s41467-019-09142-9
Ma Z, Zhang Y, Wu L, Zhang G, Sun Z, Li Z, et al. High-quality genome assembly and resequencing of modern cotton cultivars provide resources for crop improvement. Nat Genet. 2021;53:1385–91.
pubmed: 34373642
pmcid: 8423627
doi: 10.1038/s41588-021-00910-2
Hu Y, Chen J, Fang L, Zhang Z, Ma W, Niu Y, et al. Gossypium barbadense and Gossypium hirsutum genomes provide insights into the origin and evolution of allotetraploid cotton. Nat Genet. 2019;51:739–48.
pubmed: 30886425
doi: 10.1038/s41588-019-0371-5
Li F, Fan G, Lu C, Xiao G, Zou C, Kohel RJ, et al. Genome sequence of cultivated Upland cotton (Gossypium hirsutum TM-1) provides insights into genome evolution. Nat Biotechnol. 2015;33:524–30.
pubmed: 25893780
doi: 10.1038/nbt.3208
Zeng X, Xu T, Ling Z, Wang Y, Li X, Xu S, et al. An improved high-quality genome assembly and annotation of tibetan hulless barley. Sci Data. 2020;7:139.
pubmed: 32385314
pmcid: 7210891
doi: 10.1038/s41597-020-0480-0
Mascher M, Gundlach H, Himmelbach A, Beier S, Twardziok SO, Wicker T, et al. A chromosome conformation capture ordered sequence of the barley genome. Nature. 2017;544:427–33.
pubmed: 28447635
doi: 10.1038/nature22043
Schreiber M, Mascher M, Wright J, Padmarasu S, Himmelbach A, Heavens D, et al. A genome assembly of the barley transformation reference cultivar golden promise. G3 (Bethesda). 2020;10:1823–7.
pubmed: 32241919
doi: 10.1534/g3.119.401010
Rajarammohan S, Kaur L, Verma A, Singh D, Mantri S, Roy JK, et al. Genome sequencing and assembly of Lathyrus sativus - a nutrient-rich hardy legume crop. Sci Data. 2023;10:32.
pubmed: 36650149
pmcid: 9845207
doi: 10.1038/s41597-022-01903-4
Emmrich PMF, Sarkar A, Njaci I, Kaithakottil GG, Ellis N, Moore C, et al. A draft genome of grass pea (Lathyrus sativus), a resilient diploid legume. BioRxiv. 2020. https://doi.org/10.1101/2020.04.24.058164 .
Shen C, Du H, Chen Z, Lu H, Zhu F, Chen H, et al. The chromosome-level genome sequence of the autotetraploid alfalfa and resequencing of core germplasms provide genomic resources for alfalfa research. Mol Plant. 2020;13:1250–61.
pubmed: 32673760
doi: 10.1016/j.molp.2020.07.003
Long R, Zhang F, Zhang Z, Li M, Chen L, Wang X, et al. Genome assembly of alfalfa cultivar Zhongmu-4 and identification of SNPs associated with agronomic traits. Genomics Proteom Bioinf. 2022;20:14–28.
doi: 10.1016/j.gpb.2022.01.002
Chen H, Zeng Y, Yang Y, Huang L, Tang B, Zhang H, et al. Allele-aware chromosome-level genome assembly and efficient transgene-free genome editing for the autotetraploid cultivated alfalfa. Nat Commun. 2020;11:2494.
pubmed: 32427850
pmcid: 7237683
doi: 10.1038/s41467-020-16338-x
Choi JY, Lye ZN, Groen SC, Dai X, Rughani P, Zaaijer S, et al. Nanopore sequencing-based genome assembly and evolutionary genomics of circum-basmati rice. Genome Biol. 2020;21:21.
pubmed: 32019604
pmcid: 7001208
doi: 10.1186/s13059-020-1938-2
Du H, Yu Y, Ma Y, Gao Q, Cao Y, Chen Z, et al. Sequencing and de novo assembly of a near complete indica rice genome. Nat Commun. 2017;8:15324.
pubmed: 28469237
doi: 10.1038/ncomms15324
Kawahara Y, de la Bastide M, Hamilton JP, Kanamori H, McCombie WR, Ouyang S, et al. Improvement of the Oryza sativa nipponbare reference genome using next generation sequence and optical map data. Rice (N Y). 2013;6:4.
pubmed: 24280374
doi: 10.1186/1939-8433-6-4
Yan H, Sun M, Zhang Z, Jin Y, Zhang A, Lin C, et al. Pangenomic analysis identifies structural variation associated with heat tolerance in pearl millet. Nat Genet. 2023;55:507–18.
pubmed: 36864101
pmcid: 10011142
doi: 10.1038/s41588-023-01302-4
Varshney RK, Shi C, Thudi M, Mariac C, Wallace J, Qi P, et al. Pearl millet genome sequence provides a resource to improve agronomic traits in arid environments. Nat Biotechnol. 2017;35:969–76.
pubmed: 28922347
pmcid: 6871012
doi: 10.1038/nbt.3943
Su X, Wang B, Geng X, Du Y, Yang Q, Liang B, et al. A high-continuity and annotated tomato reference genome. BMC Genomics. 2021;22:898.
pubmed: 34911432
pmcid: 8672587
doi: 10.1186/s12864-021-08212-x
Hosmani PS, Flores-Gonzalez M, van de Geest H, Maumus F, Bakker LV, Schijlen E, et al. An improved de novo assembly and annotation of the tomato reference genome using single-molecule sequencing, Hi-C proximity ligation and optical maps. BioRxiv. 2019. https://doi.org/10.1101/767764 .
Takei H, Shirasawa K, Kuwabara K, Toyoda A, Matsuzawa Y, Iioka S, et al. De novo genome assembly of two tomato ancestors, Solanum pimpinellifolium and Solanum lycopersicum var. cerasiforme, by long-read sequencing. DNA Res. 2021;28:28.
doi: 10.1093/dnares/dsaa029
Karetnikov DI, Vasiliev GV, Toshchakov SV, Shmakov NA, Genaev MA, Nesterov MA, et al. Analysis of genome structure and its variations in potato cultivars grown in Russia. Int J Mol Sci. 2023;24:5713.
pubmed: 36982787
pmcid: 10059000
doi: 10.3390/ijms24065713
Kyriakidou M, Anglin NL, Ellis D, Tai HH, Strömvik MV. Genome assembly of six polyploid potato genomes. Sci Data. 2020;7:88.
pubmed: 32161269
pmcid: 7066127
doi: 10.1038/s41597-020-0428-4
Xu PGSC, Pan X, Cheng S, Zhang S, Mu B. Genome sequence and analysis of the tuber crop potato. Nature. 2011;475:189–95.
pubmed: 21743474
doi: 10.1038/nature10158
Sun H, Jiao W-B, Krause K, Campoy JA, Goel M, Folz-Donahue K, et al. Chromosome-scale and haplotype-resolved genome assembly of a tetraploid potato cultivar. Nat Genet. 2022;54:342–8.
pubmed: 35241824
pmcid: 8920897
doi: 10.1038/s41588-022-01015-0
Wang F, Xia Z, Zou M, Zhao L, Jiang S, Zhou Y, et al. The autotetraploid potato genome provides insights into highly heterozygous species. Plant Biotechnol J. 2022;20:1996–2005.
pubmed: 35767385
pmcid: 9491450
doi: 10.1111/pbi.13883
Bao Z, Li C, Li G, Wang P, Peng Z, Cheng L, et al. Genome architecture and tetrasomic inheritance of autotetraploid potato. Mol Plant. 2022;15:1211–26.
pubmed: 35733345
doi: 10.1016/j.molp.2022.06.009
Van Lieshout N, van der Burgt A, de Vries ME, Ter Maat M, Eickholt D, Esselink D, et al. Solyntus, the New highly contiguous reference genome for Potato (Solanum tuberosum). G3 (Bethesda). 2020;10:G3.
Kuo Y-T, Ishii T, Fuchs J, Hsieh W-H, Houben A, Lin Y-R. The evolutionary dynamics of repetitive DNA and its impact on the genome diversification in the genus sorghum. Front Plant Sci. 2021;12:729734.
pubmed: 34475879
pmcid: 8407070
doi: 10.3389/fpls.2021.729734
International Wheat Genome Sequencing Consortium (IWGSC). Shifting the limits in wheat research and breeding using a fully annotated reference genome. Science. 2018;361:eaar7191.
doi: 10.1126/science.aar7191
Walkowiak S, Gao L, Monat C, Haberer G, Kassa MT, Brinton J, et al. Multiple wheat genomes reveal global variation in modern breeding. Nature. 2020;588:277–83.
pubmed: 33239791
pmcid: 7759465
doi: 10.1038/s41586-020-2961-x
Xi H, Nguyen V, Ward C, Liu Z, Searle IR. Chromosome-level assembly of the common vetch (Vicia sativa) reference genome. Gigabyte. 2022;2022:gigabyte38.
pubmed: 36824524
pmcid: 9650280
doi: 10.46471/gigabyte.38
Shirasawa K, Kosugi S, Sasaki K, Ghelfi A, Okazaki K, Toyoda A, et al. Genome features of common vetch (Vicia sativa) in natural habitats. Plant Direct. 2021;5:e352.
pubmed: 34646975
pmcid: 8496506
doi: 10.1002/pld3.352
Liu C, Wang Y, Peng J, Fan B, Xu D, Wu J, et al. High-quality genome assembly and pan-genome studies facilitate genetic discovery in mung bean and its improvement. Plant Commun. 2022;3:100352.
pubmed: 35752938
pmcid: 9700124
doi: 10.1016/j.xplc.2022.100352
Ha J, Satyawan D, Jeong H, Lee E, Cho K-H, Kim MY, et al. A near-complete genome sequence of mungbean (Vigna radiata L.) provides key insights into the modern breeding program. Plant Genome. 2021;14:e20121.
pubmed: 34275211
doi: 10.1002/tpg2.20121
Hufford MB, Seetharam AS, Woodhouse MR, Chougule KM, Ou S, Liu J, et al. De novo assembly, annotation, and comparative analysis of 26 diverse maize genomes. Science. 2021;373:655–62.
pubmed: 34353948
pmcid: 8733867
doi: 10.1126/science.abg5289
Chen J, Wang Z, Tan K, Huang W, Shi J, Li T, et al. A complete telomere-to-telomere assembly of the maize genome. Nat Genet. 2023;55:1221–31.
pubmed: 37322109
pmcid: 10335936
doi: 10.1038/s41588-023-01419-6
Zhou P, Silverstein KAT, Ramaraj T, Guhlin J, Denny R, Liu J, et al. Exploring structural variation and gene family architecture with De Novo assemblies of 15 Medicago genomes. BMC Genomics. 2017;18:261.
pubmed: 28347275
pmcid: 5369179
doi: 10.1186/s12864-017-3654-1
Hirsch CN, Foerster JM, Johnson JM, Sekhon RS, Muttoni G, Vaillancourt B, et al. Insights into the maize pan-genome and pan-transcriptome. Plant Cell. 2014;26:121–35.
pubmed: 24488960
pmcid: 3963563
doi: 10.1105/tpc.113.119982
Yao W, Li G, Zhao H, Wang G, Lian X, Xie W. Exploring the rice dispensable genome using a metagenome-like assembly strategy. Genome Biol. 2015;16:187.
pubmed: 26403182
pmcid: 4583175
doi: 10.1186/s13059-015-0757-3
Tettelin H, Masignani V, Cieslewicz MJ, Donati C, Medini D, Ward NL, et al. Genome analysis of multiple pathogenic isolates of Streptococcus agalactiae: implications for the microbial pan-genome. Proc Natl Acad Sci USA. 2005;102:13950–5.
pubmed: 16172379
pmcid: 1216834
doi: 10.1073/pnas.0506758102
Morgante M, De Paoli E, Radovic S. Transposable elements and the plant pan-genomes. Curr Opin Plant Biol. 2007;10:149–55.
pubmed: 17300983
doi: 10.1016/j.pbi.2007.02.001
Li R, Li Y, Zheng H, Luo R, Zhu H, Li Q, et al. Building the sequence map of the human pan-genome. Nat Biotechnol. 2010;28:57–63.
pubmed: 19997067
doi: 10.1038/nbt.1596
Montenegro JD, Golicz AA, Bayer PE, Hurgobin B, Lee H, Chan C-KK, et al. The pangenome of hexaploid bread wheat. Plant J. 2017;90:1007–13.
pubmed: 28231383
doi: 10.1111/tpj.13515
Gordon SP, Contreras-Moreira B, Woods DP, Des Marais DL, Burgess D, Shu S, et al. Extensive gene content variation in the Brachypodium distachyon pan-genome correlates with population structure. Nat Commun. 2017;8:2184.
pubmed: 29259172
pmcid: 5736591
doi: 10.1038/s41467-017-02292-8
Yang T, Liu R, Luo Y, Hu S, Wang D, Wang C, et al. Improved pea reference genome and pan-genome highlight genomic features and evolutionary characteristics. Nat Genet. 2022;54:1553–63.
pubmed: 36138232
pmcid: 9534762
doi: 10.1038/s41588-022-01172-2
Gao L, Gonda I, Sun H, Ma Q, Bao K, Tieman DM, et al. The tomato pan-genome uncovers new genes and a rare allele regulating fruit flavor. Nat Genet. 2019;51:1044–51.
pubmed: 31086351
doi: 10.1038/s41588-019-0410-2
Bozan I, Achakkagari SR, Anglin NL, Ellis D, Tai HH, Strömvik MV. Pangenome analyses reveal impact of transposable elements and ploidy on the evolution of potato species. Proc Natl Acad Sci USA. 2023;120:e2211117120.
pubmed: 37487084
pmcid: 10401005
doi: 10.1073/pnas.2211117120
Hoopes G, Meng X, Hamilton JP, Achakkagari SR, de Alves Freitas Guesdes F, Bolger ME, et al. Phased, chromosome-scale genome assemblies of tetraploid potato reveal a complex genome, transcriptome, and predicted proteome landscape underpinning genetic diversity. Mol Plant. 2022;15:520–36.
pubmed: 35026436
doi: 10.1016/j.molp.2022.01.003
Cochetel N, Minio A, Guarracino A, Garcia JF, Figueroa-Balderas R, Massonnet M, et al. A super-pangenome of the north American wild grape species. Genome Biol. 2023;24:290.
pubmed: 38111050
pmcid: 10729490
doi: 10.1186/s13059-023-03133-2
Steuernagel B, Jupe F, Witek K, Jones JDG, Wulff BBH. NLR-parser: rapid annotation of plant NLR complements. Bioinformatics. 2015;31:1665–7.
pubmed: 25586514
pmcid: 4426836
doi: 10.1093/bioinformatics/btv005
Jones P, Binns D, Chang H-Y, Fraser M, Li W, McAnulla C, et al. InterProScan 5: genome-scale protein function classification. Bioinformatics. 2014;30:1236–40.
pubmed: 24451626
pmcid: 3998142
doi: 10.1093/bioinformatics/btu031
Peng R, Xu Y, Tian S, Unver T, Liu Z, Zhou Z, et al. Evolutionary divergence of duplicated genomes in newly described allotetraploid cottons. Proc Natl Acad Sci USA. 2022;119:e2208496119.
pubmed: 36122204
pmcid: 9522333
doi: 10.1073/pnas.2208496119
Barragan AC, Weigel D. Plant NLR diversity: the known unknowns of pan-NLRomes. Plant Cell. 2021;33:814–31.
pubmed: 33793812
pmcid: 8226294
doi: 10.1093/plcell/koaa002
Murat F, Van de Peer Y, Salse J. Decoding plant and animal genome plasticity from differential paleo-evolutionary patterns and processes. Genome Biol Evol. 2012;4:917–28.
pubmed: 22833223
pmcid: 3516226
doi: 10.1093/gbe/evs066
Soltis PS, Soltis DE. The role of genetic and genomic attributes in the success of polyploids. Proc Natl Acad Sci USA. 2000;97:7051–7.
pubmed: 10860970
pmcid: 34383
doi: 10.1073/pnas.97.13.7051
Pellicer J, Hidalgo O, Dodsworth S, Leitch IJ. Genome size diversity and its impact on the evolution of land plants. Genes. 2018;9:88.
pubmed: 29443885
pmcid: 5852584
doi: 10.3390/genes9020088
Van de Peer Y, Mizrachi E, Marchal K. The evolutionary significance of polyploidy. Nat Rev Genet. 2017;18:411–24.
pubmed: 28502977
doi: 10.1038/nrg.2017.26
Bennett MD, Leitch IJ. Genome size evolution in plants. In: The evolution of the genome. Elsevier, Academic Press; 2005. p. 89–162. https://doi.org/10.1016/B978-012301463-4/50004-8 .
Herdan G. Quantitative Linguistics. Oxford, UK: Butterworths; 1964.
Heaps HS. Information Retrieval: computational and theoretical aspects. New York, NY: Academic Press, Inc; 1978.
Wang W, Mauleon R, Hu Z, Chebotarov D, Tai S, Wu Z, et al. Genomic variation in 3,010 diverse accessions of Asian cultivated rice. Nature. 2018;557:43–9.
pubmed: 29695866
doi: 10.1038/s41586-018-0063-9
Ruperao P, Thirunavukkarasu N, Gandham P, Selvanayagam S, Govindaraj M, Nebie B, et al. Sorghum pan-genome explores the functional utility for genomic-assisted breeding to accelerate the genetic gain. Front Plant Sci. 2021;12:666342.
pubmed: 34140962
pmcid: 8204017
doi: 10.3389/fpls.2021.666342
Torkamaneh D, Lemay MA, Belzile F. The pan-genome of the cultivated soybean (PanSoy) reveals an extraordinarily conserved gene content. Plant Biotechnol J. 2021;19:1852–62.
pubmed: 33942475
pmcid: 8428833
doi: 10.1111/pbi.13600
Ou L, Li D, Lv J, Chen W, Zhang Z, Li X, et al. Pan-genome of cultivated pepper (Capsicum) and its use in gene presence-absence variation analyses. New Phytol. 2018;220:360–3.
pubmed: 30129229
doi: 10.1111/nph.15413
Hübner S, Bercovich N, Todesco M, Mandel JR, Odenheimer J, Ziegler E, et al. Sunflower pan-genome analysis shows that hybridization altered gene content and disease resistance. Nat Plants. 2019;5:54–62.
pubmed: 30598532
doi: 10.1038/s41477-018-0329-0
Varshney RK, Roorkiwal M, Sun S, Bajaj P, Chitikineni A, Thudi M, et al. A chickpea genetic variation map based on the sequencing of 3,366 genomes. Nature. 2021;599:622–7.
pubmed: 34759320
pmcid: 8612933
doi: 10.1038/s41586-021-04066-1
Hurgobin B, Golicz AA, Bayer PE, Chan C-KK, Tirnaz S, Dolatabadian A, et al. Homoeologous exchange is a major cause of gene presence/absence variation in the amphidiploid Brassica napus. Plant Biotechnol J. 2018;16:1265–74.
pubmed: 29205771
pmcid: 5999312
doi: 10.1111/pbi.12867
Monnahan P, Brandvain Y. The effect of autopolyploidy on population genetic signals of hard sweeps. Biol Lett. 2020;16:20190796.
pubmed: 32097595
pmcid: 7058959
doi: 10.1098/rsbl.2019.0796
Tuttle HK, Del Rio AH, Bamberg JB, Shannon LM. Potato soup: analysis of cultivated potato gene bank populations reveals high diversity and little structure. Front Plant Sci. 2024;15:1429279.
pubmed: 39091313
pmcid: 11291250
doi: 10.3389/fpls.2024.1429279
Conover JL, Wendel JF. Deleterious mutations accumulate faster in allopolyploid than diploid cotton (Gossypium) and unequally between subgenomes. Mol Biol Evol. 2022;39:msac024.
pubmed: 35099532
pmcid: 8841602
doi: 10.1093/molbev/msac024
Pham GM, Newton L, Wiegert-Rininger K, Vaillancourt B, Douches DS, Buell CR. Extensive genome heterogeneity leads to preferential allele expression and copy number-dependent expression in cultivated potato. Plant J. 2017;92:624–37.
pubmed: 28869794
doi: 10.1111/tpj.13706
Schnable JC, Springer NM, Freeling M. Differentiation of the maize subgenomes by genome dominance and both ancient and ongoing gene loss. Proc Natl Acad Sci USA. 2011;108:4069–74.
pubmed: 21368132
pmcid: 3053962
doi: 10.1073/pnas.1101368108
Liang Z, Schnable JC. Functional divergence between subgenomes and gene pairs after whole genome duplications. Mol Plant. 2018;11:388–97.
pubmed: 29275166
doi: 10.1016/j.molp.2017.12.010
Contreras-Moreira B, Cantalapiedra CP, García-Pereira MJ, Gordon SP, Vogel JP, Igartua E, et al. Analysis of plant pan-genomes and transcriptomes with GET_HOMOLOGUES-EST, a clustering solution for sequences of the same species. Front Plant Sci. 2017;8:184.
pubmed: 28261241
pmcid: 5306281
doi: 10.3389/fpls.2017.00184
Lin K, Zhang N, Severing EI, Nijveen H, Cheng F, Visser RGF, et al. Beyond genomic variation–comparison and functional annotation of three Brassica rapa genomes: a turnip, a rapid cycling and a Chinese cabbage. BMC Genomics. 2014;15:250.
pubmed: 24684742
pmcid: 4230417
doi: 10.1186/1471-2164-15-250
Zhao J, Bayer PE, Ruperao P, Saxena RK, Khan AW, Golicz AA, et al. Trait associations in the pangenome of pigeon pea (Cajanus cajan). Plant Biotechnol J. 2020;18:1946–54.
pubmed: 32020732
pmcid: 7415775
doi: 10.1111/pbi.13354
Lee J-H, Venkatesh J, Jo J, Jang S, Kim GW, Kim J-M, et al. High-quality chromosome-scale genomes facilitate effective identification of large structural variations in hot and sweet peppers. Hortic Res. 2022;9:uhac210.
pubmed: 36467270
pmcid: 9715575
doi: 10.1093/hr/uhac210
Li H, Wang S, Chai S, Yang Z, Zhang Q, Xin H, et al. Graph-based pan-genome reveals structural and sequence variations related to agronomic traits and domestication in cucumber. Nat Commun. 2022;13:682.
pubmed: 35115520
pmcid: 8813957
doi: 10.1038/s41467-022-28362-0
Liu Y, Du H, Li P, Shen Y, Peng H, Liu S, et al. Pan-genome of wild and cultivated soybeans. Cell. 2020;182:162–176.e13.
pubmed: 32553274
doi: 10.1016/j.cell.2020.05.023
Li Y, Zhou G, Ma J, Jiang W, Jin L, Zhang Z, et al. De novo assembly of soybean wild relatives for pan-genome analysis of diversity and agronomic traits. Nat Biotechnol. 2014;32:1045–52.
pubmed: 25218520
doi: 10.1038/nbt.2979
Schatz MC, Maron LG, Stein JC, Hernandez Wences A, Gurtowski J, Biggers E, et al. Whole genome de novo assemblies of three divergent strains of rice, Oryza sativa, document novel gene space of aus and indica. Genome Biol. 2014;15:506.
pubmed: 25468217
Liu C, Peng P, Li W, Ye C, Zhang S, Wang R, et al. Deciphering variation of 239 elite japonica rice genomes for whole genome sequences-enabled breeding. Genomics. 2021;113:3083–91.
pubmed: 34237377
doi: 10.1016/j.ygeno.2021.07.002
Shang L, Li X, He H, Yuan Q, Song Y, Wei Z, et al. A super pan-genomic landscape of rice. Cell Res. 2022;32:878–96.
pubmed: 35821092
pmcid: 9525306
doi: 10.1038/s41422-022-00685-z
Hu Z, Wang W, Wu Z, Sun C, Li M, Lu J, et al. Novel sequences, structural variations and gene presence variations of Asian cultivated rice. Sci Data. 2018;5:180079.
pubmed: 29718005
pmcid: 5931083
doi: 10.1038/sdata.2018.79
Zhang X, Liu T, Wang J, Wang P, Qiu Y, Zhao W, et al. Pan-genome of Raphanus highlights genetic variation and introgression among domesticated, wild, and weedy radishes. Mol Plant. 2021;14:2032–55.
pubmed: 34384905
doi: 10.1016/j.molp.2021.08.005
Yu J, Golicz AA, Lu K, Dossa K, Zhang Y, Chen J, et al. Insight into the evolution and functional characteristics of the pan-genome assembly from sesame landraces and modern cultivars. Plant Biotechnol J. 2019;17:881–92.
pubmed: 30315621
doi: 10.1111/pbi.13022
Tao Y, Luo H, Xu J, Cruickshank A, Zhao X, Teng F, et al. Extensive variation within the pan-genome of cultivated and wild sorghum. Nat Plants. 2021;7:766–73.
pubmed: 34017083
doi: 10.1038/s41477-021-00925-x
Gui S, Wei W, Jiang C, Luo J, Chen L, Wu S, et al. A pan-zea genome map for enhancing maize improvement. Genome Biol. 2022;23:178.
pubmed: 35999561
doi: 10.1186/s13059-022-02742-7
Golicz AA, Batley J, Edwards D. Towards plant pangenomics. Plant Biotechnol J. 2016;14:1099–105.
pubmed: 26593040
doi: 10.1111/pbi.12499
Della Coletta R, Qiu Y, Ou S, Hufford MB, Hirsch CN. How the pan-genome is changing crop genomics and improvement. Genome Biol. 2021;22:3.
pubmed: 33397434
doi: 10.1186/s13059-020-02224-8
Yuan Y, Bayer PE, Batley J, Edwards D. Current status of structural variation studies in plants. Plant Biotechnol J. 2021;19:2153–63.
pubmed: 34101329
doi: 10.1111/pbi.13646
Stamatakis A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics. 2014;30:1312–3.
pubmed: 24451623
pmcid: 3998144
doi: 10.1093/bioinformatics/btu033
Guindon S, Dufayard J-F, Lefort V, Anisimova M, Hordijk W, Gascuel O. New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Syst Biol. 2010;59:307–21.
pubmed: 20525638
doi: 10.1093/sysbio/syq010
Price MN, Dehal PS, Arkin AP. FastTree 2 — approximately maximum-likelihood trees for large alignments. PLoS ONE. 2010;5:e9490.
pubmed: 20224823
pmcid: 2835736
doi: 10.1371/journal.pone.0009490
Cheng H, Concepcion GT, Feng X, Zhang H, Li H. Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. Nat Methods. 2021;18:170–5.
pubmed: 33526886
doi: 10.1038/s41592-020-01056-5
Cheng H, Jarvis ED, Fedrigo O, Koepfli K-P, Urban L, Gemmell NJ, et al. Haplotype-resolved assembly of diploid genomes without parental data. Nat Biotechnol. 2022;40:1332–5.
pubmed: 35332338
doi: 10.1038/s41587-022-01261-x
Koren S, Walenz BP, Berlin K, Miller JR, Bergman NH, Phillippy AM, Canu. Scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res. 2017;27:722–36.
pubmed: 28298431
pmcid: 5411767
doi: 10.1101/gr.215087.116
Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M, Kulikov AS, et al. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J Comput Biol. 2012;19:455–77.
pubmed: 22506599
pmcid: 3342519
doi: 10.1089/cmb.2012.0021
Xiao C-L, Chen Y, Xie S-Q, Chen K-N, Wang Y, Han Y, et al. MECAT: fast mapping, error correction, and de novo assembly for single-molecule sequencing reads. Nat Methods. 2017;14:1072–4.
pubmed: 28945707
doi: 10.1038/nmeth.4432
Li H. Minimap and miniasm: fast mapping and de novo assembly for noisy long sequences. Bioinformatics. 2016;32:2103–10.
pubmed: 27153593
doi: 10.1093/bioinformatics/btw152
Ruan J, Li H. Fast and accurate long-read assembly with wtdbg2. Nat Methods. 2020;17:155–8.
pubmed: 31819265
doi: 10.1038/s41592-019-0669-3
Vaser R, Šikić M. Time- and memory-efficient genome assembly with Raven. Nat Comput Sci. 2021;1:332–6.
pubmed: 38217213
doi: 10.1038/s43588-021-00073-4
Chin C-S, Alexander DH, Marks P, Klammer AA, Drake J, Heiner C, et al. Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data. Nat Methods. 2013;10:563–9.
pubmed: 23644548
doi: 10.1038/nmeth.2474
Morisse P, Marchet C, Limasset A, Lecroq T, Lefebvre A. Scalable long read self-correction and assembly polishing with multiple sequence alignment. Sci Rep. 2021;11:761.
pubmed: 33436980
pmcid: 7804095
doi: 10.1038/s41598-020-80757-5
Walker BJ, Abeel T, Shea T, Priest M, Abouelliel A, Sakthikumar S, et al. Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS ONE. 2014;9:e112963.
pubmed: 25409509
pmcid: 4237348
doi: 10.1371/journal.pone.0112963
Zerbino DR, Birney E. Velvet: algorithms for de novo short read assembly using de bruijn graphs. Genome Res. 2008;18:821–9.
pubmed: 18349386
pmcid: 2336801
doi: 10.1101/gr.074492.107
Luo R, Liu B, Xie Y, Li Z, Huang W, Yuan J, et al. SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler. Gigascience. 2012;1:18.
pubmed: 23587118
pmcid: 3626529
doi: 10.1186/2047-217X-1-18
Simpson JT, Wong K, Jackman SD, Schein JE, Jones SJM, Birol I. ABySS: a parallel assembler for short read sequence data. Genome Res. 2009;19:1117–23.
pubmed: 19251739
doi: 10.1101/gr.089532.108
Jackman SD, Vandervalk BP, Mohamadi H, Chu J, Yeo S, Hammond SA, et al. ABySS 2.0: resource-efficient assembly of large genomes using a Bloom filter. Genome Res. 2017;27:768–77.
pubmed: 28232478
doi: 10.1101/gr.214346.116
Boisvert S, Laviolette F, Corbeil J, Ray. Simultaneous assembly of reads from a mix of high-throughput sequencing technologies. J Comput Biol. 2010;17:1519–33.
pubmed: 20958248
doi: 10.1089/cmb.2009.0238
Li D, Liu C-M, Luo R, Sadakane K, Lam T-W. An ultra-fast single-node solution for large and complex metagenomics assembly via succinct de bruijn graph. Bioinformatics. 2015;31:1674–6.
pubmed: 25609793
doi: 10.1093/bioinformatics/btv033
Peng Y, Leung HCM, Yiu SM, Chin FYL. IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth. Bioinformatics. 2012;28:1420–8.
pubmed: 22495754
doi: 10.1093/bioinformatics/bts174
Gnerre S, Maccallum I, Przybylski D, Ribeiro FJ, Burton JN, Walker BJ, et al. High-quality draft assemblies of mammalian genomes from massively parallel sequence data. Proc Natl Acad Sci USA. 2011;108:1513–8.
pubmed: 21187386
doi: 10.1073/pnas.1017351108
Zimin AV, Marçais G, Puiu D, Roberts M, Salzberg SL, Yorke JA. The MaSuRCA genome assembler. Bioinformatics. 2013;29:2669–77.
pubmed: 23990416
doi: 10.1093/bioinformatics/btt476
Dudchenko O, Batra SS, Omer AD, Nyquist SK, Hoeger M, Durand NC, et al. De novo assembly of the Aedes aegypti genome using Hi-C yields chromosome-length scaffolds. Science. 2017;356:92–5.
pubmed: 28336562
doi: 10.1126/science.aal3327
Zhang X, Zhang S, Zhao Q, Ming R, Tang H. Assembly of allele-aware, chromosomal-scale autopolyploid genomes based on Hi-C data. Nat Plants. 2019;5:833–45.
pubmed: 31383970
doi: 10.1038/s41477-019-0487-8
Durand NC, Shamim MS, Machol I, Rao SSP, Huntley MH, Lander ES, et al. Juicer provides a one-click system for analyzing loop-resolution Hi-C experiments. Cell Syst. 2016;3:95–8.
pubmed: 27467249
doi: 10.1016/j.cels.2016.07.002
Kermit: guided genome assembler using colored overlap graphs. https://github.com/rikuu/kermit . Accessed 12 Jun 2024.
Kermit-optical-maps. https://github.com/Denopia/kermit-optical-maps . Accessed 12 Jun 2024.
Novo_Stitch. Novo&Stitch is a genome assembly reconciliation tool based on optical map. https://github.com/ucrbioinfo/Novo_Stitch . Accessed 12 Jun 2024.
OMGS. OMGS is a fast and accurate genome scaffolding tool with one or multiple Bionano optical maps. https://github.com/ucrbioinfo/OMGS . Accessed 12 Jun 2024.
Holt C, Yandell M. MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects. BMC Bioinformatics. 2011;12: 491.
pubmed: 22192575
pmcid: 3280279
doi: 10.1186/1471-2105-12-491
Brůna T, Hoff KJ, Lomsadze A, Stanke M, Borodovsky M. BRAKER2: automatic eukaryotic genome annotation with GeneMark-EP + and AUGUSTUS supported by a protein database. NAR Genom Bioinform. 2021;3:lqaa108.
pubmed: 33575650
doi: 10.1093/nargab/lqaa108
Stanke M, Diekhans M, Baertsch R, Haussler D. Using native and syntenically mapped cDNA alignments to improve de novo gene finding. Bioinformatics. 2008;24:637–44.
pubmed: 18218656
doi: 10.1093/bioinformatics/btn013
Geneid. Predict genic elements as splice sites, exons or genes, along eukaryotic DNA sequences. https://github.com/guigolab/geneid?tab=readme-ov-file . Accessed 12 Jun 2024.
Majoros WH, Pertea M, Salzberg SL. TigrScan and GlimmerHMM: two open source ab initio eukaryotic gene-finders. Bioinformatics. 2004;20:2878–9.
pubmed: 15145805
doi: 10.1093/bioinformatics/bth315
Korf I. Gene finding in novel genomes. BMC Bioinformatics. 2004;5:59.
pubmed: 15144565
doi: 10.1186/1471-2105-5-59
Interproscan. Genome-scale protein function classification. https://github.com/ebi-pf-team/interproscan . Accessed 12 Jun 2024.
Mount DW. Using the Basic Local Alignment Search Tool (BLAST). CSH Protoc. 2007;2007:pdb.top17.
pubmed: 21357135
Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25:1754–60.
pubmed: 19451168
doi: 10.1093/bioinformatics/btp324
Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2. Nat Methods. 2012;9:357–9.
pubmed: 22388286
pmcid: 3322381
doi: 10.1038/nmeth.1923
Kim D, Paggi JM, Park C, Bennett C, Salzberg SL. Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat Biotechnol. 2019;37:907–15.
pubmed: 31375807
pmcid: 7605509
doi: 10.1038/s41587-019-0201-4
Li H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics. 2018;34:3094–100.
pubmed: 29750242
pmcid: 6137996
doi: 10.1093/bioinformatics/bty191
Sedlazeck FJ, Rescheneder P, Smolka M, Fang H, Nattestad M, von Haeseler A, et al. Accurate detection of complex structural variations using single-molecule sequencing. Nat Methods. 2018;15:461–8.
pubmed: 29713083
doi: 10.1038/s41592-018-0001-7
BLASR. The PacBio
Kurtz S, Phillippy A, Delcher AL, Smoot M, Shumway M, Antonescu C, et al. Versatile and open software for comparing large genomes. Genome Biol. 2004;5:R12.
pubmed: 14759262
doi: 10.1186/gb-2004-5-2-r12
Song B, Marco-Sola S, Moreto M, Johnson L, Buckler ES, Stitzer MC. AnchorWave: Sensitive alignment of genomes with high sequence diversity, extensive structural polymorphism, and whole-genome duplication. Proc Natl Acad Sci USA. 2022;119:e2113075119.
pubmed: 34934012
doi: 10.1073/pnas.2113075119
Armstrong J, Hickey G, Diekhans M, Fiddes IT, Novak AM, Deran A, et al. Progressive Cactus is a multiple-genome aligner for the thousand-genome era. Nature. 2020;587:246–51.
pubmed: 33177663
doi: 10.1038/s41586-020-2871-y
Darling ACE, Mau B, Blattner FR, Perna NT, Mauve. Multiple alignment of conserved genomic sequence with rearrangements. Genome Res. 2004;14:1394–403.
pubmed: 15231754
pmcid: 442156
doi: 10.1101/gr.2289704
lastz. Program for aligning DNA sequences, a pairwise aligner. https://github.com/lastz/lastz . Accessed 12 Jun 2024.
Angiuoli SV, Salzberg SL. Mugsy: fast multiple alignment of closely related whole genomes. Bioinformatics. 2011;27:334–42.
pubmed: 21148543
doi: 10.1093/bioinformatics/btq665
Goel M, Sun H, Jiao W-B, Schneeberger K. SyRI: finding genomic rearrangements and local sequence differences from whole-genome assemblies. Genome Biol. 2019;20:277.
pubmed: 31842948
doi: 10.1186/s13059-019-1911-0
Chakraborty M, Emerson JJ, Macdonald SJ, Long AD. Structural variants exhibit widespread allelic heterogeneity and shape variation in complex traits. Nat Commun. 2019;10:4872.
pubmed: 31653862
doi: 10.1038/s41467-019-12884-1
Nattestad M, Schatz MC, Assemblytics. A web analytics tool for the detection of variants from an assembly. Bioinformatics. 2016;32:3021–3.
pubmed: 27318204
doi: 10.1093/bioinformatics/btw369
Kronenberg ZN, Fiddes IT, Gordon D, Murali S, Cantsilieris S, Meyerson OS, et al. High-resolution comparative analysis of great ape genomes. Science. 2018;360:eaar6343.
pubmed: 29880660
pmcid: 6178954
doi: 10.1126/science.aar6343
Smolka M, Paulin LF, Grochowski CM, Horner DW, Mahmoud M, Behera S, et al. Detection of mosaic and population-level structural variants with Sniffles2. Nat Biotechnol. 2024;42:1571–80.
English AC, Salerno WJ, Reid JG, PBHoney. Identifying genomic variants via long-read discordance and interrupted mapping. BMC Bioinformatics. 2014;15:180.
pubmed: 24915764
doi: 10.1186/1471-2105-15-180
Cretu Stancu M, van Roosmalen MJ, Renkens I, Nieboer MM, Middelkamp S, de Ligt J, et al. Mapping and phasing of structural variation in patient genomes using nanopore sequencing. Nat Commun. 2017;8:1326.
pubmed: 29109544
doi: 10.1038/s41467-017-01343-4
Fan X, Abbott TE, Larson D, Chen K. BreakDancer: Identification of genomic structural variation from paired-end read mapping. Curr Protoc Bioinformatics. 2014;45:15.6.1–11.
pubmed: 25152801
Layer RM, Chiang C, Quinlan AR, Hall IM. LUMPY: a probabilistic framework for structural variant discovery. Genome Biol. 2014;15:R84.
pubmed: 24970577
doi: 10.1186/gb-2014-15-6-r84
Rausch T, Zichner T, Schlattl A, Stütz AM, Benes V, Korbel JO. DELLY: structural variant discovery by integrated paired-end and split-read analysis. Bioinformatics. 2012;28:i333–339.
pubmed: 22962449
doi: 10.1093/bioinformatics/bts378
McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, et al. The genome analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 2010;20:1297–303.
pubmed: 20644199
doi: 10.1101/gr.107524.110
Garrison E, Marth G. Haplotype-based variant detection from short-read sequencing. arXiv. 2012. https://doi.org/10.48550/arXiv.1207.3907 .
Perea C, De La Hoz JF, Cruz DF, Lobaton JD, Izquierdo P, Quintero JC, et al. Bioinformatic analysis of genotype by sequencing (GBS) data with NGSEP. BMC Genomics. 2016;17 Suppl 5(Suppl 5):498.
pubmed: 27585926
doi: 10.1186/s12864-016-2827-7
Tello D, Gil J, Loaiza CD, Riascos JJ, Cardozo N, Duitama J. NGSEP3: accurate variant calling across species and sequencing protocols. Bioinformatics. 2019;35:4716–23.
pubmed: 31099384
pmcid: 6853766
doi: 10.1093/bioinformatics/btz275
Giordano F, Stammnitz MR, Murchison EP, Ning Z. scanPAV: a pipeline for extracting presence-absence variations in genome pairs. Bioinformatics. 2018;34:3022–4.
pubmed: 29608694
pmcid: 6129304
doi: 10.1093/bioinformatics/bty189
Tay Fernandez CG, Marsh JI, Nestor BJ, Gill M, Golicz AA, Bayer PE, et al. An SGSGeneloss-based method for constructing a gene presence-absence table using mosdepth. Methods Mol Biol. 2022;2512:73–80.
pubmed: 35818000
doi: 10.1007/978-1-0716-2429-6_5
Tahir Ul Qamar M, Zhu X, Xing F, Chen L-L. ppsPCP: A plant presence/absence variants scanner and pan-genome construction pipeline. Bioinformatics. 2019;35:4156–8.
pubmed: 30851098
doi: 10.1093/bioinformatics/btz168
Danecek P, Bonfield JK, Liddle J, Marshall J, Ohan V, Pollard MO, et al. Twelve years of SAMtools and BCFtools. Gigascience. 2021;10:10.
doi: 10.1093/gigascience/giab008
SURVIVOR. Toolset for SV simulation, comparison and filtering. https://github.com/fritzsedlazeck/SURVIVOR . Accessed 25 Sep 2024.
Zheng Z, Zhu M, Zhang J, Liu X, Hou L, Liu W, et al. A sequence-aware merger of genomic structural variations at population scale. Nat Commun. 2024;15:960.
pubmed: 38307885
pmcid: 10837428
doi: 10.1038/s41467-024-45244-9
Quinlan AR, Hall IM. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010;26:841–2.
pubmed: 20110278
doi: 10.1093/bioinformatics/btq033
Jasmine. SV merging across samples. https://github.com/mkirsche/Jasmine . Accessed 12 Jun 2024.
Emms DM, Kelly S, OrthoFinder. Solving fundamental biases in whole genome comparisons dramatically improves orthogroup inference accuracy. Genome Biol. 2015;16:157.
pubmed: 26243257
pmcid: 4531804
doi: 10.1186/s13059-015-0721-2
Li L, Stoeckert CJ, Roos DS. OrthoMCL: identification of ortholog groups for eukaryotic genomes. Genome Res. 2003;13:2178–89.
pubmed: 12952885
pmcid: 403725
doi: 10.1101/gr.1224503
Wang Y, Tang H, Debarry JD, Tan X, Li J, Wang X, et al. MCScanX: a toolkit for detection and evolutionary analysis of gene synteny and collinearity. Nucleic Acids Res. 2012;40:e49.
pubmed: 22217600
pmcid: 3326336
doi: 10.1093/nar/gkr1293
Fu L, Niu B, Zhu Z, Wu S, Li W. CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics. 2012;28:3150–2.
pubmed: 23060610
pmcid: 3516142
doi: 10.1093/bioinformatics/bts565
Wang J, Yang W, Zhang S, Hu H, Yuan Y, Dong J, et al. A pangenome analysis pipeline provides insights into functional gene identification in rice. Genome Biol. 2023;24:19.
pubmed: 36703158
pmcid: 9878884
doi: 10.1186/s13059-023-02861-9
Garrison E, Sirén J, Novak AM, Hickey G, Eizenga JM, Dawson ET, et al. Variation graph toolkit improves read mapping by representing genetic variation in the reference. Nat Biotechnol. 2018;36:875–9.
pubmed: 30125266
pmcid: 6126949
doi: 10.1038/nbt.4227
Hickey G, Heller D, Monlong J, Sibbesen JA, Sirén J, Eizenga J, et al. Genotyping structural variants in pangenome graphs using the vg toolkit. Genome Biol. 2020;21:35.
pubmed: 32051000
pmcid: 7017486
doi: 10.1186/s13059-020-1941-7
Li H, Feng X, Chu C. The design and construction of reference pangenome graphs with minigraph. Genome Biol. 2020;21:265.
pubmed: 33066802
pmcid: 7568353
doi: 10.1186/s13059-020-02168-z
Hickey G, Monlong J, Ebler J, Novak AM, Eizenga JM, Gao Y, et al. Pangenome graph construction from genome alignments with Minigraph-Cactus. Nat Biotechnol. 2024;42:663–73.
pubmed: 37165083
doi: 10.1038/s41587-023-01793-w
Garrison E, Guarracino A, Heumos S, Villani F, Bao Z, Tattini L et al. Building pangenome graphs. BioRxiv. 2023. https://doi.org/10.1101/2023.04.05.535718 .
Khan J, Kokot M, Deorowicz S, Patro R. Scalable, ultra-fast, and low-memory construction of compacted de Bruijn graphs with cuttlefish 2. Genome Biol. 2022;23:190.
pubmed: 36076275
pmcid: 9454175
doi: 10.1186/s13059-022-02743-6
Holley G, Melsted P. Bifrost: highly parallel construction and indexing of colored and compacted de Bruijn graphs. Genome Biol. 2020;21:249.
pubmed: 32943081
pmcid: 7499882
doi: 10.1186/s13059-020-02135-8
Sirén J, Monlong J, Chang X, Novak AM, Eizenga JM, Markello C, et al. Pangenomics enables genotyping of known structural variants in 5202 diverse genomes. Science. 2021;374:abg8871.
pubmed: 34914532
pmcid: 9365333
doi: 10.1126/science.abg8871
Sović I, Šikić M, Wilm A, Fenlon SN, Chen S, Nagarajan N. Fast and sensitive mapping of nanopore sequencing reads with GraphMap. Nat Commun. 2016;7:11307.
pubmed: 27079541
pmcid: 4835549
doi: 10.1038/ncomms11307
Rautiainen M, Marschall T, GraphAligner. Rapid and versatile sequence-to-graph alignment. Genome Biol. 2020;21:253.
pubmed: 32972461
pmcid: 7513500
doi: 10.1186/s13059-020-02157-2
Paten B, Eizenga JM, Rosen YM, Novak AM, Garrison E, Hickey G. Superbubbles, ultrabubbles, and cacti. J Comput Biol. 2018;25:649–63.
pubmed: 29461862
pmcid: 6067107
doi: 10.1089/cmb.2017.0251
Chen S, Krusche P, Dolzhenko E, Sherman RM, Petrovski R, Schlesinger F, et al. Paragraph: a graph-based structural variant genotyper for short-read sequence data. Genome Biol. 2019;20:291.
pubmed: 31856913
pmcid: 6921448
doi: 10.1186/s13059-019-1909-7
Eggertsson HP, Jonsson H, Kristmundsdottir S, Hjartarson E, Kehr B, Masson G, et al. Graphtyper enables population-scale genotyping using pangenome graphs. Nat Genet. 2017;49:1654–60.
pubmed: 28945251
doi: 10.1038/ng.3964
Eggertsson HP, Kristmundsdottir S, Beyter D, Jonsson H, Skuladottir A, Hardarson MT, et al. GraphTyper2 enables population-scale genotyping of structural variation using pangenome graphs. Nat Commun. 2019;10:5402.
pubmed: 31776332
pmcid: 6881350
doi: 10.1038/s41467-019-13341-9
Ebler J, Ebert P, Clarke WE, Rausch T, Audano PA, Houwaart T, et al. Pangenome-based genome inference allows efficient and accurate genotyping across a wide spectrum of variant classes. Nat Genet. 2022;54:518–25.
pubmed: 35410384
pmcid: 9005351
doi: 10.1038/s41588-022-01043-w
Sibbesen JA, Maretty L, Danish Pan-Genome Consortium, Krogh A. Accurate genotyping across variant classes and lengths using variant graphs. Nat Genet. 2018;50:1054–9.
pubmed: 29915429
doi: 10.1038/s41588-018-0145-5
Horsfield ST, Tonkin-Hill G, Croucher NJ, Lees JA. Accurate and fast graph-based pangenome annotation and clustering with ggCaller. Genome Res. 2023;33:1622–37.
pubmed: 37620118
pmcid: 10620059
doi: 10.1101/gr.277733.123
Sheikhizadeh S, Schranz ME, Akdel M, de Ridder D, Smit S. PanTools: representation, storage and exploration of pan-genomic data. Bioinformatics. 2016;32:i487–93.
pubmed: 27587666
doi: 10.1093/bioinformatics/btw455
Anari SS, de Ridder D, Schranz ME, Smit S. Pangenomic read mapping. BioRxiv. 2019. https://doi.org/10.1101/813634 .
Sibbesen JA, Eizenga JM, Novak AM, Sirén J, Chang X, Garrison E, et al. Haplotype-aware pantranscriptome analyses using spliced pangenome graphs. Nat Methods. 2023;20:239–47.
pubmed: 36646895
doi: 10.1038/s41592-022-01731-9
Wick RR, Schultz MB, Zobel J, Holt KE. BANDAGE: interactive visualization of de novo genome assemblies. Bioinformatics. 2015;31:3350–2.
pubmed: 26099265
pmcid: 4595904
doi: 10.1093/bioinformatics/btv383
Gonnella G, Niehus N, Kurtz S. GfaViz: flexible and interactive visualization of GFA sequence graphs. Bioinformatics. 2019;35:2853–5.
pubmed: 30596893
doi: 10.1093/bioinformatics/bty1046
Durant É, Sabot F, Conte M, Rouard M. PANACHE: a web browser-based viewer for linearized pangenomes. Bioinformatics. 2021;37:4556–8.
pubmed: 34601567
pmcid: 8652104
doi: 10.1093/bioinformatics/btab688
SequenceTubeMap: displays multiple genomic sequences in the form of a tube map. https://github.com/vgteam/sequenceTubeMap . Accessed 12 Jun 2024.
Guarracino A, Heumos S, Nahnsen S, Prins P, Garrison E. ODGI: understanding pangenome graphs. Bioinformatics. 2022;38:3319–26.
pubmed: 35552372
pmcid: 9237687
doi: 10.1093/bioinformatics/btac308
Iqbal Z, Caccamo M, Turner I, Flicek P, McVean G. De novo assembly and genotyping of variants using colored de Bruijn graphs. Nat Genet. 2012;44:226–32.
pubmed: 22231483
pmcid: 3272472
doi: 10.1038/ng.1028
Qin P, Lu H, Du H, Wang H, Chen W, Chen Z, et al. Pan-genome analysis of 33 genetically diverse rice accessions reveals hidden genomic variations. Cell. 2021;184:3542–3558.e16.
pubmed: 34051138
doi: 10.1016/j.cell.2021.04.046
Glick L, Mayrose I. The effect of methodological considerations on the construction of gene-based plant pan-genomes. Genome Biol Evol. 2023;15:15.
doi: 10.1093/gbe/evad121
Mehrotra S, Goyal V. Repetitive sequences in plant nuclear DNA: types, distribution, evolution and function. Genomics Proteom Bioinf. 2014;12:164–71.
doi: 10.1016/j.gpb.2014.07.003
Negi P, Rai AN, Suprasanna P. Moving through the stressed genome: emerging regulatory roles for transposons in plant stress response. Front Plant Sci. 2016;7:1448.
pubmed: 27777577
pmcid: 5056178
doi: 10.3389/fpls.2016.01448
Bourque G, Burns KH, Gehring M, Gorbunova V, Seluanov A, Hammell M, et al. Ten things you should know about transposable elements. Genome Biol. 2018;19:199.
pubmed: 30454069
pmcid: 6240941
doi: 10.1186/s13059-018-1577-z
Bariah I, Keidar-Friedman D, Kashkush K. Where the wild things are: transposable elements as drivers of structural and functional variations in the wheat genome. Front Plant Sci. 2020;11:585515.
pubmed: 33072155
pmcid: 7530836
doi: 10.3389/fpls.2020.585515
Zhao Q, Feng Q, Lu H, Li Y, Wang A, Tian Q, et al. Pan-genome analysis highlights the extent of genomic variation in cultivated and wild rice. Nat Genet. 2018;50:278–84.
pubmed: 29335547
doi: 10.1038/s41588-018-0041-z
Golicz AA, Bayer PE, Barker GC, Edger PP, Kim H, Martinez PA, et al. The pangenome of an agronomically important crop plant Brassica oleracea. Nat Commun. 2016;7:13390.
pubmed: 27834372
pmcid: 5114598
doi: 10.1038/ncomms13390
Badouin H, Gouzy J, Grassa CJ, Murat F, Staton SE, Cottret L, et al. The sunflower genome provides insights into oil metabolism, flowering and Asterid evolution. Nature. 2017;546:148–52.
pubmed: 28538728
doi: 10.1038/nature22380
Hu Z, Sun C, Lu K-C, Chu X, Zhao Y, Lu J, et al. EUPAN enables pan-genome studies of a large number of eukaryotic genomes. Bioinformatics. 2017;33:2408–9.
pubmed: 28369371
doi: 10.1093/bioinformatics/btx170
Ho SS, Urban AE, Mills RE. Structural variation in the sequencing era. Nat Rev Genet. 2020;21:171–89.
pubmed: 31729472
doi: 10.1038/s41576-019-0180-9
Kosugi S, Momozawa Y, Liu X, Terao C, Kubo M, Kamatani Y. Comprehensive evaluation of structural variation detection algorithms for whole genome sequencing. Genome Biol. 2019;20:117.
pubmed: 31159850
pmcid: 6547561
doi: 10.1186/s13059-019-1720-5
Zhou Y, Zhang Z, Bao Z, Li H, Lyu Y, Zan Y, et al. Graph pangenome captures missing heritability and empowers tomato breeding. Nature. 2022;606:527–34.
pubmed: 35676474
pmcid: 9200638
doi: 10.1038/s41586-022-04808-9
He Q, Tang S, Zhi H, Chen J, Zhang J, Liang H, et al. A graph-based genome and pan-genome variation of the model plant Setaria. Nat Genet. 2023;55:1232–42.
pubmed: 37291196
pmcid: 10335933
doi: 10.1038/s41588-023-01423-w
Guarracino A, Mwaniki N, Marco-Sola S, Garrison E. Wfmash: a pangenome-scale aligner. Zenodo. 2021. https://doi.org/10.5281/zenodo.6949373 .
Garrison E, Guarracino A. Unbiased pangenome graphs. Bioinformatics. 2023;39:btac743.
pubmed: 36448683
doi: 10.1093/bioinformatics/btac743
Cleary A, Ramaraj T, Kahanda I, Mudge J, Mumey B. Exploring frequented regions in pan-genomic graphs. IEEE/ACM Trans Comput Biol Bioinform. 2019;16:1424–35.
pubmed: 30106690
doi: 10.1109/TCBB.2018.2864564
Andreace F, Lechat P, Dufresne Y, Chikhi R. Comparing methods for constructing and representing human pangenome graphs. Genome Biol. 2023;24:274.
pubmed: 38037131
doi: 10.1186/s13059-023-03098-2
Liao W-W, Asri M, Ebler J, Doerr D, Haukness M, Hickey G, et al. A draft human pangenome reference. Nature. 2023;617:312–24.
pubmed: 37165242
doi: 10.1038/s41586-023-05896-x
Stanke M, Waack S. Gene prediction with a hidden Markov model and a new intron submodel. Bioinformatics. 2003;19(Suppl 2):ii215–225.
pubmed: 14534192
doi: 10.1093/bioinformatics/btg1080
Blanco E, Parra G, Guigó R. Using geneid to identify genes. Curr Protoc Bioinf. 2007;Chap. 4:Unit 4.3.
UniProt Consortium. Uniprot: the universal protein knowledgebase in 2023. Nucleic Acids Res. 2023;51:D523–31.
doi: 10.1093/nar/gkac1052
Reiser L, Bakker E, Subramaniam S, Chen X, Sawant S, Khosa K, et al. The Arabidopsis information resource in 2024. Genetics. 2024;227:iyae027.
pubmed: 38457127
doi: 10.1093/genetics/iyae027
UniProt Consortium. UniProt: a worldwide hub of protein knowledge. Nucleic Acids Res. 2019;47:D506–15.
doi: 10.1093/nar/gky1049
Zheng Y, Jiao C, Sun H, Rosli HG, Pombo MA, Zhang P, et al. iTAK: a program for genome-wide prediction and classification of plant transcription factors, transcriptional regulators, and protein kinases. Mol Plant. 2016;9:1667–70.
pubmed: 27717919
doi: 10.1016/j.molp.2016.09.014
Tian F, Yang DC, Meng YQ, Jin J, Gao G. PlantRegMap: charting functional regulatory maps in plants. Nucleic Acids Res. 2020;48:D1104–13.
pubmed: 31701126
Finn RD, Bateman A, Clements J, Coggill P, Eberhardt RY, Eddy SR, et al. Pfam: the protein families database. Nucleic Acids Res. 2014;42:222–30 Database issue:D.
doi: 10.1093/nar/gkt1223
Ou S, Su W, Liao Y, Chougule K, Agda JRA, Hellinga AJ, et al. Benchmarking transposable element annotation methods for creation of a streamlined, comprehensive pipeline. Genome Biol. 2019;20:275.
pubmed: 31843001
pmcid: 6913007
doi: 10.1186/s13059-019-1905-y
Xu Z, Wang H. LTR_FINDER: An efficient tool for the prediction of full-length LTR retrotransposons. Nucleic Acids Res. 2007;35(Web Server issue):W265–8.
pubmed: 17485477
pmcid: 1933203
doi: 10.1093/nar/gkm286
Ellinghaus D, Kurtz S, Willhoeft U. LTRharvest, an efficient and flexible software for de novo detection of LTR retrotransposons. BMC Bioinformatics. 2008;9:18.
pubmed: 18194517
pmcid: 2253517
doi: 10.1186/1471-2105-9-18
Ou S, Jiang N, Ltr_retriever. A highly accurate and sensitive program for identification of long terminal repeat retrotransposons. Plant Physiol. 2018;176:1410–22.
pubmed: 29233850
doi: 10.1104/pp.17.01310
Shi J, Liang C. Generic repeat Finder: A High-Sensitivity Tool for genome-wide De Novo repeat detection. Plant Physiol. 2019;180:1803–15.
pubmed: 31152127
pmcid: 6670090
doi: 10.1104/pp.19.00386
Su W, Gu X, Peterson T. TIR-Learner, a new ensemble method for tir transposable element annotation, provides evidence for abundant new transposable elements in the maize genome. Mol Plant. 2019;12:447–60.
pubmed: 30802553
doi: 10.1016/j.molp.2019.02.008
Xiong W, He L, Lai J, Dooner HK, Du C. HelitronScanner uncovers a large overlooked cache of Helitron transposons in many plant genomes. Proc Natl Acad Sci USA. 2014;111:10263–8.
pubmed: 24982153
pmcid: 4104883
doi: 10.1073/pnas.1410068111
Hubley R, Smit A. ISB repeat modeler. ISB repeat modeler. https://www.repeatmasker.org/RepeatModeler/ . Accessed 11 Jun 2024.
Institute for Systems Biology. Repeat masker. ISB Repeat Masker. https://www.repeatmasker.org/RepeatMasker/ . Accessed 11 Jun 2024.
Benson G. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 1999;27:573–80.
pubmed: 9862982
doi: 10.1093/nar/27.2.573
Mudge J, Farmer AD. Sequencing, assembly, and annotation of the alfalfa genome. In: Yu L-X, Kole C, editors. The Alfalfa Genome. Cham: Springer International Publishing; 2021. p. 87–109.
doi: 10.1007/978-3-030-74466-3_6
Ballouz S, Dobin A, Gillis JA. Is it time to change the reference genome? Genome Biol. 2019;20:159.
pubmed: 31399121
doi: 10.1186/s13059-019-1774-4
Bradbury PJ, Casstevens T, Jensen SE, Johnson LC, Miller ZR, Monier B, et al. The practical haplotype graph, a platform for storing and using pangenomes for imputation. Bioinformatics. 2022;38:3698–702.
pubmed: 35748708
doi: 10.1093/bioinformatics/btac410
Gallais A. Quantitative genetics and breeding methods in autopolyploid plants. Versailles: QUAE; 2004.
Zhou Q, Tang D, Huang W, Yang Z, Zhang Y, Hamilton JP, et al. Haplotype-resolved genome analyses of a heterozygous diploid potato. Nat Genet. 2020;52:1018–23.
pubmed: 32989320
doi: 10.1038/s41588-020-0699-x
Li A, Liu A, Du X, Chen J-Y, Yin M, Hu H-Y, et al. A chromosome-scale genome assembly of a diploid alfalfa, the progenitor of autotetraploid alfalfa. Hortic Res. 2020;7:194.
pubmed: 33328470
doi: 10.1038/s41438-020-00417-7
Uitdewilligen JGAML, Wolters A-MA, D’hoop BB, Borm TJA, Visser RGF, van Eck HJ. A next-generation sequencing method for genotyping-by-sequencing of highly heterozygous autotetraploid potato. PLoS ONE. 2013;8:e62355.
pubmed: 23667470
doi: 10.1371/journal.pone.0062355
USDA-ARS. Legume information system: genome assembly of cultivated Alfalfa at Diploid genome. https://data.legumeinfo.org/Medicago/sativa/genomes/ . Accessed 11 Jun 2024.
Russelle MP, Alfalfa. After an 8,000-year journey, the queen of Forages stands poised to enjoy renewed popularity. Am Sci. 2001;89:252–61.
doi: 10.1511/2001.3.252
Wang Z, Şakiroğlu M. The origin, evolution, and genetic diversity of alfalfa. In: Yu L-X, Kole C, editors. The Alfalfa Genome. Cham: Springer International Publishing; 2021. p. 29–42.
doi: 10.1007/978-3-030-74466-3_3
Li A, Liu A, Wu S, Qu K, Hu H, Yang J, et al. Comparison of structural variants in the whole genome sequences of two Medicago truncatula ecotypes: Jemalong A17 and R108. BMC Plant Biol. 2022;22:77.
pubmed: 35193491
doi: 10.1186/s12870-022-03469-0
Hardigan MA, Crisovan E, Hamilton JP, Kim J, Laimbeer P, Leisner CP, et al. Genome reduction uncovers a large dispensable genome and adaptive role for copy number variation in asexually propagated Solanum tuberosum. Plant Cell. 2016;28:388–405.
pubmed: 26772996
doi: 10.1105/tpc.15.00538
Manni M, Berkeley MR, Seppey M, Simão FA, Zdobnov EM. BUSCO Update: Novel and streamlined workflows along with broader and deeper phylogenetic coverage for scoring of eukaryotic, prokaryotic, and viral genomes. Mol Biol Evol. 2021;38:4647–54.
pubmed: 34320186
doi: 10.1093/molbev/msab199
Jeffares DC, Jolly C, Hoti M, Speed D, Shaw L, Rallis C, et al. Transient structural variations have strong effects on quantitative traits and reproductive isolation in fission yeast. Nat Commun. 2017;8:14061.
pubmed: 28117401
doi: 10.1038/ncomms14061
Cingolani P, Platts A, Wang LL, Coon M, Nguyen T, Wang L, et al. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly (Austin). 2012;6:80–92.
pubmed: 22728672
doi: 10.4161/fly.19695
Alonge M, Wang X, Benoit M, Soyk S, Pereira L, Zhang L, et al. Major impacts of widespread structural variation on gene expression and crop improvement in tomato. Cell. 2020;182:145–e16123.
pubmed: 32553272
pmcid: 7354227
doi: 10.1016/j.cell.2020.05.021
Medina CA, Zhao D, Lin M, Sapkota M, Sandercock AM, Beil CT et al. Pre-breeding in alfalfa germplasm develops highly differentiated populations, as revealed by genome-wide microhaplotype markers. Sci Rep. 2024. https://doi.org/10.21203/rs.3.rs-4215295/v1 .
Ewels PA, Peltzer A, Fillinger S, Patel H, Alneberg J, Wilm A, et al. The nf-core framework for community-curated bioinformatics pipelines. Nat Biotechnol. 2020;38:276–8.
pubmed: 32055031
doi: 10.1038/s41587-020-0439-x
Wang Z, Wang X, Zhang H, Ma L, Zhao H, Jones CS, et al. A genome-wide association study approach to the identification of candidate genes underlying agronomic traits in alfalfa (Medicago sativa L). Plant Biotechnol J. 2020;18:611–3.
pubmed: 31487419
doi: 10.1111/pbi.13251
Hu H, Li R, Zhao J, Batley J, Edwards D. Technological development and advances for constructing and analyzing plant pangenomes. Genome Biol Evol. 2024;16:evae081.
pubmed: 38669452
doi: 10.1093/gbe/evae081
Beyer W, Novak AM, Hickey G, Chan J, Tan V, Paten B, et al. Sequence tube maps: making graph genomes intuitive to commuters. Bioinformatics. 2019;35:5318–20.
pubmed: 31368484
doi: 10.1093/bioinformatics/btz597
Wilson ZA, Morroll SM, Dawson J, Swarup R, Tighe PJ. The Arabidopsis MALE STERILITY1 (MS1) gene is a transcriptional regulator of male gametogenesis, with homology to the PHD-finger family of transcription factors. Plant J. 2001;28:27–39.
pubmed: 11696184
doi: 10.1046/j.1365-313X.2001.01125.x
Fu G-Q, Xu S, Xie Y-J, Han B, Nie L, Shen W-B, et al. Molecular cloning, characterization, and expression of an alfalfa (Medicago sativa L.) heme oxygenase-1 gene, MsHO1, which is pro-oxidants-regulated. Plant Physiol Biochem. 2011;49:792–9.
pubmed: 21316255
doi: 10.1016/j.plaphy.2011.01.018
Zhang X, Chen B, Wang L, Ali S, Guo Y, Liu J, et al. Genome-wide identification and characterization of caffeic acid o-methyltransferase gene family in soybean. Plants. 2021;10:10.
doi: 10.3390/plants10122816
Jing Y, Paau AS, Brill WJ. Leghemoglobins from alfalfa (Medicago sativa L. vernal) root nodules. I. Purification and in vitro synthesis of five leghemoglobin components. Plant Sci Lett. 1982;25:119–32.
doi: 10.1016/0304-4211(82)90170-5
Löbler M, Hirsch AM. An alfalfa (Medicago sativa L.) cDNA encoding an acidic leghemoglobin (MsLb3). Plant Mol Biol. 1992;20:733–6.
pubmed: 1450387
doi: 10.1007/BF00046457
GBrowser. https://gbrowser.sourceforge.net/ . Accessed 30 Sep 2024.
Diesh C, Stevens GJ, Xie P, De Jesus Martinez T, Hershberg EA, Leung A, et al. JBrowse 2: a modular genome browser with views of synteny and structural variation. Genome Biol. 2023;24:74.
pubmed: 37069644
doi: 10.1186/s13059-023-02914-z
Thorvaldsdóttir H, Robinson JT, Mesirov JP. Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration. Brief Bioinf. 2013;14:178–92.
doi: 10.1093/bib/bbs017
GitHub - Sep SJTU-CGM/PPanG: a precise pangenome browser combining linear and graph-based pan-genome. https://github.com/SJTU-CGM/PPanG/ . Accessed 30 Sep 2024.
Manuweera B, Mudge J, Kahanda I, Mumey B, Ramaraj T, Cleary A. Pangenome-wide association studies with frequented regions. In: Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics. New York: ACM; 2019. p. 627–32.
doi: 10.1145/3307339.3343478
Tay Fernandez CG, Nestor BJ, Danilevicz MF, Marsh JI, Petereit J, Bayer PE, et al. Expanding gene-editing potential in Crop Improvement with pangenomes. Int J Mol Sci. 2022;23:2276.
pubmed: 35216392
doi: 10.3390/ijms23042276
Jin S, Han Z, Hu Y, Si Z, Dai F, He L, et al. Structural variation (SV)-based pan-genome and GWAS reveal the impacts of SVs on the speciation and diversification of allotetraploid cottons. Mol Plant. 2023;16:678–93.
pubmed: 36760124
doi: 10.1016/j.molp.2023.02.004
Sun C, Hu Z, Zheng T, Lu K, Zhao Y, Wang W, et al. RPAN: Rice pan-genome browser for ∼3000 rice genomes. Nucleic Acids Res. 2017;45:597–605.
pubmed: 27940610
doi: 10.1093/nar/gkw958
Guangdong Laboratory for Lingnan Modern Agriculture SP of W& CRPT, Institute AG. Jun, Chinese Academy of Agricultural Sciences. RiceSuperPIRdb. http://www.ricesuperpir.com/ . Accessed 11 2024.
Lawrence CJ, Dong Q, Polacco ML, Seigfried TE, Brendel V. MaizeGDB, the community database for maize genetics and genomics. Nucleic Acids Res. 2004;32(Database issue):D393–7.
pubmed: 14681441
doi: 10.1093/nar/gkh011
Gui S, Yang L, Li J, Luo J, Xu X, Yuan J, et al. ZEAMAP, a comprehensive database adapted to the maize multi-omics era. iScience. 2020;23:101241.
pubmed: 32629608
doi: 10.1016/j.isci.2020.101241
Bayer PE, Petereit J, Durant É, Monat C, Rouard M, Hu H, et al. Wheat panache: a pangenome graph database representing presence-absence variation across sixteen bread wheat genomes. Plant Genome. 2022;15:e20221.
pubmed: 35644986
doi: 10.1002/tpg2.20221