A stepwise guide for pangenome development in crop plants: an alfalfa (Medicago sativa) case study.


Journal

BMC genomics
ISSN: 1471-2164
Titre abrégé: BMC Genomics
Pays: England
ID NLM: 100965258

Informations de publication

Date de publication:
31 Oct 2024
Historique:
received: 13 06 2024
accepted: 21 10 2024
medline: 1 11 2024
pubmed: 1 11 2024
entrez: 1 11 2024
Statut: epublish

Résumé

The concept of pangenomics and the importance of structural variants is gaining recognition within the plant genomics community. Due to advancements in sequencing and computational technology, it has become feasible to sequence the entire genome of numerous individuals of a single species at a reasonable cost. Pangenomes have been constructed for many major diploid crops, including rice, maize, soybean, sorghum, pearl millet, peas, sunflower, grapes, and mustards. However, pangenomes for polyploid species are relatively scarce and are available in only few crops including wheat, cotton, rapeseed, and potatoes. In this review, we explore the various methods used in crop pangenome development, discussing the challenges and implications of these techniques based on insights from published pangenome studies. We offer a systematic guide and discuss the tools available for constructing a pangenome and conducting downstream analyses. Alfalfa, a highly heterozygous, cross pollinated and autotetraploid forage crop species, is used as an example to discuss the concerns and challenges offered by polyploid crop species. We conducted a comparative analysis using linear and graph-based methods by constructing an alfalfa graph pangenome using three publicly available genome assemblies. To illustrate the intricacies captured by pangenome graphs for a complex crop genome, we used five different gene sequences and aligned them against the three graph-based pangenomes. The comparison of the three graph pangenome methods reveals notable variations in the genomic variation captured by each pipeline. Pangenome resources are proving invaluable by offering insights into core and dispensable genes, novel gene discovery, and genome-wide patterns of variation. Developing user-friendly online portals for linear pangenome visualization has made these resources accessible to the broader scientific and breeding community. However, challenges remain with graph-based pangenomes including compatibility with other tools, extraction of sequence for regions of interest, and visualization of genetic variation captured in pangenome graphs. These issues necessitate further refinement of tools and pipelines to effectively address the complexities of polyploid, highly heterozygous, and cross-pollinated species.

Sections du résumé

BACKGROUND BACKGROUND
The concept of pangenomics and the importance of structural variants is gaining recognition within the plant genomics community. Due to advancements in sequencing and computational technology, it has become feasible to sequence the entire genome of numerous individuals of a single species at a reasonable cost. Pangenomes have been constructed for many major diploid crops, including rice, maize, soybean, sorghum, pearl millet, peas, sunflower, grapes, and mustards. However, pangenomes for polyploid species are relatively scarce and are available in only few crops including wheat, cotton, rapeseed, and potatoes.
MAIN BODY METHODS
In this review, we explore the various methods used in crop pangenome development, discussing the challenges and implications of these techniques based on insights from published pangenome studies. We offer a systematic guide and discuss the tools available for constructing a pangenome and conducting downstream analyses. Alfalfa, a highly heterozygous, cross pollinated and autotetraploid forage crop species, is used as an example to discuss the concerns and challenges offered by polyploid crop species. We conducted a comparative analysis using linear and graph-based methods by constructing an alfalfa graph pangenome using three publicly available genome assemblies. To illustrate the intricacies captured by pangenome graphs for a complex crop genome, we used five different gene sequences and aligned them against the three graph-based pangenomes. The comparison of the three graph pangenome methods reveals notable variations in the genomic variation captured by each pipeline.
CONCLUSION CONCLUSIONS
Pangenome resources are proving invaluable by offering insights into core and dispensable genes, novel gene discovery, and genome-wide patterns of variation. Developing user-friendly online portals for linear pangenome visualization has made these resources accessible to the broader scientific and breeding community. However, challenges remain with graph-based pangenomes including compatibility with other tools, extraction of sequence for regions of interest, and visualization of genetic variation captured in pangenome graphs. These issues necessitate further refinement of tools and pipelines to effectively address the complexities of polyploid, highly heterozygous, and cross-pollinated species.

Identifiants

pubmed: 39482604
doi: 10.1186/s12864-024-10931-w
pii: 10.1186/s12864-024-10931-w
doi:

Types de publication

Journal Article Review

Langues

eng

Sous-ensembles de citation

IM

Pagination

1022

Subventions

Organisme : USDA-ARS
ID : 5026-12210-004-00D
Organisme : USDA-ARS
ID : 5026-12210-004-00D
Organisme : USDA-ARS
ID : 5026-12210-004-00D

Informations de copyright

© 2024. This is a U.S. Government work and not under copyright protection in the US; foreign copyright protection may apply.

Références

Jiao Y, Peluso P, Shi J, Liang T, Stitzer MC, Wang B, et al. Improved maize reference genome with single-molecule technologies. Nature. 2017;546:524–7.
pubmed: 28605751 pmcid: 7052699 doi: 10.1038/nature22971
Sun S, Zhou Y, Chen J, Shi J, Zhao H, Zhao H, et al. Extensive intraspecific gene order and gene structural variations between Mo17 and other maize genomes. Nat Genet. 2018;50:1289–95.
pubmed: 30061735 doi: 10.1038/s41588-018-0182-0
Yang N, Liu J, Gao Q, Gui S, Chen L, Yang L, et al. Genome assembly of a tropical maize inbred line provides insights into structural variation and crop improvement. Nat Genet. 2019;51:1052–9.
pubmed: 31152161 doi: 10.1038/s41588-019-0427-6
Li C, Xiang X, Huang Y, Zhou Y, An D, Dong J, et al. Long-read sequencing reveals genomic structural variations that underlie creation of quality protein maize. Nat Commun. 2020;11:17.
pubmed: 31911615 pmcid: 6946643 doi: 10.1038/s41467-019-14023-2
Ge F, Qu J, Liu P, Pan L, Zou C, Yuan G, et al. Genome assembly of the maize inbred line A188 provides a new reference genome for functional genomics. Crop J. 2022;10:47–55.
doi: 10.1016/j.cj.2021.08.002
Wang B, Hou M, Shi J, Ku L, Song W, Li C, et al. De novo genome assembly and analyses of 12 founder inbred lines provide insights into maize heterosis. Nat Genet. 2023;55:312–23.
pubmed: 36646891 doi: 10.1038/s41588-022-01283-w
Wang B, Yang X, Jia Y, Xu Y, Jia P, Dang N, et al. High-quality Arabidopsis thaliana genome assembly with nanopore and HiFi long reads. Genomics Proteom Bioinf. 2022;20:4–13.
doi: 10.1016/j.gpb.2021.08.003
Buisine N, Quesneville H, Colot V. Improved detection and annotation of transposable elements in sequenced genomes using multiple reference sequence sets. Genomics. 2008;91:467–75.
pubmed: 18343092 doi: 10.1016/j.ygeno.2008.01.005
Lee H, Chawla HS, Obermeier C, Dreyer F, Abbadi A, Snowdon R. Chromosome-scale assembly of winter oilseed rape Brassica napus. Front Plant Sci. 2020;11:496.
pubmed: 32411167 pmcid: 7202327 doi: 10.3389/fpls.2020.00496
Bayer PE, Hurgobin B, Golicz AA, Chan C-KK, Yuan Y, Lee H, et al. Assembly and comparison of two closely related Brassica napus genomes. Plant Biotechnol J. 2017;15:1602–10.
pubmed: 28403535 pmcid: 5698052 doi: 10.1111/pbi.12742
Sun F, Fan G, Hu Q, Zhou Y, Guan M, Tong C, et al. The high-quality genome of Brassica napus Cultivar ZS11 reveals the introgression history in semi-winter morphotype. Plant J. 2017;92:452–68.
pubmed: 28849613 doi: 10.1111/tpj.13669
Lv H, Wang Y, Han F, Ji J, Fang Z, Zhuang M, et al. A high-quality reference genome for cabbage obtained with SMRT reveals novel genomic features and evolutionary characteristics. Sci Rep. 2020;10:12394.
pubmed: 32709963 pmcid: 7381634 doi: 10.1038/s41598-020-69389-x
Parkin IAP, Koh C, Tang H, Robinson SJ, Kagale S, Clarke WE, et al. Transcriptome and methylome profiling reveals relics of genome dominance in the mesopolyploid Brassica oleracea. Genome Biol. 2014;15:R77.
pubmed: 24916971 pmcid: 4097860 doi: 10.1186/gb-2014-15-6-r77
Liu S, Liu Y, Yang X, Tong C, Edwards D, Parkin IAP, et al. The Brassica oleracea genome reveals the asymmetrical evolution of polyploid genomes. Nat Commun. 2014;5:3930.
pubmed: 24852848 doi: 10.1038/ncomms4930
Zhang L, Liang J, Chen H, Zhang Z, Wu J, Wang X. A near-complete genome assembly of Brassica rapa provides new insights into the evolution of centromeres. Plant Biotechnol J. 2023;21:1022–32.
pubmed: 36688739 pmcid: 10106856 doi: 10.1111/pbi.14015
Xu H, Wang C, Shao G, Wu S, Liu P, Cao P, et al. The reference genome and full-length transcriptome of pakchoi provide insights into cuticle formation and heat adaption. Hortic Res. 2022;9:uhac123.
pubmed: 35949690 pmcid: 9358696 doi: 10.1093/hr/uhac123
Yang Z, Jiang Y, Gong J, Li Q, Dun B, Liu D, et al. R gene triplication confers European fodder turnip with improved clubroot resistance. Plant Biotechnol J. 2022;20:1502–17.
pubmed: 35445530 pmcid: 9342621 doi: 10.1111/pbi.13827
Istace B, Belser C, Falentin C, Labadie K, Boideau F, Deniot G, et al. Sequencing and chromosome-scale assembly of plant genomes, Brassica rapa as a Use Case. Biology (Basel). 2021;10:732.
pubmed: 34439964
Chu JS-C, Peng B, Tang K, Yi X, Zhou H, Wang H, et al. Eight soybean reference genome resources from varying latitudes and agronomic traits. Sci Data. 2021;8:164.
pubmed: 34210987 pmcid: 8249447 doi: 10.1038/s41597-021-00947-2
Valliyodan B, Cannon SB, Bayer PE, Shu S, Brown AV, Ren L, et al. Construction and comparison of three reference-quality genome assemblies for soybean. Plant J. 2019;100:1066–82.
pubmed: 31433882 doi: 10.1111/tpj.14500
Yi X, Liu J, Chen S, Wu H, Liu M, Xu Q, et al. Genome assembly of the JD17 soybean provides a new reference genome for comparative genomics. G3 (Bethesda). 2022;12:jkac017.
pubmed: 35188189 doi: 10.1093/g3journal/jkac017
Shen Y, Du H, Liu Y, Ni L, Wang Z, Liang C, et al. Update soybean Zhonghuang 13 genome to a golden reference. Sci China Life Sci. 2019;62:1257–60.
pubmed: 31444683 doi: 10.1007/s11427-019-9822-2
Garg V, Dudchenko O, Wang J, Khan AW, Gupta S, Kaur P, et al. Chromosome-length genome assemblies of six legume species provide insights into genome organization, evolution, and agronomic traits for crop improvement. J Advanc Res. 2022;42:315–29.
doi: 10.1016/j.jare.2021.10.009
Xie M, Chung CY-L, Li M-W, Wong F-L, Wang X, Liu A, et al. A reference-grade wild soybean genome. Nat Commun. 2019;10:1216.
pubmed: 30872580 pmcid: 6418295 doi: 10.1038/s41467-019-09142-9
Ma Z, Zhang Y, Wu L, Zhang G, Sun Z, Li Z, et al. High-quality genome assembly and resequencing of modern cotton cultivars provide resources for crop improvement. Nat Genet. 2021;53:1385–91.
pubmed: 34373642 pmcid: 8423627 doi: 10.1038/s41588-021-00910-2
Hu Y, Chen J, Fang L, Zhang Z, Ma W, Niu Y, et al. Gossypium barbadense and Gossypium hirsutum genomes provide insights into the origin and evolution of allotetraploid cotton. Nat Genet. 2019;51:739–48.
pubmed: 30886425 doi: 10.1038/s41588-019-0371-5
Li F, Fan G, Lu C, Xiao G, Zou C, Kohel RJ, et al. Genome sequence of cultivated Upland cotton (Gossypium hirsutum TM-1) provides insights into genome evolution. Nat Biotechnol. 2015;33:524–30.
pubmed: 25893780 doi: 10.1038/nbt.3208
Zeng X, Xu T, Ling Z, Wang Y, Li X, Xu S, et al. An improved high-quality genome assembly and annotation of tibetan hulless barley. Sci Data. 2020;7:139.
pubmed: 32385314 pmcid: 7210891 doi: 10.1038/s41597-020-0480-0
Mascher M, Gundlach H, Himmelbach A, Beier S, Twardziok SO, Wicker T, et al. A chromosome conformation capture ordered sequence of the barley genome. Nature. 2017;544:427–33.
pubmed: 28447635 doi: 10.1038/nature22043
Schreiber M, Mascher M, Wright J, Padmarasu S, Himmelbach A, Heavens D, et al. A genome assembly of the barley transformation reference cultivar golden promise. G3 (Bethesda). 2020;10:1823–7.
pubmed: 32241919 doi: 10.1534/g3.119.401010
Rajarammohan S, Kaur L, Verma A, Singh D, Mantri S, Roy JK, et al. Genome sequencing and assembly of Lathyrus sativus - a nutrient-rich hardy legume crop. Sci Data. 2023;10:32.
pubmed: 36650149 pmcid: 9845207 doi: 10.1038/s41597-022-01903-4
Emmrich PMF, Sarkar A, Njaci I, Kaithakottil GG, Ellis N, Moore C, et al. A draft genome of grass pea (Lathyrus sativus), a resilient diploid legume. BioRxiv. 2020. https://doi.org/10.1101/2020.04.24.058164 .
Shen C, Du H, Chen Z, Lu H, Zhu F, Chen H, et al. The chromosome-level genome sequence of the autotetraploid alfalfa and resequencing of core germplasms provide genomic resources for alfalfa research. Mol Plant. 2020;13:1250–61.
pubmed: 32673760 doi: 10.1016/j.molp.2020.07.003
Long R, Zhang F, Zhang Z, Li M, Chen L, Wang X, et al. Genome assembly of alfalfa cultivar Zhongmu-4 and identification of SNPs associated with agronomic traits. Genomics Proteom Bioinf. 2022;20:14–28.
doi: 10.1016/j.gpb.2022.01.002
Chen H, Zeng Y, Yang Y, Huang L, Tang B, Zhang H, et al. Allele-aware chromosome-level genome assembly and efficient transgene-free genome editing for the autotetraploid cultivated alfalfa. Nat Commun. 2020;11:2494.
pubmed: 32427850 pmcid: 7237683 doi: 10.1038/s41467-020-16338-x
Choi JY, Lye ZN, Groen SC, Dai X, Rughani P, Zaaijer S, et al. Nanopore sequencing-based genome assembly and evolutionary genomics of circum-basmati rice. Genome Biol. 2020;21:21.
pubmed: 32019604 pmcid: 7001208 doi: 10.1186/s13059-020-1938-2
Du H, Yu Y, Ma Y, Gao Q, Cao Y, Chen Z, et al. Sequencing and de novo assembly of a near complete indica rice genome. Nat Commun. 2017;8:15324.
pubmed: 28469237 doi: 10.1038/ncomms15324
Kawahara Y, de la Bastide M, Hamilton JP, Kanamori H, McCombie WR, Ouyang S, et al. Improvement of the Oryza sativa nipponbare reference genome using next generation sequence and optical map data. Rice (N Y). 2013;6:4.
pubmed: 24280374 doi: 10.1186/1939-8433-6-4
Yan H, Sun M, Zhang Z, Jin Y, Zhang A, Lin C, et al. Pangenomic analysis identifies structural variation associated with heat tolerance in pearl millet. Nat Genet. 2023;55:507–18.
pubmed: 36864101 pmcid: 10011142 doi: 10.1038/s41588-023-01302-4
Varshney RK, Shi C, Thudi M, Mariac C, Wallace J, Qi P, et al. Pearl millet genome sequence provides a resource to improve agronomic traits in arid environments. Nat Biotechnol. 2017;35:969–76.
pubmed: 28922347 pmcid: 6871012 doi: 10.1038/nbt.3943
Su X, Wang B, Geng X, Du Y, Yang Q, Liang B, et al. A high-continuity and annotated tomato reference genome. BMC Genomics. 2021;22:898.
pubmed: 34911432 pmcid: 8672587 doi: 10.1186/s12864-021-08212-x
Hosmani PS, Flores-Gonzalez M, van de Geest H, Maumus F, Bakker LV, Schijlen E, et al. An improved de novo assembly and annotation of the tomato reference genome using single-molecule sequencing, Hi-C proximity ligation and optical maps. BioRxiv. 2019. https://doi.org/10.1101/767764 .
Takei H, Shirasawa K, Kuwabara K, Toyoda A, Matsuzawa Y, Iioka S, et al. De novo genome assembly of two tomato ancestors, Solanum pimpinellifolium and Solanum lycopersicum var. cerasiforme, by long-read sequencing. DNA Res. 2021;28:28.
doi: 10.1093/dnares/dsaa029
Karetnikov DI, Vasiliev GV, Toshchakov SV, Shmakov NA, Genaev MA, Nesterov MA, et al. Analysis of genome structure and its variations in potato cultivars grown in Russia. Int J Mol Sci. 2023;24:5713.
pubmed: 36982787 pmcid: 10059000 doi: 10.3390/ijms24065713
Kyriakidou M, Anglin NL, Ellis D, Tai HH, Strömvik MV. Genome assembly of six polyploid potato genomes. Sci Data. 2020;7:88.
pubmed: 32161269 pmcid: 7066127 doi: 10.1038/s41597-020-0428-4
Xu PGSC, Pan X, Cheng S, Zhang S, Mu B. Genome sequence and analysis of the tuber crop potato. Nature. 2011;475:189–95.
pubmed: 21743474 doi: 10.1038/nature10158
Sun H, Jiao W-B, Krause K, Campoy JA, Goel M, Folz-Donahue K, et al. Chromosome-scale and haplotype-resolved genome assembly of a tetraploid potato cultivar. Nat Genet. 2022;54:342–8.
pubmed: 35241824 pmcid: 8920897 doi: 10.1038/s41588-022-01015-0
Wang F, Xia Z, Zou M, Zhao L, Jiang S, Zhou Y, et al. The autotetraploid potato genome provides insights into highly heterozygous species. Plant Biotechnol J. 2022;20:1996–2005.
pubmed: 35767385 pmcid: 9491450 doi: 10.1111/pbi.13883
Bao Z, Li C, Li G, Wang P, Peng Z, Cheng L, et al. Genome architecture and tetrasomic inheritance of autotetraploid potato. Mol Plant. 2022;15:1211–26.
pubmed: 35733345 doi: 10.1016/j.molp.2022.06.009
Van Lieshout N, van der Burgt A, de Vries ME, Ter Maat M, Eickholt D, Esselink D, et al. Solyntus, the New highly contiguous reference genome for Potato (Solanum tuberosum). G3 (Bethesda). 2020;10:G3.
Kuo Y-T, Ishii T, Fuchs J, Hsieh W-H, Houben A, Lin Y-R. The evolutionary dynamics of repetitive DNA and its impact on the genome diversification in the genus sorghum. Front Plant Sci. 2021;12:729734.
pubmed: 34475879 pmcid: 8407070 doi: 10.3389/fpls.2021.729734
International Wheat Genome Sequencing Consortium (IWGSC). Shifting the limits in wheat research and breeding using a fully annotated reference genome. Science. 2018;361:eaar7191.
doi: 10.1126/science.aar7191
Walkowiak S, Gao L, Monat C, Haberer G, Kassa MT, Brinton J, et al. Multiple wheat genomes reveal global variation in modern breeding. Nature. 2020;588:277–83.
pubmed: 33239791 pmcid: 7759465 doi: 10.1038/s41586-020-2961-x
Xi H, Nguyen V, Ward C, Liu Z, Searle IR. Chromosome-level assembly of the common vetch (Vicia sativa) reference genome. Gigabyte. 2022;2022:gigabyte38.
pubmed: 36824524 pmcid: 9650280 doi: 10.46471/gigabyte.38
Shirasawa K, Kosugi S, Sasaki K, Ghelfi A, Okazaki K, Toyoda A, et al. Genome features of common vetch (Vicia sativa) in natural habitats. Plant Direct. 2021;5:e352.
pubmed: 34646975 pmcid: 8496506 doi: 10.1002/pld3.352
Liu C, Wang Y, Peng J, Fan B, Xu D, Wu J, et al. High-quality genome assembly and pan-genome studies facilitate genetic discovery in mung bean and its improvement. Plant Commun. 2022;3:100352.
pubmed: 35752938 pmcid: 9700124 doi: 10.1016/j.xplc.2022.100352
Ha J, Satyawan D, Jeong H, Lee E, Cho K-H, Kim MY, et al. A near-complete genome sequence of mungbean (Vigna radiata L.) provides key insights into the modern breeding program. Plant Genome. 2021;14:e20121.
pubmed: 34275211 doi: 10.1002/tpg2.20121
Hufford MB, Seetharam AS, Woodhouse MR, Chougule KM, Ou S, Liu J, et al. De novo assembly, annotation, and comparative analysis of 26 diverse maize genomes. Science. 2021;373:655–62.
pubmed: 34353948 pmcid: 8733867 doi: 10.1126/science.abg5289
Chen J, Wang Z, Tan K, Huang W, Shi J, Li T, et al. A complete telomere-to-telomere assembly of the maize genome. Nat Genet. 2023;55:1221–31.
pubmed: 37322109 pmcid: 10335936 doi: 10.1038/s41588-023-01419-6
Zhou P, Silverstein KAT, Ramaraj T, Guhlin J, Denny R, Liu J, et al. Exploring structural variation and gene family architecture with De Novo assemblies of 15 Medicago genomes. BMC Genomics. 2017;18:261.
pubmed: 28347275 pmcid: 5369179 doi: 10.1186/s12864-017-3654-1
Hirsch CN, Foerster JM, Johnson JM, Sekhon RS, Muttoni G, Vaillancourt B, et al. Insights into the maize pan-genome and pan-transcriptome. Plant Cell. 2014;26:121–35.
pubmed: 24488960 pmcid: 3963563 doi: 10.1105/tpc.113.119982
Yao W, Li G, Zhao H, Wang G, Lian X, Xie W. Exploring the rice dispensable genome using a metagenome-like assembly strategy. Genome Biol. 2015;16:187.
pubmed: 26403182 pmcid: 4583175 doi: 10.1186/s13059-015-0757-3
Tettelin H, Masignani V, Cieslewicz MJ, Donati C, Medini D, Ward NL, et al. Genome analysis of multiple pathogenic isolates of Streptococcus agalactiae: implications for the microbial pan-genome. Proc Natl Acad Sci USA. 2005;102:13950–5.
pubmed: 16172379 pmcid: 1216834 doi: 10.1073/pnas.0506758102
Morgante M, De Paoli E, Radovic S. Transposable elements and the plant pan-genomes. Curr Opin Plant Biol. 2007;10:149–55.
pubmed: 17300983 doi: 10.1016/j.pbi.2007.02.001
Li R, Li Y, Zheng H, Luo R, Zhu H, Li Q, et al. Building the sequence map of the human pan-genome. Nat Biotechnol. 2010;28:57–63.
pubmed: 19997067 doi: 10.1038/nbt.1596
Montenegro JD, Golicz AA, Bayer PE, Hurgobin B, Lee H, Chan C-KK, et al. The pangenome of hexaploid bread wheat. Plant J. 2017;90:1007–13.
pubmed: 28231383 doi: 10.1111/tpj.13515
Gordon SP, Contreras-Moreira B, Woods DP, Des Marais DL, Burgess D, Shu S, et al. Extensive gene content variation in the Brachypodium distachyon pan-genome correlates with population structure. Nat Commun. 2017;8:2184.
pubmed: 29259172 pmcid: 5736591 doi: 10.1038/s41467-017-02292-8
Yang T, Liu R, Luo Y, Hu S, Wang D, Wang C, et al. Improved pea reference genome and pan-genome highlight genomic features and evolutionary characteristics. Nat Genet. 2022;54:1553–63.
pubmed: 36138232 pmcid: 9534762 doi: 10.1038/s41588-022-01172-2
Gao L, Gonda I, Sun H, Ma Q, Bao K, Tieman DM, et al. The tomato pan-genome uncovers new genes and a rare allele regulating fruit flavor. Nat Genet. 2019;51:1044–51.
pubmed: 31086351 doi: 10.1038/s41588-019-0410-2
Bozan I, Achakkagari SR, Anglin NL, Ellis D, Tai HH, Strömvik MV. Pangenome analyses reveal impact of transposable elements and ploidy on the evolution of potato species. Proc Natl Acad Sci USA. 2023;120:e2211117120.
pubmed: 37487084 pmcid: 10401005 doi: 10.1073/pnas.2211117120
Hoopes G, Meng X, Hamilton JP, Achakkagari SR, de Alves Freitas Guesdes F, Bolger ME, et al. Phased, chromosome-scale genome assemblies of tetraploid potato reveal a complex genome, transcriptome, and predicted proteome landscape underpinning genetic diversity. Mol Plant. 2022;15:520–36.
pubmed: 35026436 doi: 10.1016/j.molp.2022.01.003
Cochetel N, Minio A, Guarracino A, Garcia JF, Figueroa-Balderas R, Massonnet M, et al. A super-pangenome of the north American wild grape species. Genome Biol. 2023;24:290.
pubmed: 38111050 pmcid: 10729490 doi: 10.1186/s13059-023-03133-2
Steuernagel B, Jupe F, Witek K, Jones JDG, Wulff BBH. NLR-parser: rapid annotation of plant NLR complements. Bioinformatics. 2015;31:1665–7.
pubmed: 25586514 pmcid: 4426836 doi: 10.1093/bioinformatics/btv005
Jones P, Binns D, Chang H-Y, Fraser M, Li W, McAnulla C, et al. InterProScan 5: genome-scale protein function classification. Bioinformatics. 2014;30:1236–40.
pubmed: 24451626 pmcid: 3998142 doi: 10.1093/bioinformatics/btu031
Peng R, Xu Y, Tian S, Unver T, Liu Z, Zhou Z, et al. Evolutionary divergence of duplicated genomes in newly described allotetraploid cottons. Proc Natl Acad Sci USA. 2022;119:e2208496119.
pubmed: 36122204 pmcid: 9522333 doi: 10.1073/pnas.2208496119
Barragan AC, Weigel D. Plant NLR diversity: the known unknowns of pan-NLRomes. Plant Cell. 2021;33:814–31.
pubmed: 33793812 pmcid: 8226294 doi: 10.1093/plcell/koaa002
Murat F, Van de Peer Y, Salse J. Decoding plant and animal genome plasticity from differential paleo-evolutionary patterns and processes. Genome Biol Evol. 2012;4:917–28.
pubmed: 22833223 pmcid: 3516226 doi: 10.1093/gbe/evs066
Soltis PS, Soltis DE. The role of genetic and genomic attributes in the success of polyploids. Proc Natl Acad Sci USA. 2000;97:7051–7.
pubmed: 10860970 pmcid: 34383 doi: 10.1073/pnas.97.13.7051
Pellicer J, Hidalgo O, Dodsworth S, Leitch IJ. Genome size diversity and its impact on the evolution of land plants. Genes. 2018;9:88.
pubmed: 29443885 pmcid: 5852584 doi: 10.3390/genes9020088
Van de Peer Y, Mizrachi E, Marchal K. The evolutionary significance of polyploidy. Nat Rev Genet. 2017;18:411–24.
pubmed: 28502977 doi: 10.1038/nrg.2017.26
Bennett MD, Leitch IJ. Genome size evolution in plants. In: The evolution of the genome. Elsevier, Academic Press; 2005. p. 89–162. https://doi.org/10.1016/B978-012301463-4/50004-8 .
Herdan G. Quantitative Linguistics. Oxford, UK: Butterworths; 1964.
Heaps HS. Information Retrieval: computational and theoretical aspects. New York, NY: Academic Press, Inc; 1978.
Wang W, Mauleon R, Hu Z, Chebotarov D, Tai S, Wu Z, et al. Genomic variation in 3,010 diverse accessions of Asian cultivated rice. Nature. 2018;557:43–9.
pubmed: 29695866 doi: 10.1038/s41586-018-0063-9
Ruperao P, Thirunavukkarasu N, Gandham P, Selvanayagam S, Govindaraj M, Nebie B, et al. Sorghum pan-genome explores the functional utility for genomic-assisted breeding to accelerate the genetic gain. Front Plant Sci. 2021;12:666342.
pubmed: 34140962 pmcid: 8204017 doi: 10.3389/fpls.2021.666342
Torkamaneh D, Lemay MA, Belzile F. The pan-genome of the cultivated soybean (PanSoy) reveals an extraordinarily conserved gene content. Plant Biotechnol J. 2021;19:1852–62.
pubmed: 33942475 pmcid: 8428833 doi: 10.1111/pbi.13600
Ou L, Li D, Lv J, Chen W, Zhang Z, Li X, et al. Pan-genome of cultivated pepper (Capsicum) and its use in gene presence-absence variation analyses. New Phytol. 2018;220:360–3.
pubmed: 30129229 doi: 10.1111/nph.15413
Hübner S, Bercovich N, Todesco M, Mandel JR, Odenheimer J, Ziegler E, et al. Sunflower pan-genome analysis shows that hybridization altered gene content and disease resistance. Nat Plants. 2019;5:54–62.
pubmed: 30598532 doi: 10.1038/s41477-018-0329-0
Varshney RK, Roorkiwal M, Sun S, Bajaj P, Chitikineni A, Thudi M, et al. A chickpea genetic variation map based on the sequencing of 3,366 genomes. Nature. 2021;599:622–7.
pubmed: 34759320 pmcid: 8612933 doi: 10.1038/s41586-021-04066-1
Hurgobin B, Golicz AA, Bayer PE, Chan C-KK, Tirnaz S, Dolatabadian A, et al. Homoeologous exchange is a major cause of gene presence/absence variation in the amphidiploid Brassica napus. Plant Biotechnol J. 2018;16:1265–74.
pubmed: 29205771 pmcid: 5999312 doi: 10.1111/pbi.12867
Monnahan P, Brandvain Y. The effect of autopolyploidy on population genetic signals of hard sweeps. Biol Lett. 2020;16:20190796.
pubmed: 32097595 pmcid: 7058959 doi: 10.1098/rsbl.2019.0796
Tuttle HK, Del Rio AH, Bamberg JB, Shannon LM. Potato soup: analysis of cultivated potato gene bank populations reveals high diversity and little structure. Front Plant Sci. 2024;15:1429279.
pubmed: 39091313 pmcid: 11291250 doi: 10.3389/fpls.2024.1429279
Conover JL, Wendel JF. Deleterious mutations accumulate faster in allopolyploid than diploid cotton (Gossypium) and unequally between subgenomes. Mol Biol Evol. 2022;39:msac024.
pubmed: 35099532 pmcid: 8841602 doi: 10.1093/molbev/msac024
Pham GM, Newton L, Wiegert-Rininger K, Vaillancourt B, Douches DS, Buell CR. Extensive genome heterogeneity leads to preferential allele expression and copy number-dependent expression in cultivated potato. Plant J. 2017;92:624–37.
pubmed: 28869794 doi: 10.1111/tpj.13706
Schnable JC, Springer NM, Freeling M. Differentiation of the maize subgenomes by genome dominance and both ancient and ongoing gene loss. Proc Natl Acad Sci USA. 2011;108:4069–74.
pubmed: 21368132 pmcid: 3053962 doi: 10.1073/pnas.1101368108
Liang Z, Schnable JC. Functional divergence between subgenomes and gene pairs after whole genome duplications. Mol Plant. 2018;11:388–97.
pubmed: 29275166 doi: 10.1016/j.molp.2017.12.010
Contreras-Moreira B, Cantalapiedra CP, García-Pereira MJ, Gordon SP, Vogel JP, Igartua E, et al. Analysis of plant pan-genomes and transcriptomes with GET_HOMOLOGUES-EST, a clustering solution for sequences of the same species. Front Plant Sci. 2017;8:184.
pubmed: 28261241 pmcid: 5306281 doi: 10.3389/fpls.2017.00184
Lin K, Zhang N, Severing EI, Nijveen H, Cheng F, Visser RGF, et al. Beyond genomic variation–comparison and functional annotation of three Brassica rapa genomes: a turnip, a rapid cycling and a Chinese cabbage. BMC Genomics. 2014;15:250.
pubmed: 24684742 pmcid: 4230417 doi: 10.1186/1471-2164-15-250
Zhao J, Bayer PE, Ruperao P, Saxena RK, Khan AW, Golicz AA, et al. Trait associations in the pangenome of pigeon pea (Cajanus cajan). Plant Biotechnol J. 2020;18:1946–54.
pubmed: 32020732 pmcid: 7415775 doi: 10.1111/pbi.13354
Lee J-H, Venkatesh J, Jo J, Jang S, Kim GW, Kim J-M, et al. High-quality chromosome-scale genomes facilitate effective identification of large structural variations in hot and sweet peppers. Hortic Res. 2022;9:uhac210.
pubmed: 36467270 pmcid: 9715575 doi: 10.1093/hr/uhac210
Li H, Wang S, Chai S, Yang Z, Zhang Q, Xin H, et al. Graph-based pan-genome reveals structural and sequence variations related to agronomic traits and domestication in cucumber. Nat Commun. 2022;13:682.
pubmed: 35115520 pmcid: 8813957 doi: 10.1038/s41467-022-28362-0
Liu Y, Du H, Li P, Shen Y, Peng H, Liu S, et al. Pan-genome of wild and cultivated soybeans. Cell. 2020;182:162–176.e13.
pubmed: 32553274 doi: 10.1016/j.cell.2020.05.023
Li Y, Zhou G, Ma J, Jiang W, Jin L, Zhang Z, et al. De novo assembly of soybean wild relatives for pan-genome analysis of diversity and agronomic traits. Nat Biotechnol. 2014;32:1045–52.
pubmed: 25218520 doi: 10.1038/nbt.2979
Schatz MC, Maron LG, Stein JC, Hernandez Wences A, Gurtowski J, Biggers E, et al. Whole genome de novo assemblies of three divergent strains of rice, Oryza sativa, document novel gene space of aus and indica. Genome Biol. 2014;15:506.
pubmed: 25468217
Liu C, Peng P, Li W, Ye C, Zhang S, Wang R, et al. Deciphering variation of 239 elite japonica rice genomes for whole genome sequences-enabled breeding. Genomics. 2021;113:3083–91.
pubmed: 34237377 doi: 10.1016/j.ygeno.2021.07.002
Shang L, Li X, He H, Yuan Q, Song Y, Wei Z, et al. A super pan-genomic landscape of rice. Cell Res. 2022;32:878–96.
pubmed: 35821092 pmcid: 9525306 doi: 10.1038/s41422-022-00685-z
Hu Z, Wang W, Wu Z, Sun C, Li M, Lu J, et al. Novel sequences, structural variations and gene presence variations of Asian cultivated rice. Sci Data. 2018;5:180079.
pubmed: 29718005 pmcid: 5931083 doi: 10.1038/sdata.2018.79
Zhang X, Liu T, Wang J, Wang P, Qiu Y, Zhao W, et al. Pan-genome of Raphanus highlights genetic variation and introgression among domesticated, wild, and weedy radishes. Mol Plant. 2021;14:2032–55.
pubmed: 34384905 doi: 10.1016/j.molp.2021.08.005
Yu J, Golicz AA, Lu K, Dossa K, Zhang Y, Chen J, et al. Insight into the evolution and functional characteristics of the pan-genome assembly from sesame landraces and modern cultivars. Plant Biotechnol J. 2019;17:881–92.
pubmed: 30315621 doi: 10.1111/pbi.13022
Tao Y, Luo H, Xu J, Cruickshank A, Zhao X, Teng F, et al. Extensive variation within the pan-genome of cultivated and wild sorghum. Nat Plants. 2021;7:766–73.
pubmed: 34017083 doi: 10.1038/s41477-021-00925-x
Gui S, Wei W, Jiang C, Luo J, Chen L, Wu S, et al. A pan-zea genome map for enhancing maize improvement. Genome Biol. 2022;23:178.
pubmed: 35999561 doi: 10.1186/s13059-022-02742-7
Golicz AA, Batley J, Edwards D. Towards plant pangenomics. Plant Biotechnol J. 2016;14:1099–105.
pubmed: 26593040 doi: 10.1111/pbi.12499
Della Coletta R, Qiu Y, Ou S, Hufford MB, Hirsch CN. How the pan-genome is changing crop genomics and improvement. Genome Biol. 2021;22:3.
pubmed: 33397434 doi: 10.1186/s13059-020-02224-8
Yuan Y, Bayer PE, Batley J, Edwards D. Current status of structural variation studies in plants. Plant Biotechnol J. 2021;19:2153–63.
pubmed: 34101329 doi: 10.1111/pbi.13646
Stamatakis A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics. 2014;30:1312–3.
pubmed: 24451623 pmcid: 3998144 doi: 10.1093/bioinformatics/btu033
Guindon S, Dufayard J-F, Lefort V, Anisimova M, Hordijk W, Gascuel O. New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Syst Biol. 2010;59:307–21.
pubmed: 20525638 doi: 10.1093/sysbio/syq010
Price MN, Dehal PS, Arkin AP. FastTree 2 — approximately maximum-likelihood trees for large alignments. PLoS ONE. 2010;5:e9490.
pubmed: 20224823 pmcid: 2835736 doi: 10.1371/journal.pone.0009490
Cheng H, Concepcion GT, Feng X, Zhang H, Li H. Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. Nat Methods. 2021;18:170–5.
pubmed: 33526886 doi: 10.1038/s41592-020-01056-5
Cheng H, Jarvis ED, Fedrigo O, Koepfli K-P, Urban L, Gemmell NJ, et al. Haplotype-resolved assembly of diploid genomes without parental data. Nat Biotechnol. 2022;40:1332–5.
pubmed: 35332338 doi: 10.1038/s41587-022-01261-x
Koren S, Walenz BP, Berlin K, Miller JR, Bergman NH, Phillippy AM, Canu. Scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res. 2017;27:722–36.
pubmed: 28298431 pmcid: 5411767 doi: 10.1101/gr.215087.116
Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M, Kulikov AS, et al. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J Comput Biol. 2012;19:455–77.
pubmed: 22506599 pmcid: 3342519 doi: 10.1089/cmb.2012.0021
Xiao C-L, Chen Y, Xie S-Q, Chen K-N, Wang Y, Han Y, et al. MECAT: fast mapping, error correction, and de novo assembly for single-molecule sequencing reads. Nat Methods. 2017;14:1072–4.
pubmed: 28945707 doi: 10.1038/nmeth.4432
Li H. Minimap and miniasm: fast mapping and de novo assembly for noisy long sequences. Bioinformatics. 2016;32:2103–10.
pubmed: 27153593 doi: 10.1093/bioinformatics/btw152
Ruan J, Li H. Fast and accurate long-read assembly with wtdbg2. Nat Methods. 2020;17:155–8.
pubmed: 31819265 doi: 10.1038/s41592-019-0669-3
Vaser R, Šikić M. Time- and memory-efficient genome assembly with Raven. Nat Comput Sci. 2021;1:332–6.
pubmed: 38217213 doi: 10.1038/s43588-021-00073-4
Chin C-S, Alexander DH, Marks P, Klammer AA, Drake J, Heiner C, et al. Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data. Nat Methods. 2013;10:563–9.
pubmed: 23644548 doi: 10.1038/nmeth.2474
Morisse P, Marchet C, Limasset A, Lecroq T, Lefebvre A. Scalable long read self-correction and assembly polishing with multiple sequence alignment. Sci Rep. 2021;11:761.
pubmed: 33436980 pmcid: 7804095 doi: 10.1038/s41598-020-80757-5
Walker BJ, Abeel T, Shea T, Priest M, Abouelliel A, Sakthikumar S, et al. Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS ONE. 2014;9:e112963.
pubmed: 25409509 pmcid: 4237348 doi: 10.1371/journal.pone.0112963
Zerbino DR, Birney E. Velvet: algorithms for de novo short read assembly using de bruijn graphs. Genome Res. 2008;18:821–9.
pubmed: 18349386 pmcid: 2336801 doi: 10.1101/gr.074492.107
Luo R, Liu B, Xie Y, Li Z, Huang W, Yuan J, et al. SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler. Gigascience. 2012;1:18.
pubmed: 23587118 pmcid: 3626529 doi: 10.1186/2047-217X-1-18
Simpson JT, Wong K, Jackman SD, Schein JE, Jones SJM, Birol I. ABySS: a parallel assembler for short read sequence data. Genome Res. 2009;19:1117–23.
pubmed: 19251739 doi: 10.1101/gr.089532.108
Jackman SD, Vandervalk BP, Mohamadi H, Chu J, Yeo S, Hammond SA, et al. ABySS 2.0: resource-efficient assembly of large genomes using a Bloom filter. Genome Res. 2017;27:768–77.
pubmed: 28232478 doi: 10.1101/gr.214346.116
Boisvert S, Laviolette F, Corbeil J, Ray. Simultaneous assembly of reads from a mix of high-throughput sequencing technologies. J Comput Biol. 2010;17:1519–33.
pubmed: 20958248 doi: 10.1089/cmb.2009.0238
Li D, Liu C-M, Luo R, Sadakane K, Lam T-W. An ultra-fast single-node solution for large and complex metagenomics assembly via succinct de bruijn graph. Bioinformatics. 2015;31:1674–6.
pubmed: 25609793 doi: 10.1093/bioinformatics/btv033
Peng Y, Leung HCM, Yiu SM, Chin FYL. IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth. Bioinformatics. 2012;28:1420–8.
pubmed: 22495754 doi: 10.1093/bioinformatics/bts174
Gnerre S, Maccallum I, Przybylski D, Ribeiro FJ, Burton JN, Walker BJ, et al. High-quality draft assemblies of mammalian genomes from massively parallel sequence data. Proc Natl Acad Sci USA. 2011;108:1513–8.
pubmed: 21187386 doi: 10.1073/pnas.1017351108
Zimin AV, Marçais G, Puiu D, Roberts M, Salzberg SL, Yorke JA. The MaSuRCA genome assembler. Bioinformatics. 2013;29:2669–77.
pubmed: 23990416 doi: 10.1093/bioinformatics/btt476
Dudchenko O, Batra SS, Omer AD, Nyquist SK, Hoeger M, Durand NC, et al. De novo assembly of the Aedes aegypti genome using Hi-C yields chromosome-length scaffolds. Science. 2017;356:92–5.
pubmed: 28336562 doi: 10.1126/science.aal3327
Zhang X, Zhang S, Zhao Q, Ming R, Tang H. Assembly of allele-aware, chromosomal-scale autopolyploid genomes based on Hi-C data. Nat Plants. 2019;5:833–45.
pubmed: 31383970 doi: 10.1038/s41477-019-0487-8
Durand NC, Shamim MS, Machol I, Rao SSP, Huntley MH, Lander ES, et al. Juicer provides a one-click system for analyzing loop-resolution Hi-C experiments. Cell Syst. 2016;3:95–8.
pubmed: 27467249 doi: 10.1016/j.cels.2016.07.002
Kermit: guided genome assembler using colored overlap graphs. https://github.com/rikuu/kermit . Accessed 12 Jun 2024.
Kermit-optical-maps. https://github.com/Denopia/kermit-optical-maps . Accessed 12 Jun 2024.
Novo_Stitch. Novo&Stitch is a genome assembly reconciliation tool based on optical map. https://github.com/ucrbioinfo/Novo_Stitch . Accessed 12 Jun 2024.
OMGS. OMGS is a fast and accurate genome scaffolding tool with one or multiple Bionano optical maps. https://github.com/ucrbioinfo/OMGS . Accessed 12 Jun 2024.
Holt C, Yandell M. MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects. BMC Bioinformatics. 2011;12: 491.
pubmed: 22192575 pmcid: 3280279 doi: 10.1186/1471-2105-12-491
Brůna T, Hoff KJ, Lomsadze A, Stanke M, Borodovsky M. BRAKER2: automatic eukaryotic genome annotation with GeneMark-EP + and AUGUSTUS supported by a protein database. NAR Genom Bioinform. 2021;3:lqaa108.
pubmed: 33575650 doi: 10.1093/nargab/lqaa108
Stanke M, Diekhans M, Baertsch R, Haussler D. Using native and syntenically mapped cDNA alignments to improve de novo gene finding. Bioinformatics. 2008;24:637–44.
pubmed: 18218656 doi: 10.1093/bioinformatics/btn013
Geneid. Predict genic elements as splice sites, exons or genes, along eukaryotic DNA sequences. https://github.com/guigolab/geneid?tab=readme-ov-file . Accessed 12 Jun 2024.
Majoros WH, Pertea M, Salzberg SL. TigrScan and GlimmerHMM: two open source ab initio eukaryotic gene-finders. Bioinformatics. 2004;20:2878–9.
pubmed: 15145805 doi: 10.1093/bioinformatics/bth315
Korf I. Gene finding in novel genomes. BMC Bioinformatics. 2004;5:59.
pubmed: 15144565 doi: 10.1186/1471-2105-5-59
Interproscan. Genome-scale protein function classification. https://github.com/ebi-pf-team/interproscan . Accessed 12 Jun 2024.
Mount DW. Using the Basic Local Alignment Search Tool (BLAST). CSH Protoc. 2007;2007:pdb.top17.
pubmed: 21357135
Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25:1754–60.
pubmed: 19451168 doi: 10.1093/bioinformatics/btp324
Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2. Nat Methods. 2012;9:357–9.
pubmed: 22388286 pmcid: 3322381 doi: 10.1038/nmeth.1923
Kim D, Paggi JM, Park C, Bennett C, Salzberg SL. Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat Biotechnol. 2019;37:907–15.
pubmed: 31375807 pmcid: 7605509 doi: 10.1038/s41587-019-0201-4
Li H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics. 2018;34:3094–100.
pubmed: 29750242 pmcid: 6137996 doi: 10.1093/bioinformatics/bty191
Sedlazeck FJ, Rescheneder P, Smolka M, Fang H, Nattestad M, von Haeseler A, et al. Accurate detection of complex structural variations using single-molecule sequencing. Nat Methods. 2018;15:461–8.
pubmed: 29713083 doi: 10.1038/s41592-018-0001-7
BLASR. The PacBio
Kurtz S, Phillippy A, Delcher AL, Smoot M, Shumway M, Antonescu C, et al. Versatile and open software for comparing large genomes. Genome Biol. 2004;5:R12.
pubmed: 14759262 doi: 10.1186/gb-2004-5-2-r12
Song B, Marco-Sola S, Moreto M, Johnson L, Buckler ES, Stitzer MC. AnchorWave: Sensitive alignment of genomes with high sequence diversity, extensive structural polymorphism, and whole-genome duplication. Proc Natl Acad Sci USA. 2022;119:e2113075119.
pubmed: 34934012 doi: 10.1073/pnas.2113075119
Armstrong J, Hickey G, Diekhans M, Fiddes IT, Novak AM, Deran A, et al. Progressive Cactus is a multiple-genome aligner for the thousand-genome era. Nature. 2020;587:246–51.
pubmed: 33177663 doi: 10.1038/s41586-020-2871-y
Darling ACE, Mau B, Blattner FR, Perna NT, Mauve. Multiple alignment of conserved genomic sequence with rearrangements. Genome Res. 2004;14:1394–403.
pubmed: 15231754 pmcid: 442156 doi: 10.1101/gr.2289704
lastz. Program for aligning DNA sequences, a pairwise aligner. https://github.com/lastz/lastz . Accessed 12 Jun 2024.
Angiuoli SV, Salzberg SL. Mugsy: fast multiple alignment of closely related whole genomes. Bioinformatics. 2011;27:334–42.
pubmed: 21148543 doi: 10.1093/bioinformatics/btq665
Goel M, Sun H, Jiao W-B, Schneeberger K. SyRI: finding genomic rearrangements and local sequence differences from whole-genome assemblies. Genome Biol. 2019;20:277.
pubmed: 31842948 doi: 10.1186/s13059-019-1911-0
Chakraborty M, Emerson JJ, Macdonald SJ, Long AD. Structural variants exhibit widespread allelic heterogeneity and shape variation in complex traits. Nat Commun. 2019;10:4872.
pubmed: 31653862 doi: 10.1038/s41467-019-12884-1
Nattestad M, Schatz MC, Assemblytics. A web analytics tool for the detection of variants from an assembly. Bioinformatics. 2016;32:3021–3.
pubmed: 27318204 doi: 10.1093/bioinformatics/btw369
Kronenberg ZN, Fiddes IT, Gordon D, Murali S, Cantsilieris S, Meyerson OS, et al. High-resolution comparative analysis of great ape genomes. Science. 2018;360:eaar6343.
pubmed: 29880660 pmcid: 6178954 doi: 10.1126/science.aar6343
Smolka M, Paulin LF, Grochowski CM, Horner DW, Mahmoud M, Behera S, et al. Detection of mosaic and population-level structural variants with Sniffles2. Nat Biotechnol. 2024;42:1571–80.
English AC, Salerno WJ, Reid JG, PBHoney. Identifying genomic variants via long-read discordance and interrupted mapping. BMC Bioinformatics. 2014;15:180.
pubmed: 24915764 doi: 10.1186/1471-2105-15-180
Cretu Stancu M, van Roosmalen MJ, Renkens I, Nieboer MM, Middelkamp S, de Ligt J, et al. Mapping and phasing of structural variation in patient genomes using nanopore sequencing. Nat Commun. 2017;8:1326.
pubmed: 29109544 doi: 10.1038/s41467-017-01343-4
Fan X, Abbott TE, Larson D, Chen K. BreakDancer: Identification of genomic structural variation from paired-end read mapping. Curr Protoc Bioinformatics. 2014;45:15.6.1–11.
pubmed: 25152801
Layer RM, Chiang C, Quinlan AR, Hall IM. LUMPY: a probabilistic framework for structural variant discovery. Genome Biol. 2014;15:R84.
pubmed: 24970577 doi: 10.1186/gb-2014-15-6-r84
Rausch T, Zichner T, Schlattl A, Stütz AM, Benes V, Korbel JO. DELLY: structural variant discovery by integrated paired-end and split-read analysis. Bioinformatics. 2012;28:i333–339.
pubmed: 22962449 doi: 10.1093/bioinformatics/bts378
McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, et al. The genome analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 2010;20:1297–303.
pubmed: 20644199 doi: 10.1101/gr.107524.110
Garrison E, Marth G. Haplotype-based variant detection from short-read sequencing. arXiv. 2012. https://doi.org/10.48550/arXiv.1207.3907 .
Perea C, De La Hoz JF, Cruz DF, Lobaton JD, Izquierdo P, Quintero JC, et al. Bioinformatic analysis of genotype by sequencing (GBS) data with NGSEP. BMC Genomics. 2016;17 Suppl 5(Suppl 5):498.
pubmed: 27585926 doi: 10.1186/s12864-016-2827-7
Tello D, Gil J, Loaiza CD, Riascos JJ, Cardozo N, Duitama J. NGSEP3: accurate variant calling across species and sequencing protocols. Bioinformatics. 2019;35:4716–23.
pubmed: 31099384 pmcid: 6853766 doi: 10.1093/bioinformatics/btz275
Giordano F, Stammnitz MR, Murchison EP, Ning Z. scanPAV: a pipeline for extracting presence-absence variations in genome pairs. Bioinformatics. 2018;34:3022–4.
pubmed: 29608694 pmcid: 6129304 doi: 10.1093/bioinformatics/bty189
Tay Fernandez CG, Marsh JI, Nestor BJ, Gill M, Golicz AA, Bayer PE, et al. An SGSGeneloss-based method for constructing a gene presence-absence table using mosdepth. Methods Mol Biol. 2022;2512:73–80.
pubmed: 35818000 doi: 10.1007/978-1-0716-2429-6_5
Tahir Ul Qamar M, Zhu X, Xing F, Chen L-L. ppsPCP: A plant presence/absence variants scanner and pan-genome construction pipeline. Bioinformatics. 2019;35:4156–8.
pubmed: 30851098 doi: 10.1093/bioinformatics/btz168
Danecek P, Bonfield JK, Liddle J, Marshall J, Ohan V, Pollard MO, et al. Twelve years of SAMtools and BCFtools. Gigascience. 2021;10:10.
doi: 10.1093/gigascience/giab008
SURVIVOR. Toolset for SV simulation, comparison and filtering. https://github.com/fritzsedlazeck/SURVIVOR . Accessed 25 Sep 2024.
Zheng Z, Zhu M, Zhang J, Liu X, Hou L, Liu W, et al. A sequence-aware merger of genomic structural variations at population scale. Nat Commun. 2024;15:960.
pubmed: 38307885 pmcid: 10837428 doi: 10.1038/s41467-024-45244-9
Quinlan AR, Hall IM. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010;26:841–2.
pubmed: 20110278 doi: 10.1093/bioinformatics/btq033
Jasmine. SV merging across samples. https://github.com/mkirsche/Jasmine . Accessed 12 Jun 2024.
Emms DM, Kelly S, OrthoFinder. Solving fundamental biases in whole genome comparisons dramatically improves orthogroup inference accuracy. Genome Biol. 2015;16:157.
pubmed: 26243257 pmcid: 4531804 doi: 10.1186/s13059-015-0721-2
Li L, Stoeckert CJ, Roos DS. OrthoMCL: identification of ortholog groups for eukaryotic genomes. Genome Res. 2003;13:2178–89.
pubmed: 12952885 pmcid: 403725 doi: 10.1101/gr.1224503
Wang Y, Tang H, Debarry JD, Tan X, Li J, Wang X, et al. MCScanX: a toolkit for detection and evolutionary analysis of gene synteny and collinearity. Nucleic Acids Res. 2012;40:e49.
pubmed: 22217600 pmcid: 3326336 doi: 10.1093/nar/gkr1293
Fu L, Niu B, Zhu Z, Wu S, Li W. CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics. 2012;28:3150–2.
pubmed: 23060610 pmcid: 3516142 doi: 10.1093/bioinformatics/bts565
Wang J, Yang W, Zhang S, Hu H, Yuan Y, Dong J, et al. A pangenome analysis pipeline provides insights into functional gene identification in rice. Genome Biol. 2023;24:19.
pubmed: 36703158 pmcid: 9878884 doi: 10.1186/s13059-023-02861-9
Garrison E, Sirén J, Novak AM, Hickey G, Eizenga JM, Dawson ET, et al. Variation graph toolkit improves read mapping by representing genetic variation in the reference. Nat Biotechnol. 2018;36:875–9.
pubmed: 30125266 pmcid: 6126949 doi: 10.1038/nbt.4227
Hickey G, Heller D, Monlong J, Sibbesen JA, Sirén J, Eizenga J, et al. Genotyping structural variants in pangenome graphs using the vg toolkit. Genome Biol. 2020;21:35.
pubmed: 32051000 pmcid: 7017486 doi: 10.1186/s13059-020-1941-7
Li H, Feng X, Chu C. The design and construction of reference pangenome graphs with minigraph. Genome Biol. 2020;21:265.
pubmed: 33066802 pmcid: 7568353 doi: 10.1186/s13059-020-02168-z
Hickey G, Monlong J, Ebler J, Novak AM, Eizenga JM, Gao Y, et al. Pangenome graph construction from genome alignments with Minigraph-Cactus. Nat Biotechnol. 2024;42:663–73.
pubmed: 37165083 doi: 10.1038/s41587-023-01793-w
Garrison E, Guarracino A, Heumos S, Villani F, Bao Z, Tattini L et al. Building pangenome graphs. BioRxiv. 2023. https://doi.org/10.1101/2023.04.05.535718 .
Khan J, Kokot M, Deorowicz S, Patro R. Scalable, ultra-fast, and low-memory construction of compacted de Bruijn graphs with cuttlefish 2. Genome Biol. 2022;23:190.
pubmed: 36076275 pmcid: 9454175 doi: 10.1186/s13059-022-02743-6
Holley G, Melsted P. Bifrost: highly parallel construction and indexing of colored and compacted de Bruijn graphs. Genome Biol. 2020;21:249.
pubmed: 32943081 pmcid: 7499882 doi: 10.1186/s13059-020-02135-8
Sirén J, Monlong J, Chang X, Novak AM, Eizenga JM, Markello C, et al. Pangenomics enables genotyping of known structural variants in 5202 diverse genomes. Science. 2021;374:abg8871.
pubmed: 34914532 pmcid: 9365333 doi: 10.1126/science.abg8871
Sović I, Šikić M, Wilm A, Fenlon SN, Chen S, Nagarajan N. Fast and sensitive mapping of nanopore sequencing reads with GraphMap. Nat Commun. 2016;7:11307.
pubmed: 27079541 pmcid: 4835549 doi: 10.1038/ncomms11307
Rautiainen M, Marschall T, GraphAligner. Rapid and versatile sequence-to-graph alignment. Genome Biol. 2020;21:253.
pubmed: 32972461 pmcid: 7513500 doi: 10.1186/s13059-020-02157-2
Paten B, Eizenga JM, Rosen YM, Novak AM, Garrison E, Hickey G. Superbubbles, ultrabubbles, and cacti. J Comput Biol. 2018;25:649–63.
pubmed: 29461862 pmcid: 6067107 doi: 10.1089/cmb.2017.0251
Chen S, Krusche P, Dolzhenko E, Sherman RM, Petrovski R, Schlesinger F, et al. Paragraph: a graph-based structural variant genotyper for short-read sequence data. Genome Biol. 2019;20:291.
pubmed: 31856913 pmcid: 6921448 doi: 10.1186/s13059-019-1909-7
Eggertsson HP, Jonsson H, Kristmundsdottir S, Hjartarson E, Kehr B, Masson G, et al. Graphtyper enables population-scale genotyping using pangenome graphs. Nat Genet. 2017;49:1654–60.
pubmed: 28945251 doi: 10.1038/ng.3964
Eggertsson HP, Kristmundsdottir S, Beyter D, Jonsson H, Skuladottir A, Hardarson MT, et al. GraphTyper2 enables population-scale genotyping of structural variation using pangenome graphs. Nat Commun. 2019;10:5402.
pubmed: 31776332 pmcid: 6881350 doi: 10.1038/s41467-019-13341-9
Ebler J, Ebert P, Clarke WE, Rausch T, Audano PA, Houwaart T, et al. Pangenome-based genome inference allows efficient and accurate genotyping across a wide spectrum of variant classes. Nat Genet. 2022;54:518–25.
pubmed: 35410384 pmcid: 9005351 doi: 10.1038/s41588-022-01043-w
Sibbesen JA, Maretty L, Danish Pan-Genome Consortium, Krogh A. Accurate genotyping across variant classes and lengths using variant graphs. Nat Genet. 2018;50:1054–9.
pubmed: 29915429 doi: 10.1038/s41588-018-0145-5
Horsfield ST, Tonkin-Hill G, Croucher NJ, Lees JA. Accurate and fast graph-based pangenome annotation and clustering with ggCaller. Genome Res. 2023;33:1622–37.
pubmed: 37620118 pmcid: 10620059 doi: 10.1101/gr.277733.123
Sheikhizadeh S, Schranz ME, Akdel M, de Ridder D, Smit S. PanTools: representation, storage and exploration of pan-genomic data. Bioinformatics. 2016;32:i487–93.
pubmed: 27587666 doi: 10.1093/bioinformatics/btw455
Anari SS, de Ridder D, Schranz ME, Smit S. Pangenomic read mapping. BioRxiv. 2019. https://doi.org/10.1101/813634 .
Sibbesen JA, Eizenga JM, Novak AM, Sirén J, Chang X, Garrison E, et al. Haplotype-aware pantranscriptome analyses using spliced pangenome graphs. Nat Methods. 2023;20:239–47.
pubmed: 36646895 doi: 10.1038/s41592-022-01731-9
Wick RR, Schultz MB, Zobel J, Holt KE. BANDAGE: interactive visualization of de novo genome assemblies. Bioinformatics. 2015;31:3350–2.
pubmed: 26099265 pmcid: 4595904 doi: 10.1093/bioinformatics/btv383
Gonnella G, Niehus N, Kurtz S. GfaViz: flexible and interactive visualization of GFA sequence graphs. Bioinformatics. 2019;35:2853–5.
pubmed: 30596893 doi: 10.1093/bioinformatics/bty1046
Durant É, Sabot F, Conte M, Rouard M. PANACHE: a web browser-based viewer for linearized pangenomes. Bioinformatics. 2021;37:4556–8.
pubmed: 34601567 pmcid: 8652104 doi: 10.1093/bioinformatics/btab688
SequenceTubeMap: displays multiple genomic sequences in the form of a tube map. https://github.com/vgteam/sequenceTubeMap . Accessed 12 Jun 2024.
Guarracino A, Heumos S, Nahnsen S, Prins P, Garrison E. ODGI: understanding pangenome graphs. Bioinformatics. 2022;38:3319–26.
pubmed: 35552372 pmcid: 9237687 doi: 10.1093/bioinformatics/btac308
Iqbal Z, Caccamo M, Turner I, Flicek P, McVean G. De novo assembly and genotyping of variants using colored de Bruijn graphs. Nat Genet. 2012;44:226–32.
pubmed: 22231483 pmcid: 3272472 doi: 10.1038/ng.1028
Qin P, Lu H, Du H, Wang H, Chen W, Chen Z, et al. Pan-genome analysis of 33 genetically diverse rice accessions reveals hidden genomic variations. Cell. 2021;184:3542–3558.e16.
pubmed: 34051138 doi: 10.1016/j.cell.2021.04.046
Glick L, Mayrose I. The effect of methodological considerations on the construction of gene-based plant pan-genomes. Genome Biol Evol. 2023;15:15.
doi: 10.1093/gbe/evad121
Mehrotra S, Goyal V. Repetitive sequences in plant nuclear DNA: types, distribution, evolution and function. Genomics Proteom Bioinf. 2014;12:164–71.
doi: 10.1016/j.gpb.2014.07.003
Negi P, Rai AN, Suprasanna P. Moving through the stressed genome: emerging regulatory roles for transposons in plant stress response. Front Plant Sci. 2016;7:1448.
pubmed: 27777577 pmcid: 5056178 doi: 10.3389/fpls.2016.01448
Bourque G, Burns KH, Gehring M, Gorbunova V, Seluanov A, Hammell M, et al. Ten things you should know about transposable elements. Genome Biol. 2018;19:199.
pubmed: 30454069 pmcid: 6240941 doi: 10.1186/s13059-018-1577-z
Bariah I, Keidar-Friedman D, Kashkush K. Where the wild things are: transposable elements as drivers of structural and functional variations in the wheat genome. Front Plant Sci. 2020;11:585515.
pubmed: 33072155 pmcid: 7530836 doi: 10.3389/fpls.2020.585515
Zhao Q, Feng Q, Lu H, Li Y, Wang A, Tian Q, et al. Pan-genome analysis highlights the extent of genomic variation in cultivated and wild rice. Nat Genet. 2018;50:278–84.
pubmed: 29335547 doi: 10.1038/s41588-018-0041-z
Golicz AA, Bayer PE, Barker GC, Edger PP, Kim H, Martinez PA, et al. The pangenome of an agronomically important crop plant Brassica oleracea. Nat Commun. 2016;7:13390.
pubmed: 27834372 pmcid: 5114598 doi: 10.1038/ncomms13390
Badouin H, Gouzy J, Grassa CJ, Murat F, Staton SE, Cottret L, et al. The sunflower genome provides insights into oil metabolism, flowering and Asterid evolution. Nature. 2017;546:148–52.
pubmed: 28538728 doi: 10.1038/nature22380
Hu Z, Sun C, Lu K-C, Chu X, Zhao Y, Lu J, et al. EUPAN enables pan-genome studies of a large number of eukaryotic genomes. Bioinformatics. 2017;33:2408–9.
pubmed: 28369371 doi: 10.1093/bioinformatics/btx170
Ho SS, Urban AE, Mills RE. Structural variation in the sequencing era. Nat Rev Genet. 2020;21:171–89.
pubmed: 31729472 doi: 10.1038/s41576-019-0180-9
Kosugi S, Momozawa Y, Liu X, Terao C, Kubo M, Kamatani Y. Comprehensive evaluation of structural variation detection algorithms for whole genome sequencing. Genome Biol. 2019;20:117.
pubmed: 31159850 pmcid: 6547561 doi: 10.1186/s13059-019-1720-5
Zhou Y, Zhang Z, Bao Z, Li H, Lyu Y, Zan Y, et al. Graph pangenome captures missing heritability and empowers tomato breeding. Nature. 2022;606:527–34.
pubmed: 35676474 pmcid: 9200638 doi: 10.1038/s41586-022-04808-9
He Q, Tang S, Zhi H, Chen J, Zhang J, Liang H, et al. A graph-based genome and pan-genome variation of the model plant Setaria. Nat Genet. 2023;55:1232–42.
pubmed: 37291196 pmcid: 10335933 doi: 10.1038/s41588-023-01423-w
Guarracino A, Mwaniki N, Marco-Sola S, Garrison E. Wfmash: a pangenome-scale aligner. Zenodo. 2021. https://doi.org/10.5281/zenodo.6949373 .
Garrison E, Guarracino A. Unbiased pangenome graphs. Bioinformatics. 2023;39:btac743.
pubmed: 36448683 doi: 10.1093/bioinformatics/btac743
Cleary A, Ramaraj T, Kahanda I, Mudge J, Mumey B. Exploring frequented regions in pan-genomic graphs. IEEE/ACM Trans Comput Biol Bioinform. 2019;16:1424–35.
pubmed: 30106690 doi: 10.1109/TCBB.2018.2864564
Andreace F, Lechat P, Dufresne Y, Chikhi R. Comparing methods for constructing and representing human pangenome graphs. Genome Biol. 2023;24:274.
pubmed: 38037131 doi: 10.1186/s13059-023-03098-2
Liao W-W, Asri M, Ebler J, Doerr D, Haukness M, Hickey G, et al. A draft human pangenome reference. Nature. 2023;617:312–24.
pubmed: 37165242 doi: 10.1038/s41586-023-05896-x
Stanke M, Waack S. Gene prediction with a hidden Markov model and a new intron submodel. Bioinformatics. 2003;19(Suppl 2):ii215–225.
pubmed: 14534192 doi: 10.1093/bioinformatics/btg1080
Blanco E, Parra G, Guigó R. Using geneid to identify genes. Curr Protoc Bioinf. 2007;Chap. 4:Unit 4.3.
UniProt Consortium. Uniprot: the universal protein knowledgebase in 2023. Nucleic Acids Res. 2023;51:D523–31.
doi: 10.1093/nar/gkac1052
Reiser L, Bakker E, Subramaniam S, Chen X, Sawant S, Khosa K, et al. The Arabidopsis information resource in 2024. Genetics. 2024;227:iyae027.
pubmed: 38457127 doi: 10.1093/genetics/iyae027
UniProt Consortium. UniProt: a worldwide hub of protein knowledge. Nucleic Acids Res. 2019;47:D506–15.
doi: 10.1093/nar/gky1049
Zheng Y, Jiao C, Sun H, Rosli HG, Pombo MA, Zhang P, et al. iTAK: a program for genome-wide prediction and classification of plant transcription factors, transcriptional regulators, and protein kinases. Mol Plant. 2016;9:1667–70.
pubmed: 27717919 doi: 10.1016/j.molp.2016.09.014
Tian F, Yang DC, Meng YQ, Jin J, Gao G. PlantRegMap: charting functional regulatory maps in plants. Nucleic Acids Res. 2020;48:D1104–13.
pubmed: 31701126
Finn RD, Bateman A, Clements J, Coggill P, Eberhardt RY, Eddy SR, et al. Pfam: the protein families database. Nucleic Acids Res. 2014;42:222–30 Database issue:D.
doi: 10.1093/nar/gkt1223
Ou S, Su W, Liao Y, Chougule K, Agda JRA, Hellinga AJ, et al. Benchmarking transposable element annotation methods for creation of a streamlined, comprehensive pipeline. Genome Biol. 2019;20:275.
pubmed: 31843001 pmcid: 6913007 doi: 10.1186/s13059-019-1905-y
Xu Z, Wang H. LTR_FINDER: An efficient tool for the prediction of full-length LTR retrotransposons. Nucleic Acids Res. 2007;35(Web Server issue):W265–8.
pubmed: 17485477 pmcid: 1933203 doi: 10.1093/nar/gkm286
Ellinghaus D, Kurtz S, Willhoeft U. LTRharvest, an efficient and flexible software for de novo detection of LTR retrotransposons. BMC Bioinformatics. 2008;9:18.
pubmed: 18194517 pmcid: 2253517 doi: 10.1186/1471-2105-9-18
Ou S, Jiang N, Ltr_retriever. A highly accurate and sensitive program for identification of long terminal repeat retrotransposons. Plant Physiol. 2018;176:1410–22.
pubmed: 29233850 doi: 10.1104/pp.17.01310
Shi J, Liang C. Generic repeat Finder: A High-Sensitivity Tool for genome-wide De Novo repeat detection. Plant Physiol. 2019;180:1803–15.
pubmed: 31152127 pmcid: 6670090 doi: 10.1104/pp.19.00386
Su W, Gu X, Peterson T. TIR-Learner, a new ensemble method for tir transposable element annotation, provides evidence for abundant new transposable elements in the maize genome. Mol Plant. 2019;12:447–60.
pubmed: 30802553 doi: 10.1016/j.molp.2019.02.008
Xiong W, He L, Lai J, Dooner HK, Du C. HelitronScanner uncovers a large overlooked cache of Helitron transposons in many plant genomes. Proc Natl Acad Sci USA. 2014;111:10263–8.
pubmed: 24982153 pmcid: 4104883 doi: 10.1073/pnas.1410068111
Hubley R, Smit A. ISB repeat modeler. ISB repeat modeler. https://www.repeatmasker.org/RepeatModeler/ . Accessed 11 Jun 2024.
Institute for Systems Biology. Repeat masker. ISB Repeat Masker. https://www.repeatmasker.org/RepeatMasker/ . Accessed 11 Jun 2024.
Benson G. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 1999;27:573–80.
pubmed: 9862982 doi: 10.1093/nar/27.2.573
Mudge J, Farmer AD. Sequencing, assembly, and annotation of the alfalfa genome. In: Yu L-X, Kole C, editors. The Alfalfa Genome. Cham: Springer International Publishing; 2021. p. 87–109.
doi: 10.1007/978-3-030-74466-3_6
Ballouz S, Dobin A, Gillis JA. Is it time to change the reference genome? Genome Biol. 2019;20:159.
pubmed: 31399121 doi: 10.1186/s13059-019-1774-4
Bradbury PJ, Casstevens T, Jensen SE, Johnson LC, Miller ZR, Monier B, et al. The practical haplotype graph, a platform for storing and using pangenomes for imputation. Bioinformatics. 2022;38:3698–702.
pubmed: 35748708 doi: 10.1093/bioinformatics/btac410
Gallais A. Quantitative genetics and breeding methods in autopolyploid plants. Versailles: QUAE; 2004.
Zhou Q, Tang D, Huang W, Yang Z, Zhang Y, Hamilton JP, et al. Haplotype-resolved genome analyses of a heterozygous diploid potato. Nat Genet. 2020;52:1018–23.
pubmed: 32989320 doi: 10.1038/s41588-020-0699-x
Li A, Liu A, Du X, Chen J-Y, Yin M, Hu H-Y, et al. A chromosome-scale genome assembly of a diploid alfalfa, the progenitor of autotetraploid alfalfa. Hortic Res. 2020;7:194.
pubmed: 33328470 doi: 10.1038/s41438-020-00417-7
Uitdewilligen JGAML, Wolters A-MA, D’hoop BB, Borm TJA, Visser RGF, van Eck HJ. A next-generation sequencing method for genotyping-by-sequencing of highly heterozygous autotetraploid potato. PLoS ONE. 2013;8:e62355.
pubmed: 23667470 doi: 10.1371/journal.pone.0062355
USDA-ARS. Legume information system: genome assembly of cultivated Alfalfa at Diploid genome. https://data.legumeinfo.org/Medicago/sativa/genomes/ . Accessed 11 Jun 2024.
Russelle MP, Alfalfa. After an 8,000-year journey, the queen of Forages stands poised to enjoy renewed popularity. Am Sci. 2001;89:252–61.
doi: 10.1511/2001.3.252
Wang Z, Şakiroğlu M. The origin, evolution, and genetic diversity of alfalfa. In: Yu L-X, Kole C, editors. The Alfalfa Genome. Cham: Springer International Publishing; 2021. p. 29–42.
doi: 10.1007/978-3-030-74466-3_3
Li A, Liu A, Wu S, Qu K, Hu H, Yang J, et al. Comparison of structural variants in the whole genome sequences of two Medicago truncatula ecotypes: Jemalong A17 and R108. BMC Plant Biol. 2022;22:77.
pubmed: 35193491 doi: 10.1186/s12870-022-03469-0
Hardigan MA, Crisovan E, Hamilton JP, Kim J, Laimbeer P, Leisner CP, et al. Genome reduction uncovers a large dispensable genome and adaptive role for copy number variation in asexually propagated Solanum tuberosum. Plant Cell. 2016;28:388–405.
pubmed: 26772996 doi: 10.1105/tpc.15.00538
Manni M, Berkeley MR, Seppey M, Simão FA, Zdobnov EM. BUSCO Update: Novel and streamlined workflows along with broader and deeper phylogenetic coverage for scoring of eukaryotic, prokaryotic, and viral genomes. Mol Biol Evol. 2021;38:4647–54.
pubmed: 34320186 doi: 10.1093/molbev/msab199
Jeffares DC, Jolly C, Hoti M, Speed D, Shaw L, Rallis C, et al. Transient structural variations have strong effects on quantitative traits and reproductive isolation in fission yeast. Nat Commun. 2017;8:14061.
pubmed: 28117401 doi: 10.1038/ncomms14061
Cingolani P, Platts A, Wang LL, Coon M, Nguyen T, Wang L, et al. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly (Austin). 2012;6:80–92.
pubmed: 22728672 doi: 10.4161/fly.19695
Alonge M, Wang X, Benoit M, Soyk S, Pereira L, Zhang L, et al. Major impacts of widespread structural variation on gene expression and crop improvement in tomato. Cell. 2020;182:145–e16123.
pubmed: 32553272 pmcid: 7354227 doi: 10.1016/j.cell.2020.05.021
Medina CA, Zhao D, Lin M, Sapkota M, Sandercock AM, Beil CT et al. Pre-breeding in alfalfa germplasm develops highly differentiated populations, as revealed by genome-wide microhaplotype markers. Sci Rep. 2024. https://doi.org/10.21203/rs.3.rs-4215295/v1 .
Ewels PA, Peltzer A, Fillinger S, Patel H, Alneberg J, Wilm A, et al. The nf-core framework for community-curated bioinformatics pipelines. Nat Biotechnol. 2020;38:276–8.
pubmed: 32055031 doi: 10.1038/s41587-020-0439-x
Wang Z, Wang X, Zhang H, Ma L, Zhao H, Jones CS, et al. A genome-wide association study approach to the identification of candidate genes underlying agronomic traits in alfalfa (Medicago sativa L). Plant Biotechnol J. 2020;18:611–3.
pubmed: 31487419 doi: 10.1111/pbi.13251
Hu H, Li R, Zhao J, Batley J, Edwards D. Technological development and advances for constructing and analyzing plant pangenomes. Genome Biol Evol. 2024;16:evae081.
pubmed: 38669452 doi: 10.1093/gbe/evae081
Beyer W, Novak AM, Hickey G, Chan J, Tan V, Paten B, et al. Sequence tube maps: making graph genomes intuitive to commuters. Bioinformatics. 2019;35:5318–20.
pubmed: 31368484 doi: 10.1093/bioinformatics/btz597
Wilson ZA, Morroll SM, Dawson J, Swarup R, Tighe PJ. The Arabidopsis MALE STERILITY1 (MS1) gene is a transcriptional regulator of male gametogenesis, with homology to the PHD-finger family of transcription factors. Plant J. 2001;28:27–39.
pubmed: 11696184 doi: 10.1046/j.1365-313X.2001.01125.x
Fu G-Q, Xu S, Xie Y-J, Han B, Nie L, Shen W-B, et al. Molecular cloning, characterization, and expression of an alfalfa (Medicago sativa L.) heme oxygenase-1 gene, MsHO1, which is pro-oxidants-regulated. Plant Physiol Biochem. 2011;49:792–9.
pubmed: 21316255 doi: 10.1016/j.plaphy.2011.01.018
Zhang X, Chen B, Wang L, Ali S, Guo Y, Liu J, et al. Genome-wide identification and characterization of caffeic acid o-methyltransferase gene family in soybean. Plants. 2021;10:10.
doi: 10.3390/plants10122816
Jing Y, Paau AS, Brill WJ. Leghemoglobins from alfalfa (Medicago sativa L. vernal) root nodules. I. Purification and in vitro synthesis of five leghemoglobin components. Plant Sci Lett. 1982;25:119–32.
doi: 10.1016/0304-4211(82)90170-5
Löbler M, Hirsch AM. An alfalfa (Medicago sativa L.) cDNA encoding an acidic leghemoglobin (MsLb3). Plant Mol Biol. 1992;20:733–6.
pubmed: 1450387 doi: 10.1007/BF00046457
GBrowser. https://gbrowser.sourceforge.net/ . Accessed 30 Sep 2024.
Diesh C, Stevens GJ, Xie P, De Jesus Martinez T, Hershberg EA, Leung A, et al. JBrowse 2: a modular genome browser with views of synteny and structural variation. Genome Biol. 2023;24:74.
pubmed: 37069644 doi: 10.1186/s13059-023-02914-z
Thorvaldsdóttir H, Robinson JT, Mesirov JP. Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration. Brief Bioinf. 2013;14:178–92.
doi: 10.1093/bib/bbs017
GitHub - Sep SJTU-CGM/PPanG: a precise pangenome browser combining linear and graph-based pan-genome. https://github.com/SJTU-CGM/PPanG/ . Accessed 30 Sep 2024.
Manuweera B, Mudge J, Kahanda I, Mumey B, Ramaraj T, Cleary A. Pangenome-wide association studies with frequented regions. In: Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics. New York: ACM; 2019. p. 627–32.
doi: 10.1145/3307339.3343478
Tay Fernandez CG, Nestor BJ, Danilevicz MF, Marsh JI, Petereit J, Bayer PE, et al. Expanding gene-editing potential in Crop Improvement with pangenomes. Int J Mol Sci. 2022;23:2276.
pubmed: 35216392 doi: 10.3390/ijms23042276
Jin S, Han Z, Hu Y, Si Z, Dai F, He L, et al. Structural variation (SV)-based pan-genome and GWAS reveal the impacts of SVs on the speciation and diversification of allotetraploid cottons. Mol Plant. 2023;16:678–93.
pubmed: 36760124 doi: 10.1016/j.molp.2023.02.004
Sun C, Hu Z, Zheng T, Lu K, Zhao Y, Wang W, et al. RPAN: Rice pan-genome browser for ∼3000 rice genomes. Nucleic Acids Res. 2017;45:597–605.
pubmed: 27940610 doi: 10.1093/nar/gkw958
Guangdong Laboratory for Lingnan Modern Agriculture SP of W& CRPT, Institute AG. Jun, Chinese Academy of Agricultural Sciences. RiceSuperPIRdb. http://www.ricesuperpir.com/ . Accessed 11 2024.
Lawrence CJ, Dong Q, Polacco ML, Seigfried TE, Brendel V. MaizeGDB, the community database for maize genetics and genomics. Nucleic Acids Res. 2004;32(Database issue):D393–7.
pubmed: 14681441 doi: 10.1093/nar/gkh011
Gui S, Yang L, Li J, Luo J, Xu X, Yuan J, et al. ZEAMAP, a comprehensive database adapted to the maize multi-omics era. iScience. 2020;23:101241.
pubmed: 32629608 doi: 10.1016/j.isci.2020.101241
Bayer PE, Petereit J, Durant É, Monat C, Rouard M, Hu H, et al. Wheat panache: a pangenome graph database representing presence-absence variation across sixteen bread wheat genomes. Plant Genome. 2022;15:e20221.
pubmed: 35644986 doi: 10.1002/tpg2.20221

Auteurs

Harpreet Kaur (H)

Department of Horticultural Science, University of Minnesota, St. Paul, MN, 55108, USA. kaurh@umn.edu.

Laura M Shannon (LM)

Department of Horticultural Science, University of Minnesota, St. Paul, MN, 55108, USA.

Deborah A Samac (DA)

USDA-ARS, Plant Science Research Unit, St. Paul, MN, 55108, USA.

Articles similaires

Coal Metagenome Phylogeny Bacteria Genome, Bacterial
Genome Size Genome, Plant Magnoliopsida Evolution, Molecular Arabidopsis
Genome, Bacterial Virulence Phylogeny Genomics Plant Diseases
Zea mays Triticum China Seasons Crops, Agricultural

Classifications MeSH