A reference genome for pea provides insight into legume genome evolution.

Chromosome Mapping Chromosomes, Plant / genetics Evolution, Molecular Fabaceae / classification Gene Expression Regulation, Plant Genetic Variation Genome, Plant Genomics Pisum sativum / genetics Phenotype Phylogeny Plant Proteins / genetics Quantitative Trait Loci Reference Standards Repetitive Sequences, Nucleic Acid Seed Storage Proteins / genetics Whole Genome Sequencing

Journal

Nature genetics

ISSN: 1546-1718

Titre abrégé: Nat Genet

Pays: United States

ID NLM: 9216904

Informations de publication

Date de publication:
09 2019

Historique:

received: 28 12 2018

accepted: 10 07 2019

entrez: 4 9 2019

pubmed: 4 9 2019

medline: 24 1 2020

Statut: ppublish

Résumé

We report the first annotated chromosome-level reference genome assembly for pea, Gregor Mendel's original genetic model. Phylogenetics and paleogenomics show genomic rearrangements across legumes and suggest a major role for repetitive elements in pea genome evolution. Compared to other sequenced Leguminosae genomes, the pea genome shows intense gene dynamics, most likely associated with genome size expansion when the Fabeae diverged from its sister tribes. During Pisum evolution, translocation and transposition differentially occurred across lineages. This reference sequence will accelerate our understanding of the molecular basis of agronomically important traits and support crop improvement.

Identifiants

DOI: 10.1038/s41588-019-0480-1 PMID: 31477930

pubmed: 31477930

doi: 10.1038/s41588-019-0480-1

pii: 10.1038/s41588-019-0480-1

doi:

Substances chimiques

Plant Proteins 0

Seed Storage Proteins 0

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

Pagination

1411-1422

Références

Burstin, J., Gallardo, K., Mir, R. R., Varshney, R. K. & Duc, G. Improving protein content and nutrition quality, in Biology and Breeding of Food Legumes (eds Pratap, A. & Kumar, J.) 314–328 (CAB International, 2011).

Guillon, F. & Champ, M. M.-J. Carbohydrate fractions of legumes: uses in human nutrition and potential for health. Br. J. Nutr. 88, S293–S306 (2002).

pubmed: 12498630

Dahl, W. J., Foster, L. M. & Tyler, R. T. Review of the health benefits of peas (Pisum sativum L.). Br. J. Nutr. 108, S3–S10 (2012).

pubmed: 22916813

Foschia, M., Horstmann, S. W., Arendt, E. K. & Zannini, E. Legumes as functional ingredients in gluten-free bakery and pasta products. Ann. Rev. Food Sci. Technol. 8, 75–96 (2017).

Nemecek, T. et al. Environmental impacts of introducing grain legumes into European crop rotations. Eur. J. Agron. 28, 380–393 (2008).

Crews, T. E. & Peoples, M. B. Legume versus fertilizer sources of nitrogen: ecological tradeoffs and human needs. Agric. Ecosyst. Environ. 102, 279–297 (2004).

Poore, J. & Nemecek, T. Reducing food’s environmental impacts through producers and consumers. Science 360, 987–992 (2018).

pubmed: 29853680

Zohary, D. & Hopf, M. Domestication of Plants in the Old World (Oxford Univ. Press, Oxford, 2000).

Doležel, J. et al. Plant genome size estimation by flow cytometry: inter-laboratory comparison. Ann. Bot. 82, 17–26 (1998).

Young, N. D. et al. The Medicago genome provides insight into the evolution of rhizobial symbioses. Nature 480, 520–524 (2011).

pubmed: 22089132 pmcid: 3272368

Sato, S. et al. Genome structure of the legume, Lotus japonicus. DNA Res. 15, 227–239 (2008).

pubmed: 18511435 pmcid: 2575887

Schmutz, J. Genome sequence of the palaeopolyploid soybean. Nature 463, 178–183 (2010).

pubmed: 20075913

Mendel, G. Versuche über Pflanzenhybriden. Verhandlungen des naturforschenden Vereines in Brünn, Bd. IV für das Jahr 1865. Abhandlungen, 3–47 (1866).

Ellis, T. H. N., Hofer, J. M. I., Timmerman-Vaughan, G. M., Coyne, C. J. & Hellens, R. P. Mendel, 150 years on. Trends Plant Sci. 16, 590–596 (2011).

pubmed: 21775188

Tayeh, N. et al. Genomic tools in pea breeding programs: status and perspectives. Front. Plant Sci. 6, 1037 (2015).

pubmed: 26640470 pmcid: 4661580

Ellis, T. H. N. & Poyser, S. J. An integrated and comparative view of pea genetic and cytogenetic maps. New Phytol. 153, 17–25 (2002).

Flavell, R. B., Bennett, M. D., Smith, J. B. & Smith, D. B. Genome size and the proportion of repeated nucleotide sequence DNA in plants. Biochem. Genet. 12, 257–269 (1974).

pubmed: 4441361

Murray, M. G., Peters, D. L. & Thompson, W. F. Ancient repeated sequences in the pea and mung bean genomes and implications for genome evolution. J. Mol. Evol. 17, 31–42 (1981).

Macas, J. et al. In depth characterization of repetitive DNA in 23 plant genomes reveals sources of genome size variation in the legume tribe Fabeae. PLoS One 10, e0143424 (2015).

pubmed: 26606051 pmcid: 4659654

Hammarlund, C. & Håkansson, A. Parallelism of chromosome ring formation, sterility and linkage in Pisum. Hereditas 14, 97–98 (1930).

Sansome, E. Segmental interchange lines in Pisum sativum. Nature 139, 113 (1937).

Lamm, R. & Miravalle, R. J. A translocation tester set in Pisum. Hereditas 45, 417–440 (1959).

Gali, K. K. et al. Development of a sequence-based reference physical map of pea (Pisum sativum L.). Front. Plant Sci. 10, 323 (2019).

pubmed: 30930928 pmcid: 6428963

Neumann, P., Pozárková, D., Vrána, J., Doležel, J. & Macas, J. Chromosome sorting and PCR-based physical mapping in pea (Pisum sativum L.). Chromosome Res. 10, 63–71 (2002).

pubmed: 11863073

Tayeh, N. et al. Development of two major resources for pea genomics: the GenoPea 13.2K SNP Array and a high density, high resolution consensus genetic map. Plant J. 84, 1257–1273 (2015).

pubmed: 26590015

Neumann, P. et al. Stretching the rules: monocentric chromosomes with multiple centromere domains. PLoS Genet. 8, e1002777 (2012).

pubmed: 22737088 pmcid: 3380829

Pellicer, J., Hidalgo, O., Dodsworth, S. & Leitch, I. Genome size diversity and its impact on the evolution of land plants. Genes 9, 88 (2018).

pmcid: 5852584

Bennett, M. C. & Leitch, I. J. Plant DNA C-values Database release 6.0 (FAIRsharing.org, 2012); https://doi.org/10.25504/FAIRsharing.7qexb2

Hane, J. K. et al. A comprehensive draft genome sequence for lupin (Lupinus angustifolius), an emerging health food: insights into plant–microbe interactions and legume evolution. Plant Biotechnol. J. 15, 318–330 (2017).

pubmed: 27557478

Blixt, S. Mutation genetics in Pisum. Agric. Hort. Genet. 30, 1–293 (1972).

Cannon, S. et al. Multiple polyploidy events in the early radiation of nodulating and nonnodulating legumes. Mol. Biol. Evol. 32, 193–210 (2015).

pubmed: 25349287

Bowers, J. E., Chapman, B. A., Rong, J. & Paterson, A. H. Unravelling angiosperm genome evolution by phylogenetic analysis of chromosomal duplication events. Nature 422, 433–438 (2003).

pubmed: 12660784

Lavin, M., Herendeen, P. S. & Wojciechowski, M. F. Evolutionary rates analysis of Leguminosae implicates a rapid diversification of lineages during the tertiary. Syst. Biol. 54, 575–594 (2005).

pubmed: 16085576

Li, S. F. et al. Chromosome evolution in connection with repetitive sequences and epigenetics in plants. Genes 8, 290 (2017).

pmcid: 5664140

De Vega, J. J. et al. Red clover (Trifolium pratense L.) draft genome provides a platform for trait improvement. Sci. Rep. 5, 17394 (2015).

pubmed: 26617401 pmcid: 4663792

Lee, C., Yu, D., Choi, H. K. & Kim, R. W. Reconstruction of a composite comparative map composed of ten legume genomes. Genes Genom. 39, 111–119 (2017).

Kamphuis, L. G. et al. The Medicago truncatula reference accession A17 has an aberrant chromosomal configuration. New Phytol. 174, 299–303 (2007).

pubmed: 17388892

Ben-Ze'ev, N. & Zohary, D. Species relationships in the genus Pisum L. Isr. J. Bot. 22, 73–91 (1973).

Neumann, P., Nouzová, M. & Macas, J. Molecular and cytogenetic analysis of repetitive DNA in pea (Pisum sativum L.). Genome 44, 716–728 (2001).

pubmed: 11550909

Ladizinsky, G. & Abbo, S. (eds.) The Pisum genus. in The Search for Wild Relatives of Cool Season Legumes 55–68 (Springer, 2015).

Kosterin, O. E. & Bogdanova, V. S. Reciprocal compatibility within the genus Pisum L. as studied in F

Davis, P. H. in Flora of Turkey and the East Aegean Islands Vol. 3 (ed P. H. Davis) 370–373 (Edinburgh Univ., 1970).

Weeden, N. F. Domestication of pea (Pisum sativum L.): the case of the Abyssinicum pea. Front. Plant Sci. 9, 515 (2018).

pubmed: 29720994 pmcid: 5915832

Pagani, L. et al. Ethiopian genetic diversity reveals linguistic stratification and complex influences on the Ethiopian gene pool. Am. J. Hum. Genet. 91, 83–96 (2012).

pubmed: 22726845 pmcid: 3397267

Gabriel, I. et al. Variation in seed protein digestion of different pea (Pisum sativum L.) genotypes by cecectomized broiler chickens: 1. Endogenous amino acid losses, true digestibility and in vitro hydrolysis of proteins. Livest. Sci. 113, 251–261 (2008).

Rubio, L. A. et al. Characterization of pea (Pisum sativum) seed protein fractions. J. Sci. Food Agric. 94, 280–287 (2014).

pubmed: 23744804

Bourgeois, M. et al. Dissecting the proteome of pea mature seeds reveals the phenotypic plasticity of seed protein composition. Proteomics 9, 254–271 (2009).

pubmed: 19086096

Casey, R. & Domoney, C. in Seed Proteins (eds Shewry, P. R. & Casey, R.) 171–208 (Kluwer Academic Publishers, 1999).

Yoshino, M., Nagamatsu, A., Tsutsumi, K. I. & Kanazawa, A. The regulatory function of the upstream sequence of the β-conglycinin α subunit gene in seed-specific transcription is associated with the presence of the RY sequence. Genes Genet. Syst. 81, 135–141 (2006).

pubmed: 16755137

Yamamoto, S., Nishihara, M., Morikawa, H., Yamauchi, D. & Minamikawa, T. Promoter analysis of seed storage protein genes from Canavalia gladiata DC. Plant Mol. Biol. 27, 729–741 (1995).

pubmed: 7727750

Bourgeois, M. et al. A PQL (protein quantity loci) analysis of mature pea seed proteins identifies loci determining seed protein composition. Proteomics 11, 1581–1594 (2011).

pubmed: 21433288

Smýkal, P. et al. Genomic diversity and macroecology of the crop wild relatives of domesticated pea. Sci. Rep. 7, 17384 (2017).

pubmed: 29234080 pmcid: 5727218

Luo, R. et al. SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler. Gigascience 1, 18 (2012).

pubmed: 23587118 pmcid: 23587118

Boetzer, M., Henkel, C. V., Jansen, H. J., Butler, D. & Pirovano, W. Scaffolding pre-assembled contigs using SSPACE. Bioinformatics 27, 578–579 (2011).

Madoui, M.-A. et al. MaGuS: a tool for quality assessment and scaffolding of genome assemblies with whole genome profiling

pubmed: 26936254 pmcid: 4776351

van Oeveren, J. et al. Sequence-based physical mapping of complex genomes by whole genome rofiling. Genome Res. 21, 618–625 (2011).

pubmed: 21324881 pmcid: 3065709

Li, R. et al. The sequence and de novo assembly of the giant panda genome. Nature 463, 311–317 (2010).

pubmed: 20010809

Li, R. et al. De novo assembly of human genomes with massively parallel short read sequencing. Genome Res. 20, 265–272 (2010).

pubmed: 20019144 pmcid: 20019144

Bayer, P. E. et al. High-resolution skim genotyping by sequencing reveals the distribution of crossovers and gene conversions in Cicer arietinum and Brassica napus. Theor. Appl. Genet. 128, 1039–1047 (2015).

pubmed: 25754422

Tang, H. et al. ALLMAPS: robust scaffold ordering based on multiple maps. Genome Biol. 16, 3 (2015).

pubmed: 25583564 pmcid: 4305236

Tang, H. et al. An improved genome release (version Mt4.0) for the model legume Medicago truncatula. BMC Genomics 27, 312 (2014).

Flutre, T., Duprat, E., Feuillet, C. & Quesneville, H. Considering transposable element diversification in de novo annotation approaches. PloS One 6, e16526 (2011).

pubmed: 21304975 pmcid: 3031573

Quesneville, H. et al. Combined evidence annotation of transposable elements in genome sequences. PLoS Comput. Biol. 1, e22 (2005).

pmcid: 1185648

Hoede, C. et al. PASTEC: an automatic transposable element classification tool. PLoS ONE 9, e91929 (2014).

pubmed: 24786468 pmcid: 4008368

Jamilloux, V., Daron, J., Choulet, F. & Quesneville, H. De novo annotation of transposable elements: tackling the fat genome issue. Proc. IEEE 105, 474–481 (2107).

Novák, P., Neumann, P. & Macas, J. Graph-based clustering and characterization of repetitive sequences in next-generation sequencing data. BMC Bioinformatics 11, 378 (2010).

pubmed: 20633259 pmcid: 2912890

Novák, P., Neumann, P., Pech, J., Steinhaisl, J. & Macas, J. RepeatExplorer: a Galaxy-based web server for genome-wide characterization of eukaryotic repetitive elements from next-generation sequence reads. Bioinformatics 29, 792–793 (2013).

pubmed: 23376349

Wicker, T. et al. A unified classification system for eukaryotic transposable elements. Nat. Rev. Genet. 8, 973–982 (2007).

pubmed: 17984973

Keller, O. et al. A novel hybrid gene prediction method employing protein multiple sequence alignments. Bioinformatics 6, 757–763 (2011).

Solovyev, V. et al. Automatic annotation of eukaryotic genes, pseudogenes and promoters. Genome Biol. 7, S10 (2006).

pubmed: 16925832 pmcid: 1810547

Slater, G. S. & Birney, E. Automated generation of heuristics for biological sequence comparison. BMC Bioinformatics 6, 31 (2005).

pubmed: 15713233 pmcid: 553969

Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).

pubmed: 23104886 pmcid: 23104886

Alves-Carvalho, S. Full-length de novo assembly of RNA-seq data in pea (Pisum sativum L.) provides a gene expression atlas and gives insights into root nodulation in this species. Plant J. 84, 1–19 (2015).

pubmed: 26296678

Turo, C. J. Genomic Analysis of Fungal Species Causing Ascochyta Blight in Field Pea. PhD thesis, Curtin Univ. (2016).

Pertea, M. et al. StringTie enables improved reconstruction of a transcriptome from RNA-seq reads. Nat. Biotech. 33, 290 (2015).

Grabherr, M. G. et al. Full-length transcriptome assembly from RNA-seq data without a reference genome. Nat. Biotechnol. 29, 644–652 (2011).

pubmed: 21572440 pmcid: 3571712

Haas, B. J. et al. Automated eukaryotic gene structure annotation using EVidenceModeler and the program to assemble spliced alignments. Genome Biol. 9, R7 (2008).

pubmed: 18190707 pmcid: 18190707

Simão, F. A., Waterhouse, R. M., Ioannidis, P., Kriventseva, E. V. & Zdobnov, E. M. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31, 3210–3212 (2015).

The UniProt Consortium. Ongoing and future developments at the Universal Protein Resource. Nucleic Acids Res. 39, D214–D219 (2011).

Jones, P. et al. InterProScan 5: genome-scale protein function classification. Bioinformatics 30, 1236–1240 (2014).

pubmed: 24451626 pmcid: 3998142

Cock, P. J. A., Grüning, B. A., Paszkiewicz, K. & Pritchard, L. Galaxy tools and workflows for sequence analysis with applications in molecular plant pathology. Peer J. 1, e167 (2013).

pubmed: 24109552

Foissac, S. et al. Genome annotation in plants and fungi: EuGene as a model platform. Curr. Bioinf. 3, 87–97 (2008).

Badouin, H. et al. The sunflower genome provides insights into oil metabolism, flowering and Asterid evolution. Nature 546, 148–152 (2017).

pubmed: 28538728

Lelandais-Brière, C. et al. Genome-wide Medicago truncatula small RNA analysis revealed novel microRNAs and isoforms differentially regulated in roots and nodules. Plant Cell 21, 2780–2796 (2009).

pubmed: 19767456 pmcid: 2768930

Emms, D. M. & Kelly, S. OrthoFinder: solving fundamental biases in whole genome comparisons dramatically improves orthogroup inference accuracy. Genome Biol. 16, 157 (2015).

pubmed: 26243257 pmcid: 4531804

Buchfink, B., Xie, C. & Huson, D. H. Fast and sensitive protein alignment using DIAMOND. Nat. Methods 12, 59–60 (2014).

pubmed: 25402007

Edgar, R. C. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 32, 1792–1797 (2004).

pubmed: 390337 pmcid: 390337

Bonnal, R. J. P. et al. Biogem: an effective tool-based approach for scaling up open source software development in bioinformatics. Bioinformatics 28, 1035–1037 (2012).

pubmed: 22332238 pmcid: 3315718

Goldman, N. & Yang, Z. A codon-based model of nucleotide substitution for protein-coding DNA sequences. Mol. Biol. Evol. 11, 725–736 (1994).

Yang, Z. & Nielsen, R. Estimating synonymous and nonsynonymous substitution rates under realistic evolutionary models. Mol. Biol. Evol. 17, 32–43 (2000).

pubmed: 10666704

Yang, Z. PAML 4: phylogenetic analysis by maximum likelihood. Mol. Biol. Evol. 24, 1586–1591 (2007).

Vanneste, K., de Peer, Van & Maere, Y. S. Inference of genome duplications from age distributions revisited. Mol. Biol. Evol. 30, 177–190 (2013).

pubmed: 22936721

Pont, C. et al. Paleogenomics: reconstruction of plant evolutionary trajectories from modern and ancient DNA. Genome Biol. 20, 29 (2019).

pubmed: 30744646 pmcid: 6369560

Bertioli, D. J. et al. The genome sequences of Arachis duranensis and Arachis ipaensis, the diploid ancestors of cultivated peanut. Nat. Genet. 47, 438–446 (2015).

Varshney, R. K. et al. Draft genome sequence of chickpea (Cicer arietinum) provides a resource for trait improvement. Nat. Biotech. 31, 240–246 (2013).

Singh, N. K. et al. The first draft of the pigeonpea genome sequence. J. Plant Biochem. Biotechnol. 21, 98–112 (2012).

pubmed: 24431589

Schmutz, J. et al. A reference genome for common bean and genome-wide analysis of dual domestications. Nat. Genet. 46, 707–713 (2014).

pubmed: 24908249

Kang, Y. J. et al. Genome sequence of mungbean and insights into evolution within Vigna species. Nat. Commun. 5, 5443 (2014).

pubmed: 25384727 pmcid: 4241982

Kang, Y. J. et al. Draft genome sequence of adzuki bean Vigna angularis. Sci. Rep. 5, 8069 (2015).

pubmed: 25626881 pmcid: 5389050

Siol, M. et al. Patterns of genetic structure and linkage disequilibrium in a large collection of pea germplasm. G3: Genes, Genomes, Genet. 7, 2461–2471 (2017).

Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).

pubmed: 19505943 pmcid: 19505943

Cingolani, P. et al. Using Drosophila melanogaster as a model for genotoxic chemical mutational studies with a new program, SnpSift. Front. Genet. 3, 35 (2012).

pubmed: 22435069 pmcid: 3304048

Danecek, P. et al. The variant call format and VCFtools. Bioinformatics 27, 2156–2158 (2011).

pubmed: 3137218 pmcid: 3137218

Purcell, S. et al. PLINK: A Tool Set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).

pubmed: 17701901 pmcid: 17701901

Nguyen, L. T. et al. IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol. Biol. Evol. 32, 268–274 (2014).

pubmed: 25371430 pmcid: 4271533

Kalyaanamoorthy, S. et al. ModelFinder: fast model selection for accurate phylogenetic estimates. Nat. Methods 14, 587–589 (2017).

pubmed: 28481363 pmcid: 5453245

Hoang, D. T. et al. UFBoot2: improving the ultrafast bootstrap approximation. Mol. Biol. Evol. 35, 518–522 (2017).

pmcid: 5850222

Ritchie, M. E. et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 4, 3 e47 (2015).

Sedlazeck, F. J., Rescheneder, P. & Von Haeseler, A. NextGenMap: fast and accurate read mapping in highly polymorphic genomes. Bioinformatics 29, 2790–2791 (2013).

pubmed: 23975764

Liao, Y., Smyth, G. K. & Shi, W. featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics 30, 923–930 (2013).

pubmed: 24227677 pmcid: 24227677

Gallardo, K. et al. A combined proteome and transcriptome analysis of developing Medicago truncatula seeds evidence for metabolic specialization of maternal and filial tissues. Mol. Cell. Proteomics 6, 2165–2179 (2007).

pubmed: 17848586