The genome and population genomics of allopolyploid Coffea arabica reveal the diversification history of modern coffee cultivars.


Journal

Nature genetics
ISSN: 1546-1718
Titre abrégé: Nat Genet
Pays: United States
ID NLM: 9216904

Informations de publication

Date de publication:
Apr 2024
Historique:
received: 10 05 2022
accepted: 23 02 2024
medline: 16 4 2024
pubmed: 16 4 2024
entrez: 15 4 2024
Statut: ppublish

Résumé

Coffea arabica, an allotetraploid hybrid of Coffea eugenioides and Coffea canephora, is the source of approximately 60% of coffee products worldwide, and its cultivated accessions have undergone several population bottlenecks. We present chromosome-level assemblies of a di-haploid C. arabica accession and modern representatives of its diploid progenitors, C. eugenioides and C. canephora. The three species exhibit largely conserved genome structures between diploid parents and descendant subgenomes, with no obvious global subgenome dominance. We find evidence for a founding polyploidy event 350,000-610,000 years ago, followed by several pre-domestication bottlenecks, resulting in narrow genetic variation. A split between wild accessions and cultivar progenitors occurred ~30.5 thousand years ago, followed by a period of migration between the two populations. Analysis of modern varieties, including lines historically introgressed with C. canephora, highlights their breeding histories and loci that may contribute to pathogen resistance, laying the groundwork for future genomics-based breeding of C. arabica.

Identifiants

pubmed: 38622339
doi: 10.1038/s41588-024-01695-w
pii: 10.1038/s41588-024-01695-w
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

721-731

Subventions

Organisme : NSF | Directorate for Biological Sciences (BIO)
ID : 1442190
Organisme : NSF | Directorate for Biological Sciences (BIO)
ID : 2030871
Organisme : Academy of Finland (Suomen Akatemia)
ID : 343656
Organisme : Fonds Wetenschappelijk Onderzoek (Research Foundation Flanders)
ID : G056517N
Organisme : Universitair Ziekenhuis Gent (Ghent University Hospital)
ID : BOF.MET.2021.0005.01

Informations de copyright

© 2024. The Author(s).

Références

Van de Peer, Y., Mizrachi, E. & Marchal, K. The evolutionary significance of polyploidy. Nat. Rev. Genet. 18, 411–424 (2017).
pubmed: 28502977 doi: 10.1038/nrg.2017.26
Van de Peer, Y., Ashman, T.-L., Soltis, P. S. & Soltis, D. E. Polyploidy: an evolutionary and ecological force in stressful times. Plant Cell 33, 11–26 (2021).
pubmed: 33751096 doi: 10.1093/plcell/koaa015
Leebens-Mack, J. H. et al. One thousand plant transcriptomes and the phylogenomics of green plants. Nature 574, 679–685 (2019).
doi: 10.1038/s41586-019-1693-2
Sun, H. et al. Chromosome-scale and haplotype-resolved genome assembly of a tetraploid potato cultivar. Nat. Genet. 54, 342–348 (2022).
pubmed: 35241824 pmcid: 8920897 doi: 10.1038/s41588-022-01015-0
Athiyannan, N. et al. Long-read genome sequencing of bread wheat facilitates disease resistance gene cloning. Nat. Genet. 54, 227–231 (2022).
pubmed: 35288708 pmcid: 8920886 doi: 10.1038/s41588-022-01022-1
Wu, S. et al. Genome sequences of two diploid wild relatives of cultivated sweetpotato reveal targets for genetic improvement. Nat. Commun. 9, 4580 (2018).
pubmed: 30389915 pmcid: 6214957 doi: 10.1038/s41467-018-06983-8
Wang, T. et al. A complete gap-free diploid genome in Saccharum complex and the genomic footprints of evolution in the highly polyploid Saccharum genus. Nat. Plants 9, 554–571 (2023).
pubmed: 36997685 doi: 10.1038/s41477-023-01378-0
Edger, P. P. et al. Origin and evolution of the octoploid strawberry genome. Nat. Genet. 51, 541–547 (2019).
pubmed: 30804557 pmcid: 6882729 doi: 10.1038/s41588-019-0356-4
Li, F. et al. Genome sequence of cultivated Upland cotton (Gossypium hirsutum TM-1) provides insights into genome evolution. Nat. Biotechnol. 33, 524–530 (2015).
pubmed: 25893780 doi: 10.1038/nbt.3208
Chalhoub, B. et al. Early allopolyploid evolution in the post-Neolithic Brassica napus oilseed genome. Science 345, 950–953 (2014).
pubmed: 25146293 doi: 10.1126/science.1253435
Sattler, M. C., Carvalho, C. R. & Clarindo, W. R. The polyploidy and its key role in plant breeding. Planta 243, 281–296 (2016).
pubmed: 26715561 doi: 10.1007/s00425-015-2450-x
McClintock, B. The significance of responses of the genome to challenge. Science 226, 792–801 (1984).
pubmed: 15739260 doi: 10.1126/science.15739260
Sha, Y. et al. Genome shock in a synthetic allotetraploid wheat invokes subgenome-partitioned gene regulation, meiotic instability, and karyotype variation. J. Exp. Bot. 74, 5547–5563 (2023).
pubmed: 37379452 doi: 10.1093/jxb/erad247
Thomas, B. C., Pedersen, B. & Freeling, M. Following tetraploidy in an Arabidopsis ancestor, genes were removed preferentially from one homeolog leaving clusters enriched in dose-sensitive genes. Genome Res. 16, 934–946 (2006).
pubmed: 16760422 pmcid: 1484460 doi: 10.1101/gr.4708406
Schnable, J. C., Springer, N. M. & Freeling, M. Differentiation of the maize subgenomes by genome dominance and both ancient and ongoing gene loss. Proc. Natl Acad. Sci. USA 108, 4069 (2011).
pubmed: 21368132 pmcid: 3053962 doi: 10.1073/pnas.1101368108
Gaeta, R. T., Pires, J. C., Iniguez-Luy, F., Leon, E. & Osborn, T. C. Genomic changes in resynthesized Brassica napus and their effect on gene expression and phenotype. Plant Cell 19, 3403–3417 (2007).
pubmed: 18024568 pmcid: 2174891 doi: 10.1105/tpc.107.054346
Burns, R. et al. Gradual evolution of allopolyploidy in Arabidopsis suecica. Nat. Ecol. Evol. 5, 1367–1381 (2021).
pubmed: 34413506 pmcid: 8484011 doi: 10.1038/s41559-021-01525-w
Conant, G. C., Birchler, J. A. & Pires, J. C. Dosage, duplication, and diploidization: clarifying the interplay of multiple models for duplicate gene evolution over time. Curr. Opin. Plant Biol. 19, 91–98 (2014).
pubmed: 24907529 doi: 10.1016/j.pbi.2014.05.008
Carvalho, A. et al. Melhoramento do cafeeiro: IV - Café Mundo Novo. Bragantia 12, 97–130 (1952).
doi: 10.1590/S0006-87051952000200001
Scalabrin, S. et al. A single polyploidization event at the origin of the tetraploid genome of Coffea arabica is responsible for the extremely low genetic variation in wild and cultivated germplasm. Sci. Rep. 10, 4642 (2020).
pubmed: 32170172 pmcid: 7069947 doi: 10.1038/s41598-020-61216-7
Cenci, A., Combes, M.-C. & Lashermes, P. Genome evolution in diploid and tetraploid Coffea species as revealed by comparative analysis of orthologous genome segments. Plant Mol. Biol. 78, 135–145 (2012).
pubmed: 22086332 doi: 10.1007/s11103-011-9852-3
Bawin, Y. et al. Phylogenomic analysis clarifies the evolutionary origin of Coffea arabica. J. Syst. Evol. 59, 953–963 (2020).
doi: 10.1111/jse.12694
Yu, Q. et al. Micro-collinearity and genome evolution in the vicinity of an ethylene receptor gene of cultivated diploid and allotetraploid coffee species (Coffea). Plant J. 67, 305–317 (2011).
pubmed: 21457367 doi: 10.1111/j.1365-313X.2011.04590.x
Merot-L’anthoene, V. et al. Development and evaluation of a genome-wide Coffee 8.5K SNP array and its application for high-density genetic mapping and for investigating the origin of Coffea arabica L. Plant Biotechnol. J. 17, 1418–1430 (2019).
pubmed: 30582651 pmcid: 6576098 doi: 10.1111/pbi.13066
Wellman, F. L. Coffee: Botany, Cultivation and Utilization (L. Hill, 1961).
Lécolier, A., Besse, P., Charrier, A., Tchakaloff, T.-N. & Noirot, M. Unraveling the origin of Coffea arabica ‘Bourbon pointu’ from La Réunion: a historical and scientific perspective. Euphytica 168, 1–10 (2009).
doi: 10.1007/s10681-009-9886-7
Clarindo, W. R., Carvalho, C. R., Caixeta, E. T. & Koehler, A. D. Following the track of ‘Híbrido de Timor’ origin by cytogenetic and flow cytometry approaches. Genet. Resour. Crop Evol. 60, 2253–2259 (2013).
doi: 10.1007/s10722-013-9990-3
Bertrand, B., Guyot, B., Anthony, F. & Lashermes, P. Impact of the Coffea canephora gene introgression on beverage quality of C. arabica. Theor. Appl. Genet. 107, 387–394 (2003).
pubmed: 12750771 doi: 10.1007/s00122-003-1203-6
Marie, L. et al. G × E interactions on yield and quality in Coffea arabica: new F1 hybrids outperform American cultivars. Euphytica 216, 78 (2020).
doi: 10.1007/s10681-020-02608-8
Bertrand, B., Villegas Hincapié, A. M., Marie, L. & Breitler, J.-C. Breeding for the main agricultural farming of Arabica coffee. Front. Sustain. Food Syst. 5, 709901 (2021).
doi: 10.3389/fsufs.2021.709901
Breitler, J.-C. et al. CRISPR/Cas9-mediated efficient targeted mutagenesis has the potential to accelerate the domestication of Coffea canephora. Plant Cell Tissue Organ Cult. 134, 383–394 (2018).
doi: 10.1007/s11240-018-1429-2
Berthaud, J. Etude cytogénétique d’un haploïde de Coffea arabica L. Cafe Cacao The 20, 91–96 (1976).
Denoeud, F. et al. The coffee genome provides insight into the convergent evolution of caffeine biosynthesis. Science 345, 1181–1184 (2014).
pubmed: 25190796 doi: 10.1126/science.1255274
Pellicer, J. & Leitch, I. J. The Plant DNA C-values database (release 7.1): an updated online repository of plant genome size data for comparative studies. New Phytol. 226, 301–305 (2020).
pubmed: 31608445 doi: 10.1111/nph.16261
Manni, M., Berkeley, M. R., Seppey, M., Simão, F. A. & Zdobnov, E. M. BUSCO update: novel and streamlined workflows along with broader and deeper phylogenetic coverage for scoring of eukaryotic, prokaryotic, and viral genomes. Mol. Biol. Evol. 38, 4647–4654 (2021).
pubmed: 34320186 pmcid: 8476166 doi: 10.1093/molbev/msab199
Petit, M. et al. Mobilization of retrotransposons in synthetic allotetraploid tobacco. New Phytol. 186, 135–147 (2010).
pubmed: 20074093 doi: 10.1111/j.1469-8137.2009.03140.x
Sarilar, V. et al. Allopolyploidy has a moderate impact on restructuring at three contrasting transposable element insertion sites in resynthesized Brassica napus allotetraploids. New Phytol. 198, 593–604 (2013).
pubmed: 23384044 doi: 10.1111/nph.12156
Bird, K. A., VanBuren, R., Puzey, J. R. & Edger, P. P. The causes and consequences of subgenome dominance in hybrids and recent polyploids. New Phytol. 220, 87–93 (2018).
pubmed: 29882360 doi: 10.1111/nph.15256
Göbel, U. et al. Robustness of transposable element regulation but no genomic shock observed in interspecific Arabidopsis hybrids. Genome Biol. Evol. 10, 1403–1415 (2018).
pubmed: 29788048 pmcid: 6007786 doi: 10.1093/gbe/evy095
Birchler, J. A. & Veitia, R. A. The gene balance hypothesis: implications for gene regulation, quantitative traits and evolution. New Phytol. 186, 54–62 (2010).
pubmed: 19925558 doi: 10.1111/j.1469-8137.2009.03087.x
Zeiss, D. R., Piater, L. A. & Dubery, I. A. Hydroxycinnamate amides: intriguing conjugates of plant protective metabolites. Trends Plant Sci. 26, 184–195 (2021).
pubmed: 33036915 doi: 10.1016/j.tplants.2020.09.011
Bird, K. A. et al. Replaying the evolutionary tape to investigate subgenome dominance in allopolyploid Brassica napus. New Phytol. 230, 354–371 (2021).
pubmed: 33280122 pmcid: 7986222 doi: 10.1111/nph.17137
Combes, M.-C., Joët, T., Stavrinides, A. K. & Lashermes, P. New cup out of old coffee: contribution of parental gene expression legacy to phenotypic novelty in coffee beans of the allopolyploid Coffea arabica L. Ann. Bot. 131, 157–170 (2023).
pubmed: 35325016 doi: 10.1093/aob/mcac041
Yoo, M. J., Szadkowski, E. & Wendel, J. F. Homoeolog expression bias and expression level dominance in allopolyploid cotton. Heredity 110, 171–180 (2013).
pubmed: 23169565 doi: 10.1038/hdy.2012.94
Meyer, F. G., Fernie, L. M., Narasimhaswami, R. L., Monaco, L. C. & Greathead, D. J. FAO Coffee Mission to Ethiopia, 1964–1965 (Food and Agriculture Organization of the United Nations, 1968).
Halle, F. Echantillonnage du matériel Coffea arabica récolté en Ethiopie. Bulletin IFCC 14, 13–18 (1978).
Krug, C. A. & Mendes, A. J. T. Cytological observations in Coffea – IV. J. Genet. 39, 189–203 (1940).
doi: 10.1007/BF02982835
Cros, J. et al. Phylogenetic analysis of chloroplast DNA variation in Coffea L. Mol. Phylogenet. Evol. 9, 109–117 (1998).
pubmed: 9479700 doi: 10.1006/mpev.1997.0453
Lashermes, P. et al. Molecular characterisation and origin of the Coffea arabica L. genome. Mol. Gen. Genet. 261, 259–266 (1999).
pubmed: 10102360 doi: 10.1007/s004380050965
Wu, Y. et al. Genomic mosaicism due to homoeologous exchange generates extensive phenotypic diversity in nascent allopolyploids. Natl Sci. Rev. 8, nwaa277 (2021).
pubmed: 34691642 doi: 10.1093/nsr/nwaa277
Terhorst, J., Kamm, J. A. & Song, Y. S. Robust and scalable inference of population history from hundreds of unphased whole genomes. Nat. Genet. 49, 303–309 (2017).
pubmed: 28024154 doi: 10.1038/ng.3748
Moat, J., Gole, T. W. & Davis, A. P. Least concern to endangered: applying climate change projections profoundly influences the extinction risk assessment for wild Arabica coffee. Glob. Change Biol. 25, 390–403 (2019).
doi: 10.1111/gcb.14341
Kuper, R. & Kröpelin, S. Climate-controlled holocene occupation in the Sahara: motor of Africa’s evolution. Science 313, 803–807 (2006).
pubmed: 16857900 doi: 10.1126/science.1130989
Excoffier, L. et al. fastsimcoal2: demographic inference under complex evolutionary scenarios. Bioinformatics 37, 4882–4885 (2021).
pubmed: 34164653 pmcid: 8665742 doi: 10.1093/bioinformatics/btab468
Lambeck, K. et al. Sea level and shoreline reconstructions for the Red Sea: isostatic and tectonic considerations and implications for hominin migration out of Africa. Quat. Sci. Rev. 30, 3542–3574 (2011).
doi: 10.1016/j.quascirev.2011.08.008
Montagnon, C., Mahyoub, A., Solano, W. & Sheibani, F. Unveiling a unique genetic diversity of cultivated Coffea arabica L. in its main domestication center: Yemen. Genet. Resour. Crop Evol. 68, 2411–2422 (2021).
doi: 10.1007/s10722-021-01139-y
Nordborg, M. & Donnelly, P. The coalescent process with selfing. Genetics 146, 1185 (1997).
pubmed: 9215919 pmcid: 1208046 doi: 10.1093/genetics/146.3.1185
Hu, G. et al. Two divergent haplotypes from a highly heterozygous lychee genome suggest independent domestication events for early and late-maturing cultivars. Nat. Genet. 54, 73–83 (2022).
pubmed: 34980919 pmcid: 8755541 doi: 10.1038/s41588-021-00971-3
Manichaikul, A. et al. Robust relationship inference in genome-wide association studies. Bioinformatics 26, 2867–2873 (2010).
pubmed: 20926424 pmcid: 3025716 doi: 10.1093/bioinformatics/btq559
Molloy, E. K., Durvasula, A. & Sankararaman, S. Advancing admixture graph estimation via maximum likelihood network orientation. Bioinformatics 37, i142–i150 (2021).
pubmed: 34252951 pmcid: 8336447 doi: 10.1093/bioinformatics/btab267
Pfeifer, B. & Kapan, D. D. Estimates of introgression as a function of pairwise distances. BMC Bioinformatics 20, 207 (2019).
pubmed: 31014244 pmcid: 6480520 doi: 10.1186/s12859-019-2747-z
Gaut, B. S., Seymour, D. K., Liu, Q. & Zhou, Y. Demography and its effects on genomic variation in crop domestication. Nat. Plants 4, 512–520 (2018).
pubmed: 30061748 doi: 10.1038/s41477-018-0210-1
dos Santos, T. B., Baba, V. Y., Vieira, L. G. E., Pereira, L. F. P. & Domingues, D. S. The urea transporter DUR3 is differentially regulated by abiotic and biotic stresses in coffee plants. Physiol. Mol. Biol. Plants 27, 203–212 (2021).
pubmed: 33707863 pmcid: 7907287 doi: 10.1007/s12298-021-00930-6
Wang, W. et al. Structural basis of salicylic acid perception by Arabidopsis NPR proteins. Nature 586, 311–316 (2020).
pubmed: 32788727 pmcid: 7554156 doi: 10.1038/s41586-020-2596-y
Mukhtar, M. S. et al. Independently evolved virulence effectors converge onto hubs in a plant immune system network. Science 333, 596–601 (2011).
pubmed: 21798943 pmcid: 3170753 doi: 10.1126/science.1203659
Jousimo, J. et al. Ecological and evolutionary effects of fragmentation on infectious disease dynamics. Science 344, 1289–1293 (2014).
pubmed: 24926021 doi: 10.1126/science.1253621
Cooley, M. B., Pathirana, S., Wu, H. J., Kachroo, P. & Klessig, D. F. Members of the Arabidopsis HRT/RPP8 family of resistance genes confer resistance to both viral and oomycete pathogens. Plant Cell 12, 663–676 (2000).
pubmed: 10810142 pmcid: 139919 doi: 10.1105/tpc.12.5.663
Mohr, T. J. et al. The Arabidopsis downy mildew resistance gene RPP8 is induced by pathogens and salicylic acid and is regulated by W-box cis elements. Mol. Plant Microbe Interact. 23, 1303–1315 (2010).
pubmed: 20831409 doi: 10.1094/MPMI-01-10-0022
MacQueen, A. et al. Population genetics of the highly polymorphic RPP8 gene family. Genes 10, 691 (2019).
pubmed: 31500388 pmcid: 6771003 doi: 10.3390/genes10090691
Cheng, Y. T. et al. Stability of plant immune-receptor resistance proteins is controlled by SKP1-CULLIN1-F-box (SCF)-mediated protein degradation. Proc. Natl Acad. Sci. USA 108, 14694–14699 (2011).
pubmed: 21873230 pmcid: 3167521 doi: 10.1073/pnas.1105685108
Hedtmann, C. et al. The plant immunity regulating F-Box Protein CPR1 supports plastid function in absence of pathogens. Front. Plant Sci. 8, 1650 (2017).
pubmed: 29018463 pmcid: 5615928 doi: 10.3389/fpls.2017.01650
Feuillet, C., Schachermayr, G. & Keller, B. Molecular cloning of a new receptor-like kinase gene encoded at the Lr10 disease resistance locus of wheat. Plant J. 11, 45–52 (1997).
pubmed: 9025301 doi: 10.1046/j.1365-313X.1997.11010045.x
Zhou, H. et al. Molecular analysis of three new receptor-like kinase genes from hexaploid wheat and evidence for their participation in the wheat hypersensitive response to stripe rust fungus infection. Plant J. 52, 420–434 (2007).
pubmed: 17764502 doi: 10.1111/j.1365-313X.2007.03246.x
Xia, T. et al. Efficient expression and function of a receptor-like kinase in wheat powdery mildew defence require an intron-located MYB binding site. Plant Biotechnol. J. 19, 897–909 (2021).
pubmed: 33225586 doi: 10.1111/pbi.13512
Florez, J. C. et al. High throughput transcriptome analysis of coffee reveals prehaustorial resistance in response to Hemileia vastatrix infection. Plant Mol. Biol. 95, 607–623 (2017).
pubmed: 29094279 doi: 10.1007/s11103-017-0676-7
Gaut, B. S., Díez, C. M. & Morrell, P. L. Genomics and the contrasting dynamics of annual and perennial domestication. Trends Genet. 31, 709–719 (2015).
pubmed: 26603610 doi: 10.1016/j.tig.2015.10.002
Chen, Z. J. Molecular mechanisms of polyploidy and hybrid vigor. Trends Plant Sci. 15, 57–71 (2010).
pubmed: 20080432 pmcid: 2821985 doi: 10.1016/j.tplants.2009.12.003
Lan, T. et al. Insights into bear evolution from a Pleistocene polar bear genome. Proc. Natl Acad. Sci. USA 119, e2200016119 (2022).
pubmed: 35666863 pmcid: 9214488 doi: 10.1073/pnas.2200016119
Berlin, K. et al. Assembling large genomes with single-molecule sequencing and locality-sensitive hashing. Nat. Biotechnol. 33, 623–630 (2015).
pubmed: 26006009 doi: 10.1038/nbt.3238
Chin, C.-S. et al. Phased diploid genome assembly with single-molecule real-time sequencing. Nat. Methods 13, 1050–1054 (2016).
pubmed: 27749838 pmcid: 5503144 doi: 10.1038/nmeth.4035
Walker, B. J. et al. Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS ONE 9, e112963 (2014).
pubmed: 25409509 pmcid: 4237348 doi: 10.1371/journal.pone.0112963
English, A. C. et al. Mind the gap: upgrading genomes with Pacific Biosciences RS long-read sequencing technology. PLoS ONE 7, e47768 (2012).
pubmed: 23185243 pmcid: 3504050 doi: 10.1371/journal.pone.0047768
Rastas, P. Lep-MAP3: robust linkage mapping even for low-coverage whole genome sequencing data. Bioinformatics 33, 3726–3732 (2017).
pubmed: 29036272 doi: 10.1093/bioinformatics/btx494
Cheng, H., Concepcion, G. T., Feng, X., Zhang, H. & Li, H. Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. Nat. Methods 18, 170–175 (2021).
pubmed: 33526886 pmcid: 7961889 doi: 10.1038/s41592-020-01056-5
Zhang, X., Zhang, S., Zhao, Q., Ming, R. & Tang, H. Assembly of allele-aware, chromosomal-scale autopolyploid genomes based on Hi-C data. Nat. Plants 5, 833–845 (2019).
pubmed: 31383970 doi: 10.1038/s41477-019-0487-8
Dudchenko, O. et al. De novo assembly of the Aedes aegypti genome using Hi-C yields chromosome-length scaffolds. Science 356, 92–95 (2017).
pubmed: 28336562 pmcid: 5635820 doi: 10.1126/science.aal3327
Durand, N. C. et al. Juicebox provides a visualization system for Hi-C contact maps with unlimited zoom. Cell Syst. 3, 99–101 (2016).
pubmed: 27467250 pmcid: 5596920 doi: 10.1016/j.cels.2015.07.012
Lyons, E. & Freeling, M. How to usefully compare homologous plant genes and chromosomes as DNA sequences. Plant J. 53, 661–673 (2008).
pubmed: 18269575 doi: 10.1111/j.1365-313X.2007.03326.x
Lyons, E. et al. Finding and comparing syntenic regions among Arabidopsis and the outgroups papaya, poplar, and grape: CoGe with rosids. Plant Physiol. 148, 1772–1781 (2008).
pubmed: 18952863 pmcid: 2593677 doi: 10.1104/pp.108.124867
Lefebvre-Pautigny, F. et al. High resolution synteny maps allowing direct comparisons between the coffee and tomato genomes. Tree Genet. Genomes 6, 565–577 (2010).
doi: 10.1007/s11295-010-0272-3
Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014).
pubmed: 24695404 pmcid: 4103590 doi: 10.1093/bioinformatics/btu170
Li, H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. Preprint at https://doi.org/10.48550/arXiv.1303.3997 (2013).
Ou, S. et al. Benchmarking transposable element annotation methods for creation of a streamlined, comprehensive pipeline. Genome Biol. 20, 275 (2019).
pubmed: 31843001 pmcid: 6913007 doi: 10.1186/s13059-019-1905-y
Orozco-Arias, S. et al. Inpactor2: a software based on deep learning to identify and classify LTR-retrotransposons in plant genomes. Brief. Bioinform. 24, bbac511 (2023).
pubmed: 36502372 doi: 10.1093/bib/bbac511
Fu, L., Niu, B., Zhu, Z., Wu, S. & Li, W. CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics 28, 3150–3152 (2012).
pubmed: 23060610 pmcid: 3516142 doi: 10.1093/bioinformatics/bts565
Ma, J. & Bennetzen Jeffrey, L. Rapid recent growth and divergence of rice nuclear genomes. Proc. Natl Acad. Sci. USA 101, 12404–12410 (2004).
pubmed: 15240870 pmcid: 515075 doi: 10.1073/pnas.0403715101
Orozco-Arias, S. et al. Inpactor, integrated and parallel analyzer and classifier of LTR retrotransposons and its application for pineapple LTR retrotransposons diversity and dynamics. Biology 7, 32 (2018).
pubmed: 29799487 pmcid: 6022998 doi: 10.3390/biology7020032
Campbell, M. S. et al. MAKER-P: a tool kit for the rapid creation, management, and quality control of plant genome annotations. Plant Physiol. 164, 513–524 (2014).
pubmed: 24306534 doi: 10.1104/pp.113.230144
Keilwagen, J., Hartung, F. & Grau, J. in Gene Prediction: Methods and Protocols (ed. Kollmar, M.) 161–177 (Springer, 2019).
Cheng, B., Furtado, A. & Henry, R. J. The coffee bean transcriptome explains the accumulation of the major bean components through ripening. Sci. Rep. 8, 11414 (2018).
pubmed: 30061608 pmcid: 6065352 doi: 10.1038/s41598-018-29842-4
Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).
pubmed: 23104886 doi: 10.1093/bioinformatics/bts635
Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).
pubmed: 25516281 pmcid: 4302049 doi: 10.1186/s13059-014-0550-8
Sankoff, D. et al. Models for similarity distributions of syntenic homologs and applications to phylogenomics. IEEE/ACM Trans. Comput. Biol. Bioinform. 16, 727–737 (2019).
doi: 10.1109/TCBB.2018.2849377
Andrews, S. FastQC: a quality control tool for high throughput sequence data. https://www.bioinformatics.babraham.ac.uk/projects/fastqc (2010).
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
pubmed: 19451168 pmcid: 2705234 doi: 10.1093/bioinformatics/btp324
Jónsson, H., Ginolhac, A., Schubert, M., Johnson, P. L. F. & Orlando, L. mapDamage2.0: fast approximate Bayesian estimates of ancient DNA damage parameters. Bioinformatics 29, 1682–1684 (2013).
pubmed: 23613487 pmcid: 3694634 doi: 10.1093/bioinformatics/btt193
Poplin, R. et al. Scaling accurate genetic variant discovery to tens of thousands of samples. Preprint at bioRxiv https://doi.org/10.1101/201178 (2018).
Cingolani, P. et al. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff. Fly 6, 80–92 (2012).
pubmed: 22728672 pmcid: 3679285 doi: 10.4161/fly.19695
Danecek, P. et al. The variant call format and VCFtools. Bioinformatics 27, 2156–2158 (2011).
pubmed: 21653522 pmcid: 3137218 doi: 10.1093/bioinformatics/btr330
Chang, C. C. et al. Second-generation PLINK: rising to the challenge of larger and richer datasets. GigaScience 4, https://doi.org/10.1186/s13742-015-0047-8 (2015).
Alexander, D. H. & Lange, K. Enhancements to the ADMIXTURE algorithm for individual ancestry estimation. BMC Bioinf. 12, 246 (2011).
doi: 10.1186/1471-2105-12-246
Korneliussen, T. S., Albrechtsen, A. & Nielsen, R. ANGSD: analysis of next generation sequencing data. BMC Bioinf. 15, 356 (2014).
doi: 10.1186/s12859-014-0356-4
Patterson, N. et al. Ancient admixture in human history. Genetics 192, 1065–1093 (2012).
pubmed: 22960212 pmcid: 3522152 doi: 10.1534/genetics.112.145037
Salojärvi, J. et al. Genome sequencing and population genomic analyses provide insights into the adaptive landscape of silver birch. Nat. Genet. 49, 904–912 (2017).
pubmed: 28481341 doi: 10.1038/ng.3862
Stamatakis, A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30, 1312–1313 (2014).
pubmed: 24451623 pmcid: 3998144 doi: 10.1093/bioinformatics/btu033
Li, H. & Durbin, R. Inference of human population history from individual whole-genome sequences. Nature 475, 493–496 (2011).
pubmed: 21753753 pmcid: 3154645 doi: 10.1038/nature10231
Danecek, P. et al. Twelve years of SAMtools and BCFtools. GigaScience 10, giab008 (2021).
pubmed: 33590861 pmcid: 7931819 doi: 10.1093/gigascience/giab008
Pickrell, J. K. & Pritchard, J. K. Inference of population splits and mixtures from genome-wide allele frequency data. PLoS Genet. 8, e1002967 (2012).
pubmed: 23166502 pmcid: 3499260 doi: 10.1371/journal.pgen.1002967
Orozco-Arias, S. et al. TIP_finder: an HPC software to detect transposable element insertion polymorphisms in large genomic datasets. Biology 9, 281 (2020).
pubmed: 32917036 pmcid: 7563458 doi: 10.3390/biology9090281
Kautsar, S. A., Suarez Duran, H. G., Blin, K., Osbourn, A. & Medema, M. H. plantiSMASH: automated identification, annotation and expression analysis of plant biosynthetic gene clusters. Nucleic Acids Res. 45, W55–W63 (2017).
pubmed: 28453650 pmcid: 5570173 doi: 10.1093/nar/gkx305
Klopfenstein, D. V. et al. GOATOOLS: a Python library for Gene Ontology analyses. Sci. Rep. 8, 10872 (2018).
pubmed: 30022098 pmcid: 6052049 doi: 10.1038/s41598-018-28948-z
Salojärvi, J. jsalojar/PiNSiR: first release of PiNSiR. Zenodo https://doi.org/10.5281/zenodo.5136527 (2021).

Auteurs

Jarkko Salojärvi (J)

School of Biological Sciences, Nanyang Technological University, Singapore, Singapore. jarkko@ntu.edu.sg.
Organismal and Evolutionary Biology Research Programme, University of Helsinki, Helsinki, Finland. jarkko@ntu.edu.sg.
Singapore Centre for Environmental Life Sciences Engineering, Nanyang Technological University, Singapore, Singapore. jarkko@ntu.edu.sg.

Aditi Rambani (A)

Boyce Thompson Institute, Cornell University, Ithaca, NY, USA.

Zhe Yu (Z)

Department of Mathematics and Statistics, University of Ottawa, Ottawa, Ontario, Canada.

Romain Guyot (R)

Institut de Recherche pour le Développement (IRD), Université de Montpellier, Montpellier, France.
Department of Electronics and Automation, Universidad Autónoma de Manizales, Manizales, Colombia.

Susan Strickler (S)

Boyce Thompson Institute, Cornell University, Ithaca, NY, USA.

Maud Lepelley (M)

Société des Produits Nestlé SA, Nestlé Research, Tours, France.

Cui Wang (C)

Organismal and Evolutionary Biology Research Programme, University of Helsinki, Helsinki, Finland.

Sitaram Rajaraman (S)

Organismal and Evolutionary Biology Research Programme, University of Helsinki, Helsinki, Finland.

Pasi Rastas (P)

Institute of Biotechnology, University of Helsinki, Helsinki, Finland.

Chunfang Zheng (C)

Department of Mathematics and Statistics, University of Ottawa, Ottawa, Ontario, Canada.

Daniella Santos Muñoz (DS)

Department of Mathematics and Statistics, University of Ottawa, Ottawa, Ontario, Canada.

João Meidanis (J)

Institute of Computing, University of Campinas, Campinas, Brazil.

Alexandre Rossi Paschoal (AR)

Department of Computer Science, The Federal University of Technology - Paraná (UTFPR), Cornélio Procópio, Brazil.

Yves Bawin (Y)

Plant Sciences Unit, Flanders Research Institute for Agriculture, Fisheries and Food (ILVO), Melle, Belgium.

Trevor J Krabbenhoft (TJ)

Department of Biological Sciences, University at Buffalo, Buffalo, NY, USA.

Zhen Qin Wang (ZQ)

Department of Biological Sciences, University at Buffalo, Buffalo, NY, USA.

Steven J Fleck (SJ)

Department of Biological Sciences, University at Buffalo, Buffalo, NY, USA.

Rudy Aussel (R)

Société des Produits Nestlé SA, Nestlé Research, Tours, France.
Centre d'Immunologie de Marseille-Luminy, Aix Marseille Université, Marseille, France.

Laurence Bellanger (L)

Société des Produits Nestlé SA, Nestlé Research, Tours, France.

Aline Charpagne (A)

Société des Produits Nestlé SA, Nestlé Research, Lausanne, Switzerland.

Coralie Fournier (C)

Société des Produits Nestlé SA, Nestlé Research, Lausanne, Switzerland.

Mohamed Kassam (M)

Société des Produits Nestlé SA, Nestlé Research, Lausanne, Switzerland.

Gregory Lefebvre (G)

Société des Produits Nestlé SA, Nestlé Research, Lausanne, Switzerland.

Sylviane Métairon (S)

Société des Produits Nestlé SA, Nestlé Research, Lausanne, Switzerland.

Déborah Moine (D)

Société des Produits Nestlé SA, Nestlé Research, Lausanne, Switzerland.

Michel Rigoreau (M)

Société des Produits Nestlé SA, Nestlé Research, Tours, France.

Jens Stolte (J)

Société des Produits Nestlé SA, Nestlé Research, Lausanne, Switzerland.

Perla Hamon (P)

Institut de Recherche pour le Développement (IRD), Université de Montpellier, Montpellier, France.

Emmanuel Couturon (E)

Institut de Recherche pour le Développement (IRD), Université de Montpellier, Montpellier, France.

Christine Tranchant-Dubreuil (C)

Institut de Recherche pour le Développement (IRD), Université de Montpellier, Montpellier, France.

Minakshi Mukherjee (M)

Department of Biological Sciences, University at Buffalo, Buffalo, NY, USA.

Tianying Lan (T)

Department of Biological Sciences, University at Buffalo, Buffalo, NY, USA.

Jan Engelhardt (J)

Department of Computer Science, University of Leipzig, Leipzig, Germany.

Peter Stadler (P)

Department of Computer Science, University of Leipzig, Leipzig, Germany.
Interdisciplinary Center for Bioinformatics, University of Leipzig, Leipzig, Germany.

Samara Mireza Correia De Lemos (SM)

Group of Genomics and Transcriptomes in Plants, São Paulo State University, UNESP, Rio Claro, Brazil.

Suzana Ivamoto Suzuki (SI)

Centro de Ciências Agrárias, Universidade Estadual de Londrina, Londrina, Brazil.

Ucu Sumirat (U)

Indonesian Coffee and Cocoa Research Institute (ICCRI), Jember, Indonesia.

Ching Man Wai (CM)

University of Illinois at Urbana-Champaign, Urbana, IL, USA.

Nicolas Dauchot (N)

Research Unit in Plant Cellular and Molecular Biology, University of Namur, Namur, Belgium.

Simon Orozco-Arias (S)

Department of Electronics and Automation, Universidad Autónoma de Manizales, Manizales, Colombia.

Andrea Garavito (A)

Departamento de Ciencias Biológicas, Facultad de Ciencias Exactas y Naturales, Universidad de Caldas, Manizales, Colombia.

Catherine Kiwuka (C)

National Agricultural Research Organization (NARO), Entebbe, Uganda.

Pascal Musoli (P)

National Agricultural Research Organization (NARO), Entebbe, Uganda.

Anne Nalukenge (A)

National Agricultural Research Organization (NARO), Entebbe, Uganda.

Erwan Guichoux (E)

Biodiversité Gènes & Communautés, INRA, Bordeaux, France.

Havinga Reinout (H)

Hortus Botanicus Amsterdam, Amsterdam, the Netherlands.

Martin Smit (M)

Hortus Botanicus Amsterdam, Amsterdam, the Netherlands.

Lorenzo Carretero-Paulet (L)

Departamento de Biología y Geología, Universidad de Almería, Almería, Spain.

Oliveiro Guerreiro Filho (OG)

Instituto Agronômico (IAC) Centro de Café 'Alcides Carvalho', Fazenda Santa Elisa, Campinas, Brazil.

Masako Toma Braghini (MT)

Instituto Agronômico (IAC) Centro de Café 'Alcides Carvalho', Fazenda Santa Elisa, Campinas, Brazil.

Lilian Padilha (L)

Embrapa Café/Instituto Agronômico (IAC) Centro de Café 'Alcides Carvalho', Fazenda Santa Elisa, Campinas, Brazil.

Gustavo Hiroshi Sera (GH)

Instituto de Desenvolvimento Rural do Paraná- IAPAR, Londrina, Brazil.

Tom Ruttink (T)

Plant Sciences Unit, Flanders Research Institute for Agriculture, Fisheries and Food (ILVO), Melle, Belgium.
Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium.

Robert Henry (R)

Queensland Alliance for Agriculture and Food Innovation, University of Queensland, Brisbane, Queensland, Australia.

Pierre Marraccini (P)

CIRAD - UMR DIADE (IRD-CIRAD-Université de Montpellier) BP 64501, Montpellier, France.

Yves Van de Peer (Y)

Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium.
Department of Biochemistry, Genetics and Microbiology, University of Pretoria, Pretoria, South Africa.
College of Horticulture, Academy for Advanced Interdisciplinary Studies, Nanjing Agricultural University, Nanjing, China.
Center for Plant Systems Biology, VIB, Ghent, Belgium.

Alan Andrade (A)

Embrapa Café/Inovacafé Laboratory of Molecular Genetics Campus da UFLA-MG, Lavras, Brazil.

Douglas Domingues (D)

Group of Genomics and Transcriptomes in Plants, São Paulo State University, UNESP, Rio Claro, Brazil.

Giovanni Giuliano (G)

Italian National Agency for New Technologies, Energy and Sustainable Economic Development, ENEA Casaccia Research Center, Rome, Italy.

Lukas Mueller (L)

Boyce Thompson Institute, Cornell University, Ithaca, NY, USA.

Luiz Filipe Pereira (LF)

Embrapa Café/Lab. Biotecnologia, Área de Melhoramento Genético, Londrina, Brazil.

Stephane Plaisance (S)

VIB Nucleomics Core, Leuven, Belgium.

Valerie Poncet (V)

Institut de Recherche pour le Développement (IRD), Université de Montpellier, Montpellier, France.

Stephane Rombauts (S)

Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium.
Center for Plant Systems Biology, VIB, Ghent, Belgium.

David Sankoff (D)

Department of Mathematics and Statistics, University of Ottawa, Ottawa, Ontario, Canada.

Victor A Albert (VA)

Department of Biological Sciences, University at Buffalo, Buffalo, NY, USA. vaalbert@buffalo.edu.

Dominique Crouzillat (D)

Société des Produits Nestlé SA, Nestlé Research, Tours, France. dcrouzillat@gmail.com.

Alexandre de Kochko (A)

Institut de Recherche pour le Développement (IRD), Université de Montpellier, Montpellier, France. alexandre.dekochko@gmail.com.

Patrick Descombes (P)

Société des Produits Nestlé SA, Nestlé Research, Lausanne, Switzerland. patrick.descombes@rd.nestle.com.

Classifications MeSH