The genome and population genomics of allopolyploid Coffea arabica reveal the diversification history of modern coffee cultivars.
Journal
Nature genetics
ISSN: 1546-1718
Titre abrégé: Nat Genet
Pays: United States
ID NLM: 9216904
Informations de publication
Date de publication:
Apr 2024
Apr 2024
Historique:
received:
10
05
2022
accepted:
23
02
2024
medline:
16
4
2024
pubmed:
16
4
2024
entrez:
15
4
2024
Statut:
ppublish
Résumé
Coffea arabica, an allotetraploid hybrid of Coffea eugenioides and Coffea canephora, is the source of approximately 60% of coffee products worldwide, and its cultivated accessions have undergone several population bottlenecks. We present chromosome-level assemblies of a di-haploid C. arabica accession and modern representatives of its diploid progenitors, C. eugenioides and C. canephora. The three species exhibit largely conserved genome structures between diploid parents and descendant subgenomes, with no obvious global subgenome dominance. We find evidence for a founding polyploidy event 350,000-610,000 years ago, followed by several pre-domestication bottlenecks, resulting in narrow genetic variation. A split between wild accessions and cultivar progenitors occurred ~30.5 thousand years ago, followed by a period of migration between the two populations. Analysis of modern varieties, including lines historically introgressed with C. canephora, highlights their breeding histories and loci that may contribute to pathogen resistance, laying the groundwork for future genomics-based breeding of C. arabica.
Identifiants
pubmed: 38622339
doi: 10.1038/s41588-024-01695-w
pii: 10.1038/s41588-024-01695-w
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
721-731Subventions
Organisme : NSF | Directorate for Biological Sciences (BIO)
ID : 1442190
Organisme : NSF | Directorate for Biological Sciences (BIO)
ID : 2030871
Organisme : Academy of Finland (Suomen Akatemia)
ID : 343656
Organisme : Fonds Wetenschappelijk Onderzoek (Research Foundation Flanders)
ID : G056517N
Organisme : Universitair Ziekenhuis Gent (Ghent University Hospital)
ID : BOF.MET.2021.0005.01
Informations de copyright
© 2024. The Author(s).
Références
Van de Peer, Y., Mizrachi, E. & Marchal, K. The evolutionary significance of polyploidy. Nat. Rev. Genet. 18, 411–424 (2017).
pubmed: 28502977
doi: 10.1038/nrg.2017.26
Van de Peer, Y., Ashman, T.-L., Soltis, P. S. & Soltis, D. E. Polyploidy: an evolutionary and ecological force in stressful times. Plant Cell 33, 11–26 (2021).
pubmed: 33751096
doi: 10.1093/plcell/koaa015
Leebens-Mack, J. H. et al. One thousand plant transcriptomes and the phylogenomics of green plants. Nature 574, 679–685 (2019).
doi: 10.1038/s41586-019-1693-2
Sun, H. et al. Chromosome-scale and haplotype-resolved genome assembly of a tetraploid potato cultivar. Nat. Genet. 54, 342–348 (2022).
pubmed: 35241824
pmcid: 8920897
doi: 10.1038/s41588-022-01015-0
Athiyannan, N. et al. Long-read genome sequencing of bread wheat facilitates disease resistance gene cloning. Nat. Genet. 54, 227–231 (2022).
pubmed: 35288708
pmcid: 8920886
doi: 10.1038/s41588-022-01022-1
Wu, S. et al. Genome sequences of two diploid wild relatives of cultivated sweetpotato reveal targets for genetic improvement. Nat. Commun. 9, 4580 (2018).
pubmed: 30389915
pmcid: 6214957
doi: 10.1038/s41467-018-06983-8
Wang, T. et al. A complete gap-free diploid genome in Saccharum complex and the genomic footprints of evolution in the highly polyploid Saccharum genus. Nat. Plants 9, 554–571 (2023).
pubmed: 36997685
doi: 10.1038/s41477-023-01378-0
Edger, P. P. et al. Origin and evolution of the octoploid strawberry genome. Nat. Genet. 51, 541–547 (2019).
pubmed: 30804557
pmcid: 6882729
doi: 10.1038/s41588-019-0356-4
Li, F. et al. Genome sequence of cultivated Upland cotton (Gossypium hirsutum TM-1) provides insights into genome evolution. Nat. Biotechnol. 33, 524–530 (2015).
pubmed: 25893780
doi: 10.1038/nbt.3208
Chalhoub, B. et al. Early allopolyploid evolution in the post-Neolithic Brassica napus oilseed genome. Science 345, 950–953 (2014).
pubmed: 25146293
doi: 10.1126/science.1253435
Sattler, M. C., Carvalho, C. R. & Clarindo, W. R. The polyploidy and its key role in plant breeding. Planta 243, 281–296 (2016).
pubmed: 26715561
doi: 10.1007/s00425-015-2450-x
McClintock, B. The significance of responses of the genome to challenge. Science 226, 792–801 (1984).
pubmed: 15739260
doi: 10.1126/science.15739260
Sha, Y. et al. Genome shock in a synthetic allotetraploid wheat invokes subgenome-partitioned gene regulation, meiotic instability, and karyotype variation. J. Exp. Bot. 74, 5547–5563 (2023).
pubmed: 37379452
doi: 10.1093/jxb/erad247
Thomas, B. C., Pedersen, B. & Freeling, M. Following tetraploidy in an Arabidopsis ancestor, genes were removed preferentially from one homeolog leaving clusters enriched in dose-sensitive genes. Genome Res. 16, 934–946 (2006).
pubmed: 16760422
pmcid: 1484460
doi: 10.1101/gr.4708406
Schnable, J. C., Springer, N. M. & Freeling, M. Differentiation of the maize subgenomes by genome dominance and both ancient and ongoing gene loss. Proc. Natl Acad. Sci. USA 108, 4069 (2011).
pubmed: 21368132
pmcid: 3053962
doi: 10.1073/pnas.1101368108
Gaeta, R. T., Pires, J. C., Iniguez-Luy, F., Leon, E. & Osborn, T. C. Genomic changes in resynthesized Brassica napus and their effect on gene expression and phenotype. Plant Cell 19, 3403–3417 (2007).
pubmed: 18024568
pmcid: 2174891
doi: 10.1105/tpc.107.054346
Burns, R. et al. Gradual evolution of allopolyploidy in Arabidopsis suecica. Nat. Ecol. Evol. 5, 1367–1381 (2021).
pubmed: 34413506
pmcid: 8484011
doi: 10.1038/s41559-021-01525-w
Conant, G. C., Birchler, J. A. & Pires, J. C. Dosage, duplication, and diploidization: clarifying the interplay of multiple models for duplicate gene evolution over time. Curr. Opin. Plant Biol. 19, 91–98 (2014).
pubmed: 24907529
doi: 10.1016/j.pbi.2014.05.008
Carvalho, A. et al. Melhoramento do cafeeiro: IV - Café Mundo Novo. Bragantia 12, 97–130 (1952).
doi: 10.1590/S0006-87051952000200001
Scalabrin, S. et al. A single polyploidization event at the origin of the tetraploid genome of Coffea arabica is responsible for the extremely low genetic variation in wild and cultivated germplasm. Sci. Rep. 10, 4642 (2020).
pubmed: 32170172
pmcid: 7069947
doi: 10.1038/s41598-020-61216-7
Cenci, A., Combes, M.-C. & Lashermes, P. Genome evolution in diploid and tetraploid Coffea species as revealed by comparative analysis of orthologous genome segments. Plant Mol. Biol. 78, 135–145 (2012).
pubmed: 22086332
doi: 10.1007/s11103-011-9852-3
Bawin, Y. et al. Phylogenomic analysis clarifies the evolutionary origin of Coffea arabica. J. Syst. Evol. 59, 953–963 (2020).
doi: 10.1111/jse.12694
Yu, Q. et al. Micro-collinearity and genome evolution in the vicinity of an ethylene receptor gene of cultivated diploid and allotetraploid coffee species (Coffea). Plant J. 67, 305–317 (2011).
pubmed: 21457367
doi: 10.1111/j.1365-313X.2011.04590.x
Merot-L’anthoene, V. et al. Development and evaluation of a genome-wide Coffee 8.5K SNP array and its application for high-density genetic mapping and for investigating the origin of Coffea arabica L. Plant Biotechnol. J. 17, 1418–1430 (2019).
pubmed: 30582651
pmcid: 6576098
doi: 10.1111/pbi.13066
Wellman, F. L. Coffee: Botany, Cultivation and Utilization (L. Hill, 1961).
Lécolier, A., Besse, P., Charrier, A., Tchakaloff, T.-N. & Noirot, M. Unraveling the origin of Coffea arabica ‘Bourbon pointu’ from La Réunion: a historical and scientific perspective. Euphytica 168, 1–10 (2009).
doi: 10.1007/s10681-009-9886-7
Clarindo, W. R., Carvalho, C. R., Caixeta, E. T. & Koehler, A. D. Following the track of ‘Híbrido de Timor’ origin by cytogenetic and flow cytometry approaches. Genet. Resour. Crop Evol. 60, 2253–2259 (2013).
doi: 10.1007/s10722-013-9990-3
Bertrand, B., Guyot, B., Anthony, F. & Lashermes, P. Impact of the Coffea canephora gene introgression on beverage quality of C. arabica. Theor. Appl. Genet. 107, 387–394 (2003).
pubmed: 12750771
doi: 10.1007/s00122-003-1203-6
Marie, L. et al. G × E interactions on yield and quality in Coffea arabica: new F1 hybrids outperform American cultivars. Euphytica 216, 78 (2020).
doi: 10.1007/s10681-020-02608-8
Bertrand, B., Villegas Hincapié, A. M., Marie, L. & Breitler, J.-C. Breeding for the main agricultural farming of Arabica coffee. Front. Sustain. Food Syst. 5, 709901 (2021).
doi: 10.3389/fsufs.2021.709901
Breitler, J.-C. et al. CRISPR/Cas9-mediated efficient targeted mutagenesis has the potential to accelerate the domestication of Coffea canephora. Plant Cell Tissue Organ Cult. 134, 383–394 (2018).
doi: 10.1007/s11240-018-1429-2
Berthaud, J. Etude cytogénétique d’un haploïde de Coffea arabica L. Cafe Cacao The 20, 91–96 (1976).
Denoeud, F. et al. The coffee genome provides insight into the convergent evolution of caffeine biosynthesis. Science 345, 1181–1184 (2014).
pubmed: 25190796
doi: 10.1126/science.1255274
Pellicer, J. & Leitch, I. J. The Plant DNA C-values database (release 7.1): an updated online repository of plant genome size data for comparative studies. New Phytol. 226, 301–305 (2020).
pubmed: 31608445
doi: 10.1111/nph.16261
Manni, M., Berkeley, M. R., Seppey, M., Simão, F. A. & Zdobnov, E. M. BUSCO update: novel and streamlined workflows along with broader and deeper phylogenetic coverage for scoring of eukaryotic, prokaryotic, and viral genomes. Mol. Biol. Evol. 38, 4647–4654 (2021).
pubmed: 34320186
pmcid: 8476166
doi: 10.1093/molbev/msab199
Petit, M. et al. Mobilization of retrotransposons in synthetic allotetraploid tobacco. New Phytol. 186, 135–147 (2010).
pubmed: 20074093
doi: 10.1111/j.1469-8137.2009.03140.x
Sarilar, V. et al. Allopolyploidy has a moderate impact on restructuring at three contrasting transposable element insertion sites in resynthesized Brassica napus allotetraploids. New Phytol. 198, 593–604 (2013).
pubmed: 23384044
doi: 10.1111/nph.12156
Bird, K. A., VanBuren, R., Puzey, J. R. & Edger, P. P. The causes and consequences of subgenome dominance in hybrids and recent polyploids. New Phytol. 220, 87–93 (2018).
pubmed: 29882360
doi: 10.1111/nph.15256
Göbel, U. et al. Robustness of transposable element regulation but no genomic shock observed in interspecific Arabidopsis hybrids. Genome Biol. Evol. 10, 1403–1415 (2018).
pubmed: 29788048
pmcid: 6007786
doi: 10.1093/gbe/evy095
Birchler, J. A. & Veitia, R. A. The gene balance hypothesis: implications for gene regulation, quantitative traits and evolution. New Phytol. 186, 54–62 (2010).
pubmed: 19925558
doi: 10.1111/j.1469-8137.2009.03087.x
Zeiss, D. R., Piater, L. A. & Dubery, I. A. Hydroxycinnamate amides: intriguing conjugates of plant protective metabolites. Trends Plant Sci. 26, 184–195 (2021).
pubmed: 33036915
doi: 10.1016/j.tplants.2020.09.011
Bird, K. A. et al. Replaying the evolutionary tape to investigate subgenome dominance in allopolyploid Brassica napus. New Phytol. 230, 354–371 (2021).
pubmed: 33280122
pmcid: 7986222
doi: 10.1111/nph.17137
Combes, M.-C., Joët, T., Stavrinides, A. K. & Lashermes, P. New cup out of old coffee: contribution of parental gene expression legacy to phenotypic novelty in coffee beans of the allopolyploid Coffea arabica L. Ann. Bot. 131, 157–170 (2023).
pubmed: 35325016
doi: 10.1093/aob/mcac041
Yoo, M. J., Szadkowski, E. & Wendel, J. F. Homoeolog expression bias and expression level dominance in allopolyploid cotton. Heredity 110, 171–180 (2013).
pubmed: 23169565
doi: 10.1038/hdy.2012.94
Meyer, F. G., Fernie, L. M., Narasimhaswami, R. L., Monaco, L. C. & Greathead, D. J. FAO Coffee Mission to Ethiopia, 1964–1965 (Food and Agriculture Organization of the United Nations, 1968).
Halle, F. Echantillonnage du matériel Coffea arabica récolté en Ethiopie. Bulletin IFCC 14, 13–18 (1978).
Krug, C. A. & Mendes, A. J. T. Cytological observations in Coffea – IV. J. Genet. 39, 189–203 (1940).
doi: 10.1007/BF02982835
Cros, J. et al. Phylogenetic analysis of chloroplast DNA variation in Coffea L. Mol. Phylogenet. Evol. 9, 109–117 (1998).
pubmed: 9479700
doi: 10.1006/mpev.1997.0453
Lashermes, P. et al. Molecular characterisation and origin of the Coffea arabica L. genome. Mol. Gen. Genet. 261, 259–266 (1999).
pubmed: 10102360
doi: 10.1007/s004380050965
Wu, Y. et al. Genomic mosaicism due to homoeologous exchange generates extensive phenotypic diversity in nascent allopolyploids. Natl Sci. Rev. 8, nwaa277 (2021).
pubmed: 34691642
doi: 10.1093/nsr/nwaa277
Terhorst, J., Kamm, J. A. & Song, Y. S. Robust and scalable inference of population history from hundreds of unphased whole genomes. Nat. Genet. 49, 303–309 (2017).
pubmed: 28024154
doi: 10.1038/ng.3748
Moat, J., Gole, T. W. & Davis, A. P. Least concern to endangered: applying climate change projections profoundly influences the extinction risk assessment for wild Arabica coffee. Glob. Change Biol. 25, 390–403 (2019).
doi: 10.1111/gcb.14341
Kuper, R. & Kröpelin, S. Climate-controlled holocene occupation in the Sahara: motor of Africa’s evolution. Science 313, 803–807 (2006).
pubmed: 16857900
doi: 10.1126/science.1130989
Excoffier, L. et al. fastsimcoal2: demographic inference under complex evolutionary scenarios. Bioinformatics 37, 4882–4885 (2021).
pubmed: 34164653
pmcid: 8665742
doi: 10.1093/bioinformatics/btab468
Lambeck, K. et al. Sea level and shoreline reconstructions for the Red Sea: isostatic and tectonic considerations and implications for hominin migration out of Africa. Quat. Sci. Rev. 30, 3542–3574 (2011).
doi: 10.1016/j.quascirev.2011.08.008
Montagnon, C., Mahyoub, A., Solano, W. & Sheibani, F. Unveiling a unique genetic diversity of cultivated Coffea arabica L. in its main domestication center: Yemen. Genet. Resour. Crop Evol. 68, 2411–2422 (2021).
doi: 10.1007/s10722-021-01139-y
Nordborg, M. & Donnelly, P. The coalescent process with selfing. Genetics 146, 1185 (1997).
pubmed: 9215919
pmcid: 1208046
doi: 10.1093/genetics/146.3.1185
Hu, G. et al. Two divergent haplotypes from a highly heterozygous lychee genome suggest independent domestication events for early and late-maturing cultivars. Nat. Genet. 54, 73–83 (2022).
pubmed: 34980919
pmcid: 8755541
doi: 10.1038/s41588-021-00971-3
Manichaikul, A. et al. Robust relationship inference in genome-wide association studies. Bioinformatics 26, 2867–2873 (2010).
pubmed: 20926424
pmcid: 3025716
doi: 10.1093/bioinformatics/btq559
Molloy, E. K., Durvasula, A. & Sankararaman, S. Advancing admixture graph estimation via maximum likelihood network orientation. Bioinformatics 37, i142–i150 (2021).
pubmed: 34252951
pmcid: 8336447
doi: 10.1093/bioinformatics/btab267
Pfeifer, B. & Kapan, D. D. Estimates of introgression as a function of pairwise distances. BMC Bioinformatics 20, 207 (2019).
pubmed: 31014244
pmcid: 6480520
doi: 10.1186/s12859-019-2747-z
Gaut, B. S., Seymour, D. K., Liu, Q. & Zhou, Y. Demography and its effects on genomic variation in crop domestication. Nat. Plants 4, 512–520 (2018).
pubmed: 30061748
doi: 10.1038/s41477-018-0210-1
dos Santos, T. B., Baba, V. Y., Vieira, L. G. E., Pereira, L. F. P. & Domingues, D. S. The urea transporter DUR3 is differentially regulated by abiotic and biotic stresses in coffee plants. Physiol. Mol. Biol. Plants 27, 203–212 (2021).
pubmed: 33707863
pmcid: 7907287
doi: 10.1007/s12298-021-00930-6
Wang, W. et al. Structural basis of salicylic acid perception by Arabidopsis NPR proteins. Nature 586, 311–316 (2020).
pubmed: 32788727
pmcid: 7554156
doi: 10.1038/s41586-020-2596-y
Mukhtar, M. S. et al. Independently evolved virulence effectors converge onto hubs in a plant immune system network. Science 333, 596–601 (2011).
pubmed: 21798943
pmcid: 3170753
doi: 10.1126/science.1203659
Jousimo, J. et al. Ecological and evolutionary effects of fragmentation on infectious disease dynamics. Science 344, 1289–1293 (2014).
pubmed: 24926021
doi: 10.1126/science.1253621
Cooley, M. B., Pathirana, S., Wu, H. J., Kachroo, P. & Klessig, D. F. Members of the Arabidopsis HRT/RPP8 family of resistance genes confer resistance to both viral and oomycete pathogens. Plant Cell 12, 663–676 (2000).
pubmed: 10810142
pmcid: 139919
doi: 10.1105/tpc.12.5.663
Mohr, T. J. et al. The Arabidopsis downy mildew resistance gene RPP8 is induced by pathogens and salicylic acid and is regulated by W-box cis elements. Mol. Plant Microbe Interact. 23, 1303–1315 (2010).
pubmed: 20831409
doi: 10.1094/MPMI-01-10-0022
MacQueen, A. et al. Population genetics of the highly polymorphic RPP8 gene family. Genes 10, 691 (2019).
pubmed: 31500388
pmcid: 6771003
doi: 10.3390/genes10090691
Cheng, Y. T. et al. Stability of plant immune-receptor resistance proteins is controlled by SKP1-CULLIN1-F-box (SCF)-mediated protein degradation. Proc. Natl Acad. Sci. USA 108, 14694–14699 (2011).
pubmed: 21873230
pmcid: 3167521
doi: 10.1073/pnas.1105685108
Hedtmann, C. et al. The plant immunity regulating F-Box Protein CPR1 supports plastid function in absence of pathogens. Front. Plant Sci. 8, 1650 (2017).
pubmed: 29018463
pmcid: 5615928
doi: 10.3389/fpls.2017.01650
Feuillet, C., Schachermayr, G. & Keller, B. Molecular cloning of a new receptor-like kinase gene encoded at the Lr10 disease resistance locus of wheat. Plant J. 11, 45–52 (1997).
pubmed: 9025301
doi: 10.1046/j.1365-313X.1997.11010045.x
Zhou, H. et al. Molecular analysis of three new receptor-like kinase genes from hexaploid wheat and evidence for their participation in the wheat hypersensitive response to stripe rust fungus infection. Plant J. 52, 420–434 (2007).
pubmed: 17764502
doi: 10.1111/j.1365-313X.2007.03246.x
Xia, T. et al. Efficient expression and function of a receptor-like kinase in wheat powdery mildew defence require an intron-located MYB binding site. Plant Biotechnol. J. 19, 897–909 (2021).
pubmed: 33225586
doi: 10.1111/pbi.13512
Florez, J. C. et al. High throughput transcriptome analysis of coffee reveals prehaustorial resistance in response to Hemileia vastatrix infection. Plant Mol. Biol. 95, 607–623 (2017).
pubmed: 29094279
doi: 10.1007/s11103-017-0676-7
Gaut, B. S., Díez, C. M. & Morrell, P. L. Genomics and the contrasting dynamics of annual and perennial domestication. Trends Genet. 31, 709–719 (2015).
pubmed: 26603610
doi: 10.1016/j.tig.2015.10.002
Chen, Z. J. Molecular mechanisms of polyploidy and hybrid vigor. Trends Plant Sci. 15, 57–71 (2010).
pubmed: 20080432
pmcid: 2821985
doi: 10.1016/j.tplants.2009.12.003
Lan, T. et al. Insights into bear evolution from a Pleistocene polar bear genome. Proc. Natl Acad. Sci. USA 119, e2200016119 (2022).
pubmed: 35666863
pmcid: 9214488
doi: 10.1073/pnas.2200016119
Berlin, K. et al. Assembling large genomes with single-molecule sequencing and locality-sensitive hashing. Nat. Biotechnol. 33, 623–630 (2015).
pubmed: 26006009
doi: 10.1038/nbt.3238
Chin, C.-S. et al. Phased diploid genome assembly with single-molecule real-time sequencing. Nat. Methods 13, 1050–1054 (2016).
pubmed: 27749838
pmcid: 5503144
doi: 10.1038/nmeth.4035
Walker, B. J. et al. Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS ONE 9, e112963 (2014).
pubmed: 25409509
pmcid: 4237348
doi: 10.1371/journal.pone.0112963
English, A. C. et al. Mind the gap: upgrading genomes with Pacific Biosciences RS long-read sequencing technology. PLoS ONE 7, e47768 (2012).
pubmed: 23185243
pmcid: 3504050
doi: 10.1371/journal.pone.0047768
Rastas, P. Lep-MAP3: robust linkage mapping even for low-coverage whole genome sequencing data. Bioinformatics 33, 3726–3732 (2017).
pubmed: 29036272
doi: 10.1093/bioinformatics/btx494
Cheng, H., Concepcion, G. T., Feng, X., Zhang, H. & Li, H. Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. Nat. Methods 18, 170–175 (2021).
pubmed: 33526886
pmcid: 7961889
doi: 10.1038/s41592-020-01056-5
Zhang, X., Zhang, S., Zhao, Q., Ming, R. & Tang, H. Assembly of allele-aware, chromosomal-scale autopolyploid genomes based on Hi-C data. Nat. Plants 5, 833–845 (2019).
pubmed: 31383970
doi: 10.1038/s41477-019-0487-8
Dudchenko, O. et al. De novo assembly of the Aedes aegypti genome using Hi-C yields chromosome-length scaffolds. Science 356, 92–95 (2017).
pubmed: 28336562
pmcid: 5635820
doi: 10.1126/science.aal3327
Durand, N. C. et al. Juicebox provides a visualization system for Hi-C contact maps with unlimited zoom. Cell Syst. 3, 99–101 (2016).
pubmed: 27467250
pmcid: 5596920
doi: 10.1016/j.cels.2015.07.012
Lyons, E. & Freeling, M. How to usefully compare homologous plant genes and chromosomes as DNA sequences. Plant J. 53, 661–673 (2008).
pubmed: 18269575
doi: 10.1111/j.1365-313X.2007.03326.x
Lyons, E. et al. Finding and comparing syntenic regions among Arabidopsis and the outgroups papaya, poplar, and grape: CoGe with rosids. Plant Physiol. 148, 1772–1781 (2008).
pubmed: 18952863
pmcid: 2593677
doi: 10.1104/pp.108.124867
Lefebvre-Pautigny, F. et al. High resolution synteny maps allowing direct comparisons between the coffee and tomato genomes. Tree Genet. Genomes 6, 565–577 (2010).
doi: 10.1007/s11295-010-0272-3
Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014).
pubmed: 24695404
pmcid: 4103590
doi: 10.1093/bioinformatics/btu170
Li, H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. Preprint at https://doi.org/10.48550/arXiv.1303.3997 (2013).
Ou, S. et al. Benchmarking transposable element annotation methods for creation of a streamlined, comprehensive pipeline. Genome Biol. 20, 275 (2019).
pubmed: 31843001
pmcid: 6913007
doi: 10.1186/s13059-019-1905-y
Orozco-Arias, S. et al. Inpactor2: a software based on deep learning to identify and classify LTR-retrotransposons in plant genomes. Brief. Bioinform. 24, bbac511 (2023).
pubmed: 36502372
doi: 10.1093/bib/bbac511
Fu, L., Niu, B., Zhu, Z., Wu, S. & Li, W. CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics 28, 3150–3152 (2012).
pubmed: 23060610
pmcid: 3516142
doi: 10.1093/bioinformatics/bts565
Ma, J. & Bennetzen Jeffrey, L. Rapid recent growth and divergence of rice nuclear genomes. Proc. Natl Acad. Sci. USA 101, 12404–12410 (2004).
pubmed: 15240870
pmcid: 515075
doi: 10.1073/pnas.0403715101
Orozco-Arias, S. et al. Inpactor, integrated and parallel analyzer and classifier of LTR retrotransposons and its application for pineapple LTR retrotransposons diversity and dynamics. Biology 7, 32 (2018).
pubmed: 29799487
pmcid: 6022998
doi: 10.3390/biology7020032
Campbell, M. S. et al. MAKER-P: a tool kit for the rapid creation, management, and quality control of plant genome annotations. Plant Physiol. 164, 513–524 (2014).
pubmed: 24306534
doi: 10.1104/pp.113.230144
Keilwagen, J., Hartung, F. & Grau, J. in Gene Prediction: Methods and Protocols (ed. Kollmar, M.) 161–177 (Springer, 2019).
Cheng, B., Furtado, A. & Henry, R. J. The coffee bean transcriptome explains the accumulation of the major bean components through ripening. Sci. Rep. 8, 11414 (2018).
pubmed: 30061608
pmcid: 6065352
doi: 10.1038/s41598-018-29842-4
Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).
pubmed: 23104886
doi: 10.1093/bioinformatics/bts635
Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).
pubmed: 25516281
pmcid: 4302049
doi: 10.1186/s13059-014-0550-8
Sankoff, D. et al. Models for similarity distributions of syntenic homologs and applications to phylogenomics. IEEE/ACM Trans. Comput. Biol. Bioinform. 16, 727–737 (2019).
doi: 10.1109/TCBB.2018.2849377
Andrews, S. FastQC: a quality control tool for high throughput sequence data. https://www.bioinformatics.babraham.ac.uk/projects/fastqc (2010).
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
pubmed: 19451168
pmcid: 2705234
doi: 10.1093/bioinformatics/btp324
Jónsson, H., Ginolhac, A., Schubert, M., Johnson, P. L. F. & Orlando, L. mapDamage2.0: fast approximate Bayesian estimates of ancient DNA damage parameters. Bioinformatics 29, 1682–1684 (2013).
pubmed: 23613487
pmcid: 3694634
doi: 10.1093/bioinformatics/btt193
Poplin, R. et al. Scaling accurate genetic variant discovery to tens of thousands of samples. Preprint at bioRxiv https://doi.org/10.1101/201178 (2018).
Cingolani, P. et al. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff. Fly 6, 80–92 (2012).
pubmed: 22728672
pmcid: 3679285
doi: 10.4161/fly.19695
Danecek, P. et al. The variant call format and VCFtools. Bioinformatics 27, 2156–2158 (2011).
pubmed: 21653522
pmcid: 3137218
doi: 10.1093/bioinformatics/btr330
Chang, C. C. et al. Second-generation PLINK: rising to the challenge of larger and richer datasets. GigaScience 4, https://doi.org/10.1186/s13742-015-0047-8 (2015).
Alexander, D. H. & Lange, K. Enhancements to the ADMIXTURE algorithm for individual ancestry estimation. BMC Bioinf. 12, 246 (2011).
doi: 10.1186/1471-2105-12-246
Korneliussen, T. S., Albrechtsen, A. & Nielsen, R. ANGSD: analysis of next generation sequencing data. BMC Bioinf. 15, 356 (2014).
doi: 10.1186/s12859-014-0356-4
Patterson, N. et al. Ancient admixture in human history. Genetics 192, 1065–1093 (2012).
pubmed: 22960212
pmcid: 3522152
doi: 10.1534/genetics.112.145037
Salojärvi, J. et al. Genome sequencing and population genomic analyses provide insights into the adaptive landscape of silver birch. Nat. Genet. 49, 904–912 (2017).
pubmed: 28481341
doi: 10.1038/ng.3862
Stamatakis, A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30, 1312–1313 (2014).
pubmed: 24451623
pmcid: 3998144
doi: 10.1093/bioinformatics/btu033
Li, H. & Durbin, R. Inference of human population history from individual whole-genome sequences. Nature 475, 493–496 (2011).
pubmed: 21753753
pmcid: 3154645
doi: 10.1038/nature10231
Danecek, P. et al. Twelve years of SAMtools and BCFtools. GigaScience 10, giab008 (2021).
pubmed: 33590861
pmcid: 7931819
doi: 10.1093/gigascience/giab008
Pickrell, J. K. & Pritchard, J. K. Inference of population splits and mixtures from genome-wide allele frequency data. PLoS Genet. 8, e1002967 (2012).
pubmed: 23166502
pmcid: 3499260
doi: 10.1371/journal.pgen.1002967
Orozco-Arias, S. et al. TIP_finder: an HPC software to detect transposable element insertion polymorphisms in large genomic datasets. Biology 9, 281 (2020).
pubmed: 32917036
pmcid: 7563458
doi: 10.3390/biology9090281
Kautsar, S. A., Suarez Duran, H. G., Blin, K., Osbourn, A. & Medema, M. H. plantiSMASH: automated identification, annotation and expression analysis of plant biosynthetic gene clusters. Nucleic Acids Res. 45, W55–W63 (2017).
pubmed: 28453650
pmcid: 5570173
doi: 10.1093/nar/gkx305
Klopfenstein, D. V. et al. GOATOOLS: a Python library for Gene Ontology analyses. Sci. Rep. 8, 10872 (2018).
pubmed: 30022098
pmcid: 6052049
doi: 10.1038/s41598-018-28948-z
Salojärvi, J. jsalojar/PiNSiR: first release of PiNSiR. Zenodo https://doi.org/10.5281/zenodo.5136527 (2021).