Genome resources for three modern cotton lines guide future breeding efforts.
Journal
Nature plants
ISSN: 2055-0278
Titre abrégé: Nat Plants
Pays: England
ID NLM: 101651677
Informations de publication
Date de publication:
30 May 2024
30 May 2024
Historique:
received:
27
10
2023
accepted:
27
04
2024
medline:
31
5
2024
pubmed:
31
5
2024
entrez:
30
5
2024
Statut:
aheadofprint
Résumé
Cotton (Gossypium hirsutum L.) is the key renewable fibre crop worldwide, yet its yield and fibre quality show high variability due to genotype-specific traits and complex interactions among cultivars, management practices and environmental factors. Modern breeding practices may limit future yield gains due to a narrow founding gene pool. Precision breeding and biotechnological approaches offer potential solutions, contingent on accurate cultivar-specific data. Here we address this need by generating high-quality reference genomes for three modern cotton cultivars ('UGA230', 'UA48' and 'CSX8308') and updating the 'TM-1' cotton genetic standard reference. Despite hypothesized genetic uniformity, considerable sequence and structural variation was observed among the four genomes, which overlap with ancient and ongoing genomic introgressions from 'Pima' cotton, gene regulatory mechanisms and phenotypic trait divergence. Differentially expressed genes across fibre development correlate with fibre production, potentially contributing to the distinctive fibre quality traits observed in modern cotton cultivars. These genomes and comparative analyses provide a valuable foundation for future genetic endeavours to enhance global cotton yield and sustainability.
Identifiants
pubmed: 38816498
doi: 10.1038/s41477-024-01713-z
pii: 10.1038/s41477-024-01713-z
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Subventions
Organisme : Cotton Incorporated (Cotton Inc.)
ID : 18-753
Organisme : National Science Foundation (NSF)
ID : IOS1739092
Organisme : National Science Foundation (NSF)
ID : IOS1444552
Organisme : National Science Foundation (NSF)
ID : IOS1739092
Informations de copyright
© 2024. The Author(s).
Références
Splitstoser, J. C., Dillehay, T. D., Wouters, J. & Claro, A. Early pre-Hispanic use of indigo blue in Peru. Sci. Adv. 2, e1501623 (2016).
pubmed: 27652337
pmcid: 5023320
doi: 10.1126/sciadv.1501623
Dar, M. H. et al. No yield penalty under favorable conditions paving the way for successful adoption of flood tolerant rice. Sci. Rep. 8, 9245 (2018).
pubmed: 29915310
pmcid: 6006260
doi: 10.1038/s41598-018-27648-y
Yoshida, H. et al. Genome-wide association study identifies a gene responsible for temperature-dependent rice germination. Nat. Commun. 13, 5665 (2022).
pubmed: 36175401
pmcid: 9523024
doi: 10.1038/s41467-022-33318-5
Oliva, R. et al. Broad-spectrum resistance to bacterial blight in rice using genome editing. Nat. Biotechnol. 37, 1344–1350 (2019).
pubmed: 31659337
pmcid: 6831514
doi: 10.1038/s41587-019-0267-z
Gao, L. et al. The tomato pan-genome uncovers new genes and a rare allele regulating fruit flavor. Nat. Genet. 51, 1044–1051 (2019).
pubmed: 31086351
doi: 10.1038/s41588-019-0410-2
Alonge, M. et al. Major impacts of widespread structural variation on gene expression and crop improvement in tomato. Cell 182, 145–161.e23 (2020).
pubmed: 32553272
pmcid: 7354227
doi: 10.1016/j.cell.2020.05.021
Cooper, M., Gho, C., Leafgren, R., Tang, T. & Messina, C. Breeding drought-tolerant maize hybrids for the US corn-belt: discovery to product. J. Exp. Bot. 65, 6191–6204 (2014).
pubmed: 24596174
doi: 10.1093/jxb/eru064
Zhang, W. et al. Identification and characterization of Sr13, a tetraploid wheat gene that confers resistance to the Ug99 stem rust race group. Proc. Natl Acad. Sci. USA 114, E9483–E9492 (2017).
pubmed: 29078294
pmcid: 5692537
doi: 10.1073/pnas.1706277114
Emerick, K. & Ronald, P. C. Sub1 rice: engineering rice for climate change. Cold Spring Harb. Perspect. Biol. 11, a034637 (2019).
pubmed: 31182543
pmcid: 6886445
doi: 10.1101/cshperspect.a034637
Constable, G., Llewellyn, D., Wilson, L. & Stiller, W. An industry transformed the impact of GM technology on Australian cotton production. Farm Policy J. 8, 23–41 (2011).
Liu, S. M., Constable, G. A., Reid, P. E., Stiller, W. N. & Cullis, B. R. The interaction between breeding and crop management in improved cotton yield. Field Crops Res. 148, 49–60 (2013).
doi: 10.1016/j.fcr.2013.04.006
Rochester, I. J. & Constable, G. A. Improvements in nutrient uptake and nutrient use-efficiency in cotton cultivars released between 1973 and 2006. Field Crops Res. 173, 14–21 (2015).
doi: 10.1016/j.fcr.2015.01.001
Clement, J. D., Constable, G. A., Stiller, W. N. & Liu, S. M. Early generation selection strategies for breeding better combinations of cotton yield and fibre quality. Field Crops Res. 172, 145–152 (2015).
doi: 10.1016/j.fcr.2014.11.009
Guzman, M. A., Vilain, L. A., Rondon, T. M. & Sanchez, J. Genetic gain in lint yield and its components of upland cotton released during 1963 to 2010 in Venezuela. Crop Sci. 61, 3436–3444 (2021).
doi: 10.1002/csc2.20547
Islam, M. S. et al. Evaluation of genomic selection methods for predicting fiber quality traits in upland cotton. Mol. Genet. Genom. 295, 67–79 (2020).
doi: 10.1007/s00438-019-01599-z
Kohel, R. J., Richmond, T. R. & Lewis, C. F. Texas marker‐1. Description of a genetic standard for Gossypium hirsutum L. Crop Sci. 10, 670–671 (1970).
doi: 10.2135/cropsci1970.0011183X001000060019x
Hinze, L. L., Todd Campbell, B. & Kohel, R. J. Performance and combining ability in cotton (Gossypium hirsutum L.) populations with diverse parents. Euphytica 181, 115–125 (2011).
doi: 10.1007/s10681-011-0442-x
Xia, Z. et al. Major gene identification and quantitative trait locus mapping for yield-related traits in upland cotton (Gossypium hirsutum L.). J. Integr. Agric. 13, 299–309 (2014).
doi: 10.1016/S2095-3119(13)60508-0
Chen, Z. J. et al. Genomic diversifications of five Gossypium allopolyploid species and their impact on cotton improvement. Nat. Genet. 52, 525–533 (2020).
pubmed: 32313247
pmcid: 7203012
doi: 10.1038/s41588-020-0614-5
Huang, G. et al. Genome sequence of Gossypium herbaceum and genome updates of Gossypium arboreum and Gossypium hirsutum provide insights into cotton A-genome evolution. Nat. Genet. 52, 516–524 (2020).
pubmed: 32284579
pmcid: 7203013
doi: 10.1038/s41588-020-0607-4
Chen, Z. J. et al. Toward sequencing cotton (Gossypium) genomes. Plant Physiol. 145, 1303–1310 (2007).
pubmed: 18056866
pmcid: 2151711
doi: 10.1104/pp.107.107672
Egan, L. M. & Stiller, W. N. The past, present, and future of host plant resistance in cotton: an Australian perspective. Front. Plant Sci. 13, 895877 (2022).
pubmed: 35873986
pmcid: 9297922
doi: 10.3389/fpls.2022.895877
Bourland, F. M. & Jones, D. C. Registration of ‘UA48’ cotton cultivar. J. Plant Regist. 6, 15–18 (2012).
doi: 10.3198/jpr2011.06.0309crc
Saha, S. et al. Effect of chromosome substitutions from Gossypium barbadense L. 3-79 into G. hirsutum L. TM-1 on agronomic and fiber traits. J. Cotton Sci. 8, 162–169 (2004).
Simão, F. A., Waterhouse, R. M., Ioannidis, P., Kriventseva, E. V. & Zdobnov, E. M. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics https://doi.org/10.1093/bioinformatics/btv351 (2015).
doi: 10.1093/bioinformatics/btv351
pubmed: 26059717
Campbell, B. T. et al. Status of the global cotton germplasm resources. Crop Sci. 50, 1161–1179 (2010).
doi: 10.2135/cropsci2009.09.0551
Zhang, T.-T. et al. Genetic structure, gene flow pattern, and association analysis of superior germplasm resources in domesticated upland cotton (Gossypium hirsutum L.). Plant Divers 42, 189–197 (2020).
pubmed: 32695952
pmcid: 7361167
doi: 10.1016/j.pld.2020.03.001
Lovell, J. T. et al. GENESPACE tracks regions of interest and gene copy number variation across multiple genomes. eLife 11, e78526 (2022).
pubmed: 36083267
pmcid: 9462846
doi: 10.7554/eLife.78526
Yang, P. et al. Identification of candidate genes for lint percentage and fiber quality through QTL mapping and transcriptome analysis in an allotetraploid interspecific cotton CSSLs population. Front. Plant Sci. 13, 882051 (2022).
pubmed: 35574150
pmcid: 9100888
doi: 10.3389/fpls.2022.882051
Van Bel, M. et al. PLAZA 4.0: an integrative resource for functional, evolutionary and comparative plant genomics. Nucleic Acids Res. 46, D1190–D1196 (2018).
pubmed: 29069403
doi: 10.1093/nar/gkx1002
Song, Q., Guan, X. & Chen, Z. J. Dynamic roles for small RNAs and DNA methylation during ovule and fiber development in allotetraploid cotton. PLoS Genet. 11, e1005724 (2015).
pubmed: 26710171
pmcid: 4692501
doi: 10.1371/journal.pgen.1005724
Lovell, J. T. et al. The genomic landscape of molecular responses to natural drought stress in Panicum hallii. Nat. Commun. 9, 5213 (2018).
pubmed: 30523281
pmcid: 6283873
doi: 10.1038/s41467-018-07669-x
Fang, L. et al. Genomic insights into divergence and dual domestication of cultivated allotetraploid cottons. Genome Biol. 18, 33 (2017).
pubmed: 28219438
pmcid: 5317056
doi: 10.1186/s13059-017-1167-5
Preuss, M. L. et al. A plant-specific kinesin binds to actin microfilaments and interacts with cortical microtubules in cotton fibers. Plant Physiol. 136, 3945–3955 (2004).
pubmed: 15557092
pmcid: 535827
doi: 10.1104/pp.104.052340
Brandizzi, F. & Wasteneys, G. O. Cytoskeleton-dependent endomembrane organization in plant cells: an emerging role for microtubules. Plant J. 75, 339–349 (2013).
pubmed: 23647215
doi: 10.1111/tpj.12227
Chen, Q. et al. Sphingolipid profile during cotton fiber growth revealed that a phytoceramide containing hydroxylated and saturated VLCFA is important for fiber cell elongation. Biomolecules 11, 1352 (2021).
pubmed: 34572565
pmcid: 8466704
doi: 10.3390/biom11091352
Zhong, R. et al. Arabidopsis fragile fiber8, which encodes a putative glucuronyltransferase, is essential for normal secondary wall synthesis. Plant Cell 17, 3390–3408 (2005).
pubmed: 16272433
pmcid: 1315377
doi: 10.1105/tpc.105.035501
Wu, A.-M. et al. The Arabidopsis IRX10 and IRX10-LIKE glycosyltransferases are critical for glucuronoxylan biosynthesis during secondary cell wall formation. Plant J. 57, 718–731 (2009).
pubmed: 18980649
doi: 10.1111/j.1365-313X.2008.03724.x
Yang, D. et al. The GhREV transcription factor regulate the development of shoot apical meristem in cotton (Gossypium hirsutum). J. Cotton Res. 3, 1–8 (2020).
Gaarslev, N., Swinnen, G. & Soyk, S. Meristem transitions and plant architecture-learning from domestication for crop breeding. Plant Physiol. 187, 1045–1056 (2021).
pubmed: 34734278
pmcid: 8566237
doi: 10.1093/plphys/kiab388
Kim, H. J. & Triplett, B. A. Cotton fiber growth in planta and in vitro. Models for plant cell elongation and cell wall biogenesis. Plant Physiol. 127, 1361–1366 (2001).
pubmed: 11743074
pmcid: 1540163
doi: 10.1104/pp.010724
Haigler, C. H., Betancur, L., Stiff, M. R. & Tuttle, J. R. Cotton fiber: a powerful single-cell model for cell wall and cellulose research. Front. Plant Sci. 3, 104 (2012).
pubmed: 22661979
pmcid: 3356883
doi: 10.3389/fpls.2012.00104
Graham, B. P. & Haigler, C. H. Microtubules exert early, partial, and variable control of cotton fiber diameter. Planta 253, 47 (2021).
pubmed: 33484350
doi: 10.1007/s00425-020-03557-1
Wang, C., Lv, Y., Xu, W., Zhang, T. & Guo, W. Aberrant phenotype and transcriptome expression during fiber cell wall thickening caused by the mutation of the Im gene in immature fiber (im) mutant in Gossypium hirsutum L. BMC Genom. 15, 94 (2014).
doi: 10.1186/1471-2164-15-94
Lee, C., Teng, Q., Zhong, R. & Ye, Z.-H. The four Arabidopsis reduced wall acetylation genes are expressed in secondary wall-containing cells and required for the acetylation of xylan. Plant Cell Physiol. 52, 1289–1301 (2011).
pubmed: 21673009
doi: 10.1093/pcp/pcr075
Chen, F. et al. Global identification of genes associated with xylan biosynthesis in cotton fiber. J. Cotton Res. 3, 1–15 (2020).
doi: 10.1186/s42397-020-00063-3
Li, C. et al. Melatonin enhances cotton immunity to Verticillium wilt via manipulating lignin and gossypol biosynthesis. Plant J. 100, 784–800 (2019).
pubmed: 31349367
pmcid: 6899791
doi: 10.1111/tpj.14477
Guan, X. et al. Activation of Arabidopsis seed hair development by cotton fiber-related genes. PLoS ONE 6, e21301 (2011).
pubmed: 21779324
pmcid: 3136922
doi: 10.1371/journal.pone.0021301
Gong, S.-Y. et al. Cotton KNL1, encoding a class II KNOX transcription factor, is involved in regulation of fibre development. J. Exp. Bot. 65, 4133–4147 (2014).
pubmed: 24831118
pmcid: 4112624
doi: 10.1093/jxb/eru182
Yang, Z., Qanmber, G., Wang, Z., Yang, Z. & Li, F. Gossypium genomics: trends, scope, and utilization for cotton improvement. Trends Plant Sci. 25, 488–500 (2020).
pubmed: 31980282
doi: 10.1016/j.tplants.2019.12.011
Li, Z., Parris, S. & Saski, C. A. A simple plant high-molecular-weight DNA extraction method suitable for single-molecule technologies. Plant Methods 16, 38 (2020).
pubmed: 32190102
pmcid: 7071634
doi: 10.1186/s13007-020-00579-4
Bourland, F. M., Johnson, J. T. & Jones, D. C. Registration of Arkot 8712 Germplasm Line of Cotton (Wiley, 2005); https://research.amanote.com/publication/oJFf1XMBKQvf0Bhi-qmM/registration-of-arkot-8712-germplasm-line-of-cotton
Vennapusa, A. R., Somayanda, I. M., Doherty, C. J. & Jagadish, S. V. K. A universal method for high-quality RNA extraction from plant tissues rich in starch, proteins and fiber. Sci. Rep. 10, 16887 (2020).
pubmed: 33037299
pmcid: 7547072
doi: 10.1038/s41598-020-73958-5
Schneider, C. A., Rasband, W. S. & Eliceiri, K. W. NIH image to ImageJ: 25 years of image analysis. Nat. Methods 9, 671–675 (2012).
pubmed: 22930834
pmcid: 5554542
doi: 10.1038/nmeth.2089
Dia, M., Wehner, T. C. & Arellano, C. RGxE: an R program for genotype x environment interaction analysis. Am. J. Plant Sci. 08, 1672–1698 (2017).
doi: 10.4236/ajps.2017.87116
De Boeck, P. et al. The estimation of item response models with the lmer function from the lme4 package in R. J. Stat. Softw. 39, 1–28 (2011).
doi: 10.18637/jss.v039.i12
Lenth, R. V. Least-squares means: the R Package lsmeans. J. Stat. Softw. 69, 1–33 (2016).
doi: 10.18637/jss.v069.i01
Xiao, C. L. et al. MECAT: fast mapping, error correction, and de novo assembly for single-molecule sequencing reads. Nat. Methods 14, 1072–1074 (2017).
pubmed: 28945707
doi: 10.1038/nmeth.4432
Chin, C. S. et al. Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data. Nat. Methods 10, 563–569 (2013).
pubmed: 23644548
doi: 10.1038/nmeth.2474
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
pubmed: 19451168
pmcid: 2705234
doi: 10.1093/bioinformatics/btp324
McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–1303 (2010).
pubmed: 20644199
pmcid: 2928508
doi: 10.1101/gr.107524.110
Wu, T. D. & Nacu, S. Fast and SNP-tolerant detection of complex variants and splicing in short reads. Bioinformatics 26, 873–881 (2010).
pubmed: 20147302
pmcid: 2844994
doi: 10.1093/bioinformatics/btq057
Haas, B. J. et al. Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies. Nucleic Acids Res. 31, 5654–5666 (2003).
pubmed: 14500829
pmcid: 206470
doi: 10.1093/nar/gkg770
Slater, G. S. C. & Birney, E. Automated generation of heuristics for biological sequence comparison. BMC Bioinform. 6, 31 (2005).
doi: 10.1186/1471-2105-6-31
Lamesch, P. et al. The Arabidopsis Information Resource (TAIR): improved gene annotation and new tools. Nucleic Acids Res. 40, D1202–D1210 (2012).
pubmed: 22140109
doi: 10.1093/nar/gkr1090
Schmutz, J. et al. Genome sequence of the palaeopolyploid soybean. Nature 463, 178–183 (2010).
pubmed: 20075913
doi: 10.1038/nature08670
Ouyang, S. et al. The TIGR Rice Genome Annotation Resource: improvements and new features. Nucleic Acids Res. 35, D883–D887 (2007).
pubmed: 17145706
doi: 10.1093/nar/gkl976
Mamidi, S. et al. A genome resource for green millet Setaria viridis enables discovery of agronomically valuable loci. Nat. Biotechnol. 38, 1203–1210 (2020).
pubmed: 33020633
pmcid: 7536120
doi: 10.1038/s41587-020-0681-2
McCormick, R. F. et al. The Sorghum bicolor reference genome: improved assembly, gene annotations, a transcriptome atlas, and signatures of genome organization. Plant J. 93, 338–354 (2018).
pubmed: 29161754
doi: 10.1111/tpj.13781
Motamayor, J. C. et al. The genome sequence of the most widely cultivated cacao type and its use to identify candidate genes regulating pod color. Genome Biol. 14, r53 (2013).
pubmed: 23731509
pmcid: 4053823
doi: 10.1186/gb-2013-14-6-r53
Jaillon, O. et al. The grapevine genome sequence suggests ancestral hexaploidization in major angiosperm phyla. Nature 449, 463–467 (2007).
pubmed: 17721507
doi: 10.1038/nature06148
UniProt Consortium. UniProt: a worldwide hub of protein knowledge. Nucleic Acids Res. 47, D506–D515 (2019).
doi: 10.1093/nar/gky1049
Smit, A. F. A., Hubley, R. & Green, P. RepeatMasker Open-4.0; http://www.repeatmasker.org (2013–2015).
Bao, W., Kojima, K. K. & Kohany, O. Repbase Update, a database of repetitive elements in eukaryotic genomes. Mob. DNA 6, 11 (2015).
pubmed: 26045719
pmcid: 4455052
doi: 10.1186/s13100-015-0041-9
Hu, Y. et al. Gossypium barbadense and Gossypium hirsutum genomes provide insights into the origin and evolution of allotetraploid cotton. Nat. Genet. 51, 739–748 (2019).
pubmed: 30886425
doi: 10.1038/s41588-019-0371-5
Li, Z. & Trick, H. N. Rapid method for high-quality RNA isolation from seed endosperm containing high levels of starch. Biotechniques 38, 872, 874, 876 (2005).
Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).
pubmed: 23104886
doi: 10.1093/bioinformatics/bts635
Liao, Y., Smyth, G. K. & Shi, W. featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics 30, 923–930 (2014).
pubmed: 24227677
doi: 10.1093/bioinformatics/btt656
Li, B. & Dewey, C. N. RSEM: accurate transcript quantification from RNA-seq data with or without a reference genome. BMC Bioinform. 12, 323 (2011).
doi: 10.1186/1471-2105-12-323
Trapnell, C. et al. Transcript assembly and abundance estimation from RNA-seq reveals thousands of new transcripts and switching among isoforms. Nat. Biotechnol. 28, 511–515 (2011).
doi: 10.1038/nbt.1621
Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).
pubmed: 25516281
pmcid: 4302049
doi: 10.1186/s13059-014-0550-8
Alexa, A. & Rahnenfuhrer, J. topGO: Enrichment Analysis for Gene Ontology. R version 2.24.0; http://bioconductor.org/packages/release/bioc/html/topGO.html (2016).
Kanehisa, M. & Goto, S. KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res. 28, 27–30 (2000).
pubmed: 10592173
pmcid: 102409
doi: 10.1093/nar/28.1.27
Emms, D. M. & Kelly, S. OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biol. 20, 238 (2019).
pubmed: 31727128
pmcid: 6857279
doi: 10.1186/s13059-019-1832-y
Li, H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100 (2018).
pubmed: 29750242
pmcid: 6137996
doi: 10.1093/bioinformatics/bty191
Hahsler, M., Piekenbrock, M. & Doran, D. dbscan: fast density-based clustering with R. J. Stat. Softw. 91, 1–30 (2019).
doi: 10.18637/jss.v091.i01
Wickham, H. in ggplot2: Elegant Graphics for Data Analysis (ed. Wickham, H.) 241–253 (Springer, 2016).
Dowle, M. et al. Package ‘data. table’. Extension of ‘data.frame'. R package version 1.14.8, https://CRAN.R-project.org/package=data.table (2023).
Goel, M., Sun, H., Jiao, W.-B. & Schneeberger, K. SyRI: finding genomic rearrangements and local sequence differences from whole-genome assemblies. Genome Biol. 20, 277 (2019).
pubmed: 31842948
pmcid: 6913012
doi: 10.1186/s13059-019-1911-0
Goel, M. & Schneeberger, K. plotsr: visualizing structural similarities and rearrangements between multiple genomes. Bioinformatics 38, 2922–2926 (2022).
pubmed: 35561173
pmcid: 9113368
doi: 10.1093/bioinformatics/btac196
Durand, N. C. et al. Juicer provides a one-click system for analyzing loop-resolution Hi-C experiments. Cell Syst. 3, 95–98 (2016).
pubmed: 27467249
pmcid: 5846465
doi: 10.1016/j.cels.2016.07.002
Durand, N. C. et al. Juicebox provides a visualization system for Hi-C contact maps with unlimited zoom. Cell Syst. 3, 99–101 (2016).
pubmed: 27467250
pmcid: 5596920
doi: 10.1016/j.cels.2015.07.012
Li, J. et al. Cotton pan-genome retrieves the lost sequences and genes during domestication and selection. Genome Biol. 22, 119 (2021).
pubmed: 33892774
pmcid: 8063427
doi: 10.1186/s13059-021-02351-w
Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).
pubmed: 19505943
pmcid: 2723002
doi: 10.1093/bioinformatics/btp352
Koboldt, D. C. et al. VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing. Genome Res. 22, 568–576 (2012).
pubmed: 22300766
pmcid: 3290792
doi: 10.1101/gr.129684.111
Raj, A., Stephens, M. & Pritchard, J. K. fastSTRUCTURE: variational inference of population structure in large SNP data sets. Genetics 197, 573–589 (2014).
pubmed: 24700103
pmcid: 4063916
doi: 10.1534/genetics.114.164350
Chang, C. C. et al. Second-generation PLINK: rising to the challenge of larger and richer datasets. GigaScience 4, 7 (2015).
pubmed: 25722852
pmcid: 4342193
doi: 10.1186/s13742-015-0047-8