Haplotype-resolved genomes provide insights into structural variation and gene content in Angus and Brahman cattle.
Alleles
Allelic Imbalance
Animals
Base Sequence
Cattle
/ genetics
Chromosomes, Mammalian
/ genetics
Female
Genetic Loci
Genetic Variation
Genome
Haplotypes
/ genetics
INDEL Mutation
/ genetics
Male
Molecular Sequence Annotation
Polymorphism, Single Nucleotide
/ genetics
RNA, Messenger
/ genetics
Repetitive Sequences, Nucleic Acid
/ genetics
Journal
Nature communications
ISSN: 2041-1723
Titre abrégé: Nat Commun
Pays: England
ID NLM: 101528555
Informations de publication
Date de publication:
29 04 2020
29 04 2020
Historique:
received:
20
08
2019
accepted:
27
03
2020
entrez:
1
5
2020
pubmed:
1
5
2020
medline:
30
7
2020
Statut:
epublish
Résumé
Inbred animals were historically chosen for genome analysis to circumvent assembly issues caused by haplotype variation but this resulted in a composite of the two genomes. Here we report a haplotype-aware scaffolding and polishing pipeline which was used to create haplotype-resolved, chromosome-level genome assemblies of Angus (taurine) and Brahman (indicine) cattle subspecies from contigs generated by the trio binning method. These assemblies reveal structural and copy number variants that differentiate the subspecies and that variant detection is sensitive to the specific reference genome chosen. Six genes with immune related functions have additional copies in the indicine compared with taurine lineage and an indicus-specific extra copy of fatty acid desaturase is under positive selection. The haplotyped genomes also enable transcripts to be phased to detect allele-specific expression. This work exemplifies the value of haplotype-resolved genomes to better explore evolutionary and functional variations.
Identifiants
pubmed: 32350247
doi: 10.1038/s41467-020-15848-y
pii: 10.1038/s41467-020-15848-y
pmc: PMC7190621
doi:
Substances chimiques
RNA, Messenger
0
Types de publication
Journal Article
Research Support, N.I.H., Intramural
Research Support, Non-U.S. Gov't
Research Support, U.S. Gov't, Non-P.H.S.
Langues
eng
Sous-ensembles de citation
IM
Pagination
2071Subventions
Organisme : Wellcome Trust
ID : 108749/Z/15/Z
Pays : United Kingdom
Organisme : Biotechnology and Biological Sciences Research Council
ID : BB/M011615/1
Pays : United Kingdom
Organisme : Biotechnology and Biological Sciences Research Council
ID : BB/S020152/1
Pays : United Kingdom
Références
Park, S. D. E. et al. Genome sequencing of the extinct Eurasian wild aurochs, Bos primigenius, illuminates the phylogeography and evolution of cattle. Genome Biol. 16, 234 (2015).
pubmed: 26498365
pmcid: 4620651
doi: 10.1186/s13059-015-0790-2
Verdugo, M. P. et al. Ancient cattle genomics, origins, and rapid turnover in the Fertile Crescent. Science 365, 173–176 (2019).
pubmed: 31296769
Naik, S. N. Origin and domestication of Zebu cattle (Bos indicus). J. Hum. Evol. 7, 23–30 (1978).
doi: 10.1016/S0047-2484(78)80032-3
Koufariotis, L. et al. Sequencing the mosaic genome of Brahman cattle identifies historic and recent introgression including polled. Sci. Rep. 8, 17761 (2018).
pubmed: 30531891
pmcid: 6288114
doi: 10.1038/s41598-018-35698-5
American Brahman Breeders Association. Available at https://brahman.org (2020).
Koren, S. et al. De novo assembly of haplotype-resolved genomes with trio binning. Nat. Biotechnol. 36, 1174–1182 (2018).
doi: 10.1038/nbt.4277
Cao, H. et al. De novo assembly of a haplotype-resolved human genome. Nat. Biotechnol. 33, 617–622 (2015).
pubmed: 26006006
doi: 10.1038/nbt.3200
Bickhart, D. M. et al. Single-molecule sequencing and chromatin conformation capture enable de novo reference assembly of the domestic goat genome. Nat. Genet. 49, 643–650 (2017).
pubmed: 28263316
pmcid: 5909822
doi: 10.1038/ng.3802
Low, W. Y. et al. Chromosome-level assembly of the water buffalo genome surpasses human and goat genomes in sequence contiguity. Nat. Commun. 10, 260 (2019).
pubmed: 30651564
pmcid: 6335429
doi: 10.1038/s41467-018-08260-0
Zimin, A. V. et al. A whole-genome assembly of the domestic cow, Bos taurus. Genome Biol. 10, R42 (2009).
pubmed: 19393038
pmcid: 2688933
doi: 10.1186/gb-2009-10-4-r42
Nattestad, M. & Schatz, M. C. Assemblytics: a web analytics tool for the detection of variants from an assembly. Bioinformatics 32, 3021–3023 (2016).
pubmed: 27318204
pmcid: 6191160
doi: 10.1093/bioinformatics/btw369
Redon, R. et al. Global variation in copy number in the human genome. Nature 444, 444–454 (2006).
pubmed: 17122850
pmcid: 2669898
doi: 10.1038/nature05329
Bickhart, D. M. et al. Diversity and population-genetic properties of copy number variations and multicopy genes in cattle. DNA Res. 23, 253–262 (2016).
pubmed: 27085184
pmcid: 4909312
doi: 10.1093/dnares/dsw013
Kelsall, I. R. et al. Coupled monoubiquitylation of the co-E3 ligase DCNL1 by Ariadne-RBR E3 ubiquitin ligases promotes cullin-RING ligase complex remodeling. J. Biol. Chem. 294, 2651–2664 (2019).
pubmed: 30587576
doi: 10.1074/jbc.RA118.005861
pmcid: 30587576
Berchtold, M. W. & Villalobo, A. The many faces of calmodulin in cell proliferation, programmed cell death, autophagy, and cancer. Biochim. Biophys. Acta Mol. Cell Res. 1843, 398–435 (2014).
doi: 10.1016/j.bbamcr.2013.10.021
Lotfan, M. et al. Primary structures of different isoforms of buffalo pregnancy-associated glycoproteins (BuPAGs) during early pregnancy and elucidation of the 3-dimensional structure of the most abundant isoform BuPAG 7. PLoS ONE 13, e0206143 (2018).
pubmed: 30403702
pmcid: 6221303
doi: 10.1371/journal.pone.0206143
Kim, J. et al. The genome landscape of indigenous African cattle. Genome Biol. 18, 34 (2017).
pubmed: 28219390
pmcid: 5319050
doi: 10.1186/s13059-017-1153-y
Wang, B. et al. Variant phasing and haplotypic expression from single-molecule long-read sequencing in maize. Commun. Biol. 3, 1–11 (2020).
pubmed: 31925316
pmcid: 6946651
doi: 10.1038/s42003-019-0734-6
Abyzov, A., Urban, A. E., Snyder, M. & Gerstein, M. CNVnator: an approach to discover, genotype, and characterize typical and atypical CNVs from family and population genome sequencing. Genome Res. 21, 974–984 (2011).
pubmed: 21324876
pmcid: 3106330
doi: 10.1101/gr.114876.110
Kim, D. et al. Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat. Biotechnol. 37, 907–915 (2019).
pubmed: 31375807
doi: 10.1038/s41587-019-0201-4
Eggertsson, H. P. et al. Graphtyper enables population-scale genotyping using pangenome graphs. Nat. Genet. 49, 1654–1660 (2017).
pubmed: 28945251
doi: 10.1038/ng.3964
Gol, S. et al. polymorphism in the fatty acid desaturase-2 gene is associated with the arachidonic acid metabolism in pigs. Sci. Rep. 8, 14336 (2018).
pubmed: 30254373
pmcid: 6156218
doi: 10.1038/s41598-018-32710-w
Markworth, J. F. et al. Arachidonic acid supplementation modulates blood and skeletal muscle lipid profile with no effect on basal inflammation in resistance exercise trained men. Prostaglandins Leukot. Essent. Fat. Acids 128, 74–86 (2018).
doi: 10.1016/j.plefa.2017.12.003
Markworth, J. F. & Cameron-Smith, D. Arachidonic acid supplementation enhances in vitro skeletal muscle cell growth via a COX-2-dependent pathway. Am. J. Physiol. Physiol. 304, C56–C67 (2013).
doi: 10.1152/ajpcell.00038.2012
Takahashi, H. et al. Association of bovine fatty acid desaturase 2 gene single-nucleotide polymorphisms with intramuscular fatty acid composition in Japanese Black steers. Open J. Anim. Sci. 06, 105–115 (2016).
doi: 10.4236/ojas.2016.62013
Hansen, H. S. & Jensen, B. Essential function of linoleic acid esterified in acylglucosylceramide and acylceramide in maintaining the epidermal water permeability barrier. Evidence from feeding studies with oleate, linoleate, arachidonate, columbinate and α-linolenate. Biochim. Biophys. Acta Lipids Lipid Metab. 834, 357–363 (1985).
doi: 10.1016/0005-2760(85)90009-8
Bressan, M. C. et al. Genotype x environment interactions for fatty acid profiles in Bos indicus and Bos taurus finished on pasture or grain. J. Anim. Sci. 89, 221–232 (2011).
pubmed: 21178183
doi: 10.2527/jas.2009-2672
Sudano, M. J. et al. Phosphatidylcholine and sphingomyelin profiles vary in Bos taurus indicus and Bos taurus taurus in vitro- and in vivo-produced blastocysts. Biol. Reprod. 87, 130 (2012).
Sainz, R. D., Barioni, L. G., Paulino, P. V. R., S.C.Valadares & Filho, J. W. Growth Patterns of Nellore vs. British Beef Cattle Breeds Assessed using a Dynamic, Mechanistic Model of Cattle Growth and Composition (eds Kebreab, E., Dijkstra, J., Bannink, A., Gerrits, W. J. J. & France, J.) Ch. 16 (CAB eBooks, 2006).
Wang, Y. H. et al. Gene expression profiling of Hereford Shorthorn cattle following challenge with Boophilus microplus tick larvae. Aust. J. Exp. Agric. 47, 1397 (2007).
doi: 10.1071/EA07012
Bickhart, D. M. et al. Copy number variation of individual cattle genomes using next-generation sequencing. Genome Res. 22, 778–90 (2012).
pubmed: 22300768
pmcid: 3317159
doi: 10.1101/gr.133967.111
Hiendleder, S., Lewalski, H. & Janke, A. Complete mitochondrial genomes of Bos taurus and Bos indicus provide new insights into intra-species variation, taxonomy and domestication. Cytogenet. Genome Res. 120, 150–156 (2008).
pubmed: 18467841
doi: 10.1159/000118756
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
pubmed: 2705234
pmcid: 2705234
doi: 10.1093/bioinformatics/btp324
Dudchenko, O. et al. De novo assembly of the Aedes aegypti genome using Hi-C yields chromosome-length scaffolds. Science 356, 92–95 (2017).
pubmed: 28336562
pmcid: 5635820
doi: 10.1126/science.aal3327
Ghurye, J. et al. Integrating Hi-C links with assembly graphs for chromosome-scale assembly. PLOS Comput. Biol. 15, e1007273 (2019).
pubmed: 31433799
pmcid: 6719893
doi: 10.1371/journal.pcbi.1007273
Formenti, G. et al. SMRT long reads and direct label and stain optical maps allow the generation of a high-quality genome assembly for the European barn swallow (Hirundo rustica rustica). Gigascience 8, (2019).
Tardaguila, M. et al. SQANTI: extensive characterization of long read transcript sequences for quality control in full-length transcriptome identification and quantification. Preprint at https://doi.org/10.1101/118083 (2017).
Kim, D., Langmead, B. & Salzberg, S. L. HISAT: a fast spliced aligner with low memory requirements. Nat. Methods 12, 357–360 (2015).
pubmed: 25751142
pmcid: 25751142
doi: 10.1038/nmeth.3317
DePristo, M. A. et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat. Genet. 43, 491–498 (2011).
pubmed: 21478889
pmcid: 3083463
doi: 10.1038/ng.806
Ma, L. et al. Cattle sex-specific recombination and genetic control from a large pedigree analysis. PLOS Genet. 11, e1005387 (2015).
pubmed: 26540184
pmcid: 4634960
doi: 10.1371/journal.pgen.1005387
English, A. C. et al. Mind the gap: upgrading genomes with Pacific Biosciences RS long-read sequencing technology. PLoS ONE 7, e47768 (2012).
pubmed: 23185243
pmcid: 3504050
doi: 10.1371/journal.pone.0047768
Simão, F. A., Waterhouse, R. M., Ioannidis, P., Kriventseva, E. V. & Zdobnov, E. M. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31, 3210–2 (2015).
pubmed: 26059717
pmcid: 26059717
doi: 10.1093/bioinformatics/btv351
Aken, B. L. et al. The Ensembl gene annotation system. Database 2016, baw093 (2016).
pubmed: 27337980
pmcid: 4919035
doi: 10.1093/database/baw093
Bao, W., Kojima, K. K. & Kohany, O. Repbase Update, a database of repetitive elements in eukaryotic genomes. Mob. DNA 6, 11 (2015).
pubmed: 26045719
pmcid: 4455052
doi: 10.1186/s13100-015-0041-9
Heaton, M. P. et al. Using diverse U.S. beef cattle genomes to identify missense mutations in EPAS1, a gene associated with high-altitude pulmonary hypertension. F1000Research 5, 2003 (2016).
pubmed: 27746904
pmcid: 5040160
Andrews, S. FastQC: a quality control tool for high throughput sequence data. https://www.bioinformatics.babraham.ac.uk/projects/fastqc/ (2010).
Krueger, F. Trim Galore!: a wrapper tool around Cutadapt and FastQC to consistently apply quality and adapter trimming to FastQ files, with some extra functionality for MspI-digested RRBS-type (Reduced Representation Bisufite-Seq) libraries. https://www.bioinformatics.babraham.ac.uk/projects/trim_galore/ (2015).
Schubert, M., Lindgreen, S. & Orlando, L. AdapterRemoval v2: rapid adapter trimming, identification, and read merging. BMC Res. Notes 9, 88 (2016).
pubmed: 26868221
pmcid: 4751634
doi: 10.1186/s13104-016-1900-2
Li, H. et al. The sequence alignment/map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).
pubmed: 2723002
pmcid: 2723002
doi: 10.1093/bioinformatics/btp352
Broad Institute. Picard tools. Broad Institute, GitHub repository. http://broadinstitute.github.io/picard/ (2020).
Wang, K., Li, M. & Hakonarson, H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 38, e164 (2010).
pubmed: 20601685
pmcid: 2938201
doi: 10.1093/nar/gkq603
Layer, R. M., Chiang, C., Quinlan, A. R. & Hall, I. M. LUMPY: a probabilistic framework for structural variant discovery. Genome Biol. 15, R84 (2014).
pubmed: 24970577
pmcid: 4197822
doi: 10.1186/gb-2014-15-6-r84
Oldeschulte, D. L. et al. Annotated draft genome assemblies for the Northern Bobwhite (Colinus virginianus) and the scaled quail (Callipepla squamata) reveal disparate estimates of modern genome diversity and historic effective population size. G3 (Bethesda) 7, 3047–3058 (2017).
doi: 10.1534/g3.117.043083
Marçais, G. et al. MUMmer4: a fast and versatile genome alignment system. PLOS Comput. Biol. 14, e1005944 (2018).
pubmed: 29373581
pmcid: 5802927
doi: 10.1371/journal.pcbi.1005944
Krumsiek, J., Arnold, R. & Rattei, T. Gepard: a rapid and sensitive tool for creating dotplots on genome scale. Bioinformatics 23, 1026–1028 (2007).
pubmed: 17309896
pmcid: 17309896
doi: 10.1093/bioinformatics/btm039
Altschul, S. F., Gish, W., Miller, W., Myers, E. W. & Lipman, D. J. Basic local alignment search tool. J. Mol. Biol. 215, 403–410 (1990).
pubmed: 2231712
doi: 10.1016/S0022-2836(05)80360-2
Stamatakis, A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30, 1312–1313 (2014).
pubmed: 24451623
pmcid: 24451623
doi: 10.1093/bioinformatics/btu033
Nattestad, M., Chin, C.-S. & Schatz, M. C. Ribbon: visualizing complex genome alignments and structural variation. Preprint at https://doi.org/10.1101/082123 (2016).
Yang, Z. PAML 4: phylogenetic analysis by maximum likelihood. Mol. Biol. Evol. 24, 1586–1591 (2007).
pubmed: 17483113
doi: 10.1093/molbev/msm088
pmcid: 17483113
Edgar, R. C. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 32, 1792–7 (2004).
pubmed: 15034147
pmcid: 390337
doi: 10.1093/nar/gkh340
Suyama, M., Torrents, D. & Bork, P. PAL2NAL: robust conversion of protein sequence alignments into the corresponding codon alignments. Nucleic Acids Res. 34, 609–612 (2006).
doi: 10.1093/nar/gkl315
Tan, H. M. & Low, W. Y. Rapid birth-death evolution and positive selection in detoxification-type glutathione S-transferases in mammals. PLoS ONE 13, e0209336 (2018).
pubmed: 30586459
pmcid: 6306238
doi: 10.1371/journal.pone.0209336