Haplotype-resolved genomes provide insights into structural variation and gene content in Angus and Brahman cattle.


Journal

Nature communications
ISSN: 2041-1723
Titre abrégé: Nat Commun
Pays: England
ID NLM: 101528555

Informations de publication

Date de publication:
29 04 2020
Historique:
received: 20 08 2019
accepted: 27 03 2020
entrez: 1 5 2020
pubmed: 1 5 2020
medline: 30 7 2020
Statut: epublish

Résumé

Inbred animals were historically chosen for genome analysis to circumvent assembly issues caused by haplotype variation but this resulted in a composite of the two genomes. Here we report a haplotype-aware scaffolding and polishing pipeline which was used to create haplotype-resolved, chromosome-level genome assemblies of Angus (taurine) and Brahman (indicine) cattle subspecies from contigs generated by the trio binning method. These assemblies reveal structural and copy number variants that differentiate the subspecies and that variant detection is sensitive to the specific reference genome chosen. Six genes with immune related functions have additional copies in the indicine compared with taurine lineage and an indicus-specific extra copy of fatty acid desaturase is under positive selection. The haplotyped genomes also enable transcripts to be phased to detect allele-specific expression. This work exemplifies the value of haplotype-resolved genomes to better explore evolutionary and functional variations.

Identifiants

pubmed: 32350247
doi: 10.1038/s41467-020-15848-y
pii: 10.1038/s41467-020-15848-y
pmc: PMC7190621
doi:

Substances chimiques

RNA, Messenger 0

Types de publication

Journal Article Research Support, N.I.H., Intramural Research Support, Non-U.S. Gov't Research Support, U.S. Gov't, Non-P.H.S.

Langues

eng

Sous-ensembles de citation

IM

Pagination

2071

Subventions

Organisme : Wellcome Trust
ID : 108749/Z/15/Z
Pays : United Kingdom
Organisme : Biotechnology and Biological Sciences Research Council
ID : BB/M011615/1
Pays : United Kingdom
Organisme : Biotechnology and Biological Sciences Research Council
ID : BB/S020152/1
Pays : United Kingdom

Références

Park, S. D. E. et al. Genome sequencing of the extinct Eurasian wild aurochs, Bos primigenius, illuminates the phylogeography and evolution of cattle. Genome Biol. 16, 234 (2015).
pubmed: 26498365 pmcid: 4620651 doi: 10.1186/s13059-015-0790-2
Verdugo, M. P. et al. Ancient cattle genomics, origins, and rapid turnover in the Fertile Crescent. Science 365, 173–176 (2019).
pubmed: 31296769
Naik, S. N. Origin and domestication of Zebu cattle (Bos indicus). J. Hum. Evol. 7, 23–30 (1978).
doi: 10.1016/S0047-2484(78)80032-3
Koufariotis, L. et al. Sequencing the mosaic genome of Brahman cattle identifies historic and recent introgression including polled. Sci. Rep. 8, 17761 (2018).
pubmed: 30531891 pmcid: 6288114 doi: 10.1038/s41598-018-35698-5
American Brahman Breeders Association. Available at https://brahman.org (2020).
Koren, S. et al. De novo assembly of haplotype-resolved genomes with trio binning. Nat. Biotechnol. 36, 1174–1182 (2018).
doi: 10.1038/nbt.4277
Cao, H. et al. De novo assembly of a haplotype-resolved human genome. Nat. Biotechnol. 33, 617–622 (2015).
pubmed: 26006006 doi: 10.1038/nbt.3200
Bickhart, D. M. et al. Single-molecule sequencing and chromatin conformation capture enable de novo reference assembly of the domestic goat genome. Nat. Genet. 49, 643–650 (2017).
pubmed: 28263316 pmcid: 5909822 doi: 10.1038/ng.3802
Low, W. Y. et al. Chromosome-level assembly of the water buffalo genome surpasses human and goat genomes in sequence contiguity. Nat. Commun. 10, 260 (2019).
pubmed: 30651564 pmcid: 6335429 doi: 10.1038/s41467-018-08260-0
Zimin, A. V. et al. A whole-genome assembly of the domestic cow, Bos taurus. Genome Biol. 10, R42 (2009).
pubmed: 19393038 pmcid: 2688933 doi: 10.1186/gb-2009-10-4-r42
Nattestad, M. & Schatz, M. C. Assemblytics: a web analytics tool for the detection of variants from an assembly. Bioinformatics 32, 3021–3023 (2016).
pubmed: 27318204 pmcid: 6191160 doi: 10.1093/bioinformatics/btw369
Redon, R. et al. Global variation in copy number in the human genome. Nature 444, 444–454 (2006).
pubmed: 17122850 pmcid: 2669898 doi: 10.1038/nature05329
Bickhart, D. M. et al. Diversity and population-genetic properties of copy number variations and multicopy genes in cattle. DNA Res. 23, 253–262 (2016).
pubmed: 27085184 pmcid: 4909312 doi: 10.1093/dnares/dsw013
Kelsall, I. R. et al. Coupled monoubiquitylation of the co-E3 ligase DCNL1 by Ariadne-RBR E3 ubiquitin ligases promotes cullin-RING ligase complex remodeling. J. Biol. Chem. 294, 2651–2664 (2019).
pubmed: 30587576 doi: 10.1074/jbc.RA118.005861 pmcid: 30587576
Berchtold, M. W. & Villalobo, A. The many faces of calmodulin in cell proliferation, programmed cell death, autophagy, and cancer. Biochim. Biophys. Acta Mol. Cell Res. 1843, 398–435 (2014).
doi: 10.1016/j.bbamcr.2013.10.021
Lotfan, M. et al. Primary structures of different isoforms of buffalo pregnancy-associated glycoproteins (BuPAGs) during early pregnancy and elucidation of the 3-dimensional structure of the most abundant isoform BuPAG 7. PLoS ONE 13, e0206143 (2018).
pubmed: 30403702 pmcid: 6221303 doi: 10.1371/journal.pone.0206143
Kim, J. et al. The genome landscape of indigenous African cattle. Genome Biol. 18, 34 (2017).
pubmed: 28219390 pmcid: 5319050 doi: 10.1186/s13059-017-1153-y
Wang, B. et al. Variant phasing and haplotypic expression from single-molecule long-read sequencing in maize. Commun. Biol. 3, 1–11 (2020).
pubmed: 31925316 pmcid: 6946651 doi: 10.1038/s42003-019-0734-6
Abyzov, A., Urban, A. E., Snyder, M. & Gerstein, M. CNVnator: an approach to discover, genotype, and characterize typical and atypical CNVs from family and population genome sequencing. Genome Res. 21, 974–984 (2011).
pubmed: 21324876 pmcid: 3106330 doi: 10.1101/gr.114876.110
Kim, D. et al. Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat. Biotechnol. 37, 907–915 (2019).
pubmed: 31375807 doi: 10.1038/s41587-019-0201-4
Eggertsson, H. P. et al. Graphtyper enables population-scale genotyping using pangenome graphs. Nat. Genet. 49, 1654–1660 (2017).
pubmed: 28945251 doi: 10.1038/ng.3964
Gol, S. et al. polymorphism in the fatty acid desaturase-2 gene is associated with the arachidonic acid metabolism in pigs. Sci. Rep. 8, 14336 (2018).
pubmed: 30254373 pmcid: 6156218 doi: 10.1038/s41598-018-32710-w
Markworth, J. F. et al. Arachidonic acid supplementation modulates blood and skeletal muscle lipid profile with no effect on basal inflammation in resistance exercise trained men. Prostaglandins Leukot. Essent. Fat. Acids 128, 74–86 (2018).
doi: 10.1016/j.plefa.2017.12.003
Markworth, J. F. & Cameron-Smith, D. Arachidonic acid supplementation enhances in vitro skeletal muscle cell growth via a COX-2-dependent pathway. Am. J. Physiol. Physiol. 304, C56–C67 (2013).
doi: 10.1152/ajpcell.00038.2012
Takahashi, H. et al. Association of bovine fatty acid desaturase 2 gene single-nucleotide polymorphisms with intramuscular fatty acid composition in Japanese Black steers. Open J. Anim. Sci. 06, 105–115 (2016).
doi: 10.4236/ojas.2016.62013
Hansen, H. S. & Jensen, B. Essential function of linoleic acid esterified in acylglucosylceramide and acylceramide in maintaining the epidermal water permeability barrier. Evidence from feeding studies with oleate, linoleate, arachidonate, columbinate and α-linolenate. Biochim. Biophys. Acta Lipids Lipid Metab. 834, 357–363 (1985).
doi: 10.1016/0005-2760(85)90009-8
Bressan, M. C. et al. Genotype x environment interactions for fatty acid profiles in Bos indicus and Bos taurus finished on pasture or grain. J. Anim. Sci. 89, 221–232 (2011).
pubmed: 21178183 doi: 10.2527/jas.2009-2672
Sudano, M. J. et al. Phosphatidylcholine and sphingomyelin profiles vary in Bos taurus indicus and Bos taurus taurus in vitro- and in vivo-produced blastocysts. Biol. Reprod. 87, 130 (2012).
Sainz, R. D., Barioni, L. G., Paulino, P. V. R., S.C.Valadares & Filho, J. W. Growth Patterns of Nellore vs. British Beef Cattle Breeds Assessed using a Dynamic, Mechanistic Model of Cattle Growth and Composition (eds Kebreab, E., Dijkstra, J., Bannink, A., Gerrits, W. J. J. & France, J.) Ch. 16 (CAB eBooks, 2006).
Wang, Y. H. et al. Gene expression profiling of Hereford Shorthorn cattle following challenge with Boophilus microplus tick larvae. Aust. J. Exp. Agric. 47, 1397 (2007).
doi: 10.1071/EA07012
Bickhart, D. M. et al. Copy number variation of individual cattle genomes using next-generation sequencing. Genome Res. 22, 778–90 (2012).
pubmed: 22300768 pmcid: 3317159 doi: 10.1101/gr.133967.111
Hiendleder, S., Lewalski, H. & Janke, A. Complete mitochondrial genomes of Bos taurus and Bos indicus provide new insights into intra-species variation, taxonomy and domestication. Cytogenet. Genome Res. 120, 150–156 (2008).
pubmed: 18467841 doi: 10.1159/000118756
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
pubmed: 2705234 pmcid: 2705234 doi: 10.1093/bioinformatics/btp324
Dudchenko, O. et al. De novo assembly of the Aedes aegypti genome using Hi-C yields chromosome-length scaffolds. Science 356, 92–95 (2017).
pubmed: 28336562 pmcid: 5635820 doi: 10.1126/science.aal3327
Ghurye, J. et al. Integrating Hi-C links with assembly graphs for chromosome-scale assembly. PLOS Comput. Biol. 15, e1007273 (2019).
pubmed: 31433799 pmcid: 6719893 doi: 10.1371/journal.pcbi.1007273
Formenti, G. et al. SMRT long reads and direct label and stain optical maps allow the generation of a high-quality genome assembly for the European barn swallow (Hirundo rustica rustica). Gigascience 8, (2019).
Tardaguila, M. et al. SQANTI: extensive characterization of long read transcript sequences for quality control in full-length transcriptome identification and quantification. Preprint at https://doi.org/10.1101/118083 (2017).
Kim, D., Langmead, B. & Salzberg, S. L. HISAT: a fast spliced aligner with low memory requirements. Nat. Methods 12, 357–360 (2015).
pubmed: 25751142 pmcid: 25751142 doi: 10.1038/nmeth.3317
DePristo, M. A. et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat. Genet. 43, 491–498 (2011).
pubmed: 21478889 pmcid: 3083463 doi: 10.1038/ng.806
Ma, L. et al. Cattle sex-specific recombination and genetic control from a large pedigree analysis. PLOS Genet. 11, e1005387 (2015).
pubmed: 26540184 pmcid: 4634960 doi: 10.1371/journal.pgen.1005387
English, A. C. et al. Mind the gap: upgrading genomes with Pacific Biosciences RS long-read sequencing technology. PLoS ONE 7, e47768 (2012).
pubmed: 23185243 pmcid: 3504050 doi: 10.1371/journal.pone.0047768
Simão, F. A., Waterhouse, R. M., Ioannidis, P., Kriventseva, E. V. & Zdobnov, E. M. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31, 3210–2 (2015).
pubmed: 26059717 pmcid: 26059717 doi: 10.1093/bioinformatics/btv351
Aken, B. L. et al. The Ensembl gene annotation system. Database 2016, baw093 (2016).
pubmed: 27337980 pmcid: 4919035 doi: 10.1093/database/baw093
Bao, W., Kojima, K. K. & Kohany, O. Repbase Update, a database of repetitive elements in eukaryotic genomes. Mob. DNA 6, 11 (2015).
pubmed: 26045719 pmcid: 4455052 doi: 10.1186/s13100-015-0041-9
Heaton, M. P. et al. Using diverse U.S. beef cattle genomes to identify missense mutations in EPAS1, a gene associated with high-altitude pulmonary hypertension. F1000Research 5, 2003 (2016).
pubmed: 27746904 pmcid: 5040160
Andrews, S. FastQC: a quality control tool for high throughput sequence data. https://www.bioinformatics.babraham.ac.uk/projects/fastqc/ (2010).
Krueger, F. Trim Galore!: a wrapper tool around Cutadapt and FastQC to consistently apply quality and adapter trimming to FastQ files, with some extra functionality for MspI-digested RRBS-type (Reduced Representation Bisufite-Seq) libraries. https://www.bioinformatics.babraham.ac.uk/projects/trim_galore/ (2015).
Schubert, M., Lindgreen, S. & Orlando, L. AdapterRemoval v2: rapid adapter trimming, identification, and read merging. BMC Res. Notes 9, 88 (2016).
pubmed: 26868221 pmcid: 4751634 doi: 10.1186/s13104-016-1900-2
Li, H. et al. The sequence alignment/map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).
pubmed: 2723002 pmcid: 2723002 doi: 10.1093/bioinformatics/btp352
Broad Institute. Picard tools. Broad Institute, GitHub repository. http://broadinstitute.github.io/picard/ (2020).
Wang, K., Li, M. & Hakonarson, H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 38, e164 (2010).
pubmed: 20601685 pmcid: 2938201 doi: 10.1093/nar/gkq603
Layer, R. M., Chiang, C., Quinlan, A. R. & Hall, I. M. LUMPY: a probabilistic framework for structural variant discovery. Genome Biol. 15, R84 (2014).
pubmed: 24970577 pmcid: 4197822 doi: 10.1186/gb-2014-15-6-r84
Oldeschulte, D. L. et al. Annotated draft genome assemblies for the Northern Bobwhite (Colinus virginianus) and the scaled quail (Callipepla squamata) reveal disparate estimates of modern genome diversity and historic effective population size. G3 (Bethesda) 7, 3047–3058 (2017).
doi: 10.1534/g3.117.043083
Marçais, G. et al. MUMmer4: a fast and versatile genome alignment system. PLOS Comput. Biol. 14, e1005944 (2018).
pubmed: 29373581 pmcid: 5802927 doi: 10.1371/journal.pcbi.1005944
Krumsiek, J., Arnold, R. & Rattei, T. Gepard: a rapid and sensitive tool for creating dotplots on genome scale. Bioinformatics 23, 1026–1028 (2007).
pubmed: 17309896 pmcid: 17309896 doi: 10.1093/bioinformatics/btm039
Altschul, S. F., Gish, W., Miller, W., Myers, E. W. & Lipman, D. J. Basic local alignment search tool. J. Mol. Biol. 215, 403–410 (1990).
pubmed: 2231712 doi: 10.1016/S0022-2836(05)80360-2
Stamatakis, A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30, 1312–1313 (2014).
pubmed: 24451623 pmcid: 24451623 doi: 10.1093/bioinformatics/btu033
Nattestad, M., Chin, C.-S. & Schatz, M. C. Ribbon: visualizing complex genome alignments and structural variation. Preprint at https://doi.org/10.1101/082123 (2016).
Yang, Z. PAML 4: phylogenetic analysis by maximum likelihood. Mol. Biol. Evol. 24, 1586–1591 (2007).
pubmed: 17483113 doi: 10.1093/molbev/msm088 pmcid: 17483113
Edgar, R. C. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 32, 1792–7 (2004).
pubmed: 15034147 pmcid: 390337 doi: 10.1093/nar/gkh340
Suyama, M., Torrents, D. & Bork, P. PAL2NAL: robust conversion of protein sequence alignments into the corresponding codon alignments. Nucleic Acids Res. 34, 609–612 (2006).
doi: 10.1093/nar/gkl315
Tan, H. M. & Low, W. Y. Rapid birth-death evolution and positive selection in detoxification-type glutathione S-transferases in mammals. PLoS ONE 13, e0209336 (2018).
pubmed: 30586459 pmcid: 6306238 doi: 10.1371/journal.pone.0209336

Auteurs

Wai Yee Low (WY)

The Davies Research Centre, School of Animal and Veterinary Sciences, University of Adelaide, Roseworthy, SA, 5371, Australia.

Rick Tearle (R)

The Davies Research Centre, School of Animal and Veterinary Sciences, University of Adelaide, Roseworthy, SA, 5371, Australia.

Ruijie Liu (R)

The Davies Research Centre, School of Animal and Veterinary Sciences, University of Adelaide, Roseworthy, SA, 5371, Australia.

Sergey Koren (S)

Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, Bethesda, MD, USA.

Arang Rhie (A)

Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, Bethesda, MD, USA.

Derek M Bickhart (DM)

Dairy Forage Research Center, ARS USDA, Madison, WI, USA.

Benjamin D Rosen (BD)

Animal Genomics and Improvement Laboratory, ARS USDA, Beltsville, MD, USA.

Zev N Kronenberg (ZN)

Phase Genomics, 4000 Mason Road, Suite 225, Seattle, WA, 98195, USA.

Sarah B Kingan (SB)

Pacific Biosciences, Menlo Park, CA, 94025, USA.

Elizabeth Tseng (E)

Pacific Biosciences, Menlo Park, CA, 94025, USA.

Françoise Thibaud-Nissen (F)

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA.

Fergal J Martin (FJ)

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.

Konstantinos Billis (K)

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.

Jay Ghurye (J)

Center for Bioinformatics and Computational Biology, Lab 3104A, Biomolecular Science Building, University of Maryland, College Park, MD, 20742, USA.

Alex R Hastie (AR)

Bionano Genomics, San Diego, CA, USA.

Joyce Lee (J)

Bionano Genomics, San Diego, CA, USA.

Andy W C Pang (AWC)

Bionano Genomics, San Diego, CA, USA.

Michael P Heaton (MP)

US Meat Animal Research Center, ARS USDA, Clay Center, NE, USA.

Adam M Phillippy (AM)

Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, Bethesda, MD, USA.

Stefan Hiendleder (S)

The Davies Research Centre, School of Animal and Veterinary Sciences, University of Adelaide, Roseworthy, SA, 5371, Australia. stefan.hiendleder@adelaide.edu.au.

Timothy P L Smith (TPL)

US Meat Animal Research Center, ARS USDA, Clay Center, NE, USA. tim.smith2@usda.gov.

John L Williams (JL)

The Davies Research Centre, School of Animal and Veterinary Sciences, University of Adelaide, Roseworthy, SA, 5371, Australia. john.williams01@adelaide.edu.au.

Articles similaires

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male
Humans Meals Time Factors Female Adult

Classifications MeSH