Continuous chromosome-scale haplotypes assembled from a single interspecies F1 hybrid of yak and cattle.
Bos grunniens
Bos taurus
Highland cattle
genome assembly
phasing
Journal
GigaScience
ISSN: 2047-217X
Titre abrégé: Gigascience
Pays: United States
ID NLM: 101596872
Informations de publication
Date de publication:
01 04 2020
01 04 2020
Historique:
received:
01
10
2019
revised:
08
01
2020
accepted:
10
03
2020
entrez:
4
4
2020
pubmed:
4
4
2020
medline:
12
6
2021
Statut:
ppublish
Résumé
The development of trio binning as an approach for assembling diploid genomes has enabled the creation of fully haplotype-resolved reference genomes. Unlike other methods of assembly for diploid genomes, this approach is enhanced, rather than hindered, by the heterozygosity of the individual sequenced. To maximize heterozygosity and simultaneously assemble reference genomes for 2 species, we applied trio binning to an interspecies F1 hybrid of yak (Bos grunniens) and cattle (Bos taurus), 2 species that diverged nearly 5 million years ago. The genomes of both of these species are composed of acrocentric autosomes. We produced the most continuous haplotype-resolved assemblies for a diploid animal yet reported. Both the maternal (yak) and paternal (cattle) assemblies have the largest 2 chromosomes in single haplotigs, and more than one-third of the autosomes similarly lack gaps. The maximum length haplotig produced was 153 Mb without any scaffolding or gap-filling steps and represents the longest haplotig reported for any species. The assemblies are also more complete and accurate than those reported for most other vertebrates, with 97% of mammalian universal single-copy orthologs present. The high heterozygosity inherent to interspecies crosses maximizes the effectiveness of the trio binning method. The interspecies trio binning approach we describe is likely to provide the highest-quality assemblies for any pair of species that can interbreed to produce hybrid offspring that develop to sufficient cell numbers for DNA extraction.
Sections du résumé
BACKGROUND
The development of trio binning as an approach for assembling diploid genomes has enabled the creation of fully haplotype-resolved reference genomes. Unlike other methods of assembly for diploid genomes, this approach is enhanced, rather than hindered, by the heterozygosity of the individual sequenced. To maximize heterozygosity and simultaneously assemble reference genomes for 2 species, we applied trio binning to an interspecies F1 hybrid of yak (Bos grunniens) and cattle (Bos taurus), 2 species that diverged nearly 5 million years ago. The genomes of both of these species are composed of acrocentric autosomes.
RESULTS
We produced the most continuous haplotype-resolved assemblies for a diploid animal yet reported. Both the maternal (yak) and paternal (cattle) assemblies have the largest 2 chromosomes in single haplotigs, and more than one-third of the autosomes similarly lack gaps. The maximum length haplotig produced was 153 Mb without any scaffolding or gap-filling steps and represents the longest haplotig reported for any species. The assemblies are also more complete and accurate than those reported for most other vertebrates, with 97% of mammalian universal single-copy orthologs present.
CONCLUSIONS
The high heterozygosity inherent to interspecies crosses maximizes the effectiveness of the trio binning method. The interspecies trio binning approach we describe is likely to provide the highest-quality assemblies for any pair of species that can interbreed to produce hybrid offspring that develop to sufficient cell numbers for DNA extraction.
Identifiants
pubmed: 32242610
pii: 5815405
doi: 10.1093/gigascience/giaa029
pmc: PMC7118895
pii:
doi:
Types de publication
Journal Article
Research Support, N.I.H., Intramural
Research Support, Non-U.S. Gov't
Research Support, U.S. Gov't, Non-P.H.S.
Langues
eng
Sous-ensembles de citation
IM
Informations de copyright
© The Author(s) 2020. Published by Oxford University Press.
Références
Nat Commun. 2019 Jan 16;10(1):260
pubmed: 30651564
Retrovirology. 2017 Apr 4;14(1):24
pubmed: 28376881
BMC Vet Res. 2008 Jul 14;4:25
pubmed: 18625065
Nucleic Acids Res. 2018 Mar 16;46(5):2159-2168
pubmed: 29401301
Nat Biotechnol. 2018 Apr;36(4):321-323
pubmed: 29553574
Nat Ecol Evol. 2018 Jul;2(7):1139-1145
pubmed: 29784979
Cytogenet Cell Genet. 1997;78(1):69-73
pubmed: 9345913
Nat Biotechnol. 2019 May;37(5):540-546
pubmed: 30936562
BMC Genomics. 2015 Aug 28;16:644
pubmed: 26314885
Nat Biotechnol. 2018 Oct 22;:
pubmed: 30346939
F1000Res. 2016 Aug 16;5:2003
pubmed: 27746904
Nat Rev Genet. 2011 Nov 29;13(1):36-46
pubmed: 22124482
ISRN Vet Sci. 2012 May 28;2012:872710
pubmed: 23738132
Bioinformatics. 2017 Jul 15;33(14):2202-2204
pubmed: 28369201
Genome Res. 2017 May;27(5):757-767
pubmed: 28381613
Bioinformatics. 2016 Jul 15;32(14):2103-10
pubmed: 27153593
Annu Rev Anim Biosci. 2019 Feb 15;7:17-40
pubmed: 30485757
Bioinformatics. 2015 Oct 1;31(19):3210-2
pubmed: 26059717
Nat Genet. 2017 Mar;49(3):470-475
pubmed: 28135247
Nat Genet. 2011 May;43(5):491-8
pubmed: 21478889
Science. 2009 Apr 24;324(5926):522-8
pubmed: 19390049
Genome Res. 2017 May;27(5):722-736
pubmed: 28298431
Nat Methods. 2011 Jan;8(1):61-5
pubmed: 21102452
F1000Res. 2017 Aug 2;6:1303
pubmed: 28928950
PLoS Comput Biol. 2019 Aug 21;15(8):e1007273
pubmed: 31433799
Nat Genet. 2012 Jul 01;44(8):946-9
pubmed: 22751099
Nat Biotechnol. 2018 Apr;36(4):338-345
pubmed: 29431738
Commun Biol. 2018 Nov 16;1:197
pubmed: 30456315
BMC Bioinformatics. 2012 Sep 19;13:238
pubmed: 22988817
PLoS One. 2012;7(11):e47768
pubmed: 23185243
Nat Methods. 2016 Dec;13(12):1050-1054
pubmed: 27749838
Bioinformatics. 2009 Aug 15;25(16):2078-9
pubmed: 19505943
PLoS Genet. 2015 Nov 05;11(11):e1005387
pubmed: 26540184
Nat Genet. 2017 Apr;49(4):643-650
pubmed: 28263316
Bioinformatics. 2010 Mar 15;26(6):841-2
pubmed: 20110278
J Comput Biol. 2018 Jul;25(7):766-779
pubmed: 29708767
Bioinformatics. 2019 Jul 1;35(13):2193-2198
pubmed: 30462145