A near-chromosome-scale genome assembly of the gemsbok (Oryx gazella): an iconic antelope of the Kalahari desert.
Journal
GigaScience
ISSN: 2047-217X
Titre abrégé: Gigascience
Pays: United States
ID NLM: 101596872
Informations de publication
Date de publication:
01 02 2019
01 02 2019
Historique:
received:
15
10
2018
accepted:
12
12
2018
pubmed:
17
1
2019
medline:
25
6
2019
entrez:
17
1
2019
Statut:
epublish
Résumé
The gemsbok (Oryx gazella) is one of the largest antelopes in Africa. Gemsbok are heterothermic and thus highly adapted to live in the desert, changing their feeding behavior when faced with extreme drought and heat. A high-quality genome sequence of this species will assist efforts to elucidate these and other important traits of gemsbok and facilitate research on conservation efforts. Using 180 Gbp of Illumina paired-end and mate-pair reads, a 2.9 Gbp assembly with scaffold N50 of 1.48 Mbp was generated using SOAPdenovo. Scaffolds were extended using Chicago library sequencing, which yielded an additional 114.7 Gbp of DNA sequence. The HiRise assembly using SOAPdenovo + Chicago library sequencing produced a scaffold N50 of 47 Mbp and a final genome size of 2.9 Gbp, representing 90.6% of the estimated genome size and including 93.2% of expected genes according to Benchmarking Universal Single-Copy Orthologs analysis. The Reference-Assisted Chromosome Assembly tool was used to generate a final set of 47 predicted chromosome fragments with N50 of 86.25 Mbp and containing 93.8% of expected genes. A total of 23,125 protein-coding genes and 1.14 Gbp of repetitive sequences were annotated using de novo and homology-based predictions. Our results provide the first high-quality, chromosome-scale genome sequence assembly for gemsbok, which will be a valuable resource for studying adaptive evolution of this species and other ruminants.
Sections du résumé
Background
The gemsbok (Oryx gazella) is one of the largest antelopes in Africa. Gemsbok are heterothermic and thus highly adapted to live in the desert, changing their feeding behavior when faced with extreme drought and heat. A high-quality genome sequence of this species will assist efforts to elucidate these and other important traits of gemsbok and facilitate research on conservation efforts.
Findings
Using 180 Gbp of Illumina paired-end and mate-pair reads, a 2.9 Gbp assembly with scaffold N50 of 1.48 Mbp was generated using SOAPdenovo. Scaffolds were extended using Chicago library sequencing, which yielded an additional 114.7 Gbp of DNA sequence. The HiRise assembly using SOAPdenovo + Chicago library sequencing produced a scaffold N50 of 47 Mbp and a final genome size of 2.9 Gbp, representing 90.6% of the estimated genome size and including 93.2% of expected genes according to Benchmarking Universal Single-Copy Orthologs analysis. The Reference-Assisted Chromosome Assembly tool was used to generate a final set of 47 predicted chromosome fragments with N50 of 86.25 Mbp and containing 93.8% of expected genes. A total of 23,125 protein-coding genes and 1.14 Gbp of repetitive sequences were annotated using de novo and homology-based predictions.
Conclusions
Our results provide the first high-quality, chromosome-scale genome sequence assembly for gemsbok, which will be a valuable resource for studying adaptive evolution of this species and other ruminants.
Identifiants
pubmed: 30649288
pii: 5289690
doi: 10.1093/gigascience/giy162
pmc: PMC6351727
doi:
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Research Support, U.S. Gov't, Non-P.H.S.
Langues
eng
Sous-ensembles de citation
IM
Subventions
Organisme : Biotechnology and Biological Sciences Research Council
ID : BB/P020062/1
Pays : United Kingdom
Références
Genes (Basel). 2018 Jun 20;9(6):
pubmed: 29925783
Genome Res. 2004 May;14(5):988-95
pubmed: 15123596
Science. 2014 Dec 12;346(6215):1311-20
pubmed: 25504712
Heredity (Edinb). 1996 May;76 ( Pt 5):465-75
pubmed: 8666544
Nucleic Acids Res. 2006 Jan 1;34(Database issue):D572-80
pubmed: 16381935
Nat Rev Genet. 2012 Mar 28;13(5):303-14
pubmed: 22456349
Bioinformatics. 2006 Dec 1;22(23):2971-2
pubmed: 17021158
Mol Ecol. 2000 Dec;9(12):1997-2008
pubmed: 11123612
J Mol Biol. 1997 Apr 25;268(1):78-94
pubmed: 9149143
Gigascience. 2018 Feb 1;7(2):
pubmed: 29267854
Proc Natl Acad Sci U S A. 2018 Apr 24;115(17):4325-4333
pubmed: 29686065
Genome Biol. 2007;8(1):R13
pubmed: 17241472
Bioinformatics. 2009 Jul 15;25(14):1754-60
pubmed: 19451168
Nucleic Acids Res. 2006 Jul 1;34(Web Server issue):W435-9
pubmed: 16845043
Curr Protoc Bioinformatics. 2009 Mar;Chapter 4:Unit 4.10
pubmed: 19274634
Proc Natl Acad Sci U S A. 2013 Jan 29;110(5):1785-90
pubmed: 23307812
Gigascience. 2012 Dec 27;1(1):18
pubmed: 23587118
J Hered. 1992 Jul-Aug;83(4):287-98
pubmed: 1401875
Nature. 2011 Oct 12;479(7372):223-7
pubmed: 21993625
Biol Rev Camb Philos Soc. 2016 Feb;91(1):187-205
pubmed: 25522232
Bioinformatics. 2015 Oct 1;31(19):3210-2
pubmed: 26059717
Gigascience. 2019 Feb 1;8(2):
pubmed: 30649288
Proc Natl Acad Sci U S A. 2009 Nov 3;106(44):18644-9
pubmed: 19846765
Genome Res. 2002 Jun;12(6):996-1006
pubmed: 12045153
Annu Rev Anim Biosci. 2015;3:57-111
pubmed: 25689317
Genome Res. 2017 May;27(5):875-884
pubmed: 27903645
Syst Biol. 2010 May;59(3):307-21
pubmed: 20525638
Genome Res. 2016 Mar;26(3):342-50
pubmed: 26848124
BMC Bioinformatics. 2004 May 14;5:59
pubmed: 15144565