First de novo whole genome sequencing and assembly of the bar-headed goose.
10X Genomics Chromium
Anser indicus
Avian genomes
Bar-headed goose
Comparative genomics
Conservation genomics
High-altitude adaptation
Hypoxia
Positive selection
Qinghai-Tibetan Plateau
Journal
PeerJ
ISSN: 2167-8359
Titre abrégé: PeerJ
Pays: United States
ID NLM: 101603425
Informations de publication
Date de publication:
2020
2020
Historique:
received:
23
09
2019
accepted:
15
03
2020
entrez:
16
4
2020
pubmed:
16
4
2020
medline:
16
4
2020
Statut:
epublish
Résumé
The bar-headed goose ( In this study, we present the first de novo whole genome sequencing and assembly of the bar-headed goose, along with gene prediction and annotation. 10X Genomics sequencing produced a total of 124 Gb sequencing data, which can cover the estimated genome size of bar-headed goose for 103 times (average coverage). The genome assembly comprised 10,528 scaffolds, with a total length of 1.143 Gb and a scaffold N50 of 10.09 Mb. Annotation of the bar-headed goose genome assembly identified a total of 102 Mb (8.9%) of repetitive sequences, 16,428 protein-coding genes, and 282 tRNAs. In total, we determined that there were 63 expanded and 20 contracted gene families in the bar-headed goose compared with the other 15 vertebrates. We also performed a positive selection analysis between the bar-headed goose and the closely related low-altitude goose, swan goose ( We reported the currently most complete genome sequence of the bar-headed goose. Our assembly will provide a valuable resource to enhance further studies of the gene functions of bar-headed goose. The data will also be valuable for facilitating studies of the evolution, population genetics and high-altitude adaptations of the bar-headed geese at the genomic level.
Sections du résumé
BACKGROUND
BACKGROUND
The bar-headed goose (
METHODS
METHODS
In this study, we present the first de novo whole genome sequencing and assembly of the bar-headed goose, along with gene prediction and annotation.
RESULTS
RESULTS
10X Genomics sequencing produced a total of 124 Gb sequencing data, which can cover the estimated genome size of bar-headed goose for 103 times (average coverage). The genome assembly comprised 10,528 scaffolds, with a total length of 1.143 Gb and a scaffold N50 of 10.09 Mb. Annotation of the bar-headed goose genome assembly identified a total of 102 Mb (8.9%) of repetitive sequences, 16,428 protein-coding genes, and 282 tRNAs. In total, we determined that there were 63 expanded and 20 contracted gene families in the bar-headed goose compared with the other 15 vertebrates. We also performed a positive selection analysis between the bar-headed goose and the closely related low-altitude goose, swan goose (
CONCLUSION
CONCLUSIONS
We reported the currently most complete genome sequence of the bar-headed goose. Our assembly will provide a valuable resource to enhance further studies of the gene functions of bar-headed goose. The data will also be valuable for facilitating studies of the evolution, population genetics and high-altitude adaptations of the bar-headed geese at the genomic level.
Identifiants
pubmed: 32292659
doi: 10.7717/peerj.8914
pii: 8914
pmc: PMC7144584
doi:
Banques de données
figshare
['10.6084/m9.figshare.8229083.v1']
Types de publication
Journal Article
Langues
eng
Pagination
e8914Informations de copyright
©2020 Wang et al.
Déclaration de conflit d'intérêts
The authors declare there are no competing interests. Rongkai Hao is employed by Novogene Bioinformatics Institute.
Références
Nature. 2010 Apr 1;464(7289):757-62
pubmed: 20360741
Science. 2015 Jan 16;347(6219):250-4
pubmed: 25593180
Genome Res. 2004 May;14(5):988-95
pubmed: 15123596
Sci Rep. 2016 Sep 09;6:32961
pubmed: 27608918
J Exp Biol. 2002 Nov;205(Pt 21):3347-56
pubmed: 12324544
Nucleic Acids Res. 1999 Jan 1;27(1):29-34
pubmed: 9847135
J Mol Biol. 1997 Apr 25;268(1):78-94
pubmed: 9149143
Am J Physiol Regul Integr Comp Physiol. 2007 Jul;293(1):R379-91
pubmed: 17491113
Genomics. 2018 Mar;110(2):75-79
pubmed: 28860085
Physiol Rev. 1991 Oct;71(4):1135-72
pubmed: 1924550
Mob DNA. 2015 Jun 02;6:11
pubmed: 26045719
PLoS Biol. 2010 Sep 07;8(9):
pubmed: 20838655
Curr Protoc Bioinformatics. 2009 Mar;Chapter 4:Unit 4.10
pubmed: 19274634
Ann N Y Acad Sci. 2017 Feb;1389(1):164-185
pubmed: 27997700
Gigascience. 2019 Feb 1;8(2):
pubmed: 30346553
Nucleic Acids Res. 2005 Jul 1;33(Web Server issue):W686-9
pubmed: 15980563
Gigascience. 2019 Mar 1;8(3):
pubmed: 30624602
Evol Appl. 2014 Nov;7(9):1026-42
pubmed: 25553065
Mol Genet Genomics. 2020 Jan;295(1):31-46
pubmed: 31414227
Genome Res. 2017 May;27(5):757-767
pubmed: 28381613
Respir Physiol. 1984 Nov;58(2):151-60
pubmed: 6522870
Nat Biotechnol. 2010 May;28(5):511-5
pubmed: 20436464
Annu Rev Physiol. 1991;53:59-70
pubmed: 2042973
Nat Genet. 2013 May;45(5):563-6
pubmed: 23525076
Nat Commun. 2013;4:2071
pubmed: 23817352
PLoS Genet. 2018 Apr 2;14(4):e1007331
pubmed: 29608560
Proc Natl Acad Sci U S A. 2002 Apr 16;99(8):5271-6
pubmed: 11959977
Immunogenetics. 2017 Mar;69(3):175-186
pubmed: 27888301
Nature. 2004 Dec 9;432(7018):695-716
pubmed: 15592404
Mol Biol Evol. 2007 Aug;24(8):1586-91
pubmed: 17483113
Science. 2014 Dec 12;346(6215):1311-20
pubmed: 25504712
Nucleic Acids Res. 1999 Jan 15;27(2):573-80
pubmed: 9862982
Proc Natl Acad Sci U S A. 2018 Feb 20;115(8):1865-1870
pubmed: 29432191
Bioinformatics. 2015 Oct 1;31(19):3210-2
pubmed: 26059717
Nat Genet. 2013 Jul;45(7):776-783
pubmed: 23749191
Toxicol Sci. 2016 May;151(1):193-203
pubmed: 26884059
J Exp Biol. 2011 Aug 1;214(Pt 15):2455-62
pubmed: 21753038
J Exp Biol. 1968 Feb;48(1):55-66
pubmed: 5648817
Bioinformatics. 2007 May 1;23(9):1061-7
pubmed: 17332020
Nucleic Acids Res. 2018 Jul 2;46(W1):W200-W204
pubmed: 29905871
Science. 2014 Dec 12;346(6215):1320-31
pubmed: 25504713
Methods Mol Biol. 2007;396:59-70
pubmed: 18025686
Nucleic Acids Res. 2014 Jan;42(Database issue):D222-30
pubmed: 24288371
J Mol Biol. 1992 Jul 5;226(1):141-57
pubmed: 1619647
Proc Natl Acad Sci U S A. 2013 Dec 17;110(51):20669-74
pubmed: 24297909
Nucleic Acids Res. 2006 Jan 1;34(Database issue):D572-80
pubmed: 16381935
Mol Cell Biochem. 2016 Apr;415(1-2):29-38
pubmed: 26920732
BMC Genomics. 2006 Dec 28;7:327
pubmed: 17194304
Nat Commun. 2013;4:1858
pubmed: 23673643
Proc Biol Sci. 2009 Oct 22;276(1673):3645-53
pubmed: 19640884
Comp Biochem Physiol A Mol Integr Physiol. 2010 Jul;156(3):325-9
pubmed: 20116442
Science. 2013 Mar 1;339(6123):1063-7
pubmed: 23371554
Proc Natl Acad Sci U S A. 2011 Jun 7;108(23):9516-9
pubmed: 21628594
BMC Evol Biol. 2017 Aug 22;17(1):201
pubmed: 28830337
Nat Genet. 2012 Jul 01;44(8):946-9
pubmed: 22751099
Nucleic Acids Res. 1997 Sep 1;25(17):3389-402
pubmed: 9254694
Physiology (Bethesda). 2015 Mar;30(2):107-15
pubmed: 25729056
Nucleic Acids Res. 2005 Jul 1;33(Web Server issue):W465-7
pubmed: 15980513
Mol Phylogenet Evol. 2010 Aug;56(2):649-58
pubmed: 20434566
Genome Biol. 2013 Apr 25;14(4):R36
pubmed: 23618408
Curr Opin Nephrol Hypertens. 1996 Jan;5(1):4-11
pubmed: 8834155
Bioinformatics. 2006 May 15;22(10):1269-71
pubmed: 16543274
Genome Biol. 2015 May 06;16:89
pubmed: 25943208
Genome Biol. 2008 Jan 11;9(1):R7
pubmed: 18190707
Mol Phylogenet Evol. 2016 Aug;101:303-313
pubmed: 27233434
Nat Genet. 2016 Jan;48(1):84-8
pubmed: 26569123
Bioinformatics. 2009 May 15;25(10):1335-7
pubmed: 19307242
Chromosome Res. 2008;16(1):203-15
pubmed: 18293113
Nature. 2015 Jun 4;522(7554):34
pubmed: 26040883
Nucleic Acids Res. 2016 Jan 4;44(D1):D733-45
pubmed: 26553804
Bioinformatics. 2004 Nov 1;20(16):2878-9
pubmed: 15145805
Curr Protoc Bioinformatics. 2018 Jun;62(1):e51
pubmed: 29927072
J Mol Biol. 1990 Oct 5;215(3):403-10
pubmed: 2231712
Nucleic Acids Res. 2004 Mar 19;32(5):1792-7
pubmed: 15034147
Genome Biol. 2014;15(12):557
pubmed: 25496777
Sci Rep. 2015 Sep 25;5:14256
pubmed: 26404527
BMC Bioinformatics. 2004 May 14;5:59
pubmed: 15144565