First de novo whole genome sequencing and assembly of the bar-headed goose.

10X Genomics Chromium Anser indicus Avian genomes Bar-headed goose Comparative genomics Conservation genomics High-altitude adaptation Hypoxia Positive selection Qinghai-Tibetan Plateau

Journal

PeerJ
ISSN: 2167-8359
Titre abrégé: PeerJ
Pays: United States
ID NLM: 101603425

Informations de publication

Date de publication:
2020
Historique:
received: 23 09 2019
accepted: 15 03 2020
entrez: 16 4 2020
pubmed: 16 4 2020
medline: 16 4 2020
Statut: epublish

Résumé

The bar-headed goose ( In this study, we present the first de novo whole genome sequencing and assembly of the bar-headed goose, along with gene prediction and annotation. 10X Genomics sequencing produced a total of 124 Gb sequencing data, which can cover the estimated genome size of bar-headed goose for 103 times (average coverage). The genome assembly comprised 10,528 scaffolds, with a total length of 1.143 Gb and a scaffold N50 of 10.09 Mb. Annotation of the bar-headed goose genome assembly identified a total of 102 Mb (8.9%) of repetitive sequences, 16,428 protein-coding genes, and 282 tRNAs. In total, we determined that there were 63 expanded and 20 contracted gene families in the bar-headed goose compared with the other 15 vertebrates. We also performed a positive selection analysis between the bar-headed goose and the closely related low-altitude goose, swan goose ( We reported the currently most complete genome sequence of the bar-headed goose. Our assembly will provide a valuable resource to enhance further studies of the gene functions of bar-headed goose. The data will also be valuable for facilitating studies of the evolution, population genetics and high-altitude adaptations of the bar-headed geese at the genomic level.

Sections du résumé

BACKGROUND BACKGROUND
The bar-headed goose (
METHODS METHODS
In this study, we present the first de novo whole genome sequencing and assembly of the bar-headed goose, along with gene prediction and annotation.
RESULTS RESULTS
10X Genomics sequencing produced a total of 124 Gb sequencing data, which can cover the estimated genome size of bar-headed goose for 103 times (average coverage). The genome assembly comprised 10,528 scaffolds, with a total length of 1.143 Gb and a scaffold N50 of 10.09 Mb. Annotation of the bar-headed goose genome assembly identified a total of 102 Mb (8.9%) of repetitive sequences, 16,428 protein-coding genes, and 282 tRNAs. In total, we determined that there were 63 expanded and 20 contracted gene families in the bar-headed goose compared with the other 15 vertebrates. We also performed a positive selection analysis between the bar-headed goose and the closely related low-altitude goose, swan goose (
CONCLUSION CONCLUSIONS
We reported the currently most complete genome sequence of the bar-headed goose. Our assembly will provide a valuable resource to enhance further studies of the gene functions of bar-headed goose. The data will also be valuable for facilitating studies of the evolution, population genetics and high-altitude adaptations of the bar-headed geese at the genomic level.

Identifiants

pubmed: 32292659
doi: 10.7717/peerj.8914
pii: 8914
pmc: PMC7144584
doi:

Banques de données

figshare
['10.6084/m9.figshare.8229083.v1']

Types de publication

Journal Article

Langues

eng

Pagination

e8914

Informations de copyright

©2020 Wang et al.

Déclaration de conflit d'intérêts

The authors declare there are no competing interests. Rongkai Hao is employed by Novogene Bioinformatics Institute.

Références

Nature. 2010 Apr 1;464(7289):757-62
pubmed: 20360741
Science. 2015 Jan 16;347(6219):250-4
pubmed: 25593180
Genome Res. 2004 May;14(5):988-95
pubmed: 15123596
Sci Rep. 2016 Sep 09;6:32961
pubmed: 27608918
J Exp Biol. 2002 Nov;205(Pt 21):3347-56
pubmed: 12324544
Nucleic Acids Res. 1999 Jan 1;27(1):29-34
pubmed: 9847135
J Mol Biol. 1997 Apr 25;268(1):78-94
pubmed: 9149143
Am J Physiol Regul Integr Comp Physiol. 2007 Jul;293(1):R379-91
pubmed: 17491113
Genomics. 2018 Mar;110(2):75-79
pubmed: 28860085
Physiol Rev. 1991 Oct;71(4):1135-72
pubmed: 1924550
Mob DNA. 2015 Jun 02;6:11
pubmed: 26045719
PLoS Biol. 2010 Sep 07;8(9):
pubmed: 20838655
Curr Protoc Bioinformatics. 2009 Mar;Chapter 4:Unit 4.10
pubmed: 19274634
Ann N Y Acad Sci. 2017 Feb;1389(1):164-185
pubmed: 27997700
Gigascience. 2019 Feb 1;8(2):
pubmed: 30346553
Nucleic Acids Res. 2005 Jul 1;33(Web Server issue):W686-9
pubmed: 15980563
Gigascience. 2019 Mar 1;8(3):
pubmed: 30624602
Evol Appl. 2014 Nov;7(9):1026-42
pubmed: 25553065
Mol Genet Genomics. 2020 Jan;295(1):31-46
pubmed: 31414227
Genome Res. 2017 May;27(5):757-767
pubmed: 28381613
Respir Physiol. 1984 Nov;58(2):151-60
pubmed: 6522870
Nat Biotechnol. 2010 May;28(5):511-5
pubmed: 20436464
Annu Rev Physiol. 1991;53:59-70
pubmed: 2042973
Nat Genet. 2013 May;45(5):563-6
pubmed: 23525076
Nat Commun. 2013;4:2071
pubmed: 23817352
PLoS Genet. 2018 Apr 2;14(4):e1007331
pubmed: 29608560
Proc Natl Acad Sci U S A. 2002 Apr 16;99(8):5271-6
pubmed: 11959977
Immunogenetics. 2017 Mar;69(3):175-186
pubmed: 27888301
Nature. 2004 Dec 9;432(7018):695-716
pubmed: 15592404
Mol Biol Evol. 2007 Aug;24(8):1586-91
pubmed: 17483113
Science. 2014 Dec 12;346(6215):1311-20
pubmed: 25504712
Nucleic Acids Res. 1999 Jan 15;27(2):573-80
pubmed: 9862982
Proc Natl Acad Sci U S A. 2018 Feb 20;115(8):1865-1870
pubmed: 29432191
Bioinformatics. 2015 Oct 1;31(19):3210-2
pubmed: 26059717
Nat Genet. 2013 Jul;45(7):776-783
pubmed: 23749191
Toxicol Sci. 2016 May;151(1):193-203
pubmed: 26884059
J Exp Biol. 2011 Aug 1;214(Pt 15):2455-62
pubmed: 21753038
J Exp Biol. 1968 Feb;48(1):55-66
pubmed: 5648817
Bioinformatics. 2007 May 1;23(9):1061-7
pubmed: 17332020
Nucleic Acids Res. 2018 Jul 2;46(W1):W200-W204
pubmed: 29905871
Science. 2014 Dec 12;346(6215):1320-31
pubmed: 25504713
Methods Mol Biol. 2007;396:59-70
pubmed: 18025686
Nucleic Acids Res. 2014 Jan;42(Database issue):D222-30
pubmed: 24288371
J Mol Biol. 1992 Jul 5;226(1):141-57
pubmed: 1619647
Proc Natl Acad Sci U S A. 2013 Dec 17;110(51):20669-74
pubmed: 24297909
Nucleic Acids Res. 2006 Jan 1;34(Database issue):D572-80
pubmed: 16381935
Mol Cell Biochem. 2016 Apr;415(1-2):29-38
pubmed: 26920732
BMC Genomics. 2006 Dec 28;7:327
pubmed: 17194304
Nat Commun. 2013;4:1858
pubmed: 23673643
Proc Biol Sci. 2009 Oct 22;276(1673):3645-53
pubmed: 19640884
Comp Biochem Physiol A Mol Integr Physiol. 2010 Jul;156(3):325-9
pubmed: 20116442
Science. 2013 Mar 1;339(6123):1063-7
pubmed: 23371554
Proc Natl Acad Sci U S A. 2011 Jun 7;108(23):9516-9
pubmed: 21628594
BMC Evol Biol. 2017 Aug 22;17(1):201
pubmed: 28830337
Nat Genet. 2012 Jul 01;44(8):946-9
pubmed: 22751099
Nucleic Acids Res. 1997 Sep 1;25(17):3389-402
pubmed: 9254694
Physiology (Bethesda). 2015 Mar;30(2):107-15
pubmed: 25729056
Nucleic Acids Res. 2005 Jul 1;33(Web Server issue):W465-7
pubmed: 15980513
Mol Phylogenet Evol. 2010 Aug;56(2):649-58
pubmed: 20434566
Genome Biol. 2013 Apr 25;14(4):R36
pubmed: 23618408
Curr Opin Nephrol Hypertens. 1996 Jan;5(1):4-11
pubmed: 8834155
Bioinformatics. 2006 May 15;22(10):1269-71
pubmed: 16543274
Genome Biol. 2015 May 06;16:89
pubmed: 25943208
Genome Biol. 2008 Jan 11;9(1):R7
pubmed: 18190707
Mol Phylogenet Evol. 2016 Aug;101:303-313
pubmed: 27233434
Nat Genet. 2016 Jan;48(1):84-8
pubmed: 26569123
Bioinformatics. 2009 May 15;25(10):1335-7
pubmed: 19307242
Chromosome Res. 2008;16(1):203-15
pubmed: 18293113
Nature. 2015 Jun 4;522(7554):34
pubmed: 26040883
Nucleic Acids Res. 2016 Jan 4;44(D1):D733-45
pubmed: 26553804
Bioinformatics. 2004 Nov 1;20(16):2878-9
pubmed: 15145805
Curr Protoc Bioinformatics. 2018 Jun;62(1):e51
pubmed: 29927072
J Mol Biol. 1990 Oct 5;215(3):403-10
pubmed: 2231712
Nucleic Acids Res. 2004 Mar 19;32(5):1792-7
pubmed: 15034147
Genome Biol. 2014;15(12):557
pubmed: 25496777
Sci Rep. 2015 Sep 25;5:14256
pubmed: 26404527
BMC Bioinformatics. 2004 May 14;5:59
pubmed: 15144565

Auteurs

Wen Wang (W)

State Key Laboratory of Plateau Ecology and Agriculture, Qinghai University, Xi'ning, Qinghai, China.

Fang Wang (F)

Northwest Institute of Plateau Biology, Chinese Academy of Sciences, Xi'ning, Qinghai, China.

Rongkai Hao (R)

Novogene Bioinformatics Institute, Beijing, China.

Aizhen Wang (A)

College of Eco-Environmental Engineering, Qinghai University, Xi'ning, Qinghai, China.

Kirill Sharshov (K)

Research Institute of Experimental and Clinical Medicine, Novosibirsk, Russia.

Alexey Druzyaka (A)

Institute of Systematics and Ecology of Animals, Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia.

Zhuoma Lancuo (Z)

School of Finance and Economics, Qinghai University, Xi'ning, Qinghai, China.

Yuetong Shi (Y)

KunLun College of Qinghai University, Xi'ning, Qinghai, China.

Shuo Feng (S)

State Key Laboratory of Plateau Ecology and Agriculture, Qinghai University, Xi'ning, Qinghai, China.

Classifications MeSH