A hybrid de novo genome assembly of the honeybee, Apis mellifera, with chromosome-length scaffolds.

Centromeres Genome assembly Hi-C Linked-read sequencing Optical mapping Single-molecule real-time (SMRT) sequencing Telomeres

Journal

BMC genomics
ISSN: 1471-2164
Titre abrégé: BMC Genomics
Pays: England
ID NLM: 100965258

Informations de publication

Date de publication:
08 Apr 2019
Historique:
received: 27 07 2018
accepted: 24 03 2019
entrez: 10 4 2019
pubmed: 10 4 2019
medline: 28 7 2019
Statut: epublish

Résumé

The ability to generate long sequencing reads and access long-range linkage information is revolutionizing the quality and completeness of genome assemblies. Here we use a hybrid approach that combines data from four genome sequencing and mapping technologies to generate a new genome assembly of the honeybee Apis mellifera. We first generated contigs based on PacBio sequencing libraries, which were then merged with linked-read 10x Chromium data followed by scaffolding using a BioNano optical genome map and a Hi-C chromatin interaction map, complemented by a genetic linkage map. Each of the assembly steps reduced the number of gaps and incorporated a substantial amount of additional sequence into scaffolds. The new assembly (Amel_HAv3) is significantly more contiguous and complete than the previous one (Amel_4.5), based mainly on Sanger sequencing reads. N50 of contigs is 120-fold higher (5.381 Mbp compared to 0.053 Mbp) and we anchor > 98% of the sequence to chromosomes. All of the 16 chromosomes are represented as single scaffolds with an average of three sequence gaps per chromosome. The improvements are largely due to the inclusion of repetitive sequence that was unplaced in previous assemblies. In particular, our assembly is highly contiguous across centromeres and telomeres and includes hundreds of AvaI and AluI repeats associated with these features. The improved assembly will be of utility for refining gene models, studying genome function, mapping functional genetic variation, identification of structural variants, and comparative genomics.

Sections du résumé

BACKGROUND BACKGROUND
The ability to generate long sequencing reads and access long-range linkage information is revolutionizing the quality and completeness of genome assemblies. Here we use a hybrid approach that combines data from four genome sequencing and mapping technologies to generate a new genome assembly of the honeybee Apis mellifera. We first generated contigs based on PacBio sequencing libraries, which were then merged with linked-read 10x Chromium data followed by scaffolding using a BioNano optical genome map and a Hi-C chromatin interaction map, complemented by a genetic linkage map.
RESULTS RESULTS
Each of the assembly steps reduced the number of gaps and incorporated a substantial amount of additional sequence into scaffolds. The new assembly (Amel_HAv3) is significantly more contiguous and complete than the previous one (Amel_4.5), based mainly on Sanger sequencing reads. N50 of contigs is 120-fold higher (5.381 Mbp compared to 0.053 Mbp) and we anchor > 98% of the sequence to chromosomes. All of the 16 chromosomes are represented as single scaffolds with an average of three sequence gaps per chromosome. The improvements are largely due to the inclusion of repetitive sequence that was unplaced in previous assemblies. In particular, our assembly is highly contiguous across centromeres and telomeres and includes hundreds of AvaI and AluI repeats associated with these features.
CONCLUSIONS CONCLUSIONS
The improved assembly will be of utility for refining gene models, studying genome function, mapping functional genetic variation, identification of structural variants, and comparative genomics.

Identifiants

pubmed: 30961563
doi: 10.1186/s12864-019-5642-0
pii: 10.1186/s12864-019-5642-0
pmc: PMC6454739
doi:

Types de publication

Journal Article

Langues

eng

Pagination

275

Subventions

Organisme : Vetenskapsrådet
ID : 2013-722
Organisme : Svenska Forskningsrådet Formas
ID : 2014-5096

Références

Chromosome Res. 1999;7(6):449-60
pubmed: 10560968
Genome. 2004 Feb;47(1):163-78
pubmed: 15060613
Genetics. 2004 May;167(1):243-52
pubmed: 15166151
Genetics. 2004 May;167(1):253-62
pubmed: 15166152
Genetics. 2006 May;173(1):419-34
pubmed: 16204214
Genome Res. 2006 Nov;16(11):1345-51
pubmed: 17065609
Nature. 2006 Oct 26;443(7114):931-49
pubmed: 17073008
Genome Biol. 2007;8(4):R66
pubmed: 17459148
Bioinformatics. 2008 May 15;24(10):1229-35
pubmed: 18356192
Nat Rev Genet. 2008 Oct;9(10):735-48
pubmed: 18802413
Bioinformatics. 2009 Jan 1;25(1):119-20
pubmed: 18990721
Science. 2009 Jan 2;323(5910):133-8
pubmed: 19023044
Bioinformatics. 2009 Jul 15;25(14):1754-60
pubmed: 19451168
Nat Rev Genet. 2010 Jan;11(1):31-46
pubmed: 19997069
BMC Bioinformatics. 2009 Dec 15;10:421
pubmed: 20003500
Bioinformatics. 2010 May 1;26(9):1145-51
pubmed: 20208069
Proc Natl Acad Sci U S A. 2010 Jun 15;107(24):10848-53
pubmed: 20534489
Nature. 2011 Aug 14;477(7363):203-6
pubmed: 21841803
Bioinformatics. 2011 Nov 1;27(21):2987-93
pubmed: 21903627
Nat Rev Genet. 2011 Nov 29;13(1):36-46
pubmed: 22124482
Nucleic Acids Res. 2012 Apr;40(7):2833-45
pubmed: 22139921
PLoS One. 2012;7(1):e30377
pubmed: 22276185
J Mol Biol. 1990 Oct 5;215(3):403-10
pubmed: 2231712
Nature. 2012 Apr 04;484(7392):55-61
pubmed: 22481358
Nat Biotechnol. 2012 Aug;30(8):771-6
pubmed: 22797562
Gene. 2012 Nov 1;509(1):7-15
pubmed: 22921893
Mol Phylogenet Evol. 2013 Nov;69(2):313-9
pubmed: 22982435
PLoS One. 2012;7(11):e47768
pubmed: 23185243
Nat Rev Genet. 2013 Feb;14(2):125-38
pubmed: 23329113
Mol Biol Evol. 2013 Apr;30(4):772-80
pubmed: 23329690
Nature. 2013 Jan 31;493(7434):664-8
pubmed: 23334415
Genome Biol Evol. 2013;5(6):1142-50
pubmed: 23699225
Nat Biotechnol. 2013 Dec;31(12):1119-25
pubmed: 24185095
Nat Biotechnol. 2013 Dec;31(12):1143-7
pubmed: 24270850
BMC Genomics. 2014 Jan 30;15:86
pubmed: 24479613
Chromosome Res. 2014 Dec;22(4):495-503
pubmed: 25080999
Nat Genet. 2014 Oct;46(10):1081-8
pubmed: 25151355
Genome Res. 2015 Mar;25(3):445-58
pubmed: 25589440
Genome Biol. 2015 Jan 02;16:15
pubmed: 25651211
PLoS Genet. 2015 Apr 22;11(4):e1005189
pubmed: 25902173
Mob DNA. 2015 Jun 02;6:11
pubmed: 26045719
Bioinformatics. 2015 Oct 1;31(19):3210-2
pubmed: 26059717
Gigascience. 2015 Aug 04;4:35
pubmed: 26244089
Nat Rev Genet. 2015 Nov;16(11):627-40
pubmed: 26442640
Chromosoma. 2016 Jun;125(3):405-11
pubmed: 26490169
Nat Biotechnol. 2016 Mar;34(3):303-11
pubmed: 26829319
Science. 2016 Apr 1;352(6281):aae0344
pubmed: 27034376
Nat Methods. 2016 Jul;13(7):587-90
pubmed: 27159086
PLoS Genet. 2016 Jun 09;12(6):e1006097
pubmed: 27280405
Am Nat. 2016 Oct;188(4):379-97
pubmed: 27622873
Nat Methods. 2016 Dec;13(12):1050-1054
pubmed: 27749838
Nucleic Acids Res. 2017 Jan 4;45(D1):D744-D749
pubmed: 27899580
Exp Cell Res. 2017 Sep 15;358(2):433-438
pubmed: 28017728
Nat Genet. 2017 Apr;49(4):643-650
pubmed: 28263316
Genes Brain Behav. 2017 Jul;16(6):579-591
pubmed: 28328153
Genome Res. 2017 May;27(5):697-708
pubmed: 28360231
Genome Res. 2017 May;27(5):757-767
pubmed: 28381613
Genome Res. 2017 May;27(5):xi-xiii
pubmed: 28461322
PLoS Genet. 2017 May 25;13(5):e1006792
pubmed: 28542163
Nat Genet. 2017 Jul;49(7):1099-1106
pubmed: 28581499
BMC Biol. 2017 Aug 31;15(1):74
pubmed: 28854926
Bioinformatics. 2018 Mar 1;34(5):725-731
pubmed: 29069293
Gigascience. 2017 Nov 1;6(11):1-7
pubmed: 29069494
BMC Biol. 2017 Nov 16;15(1):110
pubmed: 29145861
Nat Genet. 2018 Jan;50(1):20-25
pubmed: 29255259
Genome Biol Evol. 2018 Jan 1;10(1):143-156
pubmed: 29294012
Nature. 2018 Feb 1;554(7690):50-55
pubmed: 29364872
Nat Rev Genet. 2018 Jun;19(6):329-346
pubmed: 29599501
Nature. 2018 Nov;563(7732):501-507
pubmed: 30429615
J Hered. 1995 Mar-Apr;86(2):145-50
pubmed: 7751599
Genetics. 1993 Aug;134(4):1195-204
pubmed: 8104160
Genetics. 1993 Jan;133(1):97-117
pubmed: 8417993
Genome Res. 1998 Feb;8(2):81-2
pubmed: 9477334

Auteurs

Andreas Wallberg (A)

Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, Uppsala, Sweden.

Ignas Bunikis (I)

Department of Immunology, Genetics and Pathology, Science for Life Laboratory, Uppsala University, Uppsala, Sweden.

Olga Vinnere Pettersson (OV)

Department of Immunology, Genetics and Pathology, Science for Life Laboratory, Uppsala University, Uppsala, Sweden.

Mai-Britt Mosbech (MB)

Department of Immunology, Genetics and Pathology, Science for Life Laboratory, Uppsala University, Uppsala, Sweden.

Anna K Childers (AK)

USDA-ARS Insect Genetics and Biochemistry Research Unit, Fargo, ND, USA.
USDA-ARS Bee Research Lab, Beltsville, MD, USA.

Jay D Evans (JD)

USDA-ARS Bee Research Lab, Beltsville, MD, USA.

Alexander S Mikheyev (AS)

Okinawa Institute of Science and Technology, Okinawa, Japan.

Hugh M Robertson (HM)

Department of Entomology and Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL, USA.

Gene E Robinson (GE)

Department of Entomology and Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL, USA.

Matthew T Webster (MT)

Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, Uppsala, Sweden. matthew.webster@imbim.uu.se.

Articles similaires

Robotic Surgical Procedures Animals Humans Telemedicine Models, Animal

Odour generalisation and detection dog training.

Lyn Caldicott, Thomas W Pike, Helen E Zulch et al.
1.00
Animals Odorants Dogs Generalization, Psychological Smell
Animals TOR Serine-Threonine Kinases Colorectal Neoplasms Colitis Mice
Animals Tail Swine Behavior, Animal Animal Husbandry

Classifications MeSH