Draft genome assemblies using sequencing reads from Oxford Nanopore Technology and Illumina platforms for four species of North American Fundulus killifish.


Journal

GigaScience
ISSN: 2047-217X
Titre abrégé: Gigascience
Pays: United States
ID NLM: 101596872

Informations de publication

Date de publication:
01 06 2020
Historique:
received: 04 10 2019
revised: 16 04 2020
accepted: 27 05 2020
entrez: 20 6 2020
pubmed: 20 6 2020
medline: 5 10 2021
Statut: ppublish

Résumé

Whole-genome sequencing data from wild-caught individuals of closely related North American killifish species (Fundulus xenicus, Fundulus catenatus, Fundulus nottii, and Fundulus olivaceus) were obtained using long-read Oxford Nanopore Technology (ONT) PromethION and short-read Illumina platforms. Draft de novo reference genome assemblies were generated using a combination of long and short sequencing reads. For each species, the PromethION platform was used to generate 30-45× sequence coverage, and the Illumina platform was used to generate 50-160× sequence coverage. Illumina-only assemblies were fragmented with high numbers of contigs, while ONT-only assemblies were error prone with low BUSCO scores. The highest N50 values, ranging from 0.4 to 2.7 Mb, were from assemblies generated using a combination of short- and long-read data. BUSCO scores were consistently >90% complete using the Eukaryota database. High-quality genomes can be obtained from a combination of using short-read Illumina data to polish assemblies generated with long-read ONT data. Draft assemblies and raw sequencing data are available for public use. We encourage use and reuse of these data for assembly benchmarking and other analyses.

Sections du résumé

BACKGROUND
Whole-genome sequencing data from wild-caught individuals of closely related North American killifish species (Fundulus xenicus, Fundulus catenatus, Fundulus nottii, and Fundulus olivaceus) were obtained using long-read Oxford Nanopore Technology (ONT) PromethION and short-read Illumina platforms.
FINDINGS
Draft de novo reference genome assemblies were generated using a combination of long and short sequencing reads. For each species, the PromethION platform was used to generate 30-45× sequence coverage, and the Illumina platform was used to generate 50-160× sequence coverage. Illumina-only assemblies were fragmented with high numbers of contigs, while ONT-only assemblies were error prone with low BUSCO scores. The highest N50 values, ranging from 0.4 to 2.7 Mb, were from assemblies generated using a combination of short- and long-read data. BUSCO scores were consistently >90% complete using the Eukaryota database.
CONCLUSIONS
High-quality genomes can be obtained from a combination of using short-read Illumina data to polish assemblies generated with long-read ONT data. Draft assemblies and raw sequencing data are available for public use. We encourage use and reuse of these data for assembly benchmarking and other analyses.

Identifiants

pubmed: 32556169
pii: 5859380
doi: 10.1093/gigascience/giaa067
pmc: PMC7301629
pii:
doi:

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Informations de copyright

© The Author(s) 2020. Published by Oxford University Press.

Références

Nat Commun. 2019 Jan 16;10(1):260
pubmed: 30651564
Nat Biotechnol. 2018 Dec 6;36(12):1121
pubmed: 30520871
Evolution. 2010 Jul;64(7):2070-85
pubmed: 20100216
Gigascience. 2018 Mar 1;7(3):1-6
pubmed: 29342277
Nat Commun. 2017 Feb 20;8:14515
pubmed: 28218240
Science. 2012 Apr 27;336(6080):455-8
pubmed: 22539717
Genome Biol Evol. 2017 Feb 13;9(3):659-676
pubmed: 28201664
Gigascience. 2019 Dec 1;8(12):
pubmed: 31794015
Bioinformatics. 2018 Aug 1;34(15):2666-2669
pubmed: 29547981
Nat Biotechnol. 2018 Apr;36(4):338-345
pubmed: 29431738
Hereditas. 2003;138(3):161-5
pubmed: 14641478
Genome Res. 2002 May;12(5):669-71
pubmed: 11997333
Sci Rep. 2018 Jul 19;8(1):10931
pubmed: 30026559
Elife. 2016 Apr 07;5:
pubmed: 27054412
Brief Bioinform. 2019 Jul 19;20(4):1542-1559
pubmed: 29617724
Evol Appl. 2014 Nov;7(9):1026-42
pubmed: 25553065
Genome Res. 2019 Jul;29(7):1178-1187
pubmed: 31186302
Gigascience. 2018 Apr 1;7(4):
pubmed: 29617771
Gigascience. 2015 Nov 26;4:56
pubmed: 26617983
Biomol Detect Quantif. 2015 Mar;3:1-8
pubmed: 26753127
Appl Plant Sci. 2018 Mar 30;6(3):e1030
pubmed: 29732260
Gigascience. 2017 Aug 1;6(8):1-6
pubmed: 28873963
Science. 2019 May 3;364(6439):455-457
pubmed: 31048485
F1000Res. 2017 Jul 7;6:1083
pubmed: 29375809
Genome Res. 2017 May;27(5):737-746
pubmed: 28100585
F1000Res. 2018 Feb 5;7:
pubmed: 29568489
Genome Biol. 2019 May 20;20(1):97
pubmed: 31104630
Nat Protoc. 2017 Jun;12(6):1261-1276
pubmed: 28538739
Bioinformatics. 2015 Oct 1;31(19):3210-2
pubmed: 26059717
Sci Rep. 2017 Jun 21;7(1):3935
pubmed: 28638050
Genome Res. 2018 Feb;28(2):266-274
pubmed: 29273626
Science. 2019 Jan 4;363(6422):74-77
pubmed: 30606844
Sci Rep. 2018 Jul 19;8(1):10950
pubmed: 30026539
Mol Ecol Resour. 2019 Jan;19(1):77-89
pubmed: 30118581
Nat Methods. 2020 Feb;17(2):155-158
pubmed: 31819265
Microbiol Resour Announc. 2018 Oct 18;7(15):
pubmed: 30533723
Nature. 2016 Feb 11;530(7589):228-232
pubmed: 26840485
Genome Biol. 2013 Jul 30;14(7):128
pubmed: 23906089
Comp Biochem Physiol Part D Genomics Proteomics. 2007 Dec;2(4):257-86
pubmed: 18071578
F1000Res. 2015 Oct 15;4:1075
pubmed: 26834992
PLoS One. 2014 Nov 19;9(11):e112963
pubmed: 25409509
G3 (Bethesda). 2018 Oct 3;8(10):3131-3141
pubmed: 30087105
Gigascience. 2020 Jun 1;9(6):
pubmed: 32556169
Front Genet. 2014 Jan 31;5:13
pubmed: 24567737
Plant Cell. 2017 Oct;29(10):2336-2348
pubmed: 29025960
BMC Bioinformatics. 2018 Jan 30;19(1):26
pubmed: 29382321
Nat Biotechnol. 2019 Feb;37(2):124-126
pubmed: 30670796
Nat Genet. 2017 Apr;49(4):643-650
pubmed: 28263316
Science. 2016 Dec 09;354(6317):1305-1308
pubmed: 27940876
Nat Plants. 2018 Nov;4(11):879-887
pubmed: 30390080

Auteurs

Articles similaires

Genome, Chloroplast Phylogeny Genetic Markers Base Composition High-Throughput Nucleotide Sequencing
Robotic Surgical Procedures Animals Humans Telemedicine Models, Animal

Odour generalisation and detection dog training.

Lyn Caldicott, Thomas W Pike, Helen E Zulch et al.
1.00
Animals Odorants Dogs Generalization, Psychological Smell
Animals TOR Serine-Threonine Kinases Colorectal Neoplasms Colitis Mice

Classifications MeSH