Effect of sequence depth and length in long-read assembly of the maize inbred NC358.
Journal
Nature communications
ISSN: 2041-1723
Titre abrégé: Nat Commun
Pays: England
ID NLM: 101528555
Informations de publication
Date de publication:
08 05 2020
08 05 2020
Historique:
received:
23
11
2019
accepted:
09
04
2020
entrez:
10
5
2020
pubmed:
10
5
2020
medline:
6
8
2020
Statut:
epublish
Résumé
Improvements in long-read data and scaffolding technologies have enabled rapid generation of reference-quality assemblies for complex genomes. Still, an assessment of critical sequence depth and read length is important for allocating limited resources. To this end, we have generated eight assemblies for the complex genome of the maize inbred line NC358 using PacBio datasets ranging from 20 to 75 × genomic depth and with N50 subread lengths of 11-21 kb. Assemblies with ≤30 × depth and N50 subread length of 11 kb are highly fragmented, with even low-copy genic regions showing degradation at 20 × depth. Distinct sequence-quality thresholds are observed for complete assembly of genes, transposable elements, and highly repetitive genomic features such as telomeres, heterochromatic knobs, and centromeres. In addition, we show high-quality optical maps can dramatically improve contiguity in even our most fragmented base assembly. This study provides a useful resource allocation reference to the community as long-read technologies continue to mature.
Identifiants
pubmed: 32385271
doi: 10.1038/s41467-020-16037-7
pii: 10.1038/s41467-020-16037-7
pmc: PMC7211024
doi:
Substances chimiques
DNA Transposable Elements
0
Types de publication
Journal Article
Research Support, N.I.H., Intramural
Research Support, U.S. Gov't, Non-P.H.S.
Langues
eng
Sous-ensembles de citation
IM
Pagination
2288Références
Nat Commun. 2018 Nov 19;9(1):4844
pubmed: 30451840
PLoS One. 2009 Dec 24;4(12):e8451
pubmed: 20041112
Nat Biotechnol. 2019 Aug;37(8):907-915
pubmed: 31375807
Bioinformatics. 2015 Oct 1;31(19):3210-2
pubmed: 26059717
BMC Genomics. 2020 Mar 2;21(1):193
pubmed: 32122303
Curr Protoc Bioinformatics. 2014 Dec 12;48:4.11.1-4.11.39
pubmed: 25501943
Genome Res. 2001 Nov;11(11):1817-25
pubmed: 11691845
Nat Methods. 2020 Feb;17(2):155-158
pubmed: 31819265
Plant Cell. 2004 Mar;16(3):571-81
pubmed: 14973167
Bioinformatics. 2011 Mar 15;27(6):764-70
pubmed: 21217122
Bioinformatics. 2009 Aug 15;25(16):2078-9
pubmed: 19505943
Plant Cell. 2016 Nov;28(11):2700-2714
pubmed: 27803309
Nucleic Acids Res. 2018 Nov 30;46(21):e126
pubmed: 30107434
Cytogenet Genome Res. 2010 Jul;129(1-3):6-16
pubmed: 20551613
Nature. 2007 Sep 27;449(7161):463-7
pubmed: 17721507
Nucleic Acids Res. 2006 Jul 1;34(Web Server issue):W435-9
pubmed: 16845043
BMC Genomics. 2019 Jan 9;20(1):23
pubmed: 30626323
Plant Physiol. 2018 Feb;176(2):1410-1422
pubmed: 29233850
Nat Genet. 2018 Sep;50(9):1289-1295
pubmed: 30061735
Nat Genet. 2019 Jun;51(6):1052-1059
pubmed: 31152161
Brief Bioinform. 2019 May 21;20(3):866-876
pubmed: 29112696
Cytogenet Genome Res. 2018;154(2):107-118
pubmed: 29635249
Genome Biol. 2017 Jun 21;18(1):121
pubmed: 28637491
Science. 2009 Nov 20;326(5956):1112-5
pubmed: 19965430
Nature. 2017 Jun 22;546(7659):524-527
pubmed: 28605751
Plant Cell. 2013 Sep;25(9):3212-27
pubmed: 24058158
Nucleic Acids Res. 2013 Jul;41(12):e121
pubmed: 23598997
Trends Genet. 2000 Jun;16(6):276-7
pubmed: 10827456
Nat Commun. 2017 May 04;8:15324
pubmed: 28469237
F1000Res. 2019 Dec 23;8:2138
pubmed: 31984131
Nat Biotechnol. 2019 May;37(5):540-546
pubmed: 30936562
Nat Commun. 2017 Nov 30;8(1):1874
pubmed: 29187731
Curr Opin Plant Biol. 2019 Apr;48:9-17
pubmed: 30797187
Nat Genet. 2012 Jun 03;44(7):803-7
pubmed: 22660545
Nat Commun. 2015 Apr 16;6:6914
pubmed: 25881062
Genome Biol. 2013 May 10;14(5):R41
pubmed: 23663246
Genome Res. 2017 May;27(5):722-736
pubmed: 28298431
Genome Biol. 2020 May 20;21(1):121
pubmed: 32434565
Bioinformatics. 2013 Jan 1;29(1):15-21
pubmed: 23104886
Nat Methods. 2011 Jan;8(1):61-5
pubmed: 21102452
Science. 2002 Apr 5;296(5565):79-92
pubmed: 11935017
Genome Biol. 2015 Jan 13;16:3
pubmed: 25583564
Nat Biotechnol. 2018 Apr;36(4):338-345
pubmed: 29431738
Genome Biol. 2019 Dec 16;20(1):275
pubmed: 31843001
Science. 2000 Mar 24;287(5461):2185-95
pubmed: 10731132
Nature. 2001 Feb 15;409(6822):860-921
pubmed: 11237011
BMC Bioinformatics. 2012 Sep 19;13:238
pubmed: 22988817
Sci Data. 2016 Jun 07;3:160025
pubmed: 27271295
PLoS One. 2014 Nov 19;9(11):e112963
pubmed: 25409509
Nat Methods. 2016 Dec;13(12):1050-1054
pubmed: 27749838
Bioinformatics. 2018 Sep 15;34(18):3094-3100
pubmed: 29750242
Plant Cell. 2008 Feb;20(2):249-58
pubmed: 18296625
Nucleic Acids Res. 2008 Jul 1;36(Web Server issue):W5-9
pubmed: 18440982
Bioinformatics. 2010 Mar 15;26(6):841-2
pubmed: 20110278
Nat Plants. 2018 Nov;4(11):879-887
pubmed: 30390080
Nat Genet. 2018 Sep;50(9):1282-1288
pubmed: 30061736
Plant Cell. 1999 Jul;11(7):1365-76
pubmed: 10402435