Probing the physical limits of reliable DNA data retrieval.

Base Sequence DNA / chemistry Databases, Nucleic Acid Gene Dosage Genetic Engineering / methods High-Throughput Nucleotide Sequencing Information Science Information Storage and Retrieval / methods

Journal

Nature communications

ISSN: 2041-1723

Titre abrégé: Nat Commun

Pays: England

ID NLM: 101528555

Informations de publication

Date de publication:
30 01 2020

Historique:

received: 28 05 2019

accepted: 16 12 2019

entrez: 1 2 2020

pubmed: 1 2 2020

medline: 24 4 2020

Statut: epublish

Résumé

Synthetic DNA is gaining momentum as a potential storage medium for archival data storage. In this process, digital information is translated into sequences of nucleotides and the resulting synthetic DNA strands are then stored for later retrieval. Here, we demonstrate reliable file recovery with PCR-based random access when as few as ten copies per sequence are stored, on average. This results in density of about 17 exabytes/gram, nearly two orders of magnitude greater than prior work has shown. We successfully retrieve the same data in a complex pool of over 10

Identifiants

DOI: 10.1038/s41467-020-14319-8 PMID: 32001691 PMC: PMC6992699

pubmed: 32001691

doi: 10.1038/s41467-020-14319-8

pii: 10.1038/s41467-020-14319-8

pmc: PMC6992699

doi:

Substances chimiques

DNA 9007-49-2

Types de publication

Journal Article Research Support, Non-U.S. Gov't Research Support, U.S. Gov't, Non-P.H.S.

Langues

eng

Sous-ensembles de citation

Pagination

616

Commentaires et corrections

Type : ErratumIn

Références

Church, G. M., Gao, Y. & Kosuri, S. Next-generation digital information storage in DNA. Science 337, 1628 (2012).

doi: 10.1126/science.1226355

Zhirnov, V., Zadegan, R. M., Sandhu, G. S., Church, G. M. & Hughes, W. L. Nucleic acid memory. Nat. Mater. 15, 366–370 (2016).

doi: 10.1038/nmat4594

Goldman, N. et al. Towards practical, high-capacity, low-maintenance information storage in synthesized DNA. Nature 494, 77–80 (2013).

doi: 10.1038/nature11875

Yazdi, S. M. H. T., Yuan, Y., Ma, J. & Zhao, H. A rewritable, random-access DNA-based storage system. Sci. Rep. 5, 1–10 (2015).

Grass, R. N., Heckel, R., Puddu, M., Paunescu, D. & Stark, W. J. Robust chemical preservation of digital information on DNA in silica with error-correcting codes. Angew. Chem. Int. Ed. Engl. 54, 2552–2555 (2015).

doi: 10.1002/anie.201411378

Blawat, M. et al. Forward error correction for DNA data storage. Procedia Comput. Sci. 80, 1011–1022 (2016).

doi: 10.1016/j.procs.2016.05.398

Erlich, Y. & Zielinski, D. DNA Fountain enables a robust and efficient storage architecture. Science 355, 950–954 (2017).

doi: 10.1126/science.aaj2038

Bornholt, J. et al. A DNA-based archival storage system. In Proc. ASPLOS (Association for Computing Machinery, New York, 2016).

Yazdi, S. M. H. T., Gabrys, R. & Milenkovic, O. Portable and error-free DNA-based data storage. Sci. Rep. 7, 5011 (2017).

doi: 10.1038/s41598-017-05188-1

Organick, L. et al. Random access in large-scale DNA data storage. Nat. Biotechnol. 36, 242–248 (2018).

doi: 10.1038/nbt.4079

Tomek, K. J. et al. Driving the scalability of DNA-based information storage systems. ACS Synth. Biol. 8, 1241–1248 (2019).

doi: 10.1021/acssynbio.9b00100

Zaccolo, M. & Gherardi, E. The effect of high-frequency random mutagenesis on in vitro protein evolution: a study on tem-1 β-lactamase. J. Mol. Biol. 285, 775–783 (1999).

doi: 10.1006/jmbi.1998.2262

Geer, L. Y. et al. The NCBI BioSystems database. Nucleic Acids Res. 38, D492–D496 (2010).

doi: 10.1093/nar/gkp858

Gong, W., Kwak, I.-Y., Pota, P., Koyano-Nakagawa, N. & Garry, D. J. DrImpute: imputing dropout events in single cell RNA sequencing data. BMC Bioinformatics 19, 220 (2018).

doi: 10.1186/s12859-018-2226-y

Kharchenko, P. V., Silberstein, L. & Scadden, D. T. Bayesian approach to single-cell differential expression analysis. Nat. Methods 11, 740–742 (2014).

doi: 10.1038/nmeth.2967

Verboven, S., Branden, K. V. & Goos, P. Sequential imputation for missing values. Comput. Biol. Chem. 31, 320–327 (2007).

doi: 10.1016/j.compbiolchem.2007.07.001

Kim, H., Golub, G. H. & Park, H. Missing value estimation for DNA microarray gene expression data: local least squares imputation. Bioinformatics 21, 187–198 (2005).

doi: 10.1093/bioinformatics/bth499

Xu, Q., Schlabach, M. R., Hannon, G. J. & Elledge, S. J. Design of 240,000 orthogonal 25mer dna barcode probes. Proc. Natl Acad. Sci. USA 106, 2289–2294 (2009).

doi: 10.1073/pnas.0812506106

Chen, Y.-J. et al. Quantifying molecular bias in DNA data storage. Preprint at https://www.biorxiv.org/content/10.1101/566554v1 (2019).

Probing the physical limits of reliable DNA data retrieval.

Journal

Informations de publication

Résumé

Identifiants

Substances chimiques

Types de publication

Langues

Sous-ensembles de citation

Pagination

Commentaires et corrections

Références

Auteurs

Lee Organick (L)

Yuan-Jyue Chen (YJ)

Siena Dumas Ang (S)

Randolph Lopez (R)

Xiaomeng Liu (X)

Karin Strauss (K)

Luis Ceze (L)

Articles similaires

Comprehensive comparative analysis and development of molecular markers for Lasianthus species based on complete chloroplast genome sequences.

Fasciola hepatica and Fasciola hybrid form co-existence in yak from Tibet of China: application of rDNA internal transcribed spacer.

Comparative genomic analysis and characterization of novel high-quality draft genomes from the coal metagenome.

iDNA-ITLM: An interpretable and transferable learning model for identifying DNA methylation.

Classifications MeSH