Probing the physical limits of reliable DNA data retrieval.


Journal

Nature communications
ISSN: 2041-1723
Titre abrégé: Nat Commun
Pays: England
ID NLM: 101528555

Informations de publication

Date de publication:
30 01 2020
Historique:
received: 28 05 2019
accepted: 16 12 2019
entrez: 1 2 2020
pubmed: 1 2 2020
medline: 24 4 2020
Statut: epublish

Résumé

Synthetic DNA is gaining momentum as a potential storage medium for archival data storage. In this process, digital information is translated into sequences of nucleotides and the resulting synthetic DNA strands are then stored for later retrieval. Here, we demonstrate reliable file recovery with PCR-based random access when as few as ten copies per sequence are stored, on average. This results in density of about 17 exabytes/gram, nearly two orders of magnitude greater than prior work has shown. We successfully retrieve the same data in a complex pool of over 10

Identifiants

pubmed: 32001691
doi: 10.1038/s41467-020-14319-8
pii: 10.1038/s41467-020-14319-8
pmc: PMC6992699
doi:

Substances chimiques

DNA 9007-49-2

Types de publication

Journal Article Research Support, Non-U.S. Gov't Research Support, U.S. Gov't, Non-P.H.S.

Langues

eng

Sous-ensembles de citation

IM

Pagination

616

Commentaires et corrections

Type : ErratumIn

Références

Church, G. M., Gao, Y. & Kosuri, S. Next-generation digital information storage in DNA. Science 337, 1628 (2012).
doi: 10.1126/science.1226355
Zhirnov, V., Zadegan, R. M., Sandhu, G. S., Church, G. M. & Hughes, W. L. Nucleic acid memory. Nat. Mater. 15, 366–370 (2016).
doi: 10.1038/nmat4594
Goldman, N. et al. Towards practical, high-capacity, low-maintenance information storage in synthesized DNA. Nature 494, 77–80 (2013).
doi: 10.1038/nature11875
Yazdi, S. M. H. T., Yuan, Y., Ma, J. & Zhao, H. A rewritable, random-access DNA-based storage system. Sci. Rep. 5, 1–10 (2015).
Grass, R. N., Heckel, R., Puddu, M., Paunescu, D. & Stark, W. J. Robust chemical preservation of digital information on DNA in silica with error-correcting codes. Angew. Chem. Int. Ed. Engl. 54, 2552–2555 (2015).
doi: 10.1002/anie.201411378
Blawat, M. et al. Forward error correction for DNA data storage. Procedia Comput. Sci. 80, 1011–1022 (2016).
doi: 10.1016/j.procs.2016.05.398
Erlich, Y. & Zielinski, D. DNA Fountain enables a robust and efficient storage architecture. Science 355, 950–954 (2017).
doi: 10.1126/science.aaj2038
Bornholt, J. et al. A DNA-based archival storage system. In Proc. ASPLOS (Association for Computing Machinery, New York, 2016).
Yazdi, S. M. H. T., Gabrys, R. & Milenkovic, O. Portable and error-free DNA-based data storage. Sci. Rep. 7, 5011 (2017).
doi: 10.1038/s41598-017-05188-1
Organick, L. et al. Random access in large-scale DNA data storage. Nat. Biotechnol. 36, 242–248 (2018).
doi: 10.1038/nbt.4079
Tomek, K. J. et al. Driving the scalability of DNA-based information storage systems. ACS Synth. Biol. 8, 1241–1248 (2019).
doi: 10.1021/acssynbio.9b00100
Zaccolo, M. & Gherardi, E. The effect of high-frequency random mutagenesis on in vitro protein evolution: a study on tem-1 β-lactamase. J. Mol. Biol. 285, 775–783 (1999).
doi: 10.1006/jmbi.1998.2262
Geer, L. Y. et al. The NCBI BioSystems database. Nucleic Acids Res. 38, D492–D496 (2010).
doi: 10.1093/nar/gkp858
Gong, W., Kwak, I.-Y., Pota, P., Koyano-Nakagawa, N. & Garry, D. J. DrImpute: imputing dropout events in single cell RNA sequencing data. BMC Bioinformatics 19, 220 (2018).
doi: 10.1186/s12859-018-2226-y
Kharchenko, P. V., Silberstein, L. & Scadden, D. T. Bayesian approach to single-cell differential expression analysis. Nat. Methods 11, 740–742 (2014).
doi: 10.1038/nmeth.2967
Verboven, S., Branden, K. V. & Goos, P. Sequential imputation for missing values. Comput. Biol. Chem. 31, 320–327 (2007).
doi: 10.1016/j.compbiolchem.2007.07.001
Kim, H., Golub, G. H. & Park, H. Missing value estimation for DNA microarray gene expression data: local least squares imputation. Bioinformatics 21, 187–198 (2005).
doi: 10.1093/bioinformatics/bth499
Xu, Q., Schlabach, M. R., Hannon, G. J. & Elledge, S. J. Design of 240,000 orthogonal 25mer dna barcode probes. Proc. Natl Acad. Sci. USA 106, 2289–2294 (2009).
doi: 10.1073/pnas.0812506106
Chen, Y.-J. et al. Quantifying molecular bias in DNA data storage. Preprint at https://www.biorxiv.org/content/10.1101/566554v1 (2019).

Auteurs

Lee Organick (L)

Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, WA, 98195, USA. leeorg@cs.washington.edu.

Yuan-Jyue Chen (YJ)

Microsoft, Redmond, WA, 98052, USA.

Siena Dumas Ang (S)

Microsoft, Redmond, WA, 98052, USA.

Randolph Lopez (R)

Department of Bioengineering, University of Washington, Seattle, WA, 98195, USA.

Xiaomeng Liu (X)

Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, WA, 98195, USA.

Karin Strauss (K)

Microsoft, Redmond, WA, 98052, USA. kstrauss@microsoft.com.

Luis Ceze (L)

Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, WA, 98195, USA. luisceze@cs.washington.edu.

Articles similaires

Genome, Chloroplast Phylogeny Genetic Markers Base Composition High-Throughput Nucleotide Sequencing
Coal Metagenome Phylogeny Bacteria Genome, Bacterial
DNA Methylation Humans DNA Animals Machine Learning

Classifications MeSH