High-scale random access on DNA storage systems.


Journal

NAR genomics and bioinformatics
ISSN: 2631-9268
Titre abrégé: NAR Genom Bioinform
Pays: England
ID NLM: 101756213

Informations de publication

Date de publication:
Mar 2022
Historique:
received: 09 10 2021
revised: 02 12 2021
accepted: 22 12 2021
entrez: 14 2 2022
pubmed: 15 2 2022
medline: 15 2 2022
Statut: epublish

Résumé

Due to the rapid cost decline of synthesizing and sequencing deoxyribonucleic acid (DNA), high information density, and its durability of up to centuries, utilizing DNA as an information storage medium has received the attention of many scientists. State-of-the-art DNA storage systems exploit the high capacity of DNA and enable random access (predominantly random reads) by primers, which serve as unique identifiers for directly accessing data. However, primers come with a significant limitation regarding the maximum available number per DNA library. The number of different primers within a library is typically very small (e.g. ≈10). We propose a method to overcome this deficiency and present a general-purpose technique for addressing and directly accessing thousands to potentially millions of different data objects within the same DNA pool. Our approach utilizes a fountain code, sophisticated probe design, and microarray technologies. A key component is locality-sensitive hashing, making checks for dissimilarity among such a large number of probes and data objects feasible.

Identifiants

pubmed: 35156022
doi: 10.1093/nargab/lqab126
pii: lqab126
pmc: PMC8829907
doi:

Types de publication

Journal Article

Langues

eng

Pagination

lqab126

Informations de copyright

© The Author(s) 2022. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics.

Références

IEEE Trans Syst Man Cybern B Cybern. 2009 Dec;39(6):1606-16
pubmed: 19556203
Angew Chem Int Ed Engl. 2015 Feb 16;54(8):2552-5
pubmed: 25650567
Bioinformatics. 2020 Jun 1;36(11):3322-3326
pubmed: 32129840
Nat Biotechnol. 2018 Mar;36(3):242-248
pubmed: 29457795
Bioinformatics. 2004 Nov 22;20(17):2893-902
pubmed: 15180932
Nucleic Acids Res. 2006 Jan 31;34(2):564-74
pubmed: 16449200
Bioinformatics. 2001 May;17(5):419-28
pubmed: 11331236
Nat Methods. 2010 Feb;7(2):111-8
pubmed: 20111037
Genome Biol. 2016 Jun 20;17(1):132
pubmed: 27323842
Science. 2017 Mar 3;355(6328):950-954
pubmed: 28254941
Nat Rev Genet. 2019 Aug;20(8):456-466
pubmed: 31068682
Curr Protoc Mol Biol. 2013 Jan;Chapter 22:Unit 22.1.
pubmed: 23288464
Nat Biotechnol. 2015 Jun;33(6):623-30
pubmed: 26006009
Nat Mater. 2021 Sep;20(9):1272-1280
pubmed: 34112975
Science. 2012 Sep 28;337(6102):1628
pubmed: 22903519
J Am Soc Nephrol. 2001 May;12(5):1072-1078
pubmed: 11316867
Adv Biochem Eng Biotechnol. 2008;109:433-53
pubmed: 17985099
Nature. 2013 Feb 7;494(7435):77-80
pubmed: 23354052
Nat Commun. 2020 Jun 12;11(1):2981
pubmed: 32532979
Annu Rev Biomed Eng. 2002;4:129-53
pubmed: 12117754

Auteurs

Alex El-Shaikh (A)

Department of Computer Science, University of Marburg, Marburg 35037, Germany.

Marius Welzel (M)

Department of Computer Science, University of Marburg, Marburg 35037, Germany.

Dominik Heider (D)

Department of Computer Science, University of Marburg, Marburg 35037, Germany.

Bernhard Seeger (B)

Department of Computer Science, University of Marburg, Marburg 35037, Germany.

Classifications MeSH