Scalable search of massively pooled nucleic acid samples enabled by a molecular database query language.
Journal
medRxiv : the preprint server for health sciences
Titre abrégé: medRxiv
Pays: United States
ID NLM: 101767986
Informations de publication
Date de publication:
15 Apr 2024
15 Apr 2024
Historique:
medline:
3
5
2024
pubmed:
3
5
2024
entrez:
3
5
2024
Statut:
epublish
Résumé
The surge in nucleic acid analytics requires scalable storage and retrieval systems akin to electronic databases used to organize digital data. Such a system could transform disease diagnosis, ecological preservation, and molecular surveillance of biothreats. Current storage systems use individual containers for nucleic acid samples, requiring single-sample retrieval that falls short compared with digital databases that allow complex and combinatorial data retrieval on aggregated data. Here, we leverage protective microcapsules with combinatorial DNA labeling that enables arbitrary retrieval on pooled biosamples analogous to Structured Query Languages. Ninety-six encapsulated pooled mock SARS-CoV-2 genomic samples barcoded with patient metadata are used to demonstrate queries with simultaneous matches to sample collection date ranges, locations, and patient health statuses, illustrating how such flexible queries can be used to yield immunological or epidemiological insights. The approach applies to any biosample database labeled with orthogonal barcodes, enabling complex post-hoc analysis, for example, to study global biothreat epidemiology.
Identifiants
pubmed: 38699348
doi: 10.1101/2024.04.12.24305660
pmc: PMC11064994
pii:
doi:
Types de publication
Preprint
Langues
eng