Deep Learning Encoding for Rapid Sequence Identification on Microbiome Data.
convolutional neural networks
deep learning
denoising
embedding
encoding
microbiome
rapid sequence identification
Journal
Frontiers in bioinformatics
ISSN: 2673-7647
Titre abrégé: Front Bioinform
Pays: Switzerland
ID NLM: 9918227263306676
Informations de publication
Date de publication:
2022
2022
Historique:
received:
08
02
2022
accepted:
30
05
2022
entrez:
28
10
2022
pubmed:
29
10
2022
medline:
29
10
2022
Statut:
epublish
Résumé
We present a novel approach for rapidly identifying sequences that leverages the representational power of Deep Learning techniques and is applied to the analysis of microbiome data. The method involves the creation of a latent sequence space, training a convolutional neural network to rapidly identify sequences by mapping them into that space, and we leverage the novel encoded latent space for denoising to correct sequencing errors. Using mock bacterial communities of known composition, we show that this approach achieves single nucleotide resolution, generating results for sequence identification and abundance estimation that match the best available microbiome algorithms in terms of accuracy while vastly increasing the speed of accurate processing. We further show the ability of this approach to support phenotypic prediction at the sample level on an experimental data set for which the ground truth for sequence identities and abundances is unknown, but the expected phenotypes of the samples are definitive. Moreover, this approach offers a potential solution for the analysis of data from other types of experiments that currently rely on computationally intensive sequence identification.
Identifiants
pubmed: 36304316
doi: 10.3389/fbinf.2022.871256
pii: 871256
pmc: PMC9580936
doi:
Types de publication
Journal Article
Langues
eng
Pagination
871256Informations de copyright
Copyright © 2022 Borgman, Stark, Carson and Hauser.
Déclaration de conflit d'intérêts
JB, KS, JC and LH were employed by Digital Infuzion, Inc.
Références
Front Microbiol. 2020 Jun 17;11:1262
pubmed: 32636817
World J Gastroenterol. 2015 Aug 7;21(29):8787-803
pubmed: 26269668
Nat Rev Immunol. 2017 Apr;17(4):219-232
pubmed: 28260787
mSystems. 2019 Feb 19;4(1):
pubmed: 30801029
Bioinformatics. 2018 Jul 1;34(13):i32-i42
pubmed: 29950008
Nature. 2016 Jan 28;529(7587):484-9
pubmed: 26819042
Bioinformatics. 2010 Oct 1;26(19):2460-1
pubmed: 20709691
Nucleic Acids Res. 2020 Dec 2;48(21):e121
pubmed: 33045744
Nucleic Acids Res. 2004 Jan 16;32(1):380-5
pubmed: 14729922
Sci Data. 2019 Feb 05;6:190007
pubmed: 30720800
IEEE Trans Nanobioscience. 2015 Sep;14(6):608-16
pubmed: 26316190
PeerJ. 2018 Aug 8;6:e5364
pubmed: 30123705
mSystems. 2017 Mar 7;2(2):
pubmed: 28289731
Cell Host Microbe. 2016 May 11;19(5):731-43
pubmed: 27173935
Nat Biotechnol. 2016 Sep;34(9):942-9
pubmed: 27454739
Nucleic Acids Res. 2014 Jan;42(Database issue):D643-8
pubmed: 24293649
PLoS Comput Biol. 2021 Sep 22;17(9):e1009345
pubmed: 34550967
PLoS One. 2020 Jan 16;15(1):e0227434
pubmed: 31945086
Nat Methods. 2016 Jul;13(7):581-3
pubmed: 27214047
Appl Environ Microbiol. 2009 Dec;75(23):7537-41
pubmed: 19801464
Bioinformatics. 2017 May 1;33(9):1394-1395
pubmed: 28453688
Nat Methods. 2013 Oct;10(10):996-8
pubmed: 23955772
Gigascience. 2019 Dec 1;8(12):
pubmed: 31816087
Clin Gastroenterol Hepatol. 2019 Jan;17(2):218-230
pubmed: 30240894
Bioinformatics. 2021 Jun 16;37(10):1444-1451
pubmed: 33289510
Biochem J. 2017 May 16;474(11):1823-1836
pubmed: 28512250
Nat Rev Genet. 2012 Mar 13;13(4):260-70
pubmed: 22411464
Quant Biol. 2020 Mar;8(1):64-77
pubmed: 34084563
Sci Rep. 2021 Jun 4;11(1):11848
pubmed: 34088939
Microbiome. 2021 Feb 1;9(1):37
pubmed: 33522966
PeerJ. 2016 Oct 18;4:e2584
pubmed: 27781170
J Microbiol. 2018 Apr;56(4):280-285
pubmed: 29492869
iScience. 2021 Nov 22;24(12):103481
pubmed: 34927025
J Mol Biol. 1981 Mar 25;147(1):195-7
pubmed: 7265238
Poult Sci. 2022 Feb;101(2):101624
pubmed: 34936955
Nucleic Acids Res. 2015 Mar 31;43(6):e37
pubmed: 25586220
Microbiome. 2018 May 17;6(1):90
pubmed: 29773078
mSystems. 2016 Oct 18;1(5):
pubmed: 27822553
PLoS One. 2018 Oct 25;13(10):e0198305
pubmed: 30359379
Nat Med. 2018 Apr 10;24(4):392-400
pubmed: 29634682
Front Microbiol. 2018 May 15;9:946
pubmed: 29867854
mSystems. 2016 Feb 9;1(1):
pubmed: 27822515
BMC Bioinformatics. 2009 Dec 15;10:421
pubmed: 20003500
J Mol Biol. 1990 Oct 5;215(3):403-10
pubmed: 2231712