G-Aligner: a graph-based feature alignment method for untargeted LC-MS-based metabolomics.
Combinatorial optimization
Feature alignment
LC–MS
Multidimensional assignment problem
Journal
BMC bioinformatics
ISSN: 1471-2105
Titre abrégé: BMC Bioinformatics
Pays: England
ID NLM: 100965194
Informations de publication
Date de publication:
14 Nov 2023
14 Nov 2023
Historique:
received:
05
07
2023
accepted:
09
10
2023
medline:
16
11
2023
pubmed:
15
11
2023
entrez:
14
11
2023
Statut:
epublish
Résumé
Liquid chromatography-mass spectrometry is widely used in untargeted metabolomics for composition profiling. In multi-run analysis scenarios, features of each run are aligned into consensus features by feature alignment algorithms to observe the intensity variations across runs. However, most of the existing feature alignment methods focus more on accurate retention time correction, while underestimating the importance of feature matching. None of the existing methods can comprehensively consider feature correspondences among all runs and achieve optimal matching. To comprehensively analyze feature correspondences among runs, we propose G-Aligner, a graph-based feature alignment method for untargeted LC-MS data. In the feature matching stage, G-Aligner treats features and potential correspondences as nodes and edges in a multipartite graph, considers the multi-run feature matching problem an unbalanced multidimensional assignment problem, and provides three combinatorial optimization algorithms to find optimal matching solutions. In comparison with the feature alignment methods in OpenMS, MZmine2 and XCMS on three public metabolomics benchmark datasets, G-Aligner achieved the best feature alignment performance on all the three datasets with up to 9.8% and 26.6% increase in accurately aligned features and analytes, and helped all comparison software obtain more accurate results on their self-extracted features by integrating G-Aligner to their analysis workflow. G-Aligner is open-source and freely available at https://github.com/CSi-Studio/G-Aligner under a permissive license. Benchmark datasets, manual annotation results, evaluation methods and results are available at https://doi.org/10.5281/zenodo.8313034 CONCLUSIONS: In this study, we proposed G-Aligner to improve feature matching accuracy for untargeted metabolomics LC-MS data. G-Aligner comprehensively considered potential feature correspondences between all runs, converting the feature matching problem as a multidimensional assignment problem (MAP). In evaluations on three public metabolomics benchmark datasets, G-Aligner achieved the highest alignment accuracy on manual annotated and popular software extracted features, proving the effectiveness and robustness of the algorithm.
Sections du résumé
BACKGROUND
BACKGROUND
Liquid chromatography-mass spectrometry is widely used in untargeted metabolomics for composition profiling. In multi-run analysis scenarios, features of each run are aligned into consensus features by feature alignment algorithms to observe the intensity variations across runs. However, most of the existing feature alignment methods focus more on accurate retention time correction, while underestimating the importance of feature matching. None of the existing methods can comprehensively consider feature correspondences among all runs and achieve optimal matching.
RESULTS
RESULTS
To comprehensively analyze feature correspondences among runs, we propose G-Aligner, a graph-based feature alignment method for untargeted LC-MS data. In the feature matching stage, G-Aligner treats features and potential correspondences as nodes and edges in a multipartite graph, considers the multi-run feature matching problem an unbalanced multidimensional assignment problem, and provides three combinatorial optimization algorithms to find optimal matching solutions. In comparison with the feature alignment methods in OpenMS, MZmine2 and XCMS on three public metabolomics benchmark datasets, G-Aligner achieved the best feature alignment performance on all the three datasets with up to 9.8% and 26.6% increase in accurately aligned features and analytes, and helped all comparison software obtain more accurate results on their self-extracted features by integrating G-Aligner to their analysis workflow. G-Aligner is open-source and freely available at https://github.com/CSi-Studio/G-Aligner under a permissive license. Benchmark datasets, manual annotation results, evaluation methods and results are available at https://doi.org/10.5281/zenodo.8313034 CONCLUSIONS: In this study, we proposed G-Aligner to improve feature matching accuracy for untargeted metabolomics LC-MS data. G-Aligner comprehensively considered potential feature correspondences between all runs, converting the feature matching problem as a multidimensional assignment problem (MAP). In evaluations on three public metabolomics benchmark datasets, G-Aligner achieved the highest alignment accuracy on manual annotated and popular software extracted features, proving the effectiveness and robustness of the algorithm.
Identifiants
pubmed: 37964228
doi: 10.1186/s12859-023-05525-4
pii: 10.1186/s12859-023-05525-4
pmc: PMC10644574
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
431Subventions
Organisme : Natural Science Foundation of Shandong Province
ID : 2022HWYQ-081
Informations de copyright
© 2023. The Author(s).
Références
Nat Rev Drug Discov. 2016 Jul;15(7):473-84
pubmed: 26965202
Nat Biotechnol. 2012 Oct;30(10):918-20
pubmed: 23051804
BMC Bioinformatics. 2022 Jan 12;23(1):35
pubmed: 35021987
Metabolomics. 2017 Nov 28;14(1):5
pubmed: 30830317
Proteomics. 2016 Aug;16(15-16):2272-83
pubmed: 27302277
Nat Methods. 2016 Aug 30;13(9):741-8
pubmed: 27575624
Bioinformatics. 2013 Oct 1;29(19):2469-76
pubmed: 23904508
Plant Mol Biol. 2002 Jan;48(1-2):155-71
pubmed: 11860207
Anal Chim Acta. 2018 Oct 31;1029:50-57
pubmed: 29907290
Sheng Wu Gong Cheng Xue Bao. 2022 Mar 25;38(3):961-975
pubmed: 35355467
Brief Bioinform. 2015 Jan;16(1):104-17
pubmed: 24273217
Anal Chem. 2006 Feb 1;78(3):779-87
pubmed: 16448051
Anal Chem. 2004 Jan 15;76(2):404-11
pubmed: 14719890
Metabolomics. 2023 Jun 8;19(6):57
pubmed: 37289291
BMC Bioinformatics. 2010 Jul 23;11:395
pubmed: 20650010
Anal Chem. 2006 Sep 1;78(17):6140-52
pubmed: 16944896