G-Aligner: a graph-based feature alignment method for untargeted LC-MS-based metabolomics.

Combinatorial optimization Feature alignment LC–MS Multidimensional assignment problem

Journal

BMC bioinformatics
ISSN: 1471-2105
Titre abrégé: BMC Bioinformatics
Pays: England
ID NLM: 100965194

Informations de publication

Date de publication:
14 Nov 2023
Historique:
received: 05 07 2023
accepted: 09 10 2023
medline: 16 11 2023
pubmed: 15 11 2023
entrez: 14 11 2023
Statut: epublish

Résumé

Liquid chromatography-mass spectrometry is widely used in untargeted metabolomics for composition profiling. In multi-run analysis scenarios, features of each run are aligned into consensus features by feature alignment algorithms to observe the intensity variations across runs. However, most of the existing feature alignment methods focus more on accurate retention time correction, while underestimating the importance of feature matching. None of the existing methods can comprehensively consider feature correspondences among all runs and achieve optimal matching. To comprehensively analyze feature correspondences among runs, we propose G-Aligner, a graph-based feature alignment method for untargeted LC-MS data. In the feature matching stage, G-Aligner treats features and potential correspondences as nodes and edges in a multipartite graph, considers the multi-run feature matching problem an unbalanced multidimensional assignment problem, and provides three combinatorial optimization algorithms to find optimal matching solutions. In comparison with the feature alignment methods in OpenMS, MZmine2 and XCMS on three public metabolomics benchmark datasets, G-Aligner achieved the best feature alignment performance on all the three datasets with up to 9.8% and 26.6% increase in accurately aligned features and analytes, and helped all comparison software obtain more accurate results on their self-extracted features by integrating G-Aligner to their analysis workflow. G-Aligner is open-source and freely available at https://github.com/CSi-Studio/G-Aligner under a permissive license. Benchmark datasets, manual annotation results, evaluation methods and results are available at https://doi.org/10.5281/zenodo.8313034 CONCLUSIONS: In this study, we proposed G-Aligner to improve feature matching accuracy for untargeted metabolomics LC-MS data. G-Aligner comprehensively considered potential feature correspondences between all runs, converting the feature matching problem as a multidimensional assignment problem (MAP). In evaluations on three public metabolomics benchmark datasets, G-Aligner achieved the highest alignment accuracy on manual annotated and popular software extracted features, proving the effectiveness and robustness of the algorithm.

Sections du résumé

BACKGROUND BACKGROUND
Liquid chromatography-mass spectrometry is widely used in untargeted metabolomics for composition profiling. In multi-run analysis scenarios, features of each run are aligned into consensus features by feature alignment algorithms to observe the intensity variations across runs. However, most of the existing feature alignment methods focus more on accurate retention time correction, while underestimating the importance of feature matching. None of the existing methods can comprehensively consider feature correspondences among all runs and achieve optimal matching.
RESULTS RESULTS
To comprehensively analyze feature correspondences among runs, we propose G-Aligner, a graph-based feature alignment method for untargeted LC-MS data. In the feature matching stage, G-Aligner treats features and potential correspondences as nodes and edges in a multipartite graph, considers the multi-run feature matching problem an unbalanced multidimensional assignment problem, and provides three combinatorial optimization algorithms to find optimal matching solutions. In comparison with the feature alignment methods in OpenMS, MZmine2 and XCMS on three public metabolomics benchmark datasets, G-Aligner achieved the best feature alignment performance on all the three datasets with up to 9.8% and 26.6% increase in accurately aligned features and analytes, and helped all comparison software obtain more accurate results on their self-extracted features by integrating G-Aligner to their analysis workflow. G-Aligner is open-source and freely available at https://github.com/CSi-Studio/G-Aligner under a permissive license. Benchmark datasets, manual annotation results, evaluation methods and results are available at https://doi.org/10.5281/zenodo.8313034 CONCLUSIONS: In this study, we proposed G-Aligner to improve feature matching accuracy for untargeted metabolomics LC-MS data. G-Aligner comprehensively considered potential feature correspondences between all runs, converting the feature matching problem as a multidimensional assignment problem (MAP). In evaluations on three public metabolomics benchmark datasets, G-Aligner achieved the highest alignment accuracy on manual annotated and popular software extracted features, proving the effectiveness and robustness of the algorithm.

Identifiants

pubmed: 37964228
doi: 10.1186/s12859-023-05525-4
pii: 10.1186/s12859-023-05525-4
pmc: PMC10644574
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

431

Subventions

Organisme : Natural Science Foundation of Shandong Province
ID : 2022HWYQ-081

Informations de copyright

© 2023. The Author(s).

Références

Nat Rev Drug Discov. 2016 Jul;15(7):473-84
pubmed: 26965202
Nat Biotechnol. 2012 Oct;30(10):918-20
pubmed: 23051804
BMC Bioinformatics. 2022 Jan 12;23(1):35
pubmed: 35021987
Metabolomics. 2017 Nov 28;14(1):5
pubmed: 30830317
Proteomics. 2016 Aug;16(15-16):2272-83
pubmed: 27302277
Nat Methods. 2016 Aug 30;13(9):741-8
pubmed: 27575624
Bioinformatics. 2013 Oct 1;29(19):2469-76
pubmed: 23904508
Plant Mol Biol. 2002 Jan;48(1-2):155-71
pubmed: 11860207
Anal Chim Acta. 2018 Oct 31;1029:50-57
pubmed: 29907290
Sheng Wu Gong Cheng Xue Bao. 2022 Mar 25;38(3):961-975
pubmed: 35355467
Brief Bioinform. 2015 Jan;16(1):104-17
pubmed: 24273217
Anal Chem. 2006 Feb 1;78(3):779-87
pubmed: 16448051
Anal Chem. 2004 Jan 15;76(2):404-11
pubmed: 14719890
Metabolomics. 2023 Jun 8;19(6):57
pubmed: 37289291
BMC Bioinformatics. 2010 Jul 23;11:395
pubmed: 20650010
Anal Chem. 2006 Sep 1;78(17):6140-52
pubmed: 16944896

Auteurs

Ruimin Wang (R)

Fudan University, Shanghai, 200433, Shanghai, China.
School of Engineering, Westlake University, Hangzhou, 310030, Zhejiang, China.
Shandong First Medical University and Shandong Academy of Medical Sciences, Jinan, 250021, Shandong, China.

Miaoshan Lu (M)

School of Engineering, Westlake University, Hangzhou, 310030, Zhejiang, China.
Shandong First Medical University and Shandong Academy of Medical Sciences, Jinan, 250021, Shandong, China.
Zhejiang University, Hangzhou, 310058, Zhejiang, China.

Shaowei An (S)

Fudan University, Shanghai, 200433, Shanghai, China.
Shandong First Medical University and Shandong Academy of Medical Sciences, Jinan, 250021, Shandong, China.
School of Life Sciences, Westlake University, Hangzhou, 310030, Zhejiang, China.

Jinyin Wang (J)

Shandong First Medical University and Shandong Academy of Medical Sciences, Jinan, 250021, Shandong, China.
Zhejiang University, Hangzhou, 310058, Zhejiang, China.
School of Life Sciences, Westlake University, Hangzhou, 310030, Zhejiang, China.

Changbin Yu (C)

Shandong First Medical University and Shandong Academy of Medical Sciences, Jinan, 250021, Shandong, China. yu_lab@sdfmu.edu.cn.

Articles similaires

Selecting optimal software code descriptors-The case of Java.

Yegor Bugayenko, Zamira Kholmatova, Artem Kruglov et al.
1.00
Software Algorithms Programming Languages

Exploring blood-brain barrier passage using atomic weighted vector and machine learning.

Yoan Martínez-López, Paulina Phoobane, Yanaima Jauriga et al.
1.00
Blood-Brain Barrier Machine Learning Humans Support Vector Machine Software
1.00
Humans Magnetic Resonance Imaging Brain Infant, Newborn Infant, Premature
Cephalometry Humans Anatomic Landmarks Software Internet

Classifications MeSH