Statistical analysis of isocratic chromatographic data using Bayesian modeling.
Bayesian inference
Method development
Multilevel model
Retention modeling
Journal
Analytical and bioanalytical chemistry
ISSN: 1618-2650
Titre abrégé: Anal Bioanal Chem
Pays: Germany
ID NLM: 101134327
Informations de publication
Date de publication:
May 2022
May 2022
Historique:
received:
16
09
2021
accepted:
08
02
2022
revised:
28
01
2022
pubmed:
30
3
2022
medline:
22
4
2022
entrez:
29
3
2022
Statut:
ppublish
Résumé
Chromatographic retention times are usually modeled considering only one analyte at a time. However, it has certain limitations as no information is shared between the analytes, and consequently the model predictions poorly generalize to out-of-sample analytes. In this work, a publicly available dataset was used to illustrate the benefits of pooling the individual data and analyzing them simultaneously utilizing Bayesian hierarchical approach. Statistical analysis was carried out using the Stan program coupled with R, which enables full Bayesian inference with Markov chain Monte Carlo sampling. This methodology allows (i) incorporating prior knowledge about the likely values of model parameters, (ii) considering the between-analyte variability and the correlation between the model parameters, (iii) explaining the between-analyte variability by available predictors, and (iv) sharing information across the analytes. The latter is especially valuable when only limited information is available in the data about certain model parameters. The results are obtained in the form of posterior probability distribution, which quantifies uncertainty about the model parameters and predictions. Posterior probability is also directly relevant for decision-making. In this work, we used the Neue model to describe the relationship between retention factor and acetonitrile content in the mobile phase for 1026 analytes. The model was parametrized in terms of retention factor in 100% water, retention factor in 100% acetonitrile, and curvature coefficient, and considered log P and pK
Identifiants
pubmed: 35347353
doi: 10.1007/s00216-022-03968-x
pii: 10.1007/s00216-022-03968-x
doi:
Substances chimiques
Water
059QF0KO0R
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
3471-3481Informations de copyright
© 2022. Springer-Verlag GmbH Germany, part of Springer Nature.
Références
Snyder LR, Kirkland JJ, Dolan JW. Introduction to modern liquid chromatography, 2nd ed. New York: John Wiley & Sons, Inc.; 2009.
doi: 10.1002/9780470508183
Nikitas P, Pappa-Louisi A. Retention models for isocratic and gradient elution in reversed-phase liquid chromatography. Journal of chromatography. A 2009;1216(10):1737–1755. https://doi.org/10.1016/j.chroma.2008.09.051
doi: 10.1016/j.chroma.2008.09.051
pubmed: 18838140
Neue UD. Nonlinear Retention Relationships in Reversed-Phase Chromatography. Chromatographia 2006;63(S13):S45–S53. https://doi.org/10.1365/s10337-006-0718-9 , http://www.springerlink.com/index/10.1365/s10337-006-0718-9 .
doi: 10.1365/s10337-006-0718-9
Gelman A. Multilevel (Hierarchical) Modeling: What It Can and Cannot Do. Technometrics 2006; 48(3):432–435. https://doi.org/10.1198/004017005000000661 .
doi: 10.1198/004017005000000661
Hox J. Multilevel analysis: Techniques and applications, 2nd ed. New York: Routledge; 2010.
doi: 10.4324/9780203852279
Stangl DK. Prediction and decision making using Bayesian hierarchical models. Stat Med 1995; 14(20):2173–2190.
doi: 10.1002/sim.4780142002
pubmed: 8552895
Wiczling P. Analyzing chromatographic data using multilevel modeling. Anal Bioanal Chem 2018; 410(16):3905–3915. https://doi.org/10.1007/s00216-018-1061-3 .
doi: 10.1007/s00216-018-1061-3
pubmed: 29679115
Haddad PR, Taraji M, Szücs R. Prediction of Analyte Retention Time in Liquid Chromatography. Anal Chem 2021;93(1):228–256. https://doi.org/10.1021/acs.analchem.0c04190 .
doi: 10.1021/acs.analchem.0c04190
pubmed: 33085452
Bouwmeester R, Gabriels R, Hulstaert N, Martens L, Degroeve S. DeepLC can predict retention times for peptides that carry as-yet unseen modifications. Nat Methods 2021;18(11):1363–1369. https://doi.org/10.1038/s41592-021-01301-5 .
doi: 10.1038/s41592-021-01301-5
pubmed: 34711972
Giese S H, Sinn L R, Wegner F, Rappsilber J. Retention time prediction using neural networks increases identifications in crosslinking mass spectrometry. Nat Commun 2021;12(1):3237. https://doi.org/10.1038/s41467-021-23441-0 .
doi: 10.1038/s41467-021-23441-0
pubmed: 34050149
pmcid: 8163845
McElreath R. 2016. Statistical rethinking: a bayesian course with examples in r and stan.
Gelman A, Simpson D, Betancourt M. The prior can often only be understood in the context of the likelihood. Entropy 2017;19(10):555. https://doi.org/10.4324/9781315650982 .
doi: 10.3390/e19100555
Boswell PG, Schellenberg JR, Carr PW, Cohen JD, Hegeman AD. Easy and accurate high-performance liquid chromatography retention prediction with different gradients, flow rates, and instruments by back-calculation of gradient and flow rate profiles. J Chromatogr A 2011;1218(38):6742–6749. https://doi.org/10.1016/J.CHROMA.2011.07.070 , https://www.sciencedirect.com/science/article/abs/pii/S0021967311011095?via%3Dihub .
doi: 10.1016/j.chroma.2011.07.070
pubmed: 21840007
Boswell PG, Schellenberg JR, Carr PW, Cohen JD, Hegeman AD. A study on retention ‘projection’ as a supplementary means for compound identification by liquid chromatography?mass spectrometry capable of predicting-retention with different gradients, flow rates, and instruments. J Chromatogr A 2011;1218(38):6732–6741. https://doi.org/10.1016/J.CHROMA.2011.07.105 , https://www.sciencedirect.com/science/article/abs/pii/S0021967311011447?via%3Dihub .
doi: 10.1016/j.chroma.2011.07.105
pubmed: 21862024
Kruschke JK. Doing bayesian data analysis: A tutorial with r, jags, and stan, 2nd ed. London: Academic Press; 2014.
Hoffman MD, Gelman A. The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. J Mach Learn Res 2014;15(1):1593–1623.
Carpenter B, Gelman A, Hoffman MD, Lee D, Goodrich B, Betancourt M, Brubaker M, Guo J, Li P, Riddell A. Stan: A probabilistic programming language. Journal of Statistical Software, Articles 2017;76(1):1–32. https://doi.org/10.18637/jss.v076.i01 .
Stan Development Team. 2021. RStan: the R interface to Stan. https://mc-stan.org/ , R package version 2.21.3.
Margossian C, Gillespie B. 2017. Differential equations based models in stan. https://mc-stan.org/events/stancon2017-notebooks/stancon2017-margossian-gillespie-ode.html .
Kubik L, Kaliszan R, Wiczling P. Analysis of Isocratic-Chromatographic-Retention Data using Bayesian Multilevel Modeling. Anal Chem 2018;90(22):13670–13679. https://doi.org/10.1021/acs.analchem.8b04033 .
doi: 10.1021/acs.analchem.8b04033
pubmed: 30335375
Neue UD, Phoebe CH, Tran K, Cheng Y-F, Lu Z. Dependence of reversed-phase retention of ionizable analytes on pH, concentration of organic solvent and silanol activity. J Chromatogr A 2001; 925(1):49–67. https://doi.org/10.1016/S0021-9673(01)01009-3 .
doi: 10.1016/S0021-9673(01)01009-3
pubmed: 11519817
Pappa-Louisi A, Nikitas P, Balkatzopoulou P, Malliakas C. Two- and three-parameter equations for representation of retention data in reversed-phase liquid chromatography. J Chromatogr A 2004; 1033(1):29–41. https://doi.org/10.1016/J.CHROMA.2004.01.021 .
doi: 10.1016/j.chroma.2004.01.021
pubmed: 15072288
Gelman A, Hwang J, Vehtari A. Understanding predictive information criteria for Bayesian models. Stat Comput 2014; 24 (6): 997–1016. https://doi.org/10.1007/s11222-013-9416-2 , http://link.springer.com/10.1007/s11222-013-9416-2 .
doi: 10.1007/s11222-013-9416-2
Vehtari A, Gelman A, Gabry J. Practical bayesian model evaluation using leave-one-out cross-validation and waic. Stat Comput 2017;27:1413–1432.
doi: 10.1007/s11222-016-9696-4
Hanai T. Structure---retention correlation in liquid chromatography. J Chromatogr A 1991;550:313–324. https://doi.org/10.1016/S0021-9673(01)88547-2 , http://www.sciencedirect.com/science/article/pii/S0021967301885472 .
doi: 10.1016/S0021-9673(01)88547-2
Gritti F, Guiochon G. Adsorption Mechanism in RPLC. Effect of the Nature of the Organic Modifier. Anal Chem 2005;77(13):4257–4272. https://doi.org/10.1021/ac0580058 .
doi: 10.1021/ac0580058
pubmed: 15987135
Royal Society of Chemistry. 2021. CSID:2015292. https://www.chemspider.com/Chemical-Structure.2015292.html .
Wiczling P, Kamedulska A, Kubik L. Application of Bayesian Multilevel Modeling in the Quantitative Structure---Retention Relationship Studies of Heterogeneous Compounds. Anal Chem 2021;93(18):6961–6971. https://doi.org/10.1021/acs.analchem.0c05227 .
doi: 10.1021/acs.analchem.0c05227
pubmed: 33905658