Mixture density networks for the indirect estimation of reference intervals.

Child Hemoglobins / analysis Humans Prospective Studies Reference Values

Distributional regression Latent class regression Mixture density networks Reference intervals

Journal

BMC bioinformatics

ISSN: 1471-2105

Titre abrégé: BMC Bioinformatics

Pays: England

ID NLM: 100965194

Informations de publication

Date de publication:
29 Jul 2022

Historique:

received: 14 01 2022

accepted: 15 07 2022

entrez: 29 7 2022

pubmed: 30 7 2022

medline: 3 8 2022

Statut: epublish

Résumé

Reference intervals represent the expected range of physiological test results in a healthy population and are essential to support medical decision making. Particularly in the context of pediatric reference intervals, where recruitment regulations make prospective studies challenging to conduct, indirect estimation strategies are becoming increasingly important. Established indirect methods enable robust identification of the distribution of "healthy" samples from laboratory databases, which include unlabeled pathologic cases, but are currently severely limited when adjusting for essential patient characteristics such as age. Here, we propose the use of mixture density networks (MDN) to overcome this problem and model all parameters of the mixture distribution in a single step. Estimated reference intervals from varying settings with simulated data demonstrate the ability to accurately estimate latent distributions from unlabeled data using different implementations of MDNs. Comparing the performance with alternative estimation approaches further highlights the importance of modeling the mixture component weights as a function of the input in order to avoid biased estimates for all other parameters and the resulting reference intervals. We also provide a strategy to generate partially customized starting weights to improve proper identification of the latent components. Finally, the application on real-world hemoglobin samples provides results in line with current gold standard approaches, but also suggests further investigations with respect to adequate regularization strategies in order to prevent overfitting the data. Mixture density networks provide a promising approach capable of extracting the distribution of healthy samples from unlabeled laboratory databases while simultaneously and explicitly estimating all parameters and component weights as non-linear functions of the covariate(s), thereby allowing the estimation of age-dependent reference intervals in a single step. Further studies on model regularization and asymmetric component distributions are warranted to consolidate our findings and expand the scope of applications.

Sections du résumé

BACKGROUND BACKGROUND

RESULTS RESULTS

Estimated reference intervals from varying settings with simulated data demonstrate the ability to accurately estimate latent distributions from unlabeled data using different implementations of MDNs. Comparing the performance with alternative estimation approaches further highlights the importance of modeling the mixture component weights as a function of the input in order to avoid biased estimates for all other parameters and the resulting reference intervals. We also provide a strategy to generate partially customized starting weights to improve proper identification of the latent components. Finally, the application on real-world hemoglobin samples provides results in line with current gold standard approaches, but also suggests further investigations with respect to adequate regularization strategies in order to prevent overfitting the data.

CONCLUSIONS CONCLUSIONS

Mixture density networks provide a promising approach capable of extracting the distribution of healthy samples from unlabeled laboratory databases while simultaneously and explicitly estimating all parameters and component weights as non-linear functions of the covariate(s), thereby allowing the estimation of age-dependent reference intervals in a single step. Further studies on model regularization and asymmetric component distributions are warranted to consolidate our findings and expand the scope of applications.

Identifiants

DOI: 10.1186/s12859-022-04846-0 PMID: 35906555 PMC: PMC9336034

pubmed: 35906555

doi: 10.1186/s12859-022-04846-0

pii: 10.1186/s12859-022-04846-0

pmc: PMC9336034

doi:

Substances chimiques

Hemoglobins 0

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

Pagination

307

Informations de copyright

Références

Clin Chem Lab Med. 2016 Dec 1;54(12):1893-1900

pubmed: 27748267

Clin Chem. 2012 May;58(5):808-10

pubmed: 22377530

Clin Chem Lab Med. 2007;45(8):1033-42

pubmed: 17867993

Sci Rep. 2021 Aug 6;11(1):16023

pubmed: 34362961

Clin Chem Lab Med. 2018 Dec 19;57(1):20-29

pubmed: 29672266

Clin Chem Lab Med. 2011 Apr;49(4):659-64

pubmed: 21342020

BMC Bioinformatics. 2020 Nov 13;21(1):524

pubmed: 33187469

PLoS One. 2016 Mar 04;11(3):e0149856

pubmed: 26942417

Clin Chem Lab Med. 2017 Jan 1;55(1):102-110

pubmed: 27505090

Clin Chim Acta. 2003 Aug;334(1-2):5-23

pubmed: 12867273

Bull Math Biol. 1990;52(1-2):99-115; discussion 73-97

pubmed: 2185863

Ann Clin Biochem. 2004 Jul;41(Pt 4):321-9

pubmed: 15298745

Sci Rep. 2020 Feb 3;10(1):1704

pubmed: 32015476

Psychol Rev. 1958 Nov;65(6):386-408

pubmed: 13602029

Clin Chem. 2015 Jul;61(7):964-73

pubmed: 25967371

Mixture density networks for the indirect estimation of reference intervals.

Journal

Informations de publication

Résumé

Sections du résumé

Identifiants

Substances chimiques

Types de publication

Langues

Sous-ensembles de citation

Pagination

Informations de copyright

Références

Auteurs

Tobias Hepp (T)

Jakob Zierk (J)

Manfred Rauh (M)

Markus Metzler (M)

Sarem Seitz (S)

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Smoking Cessation and Incident Cardiovascular Disease.

Evaluation of Low-Value Services Across Major Medicare Advantage Insurers and Traditional Medicare.

Effectiveness of Virtual Yoga for Chronic Low Back Pain: A Randomized Clinical Trial.

Classifications MeSH