A review on longitudinal data analysis with random forest.

clustered data longitudinal data machine learning multivariate response repeated measurements

Journal

Briefings in bioinformatics
ISSN: 1477-4054
Titre abrégé: Brief Bioinform
Pays: England
ID NLM: 100912837

Informations de publication

Date de publication:
19 03 2023
Historique:
received: 31 08 2022
revised: 12 12 2022
accepted: 31 12 2012
pubmed: 19 1 2023
medline: 22 3 2023
entrez: 18 1 2023
Statut: ppublish

Résumé

In longitudinal studies variables are measured repeatedly over time, leading to clustered and correlated observations. If the goal of the study is to develop prediction models, machine learning approaches such as the powerful random forest (RF) are often promising alternatives to standard statistical methods, especially in the context of high-dimensional data. In this paper, we review extensions of the standard RF method for the purpose of longitudinal data analysis. Extension methods are categorized according to the data structures for which they are designed. We consider both univariate and multivariate response longitudinal data and further categorize the repeated measurements according to whether the time effect is relevant. Even though most extensions are proposed for low-dimensional data, some can be applied to high-dimensional data. Information of available software implementations of the reviewed extensions is also given. We conclude with discussions on the limitations of our review and some future research directions.

Identifiants

pubmed: 36653905
pii: 6991123
doi: 10.1093/bib/bbad002
pmc: PMC10025446
pii:
doi:

Types de publication

Review Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Informations de copyright

© The Author(s) 2023. Published by Oxford University Press.

Références

Biometrics. 2004 Jun;60(2):543-9
pubmed: 15180683
Lancet Neurol. 2017 Nov;16(11):908-916
pubmed: 28958801
PLoS One. 2013 Apr 24;8(4):e61562
pubmed: 23637855
Biometrics. 2021 Mar;77(1):343-351
pubmed: 32311079
Nat Rev Genet. 2016 Aug 16;17(9):507-22
pubmed: 27528417
Stat Appl Genet Mol Biol. 2013 Dec;12(6):757-86
pubmed: 24246292
Stat Med. 2002 Nov 30;21(22):3395-409
pubmed: 12407680
J Biomed Inform. 2019 Jan;89:56-67
pubmed: 30189255
Cancers (Basel). 2017 Oct 25;9(11):
pubmed: 29068364
Educ Psychol Meas. 2021 Dec;81(6):1118-1142
pubmed: 34565818
Commun Stat Simul Comput. 2020;49(4):1004-1023
pubmed: 32377032
Biometrics. 1982 Dec;38(4):963-74
pubmed: 7168798
Brief Bioinform. 2019 Mar 22;20(2):492-503
pubmed: 29045534
Int Stat Rev. 2014 Dec 1;82(3):359-361
pubmed: 25844011
Hum Genet. 2012 Oct;131(10):1615-26
pubmed: 22923055
Clin Chem. 2004 Aug;50(8):1438-41
pubmed: 15277356
Eur Respir J. 2017 Oct 19;50(4):
pubmed: 29051268
Stat Methods Med Res. 2021 Jan;30(1):166-184
pubmed: 32772626
Bioinformatics. 2017 May 1;33(9):1407-1410
pubmed: 28334269
Behav Res Methods. 2018 Oct;50(5):2016-2034
pubmed: 29071652
Genomics. 2012 Jun;99(6):323-9
pubmed: 22546560
N Engl J Med. 2015 Jun 4;372(23):2229-34
pubmed: 26014593
Sci Rep. 2019 Jan 28;9(1):797
pubmed: 30692568
Ecology. 2007 Nov;88(11):2783-92
pubmed: 18051647
PLoS One. 2009 Sep 18;4(9):e7087
pubmed: 19763254
Chemometr Intell Lab Syst. 2019 Feb 15;185:122-134
pubmed: 31656362
Hum Genet. 2015 May;134(5):459-65
pubmed: 25238897
Multivariate Behav Res. 2019 Jul-Aug;54(4):578-592
pubmed: 30644764
Stat Interface. 2008;1(1):169-178
pubmed: 18852827

Auteurs

Jianchang Hu (J)

Institute of Medical Biometry and Statistics, University of Lübeck, Ratzeburger Allee 160, 23562, Lübeck, Germany.

Silke Szymczak (S)

Institute of Medical Biometry and Statistics, University of Lübeck, Ratzeburger Allee 160, 23562, Lübeck, Germany.

Articles similaires

Selecting optimal software code descriptors-The case of Java.

Yegor Bugayenko, Zamira Kholmatova, Artem Kruglov et al.
1.00
Software Algorithms Programming Languages

Exploring blood-brain barrier passage using atomic weighted vector and machine learning.

Yoan Martínez-López, Paulina Phoobane, Yanaima Jauriga et al.
1.00
Blood-Brain Barrier Machine Learning Humans Support Vector Machine Software
Humans Male Female Aged Middle Aged

Classifications MeSH