Prediction intervals with random forests.
Random forest
out-of-bag calibration
prediction interval
splitting rule
Journal
Statistical methods in medical research
ISSN: 1477-0334
Titre abrégé: Stat Methods Med Res
Pays: England
ID NLM: 9212457
Informations de publication
Date de publication:
01 2020
01 2020
Historique:
pubmed:
23
2
2019
medline:
2
3
2021
entrez:
22
2
2019
Statut:
ppublish
Résumé
The classical and most commonly used approach to building prediction intervals is the parametric approach. However, its main drawback is that its validity and performance highly depend on the assumed functional link between the covariates and the response. This research investigates new methods that improve the performance of prediction intervals with random forests. Two aspects are explored: The method used to build the forest and the method used to build the prediction interval. Four methods to build the forest are investigated, three from the classification and regression tree (CART) paradigm and the transformation forest method. For CART forests, in addition to the default least-squares splitting rule, two alternative splitting criteria are investigated. We also present and evaluate the performance of five flexible methods for constructing prediction intervals. This yields 20 distinct method variations. To reliably attain the desired confidence level, we include a calibration procedure performed on the out-of-bag information provided by the forest. The 20 method variations are thoroughly investigated, and compared to five alternative methods through simulation studies and in real data settings. The results show that the proposed methods are very competitive. They outperform commonly used methods in both in simulation settings and with real data.
Identifiants
pubmed: 30786820
doi: 10.1177/0962280219829885
doi:
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM