HPOSS: A hierarchical portfolio optimization stacking strategy to reduce the generalization error of ensembles of models.


Journal

PloS one
ISSN: 1932-6203
Titre abrégé: PLoS One
Pays: United States
ID NLM: 101285081

Informations de publication

Date de publication:
2023
Historique:
received: 21 03 2023
accepted: 04 08 2023
medline: 4 9 2023
pubmed: 31 8 2023
entrez: 31 8 2023
Statut: epublish

Résumé

Surrogate models are frequently used to replace costly engineering simulations. A single surrogate is frequently chosen based on previous experience or by fitting multiple surrogates and selecting one based on mean cross-validation errors. A novel stacking strategy will be presented in this paper. This new strategy results from reinterpreting the model selection process based on the generalization error. For the first time, this problem is proposed to be translated into a well-studied financial problem: portfolio management and optimization. In short, it is demonstrated that the individual residues calculated by leave-one-out procedures are samples from a given random variable ϵi, whose second non-central moment is the i-th model's generalization error. Thus, a stacking methodology based solely on evaluating the behavior of the linear combination of the random variables ϵi is proposed. At first, several surrogate models are calibrated. The Directed Bubble Hierarchical Tree (DBHT) clustering algorithm is then used to determine which models are worth stacking. The stacking weights can be calculated using any financial approach to the portfolio optimization problem. This alternative understanding of the problem enables practitioners to use established financial methodologies to calculate the models' weights, significantly improving the ensemble of models' out-of-sample performance. A study case is carried out to demonstrate the applicability of the new methodology. Overall, a total of 124 models were trained using a specific dataset: 40 Machine Learning models and 84 Polynomial Chaos Expansion models (which considered 3 types of base random variables, 7 least square algorithms for fitting the up to fourth order expansion's coefficients). Among those, 99 models could be fitted without convergence and other numerical issues. The DBHT algorithm with Pearson correlation distance and generalization error similarity was able to select a subgroup of 23 models from the 99 fitted ones, implying a reduction of about 77% in the total number of models, representing a good filtering scheme which still preserves diversity. Finally, it has been demonstrated that the weights obtained by building a Hierarchical Risk Parity (HPR) portfolio perform better for various input random variables, indicating better out-of-sample performance. In this way, an economic stacking strategy has demonstrated its worth in improving the out-of-sample capabilities of stacked models, which illustrates how the new understanding of model stacking methodologies may be useful.

Identifiants

pubmed: 37651433
doi: 10.1371/journal.pone.0290331
pii: PONE-D-23-08481
pmc: PMC10470931
doi:

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

e0290331

Informations de copyright

Copyright: © 2023 Ozelim et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Déclaration de conflit d'intérêts

The authors have declared that no competing interests exist.

Références

PLoS One. 2023 Aug 31;18(8):e0290331
pubmed: 37651433
PLoS One. 2015 Mar 18;10(3):e0116201
pubmed: 25786703
PLoS One. 2012;7(3):e31929
pubmed: 22427814
PLoS One. 2022 Feb 10;17(2):e0263150
pubmed: 35143521
Biometrics. 1967 Dec;23(4):639-45
pubmed: 6080201
Neural Netw. 2013 Jul;43:72-83
pubmed: 23500502
Nat Methods. 2020 Mar;17(3):261-272
pubmed: 32015543

Auteurs

Luan Carlos de Sena Monteiro Ozelim (LCSM)

Aeronautics Institute of Technology (ITA), São José dos Campos, São Paulo, Brazil.

Dimas Betioli Ribeiro (DB)

Aeronautics Institute of Technology (ITA), São José dos Campos, São Paulo, Brazil.

José Antonio Schiavon (JA)

Aeronautics Institute of Technology (ITA), São José dos Campos, São Paulo, Brazil.

Vinicius Resende Domingues (VR)

Aeronautics Institute of Technology (ITA), São José dos Campos, São Paulo, Brazil.

Paulo Ivo Braga de Queiroz (PIB)

Aeronautics Institute of Technology (ITA), São José dos Campos, São Paulo, Brazil.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH