Diversity-driven knowledge distillation for financial trading using Deep Reinforcement Learning.


Journal

Neural networks : the official journal of the International Neural Network Society
ISSN: 1879-2782
Titre abrégé: Neural Netw
Pays: United States
ID NLM: 8805018

Informations de publication

Date de publication:
Aug 2021
Historique:
received: 10 07 2020
revised: 08 12 2020
accepted: 22 02 2021
pubmed: 29 3 2021
medline: 29 6 2021
entrez: 28 3 2021
Statut: ppublish

Résumé

Deep Reinforcement Learning (RL) is increasingly used for developing financial trading agents for a wide range of tasks. However, optimizing deep RL agents is notoriously difficult and unstable, especially in noisy financial environments, significantly hindering the performance of trading agents. In this work, we present a novel method that improves the training reliability of DRL trading agents building upon the well-known approach of neural network distillation. In the proposed approach, teacher agents are trained in different subsets of RL environment, thus diversifying the policies they learn. Then student agents are trained using distillation from the trained teachers to guide the training process, allowing for better exploring the solution space, while "mimicking" an existing policy/trading strategy provided by the teacher model. The boost in effectiveness of the proposed method comes from the use of diversified ensembles of teachers trained to perform trading for different currencies. This enables us to transfer the common view regarding the most profitable policy to the student, further improving the training stability in noisy financial environments. In the conducted experiments we find that when applying distillation, constraining the teacher models to be diversified can significantly improve their performance of the final student agents. We demonstrate this by providing an extensive evaluation on various financial trading tasks. Furthermore, we also provide additional experiments in the separate domain of control in games using the Procgen environments in order to demonstrate the generality of the proposed method.

Identifiants

pubmed: 33774425
pii: S0893-6080(21)00076-9
doi: 10.1016/j.neunet.2021.02.026
pii:
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

193-202

Informations de copyright

Copyright © 2021 Elsevier Ltd. All rights reserved.

Déclaration de conflit d'intérêts

Declaration of Competing Interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Auteurs

Avraam Tsantekidis (A)

School of Informatics, Aristotle University of Thessaloniki, Thessaloniki 54124, Greece. Electronic address: avraamt@csd.auth.gr.

Nikolaos Passalis (N)

School of Informatics, Aristotle University of Thessaloniki, Thessaloniki 54124, Greece. Electronic address: passalis@csd.auth.gr.

Anastasios Tefas (A)

School of Informatics, Aristotle University of Thessaloniki, Thessaloniki 54124, Greece. Electronic address: tefas@csd.auth.gr.

Articles similaires

Databases, Protein Protein Domains Protein Folding Proteins Deep Learning
Humans Breast Neoplasms Female Deep Learning Ultrasonography, Mammary
China Humans Family Characteristics Socioeconomic Factors Financial Management
Humans Deep Learning Mouth Neoplasms Drug Resistance, Neoplasm Cell Line, Tumor

Classifications MeSH