Magnetic control of tokamak plasmas through deep reinforcement learning.


Journal

Nature
ISSN: 1476-4687
Titre abrégé: Nature
Pays: England
ID NLM: 0410462

Informations de publication

Date de publication:
02 2022
Historique:
received: 14 07 2021
accepted: 01 12 2021
entrez: 17 2 2022
pubmed: 18 2 2022
medline: 16 4 2022
Statut: ppublish

Résumé

Nuclear fusion using magnetic confinement, in particular in the tokamak configuration, is a promising path towards sustainable energy. A core challenge is to shape and maintain a high-temperature plasma within the tokamak vessel. This requires high-dimensional, high-frequency, closed-loop control using magnetic actuator coils, further complicated by the diverse requirements across a wide range of plasma configurations. In this work, we introduce a previously undescribed architecture for tokamak magnetic controller design that autonomously learns to command the full set of control coils. This architecture meets control objectives specified at a high level, at the same time satisfying physical and operational constraints. This approach has unprecedented flexibility and generality in problem specification and yields a notable reduction in design effort to produce new plasma configurations. We successfully produce and control a diverse set of plasma configurations on the Tokamak à Configuration Variable

Identifiants

pubmed: 35173339
doi: 10.1038/s41586-021-04301-9
pii: 10.1038/s41586-021-04301-9
pmc: PMC8850200
doi:

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

414-419

Informations de copyright

© 2022. The Author(s).

Références

Hofmann, F. et al. Creation and control of variably shaped plasmas in TCV. Plasma Phys. Control. Fusion 36, B277 (1994).
doi: 10.1088/0741-3335/36/12B/023
Coda, S. et al. Physics research on the TCV tokamak facility: from conventional to alternative scenarios and beyond. Nucl. Fusion 59, 112023 (2019).
doi: 10.1088/1741-4326/ab25cb
Anand, H., Coda, S., Felici, F., Galperti, C. & Moret, J.-M. A novel plasma position and shape controller for advanced configuration development on the TCV tokamak. Nucl. Fusion 57, 126026 (2017).
doi: 10.1088/1741-4326/aa7f4d
Mele, A. et al. MIMO shape control at the EAST tokamak: simulations and experiments. Fusion Eng. Des. 146, 1282–1285 (2019).
doi: 10.1016/j.fusengdes.2019.02.058
Anand, H. et al. Plasma flux expansion control on the DIII-D tokamak. Plasma Phys. Control. Fusion 63, 015006 (2020).
doi: 10.1088/1361-6587/abc457
De Tommasi, G. Plasma magnetic control in tokamak devices. J. Fusion Energy 38, 406–436 (2019).
doi: 10.1007/s10894-018-0162-5
Walker, M. L. & Humphreys, D. A. Valid coordinate systems for linearized plasma shape response models in tokamaks. Fusion Sci. Technol. 50, 473–489 (2006).
doi: 10.13182/FST06-A1271
Blum, J., Heumann, H., Nardon, E. & Song, X. Automating the design of tokamak experiment scenarios. J. Comput. Phys. 394, 594–614 (2019).
doi: 10.1016/j.jcp.2019.05.046
Ferron, J. R. et al. Real time equilibrium reconstruction for tokamak discharge control. Nucl. Fusion 38, 1055 (1998).
doi: 10.1088/0029-5515/38/7/308
Moret, J.-M. et al. Tokamak equilibrium reconstruction code LIUQE and its real time implementation. Fusion Eng. Des. 91, 1–15 (2015).
doi: 10.1016/j.fusengdes.2014.09.019
Xie, Z., Berseth, G., Clary, P., Hurst, J. & van de Panne, M. Feedback control for Cassie with deep reinforcement learning. In 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 1241–1246 (IEEE, 2018).
Akkaya, I. et al. Solving Rubik’s cube with a robot hand. Preprint at https://arxiv.org/abs/1910.07113 (2019).
Bellemare, M. G. et al. Autonomous navigation of stratospheric balloons using reinforcement learning. Nature 588, 77–82 (2020).
doi: 10.1038/s41586-020-2939-8
Humphreys, D. et al. Advancing fusion with machine learning research needs workshop report. J. Fusion Energy 39, 123–155 (2020).
doi: 10.1007/s10894-020-00258-1
Bishop, C. M., Haynes, P. S., Smith, M. E., Todd, T. N. & Trotman, D. L. Real time control of a tokamak plasma using neural networks. Neural Comput. 7, 206–217 (1995).
doi: 10.1162/neco.1995.7.1.206
Joung, S. et al. Deep neural network Grad-Shafranov solver constrained with measured magnetic signals. Nucl. Fusion 60, 16034 (2019).
doi: 10.1088/1741-4326/ab555f
van de Plassche, K. L. et al. Fast modeling of turbulent transport in fusion plasmas using neural networks. Phys. Plasmas 27, 022310 (2020).
doi: 10.1063/1.5134126
Abbate, J., Conlin, R. & Kolemen, E. Data-driven profile prediction for DIII-D. Nucl. Fusion 61, 046027 (2021).
doi: 10.1088/1741-4326/abe08d
Kates-Harbeck, J., Svyatkovskiy, A. & Tang, W. Predicting disruptive instabilities in controlled fusion plasmas through deep learning. Nature 568, 526–531 (2019).
doi: 10.1038/s41586-019-1116-4
Jardin, S. Computational Methods in Plasma Physics (CRC Press, 2010).
Grad, H. & Rubin, H. Hydromagnetic equilibria and force-free fields. J. Nucl. Energy (1954) 7, 284–285 (1958).
doi: 10.1016/0891-3919(58)90139-6
Carpanese, F. Development of Free-boundary Equilibrium and Transport Solvers for Simulation and Real-time Interpretation of Tokamak Experiments. PhD thesis, EPFL (2021).
Abdolmaleki, A. et al. Relative entropy regularized policy iteration. Preprint at https://arxiv.org/abs/1812.02256 (2018).
Paley, J. I., Coda, S., Duval, B., Felici, F. & Moret, J.-M. Architecture and commissioning of the TCV distributed feedback control system. In 2010 17th IEEE-NPSS Real Time Conference 1–6 (IEEE, 2010).
Freidberg, J. P. Plasma Physics and Fusion Energy (Cambridge Univ. Press, 2008).
Hommen, G. D. et al. Real-time optical plasma boundary reconstruction for plasma position control at the TCV Tokamak. Nucl. Fusion 54, 073018 (2014).
doi: 10.1088/0029-5515/54/7/073018
Austin, M. E. et al. Achievement of reactor-relevant performance in negative triangularity shape in the DIII-D tokamak. Phys. Rev. Lett. 122, 115001 (2019).
doi: 10.1103/PhysRevLett.122.115001
Kolemen, E. et al. Initial development of the DIII–D snowflake divertor control. Nucl. Fusion 58, 066007 (2018).
doi: 10.1088/1741-4326/aab0d3
Anand, H. et al. Real time magnetic control of the snowflake plasma configuration in the TCV tokamak. Nucl. Fusion 59, 126032 (2019).
doi: 10.1088/1741-4326/ab4440
Wigbers, M. & Riedmiller, M. A new method for the analysis of neural reference model control. In Proc. International Conference on Neural Networks (ICNN’97) Vol. 2, 739–743 (IEEE, 1997).
Berkenkamp, F., Turchetta, M., Schoellig, A. & Krause, A. Safe model-based reinforcement learning with stability guarantees. In 2017 Advances in Neural Information Processing Systems 908–919 (ACM, 2017).
Wabersich, K. P., Hewing, L., Carron, A. & Zeilinger, M. N. Probabilistic model predictive safety certification for learning-based control. IEEE Tran. Automat. Control 67, 176–188 (2021).
doi: 10.1109/TAC.2021.3049335
Abdolmaleki, A. et al. On multi-objective policy optimization as a tool for reinforcement learning. Preprint at https://arxiv.org/abs/2106.08199 (2021).
Coda, S. et al. Overview of the TCV tokamak program: scientific progress and facility upgrades. Nucl. Fusion 57, 102011 (2017).
doi: 10.1088/1741-4326/aa6412
Karpushov, A. N. et al. Neutral beam heating on the TCV tokamak. Fusion Eng. Des. 123, 468–472 (2017).
doi: 10.1016/j.fusengdes.2017.02.076
Lister, J. B. et al. Plasma equilibrium response modelling and validation on JT-60U. Nucl. Fusion 42, 708 (2002).
doi: 10.1088/0029-5515/42/6/309
Lister, J. B. et al. The control of tokamak configuration variable plasmas. Fusion Technol. 32, 321–373 (1997).
doi: 10.13182/FST97-A1
Ulyanov, D., Vedaldi, A. & Lempitsky, V. Instance normalization: the missing ingredient for fast stylization. Preprint at https://arxiv.org/abs/1607.08022 (2016).
Andrychowicz, M. et al. What matters in on-policy reinforcement learning? A large-scale empirical study. In ICLR 2021 Ninth International Conference on Learning Representations (2021).
Cassirer, A. et al. Reverb: a framework for experience replay. Preprint at https://arxiv.org/abs/2102.04736 (2021).
Hoffman, M. et al. Acme: a research framework for distributed reinforcement learning. Preprint at https://arxiv.org/abs/2006.00979 (2020).
Hofmann, F. FBT-a free-boundary tokamak equilibrium code for highly elongated and shaped plasmas. Comput. Phys. Commun. 48, 207–221 (1988).
doi: 10.1016/0010-4655(88)90041-0
Abadi, M. et al. TensorFlow: a system for large-scale machine learning. In Proc. 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI ’16) 265–283 (2016).
De Tommasi, G. et al. Model-based plasma vertical stabilization and position control at EAST. Fusion Eng. Des. 129, 152–157 (2018).
doi: 10.1016/j.fusengdes.2018.02.020
Gerkšič, S. & De Tommasi, G. ITER plasma current and shape control using MPC. In 2016 IEEE Conference on Control Applications (CCA) 599–604 (IEEE, 2016).
Boncagni, L. et al. Performance-based controller switching: an application to plasma current control at FTU. In 2015 54th IEEE Conference on Decision and Control (CDC) 2319–2324 (IEEE, 2015).
Wakatsuki, T., Suzuki, T., Hayashi, N., Oyama, N. & Ide, S. Safety factor profile control with reduced central solenoid flux consumption during plasma current ramp-up phase using a reinforcement learning technique. Nucl. Fusion 59, 066022 (2019).
doi: 10.1088/1741-4326/ab1571
Wakatsuki, T., Suzuki, T., Oyama, N. & Hayashi, N. Ion temperature gradient control using reinforcement learning technique. Nucl. Fusion 61, 046036 (2021).
doi: 10.1088/1741-4326/abe68d
Seo, J. et al. Feedforward beta control in the KSTAR tokamak by deep reinforcement learning. Nucl. Fusion 61, 106010 (2021).
doi: 10.1088/1741-4326/ac121b
Yang, F. et al. Launchpad: a programming model for distributed machine learning research. Preprint at https://arxiv.org/abs/2106.04516 (2021).
Muldal, A. et al. dm_env: a Python interface for reinforcement learning environments. http://github.com/deepmind/dm_env (2019).
Reynolds, M. et al. Sonnet: TensorFlow-based neural network library. http://github.com/deepmind/sonnet (2017).
Martín A. et al. TensorFlow: large-scale machine learning on heterogeneous systems. Software available from https://www.tensorflow.org/ 2015.
Hender, T. C. et al. Chapter 3: MHD stability, operational limits and disruptions. Nucl. Fusion 47, S128–S202 (2007). 

Auteurs

Federico Felici (F)

Swiss Plasma Center - EPFL, Lausanne, Switzerland. federico.felici@epfl.ch.

Jonas Buchli (J)

DeepMind, London, UK. buchli@deepmind.com.

Brendan Tracey (B)

DeepMind, London, UK. btracey@deepmind.com.

Francesco Carpanese (F)

DeepMind, London, UK.
Swiss Plasma Center - EPFL, Lausanne, Switzerland.

Timo Ewalds (T)

DeepMind, London, UK.

Abbas Abdolmaleki (A)

DeepMind, London, UK.

Diego de Las Casas (D)

DeepMind, London, UK.

Craig Donner (C)

DeepMind, London, UK.

Leslie Fritz (L)

DeepMind, London, UK.

Cristian Galperti (C)

Swiss Plasma Center - EPFL, Lausanne, Switzerland.

James Keeling (J)

DeepMind, London, UK.

Maria Tsimpoukelli (M)

DeepMind, London, UK.

Jackie Kay (J)

DeepMind, London, UK.

Antoine Merle (A)

Swiss Plasma Center - EPFL, Lausanne, Switzerland.

Jean-Marc Moret (JM)

Swiss Plasma Center - EPFL, Lausanne, Switzerland.

Seb Noury (S)

DeepMind, London, UK.

Federico Pesamosca (F)

Swiss Plasma Center - EPFL, Lausanne, Switzerland.

David Pfau (D)

DeepMind, London, UK.

Olivier Sauter (O)

Swiss Plasma Center - EPFL, Lausanne, Switzerland.

Cristian Sommariva (C)

Swiss Plasma Center - EPFL, Lausanne, Switzerland.

Stefano Coda (S)

Swiss Plasma Center - EPFL, Lausanne, Switzerland.

Basil Duval (B)

Swiss Plasma Center - EPFL, Lausanne, Switzerland.

Ambrogio Fasoli (A)

Swiss Plasma Center - EPFL, Lausanne, Switzerland.

Pushmeet Kohli (P)

DeepMind, London, UK.

Koray Kavukcuoglu (K)

DeepMind, London, UK.

Classifications MeSH