Collective foraging of active particles trained by reinforcement learning.


Journal

Scientific reports
ISSN: 2045-2322
Titre abrégé: Sci Rep
Pays: England
ID NLM: 101563288

Informations de publication

Date de publication:
10 Oct 2023
Historique:
received: 28 06 2023
accepted: 05 10 2023
medline: 11 10 2023
pubmed: 11 10 2023
entrez: 10 10 2023
Statut: epublish

Résumé

Collective self-organization of animal groups is a recurring phenomenon in nature which has attracted a lot of attention in natural and social sciences. To understand how collective motion can be achieved without the presence of an external control, social interactions have been considered which regulate the motion and orientation of neighbors relative to each other. Here, we want to understand the motivation and possible reasons behind the emergence of such interaction rules using an experimental model system of light-responsive active colloidal particles (APs). Via reinforcement learning (RL), the motion of particles is optimized regarding their foraging behavior in presence of randomly appearing food sources. Although RL maximizes the rewards of single APs, we observe the emergence of collective behaviors within the particle group. The advantage of such collective strategy in context of foraging is to compensate lack of local information which strongly increases the robustness of the resulting policy. Our results demonstrate that collective behavior may not only result on the optimization of behaviors on the group level but may also arise from maximizing the benefit of individuals. Apart from a better understanding of collective behaviors in natural systems, these results may also be useful in context of the design of autonomous robotic systems.

Identifiants

pubmed: 37816879
doi: 10.1038/s41598-023-44268-3
pii: 10.1038/s41598-023-44268-3
pmc: PMC10564893
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

17055

Subventions

Organisme : European Research Council
ID : 693683
Pays : International
Organisme : European Research Council
ID : 693683
Pays : International

Informations de copyright

© 2023. Springer Nature Limited.

Références

Cavagna, A. et al. Scale-free correlations in starling flocks. Proc. Natl. Acad. Sci. 107, 11865–11870. https://doi.org/10.1073/pnas.1005766107 (2010).
doi: 10.1073/pnas.1005766107 pubmed: 20547832 pmcid: 2900681
Parrish, J. K., Viscido, S. V. & Grünbaum, D. Self-organized fish schools: An examination of emergent properties. Biol. Bull. 202, 296–305. https://doi.org/10.2307/1543482 (2002).
doi: 10.2307/1543482 pubmed: 12087003
Buhl, J. et al. From disorder to order in marching locusts. Science 312, 1402–1406. https://doi.org/10.1126/science.1125142 (2006).
doi: 10.1126/science.1125142 pubmed: 16741126
Cavagna, A. et al. Dynamic scaling in natural swarms. Nat. Phys. 13, 914–918. https://doi.org/10.1038/nphys4153 (2017).
doi: 10.1038/nphys4153
Czirók, A., Ben-Jacob, E., Cohen, I. & Vicsek, T. Formation of complex bacterial colonies via self-generated vortices. Phys. Rev. E 54, 1791–1801. https://doi.org/10.1103/physreve.54.1791 (1996).
doi: 10.1103/physreve.54.1791
Couzin, I. D., Krause, J., James, R., Ruxton, G. D. & Franks, N. R. Collective memory and spatial sorting in animal groups. J. Theor. Biol. 218, 1–11. https://doi.org/10.1006/jtbi.2002.3065 (2002).
doi: 10.1006/jtbi.2002.3065 pubmed: 12297066
Sumpter, D. J. T., Buhl, J., Biro, D. & Couzin, I. Information transfer in moving animal groups. Theory Biosci. 127, 177–186. https://doi.org/10.1007/s12064-008-0040-1 (2008).
doi: 10.1007/s12064-008-0040-1 pubmed: 18458976
Vicsek, T. & Zafeiris, A. Collective motion. Phys. Rep. 517, 71–140. https://doi.org/10.1016/j.physrep.2012.03.004 (2012).
doi: 10.1016/j.physrep.2012.03.004
Detrain, C. & Deneubourg, J.-L. Collective decision-making and foraging patterns in ants and honeybees. Adv. Insect Physiol. 35, 123–173. https://doi.org/10.1016/S0065-2806(08)00002-7 (2008).
doi: 10.1016/S0065-2806(08)00002-7
Gilbert, C., Blanc, S., Le Maho, Y. & Ancel, A. Energy saving processes in huddling emperor penguins: From experiments to theory. J. Exp. Biol. 211, 1–8. https://doi.org/10.1242/jeb.005785 (2008).
doi: 10.1242/jeb.005785 pubmed: 18083725
Krause, J. & Tegeder, R. W. The mechanism of aggregation behaviour in fish shoals: Individuals minimize approach time to neighbours. Anim. Behav. 48, 353–359. https://doi.org/10.1006/anbe.1994.1248 (1994).
doi: 10.1006/anbe.1994.1248
King, A. J. et al. Selfish-herd behaviour of sheep under threat. Curr. Biol. 22, R561–R562. https://doi.org/10.1016/j.cub.2012.05.008 (2012).
doi: 10.1016/j.cub.2012.05.008 pubmed: 22835787
Monter, S., Heuthe, V.-L., Panizon, E. & Bechinger, C. Dynamics and risk sharing in groups of selfish individuals. J. Theor. Biol. 562, 111433. https://doi.org/10.1016/j.jtbi.2023.111433 (2023).
doi: 10.1016/j.jtbi.2023.111433 pubmed: 36738824 pmcid: 10020420
Gupta, J. K., Egorov, M. & Kochenderfer, M. Cooperative multi-agent control using deep reinforcement learning. In Autonomous Agents and Multiagent Systems 66–83 (Springer International Publishing, Berlin, 2017). https://doi.org/10.1007/978-3-319-71682-4_5 .
doi: 10.1007/978-3-319-71682-4_5
Zhang, K., Yang, Z. & Başar, T. Multi-agent reinforcement learning: A selective overview of theories and algorithms. In Handbook of Reinforcement Learning and Control 321–384 (Springer International Publishing, Berlin, 2021). https://doi.org/10.1007/978-3-030-60990-0_12 .
doi: 10.1007/978-3-030-60990-0_12
Verma, S., Novati, G. & Koumoutsakos, P. Efficient collective swimming by harnessing vortices through deep reinforcement learning. Proc. Natl. Acad. Sci. 115, 5849–5854. https://doi.org/10.1073/pnas.1800923115 (2018).
doi: 10.1073/pnas.1800923115 pubmed: 29784820 pmcid: 6003313
Durve, M., Peruani, F. & Celani, A. Learning to flock through reinforcement. Phys. Rev. E 102, 012601. https://doi.org/10.1103/physreve.102.012601 (2020).
doi: 10.1103/physreve.102.012601 pubmed: 32794942
López-Incera, A., Ried, K., Müller, T. & Briegel, H. J. Development of swarm behavior in artificial learning agents that adapt to different foraging environments. PLoS ONE 15, e0243628. https://doi.org/10.1371/journal.pone.0243628 (2020).
doi: 10.1371/journal.pone.0243628 pubmed: 33338066 pmcid: 7748156
Hahn, C., Phan, T., Gabor, T., Belzner, L. & Linnhoff-Popien, C. Emergent escape-based flocking behavior using multi-agent reinforcement learning. In The 2019 Conference on Artificial Life (MIT Press, 2019). https://doi.org/10.1162/isal_a_00226 .
Sunehag, P. et al. Reinforcement learning agents acquire flocking and symbiotic behaviour in simulated ecosystems. In The 2019 Conference on Artificial Life (MIT Press, 2019). https://doi.org/10.1162/isal_a_00148 .
Young, Z. & La, H. M. Consensus, cooperative learning, and flocking for multiagent predator avoidance. Int. J. Adv. Robot. Syst. 17, 172988142096034. https://doi.org/10.1177/1729881420960342 (2020).
doi: 10.1177/1729881420960342
Muiños-Landin, S., Fischer, A., Holubec, V. & Cichos, F. Reinforcement learning with artificial microswimmers. Sci. Robot. https://doi.org/10.1126/scirobotics.abd9285 (2021).
doi: 10.1126/scirobotics.abd9285 pubmed: 34043550
Delcourt, J., Bode, N. W. F. & Denoël, M. Collective vortex behaviors: Diversity, proximate, and ultimate causes of circular animal group movements. Q. Rev. Biol. 91, 1–24. https://doi.org/10.1086/685301 (2016).
doi: 10.1086/685301 pubmed: 27192777
Gomez-Solano, J. R. et al. Tuning the motility and directionality of self-propelled colloids. Sci. Rep. https://doi.org/10.1038/s41598-017-14126-0 (2017).
doi: 10.1038/s41598-017-14126-0 pubmed: 29097762 pmcid: 5668334
Attanasi, A. et al. Collective behaviour without collective order in wild swarms of midges. PLoS Comput. Biol. 10, e1003697. https://doi.org/10.1371/journal.pcbi.1003697 (2014).
doi: 10.1371/journal.pcbi.1003697 pubmed: 25057853 pmcid: 4109845
Wadhwa, N. & Berg, H. C. Bacterial motility: Machinery and mechanisms. Nat. Rev. Microbiol. 20, 161–173. https://doi.org/10.1038/s41579-021-00626-4 (2021).
doi: 10.1038/s41579-021-00626-4 pubmed: 34548639
Bäuerle, T., Löffler, R. C. & Bechinger, C. Formation of stable and responsive collective states in suspensions of active colloids. Nat. Commun. 11, 2547. https://doi.org/10.1038/s41467-020-16161-4 (2020).
doi: 10.1038/s41467-020-16161-4 pubmed: 32439919 pmcid: 7242396
Löffler, R. C., Bäuerle, T., Kardar, M., Rohwer, C. M. & Bechinger, C. Behavior-dependent critical dynamics in collective states of active particles. Europhys. Lett. 134, 64001. https://doi.org/10.1209/0295-5075/ac0c68 (2021).
doi: 10.1209/0295-5075/ac0c68
Lozano, C., ten Hagen, B., Löwen, H. & Bechinger, C. Phototaxis of synthetic microswimmers in optical landscapes. Nat. Commun. https://doi.org/10.1038/ncomms12828 (2016).
doi: 10.1038/ncomms12828 pubmed: 27687580 pmcid: 5056439
Bäuerle, T., Fischer, A., Speck, T. & Bechinger, C. Self-organization of active particles by quorum sensing rules. Nat. Commun. https://doi.org/10.1038/s41467-018-05675-7 (2018).
doi: 10.1038/s41467-018-05675-7 pubmed: 30104679 pmcid: 6089911
Schulman, J., Wolski, F., Dhariwal, P., Radford, A. & Klimov, O. Proximal policy optimization algorithms. arXiv arXiv:1707.06347 (2017).
Schulman, J., Moritz, P., Levine, S., Jordan, M. & Abbeel, P. High-dimensional continuous control using generalized advantage estimation. arXiv arXiv:1506.02438 (2015).

Auteurs

Robert C Löffler (RC)

Fachbereich Physik, Universität Konstanz, 78464, Konstanz, Germany.

Emanuele Panizon (E)

The Abdus Salam International Centre for Theoretical Physics (ICTP), Strada Costiera 11, 34151, Trieste, Italy.

Clemens Bechinger (C)

Fachbereich Physik, Universität Konstanz, 78464, Konstanz, Germany. clemens.bechinger@uni-konstanz.de.
Centre for the Advanced Study of Collective Behaviour, Universität Konstanz, 78464, Konstanz, Germany. clemens.bechinger@uni-konstanz.de.

Classifications MeSH