Visual motion perception as online hierarchical inference.


Journal

Nature communications
ISSN: 2041-1723
Titre abrégé: Nat Commun
Pays: England
ID NLM: 101528555

Informations de publication

Date de publication:
01 12 2022
Historique:
received: 21 10 2021
accepted: 07 11 2022
entrez: 1 12 2022
pubmed: 2 12 2022
medline: 6 12 2022
Statut: epublish

Résumé

Identifying the structure of motion relations in the environment is critical for navigation, tracking, prediction, and pursuit. Yet, little is known about the mental and neural computations that allow the visual system to infer this structure online from a volatile stream of visual information. We propose online hierarchical Bayesian inference as a principled solution for how the brain might solve this complex perceptual task. We derive an online Expectation-Maximization algorithm that explains human percepts qualitatively and quantitatively for a diverse set of stimuli, covering classical psychophysics experiments, ambiguous motion scenes, and illusory motion displays. We thereby identify normative explanations for the origin of human motion structure perception and make testable predictions for future psychophysics experiments. The proposed online hierarchical inference model furthermore affords a neural network implementation which shares properties with motion-sensitive cortical areas and motivates targeted experiments to reveal the neural representations of latent structure.

Identifiants

pubmed: 36456546
doi: 10.1038/s41467-022-34805-5
pii: 10.1038/s41467-022-34805-5
pmc: PMC9715570
doi:

Types de publication

Journal Article Research Support, N.I.H., Extramural Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

7403

Subventions

Organisme : NINDS NIH HHS
ID : U19 NS118246
Pays : United States

Informations de copyright

© 2022. The Author(s).

Références

Kaiser, D., Quek, G. L., Cichy, R. M. & Peelen, M. V. Object vision in a structured world. Trends Cognit. Sci. 23, 672–685 (2019).
doi: 10.1016/j.tics.2019.04.013
Yantis, S. Multielement visual tracking: attention and perceptual organization. Cognit. Psychol. 24, 295–340 (1992).
doi: 10.1016/0010-0285(92)90010-Y
Driver, J., McLeod, P. & Dienes, Z. Motion coherence and conjunction search: implications for guided search theory. Percept. Psychophys. 51, 79–85 (1992).
doi: 10.3758/BF03205076
Royden, C. S. & Hildreth, E. C. Human heading judgments in the presence of moving objects. Percept. Psychophys. 58, 836–856 (1996).
doi: 10.3758/BF03205487
Liu, G. et al. Multiple-object tracking is based on scene, not retinal, coordinates. J. Exp. Psychol. Hum. Percept. Perform. 31, 235–247 (2005).
doi: 10.1037/0096-1523.31.2.235
Xu, H., Tang, N., Zhou, J., Shen, M. & Gao, T. Seeing “what” through “why”: evidence from probing the causal structure of hierarchical motion. J. Exp. Psychol. General 146, 896–909 (2017).
doi: 10.1037/xge0000310
Dokka, K., Park, H., Jansen, M., DeAngelis, G. C. & Angelaki, D. E. Causal inference accounts for heading perception in the presence of object motion. Proc. Natl Acad. Sci. 116, 9060–9065 (2019).
doi: 10.1073/pnas.1820373116
Bolton, A. D. et al. Elements of a stochastic 3D prediction engine in larval zebrafish prey capture. ELife 8, e51975 (2019).
doi: 10.7554/eLife.51975
Weiss, Y., Simoncelli, E. P. & Adelson, E. H. Motion illusions as optimal percepts. Nat. Neurosci. 5, 598–604 (2002).
doi: 10.1038/nn0602-858
Stocker, A. A. & Simoncelli, E. P. Noise characteristics and prior expectations in human visual speed perception. Nat. Neurosci. 9, 578–585 (2006).
doi: 10.1038/nn1669
Stocker, A. A. & Simoncelli, E. P. Sensory adaptation within a Bayesian framework for perception. In Advances in neural information processing systems (NeurIPS, 2005).
Welchman, A. E., Lam, J. M. & Bülthoff, H. H. Bayesian motion estimation accounts for a surprising bias in 3D vision. Proc. Natl Acad. Sci. 105, 12087–12092 (2008).
doi: 10.1073/pnas.0804378105
Vul, E., Frank, M. C., Tenenbaum, J. B. & Alvarez, G. A. Explaining human multiple object tracking as resource-constrained approximate inference in a dynamic probabilistic model. In Advances in neural information processing systems (NeurIPS, 2009).
Hedges, J. H., Stocker, A. A. & Simoncelli, E. P. Optimal inference explains the perceptual coherence of visual motion stimuli. J. Vis. 11, 14 (2011).
doi: 10.1167/11.6.14
Gershman, S. J., Tenenbaum, J. B. & Jäkel, F. Discovering hierarchical motion structure. Vis. Res. 126, 232–241 (2016).
doi: 10.1016/j.visres.2015.03.004
Bill, J., Pailian, H., Gershman, S. J. & Drugowitsch, J. Hierarchical structure is employed by humans during visual motion perception. Proc. Natl Acad. Sci. 117, 24581–24589 (2020).
doi: 10.1073/pnas.2008961117
Yang, S., Bill, J., Drugowitsch, J. & Gershman, S. J. Human visual motion perception shows hallmarks of Bayesian structural inference. Sci. Rep. 11, 3714 (2021).
doi: 10.1038/s41598-021-82175-7
Barlow, H. & Levick, W. R. The mechanism of directionally selective units in rabbit’s retina. J. Physiol. 178, 477–504 (1965).
doi: 10.1113/jphysiol.1965.sp007638
Graziano, M. S., Andersen, R. A. & Snowden, R. J. Tuning of MST neurons to spiral motions. J. Neurosci. 14, 54–67 (1994).
doi: 10.1523/JNEUROSCI.14-01-00054.1994
Pack, C. C., Livingstone, M. S., Duffy, K. R. & Born, R. T. End-stopping and the aperture problem: two-dimensional motion signals in macaque V1. Neuron 39, 671–680 (2003).
doi: 10.1016/S0896-6273(03)00439-2
Born, R. T. & Bradley, D. C. Structure and function of visual area MT. Annu. Rev. Neurosci. 28, 157–189 (2005).
doi: 10.1146/annurev.neuro.26.041002.131052
Mineault, P. J., Khawaja, F. A., Butts, D. A. & Pack, C. C. Hierarchical processing of complex motion along the primate dorsal visual pathway. Proc. Natl Acad. Sci. 109, E972–E980 (2012).
doi: 10.1073/pnas.1115685109
Li, K. et al. Neurons in primate visual cortex alternate between responses to multiple stimuli in their receptive field. Front. Comput. Neurosci. 10, 141 (2016).
doi: 10.3389/fncom.2016.00141
Wertheimer, M. Laws of organization in perceptual forms. In A sourcebook of gestalt psychology (ed. Ellis, W.) 71–88 (Harcourt, Brace, 1938).
Johansson, G. Visual perception of biological motion and a model for its analysis. Percept. Psychophys. 14, 201–211 (1973).
doi: 10.3758/BF03212378
Gogel, W. C. Relative motion and the adjacency principle. Q. J. Exp. Psychol. 26, 425–437 (1974).
doi: 10.1080/14640747408400432
Grossberg, S., Léveillé, J. & Versace, M. How do object reference frames and motion vector decomposition emerge in laminar cortical circuits? Atten. Percept. Psychophys. 73, 1147–1170 (2011).
doi: 10.3758/s13414-011-0095-9
Spelke, E. S. Principles of object perception. Cognit. Sci. 14, 29–56 (1990).
doi: 10.1207/s15516709cog1401_3
Dempster, A. P., Laird, N. M. & Rubin, D. B. Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. Ser. B 39, 1–38 (1977).
Bishop, C. M. Pattern recognition and machine learning (Springer, 2006).
Cappé, O. & Moulines, E. On-line expectation–maximization algorithm for latent data models. J. R. Stat. Soc. Ser. B 71, 593–613 (2009).
doi: 10.1111/j.1467-9868.2009.00698.x
Tanaka, K., Fukada, Y. & Saito, H. Underlying mechanisms of the response specificity of expansion/contraction and rotation cells in the dorsal part of the medial superior temporal area of the macaque monkey. J. Neurophysiol. 62, 642–656 (1989).
doi: 10.1152/jn.1989.62.3.642
Flombaum, J. I. & Scholl, B. J. A temporal same-object advantage in the tunnel effect: facilitated change detection for persisting objects. J. Exp. Psychol. Hum. Perception Perform. 32, 840–853 (2006).
doi: 10.1037/0096-1523.32.4.840
Gardiner, C. Stochastic methods, vol. 4 (Springer Berlin, 2009).
Duncker, K. Über induzierte bewegung. Psychologische Forschung 12, 180–259 (1929).
doi: 10.1007/BF02409210
Braddick, O. J., Wishart, K. A. & Curran, W. Directional performance in motion transparency. Vis. Res. 42, 1237–1248 (2002).
doi: 10.1016/S0042-6989(02)00018-4
Chen, Y., Meng, X., Matthews, N. & Qian, N. Effects of attention on motion repulsion. Vis. Res. 45, 1329–1339 (2005).
doi: 10.1016/j.visres.2004.11.005
Benton, C. P. & Curran, W. Direction repulsion goes global. Curr. Biol. 13, 767–771 (2003).
doi: 10.1016/S0960-9822(03)00285-9
Takemura, H., Tajima, S. & Murakami, I. Whether dots moving in two directions appear coherent or transparent depends on directional biases induced by surrounding motion. J. Vis. 11, 17 (2011).
doi: 10.1167/11.14.17
Marshak, W. & Sekuler, R. Mutual repulsion between moving visual targets. Science 205, 1399–1401 (1979).
doi: 10.1126/science.472756
Kim, J. & Wilson, H. R. Direction repulsion between components in motion transparency. Vis. Res. 36, 1177–1187 (1996).
doi: 10.1016/0042-6989(95)00153-0
Lorenceau, J. Motion integration with dot patterns: effects of motion noise and structural information. Vis. Res. 36, 3415–3427 (1996).
doi: 10.1016/0042-6989(96)00086-7
Cali, J. N., Bennett, P. J. & Sekuler, A. B. Phase integration bias in a motion grouping task. J. Vis. 20, 31 (2020).
doi: 10.1167/jov.20.7.31
Brandt, T., Dichgans, J. & Koenig, E. Differential effects of central versus peripheral vision on egocentric and exocentric motion perception. Exp. Brain Res. 16, 476–491 (1973).
doi: 10.1007/BF00234474
Angelaki, D. E., Gu, Y. & DeAngelis, G. C. Visual and vestibular cue integration for heading perception in extrastriate visual cortex. J. Physiol. 589, 825–833 (2011).
doi: 10.1113/jphysiol.2010.194720
Shivkumar, S., DeAngelis, G. C. & Haefner, R. M. A causal inference model for the perception of complex motion in the presence of self-motion. J. Vis. 20, 1631 (2020).
doi: 10.1167/jov.20.11.1631
Amano, K., Wandell, B. A. & Dumoulin, S. O. Visual field maps, population receptive field sizes, and visual field coverage in the human MT+ complex. J. Neurophysiol. 102, 2704–2718 (2009).
doi: 10.1152/jn.00102.2009
Wallach, H. & O’connell, D. The kinetic depth effect. J. Exp. Psychol. 45, 205 (1953).
doi: 10.1037/h0056880
Ullman, S. The interpretation of structure from motion. Proc. R. Soc. Lond. Ser. B Biol. Sci. 203, 405–426 (1979).
Husain, M., Treue, S. & Andersen, R. A. Surface interpolation in three-dimensional structure-from-motion perception. Neural Comput. 1, 324–333 (1989).
doi: 10.1162/neco.1989.1.3.324
Treue, S., Husain, M. & Andersen, R. A. Human perception of structure from motion. Vis. Res. 31, 59–75 (1991).
doi: 10.1016/0042-6989(91)90074-F
Treue, S., Andersen, R. A., Ando, H. & Hildreth, E. C. Structure-from-motion: perceptual evidence for surface interpolation. Vis. Res. 35, 139–148 (1995).
doi: 10.1016/0042-6989(94)E0069-W
Brouwer, G. J. & van Ee, R. Endogenous influences on perceptual bistability depend on exogenous stimulus characteristics. Vis. Res. 46, 3393–3402 (2006).
doi: 10.1016/j.visres.2006.03.016
Eby, D. W., Loomis, J. M. & Solomon, E. M. Perceptual linkage of multiple objects rotating in depth. Perception 18, 427–444 (1989).
doi: 10.1068/p180427
Bradley, D. C., Chang, G. C. & Andersen, R. A. Encoding of three-dimensional structure-from-motion by primate area MT neurons. Nature 392, 714–717 (1998).
doi: 10.1038/33688
Dodd, J. V., Krug, K., Cumming, B. G. & Parker, A. J. Perceptually bistable three-dimensional figures evoke high choice probabilities in cortical area MT. J. Neurosci. 21, 4809–4821 (2001).
doi: 10.1523/JNEUROSCI.21-13-04809.2001
Brouwer, G. J. & van Ee, R. Visual cortex allows prediction of perceptual states during ambiguous structure-from-motion. J. Neurosci. 27, 1015–1023 (2007).
doi: 10.1523/JNEUROSCI.4593-06.2007
Wasmuht, D., Parker, A. & Krug, K. Interneuronal correlations at longer time scales predict decision signals for bistable structure-from-motion perception. Sci. Rep. 9, 1–15 (2019).
doi: 10.1038/s41598-019-47786-1
Beck, J. M., Latham, P. E. & Pouget, A. Marginalization in neural circuits with divisive normalization. J. Neurosci. 31, 15310–15319 (2011).
doi: 10.1523/JNEUROSCI.1706-11.2011
Salinas, E. & Abbott, L. F. A model of multiplicative neural responses in parietal cortex. Proc. Natl Acad. Sci. 93, 11956–11961 (1996).
doi: 10.1073/pnas.93.21.11956
Dayan, P. & Abbott, L. F. Theoretical neuroscience: computational and mathematical modeling of neural systems (Computational Neuroscience Series, 2001).
Groschner, L. N., Malis, J. G., Zuidinga, B. & Borst, A. A biophysical account of multiplication by a single neuron. Nature 603, 119–123 (2022).
doi: 10.1038/s41586-022-04428-3
Gerstner, W. & Kistler, W. M. Spiking neuron models: single neurons, populations, plasticity (Cambridge University Press, 2002).
Komatsu, H. & Wurtz, R. H. Relation of cortical areas MT and MST to pursuit eye movements. I. Localization and visual properties of neurons. J. Neurophysiol. 60, 580–603 (1988).
doi: 10.1152/jn.1988.60.2.580
Duffy, C. J. & Wurtz, R. H. Sensitivity of MST neurons to optic flow stimuli. I. A continuum of response selectivity to large-field stimuli. J. Neurophysiol. 65, 1329–1345 (1991).
doi: 10.1152/jn.1991.65.6.1329
DeAngelis, G. C. & Uka, T. Coding of horizontal disparity and velocity by MT neurons in the alert macaque. J. Neurophysiol. 89, 1094–1111 (2003).
doi: 10.1152/jn.00717.2002
Nover, H., Anderson, C. H. & DeAngelis, G. C. A logarithmic, scale-invariant representation of speed in macaque middle temporal area accounts for speed discrimination performance. J. Neurosci. 25, 10049–10060 (2005).
doi: 10.1523/JNEUROSCI.1661-05.2005
Kohn, A. & Movshon, J. A. Adaptation changes the direction tuning of macaque MT neurons. Nat. Neurosci. 7, 764–772 (2004).
doi: 10.1038/nn1267
Krekelberg, B., Van Wezel, R. J. & Albright, T. D. Interactions between speed and contrast tuning in the middle temporal area: implications for the neural code for speed. J. Neurosci. 26, 8988–8998 (2006).
doi: 10.1523/JNEUROSCI.1983-06.2006
Rao, R. P. & Ballard, D. H. Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects. Nat. Neurosci. 2, 79–87 (1999).
doi: 10.1038/4580
Friston, K. Learning and inference in the brain. Neural Netw. 16, 1325–1352 (2003).
doi: 10.1016/j.neunet.2003.06.005
Walsh, K. S., McGovern, D. P., Clark, A. & O’Connell, R. G. Evaluating the neurophysiological evidence for predictive processing as a model of perception. Ann. N.Y. Acad. Sci. 1464, 242–268 (2020).
doi: 10.1111/nyas.14321
Millidge, B., Seth, A. & Buckley, C. L. Predictive coding: a theoretical and experimental review. arXiv preprint arXiv:2107.12979 (2022).
Bastos, A. M. et al. Canonical microcircuits for predictive coding. Neuron 76, 695–711 (2012).
doi: 10.1016/j.neuron.2012.10.038
Mikulasch, F. A., Rudelt, L., Wibral, M. & Priesemann, V. Dendritic predictive coding: A theory of cortical computation with spiking neurons. arXiv preprint arXiv:2205.05303 (2022).
Castet, E., Lorenceau, J., Shiffrar, M. & Bonnet, C. Perceived speed of moving lines depends on orientation, length, speed and luminance. Vis. Res. 33, 1921–1936 (1993).
doi: 10.1016/0042-6989(93)90019-S
Allman, J., Miezin, F. & McGuinness, E. Direction-and velocity-specific responses from beyond the classical receptive field in the middle temporal visual area (MT). Perception 14, 105–126 (1985).
doi: 10.1068/p140105
Huang, X., Albright, T. D. & Stoner, G. R. Stimulus dependency and mechanisms of surround modulation in cortical area MT. J. Neurosci. 28, 13889–13906 (2008).
doi: 10.1523/JNEUROSCI.1946-08.2008
Nawrot, M. & Sekuler, R. Assimilation and contrast in motion perception: explorations in cooperativity. Vis. Res. 30, 1439–1451 (1990).
doi: 10.1016/0042-6989(90)90025-G
Pastukhov, A. First, you need a Gestalt: an interaction of bottom-up and top-down streams during the perception of the ambiguously rotating human walker. Sci. Rep. 7, 1158 (2017).
doi: 10.1038/s41598-017-01376-1
Angelaki, D. E., Gu, Y. & DeAngelis, G. C. Multisensory integration: psychophysics, neurophysiology, and computation. Curr. Opin. Neurobiol. 19, 452–458 (2009).
doi: 10.1016/j.conb.2009.06.008
Takahashi, K. et al. Multimodal coding of three-dimensional rotation and translation in area MSTd: comparison of visual and vestibular selectivity. J. Neurosci. 27, 9742–9756 (2007).
doi: 10.1523/JNEUROSCI.0817-07.2007
Ventre-Dominey, J. Vestibular function in the temporal and parietal cortex: distinct velocity and inertial processing pathways. Front. Integr. Neurosci. 8, 53 (2014).
doi: 10.3389/fnint.2014.00053
Chowdhury, S. A., Takahashi, K., DeAngelis, G. C. & Angelaki, D. E. Does the middle temporal area carry vestibular signals related to self-motion? Journal of Neuroscience 29, 12020–12030 (2009).
doi: 10.1523/JNEUROSCI.0004-09.2009
Rideaux, R. & Welchman, A. E. But still it moves: static image statistics underlie how we see motion. J. Neurosci. 40, 2538–2552 (2020).
doi: 10.1523/JNEUROSCI.2760-19.2020
Kalman, R. E. & Bucy, R. S. New results in linear filtering and prediction theory. J. Basic Eng. 83, 95–108 (1961).
doi: 10.1115/1.3658902
Kutschireiter, A., Surace, S. C. & Pfister, J.-P. The hitchhiker’s guide to nonlinear filtering. J. Math. Psychol. 94, 102307 (2020).
doi: 10.1016/j.jmp.2019.102307
Bill, J., Gershman, S. J. & Drugowitsch, J. Code for the publication: visual motion perception as online hierarchical inference. GitHub, https://doi.org/10.5281/zenodo.7152982 (2022).
Qian, N., Andersen, R. A. & Adelson, E. H. Transparent motion perception as detection of unbalanced motion signals. I. Psychophysics. J. Neurosci. 14, 7357–7366 (1994).
doi: 10.1523/JNEUROSCI.14-12-07357.1994
Gershman, S. J., Vul, E. & Tenenbaum, J. Perceptual multistability as Markov chain Monte Carlo inference. In Advances in neural information processing systems (NeurIPS, 2009).

Auteurs

Johannes Bill (J)

Department of Neurobiology, Harvard Medical School, Boston, MA, USA. johannes_bill@hms.harvard.edu.
Department of Psychology, Harvard University, Cambridge, MA, USA. johannes_bill@hms.harvard.edu.

Samuel J Gershman (SJ)

Department of Psychology, Harvard University, Cambridge, MA, USA.
Center for Brain Science, Harvard University, Cambridge, MA, USA.
Center for Brains, Minds, and Machines, MIT, Cambridge, MA, USA.

Jan Drugowitsch (J)

Department of Neurobiology, Harvard Medical School, Boston, MA, USA.
Center for Brain Science, Harvard University, Cambridge, MA, USA.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH