Relative luminance and binocular disparity preferences are correlated in macaque primary visual cortex, matching natural scene statisticsSamonds JM, Potetz BR, Lee TS | PNAS | 2012 | 10.1073/pnas.1200125109 | PDF
Humans excel at inferring information about 3D scenes from their 2D images projected on the retinas, using a wide range of depth cues. One example of such inference is the tendency for observers to perceive lighter image regions as closer. This psychophysical behavior could have an ecological basis because nearer regions tend to be lighter in natural 3D scenes. Here, we show that an analogous association exists between the relative luminance and binocular disparity preferences of neurons in macaque primary visual cortex. The joint coding of relative luminance and binocular disparity at the neuronal population level may be an integral part of the neural mechanisms for perceptual inference of depth from images.
Activity of a neuron, even in the early sensory areas, is not simply a function of its local receptive field or tuning properties, but depends on global context of the stimulus, as well as the neural context. This suggests the activity of the surrounding neurons and global brain states can exert considerable influence on the activity of a neuron. In this paper we implemented an L1 regularized point process model to assess the contribution of multiple factors to the firing rate of many individual units recorded simultaneously from V1 with a 96-electrode “Utah” array. We found that the spikes of surrounding neurons indeed provide strong predictions of a neuron’s response, in addition to the neuron’s receptive field transfer function. We also found that the same spikes could be accounted for with the local field potentials, a surrogate measure of global network states. This work shows that accounting for network fluctuations can improve estimates of single trial firing rate and stimulus-response transfer functions.
The inference of depth information from single images is typically performed by devising models of image formation based on the physics of light interaction and then inverting these models to solve for depth. Once inverted, these models are highly underconstrained, requiring many assumptions such as Lambertian surface reflectance, smoothness of surfaces, uniform albedo, or lack of cast shadows. Little is known about the relative merits of these assumptions in real scenes. A statistical understanding of the joint distribution of real images and their underlying 3D structure would allow us to replace these assumptions and simplifications with probabilistic priors based on real scenes. Furthermore, statistical studies may uncover entirely new sources of information that are not obvious from physical models. Real scenes are affected by many regularities in the environment, such as the natural geometry of objects, the arrangements of objects in space, natural distributions of light, and regularities in the position of the observer. Few current computer vision algorithms for 3D shape inference make use of these trends. Despite the potential usefulness of statistical models and the growing success of statistical methods in vision, few studies have been made into the statistical relationship between images and range (depth) images. Those studies that have examined this relationship in nature have uncovered meaningful and exploitable statistical trends in real scenes which may be useful for designing new algorithms in surface inference, and also for understanding how humans perceive depth in real scenes [32, 18, 46]. In this chapter, we will highlight some results we have obtained in our study on the statistical relationships between 3D scene structures and 2D images, and discuss their implications on understanding human 3D surface perception and its underlying computational principles.
Relating functional connectivity in V1 neural circuits and 3D natural scenes using Boltzmann machinesYimeng Zhang, Xiong Li, Jason M. Samonds, Tai Sing Lee | Vision Research | 2015 | 10.1016/j.visres.2015.12.002 | PDF
Bayesian theory has provided a compelling conceptualization for perceptual inference in the brain. Central to Bayesian inference is the notion of statistical priors. To understand the neural mechanisms of Bayesian inference, we need to understand the neural representation of statistical regularities in the natural environment. In this paper, we investigated empirically how statistical regularities in natural 3D scenes are represented in the functional connectivity of disparity-tuned neurons in the primary visual cortex of primates. We applied a Boltzmann machine model to learn from 3D natural scenes, and found that the units in the model exhibited cooperative and competitive interactions, forming a “disparity association field”, analogous to the contour association field. The cooperative and competitive interactions in the disparity association field are consistent with constraints of computational models for stereo matching. In addition, we simulated neurophysiological experiments on the model, and found the results to be consistent with neurophysiological data in terms of the functional connectivity measurements between disparity-tuned neurons in the macaque primary visual cortex. These findings demonstrate that there is a relationship between the functional connectivity observed in the visual cortex and the statistics of natural scenes. They also suggest that the Boltzmann machine can be a viable model for conceptualizing computations in the visual cortex and, as such, can be used to predict neural circuits in the visual cortex from natural scene statistics.
The Bayesian paradigm has provided a useful conceptual theory for understanding perceptual computation in the brain. While the detailed neural mechanisms of Bayesian inference are not fully understood, recent computational and neurophysiological works have illuminated the underlying computational principles and representational architecture. The fundamental insights are that the visual system is organized as a modular hierarchy to encode an internal model of the world, and that perception is realized by statistical inference based on such internal model. In this paper, we will discuss and analyze the varieties of representational schemes of these internal models and how they might be used to perform learning and inference. We will argue for a unified theoretical framework for relating the internal models to the observed neural phenomena and mechanisms in the visual cortex.
Predictive Encoding of Contextual Relationships for Perceptual Inference, Interpolation and PredictionMingmin Zhao, Chengxu Zhuang, Yizhou Wang, Tai Sing Lee | International Conference on Learning Representations (ICLR) 2015 | 2015 | arXiv
We propose a new neurally-inspired model that can learn to encode the global relationship context of visual events across time and space and to use the contextual information to modulate the analysis by synthesis process in a predictive coding framework. The model learns latent contextual representations by maximizing the predictability of visual events based on local and global contextual information through both top-down and bottom-up processes. In contrast to standard predictive coding models, the prediction error in this model is used to update the contextual representation but does not alter the feedforward input for the next layer, and is thus more consistent with neurophysiological observations. We establish the computational feasibility of this model by demonstrating its ability in several aspects. We show that our model can outperform state-of-art performances of gated Boltzmann machines (GBM) in estimation of contextual information. Our model can also interpolate missing events or predict future events in image sequences while simultaneously estimating contextual information. We show it achieves state-of-art performances in terms of prediction accuracy in a variety of tasks and possesses the ability to interpolate missing frames, a function that is lacking in GBM.
Relating functional connectivity in V1 neural circuits and 3D natural scenes using Boltzmann machinesYimeng Zhang, Xiong Li, Jason M. Samonds, Ben Poole, Tai Sing Lee | Computational and Systems Neuroscience (Cosyne) 2015 | 2015 | PDF
Bayesian theory has provided a compelling conceptualization for perceptual inference in the brain. To understand the neural mechanisms of Bayesian inference, we need to understand the neural representation of statistical regularities in the natural environment. Here, we investigated empirically how the second order statistical regularities in natural 3D scenes are represented in the functional connectivity of a population of disparity-tuned neurons in the primary visual cortex of primates. We applied the Boltzmann machine to learn from 3D natural scenes and found that the functional connectivity between nodes exhibited patterns of cooperative and competitive interactions that are consistent with the observed functional connectivity between disparity-tuned neurons in the macaque primary visual cortex. The positive interactions encode statistical priors about spatial correlations in depth and implement a smoothness constraint. The negative interactions within a hypercolumn and across hypercolumns emerge automatically to reflect the uniqueness constraint found in computational models for stereopsis. These findings demonstrate that there is a relationship between the functional connectivity observed in the visual cortex and the statistics of natural scenes. This relationship suggests that the functional connectivity between disparity-tuned neurons can be considered as a disparity association field. They also suggest that the Boltzmann machine, or a Markov random field in general, can be a viable model for conceptualizing computations in the visual cortex, and as such, can be used to leverage the natural scene statistics to understand neural circuits in the visual cortex.
No abstract for this publication yet.