Pinpointing the neural signatures of single-exposure visual recognition memory
Mehrpour, Vahid; Meyer, Travis; Simoncelli, Eero P; Rust, Nicole C
Memories of the images that we have seen are thought to be reflected in the reduction of neural responses in high-level visual areas such as inferotemporal (IT) cortex, a phenomenon known as repetition suppression (RS). We challenged this hypothesis with a task that required rhesus monkeys to report whether images were novel or repeated while ignoring variations in contrast, a stimulus attribute that is also known to modulate the overall IT response. The monkeys' behavior was largely contrast invariant, contrary to the predictions of an RS-inspired decoder, which could not distinguish responses to images that are repeated from those that are of lower contrast. However, the monkeys' behavioral patterns were well predicted by a linearly decodable variant in which the total spike count was corrected for contrast modulation. These results suggest that the IT neural activity pattern that best aligns with single-exposure visual recognition memory behavior is not RS but rather sensory referenced suppression: reductions in IT population response magnitude, corrected for sensory modulation.
Inference of nonlinear receptive field subunits with spike-triggered clustering
Shah, Nishal P; Brackbill, Nora; Rhoades, Colleen; Kling, Alexandra; Goetz, Georges; Litke, Alan M; Sher, Alexander; Simoncelli, Eero P; Chichilnisky, E J
Responses of sensory neurons are often modeled using a weighted combination of rectified linear subunits. Since these subunits often cannot be measured directly, a flexible method is needed to infer their properties from the responses of downstream neurons. We present a method for maximum likelihood estimation of subunits by soft-clustering spike-triggered stimuli, and demonstrate its effectiveness in visual neurons. For parasol retinal ganglion cells in macaque retina, estimated subunits partitioned the receptive field into compact regions, likely representing aggregated bipolar cell inputs. Joint clustering revealed shared subunits between neighboring cells, producing a parsimonious population model. Closed-loop validation, using stimuli lying in the null space of the linear receptive field, revealed stronger nonlinearities in OFF cells than ON cells. Responses to natural images, jittered to emulate fixational eye movements, were accurately predicted by the subunit model. Finally, the generality of the approach was demonstrated in macaque V1 neurons.
Compound stimuli reveal the structure of visual motion selectivity in macaque MT neurons
Zaharia, Andrew D; Goris, Robbe L T; Movshon, J Anthony; Simoncelli, Eero P
Motion selectivity in primary visual cortex (V1) is approximately separable in orientation, spatial frequency, and temporal frequency ("frequency-separable"). Models for area MT neurons posit that their selectivity arises by combining direction-selective V1 afferents whose tuning is organized around a tilted plane in the frequency domain, specifying a particular direction and speed ("velocity-separable"). This construction explains "pattern direction selective" MT neurons, which are velocity-selective but relatively invariant to spatial structure, including spatial frequency, texture and shape. We designed a set of experiments to distinguish frequency- and velocity-separable models and executed them with single-unit recordings in macaque V1 and MT. Surprisingly, when tested with single drifting gratings, most MT neurons' responses are fit equally well by models with either form of separability. However, responses to plaids (sums of two moving gratings) tend to be better described as velocity-separable, especially for pattern neurons. We conclude that direction selectivity in MT is primarily computed by summing V1 afferents, but pattern-invariant velocity tuning for complex stimuli may arise from local, recurrent interactions.Significance Statement How do sensory systems build representations of complex features from simpler ones? Visual motion representation in cortex is a well-studied example: the direction and speed of moving objects, regardless of shape or texture, is computed from the local motion of oriented edges. Here we quantify tuning properties based on single-unit recordings in primate area MT, then fit a novel, generalized model of motion computation. The model reveals two core properties of MT neurons - speed tuning and invariance to local edge orientation - result from a single organizing principle: each MT neuron combines afferents that represent edge motions consistent with a common velocity, much as V1 simple cells combine thalamic inputs consistent with a common orientation.
Contextual modulation of sensitivity to naturalistic image structure in macaque V2
Ziemba, Corey M; Freeman, Jeremy; Simoncelli, Eero P; Movshon, J Anthony
The stimulus selectivity of neurons in V1 is well known, as is the finding that their responses can be affected by visual input to areas outside of the classical receptive field. Less well understood are the ways selectivity is modified as signals propagate to visual areas beyond V1, such as V2. We recently proposed a role for V2 neurons in representing the higher order statistical dependencies found in images of naturally occurring visual texture. V2 neurons, but not V1 neurons, respond more vigorously to "naturalistic" images that contain these dependencies than to "noise" images that lack them. In this work, we examine the dependency of these effects on stimulus size. For most V2 neurons, the preference for naturalistic over noise stimuli was modest when presented in small patches and gradually strengthened with increasing size, suggesting that the mechanisms responsible for this enhanced sensitivity operate over regions of the visual field that are larger than the classical receptive field. Indeed, we found that surround suppression was stronger for noise than for naturalistic stimuli and that the preference for large naturalistic stimuli developed over a delayed time course consistent with lateral or feedback connections. These findings are compatible with a spatially broad facilitatory mechanism that is absent in V1 and suggest that a distinct role for the receptive field surround emerges in V2 along with sensitivity for more complex image structure. NEW & NOTEWORTHY The responses of neurons in visual cortex are often affected by visual input delivered to regions of the visual field outside of the conventionally defined receptive field, but the significance of such contextual modulations are not well understood outside of area V1. We studied the importance of regions beyond the receptive field in establishing a novel form of selectivity for the statistical dependencies contained in natural visual textures that first emerges in area V2.
Slow gain fluctuations limit benefits of temporal integration in visual cortex
Goris, Robbe L T; Ziemba, Corey M; Movshon, J Anthony; Simoncelli, Eero P
Sensory neurons represent stimulus information with sequences of action potentials that differ across repeated measurements. This variability limits the information that can be extracted from momentary observations of a neuron's response. It is often assumed that integrating responses over time mitigates this limitation. However, temporal response correlations can reduce the benefits of temporal integration. We examined responses of individual orientation-selective neurons in the primary visual cortex of two macaque monkeys performing an orientation-discrimination task. The signal-to-noise ratio of temporally integrated responses increased for durations up to a few hundred milliseconds but saturated for longer durations. This was true even when cells exhibited little or no adaptation in their response levels. These observations are well explained by a statistical response model in which spikes arise from a Poisson process whose stimulus-dependent rate is modulated by slow, stimulus-independent fluctuations in gain. The response variability arising from the Poisson process is reduced by temporal integration, but the slow modulatory nature of variability due to gain fluctuations is not. Slow gain fluctuations therefore impose a fundamental limit on the benefits of temporal integration.
Perceptually optimized image rendering
Laparra, Valero; Berardino, Alexander; Ballé, Johannes; Simoncelli, Eero P
We develop a framework for rendering photographic images by directly optimizing their perceptual similarity to the original visual scene. Specifically, over the set of all images that can be rendered on a given display, we minimize the normalized Laplacian pyramid distance (NLPD), a measure of perceptual dissimilarity that is derived from a simple model of the early stages of the human visual system. When rendering images acquired with a higher dynamic range than that of the display, we find that the optimization boosts the contrast of low-contrast features without introducing significant artifacts, yielding results of comparable visual quality to current state-of-the-art methods, but without manual intervention or parameter adjustment. We also demonstrate the effectiveness of the framework for a variety of other display constraints, including limitations on minimum luminance (black point), mean luminance (as a proxy for energy consumption), and quantized luminance levels (halftoning). We show that the method may generally be used to enhance details and contrast, and, in particular, can be used on images degraded by optical scattering (e.g., fog). Finally, we demonstrate the necessity of each of the NLPD components-an initial power function, a multiscale transform, and local contrast gain control-in achieving these results and we show that NLPD is competitive with the current state-of-the-art image quality metrics.
Dissociation of Choice Formation and Choice-Correlated Activity in Macaque Visual Cortex
Goris, Robbe L T; Ziemba, Corey M; Stine, Gabriel M; Simoncelli, Eero P; Movshon, J Anthony
Responses of individual task-relevant sensory neurons can predict monkeys' trial-by-trial choices in perceptual decision-making tasks. Choice-correlated activity has been interpreted as evidence that the responses of these neurons are causally linked to perceptual judgments. To further test this hypothesis, we studied responses of orientation-selective neurons in V1 and V2 while two macaque monkeys performed a fine orientation discrimination task. Although both animals exhibited a high level of neuronal and behavioral sensitivity, only one exhibited choice-correlated activity. Surprisingly, this correlation was negative: when a neuron fired more vigorously, the animal was less likely to choose the orientation preferred by that neuron. Moreover, choice-correlated activity emerged late in the trial, earlier in V2 than in V1, and was correlated with anticipatory signals. Together, these results suggest that choice-correlated activity in task-relevant sensory neurons can reflect postdecision modulatory signals.SIGNIFICANCE STATEMENT When observers perform a difficult sensory discrimination, repeated presentations of the same stimulus can elicit different perceptual judgments. This behavioral variability often correlates with variability in the activity of sensory neurons driven by the stimulus. Traditionally, this correlation has been interpreted as suggesting a causal link between the activity of sensory neurons and perceptual judgments. More recently, it has been argued that the correlation instead may originate in recurrent input from other brain areas involved in the interpretation of sensory signals. Here, we call both hypotheses into question. We show that choice-related activity in sensory neurons can be highly variable across observers and can reflect modulatory processes that are dissociated from perceptual decision-making.
End-to-end optimization of nonlinear transform codes for perceptual quality
Chapter by: Ballé, Johannes; Laparra, Valero; Simoncelli, Eero P.
in: 2016 Picture Coding Symposium, PCS 2016 by
[S.l.] : Institute of Electrical and Electronics Engineers Inc., 2017
Eigen-distortions of hierarchical representations [Meeting Abstract]
Berardino, Alexander; Balle, Johannes; Laparra, Valero; Simoncelli, Eero
We develop a method for comparing hierarchical image representations in terms of their ability to explain perceptual sensitivity in humans. Specifically, we utilize Fisher information to establish a model-derived prediction of sensitivity to local perturbations of an image. For a given image, we compute the eigenvectors of the Fisher information matrix with largest and smallest eigenvalues, corresponding to the model-predicted most- and least-noticeable image distortions, respectively. For human subjects, we then measure the amount of each distortion that can be reliably detected when added to the image. We use this method to test the ability of a variety of representations to mimic human perceptual sensitivity. We find that the early layers of VGG16, a deep neural network optimized for object recognition, provide a better match to human perception than later layers, and a better match than a 4-stage convolutional neural network (CNN) trained on a database of human ratings of distorted image quality. On the other hand, we find that simple models of early visual processing, incorporating one or more stages of local gain control, trained on the same database of distortion ratings, provide substantially better predictions of human sensitivity than either the CNN, or any combination of layers of VGG16.
Neural Quadratic Discriminant Analysis: Nonlinear Decoding with V1-Like Computation
Pagan, Marino; Simoncelli, Eero P; Rust, Nicole C
Linear-nonlinear (LN) models and their extensions have proven successful in describing transformations from stimuli to spiking responses of neurons in early stages of sensory hierarchies. Neural responses at later stages are highly nonlinear and have generally been better characterized in terms of their decoding performance on prespecified tasks. Here we develop a biologically plausible decoding model for classification tasks, that we refer to as neural quadratic discriminant analysis (nQDA). Specifically, we reformulate an optimal quadratic classifier as an LN-LN computation, analogous to "subunit" encoding models that have been used to describe responses in retina and primary visual cortex. We propose a physiological mechanism by which the parameters of the nQDA classifier could be optimized, using a supervised variant of a Hebbian learning rule. As an example of its applicability, we show that nQDA provides a better account than many comparable alternatives for the transformation between neural representations in two high-level brain areas recorded as monkeys performed a visual delayed-match-to-sample task.