We introduce the feature-weighted receptive field (fwRF), an encoding model designed to balance expressiveness, interpretability and scalability. The fwRF is organized around the notion of a feature map—a transformation of visual stimuli into visual features that preserves the topology of visual space (but not necessarily the native resolution of the stimulus). The key assumption of the fwRF model is that activity in each voxel encodes variation in a spatially localized region across multiple feature maps. This region is fixed for all feature maps; however, the contribution of each feature map to voxel activity is weighted. Thus, the model has two separable sets of parameters: “where” parameters that characterize the location and extent of pooling over visual features, and “what” parameters that characterize tuning to visual features. The “where” parameters are analogous to classical receptive fields, while “what” parameters are analogous to classical tuning functions. By treating these as separable parameters, the fwRF model complexity is independent of the resolution of the underlying feature maps. This makes it possible to estimate models with thousands of high-resolution feature maps from relatively small amounts of data. Once a fwRF model has been estimated from data, spatial pooling and feature tuning can be read-off directly with no (or very little) additional post-processing or in-silico experimentation.
We describe an optimization algorithm for estimating fwRF models from data acquired during standard visual neuroimaging experiments. We then demonstrate the model's application to two distinct sets of features: Gabor wavelets and features supplied by a deep convolutional neural network. We show that when Gabor feature maps are used, the fwRF model recovers receptive fields and spatial frequency tuning functions consistent with known organizational principles of the visual cortex. We also show that a fwRF model can be used to regress entire deep convolutional networks against brain activity. The ability to use whole networks in a single encoding model yields state-of-the-art prediction accuracy. Our results suggest a wide variety of uses for the feature-weighted receptive field model, from retinotopic mapping with natural scenes, to regressing the activities of whole deep neural networks onto measured brain activity.