We investigated whether a normalization model or view combination model fit the performance of scene recognition of 3-D layouts using a virtual-reality paradigm. Participants learned a layout of seven objects from two training views (e.g., 0° and 48°) by discriminating the “correct” layout from distracters. Later, they performed a discrimination task using the training views (e.g., 0° and 48°), an interpolated view (e.g., 24°), an extrapolated view (e.g., 72°), and a far view (e.g., 96°). The results showed that the interpolated view was easier to discriminate than the extrapolated view and even easier than the training views. These results extend the applicability of view combination accounts of recognition to 3-D stimuli with stereoscopic depth information.