Human ventral temporal cortex shows a categorical organization, with regions responding selectively to faces, bodies, tools, scenes, words, and other categories. Why is this? Traditional accounts explain category selectivity as arising within a hierarchical system dedicated to visual object recognition. For example, it has been proposed that category selectivity reflects the clustering of category-associated visual feature representations, or that it reflects category-specific computational algorithms needed to achieve view invariance. This visual object recognition framework has gained renewed interest with the success of deep neural network models trained to “recognize” objects: these hierarchical feed-forward networks show similarities to human visual cortex, including categorical separability. We argue that the object recognition framework is unlikely to fully account for category selectivity in visual cortex. Instead, we consider category selectivity in the context of other functions such as navigation, social cognition, tool use, and reading. Category-selective regions are activated during such tasks even in the absence of visual input and even in individuals with no prior visual experience. Further, they are engaged in close connections with broader domain-specific networks. Considering the diverse functions of these networks, category-selective regions likely encode their preferred stimuli in highly idiosyncratic formats; representations that are useful for navigation, social cognition, or reading are unlikely to be meaningfully similar to each other and to varying degrees may not be entirely visual. The demand for specific types of representations to support category-associated tasks may best account for category selectivity in visual cortex. This broader view invites new experimental and computational approaches.