Interaction with the world is a multisensory experience, but most of what is known about the neural correlates of perception comes from studying vision. Auditory inputs enter cortex with its own set of unique qualities, and leads to use in oral communication, speech, music, and the understanding of emotional and intentional states of others, all of which are central to the human experience. To better understand how the auditory system develops, recovers after injury, and how it may have transitioned in its functions over the course of hominin evolution, advances are needed in models of how the human brain is organized to process real-world natural sounds and “auditory objects”. This review presents a simple fundamental neurobiological model of hearing perception at a category level that incorporates principles of bottom-up signal processing together with top-down constraints of grounded cognition theories of knowledge representation. Though mostly derived from human neuroimaging literature, this theoretical framework highlights rudimentary principles of real-world sound processing that may apply to most if not all mammalian species with hearing and acoustic communication abilities. The model encompasses three basic categories of sound-source: (1) action sounds (non-vocalizations) produced by ‘living things’, with human (conspecific) and non-human animal sources representing two subcategories; (2) action sounds produced by ‘non-living things’, including environmental sources and human-made machinery; and (3) vocalizations (‘living things’), with human versus non-human animals as two subcategories therein. The model is presented in the context of cognitive architectures relating to multisensory, sensory-motor, and spoken language organizations. The models' predictive values are further discussed in the context of anthropological theories of oral communication evolution and the neurodevelopment of spoken language proto-networks in infants/toddlers. These phylogenetic and ontogenetic frameworks both entail cortical network maturations that are proposed to at least in part be organized around a number of universal acoustic-semantic signal attributes of natural sounds, which are addressed herein.Graphical abstract
A neurobiological model for how the human brain is organized for processing and perceiving real-world, natural sounds.