Emotions are implicitly expressed in both facial expressions and prosodic components of vocal communication. The ability to recognize nonverbal cues of emotion is an important feature of social competence that matures gradually across childhood and adolescence. Compared to the extensive knowledge about the development of emotion recognition (ER) from facial displays of emotion, relatively little is known about the maturation of this ability in the auditory domain. The current review provides an overview of knowledge about the development of vocal emotion recognition from behavioural studies, and neural mechanisms that might contribute to this maturational process. Youth are thought to reach adult-like vocal ER ability in early or late adolescence. At a neural level, several structural and functional changes occur in the adolescent brain that may impact the representation of emotional information. However, there is a paucity of developmental neuroimaging work directly examining neural prosody processing in youth. We speculate that brain areas relevant to vocal perception in adults may undergo age-related changes that map onto increased vocal ER capacity.