The human vocal system is highly plastic, allowing for the flexible expression of language, mood and intentions. However, this plasticity is not stable throughout the life span, and it is well documented that adult learners encounter greater difficulty than children in acquiring the sounds of foreign languages. Researchers have used magnetic resonance imaging (MRI) to interrogate the neural substrates of vocal imitation and learning, and the correlates of individual differences in phonetic “talent”. In parallel, a growing body of work using MR technology to directly image the vocal tract in real time during speech has offered primarily descriptive accounts of phonetic variation within and across languages. In this paper, we review the contribution of neural MRI to our understanding of vocal learning, and give an overview of vocal tract imaging and its potential to inform the field. We propose methods by which our understanding of speech production and learning could be advanced through the combined measurement of articulation and brain activity using MRI – specifically, we describe a novel paradigm, developed in our laboratory, that uses both MRI techniques to for the first time map directly between neural, articulatory and acoustic data in the investigation of vocalisation. This non-invasive, multimodal imaging method could be used to track central and peripheral correlates of spoken language learning, and speech recovery in clinical settings, as well as provide insights into potential sites for targeted neural interventions.