Fluent speech comprises sequences that are composed from a finite alphabet of learned words, syllables, and phonemes. The sequencing of discrete motor behaviors has received much attention in the motor control literature, but relatively little has been focused directly on speech production. In this paper, we investigate the cortical and subcortical regions involved in organizing and enacting sequences of simple speech sounds. Sparse event-triggered functional magnetic resonance imaging (fMRI) was used to measure responses to preparation and overt production of non-lexical three-syllable utterances, parameterized by two factors: syllable complexity and sequence complexity. The comparison of overt production trials to preparation only trials revealed a network related to the initiation of a speech plan, control of the articulators, and to hearing one's own voice. This network included the primary motor and somatosensory cortices, auditory cortical areas, supplementary motor area (SMA), the precentral gyrus of the insula, and portions of the thalamus, basal ganglia, and cerebellum. Additional stimulus complexity led to increased engagement of the basic speech network and recruitment of additional areas known to be involved in sequencing non-speech motor acts. In particular, the left hemisphere inferior frontal sulcus and posterior parietal cortex, and bilateral regions at the junction of the anterior insula and frontal operculum, the SMA and pre-SMA, the basal ganglia, anterior thalamus, and the cerebellum showed increased activity for more complex stimuli. We hypothesize mechanistic roles for the extended speech production network in the organization and execution of sequences of speech sounds.