Humans use time-varying pitch patterns to convey information in music and speech. Recognition of musical melodies and lexical tones relies on relative pitch (RP), the ability to identify intervals between two pitches. RP processing in music is usually more fine-grained than that in tonal languages. In Western music, there are twelve pitch categories within an octave, whereas there are only three level (non-glide) lexical tones in Taiwanese (or Taiwanese Hokkien, a tonal language). The present study aimed at comparing the neural substrates underlying RP processing of musical melodic intervals with that of level lexical tones in Taiwanese. Functional magnetic resonance imaging data from fourteen participants with good RP were analyzed. The results showed that imagining the sounds of visually presented musical intervals was associated with enhanced activity in the central subregion of the right dorsal premotor cortex (dPMC), right posterior parietal cortex (PPC), and right dorsal precuneus compared to auditory imagery of visually presented Taiwanese bi-character words with level lexical tones. During the sound-congruence-judgement task (auditory imagery of musical intervals or bi-character words, and subsequently judging if the imagined sounds were melodically congruent with heard sounds), the contrast of the musical minus linguistic conditions yielded activity in the bilateral dPMC-PPC network and dorsal precuneus, with the dPMC activated in the rostral subregion. The central dPMC and PPC may mediate the attention-based maintenance of pitch intervals, whereas the dorsal precuneus may support attention control and the spatial/sensorimotor processing of the fine-grained pitch structures of music. When judging the congruence between the imagined and heard musical intervals, the bilateral rostral dPMC may play a role in attention control, working memory, evaluation of motor activities, and monitoring mechanisms. Based on the findings of this study and recent studies of amusia, we suggest that higher order cognitive operations are critical to the more fine-grained pitch processing of musical melodies compared to lexical tones.