Speech perception in face-to-face conversation involves processing of speech sounds (auditory) and speech-associated mouth/lip movements (visual) from a speaker. Using PET where no scanner noise was present, brain regions involved in speech cue processing were investigated with the normal hearing subjects with no previous lip-reading training (N= 17) carrying out a semantic plausibility decision on spoken sentences delivered in a movie file. Multimodality was ensured at the sensory level in all four conditions. Sensory-specific speech cue of one sensory modality, i.e., auditory speech (A condition) or mouth movement (V condition), was delivered with a control stimulus of the other modality whereas speech cues of both sensory modalities (AV condition) were delivered during bimodal condition. In comparison to the control condition, extensive activations in the superior temporal regions were observed bilaterally during the A condition but these activations were reduced in extent and left lateralized during the AV condition. Polymodal region such as left posterior superior temporal sulcus (pSTS) involved in cross-modal interaction/integration of audiovisual speech was found to be activated during the A and more so during the AV conditions but not during the V condition. Activations were observed in Broca's (BA 44), medial frontal (BA 8), and anterior ventrolateral prefrontal (BA 47) regions in the left during the V condition, where lip-reading performance was less successful. Results indicated that the speech-associated lip movements (visual speech cue) rendered suppression on the activity in the right auditory temporal regions. Overadditivity (AV > A + V) observed in the right postcentral region during the bimodal condition relative to the sum of unimodal speech conditions was also associated with reduced activity during the V condition. These findings suggested that visual speech cue could exert an inhibitory modulatory effect on the brain activities in the right hemisphere during the cross-modal interaction of audiovisual speech perception.