The brain constantly has to interpret stimuli from a range of modalities originating from the same or different objects to create unambiguous percepts. The mechanisms of such multisensory processing have been intensely studied with respect to the time window of integration or the effect of spatial separation. However, the neural mechanisms remain elusive with respect to the role of alerting effects and multisensory integration. We addressed this issue by choosing a test paradigm where we could manipulate potentially alerting stimuli and simultaneously activating stimuli independently: We measured the temporal ventriloquism effect in European starlings by using the temporal order judgment paradigm with subjects judging the temporal order of the lighting of 2 spatially separated lights. If spatially noninformative acoustic stimuli were added to the visual stimuli the performance improved when the 2 visual stimuli were flanked by acoustic cues with a small time-offset compared to synchronous presentation. Two acoustic cues presented with asymmetric offsets showed that this effect was mainly driven by the cue trailing the second visual stimulus, while an acoustic cue leading the first visual stimulus had less effect. In contrast, 1 singleton acoustic cue prior to the first visual stimulus, without a second acoustic cue, enhanced performance. Our results support the hypothesis that the first stimulus pair with the leading sound activates alerting mechanisms and enhances neural processing, while the second stimulus pair with the trailing sound drives multisensory integration by simultaneous activation within the temporal binding window.