Previous research has shown that processing words with an up/down association (e.g., bird, foot) can influence the subsequent identification of visual targets in congruent location (at the top/bottom of the screen). However, as facilitation and interference were found under similar conditions, the nature of the underlying mechanisms remained unclear. We propose that word comprehension relies on the perceptual simulation of a prototypical event involving the entity denoted by a word in order to provide a general account of the different findings. In 3 experiments, participants had to discriminate between 2 target pictures appearing at the top or the bottom of the screen by pressing the left versus right button. Immediately before the targets appeared, they saw an up/down word belonging to the target’s event, an up/down word unrelated to the target, or a spatially neutral control word. Prime words belonging to target event facilitated identification of targets at a stimulus onset asynchrony (SOA) of 250 ms (Experiment 1), but only when presented in the vertical location where they are typically seen, indicating that targets were integrated in the simulations activated by the prime words. Moreover, at the same SOA, there was a robust facilitation effect for targets appearing in their typical location regardless of the prime type. However, when words were presented for 100 ms (Experiment 2) or 800 ms (Experiment 3), only a location nonspecific priming effect was found, suggesting that the visual system was not activated. Implications for theories of semantic processing are discussed.