Humans can envision the world from other people’s viewpoints. To explore the embodied process of such spatial perspective taking, we examined whether action related to a whole-body movement modulates performance on spatial perspective-taking tasks. Results showed that when participants responded by putting their left/right foot or left/right hand forward, actions congruent with a movement’s direction (clockwise/counterclockwise) reduced RTs relative to incongruent actions. In contrast, actions irrelevant to a movement (a left/right hand index-finger response) did not affect performance. Furthermore, we demonstrated that this response congruency effect cannot be explained by either spatial stimulus-response compatibility or sensorimotor interference. These results support the involvement of simulated whole-body movement in spatial perspective taking. Moreover, the findings revealed faster foot responses than hand responses during spatial perspective taking, whereas the opposite result was obtained during a simple orientation judgment task without spatial perspective taking. Overall, our findings highlight the important role of motor simulation in spatial perspective taking.