This research examined whether the informational advantage of an animation over a static picture (and over no visualizations as a control condition) can be compensated by presenting the information in the text that constitutes this informational advantage. In addition, it was investigated whether learners’ spatial abilities acted as a compensator in learning with a static picture compared to an animation. Moreover, the underlying cognitive processes were explored by eye tracking measures. Two hundred and one university students were randomly assigned to one to six conditions resulting from a 2 × 3 between-subjects design with text information (with vs. without dynamic information) and visualization format (no visualization vs. static picture vs. animation) as independent variables and spatial abilities as continuous factor. For learning outcomes, results revealed that, other than expected, text information did not moderate learning with the different visualization formats. However, learners receiving visualizations significantly outperformed learners in the control conditions, and learners receiving animations significantly outperformed learners receiving static pictures in a transfer test. An analysis of the eye tracking data revealed that this beneficial effect of animations over static pictures was mediated by a pupillometry measure that is supposed to reflect effortful cognitive processing. Spatial abilities acted as a compensator in learning with the two visualization formats: The advantage of animations was particularly evident for learners with low spatial abilities, but not for learners with high spatial abilities. These results indicate that the informational advantage of animations over static pictures cannot easily be compensated through text information, but by learners’ spatial abilities.