Existing long-term memory (LTM) can boost the number of retained representations over a short delay in visual short-term memory (VSTM). However, it is unclear whether and how prior LTM affects the initial process of transforming fragile sensory inputs into durable VSTM representations (i.e., VSTM consolidation). The consolidation speed hypothesis predicts faster consolidation for familiar relative to unfamiliar stimuli. Alternatively, the perceptual boost hypothesis predicts that the advantage in perceptual processing of familiar stimuli should add a constant boost for familiar stimuli during VSTM consolidation. To test these competing hypotheses, the present study examined how the large variance in participants’ prior multimedia experience with Pokémon affected VSTM for Pokémon. In Experiment 1, the amount of time allowed for VSTM consolidation was manipulated by presenting consolidation masks at different intervals after the onset of to-be-remembered Pokémon characters. First-generation Pokémon characters that participants were more familiar with were consolidated faster into VSTM as compared with recent-generation Pokémon characters that participants were less familiar with. These effects were absent in participants who were unfamiliar with both generations of Pokémon. Although familiarity also increased the number of retained Pokémon characters when consolidation was uninterrupted but still incomplete due to insufficient encoding time in Experiment 1, this capacity effect was absent in Experiment 2 when consolidation was allowed to complete with sufficient encoding time. Together, these results support the consolidation speed hypothesis over the perceptual boost hypothesis and highlight the importance of assessing experimental effects on both processing and representation aspects of VSTM.