Joint attention—the mutual focus of 2 individuals on an item—speeds detection and discrimination of target information. However, what happens to that information beyond the initial perceptual episode? To fully comprehend and engage with our immediate environment also requires working memory (WM), which integrates information from second to second to create a coherent and fluid picture of our world. Yet, no research exists at present that examines how joint attention directly impacts WM. To investigate this, we created a unique paradigm that combines gaze cues with a traditional visual WM task. A central, direct gaze ‘cue’ face looked left or right, followed 500 ms later by 4, 6, or 8 colored squares presented on one side of the face for encoding. Crucially, the cue face either looked at the squares (valid cue) or looked away from them (invalid cue). A no shift (direct gaze) condition served as a baseline. After a blank 1,000 ms maintenance interval, participants stated whether a single test square color was present or not in the preceding display. WM accuracy was significantly greater for colors encoded in the valid versus invalid and direct conditions. Further experiments showed that an arrow cue and a low-level motion cue—both shown to reliably orient attention—did not reliably modulate WM, indicating that social cues are more powerful. This study provides the first direct evidence that sharing the focus of another individual establishes a point of reference from which information is advantageously encoded into WM.