Although bilinguals respond differently to emotionally valenced words in their first language (L1) relative to emotionally neutral words, similar effects of emotional valence are hard to come by in second language (L2) processing. We examine the extent to which these differences in first and second language processing are due to the context in which the 2 languages are acquired: L1 is typically acquired in more naturalistic settings (e.g., family) than L2 (e.g., at school). Fifty German–English bilinguals learned unfamiliar German and English negative and neutral words in 2 different learning conditions: One group (emotion video context) watched videos of a person providing definitions of the words with facial and gestural cues, whereas another group (neutral video context) received the same definitions without gestural and emotional cues. Subsequently, participants carried out an emotional Stroop task, a sentence completion task, and a recall task on the words they had just learned. We found that the effect of learning context on the influence of emotional valence on responding was modulated by a) language status, L1 versus L2, and b) task requirement. We suggest that a more nuanced approach is required to capture the differences in emotion effects in the speed versus accuracy of access to words across different learning contexts and different languages, in particular with regard to our finding that bilinguals respond to L2 words in a similar manner as L1 words provided that the learning context is naturalistic and incorporates emotional and prosodic cues.