Learning to choose adaptively when faced with uncertain and variable outcomes is a central challenge for decision makers. This study examines repeated choice in dynamic probability learning tasks in which outcome probabilities changed either as a function of the choices participants made or independently of those choices. This presence/absence of sequential choice–outcome dependencies was implemented by manipulating a single task aspect between conditions: the retention/withdrawal of reward across individual choice trials. The study addresses how people adapt to these learning environments and to what extent they engage in 2 choice strategies often contrasted as paradigmatic examples of striking violation of versus nominal adherence to rational choice: diversification and persistent probability maximizing, respectively. Results show that decisions approached adaptive choice diversification and persistence when sufficient feedback was provided on the dynamic rules of the probabilistic environments. The findings of divergent behavior in the 2 environments indicate that diversified choices represented a response to the reward retention manipulation rather than to the mere variability of outcome probabilities. Choice in both environments was well accounted for by the generalized matching law, and computational modeling-based strategy analyses indicated that adaptive choice arose mainly from reliance on reinforcement learning strategies.