The suggestion that the capacity of audiovisual integration has an upper limit of 1 was challenged in 4 experiments using perceptual factors and training to enhance the binding of auditory and visual information. Participants were required to note a number of specific visual dot locations that changed in polarity when a critical auditory stimulus was presented, under relatively fast (200-ms stimulus onset asynchrony [SOA]) and slow (700-ms SOA) rates of presentation. In Experiment 1, transient cross-modal congruency between the brightness of polarity change and pitch of the auditory tone was manipulated. In Experiment 2, sustained chunking was enabled on certain trials by connecting varying dot locations with vertices. In Experiment 3, training was employed to determine if capacity would increase through repeated experience with an intermediate presentation rate (450 ms). Estimates of audiovisual integration capacity (K) were larger than 1 during cross-modal congruency at slow presentation rates (Experiment 1), during perceptual chunking at slow and fast presentation rates (Experiment 2), and, during an intermediate presentation rate posttraining (Experiment 3). Finally, Experiment 4 showed a linear increase in K using SOAs ranging from 100 to 600 ms, suggestive of quantitative rather than qualitative changes in the mechanisms in audiovisual integration as a function of presentation rate. The data compromise the suggestion that the capacity of audiovisual integration is limited to 1 and suggest that the ability to bind sounds to sights is contingent on individual and environmental factors.