Exposure to multiple but unequal (in number) sensory inputs often leads to illusory percepts, which may be the product of a conflict between those inputs. To test this conflict, we utilized the classic sound induced visual fission and fusion illusions under various temporal configurations and timing presentations. This conflict between unequal numbers of sensory inputs (i.e., crossmodal binding rivalry) depends on the binding of the first audiovisual pair and its temporal proximity to the upcoming unisensory stimulus. We, therefore, expected that tight coupling of the first audiovisual pair would lead to higher rivalry with the upcoming unisensory stimulus and, thus, weaker illusory percepts. Loose coupling, on the other hand, would lead to lower rivalry and higher illusory percepts. Our data showed the emergence of two different participant groups, those with low discrimination performance and strong illusion reports (particularly for fusion) and those with the exact opposite pattern, thus extending previous findings on the effect of visual acuity in the strength of the illusion. Most importantly, our data revealed differential illusory strength across different temporal configurations for the fission illusion, while for the fusion illusion these effects were only noted for the largest stimulus onset asynchronies tested. These findings support that the optimal integration theory for the double flash illusion should be expanded so as to also take into account the multisensory temporal interactions of the stimuli presented (i.e., temporal sequence and configuration).