This study extended previous research on stimulus equivalence with all auditory stimuli by using a methodology more similar to conventional match-to-sample training and testing for three 3-member equivalence relations. In addition, it examined the effect of conflicting non-arbitrary relations on auditory equivalence. Three conditions (n = 11 participants each) were trained and tested for formation of equivalence using recorded auditory nonsense syllable stimuli. In the Same Voice (SV) condition, participants were exposed to stimuli pronounced by the same voice in training and testing. For the Different Voice Test (DVT) condition, in training, stimuli were all pronounced by the same voice, while in testing, they were pronounced by three different voices, with the sample always in a different voice from the equivalent comparisons. This established potentially competing sources of stimulus control, since participants might respond either in accordance with non-arbitrary auditory relations or with equivalence. In the third condition (Different Voice; DV), participants were given testing identical to the DVT condition but were trained with stimuli pronounced by different voices, such that voice was unrelated to the programmed contingencies. As predicted, the DVT condition produced less equivalence responding and more non-arbitrary matching than the DV condition. These data are broadly consistent with previous findings with visual stimuli.