The intelligibility of speech in a harmonic masker varying in fundamental frequency contour, broadband temporal envelope, and spatial location
Differences in fundamental frequency (F0), modulations in the masker envelope, and differences in spatial location between a speech target and a masker can improve speech intelligibility in cocktail-party situations. These cues have been thoroughly investigated independently and associated with unmasking mechanisms: F0 segregation, temporal dip listening and spatial unmasking, respectively. Two experiments were conducted to examine whether F0 segregation interacts with spatial unmasking (experiment 1) or temporal modulations in the masker envelope (experiment 2) by measuring speech reception thresholds for a monotonized or an intonated voice against eight types of harmonic complex masker. In experiment 1, the masker varied in F0 contour (monotonized or intonated), mean F0 (0 or 3 semitones above that of the target) and spatial location (co-located or separated from the target). In experiment 2, the masker varied in F0 contour, mean F0 and broadband temporal envelope (stationary or 1-voice modulated). The benefits associated with spatial separation and F0 differences added up linearly in almost all conditions, whereas modulations in the masker envelope improved speech intelligibility only in the presence of intonated maskers. In addition, in both experiments F0 segregation benefited considerably from natural variations in the F0 pattern of the target voice, but was largely disrupted by those of the masker.