Please use this identifier to cite or link to this item:

Towards mass-customizing up/down sound cues for listeners : issues concerning inter-subject variability

Authors Au, John Tsun Lam
Issue Date 2008
Summary Sound cues filtered from individualized head-related transfer functions (HRTFs) can provide accurate up / down directional cues to that particular listener. When the same cues are presented to other listeners, errors occur in the perceived up / down directions and these errors vary greatly among listeners. This thesis presents a study to relate individual’s localization errors of non-individualized HRTF-filtered up / down sound cues with a proposed index (referred to as the ‘matching score’). These scores are calculated from individual’s ear dimensions as well as the spectra of the HRTFs used in filtering the sound cues. It is hypothesized that for a particular listener and a particular up / down HRTF-filtered sound cue, the higher the matching score, the lower the localization errors (H1). The matching score, based upon the ‘delay-and-add’ theory proposed by Hebrank and Wright (1974), is new and original and forms part of the academic contribution of the thesis. If H1 is proven, this thesis will be the first study to provide empirical evidence to support the proposed ‘delay-and-add’ theory. Three dimensional moulds of outer ears from thirty-three participants have been collected. Using the ear dimensions, matching scores have been calculated between each of the 33 participants and 192 open-copyrighted non-individualized HRTFs (from the LISTEN database: IRCAM and AKG Acoustics, 2004 and CIPIC database: Algazi et al., 2001). These calculations have been repeated for each of the four selected elevation angles (30 and 15 degrees below ear level and 30 and 60 degrees above ear level) to give 25344 matching scores. Using these matching scores, five non-individualized HRTFs having the 0th, 25th, 50th, 75th, and 100th percentile average matching scores have been selected for each of the four elevation angles. These twenty HRTFs are then used to produce a total of twenty sound cues (4 angles x 5 HRTFs). Seventeen participants (randomly selected from the 33 participants of the survey) were invited back to take part in a within-subject design experiment in which each listener needed to localize 120 sound cues presented in random order. These 120 sound cues represent the combination of 20 sound cues and 6 repetitions. The objective of the experiment is to evaluate the relationship between the matching score and the localization errors. In this experiment, listeners obtained an average localization error of about 36 degrees to an elevation cue, which is comparable to the results from other studies (Zotkin et al.,2003, Seeber and Fastl, 2003). Our proposed matching scores have been found to significantly correlate with the localization errors and regression analyses confirm that as the scores increase, the errors reduce significantly. In other words, H1 has been supported. Further analyses using the ‘school-effect’ model indicates that the matching score alone can explain 27% of between listener variations in the localization errors. Discussions on inter- and intra-listener variability in perceiving up/down direction of a sound cue are included. The study is the first to provide empirical support to the ‘delay-and-add’ theory to explain human perception of up / down direction of a sound cue. Future work to refine the ‘matching score’ calculation is desirable.
Note Thesis (M.Phil.)--Hong Kong University of Science and Technology, 2008
Language English
Format Thesis
Access View full-text via DOI
Files in this item:
File Description Size Format
th_redirect.html 337 B HTML
Copyrighted to the author. Reproduction is prohibited without the author’s prior written consent.