Simulating how humans hear themselves vocalize : a two-parameter spectral model

Placeholder Show Content

Abstract/Contents

Abstract
It is well known that people are often uncomfortable while hearing their recorded singing and speaking voice. This discomfort caused by the unfamiliarity with the recorded voice, compared to normal hearing, is due to a different transmission mechanism; listening to one's recorded voice only involves a single air-conduction pathway, whereas the voice we hear when we sing and speak propagates through both an air-conduction and a boneconduction pathway. Despite the well-known phenomenon, one's own hearing has received less attention among researchers since it is a very complex process involving multiple paths from vocal cords to hearing sensations. In addition, studying the auditory mechanism and perception of living humans adds more difficulty because of ethical concerns. In this dissertation, we conduct a perceptual experiment in order to investigate spectral characteristics of one's own hearing. Through this perceptual method, we expect to overcome the difficulties in the mechanical studies. Moreover, we develop a model of one's own hearing with two parameters based on the conclusive experimental result. This study is the first to propose a novel and simple model of own's own hearing and contribute many potential applications from medical devices to general consumer electronics. First, we design and implement an equalizer having eight frequency bands. Then, we conduct two experiments with different groups -- amateur singers and professional singers -- using speech and singing voice samples. During the experiment, subjects process their recorded voice by the equalizer, and determine the proximity of match to the imagined sung voice. We estimate transfer functions from air conduction to one's own hearing based on the chosen equalizer settings. In the result, we observe that the transfer functions intra-subject are relatively consistent and feature mostly band-pass filters. Moreover, the averaged transfer functions among subjects also present a relatively high degree of similarity regardless of gender and level of singing experience. The consistency of transfer functions allows us to develop a model. In the first simplification procedure, we run singular value decomposition (SVD) and calculate a lowdimensional approximation of the original data. Then, we derive three models from the reduced rank-2 matrix by further data simplification and filter alteration. After performing pilot tests with each model, we conclude that the third model, having one constant filter and two variable filters, can closely simulate one's own hearing with less effort. Lastly, we proceed the validation experiment and demonstrate the accuracy and usability of the model with positive feedback from participants.

Description

Type of resource text
Form electronic; electronic resource; remote
Extent 1 online resource.
Publication date 2014
Issuance monographic
Language English

Creators/Contributors

Associated with Won, Sook Young
Associated with Stanford University, Department of Music.
Primary advisor Berger, Jonathan
Thesis advisor Berger, Jonathan
Thesis advisor Chafe, Chris
Thesis advisor Slaney, Malcolm
Thesis advisor Wang, Ge
Advisor Chafe, Chris
Advisor Slaney, Malcolm
Advisor Wang, Ge

Subjects

Genre Theses

Bibliographic information

Statement of responsibility Sook Young Won.
Note Submitted to the Department of Music.
Thesis Thesis (Ph.D.)--Stanford University, 2014.
Location electronic resource

Access conditions

Copyright
© 2014 by Sook Young Won
License
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...