A modulation spectra approach to sound texture analysis and synthesis

Placeholder Show Content

Abstract/Contents

Abstract
What information is perceived when listening to the sound of flowing water, blowing wind or crackling fire? These are examples of a class of sounds that are characterized as textures, because their frequency and temporal attributes are irregular and difficult to resolve, yet retain overall stability. Sound textures contain more structure than filtered noise and thus are not well modeled as such. Tonal methods often used for speech or music do not work well either. A better understanding of sound textures can provide insights into our auditory process, and the information extracted from auditory inputs. But to do so, it is necessary to find a quantitative representation of the added information. In this dissertation, various representations used for audio signal processing are investigated. This leads us to a carrier-modulator view of signals. For modeling the modulator, we find that modulation spectra captures the sound texture information in a compact representation. Based on this approach, we define a feature set that captures the information of sound textures. The derive features are relatively independent and are intuitive to understand. We present a system that extracts the feature set from a sound texture example and uses the extracted feature set to synthesize new samples. Through a listening test it is shown that the proposed feature set and analysis- synthesis method improves sound texture resynthesis compared to state-of-the-art method. Noticeably, our proposed method improves the resynthesis of sound textures with tonal components. Our method also works for long term stable instrument sounds, sounds not conventionally thought to be sound textures. Our study in sound texture further shows the importance of modulation in auditory perception. Such knowledge can be used to find sparse representations of sounds and be applied to data compression, machine-hearing applications and improve audio analysis-synthesis methods. It also gives insight into the perception of timbre.

Description

Type of resource text
Form electronic; electronic resource; remote
Extent 1 online resource.
Publication date 2017
Issuance monographic
Language English

Creators/Contributors

Associated with Kim, Hyung Suk
Associated with Stanford University, Department of Electrical Engineering.
Primary advisor Smith, Julius O. (Julius Orion)
Thesis advisor Smith, Julius O. (Julius Orion)
Thesis advisor Berger, Jonathan
Thesis advisor Murmann, Boris
Advisor Berger, Jonathan
Advisor Murmann, Boris

Subjects

Genre Theses

Bibliographic information

Statement of responsibility Hyung Suk Kim.
Note Submitted to the Department of Electrical Engineering.
Thesis Thesis (Ph.D.)--Stanford University, 2017.
Location electronic resource

Access conditions

Copyright
© 2017 by Hyung Suk Kim
License
This work is licensed under a Creative Commons Attribution 3.0 Unported license (CC BY).

Also listed in

Loading usage metrics...