The holistic voice : examining vocal expression through context, perception, and production

Noufi, Camille

The holistic voice : examining vocal expression through context, perception, and production

<a href="https://embed.stanford.edu/iframe/?url=https%3A%2F%2Fpurl.stanford.edu%2Ffv310sy3280" class="su-underline">Show Content</a>

Abstract/Contents

Abstract: In this dissertation, I explore the complex, crucial role of acoustic paralinguistic attributes in vocal expression, emphasizing the need for speech technologies to accurately capture these nuances. Central to my research are two key propositions: (1) the integration of context awareness and adaptive modifications in vocal expression production and perception significantly enhances the alignment of speech technologies with human communication, and (2) the introduction of the "vocal persona," a concept defined as a chosen set of vocal expressions that orient and respond to a communication context, enriching our understanding of both natural and synthesized voices. Employing a blend of qualitative research, audio signal processing, and machine learning, these studies examine the production, encoding, modification, and perception of paralinguistic attributes in both speech and singing. Together, the studies clarify and formalize the influences of multimodal context on vocal expression alongside the role of acoustic paralinguistic cues in communication. They result in the introduction of the vocal persona as a novel framework for holistic vocal expressiveness. The dissertation encompasses a comprehensive literature review on vocal expression and affect, expressive speech technologies, and the intersection of voice with context and personality, along with the role of machine learning in speech and audio processing. Perception studies on how affect is perceived in speech and song are included, as well as machine learning experiments, such as accent classification and prosodic context embeddings. The research extends to analysis for re-synthesis techniques and specific applications like tracking acoustic speech features post-pediatric traumatic brain injury. A significant portion of the dissertation is dedicated to the thematic analysis of vocal persona, proposing a model and framework for natural persona-guided expression. This research culminates in synthesizing these insights to propose a framework for persona-guided speech synthesis, including context-adaptive voice conversion and text-to-speech applications.

Description

Type of resource	text
Form	electronic resource; remote; computer; online resource
Extent	1 online resource.
Place	California
Place	[Stanford, California]
Publisher	[Stanford University]
Copyright date	2023; ©2023
Publication date	2023; 2023
Issuance	monographic
Language	English

Creators/Contributors

Author	Noufi, Camille
Degree supervisor	Berger, Jonathan, 1954-
Thesis advisor	Berger, Jonathan, 1954-
Thesis advisor	Chafe, Chris
Thesis advisor	Smith, Julius O. (Julius Orion)
Degree committee member	Chafe, Chris
Degree committee member	Smith, Julius O. (Julius Orion)
Associated with	Stanford University, School of Humanities and Sciences
Associated with	Stanford University, Department of Music

Subjects

Genre	Theses
Genre	Text

Bibliographic information

Statement of responsibility	Camille Noufi.
Note	Submitted to the Department of Music.
Thesis	Thesis Ph.D. Stanford University 2023.
Location	https://purl.stanford.edu/fv310sy3280

Access conditions

License: This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

View in SearchWorks

Loading usage metrics...