Affective analysis and synthesis of laughter
- Laughter is a universal human response to emotional stimuli. Though the production mechanism of laughter may seem crude when compared to other modes of vocalization such as speech and singing, the resulting auditory signal is nonetheless expressive. That is, laughter triggered by different social and emotional contexts is characterized by distinctiveness in auditory features that implicate certain state and attitude of the laughing person. By implementing prototypes for interactive laughter synthesis and conducting crowdsourced experiments on the synthesized laughter stimuli, this dissertation investigates acoustic features of laughter expressions, and how they may give rise to emotional meaning. The first part of the dissertation (Chapter 3) provides a new approach for interactive laughter synthesis that prioritizes expressiveness. Our synthesis model, with a reference implementation in the ChucK programming language, offers three levels of representation: the transcription mode requires specifying precise values of all control parameters, the instrument mode allows users to freely trigger and control laughter within the instrument's capacities, and the agent mode semi-automatically generates laughter according to its predefined characteristic tendency. Modified versions of this model has served as a stimulus generator for conducting perception experiments, as well as an instrument for the laptop orchestra. The second part of the dissertation (Chapter 4) describes a series of experiments conducted to understand (1) how acoustic features affect listeners' perception of emotions in synthesized laughter, and (2) the extent to which this observed relationships between features and emotions are laughter-specific. To explore the first question, a few chosen features are varied systematically to measure their impact on the perceived intensity and valence of emotions. To explore the second question, we intentionally eliminate timbral and pitch-contour cues that are essential to our recognition of laughter in order to gauge the extent to which our acoustic features are specific to the domain of laughter. As a related contribution, we describe our attempts to characterize features of auditory signal that can be used to distinguish laughter from speech (Chapter 5). While the corpus used to conduct this work does not provide annotations about the emotional qualities of laughter, and instead simply labels a given frame as either laughter, filler (such as 'uh', 'like', or 'er'), or garbage (including speech without laughter), this portion of research nonetheless serves as a starting point for applying our insights from Chapter 3 and Chapter 4 to a more practical problem involving laughter classification using real-life data. By focusing on the affective dimensions of laughter, this work complements prior works on laughter synthesis that have primarily emphasized the acceptability criteria. Moreover, by collecting listeners' response to synthesized laughter stimuli, this work attempts to establish a causal link between acoustic features and emotional meaning that is difficult to achieve when using real laughter sounds. The collection of research presented in this dissertation is intended to offer novel tools and framework for exploring many more unsolved questions about how humans communicate through laughter.
|Type of resource
|electronic; electronic resource; remote
|1 online resource.
|Stanford University, Department of Music.
|Wang, Ge, 1977-
|Wang, Ge, 1977-
|Statement of responsibility
|Submitted to the Department of Music.
|Thesis (Ph.D.)--Stanford University, 2014.
- © 2014 by Ji Eun Oh
- This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).
Also listed in
Loading usage metrics...