Data-driven methods for improving social interaction in virtual environments
- The COVID-19 pandemic highlighted the value of remote social communication between individuals. The fundamentals of such communication are actively studied for virtual reality interaction, remote video calls, and social networking, but research on these methods infrequently focuses on encoding explicit social use-case. In addition, systems that do incorporate modalities for explicit social information, such as touch, are often hand-tuned or manually generated. In this work, we seek to leverage data-driven methods to improve social interaction between individuals in virtual environments. In order to socialize effectively in shared virtual environments, we seek to learn how a person physically interacts in such environments. For the first contribution of this thesis, we develop the interaction-expectation model to improve hand tracking and interaction. The purpose of this model is to predict hand-object interaction before such interaction occurs. This allows for smooth interaction in virtual environments, which can be used to improve haptic feedback in shared-object settings. We find that we are able to predict human-object interaction before it occurs over short timescales (approximately 100ms). In order to improve social interaction, we must also understand the emotional intent of actions between individuals. In the second contribution of this thesis, we collected a dataset of pairs of individuals interacting to convey affective information through touch. We record the data using a soft pressure sensor on one participants arm. This dataset was collected in a more natural environment than existing ones, and utilized scenario prompts rather than single word prompts. For the third contribution we then develop a system to automatically convey social touch information using our dataset. We develop an algorithm that leverages computer vision algorithms to map from the recorded data to an actuator sleeve with an array of actuators which indent the skin. We find that humans accurately interpret the affective intent of our system with accuracy comparable to human touch interaction. Finally, we consider the visual mode of social experience by improving affective facial expressions for 2D virtual avatars. In recent years socialization via virtual avatars has dramatically increased, with the growing use of 2D drawn, rigged, virtual avatars with face tracking. As a last contribution of this thesis, we create a novel dataset of 2D avatar expressions, of high quality and richer data than previous datasets. We then propose use cases for this dataset to automate the creation of 2D avatars. Through the collection of contributions in this thesis, we seek to push the field of virtual social interaction forward with multi-modal interaction. This will allow people to interact in virtual environments, connect with remote loved ones, and represent a version of themselves online more easily and effectively.
|Type of resource
|electronic resource; remote; computer; online resource
|1 online resource.
|Salvato, Millie Aila
|Bohg, Jeannette, 1981-
|Bohg, Jeannette, 1981-
|Degree committee member
|Stanford University, Department of Mechanical Engineering
|Statement of responsibility
|Submitted to the Department of Mechanical Engineering.
|Thesis Ph.D. Stanford University 2022.
- © 2022 by Millie Aila Salvato
- This work is licensed under a Creative Commons Attribution Non Commercial Share Alike 3.0 Unported license (CC BY-NC-SA).
Also listed in
Loading usage metrics...