Using natural language processing on social media to understand psychological processes

Placeholder Show Content

Abstract/Contents

Abstract
Because of their global reach, social media platforms provide a treasure trough of naturalistic language data for examining social and psychological phenomena. In this work, we use social media language to examine affective processes and measure population well-being and health. Regarding the examination of affective processes, in Chapter 2, we found that Twitter users in the US (in 55,867 tweets from 1,888 users) and Japan (in 63,863 tweets from 1,825 users) are more likely to produce affect that supports cultural values, but are more influenced by affect that violates cultural values. In the process, we also built a new sentiment analysis tool for Japanese to better facilitate research in this area. In Chapter 3, we examined these patterns specifically in U.S. news media (in 30 million tweets posted by 182 news sources). We showed that across a decade (2011-2020), politically biased news sources produced posts with more high arousal negative affect (e.g., anger, hate) on Twitter and that these posts were more likely to spread. Regarding the measurement of population well-being and health, in Chapter 4, we combined advances that have evolved between 2013 and 2021 that incrementally improved data pre-processing, natural language processing, and machine-learning methods using social media language to measure population well-being. Using 1.3 billion Twitter posts and 1.9 million Gallup respondents in the U.S., we establish the current state-of-the-art (as of 2022) to measure population well-being and demonstrate its performance and validity across several validity domains, including test-retest reliability and face and external convergent validity. In Chapter 5, we study the psychological adaptation to the COVID-19 pandemic by leveraging 620,191 Twitter posts from the early stages of the pandemic (March to May 2020) that were geolocated to 889 US counties, finding that people switched from more self-referential language (I, me, mine) to more group-referential language (we, us, ours) following shared trauma, similar to what has been observed during prior crises. In the same study, we also found that overall use of self-referential language in a county predicted future COVID-19 mortality rates. Together, these studies show the potential of social media language data, combined with psychology and computer science methods, to help us understand and monitor psychological and societal processes in the digital age.

Description

Type of resource text
Form electronic resource; remote; computer; online resource
Extent 1 online resource.
Place California
Place [Stanford, California]
Publisher [Stanford University]
Copyright date 2022; ©2022
Publication date 2022; 2022
Issuance monographic
Language English

Creators/Contributors

Author Hsu, Tiffany Weiting
Degree supervisor Tsai, Jeanne Ling
Thesis advisor Tsai, Jeanne Ling
Thesis advisor Eichstaedt, Johannes C
Thesis advisor Knutson, Brian
Thesis advisor Markus, Hazel Rose
Degree committee member Eichstaedt, Johannes C
Degree committee member Knutson, Brian
Degree committee member Markus, Hazel Rose
Associated with Stanford University, Department of Psychology

Subjects

Genre Theses
Genre Text

Bibliographic information

Statement of responsibility Tiffany Weiting Hsu.
Note Submitted to the Department of Psychology.
Thesis Thesis Ph.D. Stanford University 2022.
Location https://purl.stanford.edu/bk438xj6535

Access conditions

Copyright
© 2022 by Tiffany Weiting Hsu
License
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...