Lexicon Integrated Deep Neural Networks for Fine-grained Hate Speech Detection on Twitter
Abstract/Contents
- Abstract
- The pernicious problem of hate speech online has drawn attention from companies, governments, and regulatory bodies and necessitates reliable systems for automated hate speech detection. In recent years, an increased amount of research has explored using deep neural network classifiers for this task. We perform comparative evaluation on a public English Twitter corpus of 25k tweets labeled as \textit{hate speech}, \textit{offensive language}, or \textit{neither} and find that a combined convolutional and gated recurrent unit (CNN-GRU) network is the most competitive deep architecture, but still performs poorly on the \textit{hate speech} class compared to linear classifiers with hand-engineered features. We further propose a method to enrich deep neural network models by integrating lexicon-based features. Our proposed method improves the performance of deep models by up to 4\% in F1 score. An analysis of the prediction errors shows that certain keywords are strong indicators of hate speech for all models, while the CNN-GRU is robust in ambiguous contexts. Finally, we present a simple heuristic for discovering new hate speech terms using a trained model and evaluate it against an existing hate speech lexicon.
Description
Type of resource | text |
---|---|
Date created | May 2018 |
Creators/Contributors
Author | Wang, Cindy |
---|---|
Advisor | Potts, Christopher |
Degree granting institution | Stanford University, Department of Computer Science |
Subjects
Subject | computer science |
---|---|
Subject | engineering |
Subject | Stanford |
Subject | natural language processing |
Subject | hate speech |
Subject | computational social science |
Subject | deep learning |
Subject | neural networks |
Genre | Thesis |
Bibliographic information
Related item |
|
---|---|
Location | https://purl.stanford.edu/td984qh7001 |
Access conditions
- Use and reproduction
- User agrees that, where applicable, content will not be used to identify or to otherwise infringe the privacy or confidentiality rights of individuals. Content distributed via the Stanford Digital Repository may be subject to additional license and use restrictions applied by the depositor.
- License
- This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).
Preferred citation
- Preferred Citation
- Wang, Cindy. (2018). Lexicon Integrated Deep Neural Networks for Fine-grained Hate Speech Detection on Twitter. Stanford Digital Repository. Available at: https://purl.stanford.edu/td984qh7001
Collection
Undergraduate Theses, School of Engineering
View other items in this collection in SearchWorksContact information
- Contact
- cindyw@cs.stanford.edu
Also listed in
Loading usage metrics...