Lexicon Integrated Deep Neural Networks for Fine-grained Hate Speech Detection on Twitter

Placeholder Show Content

Abstract/Contents

Abstract
The pernicious problem of hate speech online has drawn attention from companies, governments, and regulatory bodies and necessitates reliable systems for automated hate speech detection. In recent years, an increased amount of research has explored using deep neural network classifiers for this task. We perform comparative evaluation on a public English Twitter corpus of 25k tweets labeled as \textit{hate speech}, \textit{offensive language}, or \textit{neither} and find that a combined convolutional and gated recurrent unit (CNN-GRU) network is the most competitive deep architecture, but still performs poorly on the \textit{hate speech} class compared to linear classifiers with hand-engineered features. We further propose a method to enrich deep neural network models by integrating lexicon-based features. Our proposed method improves the performance of deep models by up to 4\% in F1 score. An analysis of the prediction errors shows that certain keywords are strong indicators of hate speech for all models, while the CNN-GRU is robust in ambiguous contexts. Finally, we present a simple heuristic for discovering new hate speech terms using a trained model and evaluate it against an existing hate speech lexicon.

Description

Type of resource text
Date created May 2018

Creators/Contributors

Author Wang, Cindy
Advisor Potts, Christopher
Degree granting institution Stanford University, Department of Computer Science

Subjects

Subject computer science
Subject engineering
Subject Stanford
Subject natural language processing
Subject hate speech
Subject computational social science
Subject deep learning
Subject neural networks
Genre Thesis

Bibliographic information

Access conditions

Use and reproduction
User agrees that, where applicable, content will not be used to identify or to otherwise infringe the privacy or confidentiality rights of individuals. Content distributed via the Stanford Digital Repository may be subject to additional license and use restrictions applied by the depositor.
License
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Preferred citation

Preferred Citation
Wang, Cindy. (2018). Lexicon Integrated Deep Neural Networks for Fine-grained Hate Speech Detection on Twitter. Stanford Digital Repository. Available at: https://purl.stanford.edu/td984qh7001

Collection

Undergraduate Theses, School of Engineering

View other items in this collection in SearchWorks

Contact information

Also listed in

Loading usage metrics...