Lexicon Integrated Deep Neural Networks for Fine-grained Hate Speech Detection on Twitter

<a href="https://embed.stanford.edu/iframe/?url=https%3A%2F%2Fpurl.stanford.edu%2Ftd984qh7001" class="su-underline">Show Content</a>

Abstract/Contents

Abstract: The pernicious problem of hate speech online has drawn attention from companies, governments, and regulatory bodies and necessitates reliable systems for automated hate speech detection. In recent years, an increased amount of research has explored using deep neural network classifiers for this task. We perform comparative evaluation on a public English Twitter corpus of 25k tweets labeled as \textit{hate speech}, \textit{offensive language}, or \textit{neither} and find that a combined convolutional and gated recurrent unit (CNN-GRU) network is the most competitive deep architecture, but still performs poorly on the \textit{hate speech} class compared to linear classifiers with hand-engineered features. We further propose a method to enrich deep neural network models by integrating lexicon-based features. Our proposed method improves the performance of deep models by up to 4\% in F1 score. An analysis of the prediction errors shows that certain keywords are strong indicators of hate speech for all models, while the CNN-GRU is robust in ambiguous contexts. Finally, we present a simple heuristic for discovering new hate speech terms using a trained model and evaluate it against an existing hate speech lexicon.

Type of resource	text
Date created	May 2018

Author	Wang, Cindy
Advisor	Potts, Christopher
Degree granting institution	Stanford University, Department of Computer Science

Subject	computer science
Subject	engineering
Subject	Stanford
Subject	natural language processing
Subject	hate speech
Subject	computational social science
Subject	deep learning
Subject	neural networks
Genre	Thesis

Related item	Title Github repository
Location	https://purl.stanford.edu/td984qh7001

Use and reproduction: User agrees that, where applicable, content will not be used to identify or to otherwise infringe the privacy or confidentiality rights of individuals. Content distributed via the Stanford Digital Repository may be subject to additional license and use restrictions applied by the depositor.
License: This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Preferred Citation: Wang, Cindy. (2018). Lexicon Integrated Deep Neural Networks for Fine-grained Hate Speech Detection on Twitter. Stanford Digital Repository. Available at: https://purl.stanford.edu/td984qh7001

Undergraduate Theses, School of Engineering

Loading usage metrics...