"Fine-Grained Detection of Hate Speech Using BERToxic" by Yakoob Khan

Dartmouth College Undergraduate Theses

Title

Fine-Grained Detection of Hate Speech Using BERToxic

Author

Yakoob Khan, Dartmouth CollegeFollow

Date of Award

Spring 6-1-2021

Document Type

Thesis (Undergraduate)

Department or Program

Department of Computer Science

First Advisor

Soroush Vosoughi

Abstract

This thesis describes our approach towards the fine-grained detection of hate speech using deep learning. We leverage the transformer encoder architecture to propose BERToxic, a system that fine-tunes a pre-trained BERT model to locate toxic text spans in a given text and utilizes additional post-processing steps to refine the prediction boundaries. The post-processing steps involve (1) labeling character offsets between consecutive toxic tokens as toxic and (2) assigning a toxic label to words that have at least one token labeled as toxic. Through experiments, we show that these two post-processing steps improve the performance of our model by 4.16% on the test set. We further examined the effect of ensemble models for hate speech detection. The ensemble neural architectures we studied include late fusion where predictions from token and sequence classification models are aggregated in the prediction phase and multi-task learning where the two aforementioned models are trained jointly. Finally, given the scarcity and costs of obtaining labeled data, we explored data augmentation strategies such as appending hate speech-related external datasets and token modification techniques to generate synthetic training examples. Our system significantly outperformed the baseline models and achieved an F1-score of 0.683, placing our model in 17th place out of 91 teams in a hate speech detection competition. Our code is made available at https://github.com/Yakoob-Khan/Toxic-Spans-Detection

Original Citation

Yakoob Khan, Weicheng Ma, and Soroush Vosoughi. (2021). Lone Pine at SemEval-2021 Task 5: Fine-Grained Detection of Hate Speech Using BERToxic. In Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021).

Recommended Citation

Khan, Yakoob, "Fine-Grained Detection of Hate Speech Using BERToxic" (2021). Dartmouth College Undergraduate Theses. 221.
https://digitalcommons.dartmouth.edu/senior_theses/221

Download

Included in

Artificial Intelligence and Robotics Commons, Data Science Commons

COinS