Date of Award
Department of Computer Science
This thesis describes our approach towards the fine-grained detection of hate speech using deep learning. We leverage the transformer encoder architecture to propose BERToxic, a system that fine-tunes a pre-trained BERT model to locate toxic text spans in a given text and utilizes additional post-processing steps to refine the prediction boundaries. The post-processing steps involve (1) labeling character offsets between consecutive toxic tokens as toxic and (2) assigning a toxic label to words that have at least one token labeled as toxic. Through experiments, we show that these two post-processing steps improve the performance of our model by 4.16% on the test set. We further examined the effect of ensemble models for hate speech detection. The ensemble neural architectures we studied include late fusion where predictions from token and sequence classification models are aggregated in the prediction phase and multi-task learning where the two aforementioned models are trained jointly. Finally, given the scarcity and costs of obtaining labeled data, we explored data augmentation strategies such as appending hate speech-related external datasets and token modification techniques to generate synthetic training examples. Our system significantly outperformed the baseline models and achieved an F1-score of 0.683, placing our model in 17th place out of 91 teams in a hate speech detection competition. Our code is made available at https://github.com/Yakoob-Khan/Toxic-Spans-Detection
Yakoob Khan, Weicheng Ma, and Soroush Vosoughi. (2021). Lone Pine at SemEval-2021 Task 5: Fine-Grained Detection of Hate Speech Using BERToxic. In Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021).
Khan, Yakoob, "Fine-Grained Detection of Hate Speech Using BERToxic" (2021). Dartmouth College Undergraduate Theses. 221.