Skip to main navigation Skip to search Skip to main content

An Investigation of Large Language Models for Real-World Hate Speech Detection

  • SUNY Buffalo
  • Union County Magnet High School
  • East Chapel Hill High School
  • Minhang Corsspoint Academy
  • University of Texas at San Antonio

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

24 Scopus citations

Abstract

Hate speech has emerged as a major problem plaguing our social spaces today. While there have been significant efforts to address this problem, existing methods are still significantly limited in effectively detecting hate speech online. A major limitation of existing methods is that hate speech detection is a highly contextual problem, and these methods cannot fully capture the context of hate speech to make accurate predictions. Recently, large language models (LLMs) have demonstrated state-of-the-art performance in several natural language tasks. LLMs have undergone extensive training using vast amounts of natural language data, enabling them to grasp intricate contextual details. Hence, they could be used as knowledge bases for context-aware hate speech detection. However, a fundamental problem with using LLMs to detect hate speech is that there are no studies on effectively prompting LLMs for context-aware hate speech detection. In this study, we conduct a large-scale study of hate speech detection, employing five established hate speech datasets. We discover that LLMs not only match but often surpass the performance of current benchmark machine learning models in identifying hate speech. By proposing four diverse prompting strategies that optimize the use of LLMs in detecting hate speech. Our study reveals that a meticulously crafted reasoning prompt can effectively capture the context of hate speech by fully utilizing the knowledge base in LLMs, significantly outperforming existing techniques. Furthermore, although LLMs can provide a rich knowledge base for the contextual detection of hate speech, suitable prompting strategies play a crucial role in effectively leveraging this knowledge base for efficient detection.

Original languageEnglish
Title of host publicationProceedings - 22nd IEEE International Conference on Machine Learning and Applications, ICMLA 2023
EditorsM. Arif Wani, Mihai Boicu, Moamar Sayed-Mouchaweh, Pedro Henriques Abreu, Joao Gama
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1568-1573
Number of pages6
ISBN (Electronic)9798350345346
DOIs
StatePublished - 2023
Event22nd IEEE International Conference on Machine Learning and Applications, ICMLA 2023 - Jacksonville, United States
Duration: Dec 15 2023Dec 17 2023

Publication series

NameProceedings - 22nd IEEE International Conference on Machine Learning and Applications, ICMLA 2023

Conference

Conference22nd IEEE International Conference on Machine Learning and Applications, ICMLA 2023
Country/TerritoryUnited States
CityJacksonville
Period12/15/2312/17/23

Keywords

  • few-shot learning
  • hate speech
  • large language model
  • prompt engineering

Fingerprint

Dive into the research topics of 'An Investigation of Large Language Models for Real-World Hate Speech Detection'. Together they form a unique fingerprint.

Cite this