MetaHate: A Dataset for Unifying Efforts on Hate Speech Detection

Data Distribution

This document contains comprehensive information required for accessing the MetaHate research collection.

Any scientific publication resulting from the utilization of this collection must explicitly cite the following reference:


    @misc{piot2024metahate,
          title={MetaHate: A Dataset for Unifying Efforts on Hate Speech Detection},
          author={Paloma Piot and Patricia Martín-Rodilla and Javier Parapar},
          year={2024},
          eprint={2401.06526},
          archivePrefix={arXiv},
          primaryClass={cs.CL}
    }
        

The MetaHate collection is accessible for research purposes, subject to appropriate user agreements.

Data Format

The MetaHate dataset contains 1,226,202 social media posts and is formatted in TSV (Tab-Separated Values).

The meta-collection features two columns: one for the hate speech label (1 for hate, 0 for non-hate) and another for the post content.

Access to the complete meta-collection will be granted only upon the submission of all relevant agreements for the derived datasets. Otherwise, we will only provide the access to the publicly available datasets. Access to the original datasets will be provided exclusively when the corresponding original agreement is submitted.

The table below outlines the utilized datasets and indicates whether the original dataset agreement is required.

Dataset Publication Size Agreement needed
Online Harassment 2017 A Large Human-Labeled Corpus for Online Harassment Research (Golbeck et al. 2017) 19,838 Yes (jgolbeck@umd.edu)
Hateval 2019 SemEval-2019 Task 5: Multilingual Detection of Hate Speech Against Immigrants and Women in Twitter (Basile et al. 2019) 12,747 No
OLID 2019 Predicting the Type and Target of Offensive Posts in Social Media (Zampieri et al. 2019) 14,052 No
US 2020 Elections Hate Towards the Political Opponent: A Twitter Corpus Study of the 2020 US Elections on the Basis of Offensive Speech and Stance Detection (Grimminger and Klinger 2021) 2,999 No
"Call me sexist but" 2021 The ‘Call me sexist, but’ sexism dataset (Samory et al. 2020) 3,058 Yes (web)
HASOC 2019 Overview of the HASOC track at FIRE 2019: Hate Speech and Offensive Content Identification in Indo-European Languages (Mandl et al. 2019) 6,981 No
A Curated Hate Speech Dataset 2023 A curated dataset for hate speech detection on social media text (Mody, Huang and de Oliveira 2023) 560,385 No
Hate Speech B 2016 Hateful Symbols or Hateful People? Predictive Features for Hate Speech Detection on Twitter (Waseem and Hovy. 2016) 6,909 Yes (z@zeerak.org)
Hate Speech A 2016 Are You a Racist or Am I Seeing Things? Annotator Influence on Hate Speech Detection on Twitter (Waseem 2016) 16,849 Yes (z@zeerak.org)
Hate Offensive 2017 Automated Hate Speech Detection and the Problem of Offensive Language (Davidson et al. 2017) 24,783 No
TRAC1 2018 Aggression-annotated Corpus of Hindi-English Code-mixed Data (Kumar et al. 2018) 14,537 Yes (form)
ENCASE 2018 Large Scale Crowdsourcing and Characterization of Twitter Abusive Behaviour (Founta et al. 2018) 91,950 No
MLMA 2019 Multilingual and Multi-Aspect Hate Speech Analysis (Ousidhoum et al. 2019) 5,593 No
#MeTooMA 2020 #MeTooMA: Multi-Aspect Annotations of Tweets Related to the MeToo Movement (Gautam et al. 2020) 9,889 Yes (akash15011@iiitd.ac.in)
HateXplain 2020 HateXplain: A Benchmark Dataset for Explainable Hate Speech Detection (Mathew et al. 2020) 20,109 No
Hate Speech Data 2017 (Only Whisper) A Measurement Study of Hate Speech in Social Media (Mondal, Silva, and Benevenuto 2017) and Analyzing the Targets of Hate in Online Social Media (Silva et al. 2016) 6,157 No
Hateful Tweets 2022 Pinpointing Fine-Grained Relationships between Hateful Tweets and Replies (Albanyan and Blanco 2022) 1,141 Yes (abdullahalbanyan@my.unt.edu)
Multiclass Hate Speech 2022 Large-Scale Hate Speech Detection with Cross-Domain Transfer (Toraman, Sahinuç and Yilmaz 2022) 102,076 Yes (cagritoraman@gmail.com)
Measuring Hate Speech 2020 & 2022 Constructing interval variables via faceted Rasch measurement and multitask deep learning: a hate speech application (Kennedy et al. 2022) and The Measuring Hate Speech Corpus: Leveraging Rasch Measurement Theory for Data Perspectivism (Sachdeva et al. 2022) 39,565 No
BullyDetect 2018 Using the Reddit Corpus for Cyberbully Detection (Bin Abdur Rakib and Soon 2018) 6,562 No
Intervene Hate 2019 A Benchmark Dataset for Learning to Intervene in Online Hate Speech (Qian et al. 2019) 45,170 No
Slur Corpus 2020 Towards a Comprehensive Taxonomy and Large-Scale Annotated Corpus for Online Slur Usage (Kurrek, Saleem, and Ruths 2020) 39,960 No
CAD 2021 Introducing CAD: the Contextual Abuse Dataset (Vidgen et al. 2021) 23,060 No
ETHOS 2020 ETHOS: an Online Hate Speech Detection Dataset (Mollas et al. 2020) 998 No
Hate in Online News Media 2018 Anatomy of Online Hate: Developing a Taxonomy and Machine Learning Models for Identifying and Classifying Hate in Online News Media (Salminen et al. 2018) 3,214 No
Supremacist 2018 Hate Speech Dataset from a White Supremacy Forum (Gibert et al. 2018) 10,534 No
The Gab Hate Corpus 2022 Introducing the Gab Hate Corpus: defining and applying hate-based rhetoric to social media posts at scale (Kennedy et al. 2018) 27,434 No
HateComments 2023 Hateful Comment Detection and Hate Target-Type Prediction for Video Comments (Gupta, Priyadarshi and Gupta 2023) 2,070 No
TRAC2 2020 Developing a Multilingual Annotated Corpus of Misogyny and Aggression (Bhattacharya et al. 2020) 5,329 Yes (form)
Toxic Spans 2021 SemEval-2021 Task 5: Toxic Spans Detection (Pavlopoulos et al. 2021) 10,621 No
Ex Machina 2016 Ex Machina: Personal Attacks Seen at Scale (Wulczyn, Thain, and Dixon 2016) 115,705 No
Context Toxicity 2020 Toxicity Detection: Does Context Really Matter? (Pavlopoulos et al. 2020) 19,842 No
Hugging Face and Kaggle Hugging Face and Kaggle (Roshan Sharma and Ali Toosi) 29,530 No
Kaggle Kaggle (Munki Albright) 18,208 No
Kaggle Kaggle (SR) 159,571 No
Kaggle Kaggle (Jigsaw) 223,549 No

Other related work

Apart from the datasets included in MetaHate and reference in our publication, we analyzed the following works.

Hate Lingo 2018 (ElSherief et al. 2018): Despite advocating for a shift away from keyword-based dataset creation methods, ElSherief et al. compiled their dataset (1) using a lexicon sourced from Hatebase, (2) using various hate speech hashtags, (3) incorporating segments from the datasets of Davidson et al. 2017 and Waseem and Hovy 2016, and retrieving 41 English hate tweets from No Hate Speech Movement, and (4) integrating a random sample of tweets. They concentrated on examining online hate speech with a specific focus on its target: whether it was directed towards a particular individual or entity or generalized in nature. Contributed: None (attempts to contact the authors yielded no response).

SWAD 2020 (Pamungkas et al. 2020): Pamungkas et al. integrated the OLID dataset (Zampieri et al. 2019) into their work, supplementing it by filtering entries from noswearing that contained at least one profanity. They focused on the hate speech topic and kept the binary classification type from OLID. However, we opted not to use this particular corpus, as all the entries were already encompassed in the work by Zampieri et al. Contributed: None.

Dynamic Hate 2021 (Vidgen et al.): Vidgen et al. work explains the process of generating datasets with human input and provides details of all steps and rounds. The study produced 41,135 synthetic entries classified as hate or no hate. Contributed: None (synthetic data is out of scope of our work).

Cyberbullying Personality 2018 (Tahmasbi et al. 2018): In this study, the authors collected 3,987 tweets associated with a contentious hashtag targeting a media personality. However, the methodology presented limitations: out of these, only 219 were identified as instances of cyberbullying, with no detailed annotation process; rather, all tweets containing the hashtag were indiscriminately labelled as cyberbullying. Contributed: None (we were unable to obtain the data).

AMI 2018 (Fersini et al. 2018, Anzovino et al. 2018): The primary objective of Automatic Misogyny Identification (AMI) is to discern misogynistic content from non-misogynous content, among other tasks. In pursuit of this goal, the researchers curated a dataset consisting of 4,454 posts from Twitter, employing keyword-based selection, monitoring potential victims' accounts, and accessing the history of identified misogynistic accounts. While AMI proposes various tasks, our focus in this research centres on the first task: binary classification. Contributed: None (we were unable to access the data).

Mean Birds (Chatzakou et al. 2017): Chatzakou et al. curated a dataset by scraping tweets using different hashtags. Their objective was to identify instances of bullying and aggression from a collection of 9,484 entries, which included spam messages, normal ones, and aggressive or bullying tweets from Twitter. Contributed: None (attempts to contact the authors yielded no response).

Harassment Corpus 2018 (Rezvan et al. 2018): Rezvan et al. annotated a Twitter corpus comprising 25,000 entries, categorizing the content based on the type of harassment: sexual, racial, appearance-related, intellectual, and political. Contributed: None (attempts to contact the authors yielded no response).

Ambivalent Sexism 2017 (Jha and Mamidi 2017): Jha and Mamidi directed their attention to the analysis of sexist content on social media. Expanding upon the dataset introduced by Waseem and Hovy 2016, they augmented it by incorporating comments reflecting benevolent sexism. This extension involved a dataset comprising 22,142 tweets. Contributed: None (we were unable to obtain the data).

CONAN 2019-2022 (Bonaldi et al. 2022, Fanton et al. 2021, Chung et al. 2019): The CONAN project is aimed at tackling online hate speech through counter-narratives. Through their work, they have collected 8,883 synthetic pairs of hate conversations which have been labelled based on their target category: disabled people, Jews, LGTB+, migrants, Muslims, people of colour and women. Although their creation strategy is quite complex, they have used some annotated data and then augmented it using LMs. Contributed: None (synthetic data is out of scope in our work).

Several other relevant works have been conducted on the topic of constructing hate datasets, including the efforts made by Glavas et al. 2020. They constructed a hate dataset by sampling data from three different collections, (Kumar et al. 2018, Wulczyn et al. 2016, Gao and Huang 2017). More efforts like Cercas Curry et al. 2021 developed an abuse detection dataset, with conversation context, but for conversational AI. We didn't include this dataset in our meta-collection, as we wanted to focus on social media content written by humans.

Various methodologies, such as the proposed by Bretschneider et al. 2016, centred around analyzing comments from users within the video games World of Warcraft (WoW) and League of Legends (LoL). While this dataset specifically pertains to hateful comments, we opted not to incorporate it into our study. Our focus remains on comments made within the domain of social media, excluding other forms of private communication users may engage in on the internet.

Contributions to the realm of hate speech detection have been made by Röttger et al. 2021 and Kirk et al. 2022, who devised functional tests specifically designed for evaluating hate speech detection models. The latter builds upon the former, introducing advancements for detecting hate expressed through emojis. Despite both works generating synthetic datasets to assess their respective systems, we opted not to incorporate these datasets into our meta-collection. Our decision stems from our emphasis on curating textual content originating directly from human sources.

Furthermore, recent studies have extended the scope of hate speech detection by integrating contextual information derived from users' interactions. Wijesiriwardene et al. 2020 curated a dataset focusing on toxic social media interactions among high school students. This work contributes significantly to the hate speech detection domain, as it labels entire interactions as either toxic or non-toxic. Unfortunately, this labelling approach hinders our ability to extract smaller text segments such as hate speech or non-speech and we opted not to include this resource in our collection.

Previous research, such as the work conducted by Suryawanshi et al. 2020, has delved into visual content like memes. In their study, they curated a multimodal offensive meme dataset centred around the 2016 U.S. presidential election and developed a classifier for this multimodal experiment. Interestingly, their findings revealed that textual input exerted a more significant influence on the outcomes than the accompanying images. We made a deliberate choice not to include this study in our analysis, as our focus is specifically on human-authored social media posts. Additionally, Das et al. 2023 have undertaken efforts to explore a multi-modal approach in detecting hate speech within visual content, with a specific focus on BitChute videos.

All these datasets hold significance in the realm of detecting harassment and toxic behaviour on the internet. However, they either deviate from our hate speech definition or lack the crucial social media element, or we were not able to get access to the data.

Data Usage

In order to use MetaHate you need to agree to our Terms and Conditions. Moreover, to access the full data, we require the original Terms of Use of the works above. To acquire the textual collection, kindly complete the user agreements and submit them to paloma.piot@udc.es.

Download Agreement

Disclaimer

This dataset includes content that may contain hate speech, offensive language, or other forms of inappropriate and objectionable material. The content present in the dataset is not created or endorsed by the authors or contributors of this project. It is collected from various sources and does not necessarily reflect the views or opinions of the project maintainers.

The purpose of using this dataset is for research, analysis, or educational purposes only. The authors do not endorse or promote any harmful, discriminatory, or offensive behaviour conveyed in the dataset.

Users are advised to exercise caution and sensitivity when interacting with or interpreting the dataset. If you choose to use the dataset, it is recommended to handle the content responsibly and in compliance with ethical guidelines and applicable laws.

The project maintainers disclaim any responsibility for the content within the dataset and cannot be held liable for how it is used or interpreted by others.

Acknowledgements

The authors thank the funding from the Horizon Europe research and innovation programme under the Marie Skłodowska-Curie Grant Agreement No. 101073351. The authors also thank the financial support supplied by the Consellería de Cultura, Educación, Formación Profesional e Universidades (accreditation 2019-2022 ED431G/01, ED431B 2022/33) and the European Regional Development Fund, which acknowledges the CITIC Research Center in ICT of the University of A Coruña as a Research Center of the Galician University System and the project PID2022-137061OB-C21 (Ministerio de Ciencia e Innovación, Agencia Estatal de Investigación, Proyectos de Generación de Conocimiento; supported by the European Regional Development Fund). The authors also thank the funding of project PLEC2021-007662 (MCIN/AEI/10.13039/501100011033, Ministerio de Ciencia e Innovación, Agencia Estatal de Investigación, Plan de Recuperación, Transformación y Resiliencia, Unión Europea-Next Generation EU).