Bubble reachers and uncivil discourse in polarized online public sphere comments dataset

Kobellarz, Jordan K; Brocic, Milos; Silver, Daniel; Silva, Thiago

doi:10.5281/zenodo.10443022

Published December 29, 2023 | Version v1

Dataset Open

Bubble reachers and uncivil discourse in polarized online public sphere comments dataset

1. Universidade Tecnológica Federal do Paraná
2. McGill University
3. University of Toronto

This dataset contains comments in Portuguese and English gathered from various sources, such as news websites from Brazil and Canada, social media sites like Facebook and Reddit, e-commerce reviews, Wikipedia comments, among others. Each comment is accompanied by a "toxicity" score provided by the Perspective API.

Disclaimer: This file includes words or language that is considered profane, vulgar or offensive by some readers. Due to the topic studied in this article, quoting offensive language is academically justified, but we nor PLOS in no way endorse the use of these words or the content of the quotes. Likewise, the quotes do not represent the opinions of us or that of PLOS, and we condemn online harassment and offensive language.

Column information:

preprocessed_text: the text after undergoing preprocessing steps;
dataset: the given name of the dataset;
source: the dataset's source name;
dataset_source: a combination of the dataset name with its source to facilitate data aggregation tasks;
TOXICITY: a continuous score between 0.0 and 1.0 provided by the Perspective API.

In addition to the comments, there is a spreadsheet containing analyses referenced in the article associated with this dataset.

Files

Bubble_Reachers_and_Uncivil_Discourse_2023_comments.csv

Files (378.9 MB)

Name	Size	Download all
Bubble_Reachers_and_Uncivil_Discourse_2023_comments.csv md5:90c3d4f3da3e04bf002d1d38f66304e8	378.9 MB	Preview Download
Bubble_Reachers_and_Uncivil_Discourse_2023_stats.xlsx md5:c91f6d8481dde11185b98315cbb54802	24.2 kB	Download

	All versions	This version
Views	224	224
Downloads	174	174
Data volume	55.3 GB	55.3 GB

Bubble reachers and uncivil discourse in polarized online public sphere comments dataset

Creators

Description

Files

Bubble_Reachers_and_Uncivil_Discourse_2023_comments.csv

Files (378.9 MB)