Conversational Networks For Automatic Online Moderation
- 1. Avignon Université
Description
Description. This repository contains several datasets of conversational networks, extracted from the chat messages exchanged by players of the SpaceOrigin MMORPG. Each graph represents a specific conversation, and belongs to one of two classes: Abusive (1) or Non-abusive (0). Vertices represent users, and edges represent the fact that the connected users exchanged message during the considered time period. Edges are weighted and directed: weights represent the intensity of the message exchanges, and directions represent who sent messages to whom.
We provide two types of graphs: unsigned and signed. Unsigned graphs were extracted using the method described in paper [1], below. Version 1.0 of this dataset contain only a part of the conversations, subsampled to get balanced classes. Version 1.1 is extended to contain all available conversations, and there are much more Non-abusive than Abusive conversations. Signed graphs were extracted later, using the method described in publication [9] below. Each edge is described by an additional sign, that indicates the polarity of the messages exchanged by two users; friendly (positive) vs. hostile (negative).
These datasets were used to train a classifier into automatically recognizing abusive messages. See the below papers for more details. The repository also contains some figures that appear in these papers.
Publications. The following papers used the unsigned version of the conversational networks. The extraction method is described in paper [1].
- [1] É. Papégnies, V. Labatut, R. Dufour & G. Linarès, “Conversational Networks for Automatic Online Moderation,” IEEE Transactions on Computational Social Systems 6(1):38–55, 2019. ⟨hal-01999546⟩ DOI: 10.1109/tcss.2018.2887240
- [2] É. Papegnies, R. Dufour, V. Labatut & G. Linarès. “Détection de messages abusifs au moyen de réseaux conversationnels,” in 8ème Conférence sur les modèles et l'analyse de réseaux : approches mathématiques et informatiques (MARAMI), 2017. ⟨hal-01614279⟩
- [3] É. Papegnies, V. Labatut, R. Dufour, & G. Linares. “Graph-based Features for Automatic Online Abuse Detection,” in International Conference on Statistical Language and Speech Processing (SLSP), Springer, Lecture Notes in Computer Science 10583:70-81, 2017. ⟨hal-01571639⟩ DOI: 10.1007/978-3-319-68456-7_6
- [4] N. Cécillon. “Exploration de descripteurs de plongements de graphes pour la détection de messages abusifs,” MSc Thesis, Université d'Avignon, 2019. ⟨dumas-04073337⟩
- [5] N. Cécillon, V. Labatut, R. Dufour, and G. Linarès, “Abusive Language Detection in Online Conversations by Combining Content- and Graph-based Features,” in International Workshop on Modeling and Mining Socia-Media Driven Complex Networks, Frontiers in Big Data 2:8, 2019. ⟨hal-02130205⟩ DOI: 10.3389/fdata.2019.00008
- [6] N. Cécillon, V. Labatut, R. Dufour, & G. Linarès. “Tuning Graph2vec with Node Labels for Abuse Detection in Online Conversations,” in 11ème Conférence sur les modèles et l'analyse de réseaux : approches mathématiques et informatiques (MARAMI), 2020. ⟨hal-02993571⟩ Official Page
- [7] N. Cécillon, V. Labatut, R. Dufour & G. Linarès. “Graph embeddings for Abusive Language Detection,” Springer Nature Computer Science 2:37, 2021. ⟨hal-03042171⟩ DOI: 10.1007/s42979-020-00413-7
- [8] N. Cécillon, R. Dufour & V. Labatut. “Approche multimodale par plongements de texte et de graphes pour la détection de messages abusifs,” Traitement Automatique des Langues 62:13-38, 2021. ⟨hal-03527016⟩ Official Page
The following publications use the signed version of the graphs. The modified extraction method is described in publication [9].
- [9] N. Cécillon. “Combining Graph and Text to Model Conversations: An Application to Online Abuse Detection,” PhD Thesis, Université d'Avignon, 2024. ⟨tel-04441308⟩
Funding. Part of this work was funded by a grant from the Provence-Alpes-Côte-d'Azur region (PACA, France) and the Nectar de Code company.
Citation. If you use this dataset, please cite paper [1] for the unsigned networks:
@Article{Papegnies2019,
author = {Papegnies, Étienne and Labatut, Vincent and Dufour, Richard and Linarès, Georges},
title = {Conversational Networks for Automatic Online Moderation},
journal = {IEEE Transactions on Computational Social Systems},
year = {2019},
volume = {6},
number = {1},
pages = {38-55},
doi = {10.1109/TCSS.2018.2887240},
}
and [9] for the signed ones:
@PhdThesis{Cecillon2024,
author = {Cécillon, Noé},
title = {Combining Graph and Text to Model Conversations: An Application to Online Abuse Detection},
school = {Université d'Avignon},
year = {2024},
type = {PhD Thesis},
address = {Avignon, FR},
url = {https://theses.fr/2024AVIG0100},
}
Files
SpaceOrigin_graphs.zip
Files
(18.9 MB)
Name | Size | Download all |
---|---|---|
md5:f66923268374b1322f47780958be3925
|
18.9 MB | Preview Download |
Additional details
Related works
- Is documented by
- Journal article: 10.1109/tcss.2018.2887240 (DOI)
- Conference paper: 10.3389/fdata.2019.00008 (DOI)
- Is required by
- Conference paper: 10.1007/978-3-319-68456-7_6 (DOI)
- Journal article: 10.1007/s42979-020-00413-7 (DOI)
- Obsoletes
- Dataset: 10.6084/m9.figshare.7442273 (DOI)
Funding
- Conseil Régional Provence-Alpes-Côte d'Azur
Dates
- Created
-
2017
- Updated
-
2024
Software
- Repository URL
- https://github.com/CompNet/Alert
- Development Status
- Inactive