Dataset Open Access

Characterization of Time-variant and Time-invariant Assessment of Suicidality on Reddit using C-SSRS

Gaur, Manas; Aribandi; Alambo; Kursuncu; Thirunarayan; Beich; Pathak; Sheth

Suicide is the 10th leading cause of death in the U.S (1999-2019). However, predicting when someone will attempt or complete suicide has been nearly impossible. In the modern world, many individuals suffering from mental illness seek emotional support and advice on well-known and easily-accessible social media platforms such as Reddit.   While prior artificial intelligence research has demonstrated the ability to extract valuable information from social media on suicidal thoughts and behaviors, these efforts have not considered both severity and temporality of risk. The insights made possible by access to such data have enormous clinical potential - most dramatically envisioned as a trigger to employ timely and targeted interventions (i.e. voluntary and involuntary psychiatric hospitalization) to save lives. In this work, we address this knowledge gap by developing natural datasets of users experiencing suicide-related ideations, suicide-related behaviors or suicide attempt (https://zenodo.org/record/2667859#.YCwdTR1OlQI) manifested through their communication on r/SuicideWatch and associated mental health subreddits. Through a widely recognized questionnaire to assess suicide risk severity, The Columbia Suicide Severity Rating Scale, the domain experts in the study annotated 448 users with following labels: Supportive (new add to C-SSRS and specific to social media), Suicide Ideation, Suicide Behavior, Suicide Attempt. High standards in annotation were maintained with substantial inter-rater agreement of 0.76.

Files (36.8 MB)
Name Size
500_anonymized_Reddit_users_posts_labels - 500_anonymized_Reddit_users_posts_labels.csv
md5:3daf3175b8c85d17c88f96cf4597697d
3.6 MB Download
Redditors_and_posts_batch_1.xlsx
md5:ac00635a93f05c2323be082073cb1a43
291.1 kB Download
Redditors_and_posts_batch_2.xlsx
md5:d594431db867b4e24f0c6fe6e6474fda
294.3 kB Download
Redditors_and_posts_batch_3.xlsx
md5:0de886e0ced9ba95bb53e194104334ae
331.2 kB Download
Redditors_and_posts_batch_4.xlsx
md5:bdb9013f14e3ae82c4cb28bbb6094c0a
610.6 kB Download
Redditors_and_posts_batch_5.xlsx
md5:211b5b6b378a4804bd94b3642b151fd6
264.5 kB Download
Unlabeled_Dataset.json
md5:0c2b813f393d9f10468abddcfe3ad383
31.3 MB Download
731
783
views
downloads
All versions This version
Views 731731
Downloads 783783
Data volume 2.0 GB2.0 GB
Unique views 668668
Unique downloads 320320

Share

Cite as