There is a newer version of the record available.

Published November 20, 2021 | Version 0.0.0
Dataset Open

Emotional and Cognitive Changes Surrounding Online Depression Identity Claims

  • 1. University of Michigan
  • 2. The University of Texas at Austin

Description

The repository includes data files containing anonymized user IDs, timestamps, identity claim time, LIWC variables, post vs. comment (boolean), and mental health vs. other subreddit (boolean) for our submitted paper Emotional and Cognitive Changes Surrounding Online Depression Identity Claims. These files are named ic_liwc.csv (for users with identity claims) and control_liwc.csv (for users without identity claims). Because the identity claims themselves are excluded from these files but metadata about them is required to split users into groups, we also provide a file for doing so, ic_properties.csv.

All code used in analysis will be published prior to publication of the article, along with documentation of our data collection process. The Reddit posts themselves will not be made widely available, following the lead of Cohan et al. (the paper with the data collection process we follow) who only release raw text data to researchers upon request.

Prior to publication, we will also add n-grams for reproduction of the LLR analysis and the list of terms used in our robustness check.

Files

ic_data.zip

Files (1.8 GB)

Name Size Download all
md5:b69df34f632c6e99a28273101a89a470
1.8 GB Preview Download