Dataset Open Access

Reddit Mental Health Dataset

Low, Daniel M.; Rumker, Laurie; Talker, Tanya; Torous, John; Cecchi, Guillermo; Ghosh, Satrajit S.


Dublin Core Export

<?xml version='1.0' encoding='utf-8'?>
<oai_dc:dc xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
  <dc:creator>Low, Daniel M.</dc:creator>
  <dc:creator>Rumker, Laurie</dc:creator>
  <dc:creator>Talker, Tanya</dc:creator>
  <dc:creator>Torous, John</dc:creator>
  <dc:creator>Cecchi, Guillermo</dc:creator>
  <dc:creator>Ghosh, Satrajit S.</dc:creator>
  <dc:date>2020-07-13</dc:date>
  <dc:description> 

This dataset contains posts from 28 subreddits (15 mental health support groups) from 2018-2020. We used this dataset to understand the impact of COVID-19 on mental health support groups from January to April, 2020 and included older timeframes to obtain baseline posts before COVID-19.

Please cite if you use this dataset:

Low DM, Rumker L, Talker T, Torous J, Cecchi G, Ghosh SS (2020). Natural language processing reveals vulnerable mental health support groups and heightened health anxiety on Reddit during COVID-19. PsyArXiv. https://doi.org/10.31234/osf.io/xvwcy

License

This dataset is made available under the Public Domain Dedication and License v1.0 whose full text can be found at: http://www.opendatacommons.org/licenses/pddl/1.0/

It was downloaded using pushshift API. Re-use of this data is subject to Reddit API terms.

 

Reddit Mental Health Dataset

Contains posts and text features for the following timeframes from 28 mental health and non-mental health subreddits:


	15 specific mental health support groups (r/EDAnonymous, r/addiction, r/alcoholism, r/adhd, r/anxiety, r/autism, r/bipolarreddit, r/bpd, r/depression, r/healthanxiety, r/lonely, r/ptsd, r/schizophrenia, r/socialanxiety, and r/suicidewatch)
	2 broad mental health subreddits (r/mentalhealth, r/COVID19_support)
	11 non-mental health subreddits (r/conspiracy, r/divorce, r/fitness, r/guns, r/jokes, r/legaladvice, r/meditation, r/parenting, r/personalfinance, r/relationships, r/teaching).


filenames and corresponding timeframes:


	post: Jan 1 to April 20, 2020 (called "mid-pandemic" in manuscript; r/COVID19_support appears). Unique users: 320,364. 
	pre: Dec 2018 to Dec 2019. A full year which provides more data for a baseline of Reddit posts. Unique users: 327,289.
	2019: Jan 1 to April 20, 2019 (r/EDAnonymous appears). A control for seasonal fluctuations to match post data. Unique users: 282,560.
	2018: Jan 1 to April 20, 2018. A control for seasonal fluctuations to match post data. Unique users: 177,089


Unique users across all time windows (pre and 2019 overlap): 826,961.

See manuscript Supplementary Materials (https://doi.org/10.31234/osf.io/xvwcy) for more information.

Note: if subsampling (e.g., to balance subreddits), we recommend bootstrapping analyses for unbiased results.

 </dc:description>
  <dc:identifier>https://zenodo.org/record/3941387</dc:identifier>
  <dc:identifier>10.17605/OSF.IO/7PEYQ</dc:identifier>
  <dc:identifier>oai:zenodo.org:3941387</dc:identifier>
  <dc:language>eng</dc:language>
  <dc:relation>doi:10.17605/OSF.IO/7PEYQ</dc:relation>
  <dc:relation>doi:10.31234/osf.io/xvwcy</dc:relation>
  <dc:relation>url:https://zenodo.org/communities/covid-19</dc:relation>
  <dc:relation>url:https://zenodo.org/communities/medicalnlp</dc:relation>
  <dc:relation>url:https://zenodo.org/communities/natural-language-processing</dc:relation>
  <dc:relation>url:https://zenodo.org/communities/zenodo</dc:relation>
  <dc:rights>info:eu-repo/semantics/openAccess</dc:rights>
  <dc:rights>http://www.opendefinition.org/licenses/odc-pddl</dc:rights>
  <dc:subject>Natural Language Processing</dc:subject>
  <dc:subject>Mental Health</dc:subject>
  <dc:subject>Psychiatry</dc:subject>
  <dc:subject>COVID-19</dc:subject>
  <dc:subject>Reddit</dc:subject>
  <dc:subject>Social Media</dc:subject>
  <dc:title>Reddit Mental Health Dataset</dc:title>
  <dc:type>info:eu-repo/semantics/other</dc:type>
  <dc:type>dataset</dc:type>
</oai_dc:dc>