Info: Zenodo’s user support line is staffed on regular business days between Dec 23 and Jan 5. Response times may be slightly longer than normal.

Published November 3, 2023 | Version v2
Dataset Open

Dataset of suicidal ideation texts in Brazilian Portuguese - Boamente System

  • 1. ROR icon Universidade Federal do Delta do Parnaíba
  • 2. ROR icon Instituto Federal do Maranhão
  • 3. ROR icon Instituto Federal de Educação, Ciência e Tecnologia do Ceará
  • 4. ROR icon Universidade Federal do Maranhão

Contributors

Description

We obtained non-clinical texts from tweets (user posts of the online social network Twitter). To find suicide-related tweets, we used the Twitter API to download tweets in a personalized way based on search terms associated with suicide. After different experiments to retrieve relevant texts, 5699 tweets were collected in May 2021. Each downloaded tweet had user-specific information (for example, user ID, timestamp, language, location, number of likes, etc.). Still, we kept only the post content (suicide-related texts) and discarded the additional data. Therefore, all texts were anonymized. 

After data collection, three psychologists were invited to perform the data annotation, in which they individually labeled each tweet. To avoid bias in the annotation process, we selected psychologists with different psychological approaches, namely cognitive behavioral theory, psychoanalytic theory, and humanistic theory. Professionals had to classify each tweet as negative for suicidal ideation (annotated as 0), or positive for suicidal ideation (annotated as 1). 

All tweets with at least one divergence between psychologists (n = 1513) were excluded, resulting in a dataset with 4186 instances. 398 duplicate tweets were excluded. The final dataset consists of 2691 instances labeled negative and 1097 labeled positive.

 

Files

boamente_dataset.csv

Files (349.6 kB)

Name Size Download all
md5:222acfd0582d89ad85a4005c62370e18
349.6 kB Preview Download

Additional details

Related works

Is cited by
Conference proceeding: 10.1016/j.procs.2022.09.093 (DOI)

References

  • Diniz, Evandro J. S., José E. Fontenele, Adonias C. de Oliveira, Victor H. Bastos, Silmar Teixeira, Ricardo L. Rabêlo, Dario B. Calçada, Renato M. dos Santos, Ana K. de Oliveira, and Ariel S. Teles. 2022. "Boamente: A Natural Language Processing-Based Digital Phenotyping Tool for Smart Monitoring of Suicidal Ideation" Healthcare 10, no. 4: 698. https://doi.org/10.3390/healthcare10040698