Published June 27, 2024
| Version v1
Dataset
Open
Data Donation with Dona: De-identified Messaging Data (WhatsApp and Facebook) and Evaluation Responses
Creators
Contributors
Data collector:
Project members:
Description
General information
The dataset contains de-identified messaging meta-data from 78 WhatsApp and 7 Facebook data donations. The dataset was collected in an online study using the data donation platform Dona. After donating their messaging data, the study participants viewed visual summaries of their messaging data and evaluated this visual feedback. The responses to the evaluation questions and the sociodemographic data of the participants are also included in the dataset.
The data was collected from August 2022 to June 2024.
For more information on Dona, the associated publications and updates, please visit https://mbp-lab.github.io/dona-blog/.
File description
donation_table.csv
- contains general information about the donations including- donation_id: donation identifier
- donor_id: the ID of the donor to distinguish the messages sent by them from those sent by contacts
- source: the messaging platform from which the data is donated (WhatsApp or Facebook)
- external_id: ID used to connect messaging data with the survey data
-
messages_table.csv
- contains the donated messages including
- conversation_id: chat identifier
- sender_id: sender identifier
- datetime: time of the message, UNIX time for Facebook and device time for WhatsApp
- word_count: word count of the messages achieved by splitting the text based on whitespace
- donation_id: donation identifier (also listed in
donation_table.csv
)
messages_filtered_table.csv
- same structure asmessages_table.csv
except that chats with no considerable interactions were removed. This was defined as chats where donor's word count contribution was less than 10% or more than 90%.survey.xlsx
→ contains survey responses of the participants.survey_table_coding.xlsx
→ contains the mapping between the column names insurvey.xlsx
and their meaning, including the original survey questions and response options. Different sheets of the Excel file detail the survey questions and responses in one of the study languages (English, German, Armenian).
Notes
Files
messages_filtered_table.csv
Files
(1.0 GB)
Name | Size | Download all |
---|---|---|
md5:a53acfe20ff02a1ab192906a185bd81f
|
7.7 kB | Preview Download |
md5:88f32bde7ad12bbc05a55157937be768
|
502.2 MB | Preview Download |
md5:31c48ff9d6ad7ed1d811f18c15d59984
|
534.4 MB | Preview Download |
md5:998ecc88ed94f26270f06cff0a7b739d
|
12.5 kB | Download |
md5:47ad7f7d1105fd538aa507263e291071
|
15.0 kB | Download |
Additional details
Funding
- Empathische Künstliche Intelligenz 01IS20046
- Federal Ministry of Education and Research
Software
- Repository URL
- https://github.com/mbp-lab/dona-brm
- Programming language
- Python