The dataset and software for Simulating Human Responses to Environmental Messaging by Dr Ian Drumm and Dr Atefeh Tate, University of Salford, UK.

Drumm, Ian

doi:10.5281/zenodo.17431553

Published June 14, 2025 | Version v3

Dataset Restricted

The dataset and software for Simulating Human Responses to Environmental Messaging by Dr Ian Drumm and Dr Atefeh Tate, University of Salford, UK.

Drumm, Ian (Data curator)¹

1. University of Salford

READ ME

Simulating Human Responses to Environmental Messaging

Dr Ian Drumm and Dr Atefeh Tate

This dataset will be referenced from a corresponsing journal article.

The data presented pertains to ongoing work to implement and evaluate virtual humans whose responses to environmental messaging are shaped by their media diets and social interactions. The project scraped thousands of social media post–comment pairs related to environmental issues, classified them by viewpoint through the large-scale orchestration of multiple instances of large language models, and built a vector database of embedded interactions with associated classification metadata to serve as a knowledge source for a chatbot. Dynamic, metadata-based filtering of this knowledge source, in conjunction with retrieval-augmented generation, enabled a chatbot with selectable personas that generate responses to new social media posts based on stereotypical attitudes grounded in current news and zeitgeists. A qualitative and quantitative evaluation was conducted to demonstrate the validity of the approach, though its full potential remains to be explored.

Data Content & Compliance Note

Original Content Redaction: Due to strict Data Redistribution restrictions in Reddit's Terms of Service and API policies, the original user-generated content (posts and comments) used to generate this dataset has been entirely redacted.

Data Provided: To enable analysis and evaluation of the metric scoring system, the dataset includes synthetic or "fake" comments and all associated quantitative and qualitative metrics (e.g., BERT scores, perplexity, and category justifications). This allows for the verification of scoring algorithms without violating the original content license.

Code for generating database of scored comments is given via the github.

Source Code

https://github.com/iduos/CLIMATE_BOT

Reddit Search

The data relates Reddit a search and classification based of the following parameters…

Parameter	Value	Notes / Description
--subreddits	"worldnews, politics, conservative, liberal, libertarian"	List of subreddits to collect data from
--query	"climate change OR global warming OR net zero OR renewable energy OR carbon tax OR sea level rise OR extreme weather"	Search query terms used to filter posts
--start_date	"2025-01-01"	Start date for data collection
--end_date	"2025-11-01"	End date for data collection
--bin_by_period	"month"	Searches respective months in the search period
--scoring_prompts	"rubrics/climateUK4.json"	Prompt file for scoring climate-related discussions
--scorer_llm	"gemini-2.5-flash"	Model used for scoring content
--embed_model	"nomic-embed-text:latest"	Embedding model used for storing post/comment pairs
--sample_size	5000	Samples from 30,000+ items to build database

Classified Knowledge Base

An export of the Vector Database with 5000 classified items, given in

climate_uk4_5000GF_database_export.csv

Rubric for classification

climateUK4.json

Evaluation Metrics

Evaluation metrics for 400 items sampled from the database (100 per category),

We conducted both qualitative and quantitative evaluations of the chatbot. For a given post a real comment and a generated comment were compared. The real comment’s id was used to filter it from RAG context. Hence, we ensured each real comment during the evaluation wasn’t included in formulating the chatbot’s generated comment. The generated comment was added to a new dataset of items of the form {post/real comment/chatbot comment}. The {real comment / chatbot comment} pairs were presented as respective reference and generated texts to a variety of tools for calculating chatbot metrics. Hence, averages were determined. We wanted to demonstrate linguistic similarity performance, semantic alignment, language predictability and emotional alignment. Our automated evaluation used the following quantitative metrics ... BERTScore F1, embedding similarity (ES), real comment perplexity (RP), chatbot comment perplexity (BP), emotional similarity (EM), and sentiment difference (SD). The baselines represent random pairings of real comments, serving as empirical chance-level reference. The table gives overall evaluation.

Human Evaluation

Human evaluation with two coders of 100 samples equally distributed across categories (Concerned, Paradoxical, Sceptical and Irrelevant). Gives the LLM (Gemini 2.5 Flash) classifications of original comments to reddit posts and human classifications of the same comment. Both apply the rubric given in climateUK4.json.

/evaluation_data/

Majority_LLM_CAT_v_HUMAN.csv

FAKE_v_HUMAN gives the category filters use for generating the fake comments to reddit posts and human classifications of the fake comments applying the rubric given in climateUK4.json.

/evaluation_data/

Majority_FAKE_v_HUMAN.csv

Lineup tests

We evaluated whether chatbot comments were distinguishable from real Reddit comments using a lineup-style human judgment task (Wickham et al., 2010). For each post, panels of five comments were shown to raters, four real and one chatbot generated, within one of four viewpoint categories (Sceptical, Paradoxical, Concern, Irrelevant). Comments were anonymized and standardized, and the chatbot’s position was randomized. Three raters each evaluated 50 panels (100 total trials) and selected the comment they believed was written by the chatbot. Performance near the chance level (1/5 = 20%) indicated that the chatbot’s comments were operationally indistinguishable from human ones.

/evaluation_data/

lineup_CODER_A.csv

lineup_CODER_B.csv

Files

Restricted

The record is publicly accessible, but files are restricted. <a href="https://zenodo.org/account/settings/login?next=https://zenodo.org/records/17431553">Log in</a> to check if you have access.

Additional details

Repository URL: https://github.com/iduos/CLIMATE_BOT
Programming language: Python
Development Status: Active

	All versions	This version
Views	62	26
Downloads	47	22
Data volume	137.2 MB	87.9 MB

The dataset and software for Simulating Human Responses to Environmental Messaging by Dr Ian Drumm and Dr Atefeh Tate, University of Salford, UK.

Authors/Creators

Description

READ ME

Data Content & Compliance Note

Source Code

Reddit Search

Classified Knowledge Base

Rubric for classification

Evaluation Metrics

Human Evaluation

Files

Restricted

Additional details

Software