Benchmark Dataset for Environmental Values and Sustainability Alignment in Large Language Models

Kunkel, Stefanie; Hartwig, Tilman; Voss, Marcus; Schütt, Emma; Gellrich, Angelika

doi:10.5281/zenodo.20445903

Published May 29, 2026 | Version 1.0.1

Dataset Open

Benchmark Dataset for Environmental Values and Sustainability Alignment in Large Language Models

1. Research Institute For Sustainability – Helmholtz Centre Potsdam
2. German Federal Environment Agency
3. Birds on Mars
4. University of Potsdam

This repository contains a benchmark dataset for evaluating environmental values, sustainability-related attitudes, and behavioural recommendations in large language models (LLMs).

The dataset was developed to systematically assess environmental cognition, affect, and behavioural orientations expressed by LLMs across a broad set of sustainability-related prompts. It includes responses from 31 widely used proprietary and open-weight models evaluated under multiple prompting conditions.

The benchmark combines:

questions derived from established environmental awareness surveys,
sustainability-related behavioural measures,
multilingual prompt formulations,
comparative model evaluations,
and derived sustainability-related indices.

The repository includes:

a consolidated Excel workbook,
machine-readable CSV exports for all sheets,
YAML prompt definitions,
documentation for index construction,
and metadata intended to support FAIR and reproducible research practices.

The dataset enables:

comparative benchmarking of LLM sustainability alignment,
analysis of environmental attitudes embedded in model outputs,
investigation of contextual sensitivity and persona-based steering effects,
and comparison between LLM responses and human survey benchmarks from Germany.

The benchmark is intended as a reusable framework for future research on AI governance, sustainability-related value alignment, steerability, and normative robustness in generative AI systems.

Further methodological details and analysis will be provided in the corresponding research paper on arXiv: https://arxiv.org/abs/2606.02741

The repository with the code to generate this dataset can be found here: https://gitlab.opencode.de/uba-ki-lab/llm-questionnaire-benchmarking-framework

Files

csv_exports.zip

Files (171.2 kB)

Name	Size	Download all
00_master_prompts.yaml md5:96a3d2e4287e6d9b01e09d2a407111f4	2.0 kB	Download
csv_exports.zip md5:741f13aa9582fbff6125f5052fe3b520	87.6 kB	Preview Download
Explanation_Indices.txt md5:b4e9367d9ed2e9bffbe09191ae7b4c97	1.4 kB	Preview Download
llm_role_prompting_dataset.xlsx md5:5605bc47dd9316d83eb4f3f99c43a479	76.9 kB	Download
README.txt md5:ecb8fa5fdf453204e7cd5c2feb7d2410	3.3 kB	Preview Download

Additional details

Repository URL: https://gitlab.opencode.de/uba-ki-lab/llm-questionnaire-benchmarking-framework

	All versions	This version
Views	34	34
Downloads	45	45
Data volume	1.9 MB	1.9 MB

Benchmark Dataset for Environmental Values and Sustainability Alignment in Large Language Models

Authors/Creators

Description

This repository contains a benchmark dataset for evaluating environmental values, sustainability-related attitudes, and behavioural recommendations in large language models (LLMs).

The benchmark combines:

The repository includes:

The dataset enables:

Files

csv_exports.zip

Files (171.2 kB)

Additional details

Software