Published August 31, 2025 | Version v0.2.0
Software Open

RISE-UNIBAS/humanities_data_benchmark

Description

This repository contains benchmark datasets (images), prompts, ground truths, and evaluation scripts for assessing the performance of large language models (LLMs) on humanities-related tasks. The suite is designed as a resource for researchers and practitioners interested in systematically evaluating how well various LLMs perform on digital humanities (DH) tasks involving visual materials. For detailed test results and model comparisons, visit our results dashboard at https://rise-unibas.github.io/humanities_data_benchmark/.

Notes

If you use this software, please cite it using the metadata from this file.

Files

RISE-UNIBAS/humanities_data_benchmark-v0.2.0.zip

Files (100.6 MB)

Name Size Download all
md5:0ee2c8e5b11294088099908ec9a090d9
100.6 MB Preview Download

Additional details

Related works