There is a newer version of the record available.

Published January 18, 2026 | Version v0.4.1
Software Open

RISE-UNIBAS/humanities_data_benchmark

Description

This repository contains benchmark datasets (images and text), prompts, ground truths, and evaluation scripts for assessing the performance of large language models (LLMs) on humanities-related tasks. The suite is designed as a resource for researchers and practitioners interested in systematically evaluating how well various LLMs perform on digital humanities (DH) tasks involving visual and text-like materials. For detailed test results and model comparisons, visit our results dashboard at https://rise-services.rise.unibas.ch/benchmarks/.

Notes

If you use this software, please cite it using the metadata from this file.

Files

RISE-UNIBAS/humanities_data_benchmark-v0.4.1.zip

Files (158.3 MB)

Name Size Download all
md5:efe477ffb91162001a2a325d4f903fc0
158.3 MB Preview Download

Additional details

Related works