Published March 11, 2025 | Version 1.0
Dataset Open

XL-WSD-LLM: Extending XL-WSD to evaluate Large Language Models

  • 1. ROR icon University of Bari Aldo Moro
  • 1. ROR icon University of Bari Aldo Moro

Description

This benchmark extends XL-WSD. Starting from XL-WSD, we build a set of prompts for evaluating Large Language Models (LLMs) in two settings. The first is a multiple-choice task, and the second is a generative task in which we assess the quality of the generated definition.

The benchmark consists of three compressed archives. Two archives contain training and test data for each task and language, while another is dedicated to the output of several LLMs that we evaluate. Each dataset includes data split into two folders: FT and TT. FT contains data without machine translation, while TT contains data where missing glosses are automatically translated.

More details are available in the pre-print article "Exploring the Word Sense Disambiguation Capabilities of Large Language Models,"  published on arXiv.org.

Files

Files (394.5 MB)

Name Size Download all
md5:bcdf35f090c761179d11ff65f6ceec47
51.8 MB Download
md5:59616a82ef2ac519a004cfa058439c5e
4.1 MB Download
md5:c7c1918a3f86c428a5e267d2cc84f4f1
338.6 MB Download

Additional details

Dates

Updated
2025-03-11