Published January 5, 2026 | Version v2
Dataset Open

Leaderboard Spanish Language Benchmark for Artificial Intelligence Models (TELEIA)

  • 1. Universidad Autónoma de Madrid
  • 2. ROR icon New York University
  • 3. Universidad de Valladolid
  • 4. SomosNLP
  • 5. ROR icon Universidad Politécnica de Madrid

Description

TELEIA Datasets Leaderboard

These dataset contains the answers of different LLMs to the TELEIA (Spanish Language Benchmark for Artificial Intelligence Models) dataset.

LLMs evaluated:

  • Yi-6B-Chat
  • Meta-Llama-3-8B-Instruct
  • Llama-2-7b-chat-hf
  • gemma-7b-it
  • Mistral-7B-Instruct-v0.1
  • occiglot-7b-es-en-instruct
  • GPT3.5
  • GPT4

Files:

  • TELEIA_Cervantes_AVE_results.xlsx: vocabulary and grammatical structures, following the format of the Cervantes AVE exam
  • TELEIA_PCE_results.xlsx: test on morphology and semantics resembling the style of the PCE exam, consisting of short questions or sentences to be completed
  • TELEIA_SIELE_results.xlsx: different texts with questions related to them, based on the reading comprehension task of the SIELE exam

Each .xlsx contains a sheet with the results of each model and the following columns:

  • question: question from TELEIA
  • option_a: possible answer from TELEIA    
  • option_b: possible answer from TELEIA       
  • option_c: possible answer from TELEIA        
  • option_d: possible answer from TELEIA        
  • correct_answer: correct answer form TELEIA  
  • llm_question: complete question made to the LLM    
  • tokens_in: list of tokens that compound the question    
  • tokens_in_count: number of tokens that compound the question    
  • llm_answer: raw answer from the LLM    
  • llm_answer_filtered: answer in format {A,B,C,D} from the LLM    
  • tokens_out : list of tokens that compound the raw answer    
  • tokens_out_count: number of tokens that compound the raw answer    
  • word_count :  number of words that compound the raw answer                                               

 

Files

README.md

Files (1.2 MB)

Name Size Download all
md5:0943a2763ca763c1050f732ebbea509e
6.4 kB Download
md5:98deb31aae90173f9807cd687c370d65
13.0 kB Download
md5:1883ffff3883b39e65eeb8606a8156e5
9.1 kB Download
md5:4dbbc9d2b36418d60991c0211ef7b0be
5.8 kB Preview Download
md5:8c9039581d3536371ef29693cdb178ab
386.5 kB Download
md5:19161eae9195c89be4aa56906ea21ab7
391.0 kB Download
md5:86f0a93838772e4bb405a039339ae637
419.8 kB Download
md5:88d5580d63ce1601d3ada0996f0fae98
6.5 kB Download

Additional details

Funding

Agencia Estatal de Investigación
Fun4Date PID2022-136684OB-C21/C22
European Commission
SMARTY 101140087
OpenAI (United States)
Researcher Access Program Researcher Access Program