Vocabulary Tests LLMs
Creators
Description
Vocabulary evaluation of LLMs
This repository contains the results for the different vocabulary tests run on LLM tools/models presented in the paper: "The continued usefulness of vocabulary tests for evaluating large language models" currently published in PLOS ONE: https://doi.org/10.1371/journal.pone.0308259
The name of the files correspond to the vocabulary tests for which results are presented in Tables 3-6 in the paper (note that questions for the TOEFL test are not public).
In each file, the first column has the question posed to the LLM tool/model, followed by the correct answer. The rest of the columns correspond to the answers of the different LLM tools/models evaluated. The results and percentages are summarized at the bottom of the file after the last test items.
The models evaluated are:
| Model | Link |
| Llama 2 7b | https://huggingface.co/meta-llama/Llama-2-7b-chat |
| Llama 2 13b | https://huggingface.co/meta-llama/Llama-2-13b-chat |
| Llama 2 70b | https://huggingface.co/meta-llama/Llama-2-70b-chat |
| Mistral 7b v0.1 | https://huggingface.co/mistralai/Mistral-7B-v0.1 |
| GPT 3.5 turbo 0613 | https://platform.openai.com/docs/models/gpt-3-5-turbo |
| GPT 4 0613 | https://platform.openai.com/docs/models/gpt-4-turbo-and-gpt-4 |
| Bard |
To cite our work:
@article{10.1371/journal.pone.0308259,
doi = {10.1371/journal.pone.0308259},
author = {Martínez, Gonzalo AND Conde, Javier AND Merino-Gómez, Elena AND Bermúdez-Margaretto, Beatriz AND Hernández, José Alberto AND Reviriego, Pedro AND Brysbaert, Marc},
journal = {PLOS ONE},
publisher = {Public Library of Science},
title = {Establishing vocabulary tests as a benchmark for evaluating large language models},
year = {2024},
month = {12},
volume = {19},
url = {https://doi.org/10.1371/journal.pone.0308259},
pages = {1-17},
number = {12},
}
Files
LLM_Vocabulary_Evaluation.zip
Files
(237.4 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:9419049130f5325ad4a2f587acfa3c8d
|
237.4 kB | Preview Download |
Additional details
Software
- Repository URL
- https://github.com/WordsGPT/LLM_Vocabulary_Evaluation