Prompts generated from ChatGPT3.5, ChatGPT4, LLama3-8B, and Mistral-7B with NYT and HC3 topics in different roles and parameters configurations
Creators
Description
Description
Prompts generated from ChatGPT3.5, ChatGPT4, Llama3-8B, and Mistral-7B with NYT and HC3 topics in different roles and parameter configurations.
The dataset is useful to study lexical aspects of LLMs with different parameters/roles configurations.
- The 0_Base_Topics.xlsx file lists the topics used for the dataset generation
- The rest of the files collect the answers of ChatGPT to these topics with different configurations of parameters/context:
- Temperature (parameter): Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.
- Frequency penalty (parameter): Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
- Top probability (parameter): An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass.
- Presence penalty (parameter): Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
- Roles (context)
- Default: No role is assigned to the LLM, the default role is used.
- Child: The LLM is requested to answer as a five-year-old child.
- Young adult male: The LLM is requested to answer as a young male adult.
- Young adult female: The LLM is requested to answer as a young female adult.
- Elderly adult male: The LLM is requested to answer as an elderly male adult.
- Elderly adult female: The LLM is requested to answer as an elderly female adult.
- Affluent adult male: The LLM is requested to answer as an affluent male adult.
- Affluent adult female: The LLM is requested to answer as an affluent female adult.
- Lower-class adult male: The LLM is requested to answer as a lower-class male adult.
- Lower-class adult female: The LLM is requested to answer as a lower-class female adult.
- Erudite: The LLM is requested to answer as an erudite who uses a rich vocabulary.
Paper
- Paper: Beware of Words: Evaluating the Lexical Diversity of Conversational LLMs using ChatGPT as Case Study
- Cite:
@article{10.1145/3696459,
author = {Mart\'{\i}nez, Gonzalo and Hern\'{a}ndez, Jos\'{e} Alberto and Conde, Javier and Reviriego, Pedro and Merino-G\'{o}mez, Elena},
title = {Beware of Words: Evaluating the Lexical Diversity of Conversational LLMs using ChatGPT as Case Study
},
year = {2024},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
issn = {2157-6904},
url = {https://doi.org/10.1145/3696459},
doi = {10.1145/3696459},
abstract = ,
note = {Just Accepted},
journal = {ACM Trans. Intell. Syst. Technol.},
month = sep,
keywords = {LLM, Lexical diversity, ChatGPT, Evaluation}
}
Files
Files
(46.5 MB)
Name | Size | Download all |
---|---|---|
md5:73c20af0a18681f3da00f36af36fa41d
|
25.6 kB | Download |
md5:a18f6ca007b32e892faf969ee0deb4ff
|
2.5 MB | Download |
md5:86af04906f8f4ddefef51d36e5df380c
|
2.8 MB | Download |
md5:813512e1041f74a1818e71673d19ee73
|
2.1 MB | Download |
md5:9dc67df880147e7480628e762440aae6
|
3.0 MB | Download |
md5:6cdc561af95e8bca5441618592fd021c
|
5.0 MB | Download |
md5:9286bac6ac6d8912e5b3332489f93591
|
9.9 MB | Download |
md5:6a21d6912efb6922e5bad887e9a3f22f
|
4.0 MB | Download |
md5:150c6d693bd655bf9a21b91b2d307029
|
2.7 MB | Download |
md5:851097698072a74ee2410156b483619c
|
4.4 MB | Download |
md5:7b37975ad42c11b2a7251f4f2d410da3
|
5.1 MB | Download |
md5:d7bb6668678e32caccc56298c5f688a6
|
2.0 MB | Download |
md5:c3e7224d4f4cde9f4f6d6071caf2bef0
|
2.9 MB | Download |
Additional details
Related works
- Is published in
- Publication: 10.48550/arXiv.2402.15518 (DOI)