Prompts generated from ChatGPT3.5 and ChatGPT4 with NYT and HC3 topics in different roles and parameters configurations

Gonzalo, Martínez; José Alberto, Hernández; Javier, Conde; Pedro, Reviriego; Elena, Merino

doi:10.5281/zenodo.10646082

Published February 11, 2024 | Version v1

Dataset Open

Prompts generated from ChatGPT3.5 and ChatGPT4 with NYT and HC3 topics in different roles and parameters configurations

1. Universidad Carlos III de Madrid
2. Universidad Politécnica de Madrid
3. Universidad de Valladolid

Prompts generated from ChatGPT3.5 and ChatGPT4 with NYT and HC3 topics in different roles and parameter configurations.

The dataset is useful to study lexical aspects of LLMs with different parameters/roles configurations.

The 0_Base_Topics.xlsx file lists the topics used for the dataset generation
The rest of the files collect the answers of ChatGPT to these topics with different configurations of parameters/context:
- Temperature (parameter): Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.
- Frequency penalty (parameter): Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
- Top probability (parameter): An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass.
- Presence penalty (parameter): Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
- Roles (context)
  - Default: No role is assigned to the LLM, the default role is used.
  - Child: The LLM is requested to answer as a five-year-old child.
  - Young adult male: The LLM is requested to answer as a young male adult.
  - Young adult female: The LLM is requested to answer as a young female adult.
  - Elderly adult male: The LLM is requested to answer as an elderly male adult.
  - Elderly adult female: The LLM is requested to answer as an elderly female adult.
  - Affluent adult male: The LLM is requested to answer as an affluent male adult.
  - Affluent adult female: The LLM is requested to answer as an affluent female adult.
  - Lower-class adult male: The LLM is requested to answer as a lower-class male adult.
  - Lower-class adult female: The LLM is requested to answer as a lower-class female adult.
  - Erudite: The LLM is requested to answer as an erudite who uses a rich vocabulary.

Files

Files (32.5 MB)

Name	Size	Download all
0_Base_topics.xlsx md5:73c20af0a18681f3da00f36af36fa41d	25.6 kB	Download
Frequency_GPT35.xlsx md5:a18f6ca007b32e892faf969ee0deb4ff	2.5 MB	Download
Frequency_NYT_GPT4.xlsx md5:86af04906f8f4ddefef51d36e5df380c	2.8 MB	Download
Presence_GPT35.xlsx md5:813512e1041f74a1818e71673d19ee73	2.1 MB	Download
Presence_NYT_GPT4.xlsx md5:9dc67df880147e7480628e762440aae6	3.0 MB	Download
Roles_GPT35.xlsx md5:6cdc561af95e8bca5441618592fd021c	5.0 MB	Download
Roles_NYT_GPT4_.xlsx md5:150c6d693bd655bf9a21b91b2d307029	2.7 MB	Download
Temperature_GPT35.xlsx md5:851097698072a74ee2410156b483619c	4.4 MB	Download
Temperature_NTY_GPT4.xlsx md5:7b37975ad42c11b2a7251f4f2d410da3	5.1 MB	Download
Top_GPT35.xlsx md5:d7bb6668678e32caccc56298c5f688a6	2.0 MB	Download
Top_NYT_GPT4.xlsx md5:c3e7224d4f4cde9f4f6d6071caf2bef0	2.9 MB	Download

	All versions	This version
Views	335	136
Downloads	754	328
Data volume	2.5 GB	957.3 MB

Prompts generated from ChatGPT3.5 and ChatGPT4 with NYT and HC3 topics in different roles and parameters configurations

Creators

Description

Files

Files (32.5 MB)