Usability Survey Dataset: ChatGPT for Role‑Play Language Practice
Authors/Creators
Description
Usability Survey Dataset: Prompted ChatGPT Role-Play for Language Practice
This Zenodo record contains the raw survey responses collected for the study reported in:
Pablo Gervás, Carlos León, Mayuresh Kumar, Gonzalo Méndez, Susana Bautista (2025). Prompting an LLM Chatbot to Role Play Conversational Situations for Language Practice.
In Proceedings of the 17th International Conference on Computer Supported Education (CSEDU 2025), Volume 2, pp. 257–264. SciTePress. DOI: 10.5220/0013235400003932.
Paper PDF (publisher page): https://www.scitepress.org/publishedPapers/2025/132354/pdf/index.html
Authors
The dataset authors are the same as the paper authors:
- Pablo Gervás
- Carlos León
- Mayuresh Kumar
- Gonzalo Méndez
- Susana Bautista
Contact author: Carlos León — cleon@ucm.es
ORCID identifiers (from the paper PDF)
- Pablo Gervás — ORCID: https://orcid.org/0000-0003-4906-9837
- Carlos León — ORCID: https://orcid.org/0000-0002-6768-1766
- Mayuresh Kumar — ORCID: https://orcid.org/0000-0002-1728-7349
- Gonzalo Méndez — ORCID: https://orcid.org/0000-0001-7659-1482
- Susana Bautista — ORCID: https://orcid.org/0000-0003-1648-0208
Overview
Large Language Model (LLM) chatbots can sustain fluent dialogues and can be configured via prompts to play specific roles in an interaction. The related paper proposes and evaluates a prompting framework to make a chatbot:
- propose conversational situations of appropriate complexity,
- play a role in those situations,
- monitor learner language, and
- provide feedback proactively and on request.
This dataset provides the participant-level questionnaire data from a usability-focused user study of that approach.
What is in this record?
Files
Usability of ChatGPT(1-20).xlsx— raw survey export (one row per participant).
Unit of analysis
- One row = one participant (N=20)
Variables
The spreadsheet includes:
- Administrative timestamps (start/end time),
- Participant background (gender; self-rated English skills),
- USE Questionnaire (Usefulness, Ease of Use, Ease of Learning, Satisfaction),
- System Usability Scale (SUS),
- Custom items about role-play language practice (clarifications, switching language, correction behavior, praise, perceived learning, perceived practice of specific skills),
- Open-ended comments (optional).
Quick descriptive summary (computed from the XLSX)
- Participants: 20
- Columns: 67
- Collection date: 2024-05-31 (all responses collected on the same day, based on
Hora de inicio) Hora de iniciorange: 2024-05-31 08:16:51 → 2024-05-31 18:56:06- Survey completion time (minutes):
- mean ≈ 6.41
- median ≈ 6.08
- min ≈ 1.43, max ≈ 14.72
- Gender (self-reported): Man=13, Woman=7
- The
Correo electrónicofield is"anonymous"for all records in the provided file.
Instruments included
USE Questionnaire (30 items)
The dataset contains the 30-item USE Questionnaire (Likert 1–5). In this file, the items appear in the standard order and can be aggregated into the common subscales:
- Usefulness (8 items): items 1–8 (from “It helps me be more effective” … “It does everything I would expect it to do”)
- Ease of Use (11 items): items 9–19
- Ease of Learning (4 items): items 20–23
- Satisfaction (7 items): items 24–30
Suggested aggregation: mean (or sum) within each subscale (report which one you use).
System Usability Scale (SUS) (10 items)
The dataset includes the 10 SUS items (Likert 1–5). See Scoring below for the standard 0–100 calculation.
Custom pedagogical/usability items for language practice
The dataset includes additional questions about:
- whether suggested topics enabled practice of the target feature,
- clarification behavior (and whether clarifications were helpful),
- switching to the native language and returning to Spanish afterward,
- whether the chatbot corrected mistakes (and whether reminders were needed),
- usefulness of corrections (open-ended),
- praise, perceived learning, and perceived Spanish practice,
- perceived practice of: conversation, writing, grammar, vocabulary.
Dataset structure (columns)
The spreadsheet has 67 columns. For convenience, they are grouped below.
1) Administrative / metadata
ID, Hora de inicio, Hora de finalización, Correo electrónico, Nombre, Hora de la última modificación
2) Participant background
Select your gender, and self-rated English skills (Speaking, Listening, Reading, Writing)
3) USE Questionnaire (30 items)
All columns from It helps me be more effective to It is pleasant to use
4) SUS (10 items)
All columns from I think that I would like to use this platform frequently toI needed to learn a lot of things before I could get going with this platform
5) Role-play language-practice items (custom)
All columns from Did you feel that the chatbot proposed conversation topics... to Please, any comments more
Data dictionary (full column list)
Notes - Some headers include trailing non‑breaking spaces (\xa0) and/or newlines (\n) from the form export. - Suggested short names below are provided to support scripting; they are not additional files.
| Group | XLSX column header (raw) | Suggested short name | Type | Notes |
|---|---|---|---|---|
administrative_metadata |
ID |
id |
integer | Administrative / survey export metadata. |
administrative_metadata |
Hora de inicio |
hora_de_inicio |
datetime | Administrative / survey export metadata. |
administrative_metadata |
Hora de finalización |
hora_de_finalización |
datetime | Administrative / survey export metadata. |
administrative_metadata |
Correo electrónico |
correo_electrónico |
string | Administrative / survey export metadata. |
administrative_metadata |
Nombre |
nombre |
string | Administrative / survey export metadata. |
administrative_metadata |
Hora de la última modificación |
hora_de_la_última_modificación |
datetime | Administrative / survey export metadata. |
participant_background |
Select your gender |
select_your_gender |
string | Participant self-report background item. |
participant_background |
Rate your Speaking English language knowledge level |
rate_your_speaking_english_language_knowledge_level |
integer (1–5) | Participant self-report background item. |
participant_background |
Rate your Listening English language knowledge level |
rate_your_listening_english_language_knowledge_level |
integer (1–5) | Participant self-report background item. |
participant_background |
Rate your Reading English language knowledge level |
rate_your_reading_english_language_knowledge_level |
integer (1–5) | Participant self-report background item. |
participant_background |
Rate your Writing English language knowledge level |
rate_your_writing_english_language_knowledge_level |
integer (1–5) | Participant self-report background item. |
use_questionnaire |
It helps me be more effective |
it_helps_me_be_more_effective |
integer (1–5) | USE Questionnaire item (Likert 1–5). |
use_questionnaire |
It helps me be more productive |
it_helps_me_be_more_productive |
integer (1–5) | USE Questionnaire item (Likert 1–5). |
use_questionnaire |
It is useful |
it_is_useful |
integer (1–5) | USE Questionnaire item (Likert 1–5). |
use_questionnaire |
It gives me more control over the activities in my life |
it_gives_me_more_control_over_the_activities_in_my_life |
integer (1–5) | USE Questionnaire item (Likert 1–5). |
use_questionnaire |
It makes the things I want to accomplish easier to get done |
it_makes_the_things_i_want_to_accomplish_easier_to_get_done |
integer (1–5) | USE Questionnaire item (Likert 1–5). |
use_questionnaire |
It saves me time when I use it |
it_saves_me_time_when_i_use_it |
integer (1–5) | USE Questionnaire item (Likert 1–5). |
use_questionnaire |
It meets my needs |
it_meets_my_needs |
integer (1–5) | USE Questionnaire item (Likert 1–5). |
use_questionnaire |
It does everything I would expect it to do |
it_does_everything_i_would_expect_it_to_do |
integer (1–5) | USE Questionnaire item (Likert 1–5). |
use_questionnaire |
It is easy to use |
it_is_easy_to_use |
integer (1–5) | USE Questionnaire item (Likert 1–5). |
use_questionnaire |
It is simple to use |
it_is_simple_to_use |
integer (1–5) | USE Questionnaire item (Likert 1–5). |
use_questionnaire |
It is user friendly |
it_is_user_friendly |
integer (1–5) | USE Questionnaire item (Likert 1–5). |
use_questionnaire |
It requires the fewest steps possible to accomplish what I want to do with it |
it_requires_the_fewest_steps_possible_to_accomplish_what_i_want_to_do_with_it |
integer (1–5) | USE Questionnaire item (Likert 1–5). |
use_questionnaire |
It is flexible |
it_is_flexible |
integer (1–5) | USE Questionnaire item (Likert 1–5). |
use_questionnaire |
Using it is effortless |
using_it_is_effortless |
integer (1–5) | USE Questionnaire item (Likert 1–5). |
use_questionnaire |
I can use it without written instructions |
i_can_use_it_without_written_instructions |
integer (1–5) | USE Questionnaire item (Likert 1–5). |
use_questionnaire |
I do not notice any inconsistencies as I use it |
i_do_not_notice_any_inconsistencies_as_i_use_it |
integer (1–5) | USE Questionnaire item (Likert 1–5). |
use_questionnaire |
Both occasional and regular users would like it |
both_occasional_and_regular_users_would_like_it |
integer (1–5) | USE Questionnaire item (Likert 1–5). |
use_questionnaire |
I can recover from mistakes quickly and easily |
i_can_recover_from_mistakes_quickly_and_easily |
integer (1–5) | USE Questionnaire item (Likert 1–5). |
use_questionnaire |
I can use it successfully every time |
i_can_use_it_successfully_every_time |
integer (1–5) | USE Questionnaire item (Likert 1–5). |
use_questionnaire |
I learned to use it quickly |
i_learned_to_use_it_quickly |
integer (1–5) | USE Questionnaire item (Likert 1–5). |
use_questionnaire |
I easily remember how to use it |
i_easily_remember_how_to_use_it |
integer (1–5) | USE Questionnaire item (Likert 1–5). |
use_questionnaire |
It is easy to learn to use it |
it_is_easy_to_learn_to_use_it |
integer (1–5) | USE Questionnaire item (Likert 1–5). |
use_questionnaire |
I quickly became skillful with it |
i_quickly_became_skillful_with_it |
integer (1–5) | USE Questionnaire item (Likert 1–5). |
use_questionnaire |
I am satisfied with it |
i_am_satisfied_with_it |
integer (1–5) | USE Questionnaire item (Likert 1–5). |
use_questionnaire |
I would recommend it to a friend |
i_would_recommend_it_to_a_friend |
integer (1–5) | USE Questionnaire item (Likert 1–5). |
use_questionnaire |
It is fun to use |
it_is_fun_to_use |
integer (1–5) | USE Questionnaire item (Likert 1–5). |
use_questionnaire |
It works the way I want it to work |
it_works_the_way_i_want_it_to_work |
integer (1–5) | USE Questionnaire item (Likert 1–5). |
use_questionnaire |
It is wonderful |
it_is_wonderful |
integer (1–5) | USE Questionnaire item (Likert 1–5). |
use_questionnaire |
I feel I need to have it |
i_feel_i_need_to_have_it |
integer (1–5) | USE Questionnaire item (Likert 1–5). |
use_questionnaire |
It is pleasant to use |
it_is_pleasant_to_use |
integer (1–5) | USE Questionnaire item (Likert 1–5). |
sus |
I think that I would like to use this platform frequently |
i_think_that_i_would_like_to_use_this_platform_frequently |
integer (1–5) | System Usability Scale item (Likert 1–5). |
sus |
I found the platform unnecessarily complex |
i_found_the_platform_unnecessarily_complex |
integer (1–5) | System Usability Scale item (Likert 1–5). |
sus |
I thought the platform was easy to use |
i_thought_the_platform_was_easy_to_use |
integer (1–5) | System Usability Scale item (Likert 1–5). |
sus |
I think that I would need the support of a technical person to be able to use this platform |
i_think_that_i_would_need_the_support_of_a_technical_person_to_be_able_to_use_this_platform |
integer (1–5) | System Usability Scale item (Likert 1–5). |
sus |
I found the various functions in this platform were well integrated |
i_found_the_various_functions_in_this_platform_were_well_integrated |
integer (1–5) | System Usability Scale item (Likert 1–5). |
sus |
I thought there was too much inconsistency in the platform |
i_thought_there_was_too_much_inconsistency_in_the_platform |
integer (1–5) | System Usability Scale item (Likert 1–5). |
sus |
I would imagine that most people would learn to use this platform very quickly |
i_would_imagine_that_most_people_would_learn_to_use_this_platform_very_quickly |
integer (1–5) | System Usability Scale item (Likert 1–5). |
sus |
I found the platform very cumbersome to use |
i_found_the_platform_very_cumbersome_to_use |
integer (1–5) | System Usability Scale item (Likert 1–5). |
sus |
I felt very confident using the platform |
i_felt_very_confident_using_the_platform |
integer (1–5) | System Usability Scale item (Likert 1–5). |
sus |
I needed to learn a lot of things before I could get going with this platform |
i_needed_to_learn_a_lot_of_things_before_i_could_get_going_with_this_platform |
integer (1–5) | System Usability Scale item (Likert 1–5). |
roleplay_language_practice_items |
Did you feel that the chatbot proposed conversation topics that allowed you to practice the targeted feature of the language? |
did_you_feel_that_the_chatbot_proposed_conversation_topics_that_allowed_you_to_practice_the_targeted_feature_of_the_language |
integer (1–5) | Custom study item about interaction/pedagogy (Likert 1–5). |
roleplay_language_practice_items |
Did you request clarifications? |
did_you_request_clarifications |
integer (1–5) | Custom study item about interaction/pedagogy (Likert 1–5). |
roleplay_language_practice_items |
Where they helpful? |
where_they_helpful |
string | Open-ended text response. |
roleplay_language_practice_items |
Did you need to ask for explanations in your native language? |
did_you_need_to_ask_for_explanations_in_your_native_language |
integer (1–5) | Custom study item about interaction/pedagogy (Likert 1–5). |
roleplay_language_practice_items |
If you asked for clarification in English, did the chatbot remember to return to Spanish after explaining? |
if_you_asked_for_clarification_in_english_did_the_chatbot_remember_to_return_to_spanish_after_explaining |
integer (1–5) | Custom study item about interaction/pedagogy (Likert 1–5). |
roleplay_language_practice_items |
Did the chatbot remember to correct your mistakes? |
did_the_chatbot_remember_to_correct_your_mistakes |
integer (1–5) | Custom study item about interaction/pedagogy (Likert 1–5). |
roleplay_language_practice_items |
Did you have to remind it to correct your mistakes? |
did_you_have_to_remind_it_to_correct_your_mistakes |
integer (1–5) | Custom study item about interaction/pedagogy (Likert 1–5). |
roleplay_language_practice_items |
`Where the corrections useful? |
|where_the_corrections_useful| string | Open-ended text response. | |roleplay_language_practice_items|Did you at any point receive praise from the chatbot? |did_you_at_any_point_receive_praise_from_the_chatbot| integer (1–5) | Custom study item about interaction/pedagogy (Likert 1–5). | |roleplay_language_practice_items|Do you think you have learnt new languages skills? |do_you_think_you_have_learnt_new_languages_skills| integer (1–5) | Custom study item about interaction/pedagogy (Likert 1–5). | |roleplay_language_practice_items|Do you think the session helped you practice your Spanish? |do_you_think_the_session_helped_you_practice_your_spanish| integer (1–5) | Custom study item about interaction/pedagogy (Likert 1–5). | |roleplay_language_practice_items|Conversation |conversation| integer (1–5) | Perceived practice of this skill (Likert 1–5). | |roleplay_language_practice_items|Writing |writing| integer (1–5) | Perceived practice of this skill (Likert 1–5). | |roleplay_language_practice_items|Grammar|grammar| integer (1–5) | Perceived practice of this skill (Likert 1–5). | |roleplay_language_practice_items|Vocabulary |vocabulary| integer (1–5) | Perceived practice of this skill (Likert 1–5). | |roleplay_language_practice_items|Please, any comments more |please_any_comments_more` | string | Open-ended text response. |
Scoring and derived measures
SUS (0–100)
To compute the standard SUS score:
- For items 1, 3, 5, 7, 9: contribution = (score − 1)
- For items 2, 4, 6, 8, 10: contribution = (5 − score)
- SUS score = (sum of contributions) × 2.5
Where the item order is the same as the column order in the XLSX under the SUS group.
USE subscales
A typical approach is to compute a mean (or sum) for each subscale:
- Usefulness (8 items)
- Ease of Use (11 items)
- Ease of Learning (4 items)
- Satisfaction (7 items)
Because the raw export does not embed the survey anchors, treat the scale as ordinal 1–5 and document your assumed anchors when reporting results.
Custom items
Most custom items are also stored as 1–5 numeric responses. Some are phrased as Yes/No questions but appear numerically coded; interpret them as degree/frequency unless you have access to the exact survey labels used in the form.
Recommended preprocessing
1) Preserve raw values
Keep the XLSX as the source of truth. If you export to CSV, store a copy of the exact export and record your preprocessing steps.
2) Clean column names (optional)
If you want to work with cleaner headers, you may: - strip whitespace, - replace non‑breaking spaces, - replace newlines with spaces, - map the raw headers to short snake_case names.
3) Handle missingness
In the provided file: - Nombre and Hora de la última modificación are empty for all participants, - the open-ended Please, any comments more is missing for several participants.
Reproducible loading examples
Python (pandas)
import pandas as pd
df = pd.read_excel("Usability of ChatGPT(1-20).xlsx", sheet_name="Sheet1")
# Parse survey duration in minutes
start = pd.to_datetime(df["Hora de inicio"])
end = pd.to_datetime(df["Hora de finalización"])
df["duration_min"] = (end - start).dt.total_seconds() / 60
# SUS scoring (0–100)
sus_cols = [
"I think that I would like to use this platform frequently ",
"I found the platform unnecessarily complex",
"I thought the platform was easy to use ",
"I think that I would need the support of a technical person to be able to use this platform ",
"I found the various functions in this platform were well integrated",
"I thought there was too much inconsistency in the platform",
"I would imagine that most people would learn to use this platform very quickly ",
"I found the platform very cumbersome to use ",
"I felt very confident using the platform",
"I needed to learn a lot of things before I could get going with this platform ",
]
X = df[sus_cols].copy()
odd = [0, 2, 4, 6, 8] # items 1,3,5,7,9
even = [1, 3, 5, 7, 9] # items 2,4,6,8,10
sus_contrib = X.copy()
sus_contrib.iloc[:, odd] = sus_contrib.iloc[:, odd] - 1
sus_contrib.iloc[:, even] = 5 - sus_contrib.iloc[:, even]
df["sus_score_0_100"] = sus_contrib.sum(axis=1) * 2.5
R (readxl)
library(readxl)
df <- read_excel("Usability of ChatGPT(1-20).xlsx", sheet = "Sheet1")
dim(df)
head(df)
Ethics, privacy, and responsible reuse
- The provided export is anonymized at the level of direct identifiers:
Correo electrónicois"anonymous"for all records andNombreis empty. - Free-text fields may still contain incidental personal information entered by participants.
If you publish derived versions (cleaned CSV, annotated texts), consider re-checking those fields before release. - This dataset reflects self-reported perceptions (usability/learning impressions) and does not measure objective learning gains.
License
Choose a Zenodo license compatible with participant consent and institutional policy.
Common options: - CC BY 4.0 (recommended for open data when possible), - CC BY-NC 4.0 (if restricting commercial reuse is required), - CC BY-NC-ND 4.0 is more restrictive and may reduce reuse.
The associated paper is published under CC BY-NC-ND 4.0 according to the PDF footer.
How to cite
Dataset (this Zenodo record)
Replace placeholders after Zenodo assigns the DOI and version.
Recommended citation > Gervás, P., León, C., Kumar, M., Méndez, G., & Bautista, S. ([YEAR]). Usability Survey Dataset: Prompted ChatGPT Role-Play for Language Practice (Version [VERSION]) [Data set]. Zenodo. https://doi.org/[ZENODO_DOI]
BibTeX (dataset)
@dataset{gervas_leon_kumar_mendez_bautista_zenodo_dataset,
author = {Pablo Gerv{\'a}s and Carlos Le{\'o}n and Mayuresh Kumar and Gonzalo M{\'e}ndez and Susana Bautista},
title = {Usability Survey Dataset: Prompted ChatGPT Role-Play for Language Practice},
year = {2025},
publisher = {Zenodo},
doi = {10.5281/zenodo.18783531}
}
Related paper
Recommended citation > Gervás, P., León, C., Kumar, M., Méndez, G., & Bautista, S. (2025). Prompting an LLM Chatbot to Role Play Conversational Situations for Language Practice. In Proceedings of the 17th International Conference on Computer Supported Education (CSEDU 2025) – Volume 2, 257–264. SciTePress. https://doi.org/10.5220/0013235400003932
BibTeX (paper)
@conference{csedu25,
author = {{Pablo Gerv{\'a}s and Carlos Le{\'o}n and Mayuresh Kumar and Gonzalo M{\'e}ndez and Susana Bautista}},
title = {{Prompting an LLM Chatbot to Role Play Conversational Situations for Language Practice}},
booktitle = {{Proceedings of the 17th International Conference on Computer Supported Education - Volume 2: CSEDU}},
year = {{2025}},
pages = {{257-264}},
publisher = {{SciTePress}},
organization = {{INSTICC}},
doi = {{10.5220/0013235400003932}},
isbn = {{978-989-758-746-7}}
}
Versioning and changelog
- v1.0.0 — Initial release of the raw XLSX export (N=20).
Funding / acknowledgements (optional)
If you have project or institutional funding information, add it here (grant numbers, institutions, etc.).
Questions?
Please contact Carlos León (cleon@ucm.es).
Files
README.md
Files
(41.7 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:95d532f2aa9530ee71b284e2f0bc25d9
|
23.4 kB | Preview Download |
|
md5:987745e83d0039b56c7310c77ae976e3
|
18.3 kB | Download |
Additional details
Related works
- Is described by
- Conference paper: 10.5220/0013235400003932 (DOI)
Funding
Software
- Development Status
- Active