Published December 1, 2025 | Version 1.0
Dataset Open

Usability Survey Dataset: ChatGPT for Role‑Play Language Practice

  • 1. ROR icon Universidad Complutense de Madrid
  • 2. ROR icon Aligarh Muslim University
  • 3. Universidad Complutense de Madrid Facultad de Informática
  • 4. ROR icon Universidad Francisco de Vitoria

Description

Usability Survey Dataset: Prompted ChatGPT Role-Play for Language Practice

This Zenodo record contains the raw survey responses collected for the study reported in:

Pablo Gervás, Carlos León, Mayuresh Kumar, Gonzalo Méndez, Susana Bautista (2025). Prompting an LLM Chatbot to Role Play Conversational Situations for Language Practice.
In Proceedings of the 17th International Conference on Computer Supported Education (CSEDU 2025), Volume 2, pp. 257–264. SciTePress. DOI: 10.5220/0013235400003932.
Paper PDF (publisher page): https://www.scitepress.org/publishedPapers/2025/132354/pdf/index.html

Authors

The dataset authors are the same as the paper authors:

  • Pablo Gervás
  • Carlos León
  • Mayuresh Kumar
  • Gonzalo Méndez
  • Susana Bautista

Contact author: Carlos León — cleon@ucm.es

ORCID identifiers (from the paper PDF)

  • Pablo Gervás — ORCID: https://orcid.org/0000-0003-4906-9837
  • Carlos León — ORCID: https://orcid.org/0000-0002-6768-1766
  • Mayuresh Kumar — ORCID: https://orcid.org/0000-0002-1728-7349
  • Gonzalo Méndez — ORCID: https://orcid.org/0000-0001-7659-1482
  • Susana Bautista — ORCID: https://orcid.org/0000-0003-1648-0208

Overview

Large Language Model (LLM) chatbots can sustain fluent dialogues and can be configured via prompts to play specific roles in an interaction. The related paper proposes and evaluates a prompting framework to make a chatbot:

  • propose conversational situations of appropriate complexity,
  • play a role in those situations,
  • monitor learner language, and
  • provide feedback proactively and on request.

This dataset provides the participant-level questionnaire data from a usability-focused user study of that approach.

What is in this record?

Files

  • Usability of ChatGPT(1-20).xlsx — raw survey export (one row per participant).

Unit of analysis

  • One row = one participant (N=20)

Variables

The spreadsheet includes:

  1. Administrative timestamps (start/end time),
  2. Participant background (gender; self-rated English skills),
  3. USE Questionnaire (Usefulness, Ease of Use, Ease of Learning, Satisfaction),
  4. System Usability Scale (SUS),
  5. Custom items about role-play language practice (clarifications, switching language, correction behavior, praise, perceived learning, perceived practice of specific skills),
  6. Open-ended comments (optional).

Quick descriptive summary (computed from the XLSX)

  • Participants: 20
  • Columns: 67
  • Collection date: 2024-05-31 (all responses collected on the same day, based on Hora de inicio)
  • Hora de inicio range: 2024-05-31 08:16:512024-05-31 18:56:06
  • Survey completion time (minutes):
    • mean ≈ 6.41
    • median ≈ 6.08
    • min ≈ 1.43, max ≈ 14.72
  • Gender (self-reported): Man=13, Woman=7
  • The Correo electrónico field is "anonymous" for all records in the provided file.

Instruments included

USE Questionnaire (30 items)

The dataset contains the 30-item USE Questionnaire (Likert 1–5). In this file, the items appear in the standard order and can be aggregated into the common subscales:

  • Usefulness (8 items): items 1–8 (from “It helps me be more effective” … “It does everything I would expect it to do”)
  • Ease of Use (11 items): items 9–19
  • Ease of Learning (4 items): items 20–23
  • Satisfaction (7 items): items 24–30

Suggested aggregation: mean (or sum) within each subscale (report which one you use).

System Usability Scale (SUS) (10 items)

The dataset includes the 10 SUS items (Likert 1–5). See Scoring below for the standard 0–100 calculation.

Custom pedagogical/usability items for language practice

The dataset includes additional questions about:

  • whether suggested topics enabled practice of the target feature,
  • clarification behavior (and whether clarifications were helpful),
  • switching to the native language and returning to Spanish afterward,
  • whether the chatbot corrected mistakes (and whether reminders were needed),
  • usefulness of corrections (open-ended),
  • praise, perceived learning, and perceived Spanish practice,
  • perceived practice of: conversation, writing, grammar, vocabulary.

Dataset structure (columns)

The spreadsheet has 67 columns. For convenience, they are grouped below.

1) Administrative / metadata

ID, Hora de inicio, Hora de finalización, Correo electrónico, Nombre, Hora de la última modificación

2) Participant background

Select your gender, and self-rated English skills (Speaking, Listening, Reading, Writing)

3) USE Questionnaire (30 items)

All columns from It helps me be more effective to It is pleasant to use

4) SUS (10 items)

All columns from I think that I would like to use this platform frequently to
I needed to learn a lot of things before I could get going with this platform

5) Role-play language-practice items (custom)

All columns from Did you feel that the chatbot proposed conversation topics... to Please, any comments more

Data dictionary (full column list)

Notes - Some headers include trailing non‑breaking spaces (\xa0) and/or newlines (\n) from the form export. - Suggested short names below are provided to support scripting; they are not additional files.

Group XLSX column header (raw) Suggested short name Type Notes
administrative_metadata ID id integer Administrative / survey export metadata.
administrative_metadata Hora de inicio hora_de_inicio datetime Administrative / survey export metadata.
administrative_metadata Hora de finalización hora_de_finalización datetime Administrative / survey export metadata.
administrative_metadata Correo electrónico correo_electrónico string Administrative / survey export metadata.
administrative_metadata Nombre nombre string Administrative / survey export metadata.
administrative_metadata Hora de la última modificación hora_de_la_última_modificación datetime Administrative / survey export metadata.
participant_background Select your gender select_your_gender string Participant self-report background item.
participant_background Rate your Speaking English language knowledge level  rate_your_speaking_english_language_knowledge_level integer (1–5) Participant self-report background item.
participant_background Rate your Listening English language knowledge level  rate_your_listening_english_language_knowledge_level integer (1–5) Participant self-report background item.
participant_background Rate your Reading English language knowledge level  rate_your_reading_english_language_knowledge_level integer (1–5) Participant self-report background item.
participant_background Rate your Writing English language knowledge level  rate_your_writing_english_language_knowledge_level integer (1–5) Participant self-report background item.
use_questionnaire It helps me be more effective it_helps_me_be_more_effective integer (1–5) USE Questionnaire item (Likert 1–5).
use_questionnaire It helps me be more productive it_helps_me_be_more_productive integer (1–5) USE Questionnaire item (Likert 1–5).
use_questionnaire It is useful it_is_useful integer (1–5) USE Questionnaire item (Likert 1–5).
use_questionnaire It gives me more control over the activities in my life  it_gives_me_more_control_over_the_activities_in_my_life integer (1–5) USE Questionnaire item (Likert 1–5).
use_questionnaire It makes the things I want to accomplish easier to get done it_makes_the_things_i_want_to_accomplish_easier_to_get_done integer (1–5) USE Questionnaire item (Likert 1–5).
use_questionnaire It saves me time when I use it it_saves_me_time_when_i_use_it integer (1–5) USE Questionnaire item (Likert 1–5).
use_questionnaire It meets my needs it_meets_my_needs integer (1–5) USE Questionnaire item (Likert 1–5).
use_questionnaire It does everything I would expect it to do it_does_everything_i_would_expect_it_to_do integer (1–5) USE Questionnaire item (Likert 1–5).
use_questionnaire It is easy to use  it_is_easy_to_use integer (1–5) USE Questionnaire item (Likert 1–5).
use_questionnaire It is simple to use  it_is_simple_to_use integer (1–5) USE Questionnaire item (Likert 1–5).
use_questionnaire It is user friendly  it_is_user_friendly integer (1–5) USE Questionnaire item (Likert 1–5).
use_questionnaire It requires the fewest steps possible to accomplish what I want to do with it  it_requires_the_fewest_steps_possible_to_accomplish_what_i_want_to_do_with_it integer (1–5) USE Questionnaire item (Likert 1–5).
use_questionnaire It is flexible it_is_flexible integer (1–5) USE Questionnaire item (Likert 1–5).
use_questionnaire Using it is effortless using_it_is_effortless integer (1–5) USE Questionnaire item (Likert 1–5).
use_questionnaire I can use it without written instructions i_can_use_it_without_written_instructions integer (1–5) USE Questionnaire item (Likert 1–5).
use_questionnaire I do not notice any inconsistencies as I use it i_do_not_notice_any_inconsistencies_as_i_use_it integer (1–5) USE Questionnaire item (Likert 1–5).
use_questionnaire Both occasional and regular users would like it both_occasional_and_regular_users_would_like_it integer (1–5) USE Questionnaire item (Likert 1–5).
use_questionnaire I can recover from mistakes quickly and easily  i_can_recover_from_mistakes_quickly_and_easily integer (1–5) USE Questionnaire item (Likert 1–5).
use_questionnaire I can use it successfully every time  i_can_use_it_successfully_every_time integer (1–5) USE Questionnaire item (Likert 1–5).
use_questionnaire I learned to use it quickly  i_learned_to_use_it_quickly integer (1–5) USE Questionnaire item (Likert 1–5).
use_questionnaire I easily remember how to use it  i_easily_remember_how_to_use_it integer (1–5) USE Questionnaire item (Likert 1–5).
use_questionnaire It is easy to learn to use it  it_is_easy_to_learn_to_use_it integer (1–5) USE Questionnaire item (Likert 1–5).
use_questionnaire I quickly became skillful with it  i_quickly_became_skillful_with_it integer (1–5) USE Questionnaire item (Likert 1–5).
use_questionnaire I am satisfied with it  i_am_satisfied_with_it integer (1–5) USE Questionnaire item (Likert 1–5).
use_questionnaire I would recommend it to a friend i_would_recommend_it_to_a_friend integer (1–5) USE Questionnaire item (Likert 1–5).
use_questionnaire It is fun to use  it_is_fun_to_use integer (1–5) USE Questionnaire item (Likert 1–5).
use_questionnaire It works the way I want it to work  it_works_the_way_i_want_it_to_work integer (1–5) USE Questionnaire item (Likert 1–5).
use_questionnaire It is wonderful it_is_wonderful integer (1–5) USE Questionnaire item (Likert 1–5).
use_questionnaire I feel I need to have it i_feel_i_need_to_have_it integer (1–5) USE Questionnaire item (Likert 1–5).
use_questionnaire It is pleasant to use  it_is_pleasant_to_use integer (1–5) USE Questionnaire item (Likert 1–5).
sus I think that I would like to use this platform frequently  i_think_that_i_would_like_to_use_this_platform_frequently integer (1–5) System Usability Scale item (Likert 1–5).
sus I found the platform unnecessarily complex i_found_the_platform_unnecessarily_complex integer (1–5) System Usability Scale item (Likert 1–5).
sus I thought the platform was easy to use  i_thought_the_platform_was_easy_to_use integer (1–5) System Usability Scale item (Likert 1–5).
sus I think that I would need the support of a technical person to be able to use this platform  i_think_that_i_would_need_the_support_of_a_technical_person_to_be_able_to_use_this_platform integer (1–5) System Usability Scale item (Likert 1–5).
sus I found the various functions in this platform were well integrated i_found_the_various_functions_in_this_platform_were_well_integrated integer (1–5) System Usability Scale item (Likert 1–5).
sus I thought there was too much inconsistency in the platform i_thought_there_was_too_much_inconsistency_in_the_platform integer (1–5) System Usability Scale item (Likert 1–5).
sus I would imagine that most people would learn to use this platform very quickly  i_would_imagine_that_most_people_would_learn_to_use_this_platform_very_quickly integer (1–5) System Usability Scale item (Likert 1–5).
sus I found the platform very cumbersome to use  i_found_the_platform_very_cumbersome_to_use integer (1–5) System Usability Scale item (Likert 1–5).
sus I felt very confident using the platform i_felt_very_confident_using_the_platform integer (1–5) System Usability Scale item (Likert 1–5).
sus I needed to learn a lot of things before I could get going with this platform  i_needed_to_learn_a_lot_of_things_before_i_could_get_going_with_this_platform integer (1–5) System Usability Scale item (Likert 1–5).
roleplay_language_practice_items Did you feel that the chatbot proposed conversation topics that allowed you to practice the targeted feature of the language? did_you_feel_that_the_chatbot_proposed_conversation_topics_that_allowed_you_to_practice_the_targeted_feature_of_the_language integer (1–5) Custom study item about interaction/pedagogy (Likert 1–5).
roleplay_language_practice_items Did you request clarifications? did_you_request_clarifications integer (1–5) Custom study item about interaction/pedagogy (Likert 1–5).
roleplay_language_practice_items Where they helpful? where_they_helpful string Open-ended text response.
roleplay_language_practice_items Did you need to ask for explanations in your native language? did_you_need_to_ask_for_explanations_in_your_native_language integer (1–5) Custom study item about interaction/pedagogy (Likert 1–5).
roleplay_language_practice_items If you asked for clarification in English, did the chatbot remember to return to Spanish after explaining? if_you_asked_for_clarification_in_english_did_the_chatbot_remember_to_return_to_spanish_after_explaining integer (1–5) Custom study item about interaction/pedagogy (Likert 1–5).
roleplay_language_practice_items Did the chatbot remember to correct your mistakes? did_the_chatbot_remember_to_correct_your_mistakes integer (1–5) Custom study item about interaction/pedagogy (Likert 1–5).
roleplay_language_practice_items Did you have to remind it to correct your mistakes? did_you_have_to_remind_it_to_correct_your_mistakes integer (1–5) Custom study item about interaction/pedagogy (Likert 1–5).
roleplay_language_practice_items `Where the corrections useful?      

|where_the_corrections_useful| string | Open-ended text response. | |roleplay_language_practice_items|Did you at any point receive praise from the chatbot? |did_you_at_any_point_receive_praise_from_the_chatbot| integer (1–5) | Custom study item about interaction/pedagogy (Likert 1–5). | |roleplay_language_practice_items|Do you think you have learnt new languages skills? |do_you_think_you_have_learnt_new_languages_skills| integer (1–5) | Custom study item about interaction/pedagogy (Likert 1–5). | |roleplay_language_practice_items|Do you think the session helped you practice your Spanish? |do_you_think_the_session_helped_you_practice_your_spanish| integer (1–5) | Custom study item about interaction/pedagogy (Likert 1–5). | |roleplay_language_practice_items|Conversation |conversation| integer (1–5) | Perceived practice of this skill (Likert 1–5). | |roleplay_language_practice_items|Writing |writing| integer (1–5) | Perceived practice of this skill (Likert 1–5). | |roleplay_language_practice_items|Grammar|grammar| integer (1–5) | Perceived practice of this skill (Likert 1–5). | |roleplay_language_practice_items|Vocabulary |vocabulary| integer (1–5) | Perceived practice of this skill (Likert 1–5). | |roleplay_language_practice_items|Please, any comments more |please_any_comments_more` | string | Open-ended text response. |

Scoring and derived measures

SUS (0–100)

To compute the standard SUS score:

  • For items 1, 3, 5, 7, 9: contribution = (score − 1)
  • For items 2, 4, 6, 8, 10: contribution = (5 − score)
  • SUS score = (sum of contributions) × 2.5

Where the item order is the same as the column order in the XLSX under the SUS group.

USE subscales

A typical approach is to compute a mean (or sum) for each subscale:

  • Usefulness (8 items)
  • Ease of Use (11 items)
  • Ease of Learning (4 items)
  • Satisfaction (7 items)

Because the raw export does not embed the survey anchors, treat the scale as ordinal 1–5 and document your assumed anchors when reporting results.

Custom items

Most custom items are also stored as 1–5 numeric responses. Some are phrased as Yes/No questions but appear numerically coded; interpret them as degree/frequency unless you have access to the exact survey labels used in the form.

Recommended preprocessing

1) Preserve raw values

Keep the XLSX as the source of truth. If you export to CSV, store a copy of the exact export and record your preprocessing steps.

2) Clean column names (optional)

If you want to work with cleaner headers, you may: - strip whitespace, - replace non‑breaking spaces, - replace newlines with spaces, - map the raw headers to short snake_case names.

3) Handle missingness

In the provided file: - Nombre and Hora de la última modificación are empty for all participants, - the open-ended Please, any comments more is missing for several participants.

Reproducible loading examples

Python (pandas)

import pandas as pd

df = pd.read_excel("Usability of ChatGPT(1-20).xlsx", sheet_name="Sheet1")

# Parse survey duration in minutes
start = pd.to_datetime(df["Hora de inicio"])
end = pd.to_datetime(df["Hora de finalización"])
df["duration_min"] = (end - start).dt.total_seconds() / 60

# SUS scoring (0–100)
sus_cols = [
    "I think that I would like to use this platform frequently ",
    "I found the platform unnecessarily complex",
    "I thought the platform was easy to use ",
    "I think that I would need the support of a technical person to be able to use this platform ",
    "I found the various functions in this platform were well integrated",
    "I thought there was too much inconsistency in the platform",
    "I would imagine that most people would learn to use this platform very quickly ",
    "I found the platform very cumbersome to use ",
    "I felt very confident using the platform",
    "I needed to learn a lot of things before I could get going with this platform ",
]

X = df[sus_cols].copy()
odd = [0, 2, 4, 6, 8]   # items 1,3,5,7,9
even = [1, 3, 5, 7, 9]  # items 2,4,6,8,10

sus_contrib = X.copy()
sus_contrib.iloc[:, odd] = sus_contrib.iloc[:, odd] - 1
sus_contrib.iloc[:, even] = 5 - sus_contrib.iloc[:, even]
df["sus_score_0_100"] = sus_contrib.sum(axis=1) * 2.5

R (readxl)

library(readxl)
df <- read_excel("Usability of ChatGPT(1-20).xlsx", sheet = "Sheet1")
dim(df)
head(df)

Ethics, privacy, and responsible reuse

  • The provided export is anonymized at the level of direct identifiers:
    Correo electrónico is "anonymous" for all records and Nombre is empty.
  • Free-text fields may still contain incidental personal information entered by participants.
    If you publish derived versions (cleaned CSV, annotated texts), consider re-checking those fields before release.
  • This dataset reflects self-reported perceptions (usability/learning impressions) and does not measure objective learning gains.

License

Choose a Zenodo license compatible with participant consent and institutional policy.

Common options: - CC BY 4.0 (recommended for open data when possible), - CC BY-NC 4.0 (if restricting commercial reuse is required), - CC BY-NC-ND 4.0 is more restrictive and may reduce reuse.

The associated paper is published under CC BY-NC-ND 4.0 according to the PDF footer.

How to cite

Dataset (this Zenodo record)

Replace placeholders after Zenodo assigns the DOI and version.

Recommended citation > Gervás, P., León, C., Kumar, M., Méndez, G., & Bautista, S. ([YEAR]). Usability Survey Dataset: Prompted ChatGPT Role-Play for Language Practice (Version [VERSION]) [Data set]. Zenodo. https://doi.org/[ZENODO_DOI]

BibTeX (dataset)

@dataset{gervas_leon_kumar_mendez_bautista_zenodo_dataset,
  author       = {Pablo Gerv{\'a}s and Carlos Le{\'o}n and Mayuresh Kumar and Gonzalo M{\'e}ndez and Susana Bautista},
  title        = {Usability Survey Dataset: Prompted ChatGPT Role-Play for Language Practice},
  year         = {2025},
  publisher    = {Zenodo},
  doi          = {10.5281/zenodo.18783531}
}

Related paper

Recommended citation > Gervás, P., León, C., Kumar, M., Méndez, G., & Bautista, S. (2025). Prompting an LLM Chatbot to Role Play Conversational Situations for Language Practice. In Proceedings of the 17th International Conference on Computer Supported Education (CSEDU 2025) – Volume 2, 257–264. SciTePress. https://doi.org/10.5220/0013235400003932

BibTeX (paper)

@conference{csedu25,
  author       = {{Pablo Gerv{\'a}s and Carlos Le{\'o}n and Mayuresh Kumar and Gonzalo M{\'e}ndez and Susana Bautista}},
  title        = {{Prompting an LLM Chatbot to Role Play Conversational Situations for Language Practice}},
  booktitle    = {{Proceedings of the 17th International Conference on Computer Supported Education - Volume 2: CSEDU}},
  year         = {{2025}},
  pages        = {{257-264}},
  publisher    = {{SciTePress}},
  organization = {{INSTICC}},
  doi          = {{10.5220/0013235400003932}},
  isbn         = {{978-989-758-746-7}}
}

Versioning and changelog

  • v1.0.0 — Initial release of the raw XLSX export (N=20).

Funding / acknowledgements (optional)

If you have project or institutional funding information, add it here (grant numbers, institutions, etc.).

Questions?

Please contact Carlos León (cleon@ucm.es).

Files

README.md

Files (41.7 kB)

Name Size Download all
md5:95d532f2aa9530ee71b284e2f0bc25d9
23.4 kB Preview Download
md5:987745e83d0039b56c7310c77ae976e3
18.3 kB Download

Additional details

Related works

Is described by
Conference paper: 10.5220/0013235400003932 (DOI)

Funding

Agencia Estatal de Investigación
DARK NITE PID2023-146308OB-I00
European Commission
EA-DIGIFOLK 101086338
Consejo de Seguridad Nuclear
ADARVE SUBV20/2021
Agencia Estatal de Investigación
CANTOR PID2019-108927RB- I00

Software

Development Status
Active