Published July 12, 2025 | Version v2
Dataset Open

Python Code and Dataset for An Empirical Study of ChatGPT-4o Use in Engineering Education: Prompting and Performance

  • 1. ROR icon National University of Distance Education
  • 1. ROR icon National University of Distance Education

Description

Repository Overview

This repository contains the complete Python codebase and dataset used in the study:

An Empirical Study of ChatGPT-4o Use in Engineering Education: Prompting and Performance

The project investigates the relationship between AI prompting behaviors and academic performance among engineering students using ChatGPT-4o.

The GitHub repository includes:

  • Data preprocessing scripts

  • Metric calculation modules

  • Machine learning models and analysis pipelines

  • Code for figure generation and statistical tests

The full codebase is available at:

https://github.com/lisaza88/An-Empirical-Research-Study-of-ChatGPT-4o-Use-in-Engineering-Education

Dataset

This Zenodo record also includes the original dataset:

An Empirical Study of ChatGPT-4o Use in Engineering Education Prompting and Performance.xlsx

This file contains anonymized data used in the study, including:

  • AI interaction logs (prompts, responses, timestamps)

  • Written student submissions

  • Assignment grades

  • Computed metrics (e.g., structural complexity, content richness, query efficiency)

The dataset is shared in accordance with ethical and privacy standards and is intended for reproducibility and academic reuse.

Notes

Python Code Overview (By File Order)

This section describes the functionality of each script in the order it appears in the Github repository:

  1. SC - compute_structural_complexity.txt
    Calculates the grammatical and syntactic complexity of student writing.

  2. RU - compute_semantic_novelty.txt
    Measures how semantically novel student responses are compared to typical AI outputs.

  3. RU - compute_response_utility.txt
    Assesses the helpfulness and task alignment of AI responses.

  4. RU - compute_contribution_final.txt
    Combines multiple utility-related dimensions to assess final AI contribution to the assignment.

  5. RU - compute_conceptual_transformation.txt
    Evaluates how well students transformed AI responses conceptually rather than copying them directly.

  6. README.md
    This document—repository description and usage instructions.

  7. QE - compute_query_efficiency.txt
    Calculates how efficiently students obtain useful answers relative to prompt count.

  8. QD - compute_query_depth.txt
    Aggregates lexical, structural, and logical depth of student prompts.

  9. QD - compute_multistep_depth.txt
    Scores prompts based on presence of multi-step reasoning or layered structure.

  10. QD - compute_lexical_structure.txt
    Evaluates lexical variety and formal characteristics of student prompts.

  11. QD - compute_focus_clarity.txt
    Measures how focused, goal-oriented, and unambiguous the prompts are.

  12. PS - compute_stepwise_alignment.txt
    Assesses whether student work reflects logical integration of AI-generated insights.

  13. PS - compute_problem_solving_score.txt
    Final score summarizing how well students used AI for analytical or problem-solving tasks.

  14. PS - compute_independent_expansion.txt
    Checks how much the student expanded upon or added new ideas beyond AI responses.

  15. PS - compute_conceptual_application.txt
    Measures how students applied AI suggestions within a relevant engineering context.

  16. PRD - compute_prompt_refinement_depth.txt
    Tracks iterative prompt modifications and semantic improvement.

  17. Interface Python Code.txt
    The local interface tool students used to interact with ChatGPT. Logs prompts/responses and emails data to the researcher.

  18. Final Dataset - final_export_to_excel.txt
    Exports all computed metrics and metadata into a final .xlsx file for analysis.

  19. Data merging - convert_grades.txt
    Converts raw Excel grade sheets into structured data.

  20. Data Merging - parse_ai_logs.txt
    Extracts prompts and responses from raw AI logs, removes duplicates, and formats them.

  21. Data Merging - merge all data.txt
    Merges all student data: logs, assignments, grades, and computed metrics.

  22. Data Merging - convert_assignments.txt
    Converts student .docx assignments to JSON with clean, tokenized text.

  23. CR - compute_content_richness.txt
    Assesses conceptual density and information content in student submissions.

  24. ARR - compute_text_similarity.txt
    Calculates similarity between student work and AI responses at the lexical level.

  25. ARR - compute_structural_similarity.txt
    Evaluates structural overlaps between student output and AI output.

  26. ARR - compute_query_submission_link.txt
    Links submitted work back to the queries that most influenced it.

  27. ARR - compute_prompt_response_consistency.txt
    Measures how logically aligned the student’s prompt is with the AI's response.

  28. ARR - compute_copy_paste_score.txt
    Detects copied or lightly modified content from AI responses.

  29. ARR - compute_ai_response_reliance.txt
    Aggregates all ARR metrics into a single AI Reliance Score.

Notes

Dataset Notes

The file An Empirical Study of ChatGPT-4o Use in Engineering Education Prompting and Performance.xlsx contains the full dataset used for the paper "An Empirical Study of ChatGPT-4o Use in Engineering Education: Prompting and Performance." Each row represents a student's session in a weekly engineering class, capturing both behavioral and performance data.

Columns in the dataset:

  1. Student ID – An anonymized unique identifier for each student (e.g., Systems Engineering_1).

  2. Course – The name of the engineering course the student was enrolled in (e.g., Systems Engineering, Environmental Engineering).

  3. Week – Indicates which week of the 16-week semester the session corresponds to.

  4. Attendance – Binary indicator of whether the student was present (1) or absent (0) for the session.

  5. AI Access – Binary indicator of whether the student had access to ChatGPT-4o during the session (1 for access, 0 for no access).

  6. Activity Type – The type of task completed during the session. This could be Case Study Analysis, Engineering Design Report, Multi-Step Engineering Problem-Solving, or Experimental Data Analysis.

  7. AI Query Count – The number of prompts the student submitted to ChatGPT during the session.

  8. AI Response Reliance Score – A percentage score representing how much of the student’s submitted work related to AI output

  9. Assignment Score – The grade (out of 100) the student received for their work that week.

  10. AI Query Depth & Structure Score – A custom metric evaluating how complex, well-structured, and thoughtful the student’s prompts were.

  11. AI Query Efficiency Score – A metric indicating how efficiently students got useful responses using fewer and more focused queries.

  12. Prompt Refinement Depth Score – Captures how much the student iteratively refined or improved their prompts to get better responses.

  13. AI Response Complexity Score – Measures the linguistic and syntactic sophistication of ChatGPT’s responses.

  14. AI Response Utility Score – Evaluates the usefulness and relevance of the AI responses to the assigned task.

  15. AI-Driven Problem-Solving Score – A composite metric assessing how well the student integrated ChatGPT-generated content into their actual solution.

  16. Structural Complexity Score – Measures the structural sophistication of the student’s written work (e.g., use of transitions, logical flow).

  17. Content Richness Score – Assesses how information-dense and conceptually rich the student’s final submission was.

Each row is a unique session, meaning a single student will appear multiple times (once per week, assuming attendance), with different metrics depending on whether they had AI access and how they used it.

Files