Published July 27, 2022 | Version v1
Conference paper Open

Early prediction of student performance in a programming class using prior code submissions and metadata

  • 1. North Carolina State University

Description

Early prediction of student performance is a challenging research problem. In this study, we aim to address this problem in a programming class. We use a dataset containing code submissions and associated metadata from programming problem assignments in a CS programming course. We use this data to predict students’ final exam scores in the course. We investigate the use of deep learning based advanced machine learning approaches such as the convolutional neural network (CNN) based code embeddings, and long short-term memory networks (LSTM) to find out the effectiveness of these methods for predicting student performance. In addition, we looked into the use of Halstead features from code submissions in order to augment our models with more code-specific information. We evaluate and compare these deep learning methodologies against various traditional machine learning approaches. Empirical findings show that the LSTM model using metadata outperformed all the other machine learning methods for early prediction of student final exam scores. Our findings provide insight into the use of advanced machine learning models for prediction of student performance and early identification of at risk students, and can aid in determining whether additional support and pedagogical intervention are needed for any student.

Files

CSEDM_workshop_paper_final - Nazia Alam.pdf

Files (329.4 kB)

Name Size Download all
md5:e5be3293815ccc9105944133c1c4cc63
329.4 kB Preview Download