Early prediction of student performance in a programming class using prior code submissions and metadata
- 1. North Carolina State University
Description
Early prediction of student performance is a challenging research problem. In this study, we aim to address this problem in a programming class. We use a dataset containing code submissions and associated metadata from programming problem assignments in a CS programming course. We use this data to predict students’ final exam scores in the course. We investigate the use of deep learning based advanced machine learning approaches such as the convolutional neural network (CNN) based code embeddings, and long short-term memory networks (LSTM) to find out the effectiveness of these methods for predicting student performance. In addition, we looked into the use of Halstead features from code submissions in order to augment our models with more code-specific information. We evaluate and compare these deep learning methodologies against various traditional machine learning approaches. Empirical findings show that the LSTM model using metadata outperformed all the other machine learning methods for early prediction of student final exam scores. Our findings provide insight into the use of advanced machine learning models for prediction of student performance and early identification of at risk students, and can aid in determining whether additional support and pedagogical intervention are needed for any student.
Files
CSEDM_workshop_paper_final - Nazia Alam.pdf
Files
(329.4 kB)
Name | Size | Download all |
---|---|---|
md5:e5be3293815ccc9105944133c1c4cc63
|
329.4 kB | Preview Download |