COMPARATIVE STUDY OF CLUSTERING ALGORITHMS FOR STUDENT PERFORMANCE EVALUATION
Authors/Creators
Description
Predicting student performance is essential for enhancing educational outcomes, enabling educators to identify students
who may need additional support or intervention. Clustering algorithms, as unsupervised data mining techniques, are
particularly effective at uncovering patterns in student performance data. These algorithms can group students based
on their exam scores, providing insights that allow for more tailored and targeted educational strategies. This study
compares four unsupervised methods K-Means, DBSCAN, Hierarchical Clustering (Ward linkage), and Gaussian
Mixture Models (GMM) on a dataset of 200 students’ scores across five exam questions. After standardizing the data,
we project it into two dimensions via Principal Component Analysis (PCA) for visualization. We then evaluate each
model using three validation metrics: Silhouette Score, Davies-Bouldin Index, and Calinski-Harabasz Index. K-Means
with k = 5 achieves the highest Silhouette (0.387) and Calinski-Harabasz (90.156) scores and the lowest DaviesBouldin Index (0.883), outperforming alternatives in both visual separation and quantitative metrics. DBSCAN
identifies noise but yields overlapping clusters; Hierarchical clustering shows moderate cohesion; GMM produces
softer boundaries. Our results demonstrate that K-Means offers the most interpretable and robust grouping for this
educational dataset, providing a practical tool for segmenting students into performance tiers. Future work may explore
dynamic k-selection methods, incorporation of additional student features, and deployment in intelligent tutoring
systems.
Files
MAY50.pdf
Files
(498.0 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:f028d8d9c6bd4c62771f6456374f96be
|
498.0 kB | Preview Download |