Journal article Open Access

Using a Latent Class Forest to Identify At-Risk Students in Higher Education

Pelaez, Kevin; Levine, Richard; Fan, Juanjuan; Guarcello, Maureen; Laumakis, Mark

Higher education institutions often examine performance discrepancies of specific subgroups, such as students from underrepresented minority and first-generation backgrounds. An increase in educational technology and computational power has promoted research interest in using data mining tools to help identify groups of students who are academically at-risk. Institutions can then implement data-informed decisions to help promote student access, increase retention and graduation rates, and guide intervention programs. We introduce a latent class forest, a latent class analysis and a random forest ensemble that will recursively partition observations into groups to help identify at-risk students. The procedure is a form of model-based hierarchical clustering that relies on latent class trees to optimally identify subgroups. We motivate and apply our latent class forest method to identify key demographic and academic characteristics of at-risk students in a large enrollment, bottleneck introductory psychology course at San Diego State University (SDSU). A post hoc analysis is conducted to measure the efficacy of Supplemental Instruction (SI) across these groups. SI is a peer-led academic intervention that targets historically challenging courses and aims to increase student performance. In doing so, we are able to identify populations that benefit most from SI to guide program recruitment and help increase the introductory psychology course success rate.

The file is in PDF format. If your computer does not recognize it, simply download the file and then open it with your browser.
Files (337.5 kB)
Name Size
337.5 kB Download
All versions This version
Views 4949
Downloads 1212
Data volume 4.0 MB4.0 MB
Unique views 4848
Unique downloads 1111


Cite as