Published March 15, 2023 | Version v1
Journal article Open

Analysis of Student Pair Teamwork Using GitHub Activities

  • 1. North Carolina State University


Few studies have analyzed students’ teamwork (pairwork) habits in programming projects due to the
challenges and high cost of analyzing complex, long-term collaborative processes. In this work, we
analyze student teamwork data collected from the GitHub platform with the goal of identifying specific
pair teamwork styles. This analysis builds on an initial corpus of commit message data that was manually
labeled by subject matter experts. We then extend this annotation through the use of self-supervised,
semi-supervised learning to develop a large-scale annotated dataset that covers multiple course offerings
from a second-semester CS2 course. Further, we develop a series of predictive models to automatically
identify student teamwork styles. Finally, we compare trends in students’ performance and team selection
for each teamwork style to see if any of them reflected better student outcomes or different trends of help-seekingamong students. Our analysis showed that applying self-supervised semi-supervised methods
helps us to label larger subsets of data automatically and maintains and even sometimes improves the
performance of the fully supervised models on a held-out validation set. Our analysis also showed that
members of teams in which all members have significant contributions tend to have better performance
in class, but their help-seeking behaviors are not significantly different.



Files (779.4 kB)

Name Size Download all
779.4 kB Preview Download

Additional details

Related works


SHF:Small: Enabling Scalable and Expressive Program Analysis Notifications 1714538
National Science Foundation