Utilizing programming traces to explore the dimensions of novice programmers' code writing skill
Studies have found that novice programmers are weak in code writing. However, it is unclear what subskills code writing is composed of, and which subskills novices are weak in. This study utilizes programming traces to identify latent subskills that constitute code writing so that teachers can offer specific instruction on the weak subskills. Data were collected from an undergraduate course teaching introductory computer science in Java. Six hundred and fourteen students made submissions to homework programming questions in a web-based learning system. We used the submission traces to compute eleven features related to correctness and time students spent on their submissions. To investigate the underlying factors contributing to these features, we conducted an exploratory factor analysis on two-thirds of students selected randomly and identified four factors. The first factor, code style proficiency, was mainly related to checkstyle errors. The second, syntactic proficiency, concerned compiler errors. The third factor, semantic proficiency, concerned runtime errors and logic errors. The fourth factor, syntactic debugging proficiency, concerned the success rate and time required for fixing compiler and checkstyle errors. The confirmatory factor analysis conducted on the remaining one-third of data supported the four-factor structure. We linked these factors to the extant taxonomy and framework of programming skills.