Congratulations to Davin McCall, research student in the School of Computing, who has recently passed his PhD viva, with minor corrections. His PhD thesis was entitled ‘Novice Programmer Errors – Analysis and Diagnostics‘.
The thesis detailed a systematic and thorough approach both to analysing which errors that are most problematic for students, and to automated diagnosis of errors.
Abstract
All programmers make errors when writing program code, and for novices the difficulty of repairing errors can be frustrating and demoralising. It is widely recognised that compiler error diagnostics can be inaccurate, imprecise, or otherwise difficult for novices to comprehend, and many approaches to mitigating the difficulty of dealing with errors are centered around the production of diagnostic messages with improved accuracy and precision, and revised wording considered more suitable for novices. These efforts have shown limited success, partially due to uncertainty surrounding the types of error that students actually have the most difficulty with – which has most commonly been assessed by categorising them according to the diagnostic message already produced – and a traditional approach to the error diagnosis process which has known limitations.
In this thesis we detail a systematic and thorough approach both to analysing which errors that are most problematic for students, and to automated diagnosis of errors. We detail a methodology for developing a category schema for errors and for classifying individual errors in student programs according to such a schema. We show that this classification results in a different picture of the distribution of error types when compared to a classification according to diagnostic messages. We formally define the severity of an error type as a product of its frequency and difficulty, and by using repair time as an indicator of difficulty we show that error types rank differently via severity than they do by frequency alone.
Having developed a ranking of errors according to severity, we then investigate the contextual information within source code that experienced programmers can use to more accurately and precisely classify errors than compiler tools typically do. We show that, for a number of more severe errors, these techniques can be applied in an automated tool to provide better diagnostics than are provided by traditional compilers.