Runtime Error Checking for Novice C Programmers

Runtime error checking for novice C programmers Matthew Heinsen Egan Chris McDonald Computer Science and Software Engineering Computer Science and Software Engineering University of Western Australia University of Western Australia Crawley, WA 6009, Australia Crawley, WA 6009, Australia [email protected] [email protected] ABSTRACT The standard C programming language can be especially Debugging is a source of great frustration for most novice difficult for newcomers. In particular, pointers and manual programmers. Standard and professional debugging tools memory management can present difficulties both in under- are unsuitable for novice programmers, because they are standing at a conceptual level, and in debugging the la- overly complex and do not provide the basic information conically described runtime errors which result from their that novices frequently require. We describe an ongoing misuse. Most newly developed novice-focused debugging project to design, build, and evaluate novice-focused de- systems are designed for object-oriented programming lan- bugging tools supporting the standard C programming languages, as introductory teaching has focused on these language. In particular, our focus is on detecting, reporting, guages, and the most notable tools developed to assist novice and reviewing the typical runtime errors that confuse and C programmers are predominately unmaintained. frustrate student programmers { accesses to invalid memory, Research from the fields of programming languages and reading from uninitialized memory, and C's often \defined to compilers has developed many advanced debugging tech- be undefined" behaviour. In addition, our tool features in- niques, but they are typically only supported by tools de- tegrated execution tracing so that students receive not only signed for expert programmers, rather than for novices. The very informative error reporting, but the ability to replay complexity of these tools, and the time required to learn the entire history of their program leading to that error. their use, at even a modest level, are often insurmount- able hurdles for novice students. Furthermore, while these tools can be used to locate runtime errors, they do not assist Categories and Subject Descriptors novice programmers to understand those errors. D.2.5 [Software Engineering]: Testing and Debugging; In this paper we describe our ongoing project to design, K.3.2 [Computers and Education]: Computer and In- develop, and evaluate novice-focused debugging tools for the formation Science Education C programming language. We discuss the design and implementation of our prototype tool, SeeC (pronounced 'seek'), General Terms and evaluate its error detection and reporting using a large sample of project submissions from undergraduate students. Human Factors SeeC is designed to assist novice programmers by automat- ically recording the execution of their programs, detecting Keywords runtime errors, and allowing review of the execution in a Novice programmers, debuggers simple graphical environment. The approach taken is that the execution of novices' programs is always being recorded, 1. INTRODUCTION so that execution may always be reviewed, in preference to re-running a program which may only fail intermittently. Debugging can be troublesome for novice programmers as Section 2 discusses the C programming language and the it requires the simultaneous application of a wide range of difficulties that it presents to students and educators. Sec- knowledge, which novice programmers are yet to acquire. tion 3 discusses novice programmers' difficulties with debug- If a novice programmer does not have the necessary knowl- ging, and describes the framework that we follow to evaluate edge to debug their program then the debugging process and design debugging systems. Section 4 considers existing may consume large amounts of their time, prevent them debugging tools for both novice and expert programmers. from completing assigned tasks, and ultimately contribute Section 5 discusses the implementation requirements of our to decisions to withdraw from programming courses. proposed debugging system. Section 6 discusses our use of the LLVM compiler infrastructure to implement execution tracing. Section 7 discusses our implementation of runtime error detection. Section 8 discusses our use of the Clang project to acquire precise information about the syntactic and semantic structure of students' C source code. Sec- tion 9 compares SeeC's error detection and reporting to that of Memcheck, the well-known Valgrind tool. We finally sum- 4th Annual International Conference on Computer Science Education: In- marize our discussion and highlight future plans. novation and Technology (CSEIT 2013) 2. ON TEACHING C 3. THE DEBUGGING FRAMEWORK For many years C has been one of the most widely-used Anecdotal reports often indicate that debugging is dif- programming languages. A great number of projects are ficult for novice programmers. Many formal studies have implemented in C, from operating system kernels to end- investigated the nature of the debugging process for both user programs, and it has influenced several other popu- expert programmers and novice programmers, and show the lar languages, such as C++, Objective C, Java, and C#. kind of unique difficulties that novices experience, and how Despite C's age, the growing importance of embedded and detrimental they can be to a novice's progress. sensor-based systems has cemented C's significance in mod- Seppälä reported that a questionnaire given to students ern computing. Gaspar et al. reported on the results of who were studying Java in their main programming course an anonymous online survey designed to determine the role found that \43 percent of the students claimed that they of the C language in the modern computing curricula [11]. spent most of their time trying to make their programs con- The survey found that while respondents used C in only 14% form to exercise specifications or trying to fix runtime er- of introductory courses and 10.9% of intermediate courses, rors" [27]. 67% use C in other courses { primarily Operating systems Ko and Myers presented a framework for studying soft- and Networking: ware errors [16] which we have used to evaluate the design of existing debugging tools, and to guide the design of our pro- \. the very aspects of C which are perceived as totype debugging tool. This framework is based on studies a pedagogical hindrance in introductory courses of programming and debugging, and general research on hu- can be useful to provide a more in-depth under- man error. The framework defines the correctness of a pro- standing of programming at later stages of stu- gram relative to the program's design specifications, which dent education." [11] define the system's behavioural and functional requirements. The framework defines three terms for describing runtime Lee et al. noted that moving to a Java-based introduc- errors: tory sequence \created a gap in knowledge as the students Runtime failures occur when a program's behaviour does progress to upper level courses like operating systems and not comply with its design specifications, e.g. it pro- computer graphics, where they need a command of C and duces incorrect output or crashes. the UNIX environment" [22]. Desnoyers noted that many students had no previous exposure to an unsafe language Runtime faults exist when a program's runtime state may when first encountering C in an operating systems course [7]. lead to a runtime failure, e.g. when an incorrect value Moreover, the number of potential topics in any Computer has been calculated, or a piece of code has been inap- Science curriculum often means that the C programming propriately executed. language can no longer be explicitly taught in some degrees, and students must study it at their own pace to attain the Software errors are any pieces of code that may cause a knowledge assumed for later courses. runtime fault during the program's execution. Students learning the C programming language are often experiencing their first introduction to pointers and man- Note that software errors may lie dormant until circum- ual memory management. Many studies have noted that stances cause them to manifest a runtime fault, and similarly students encounter difficulty while learning to use pointers that a runtime fault will not necessarily produce a runtime and manual memory management. Lahtinen et al. surveyed failure. However, the presence of a runtime failure guar- students and teachers, and found that on average pointers antees that at least one runtime fault exists, which in turn and references were rated the most difficult programming guarantees that at least one software error exists. concepts to learn [19]. Brusilovsky et al. surveyed computer Within this framework, we can consider the debugging science educators, and found that pointers were the most fre- process as follows: after a programmer becomes aware of a quent response for difficult to learn concepts, and the second runtime failure, they must locate the software errors respon- most frequent response for difficult to teach concepts [2]. sible and correct them. This often involves the intermediate Students at The University of Western Australia are in- task of finding the runtime faults responsible for the ob- troduced to C in a first year course covering core aspects of served runtime failure. When the programmer has located a procedural programming language and their relationship the runtime faults, they use their knowledge to identify the to core operating systems concepts. The course is part of software errors responsible, and then to correct those errors. the required sequence for traditional Computer Science stu- Ducasséand Emde identified seven kinds of knowledge dents, who later apply their knowledge in networking and necessary for debugging [9], namely knowledge of: the in- security courses, but is also taken by many Electronic Engi- tended program; the actual program; the programming lan- neering students who later apply their knowledge in courses guage; bugs; debugging methods; general programming ex- on embedded systems and robotics. pertise; and the application domain.

Load more