Usability of Error Messages for Introductory Students

Total Page:16

File Type:pdf, Size:1020Kb

Usability of Error Messages for Introductory Students Scholarly Horizons: University of Minnesota, Morris Undergraduate Journal Volume 2 Issue 2 Article 5 September 2015 Usability of Error Messages for Introductory Students Paul A. Schliep University of Minnesota, Morris Follow this and additional works at: https://digitalcommons.morris.umn.edu/horizons Part of the Software Engineering Commons Recommended Citation Schliep, Paul A. (2015) "Usability of Error Messages for Introductory Students," Scholarly Horizons: University of Minnesota, Morris Undergraduate Journal: Vol. 2 : Iss. 2 , Article 5. Available at: https://digitalcommons.morris.umn.edu/horizons/vol2/iss2/5 This Article is brought to you for free and open access by the Journals at University of Minnesota Morris Digital Well. It has been accepted for inclusion in Scholarly Horizons: University of Minnesota, Morris Undergraduate Journal by an authorized editor of University of Minnesota Morris Digital Well. For more information, please contact [email protected]. Schliep: Usability of Error Messages for Introductory Students Usability of Error Messages for Introductory Students Paul A. Schliep Division of Science and Mathematics University of Minnesota, Morris Morris, Minnesota, USA 56267 [email protected] ABSTRACT that students struggle with various elements of error mes- Error messages are an important tool programmers use to sages such as terminology and source highlighting [1, 7]. help find and fix mistakes or issues in their code. When Several tools and heuristics are being developed to help ad- an error message is unhelpful, it can be difficult to find the dress issues in error message usability. The goals of these issue and may impose additional challenges in learning the methodologies are to help introductory programmers locate language and concepts. Error messages are especially crit- the issue in the program and guide the programmers to a ical for introductory programmers in understanding prob- solution. The purpose of this paper is to discuss analyses lems with their code. Unfortunately, not all error messages of error message design and its usability for introductory in programming are beneficial for novice programmers. This students in the scope of a class (meaning students' interac- paper discusses the general usability of error messages for in- tions with programming in a lab setting and at home), and troductory programmers, analyses of error messages in com- how these developed methodologies help improve the user pilers and DrRacket, and two methodologies intended to im- experience with error messages. prove error handling. This paper is divided into four sections. In Section 2 we provide background on usability studies, dynamic and static typing, compiler and runtime errors, and an overview of lan- Keywords guages and tools used. In Section 3 we focus on analyses Novice programmers, usability, error messages, usability stud- of the usability of error messages in development environ- ies, compiler errors, syntax errors ments and compiler messages for introductory students and how those analyses were performed. In Section 4 we ex- amine Marceau et al.'s recommendations for error message 1. INTRODUCTION design [5] and then we explore Denny et al.'s syntax error One of the most important foundations of computer pro- enhancement tool [1]. gramming is the communication between the system and the user, specifically in the error messages produced by the 2. BACKGROUND system. These error messages are especially important for In order to discuss the analyses of error messages, we introductory-level computer science students to help them need to understand several concepts related to error types resolve issues in their program because the error messages and usability. These concepts include compiler errors, syn- are the primary source for understanding what is wrong. tax errors, runtime errors, usability studies, and Human- According to Marceau et al., \[students] lack the experience Computer Interaction. We also introduce dynamically and to decipher complicated or poorly-constructed feedback" [4]. statically typed languages and their differences. We then The first rule of good message design is to be sure that the discuss the programming languages and tools used in the error does not add confusion [2]. Difficulties in understand- analysis of the error messages. We will focus on the Racket, ing error messages can often lead the programmers to frus- C++, and Java programming languages and the integrated tration because the error message is either too complicated development environment (IDE) DrRacket. to understand or led them down the wrong path [5], which can sometimes introduce new errors [1]. 2.1 Human-computer interaction and meth- Various studies have been conducted on modern program- ming languages' error messages to evaluate their effective- ods of usability analysis ness in helping novice programmers. The results have shown The study of Human-Computer Interaction, or HCI, is focused on how computer technology is used, specifically on the interfaces between the user and the programs on the computer. As Traver notes, \HCI is a discipline that aims to provide user interfaces that make working with a computer a more productive, effective, and enjoyable task" [7]. Much This work is licensed under the Creative Commons Attribution- of the research presented in this paper is viewed from an Noncommercial-Share Alike 3.0 United States License. To view a copy HCI perspective and emphasizes usability. of this license, visit http://creativecommons.org/licenses/by-nc-sa/3.0/us/ or In order to analyze these messages from an HCI per- send a letter to Creative Commons, 171 Second Street, Suite 300, San Fran- cisco, California, 94105, USA. spective and attain qualitative and quantitative information UMM CSci Senior Seminar Conference, December 2013 Morris, MN. about their usability, a case study may be performed. A Published by University of Minnesota Morris Digital Well, 2015 1 Scholarly Horizons: University of Minnesota, Morris Undergraduate Journal, Vol. 2, Iss. 2 [2015], Art. 5 case study is a research method that closely studies a group 2.3 Statically and dynamically typed of participants (in this paper, the participants are introduc- While we do not mention these concepts in any study, it tory students in the scope of the class) and collects data is important to note how some languages differ in error han- about participants interactions by observations and inter- dling. In statically typed languages, type checking is done views. Many of the studies analyzed in this paper are using at compile-time and objects are bound to types. This means a case study design. Using the data obtained from these that when programming in statically typed languages, a pro- studies a t-test may be performed, which is a statistical ex- grammer will need to pay attention to how a variable's type amination of two data sets and finds if they are significantly is cast. However, statically typed languages provide benefits different from each other. The significance in a t-test is de- such as earlier detection of programming mistakes. termined by the calculated probability, or the p-value. The A dynamically typed language means a type is interpreted p-value is the estimated probability that there is no signifi- at runtime rather than compile time and variables in the lan- cant difference in the data sets. In this paper, a p-value < guage are not bound to types. Since the runtime system of 0.05 will be considered significant. a dynamically typed language deduces type and type con- 2.2 Compiler and runtime errors versions, a programmer does not have to worry about type declaration while writing code. When writing code, a programmer may receive various er- As dynamically and statically typed languages differ in ror messages during different phases of the program, runtime how types are handled, something can be legal in dynami- and compile time. A compiler is a program that converts cally typed and illegal in statically typed language. Consider source code written in a programming language to a lower- the following example: level language understood by a processor or interpreter so that instructions can be executed. A compilation error is re- personName = "Francis" turned when the compiler fails to compile a piece of program personName = 7 source code. A program will not run if there is a compila- tion error because the compiler will not be able to create This sequence of statements is illegal in a statically typed executable code to run if there are errors the compiler finds. language since we are binding a string to personName, then We also discuss several examples of syntax errors in this an integer to personName. This statement would then throw paper. A syntax error is a type of compilation error that an error during compile time. This is a legal statement in a occurs when the code does not conform to the syntactical dynamically typed language, however, since variables do not order expected by the parser. The parser is a program (usu- have types in dynamic languages. Note that this statement ally part of a compiler) that receives input as source code would not work in a purely function language, or a functional and builds it into a structural representation of the input language that does not allow side effects, as you can not while checking for correct syntax of the input. As Kum- change the value of a variable. merfield and Kay note, \The usability of compiler errors is important because syntax error correction is the first step 2.4 Overview of programming languages and in the debugging process. It is not possible to continue pro- tools analyzed gram development until the code compiles. This means it is In section 3, we discuss a study performed by Marceau et a crucial part of the error correction process." [3] al. that analyzes the error messages in the DrRacket inte- Below is an example of a compiler error in Java. Here, grated development environment [4].
Recommended publications
  • Compiler Error Messages Considered Unhelpful: the Landscape of Text-Based Programming Error Message Research
    Working Group Report ITiCSE-WGR ’19, July 15–17, 2019, Aberdeen, Scotland Uk Compiler Error Messages Considered Unhelpful: The Landscape of Text-Based Programming Error Message Research Brett A. Becker∗ Paul Denny∗ Raymond Pettit∗ University College Dublin University of Auckland University of Virginia Dublin, Ireland Auckland, New Zealand Charlottesville, Virginia, USA [email protected] [email protected] [email protected] Durell Bouchard Dennis J. Bouvier Brian Harrington Roanoke College Southern Illinois University Edwardsville University of Toronto Scarborough Roanoke, Virgina, USA Edwardsville, Illinois, USA Scarborough, Ontario, Canada [email protected] [email protected] [email protected] Amir Kamil Amey Karkare Chris McDonald University of Michigan Indian Institute of Technology Kanpur University of Western Australia Ann Arbor, Michigan, USA Kanpur, India Perth, Australia [email protected] [email protected] [email protected] Peter-Michael Osera Janice L. Pearce James Prather Grinnell College Berea College Abilene Christian University Grinnell, Iowa, USA Berea, Kentucky, USA Abilene, Texas, USA [email protected] [email protected] [email protected] ABSTRACT of evidence supporting each one (historical, anecdotal, and empiri- Diagnostic messages generated by compilers and interpreters such cal). This work can serve as a starting point for those who wish to as syntax error messages have been researched for over half of a conduct research on compiler error messages, runtime errors, and century. Unfortunately, these messages which include error, warn- warnings. We also make the bibtex file of our 300+ reference corpus ing, and run-time messages, present substantial difficulty and could publicly available.
    [Show full text]
  • Common Logic Errors for Programming Learners: a Three-Decade Litera- Ture Survey
    Paper ID #32801 Common Logic Errors for Programming Learners: A Three-decade Litera- ture Survey Nabeel Alzahrani, University of California, Riverside Nabeel Alzahrani is a Computer Science Ph.D. student in the Department of Computer Science and En- gineering at the University of California, Riverside. Nabeel’s research interests include causes of student struggle, and debugging methodologies, in introductory computer programming courses. Prof. Frank Vahid, University of California, Riverside Frank Vahid is a Professor of Computer Science and Engineering at the Univ. of California, Riverside. His research interests include embedded systems design, and engineering education. He is a co-founder of zyBooks.com. c American Society for Engineering Education, 2021 Common Logic Errors for Programming Learners: A Three- Decade Literature Survey Nabeel Alzahrani, Frank Vahid* Computer Science and Engineering University of California, Riverside {nalza001, vahid}@ucr.edu *Also with zyBooks Abstract We surveyed common logic errors made by students learning programming in introductory (CS1) programming classes, as reported in 47 publications from 1985 to 2018. A logic error causes incorrect program execution, in contrast to a syntax error, which prevents execution. Logic errors tend to be harder to detect and fix and are more likely to cause students to struggle. The publications described 166 common logic errors, which we classified into 11 error categories: input (2 errors), output (1 error), variable (7 errors), computation (21 errors), condition (18 errors), branch (14 errors), loop (27 errors), array (5 errors), function (24 errors), conceptual (43 errors), and miscellaneous (4 errors). Among those errors, we highlighted 43 that seemed to be the most common and/or troublesome.
    [Show full text]
  • Automated Debugging of Java Syntax Errors Through Diagnosis
    Uti and Fox Automated debugging of Java syntax errors through diagnosis Ngozi V. Uti and Dr. Richard Fox Department of Mathematics and Computer Science Northern Kentucky University Highland Heights, KY 41099 Abstract Syntax errors arise in a program because the program code misuses the programming language. When these errors are detected by a compiler, they are not necessarily easy to debug because the compiler error message that reports the error is often misleading. JaSD, a knowledge-based system written in Java, is a program that can determine the cause of a given syntax error and suggest how to correct that error. JaSD uses artificial intelligence techniques to perform debugging. Currently, JaSD has enough knowledge to debug one of four different Java syntax errors. This paper describes the JaSD system and the diagnostic approach taken and offers some examples. Conclusions describe how JaSD can be expanded to debug additional errors. Introduction expected error is common because a semicolon is used Syntax errors can arise in a program because the in Java, C, and C++ to end instructions and in Pascal to program code misuses the programming language. The delineate (separate) instructions. Forgetting a semico- compiler, in its attempt to translate the high-level lan- lon is common and yields the semicolon-expected guage program into machine language, has reached an error in any of these languages. Thus, the programmer instruction that it does not understand. The error might learns how the error can arise. However, there are be because the programmer misused the high-level lan- numerous possible causes for this error, omitting a guage (e.g., improper punctuation or not structuring a semicolon being only one of them.
    [Show full text]
  • Introduction to Computers and Programming
    M01_GADD7119_01_SE_C01.QXD 1/30/08 12:55 AM Page 1 Introduction to Computers CHAPTER 1 and Programming TOPICS 1.1 Introduction 1.4 How a Program Works 1.2 Hardware and Software 1.5 Using Python 1.3 How Computers Store Data 1.1 Introduction Think about some of the different ways that people use computers. In school, students use com- puters for tasks such as writing papers, searching for articles, sending email, and participating in online classes. At work, people use computers to analyze data, make presentations, conduct busi- ness transactions, communicate with customers and coworkers, control machines in manufac- turing facilities, and do many other things. At home, people use computers for tasks such as pay- ing bills, shopping online, communicating with friends and family, and playing computer games. And don’t forget that cell phones, iPods®, BlackBerries®, car navigation systems, and many other devices are computers too. The uses of computers are almost limitless in our everyday lives. Computers can do such a wide variety of things because they can be programmed. This means that computers are not designed to do just one job, but to do any job that their programs tell them to do. A program is a set of instructions that a computer follows to perform a task. For example, Figure 1-1 shows screens from two commonly used programs, Microsoft Word and Adobe Photoshop. Microsoft Word is a word processing program that allows you to create, edit, and print documents with your computer. Adobe Photoshop is an image editing program that allows you to work with graphic images, such as photos taken with your digital camera.
    [Show full text]
  • Compile and Runtime Errors in Java
    Compile and Runtime Errors in Java Mordechai (Moti) Ben-Ari Department of Science Teaching Weizmann Institute of Science Rehovot 76100 Israel http://stwww.weizmann.ac.il/g-cs/benari/ January 24, 2007 This work is licensed under the Creative Commons Attribution-Noncommercial-No Derivative Works 2.5 License. To view a copy of this license, visit http://creativecommons.org/licenses/ by-nc-nd/2.5/; or, (b) send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA. Acknowledgement Otto Seppälä and Niko Myller contributed many helpful suggestions. Contents 1 Introduction 5 2 Compile-time Errors 6 2.1 Syntax errors . 7 2.1.1 . expected . 7 2.1.2 unclosed string literal . 7 2.1.3 illegal start of expression . 8 2.1.4 not a statement . 8 2.2 Identifiers . 10 2.2.1 cannot find symbol . 10 2.2.2 . is already defined in . 10 2.2.3 array required but . found . 10 2.2.4 . has private access in . 10 2.3 Computation . 11 2.3.1 variable . might not have been initialized . 11 2.3.2 . in . cannot be applied to . 11 2.3.3 operator . cannot be applied to . ,. 12 2.3.4 possible loss of precision . 12 2.3.5 incompatible types . 12 2.3.6 inconvertible types . 13 2.4 Return statements . 13 2.4.1 missing return statement . 13 2.4.2 missing return value . 14 2.4.3 cannot return a value from method whose result type is void . 14 2.4.4 invalid method declaration; return type required .
    [Show full text]
  • Compile and Runtime Errors in Java
    Compile and Runtime Errors in Java Mordechai (Moti) Ben-Ari Department of Science Teaching Weizmann Institute of Science Rehovot 76100 Israel http://stwww.weizmann.ac.il/g-cs/benari/ August 27, 2007 This work is licensed under the Creative Commons Attribution-Noncommercial-No Derivative Works 2.5 License. To view a copy of this license, visit http://creativecommons.org/licenses/ by-nc-nd/2.5/; or, (b) send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA. Acknowledgement Otto Seppälä and Niko Myller contributed many helpful suggestions. Contents 1 Introduction 5 2 Compile-time Errors 6 2.1 Syntax errors . 7 2.1.1 . expected . 7 2.1.2 unclosed string literal . 7 2.1.3 illegal start of expression . 8 2.1.4 not a statement . 8 2.2 Identifiers . 10 2.2.1 cannot find symbol . 10 2.2.2 . is already defined in . 10 2.2.3 array required but . found . 10 2.2.4 . has private access in . 10 2.3 Computation . 11 2.3.1 variable . might not have been initialized . 11 2.3.2 . in . cannot be applied to . 11 2.3.3 operator . cannot be applied to . ,. 12 2.3.4 possible loss of precision . 12 2.3.5 incompatible types . 12 2.3.6 inconvertible types . 13 2.4 Return statements . 13 2.4.1 missing return statement . 13 2.4.2 missing return value . 14 2.4.3 cannot return a value from method whose result type is void . 14 2.4.4 invalid method declaration; return type required .
    [Show full text]
  • Truffle/Graal: from Interpreters to Optimizing Compilers Via Partial Evaluation
    Truffle/Graal: From Interpreters to Optimizing Compilers via Partial Evaluation Jonathan Aldrich 17-396/17-696/17-960: Language Design and Prototyping Carnegie Mellon University Many slides from Oracle’s PLDI 2016 and 2017 tutorials on Truffle/Graal—marked where they occur. 1 Interpreters for Prototyping • Writing optimizing compilers is hard • Many complex optimization algorithms • Relies on knowledge of architecture and performance characteristics • See 15-411/15-611, 15-745 • So when we prototype with interpreters • Easy to write • Easy to change when the language changes • Unfortunately, interpreters are slow • Especially if you make them simple! • Could be 2 orders of magnitude slower than an optimizing compiler • Is there any way to make an interpreter faster? 2 From Interpreter to Compiler • Given • An interpreter for guest language G written in host language H • “if you see an add expression, add the two arguments” • A program in language G • “add(add(x, y), z)” • What if we “run” the interpreter on the program? • Whenever we get to actual input data, we’ll stop evaluating there (but keep going elsewhere) • “add(add(x, y), z)” x+y+z • We have essentially “compiled” the language • This is called Partial Evaluation • When applied in the context of interpreters and programs, it’s called the First Futamura Projection (after Futamura, who proposed it in 1971) 3 Example: Partial Evaluation class ExampleNode { // parameter this in rsi @CompilationFinal boolean flag; normal compilation cmpb [rsi + 16], 0 of method foo() jz L1 int foo() { mov eax, 42 if (this.flag) { ret return 42; L1: mov eax, -1 } else { ret return -1; } } mov rax, 42 Object value of this ret ExampleNode flag: true partial evaluation of method foo() with known parameter this Memory access is eliminated and condition is constant folded during partial evaluation @CompilationFinal field is treated like a final field during partial evaluation Copyright © 2016, Oracle and/or its affiliates.
    [Show full text]
  • Using Language Models to Detect and Correct Syntax Errors
    Syntax and Sensibility: Using language models to detect and correct syntax errors Eddie Antonio Santos, Joshua Charles Campbell, Dhvani Patel, Abram Hindle, and José Nelson Amaral Department of Computing Science University of Alberta Edmonton, Alberta, Canada {easantos,joshua2,dhvani,hindle1,jamaral}@ualberta.ca Abstract—Syntax errors are made by novice and experienced Listing 1: Syntactically invalid Java code. An open brace ({) is missing programmers alike; however, novice programmers lack the years at the end of line 3. of experience that help them quickly resolve these frustrating errors. Standard LR parsers are of little help, typically resolving 1 public class A{ public static void main(String args[]) { syntax errors and their precise location poorly. We propose 2 3 if (args.length < 2) a methodology that locates where syntax errors occur, and 4 System.out.println("Not enough args!"); suggests possible changes to the token stream that can fix the 5 System.exit(1); error identified. This methodology finds syntax errors by using 6 } language models trained on correct source code to find tokens 7 System.out.println("Hello, world!"); that seem out of place. Fixes are synthesized by consulting the 8 } language models to determine what tokens are more likely at the 9 } estimated error location. We compare n-gram and LSTM (long short-term memory) language models for this task, each trained input to a Java 8 compiler such as OpenJDK’s javac [4], on a large corpus of Java code collected from GitHub. Unlike and it reports that there is an error with the source file, but it prior work, our methodology does not rely that the problem overwhelms the user with misleading error messages to identify source code comes from the same domain as the training data.
    [Show full text]
  • Rule-Based Program Specialization to Optimize Gradually Typed Code
    Rule-based Program Specialization to Optimize Gradually Typed Code Francisco Ortina,∗, Miguel Garciaa, Seán McSweeneyb aUniversity of Oviedo, Computer Science Department, Calvo Sotelo s/n, 33007, Oviedo, Spain bCork Institute of Technology, Department of Computer Science, Rossa Avenue, Bishopstown, Cork, Ireland Abstract Both static and dynamic typing provide different benefits to the programmer. Stati- cally typed languages support earlier type error detection and more opportunities for compiler optimizations. Dynamically typed languages facilitate the development of runtime adaptable applications and rapid prototyping. Since both approaches provide benefits, gradually typed languages support both typing approaches in the very same programming language. Gradual typing has been an active research field in the last years, turning out to be a strong influence on commercial languages. However, one important drawback of gradual typing is the runtime performance cost of the additional type checks performed at runtime. In this article, we propose a rule-based program specialization mechanism to provide significant performance optimizations of gradually typed code. Our system gathers dynamic type information of the application by simulating its execution. That type information is used to optimize the generated code, reducing the number of type checks performed at runtime. Moreover, program specialization allows the early detection of compile-time type errors, providing static type safety. To ensure the correctness of the proposed approach, we prove its soundness and efficiency properties. The specialization system has been implemented as part of a full-fledged programming language, measuring the runtime performance gain. The generated code performs significantly better than the state-of-the-art techniques to optimize dynamically typed code.
    [Show full text]
  • Python for Everybody
    Python for Everybody Exploring Data Using Python 3 Charles R. Severance 1.10. WHAT COULD POSSIBLY GO WRONG? 13 I hate you Python! ^ SyntaxError: invalid syntax >>> if you come out of there, I would teach you a lesson File "<stdin>", line 1 if you come out of there, I would teach you a lesson ^ SyntaxError: invalid syntax >>> There is little to be gained by arguing with Python. It is just a tool. It has no emotions and it is happy and ready to serve you whenever you need it. Its error messages sound harsh, but they are just Python’s call for help. It has looked at what you typed, and it simply cannot understand what you have entered. Python is much more like a dog, loving you unconditionally, having a few key words that it understands, looking you with a sweet look on its face (>>>), and waiting for you to say something it understands. When Python says “SyntaxError: invalid syntax”, it is simply wagging its tail and saying, “You seemed to say something but I just don’t understand what you meant, but please keep talking to me (>>>).” As your programs become increasingly sophisticated, you will encounter three gen- eral types of errors: Syntax errors These are the first errors you will make and the easiest to fix. A syntax error means that you have violated the “grammar” rules of Python. Python does its best to point right at the line and character where it noticed it was confused. The only tricky bit of syntax errors is that sometimes the mistake that needs fixing is actually earlier in the program than where Python noticed it was confused.
    [Show full text]
  • An Integrated Approach to Source Level Debugging and Compile Error Reporting in Metaprograms
    Journal of Object Technology Published by AITO — Association Internationale pour les Technologies Objets, c JOT 2011 Online at http://www.jot.fm. An Integrated Approach to Source Level Debugging and Compile Error Reporting in Metaprograms Yannis Lilisa Anthony Savidisab a. Institute of Computer Science, FORTH b. Department of Computer Science, University of Crete Abstract Metaprogramming is an advanced language feature enabling to mix programs with definitions that may be executed either at compile- time or at runtime to generate source code to be put in their place. Such definitions are called metaprograms and their actual evaluation constitutes a compilation stage. As metaprograms are also programs, programmers should be supported in handling compile-time and runtime errors, something introducing challenges to the entire tool chain along two lines. Firstly, the source point of a compile error may well be the outcome of a series of compilation stages, thus never appearing within the original program. Effectively, the latter requires a compiler to track down the error chain across all involved stages so as to provide a meaningful, descriptive and precise error report. Secondly, every compilation stage is instantiated by the execution of the respective staged program. Thus, typical full-fledged source-level debugging for any particular stage should be facilitated during the compilation process. Existing implementations suffer in both terms, overall providing poor error messages, while lacking the required support for debugging metaprograms of any staging depth. In this paper, we outline the implementation of a compile-time metaprogramming system offering all aforementioned facilities. Then, we detail the required amendments to the compilation process and the necessary interaction between the compiler and the tool-chain (IDE).
    [Show full text]
  • Syntax Errors Just Aren't Natural: Improving Error Reporting With
    Syntax Errors Just Aren’t Natural: Improving Error Reporting with Language Models Joshua Charles Abram Hindle José Nelson Amaral Campbell Department of Computing Department of Computing Department of Computing Science Science Science University of Alberta University of Alberta University of Alberta Edmonton, Canada Edmonton, Canada Edmonton, Canada [email protected] [email protected] [email protected] ABSTRACT programmers get frustrated by syntax errors, often resorting A frustrating aspect of software development is that com- to erratically commenting their code out in the hope that piler error messages often fail to locate the actual cause of they discover the location of the error that brought their a syntax error. An errant semicolon or brace can result in development to a screeching halt [14]. Syntax errors are many errors reported throughout the file. We seek to find annoying to programmers, and sometimes very hard to find. the actual source of these syntax errors by relying on the con- Garner et al. [4] admit that “students very persistently sistency of software: valid source code is usually repetitive keep seeking assistance for problems with basic syntactic de- and unsurprising. We exploit this consistency by construct- tails.” They, corroborated by numerous other studies [11, 12, ing a simple N-gram language model of lexed source code 10, 17], found that errors related to basic mechanics (semi- tokens. We implemented an automatic Java syntax-error lo- colons, braces, etc.) were the most persistent and common cator using the corpus of the project itself and evaluated its problems that beginner programmers faced. In fact these performance on mutated source code from several projects.
    [Show full text]