The Anatomy and Aptness of Learners Programming Language

Muhammad Shumail Naveed

Dissertation submitted for the partial fulfillment of the Degree of Doctor of Philosophy

Department of Computer Science Faculty of Science Federal Urdu University of Arts, Science and Technology Karachi, Sindh, Pakistan.

December 2015

© Muhammad Shumail Naveed 2015

CERTIFICATE OF ORIGINAL AUTHORSHIP

I, Muhammad Shumail Naveed S/O Muhammad Khurshid Zahid, certify that the work in this dissertation has not previously been submitted for a de- gree nor has it been submitted as part of requirements for a degree except as fully acknowledged with the text.

I also certify that the dissertation has been written by myself and any help that I have received in my research work or in the preparation of the dissertation itself has been acknowledged. In addition, I also certify that all the information sources and literature used are indicated in the dissertation.

————————– ———————– Signature Date

APPROVED BY

Supervisor Name: Dr. Muhammad Sarim Assistant Professor Department of Computer Science Federal Urdu University of Arts, Sci- ence and Technology

——————— ——————— ——————— Signature Seal Date

Co-supervisor Name: Dr. Kamran Ahsan Assistant Professor Department of Computer Science Federal Urdu University of Arts, Sci- ence and Technology

——————— ——————— ——————— Signature Seal Date

Dedicated to my parents

Abstract

Computer programming is a core area in computer science education. However, learning the programming is notoriously difficult and introductory programming courses are infamous to pose challenges for students. Programming involves the correct understanding of concepts, complicated syntax and the development of problem-solving techniques. It necessitates the beginning students to symbolize their concepts, requirement and method, to obtain an understanding of ideas for which they may have no previous experience, and to articulate these ideas in a prescribed method of a programming language they have never before confronted and consequently influence the high rate of failure and dropout of students in introductory programming courses. Decades of research have been devoted to diminish the problems of introduc- tory programming. Several techniques and methodologies have been proposed, but the introductory programming courses still have high dropout and failure rates. This thesis investigated the existing methods which are defined to overcome the hardness of introductory programming and establish a solution, called a learners’ programming language as a zeroth programming course and inter- vened before the first programming course. The proposed solution originally aims to introduce the fundamental concepts of programming and motivates the students and consequently, prepares them for the first programming course. The proposed solution is a two-phase foundation programming course, which presents the elementary topics of programming with the graphical environment and later covers these topics with a dedicated textual language and encourages the students to work in pairs. An experimental prototype of the proposed solution is developed, and a small study was designed to evaluate the significance of the proposed solution. The study demonstrated that experts acknowledged the proposed solution as a log- ical way to prepare the novice students for the first programming course. The practical evaluation evinced the significance of the proposed solution. The stu- dents who attended the proposed course have better performance, commitment and self-perceived programming proficiency. The study demonstrated that the students opinion about the proposed solution was somewhat diverse, but the majority of students acknowledged the significance of the proposed solution. The study ultimately suggests that the learners programming language is a viable approach for preparing and motivating the students and can be very fruitful in controlling the high dropout and failure rates of the first program- ming course.

Key words: Introductory programming courses, high dropouts, low retention, motivation and comfort level, syntax and semantics, educational programming languages, CS0, graphical environment, textual language, pair programming, compiler. Acknowledgements

First and foremost I am very much thankful to Almighty Allah, who has given me the courage and strength to achieve this accomplishment. I would like to thank my parents for their everlasting love, encouragement, sup- port and constant prayers. In fact, words cannot state how indebted I am to my parents for all of the sacrifices that they made on my behalf. My deepest gratitude is to my supervisor, Dr. Muhammad Sarim for his con- tinual support, encouragement, guidance, enthusiasm, and immense knowledge. He patiently guided me throughout the work, and taught me much about re- search. He prompted me to write this thesis; the ideas and suggestions he has come up with have had a large influence on the direction in which the research has developed. I am also thankful to my co-supervisor, Dr. Kamran Ahsan for his valuable suggestions and support. I am especially grateful to my brother Muhammad Shakeel Azam for his love, encouragement and support. He is the person who sparked my interest in the field of computer. Furthermore, I would like to thank the University of Balochistan for providing me an opportunity to come to Karachi for research at the Department of Com- puter Science, Federal Urdu University of Science and Technology. May the Almighty Allah exquisitely bless all of you. Contents

1 Introduction1 1.1 Statement of Problem...... 1 1.2 Purpose of Study...... 2 1.3 Research Questions...... 2 1.4 Outline of the Thesis...... 3

2 Background4 2.1 Introduction...... 4 2.2 Programming...... 6 2.3 Introductory Programming Courses...... 8 2.4 Dropout and Failure Rates in Introductory Programming Courses 10

3 Literature Review 13 3.1 Introduction...... 13 3.2 Natural Programming Languages...... 14 3.2.1 NLC...... 14 3.2.2 Pegasus...... 15 3.2.3 Metafor...... 16 3.2.4 NaturalJava...... 16 3.2.5 SHRDLU...... 17 3.2.6 HANDS...... 17 3.2.7 SNAP...... 18 3.2.8 sEnglish...... 18 3.3 Visual Languages and Microworlds...... 19 3.3.1 Greenfoot...... 20

i Contents ii

3.3.2 Scratch...... 20 3.3.3 ...... 23 3.3.4 Alice...... 24 3.4 Visualization & Animation Tools...... 28 3.4.1 Jeliot...... 29 3.4.2 Visual interpreter...... 30 3.4.3 VILLE...... 32 3.4.4 PlanAni system...... 33 3.4.5 EduVisor...... 34 3.5 Based Programming Environments...... 35 3.5.1 RAPTOR...... 35 3.5.2 Flowgorithm...... 36 3.5.3 ProGuide...... 36 3.5.4 Iconic Programmer...... 37 3.6 Mini-languages...... 38 3.7 CS0 “Pre-programming” Courses...... 41 3.8 Other Popular Solutions...... 48

4 Learners Programming Language 52 4.1 Introduction...... 52 4.2 Learners Programming Language...... 58 4.3 Aim and Objectives...... 59 4.4 Course Description...... 60 4.5 Phase I...... 62 4.6 Pair Programming...... 68 4.7 Phase II...... 70 4.8 Dedicated Textual Language...... 72 4.8.1 Subroutines...... 73 4.8.2 Data types and variables...... 74 4.8.3 Literals...... 75 4.8.4 Operators...... 75 4.8.5 Assignment...... 76 Contents iii

4.8.6 Input/Output...... 76 4.8.7 Selection Structure...... 77 4.8.8 Loops...... 78 4.8.9 Comments...... 79 4.8.10 Error handling...... 79 4.8.11 High level code generation...... 80 4.8.12 Flexibility of a textual language...... 80 4.9 Implementation...... 81

5 Dedicated Textual Language 83 5.1 Introduction...... 83 5.2 Lexical Structure...... 84 5.2.1 Regular Expression...... 86 5.2.2 Definition of lexical structure...... 87 5.2.3 Regular Grammar...... 89 5.3 Syntax and Grammar...... 90 5.3.1 Context-free grammar...... 90 5.3.2 Definition of syntax...... 95 5.4 Semantics...... 126 5.4.1 Attribute grammar...... 128 5.4.2 Definition of semantics...... 129 5.5 Design of Textual Language Translator...... 138 5.5.1 Lexical analyzer...... 140 5.5.2 Syntax analyzer...... 148 5.5.3 Semantic analyzer...... 163 5.5.4 Symbol table management...... 168 5.5.5 Error handling...... 171 5.5.6 Intermediate code generator...... 173 5.5.7 Code generator...... 174 Contents iv

6 Experimental Prototype of Learners Programming Language 180 6.1 Introduction...... 180 6.2 Phase I...... 181 6.3 Phase II...... 182 6.4 Pair programming...... 205 6.5 Implementation...... 206

7 Evaluation & Discussion 207 7.1 Introduction...... 207 7.2 Expert Judgment...... 208 7.2.1 Hardness of CS1...... 208 7.2.2 Main cause of problems in CS1...... 209 7.2.3 Paradigm of introductory programming courses..... 210 7.2.4 Significance of prior knowledge in CS1...... 211 7.2.5 Significance of CS0...... 212 7.2.6 Significance of the graphical environment in CS0..... 213 7.2.7 Significance of learners programming language...... 214 7.2.8 High level code generation...... 215 7.2.9 Motivation and comfort level...... 216 7.2.10 Implementation of LPL course...... 217 7.2.11 Essential topics...... 218 7.2.12 Pair programming...... 219 7.2.13 Implementation risk of LPL course...... 220 7.3 Practical Evaluation...... 221 7.3.1 Student's performance...... 222 7.3.2 Self-perceived programming proficiency...... 224 7.3.3 Commitment to class...... 226 7.3.4 Student’s perception...... 228 7.3.5 Interest in next programming course...... 229

8 Conclusions and Future Work 232 8.1 Overview and Conclusion...... 232 8.2 Future Work...... 236 Contents v

Appendices 292

A Textual Language Grammar 294

B Textual Language Grammar in EBNF 302

C Synchronizing Sets 310 Chapter 1

Introduction This chapter concisely introduces the problem area and describes the central objectives of research. It also describes why the problem addressed by this research is important. Finally, the structure of the thesis is described.

1.1 Statement of Problem

Programming skill is the core component of computer science courses and one of many skills that students of computer science programs are assumed to master [302, 342]. However, learning the programming is extremely difficult for begin- ners [131, 351, 405], and similarly the first programming language is usually very hard [365]. It requires the novice students to symbolize their concepts, requirements and method to obtain an understanding of ideas for which they may have no previous experience and to represent these ideas in a particu- lar method of a programming language they have never before confronted. A sufficient knowledge of problem-solving techniques is essential for programming [448] and syntax and semantics of programming languages are difficult for novice students [417]. Complicated syntax and semantics, lack of prior knowledge and problem-solving skills are the contributing factors that influence the high rate of failure and drop out of students in introductory programming courses. G´arcia- Mateos and Fern´andez-Alem´an argued that the implicit intricacy of the matter and a lack of motivation among students are two main causes of dropout [151]. Several studies report that the dropout and failure rate of students in introduc- tory programming courses are usually very high [46, 151, 229, 390, 429].

1 1.2. Purpose of Study 2

Other studies have identified that many students who have completed the course may not know how to program [302] or how to methodologically analyze a small fragment of code [273]. Even the majority of students nearing graduation may strive and fail to design software systems [130, 278].

1.2 Purpose of Study

The purpose of this study is to define a solution for the novice students of programming and determine whether it can increase students’ performance in a first programming course, can increase student motivation and comfort level, and can raise student engagement in the first programming course.

1.3 Research Questions

This thesis aims to answer the following question: “Does the two-phase programming foundation course called the learners pro- gramming language which includes a graphical environment with a dedicated textual programming language and augmented with a pair programming can pre- pare the novice students in learning and successfully completing the first pro- gramming course?” An answer to this main question will be sought in parts by using three secondary research questions:

1. “Does the learners programming language improves the performance of novice students in the first programming course?” 2. “Does the learners programming language increases the students motiva- tion and comfort level in the first programming?” 3. “Does the retention level of students in the first programming language increased by the intervention of learners programming language?”

An attempt will be made to answer these questions through the definition of a learners’ programming language and the implementation of its prototype. 1.4. Outline of the Thesis 3

1.4 Outline of the Thesis

The Chapter 2 defines some rudimentary terminologies and identifies the intri- cacies associated with the introductory programming. The chapter 3 includes a review of academic literature concerning the support of an introductory pro- gramming. Chapter 4 presents the objectives and intrinsic features of learners’ programming language. It also defines the formal structure of learners pro- gramming language. Chapter 5 is the continuation of chapter 4 and describes the design and the construction of a core component of learners programming language. Chapter 6 describes the experimental prototype of learners program- ming language. Chapter 7 provides the results of the evaluation of learners programming language. Chapter 8 recapitulates the contributions of this re- search and illustrates the potential future work. Chapter 2

Background In chapter 1, the rationale and aims of the study were illustrated; this chapter defines fundamental concepts and identifies the major intricacies associated with introductory programming courses.

2.1 Introduction

A language is a system that provides communication between the entities. Any language that human beings learn from their surroundings and apply to com- municate with others is called a natural language [176]. Natural languages are used to articulate the knowledge, acquaintance and sensations and to commu- nicate our responses to others. In essence, the natural languages are the most powerful and logical way of communication. A language is not only a system of communication, but also a form of power [150]. According to Brown and Ogilvie, there are more than 6800 different languages in the world and over 250 families of established languages [65]. It is argued that the development of human language is one of a compelling evolutionary event on earth [143]. The science about the natural languages is called linguistics [59]. The programming languages are the artificial languages developed to commu- nicate instructions to a machine. Programming languages are primarily used to communicate with literal-minded machines [438]. Mitchell defines the pro- gramming language as a system of expression [323]. For the convenience of pro- grammers, programming languages provide abstractions, different constructs,

4 2.1. Introduction 5

control structures and organizing principles for the development of high-quality programs. The study of programming languages is called programming linguistics [477]. Formal study, analysis, design, development, improvement and the implementa- tion of programming languages and language processors are the prime concerns of programming linguistics. There are about 2000 to 3000 notable programming languages reported on the Web [218]. Programming languages are generally classified into four groups: imperative, logic, functional and object-oriented [282, 416]. Imperative language is identified by three fundamental properties: instructions have sequential execution; the variable represent the memory location, and the values of variables are changed by the use of assignment [282]. Fundamen- tally, the imperative languages are developed around von Neumann architecture [416, 484]. C language [162, 228] is a best example of imperative languages. Sim- ilarly, the visual language (previously called the fourth generation language) is a class of languages. .Net languages [413, 450] are the most popular visual languages [416]. These languages contain drag-and-drop facilities for the construction of program segments. Logic programming language is a rule-based language formed on the opera- tional perspective of predicate calculus [483]. In logic programming languages, the programs are expressed in the form of symbolic logic and employ a logic inferencing system to generate results [282, 416]. Prolog [98] and Datalog [113] are the best examples of logic programming languages. Functional programming language is a class of the programming language in which there is no difference between expressions and statements [483]. Names are simply employed to specify functions and expression, but not to specify the memory locations. Originally, the functional programming languages are based on Lambda calculus [232, 285, 394], which is a simple model of computa- tion [158]. LISP, FP, Miranda, HOPE and Haskell are examples of functional programming languages. 2.2. Programming 6

Object-oriented programming language is a language that supports the devel- opment of programs in the form of objects. Object-oriented programming is a realistic and helpful programming methodology that promotes the modular design and software reuse. Nearly all object-oriented programming languages support data abstraction [431]. Smalltalk, C++ [441], Java and C# are leading object-oriented programming languages. Aside from these four paradigms, there are many other related programming paradigms [156].

2.2 Programming

Computer programming is a constantly progressing discipline and each step in the development institutes a new paradigm that formulates more complex software-development tasks [7]. The area of programming languages is a part of the nucleus of computer science [468] and programming is one of the expected skills of computer science students [135, 302]. Programming is an extremely helpful skill and can be a rewarding career [391]. Learning and teaching pro- gramming would be beneficial to students from any domain. However, pro- gramming is a highly complex activity [50, 103, 104, 347, 355, 465] and requires declarative and procedural knowledge. The declarative knowledge is the knowl- edge of programming syntax and semantics whereas the procedural knowledge is the problem-solving and program design skills [382]. Programming is not simply knowledge; it is a skill and requires a lot of time and effort to develop. To become a programmer, there are many aspects to be considered in program- ming, essentially the syntax, semantics and pragmatics. Furthermore, sufficient knowledge about problem-solving techniques is essential [448]. Winslow iden- tified that it takes about ten years to transform the beginner into an expert programmer [489]. Learning programming is considered a complex task because it involves a correct understanding of concepts and the development of problem-solving techniques [348] and therefore, usually difficult for students [115, 225, 277, 400, 466]. It requires the beginning students to symbolize their concepts, requirement and 2.2. Programming 7

method, to obtain an understanding of ideas for which they may have no pre- vious experience, and to articulate these ideas in a prescribed method of a programming language they confronted never before. Complicated syntax, an unusual structure, length of time to develop a program and working in isolation are the highly cited reasons for the difficulty to learn programming [10, 11]. Comprehending and grasping the syntax of a program- ming languages can be foiling for beginners [110, 371, 439]. The unfamiliar and queer syntax of a contemporary programming language is a big challenge for beginners because they are compelled to understand the syntax and conven- tional programming skills at the same time. Even professional programmers may be frustrated and crippled by the need to understand the syntactic detail of a new programming language. Especially for beginners the rigid syntax of programming languages causes several setbacks, particularly the lack of moti- vation and encouragement. The structure of programming is different from the structure followed in any other area of study. The common structure that is followed in programming (sequence, selection, and iteration) cannot be com- pared to common or typical examples. Instead, these structures have different methods and therefore, require different thinking pattern. Hence, the learning to form structured solutions to problems is a big challenge [225]. Students are required to develop numerous skills to be capable to comprehend programming and more importantly, to develop computer programs that solve problems. Beginning students lack the acquaintance and skills of programming experts. The knowledge of beginning students is usually not general, but con- text specific. Novice students lack an ample mental model and restricted to the surface knowledge of a subject [489] and therefore they cannot learn and apply programming knowledge appropriately. Lahtinen et al.[258] argued that programming is not only complex because of the abstract concepts, and students have also difficulties in different issues associated to program construction. Although some of the complexity is inherent in programming, but it may be more arduous than necessary since it requires solutions to be conveyed in ways 2.3. Introductory Programming Courses 8

that are unnatural for beginners. Complicated syntax and semantics, lack of prior knowledge and problem-solving skills are the contributing factors that influence the high rate of failure and dropout of students in introductory programming courses. Sarpong et al. in [409] highlight that teaching methods and strategies are the contributing fac- tors to high rates of failure of programming students. Computer science educators are being challenged to discover the correct blend of pedagogy and technology for their curriculum in order to aid students persist [372]. In [367], Pillay and Jugoo claims that the problems are associated with problem-solving skills and the first language of the student.

2.3 Introductory Programming Courses

The introductory course on programming is commonly known as CS1 [19, 67]. The first programming language of the student is extremely important because it sets the tone for all the subsequent classes of the computer [200]. The first programming language works as a reference for learning further programming languages [209]. However, the process of comprehending a first programming language is usually a difficult task in that it requires the knowledge of the programming theory, syntax, semantic, programming terminology, logic and problem solving [386]. Ambr´osioet al. described that problem solving is com- prised of analysis and synthesis [18]. Analysis is the capability to divide a problem down into its segments and take them individually so the problem can be more clearly understood and handled. Synthesis is putting the segments together after they have been handled separately as to find a solution to the actual problem. The selection of the for introductory programming courses is one of a vital issue and a subject of intensive debate [61, 380]. Simi- larly, the selection of an appropriate programming language for the introductory programming courses is a challenging task [254] and an emotional issue [57, 161]. In actual fact, the selection of a programming language for the first program- ming course is a critical decision [68, 314] that ultimately affects the attitude 2.3. Introductory Programming Courses 9

and the motivation of students. The programmer’s first programming language has a strong impact on programming ability as profound as the impact of ones native language ones contemplation and thought patterns [480]. In fact, the programming language used in CS1 and CS2 is considered as a fundamental factor in the future progress of students [283]. It is generally recommended that it is advantageous to use a programming language exclusively developed for teaching. However, due to the pressure and other factors, commercial lan- guages are normally selected for the introductory programming courses. In the past, the programming languages for introductory programming courses were selected on non-pedagogical factors, For instance, in 1980s, the selection factors were defined by Tharp[452], to help educators in selecting the pro- gramming language by comparing the COBOL, FORTRAN, Pascal, PL/1 and Snobol. The criteria included the factors like the speed of the code implemen- tation and speed of compilation. Today, pedagogical significance is recognized as the most important factor for choosing the programming language. Parker et al. in [361] defined a set of criteria for choosing the programming language for an introductory programming course and recognized the pedagogical factor as one of the important factors that must be considered while deciding the lan- guage. Mannila and de Raadt[291] defined a set of criteria and compared the program- ming languages and suggests that most appropriate programming languages for teaching are Python and Eiffel. Whereas the Java which is developed principally for commercial application, has an importance when recognized as a teaching language [291]. In [474], Wallace and Martin examined the factors for the selection of first pro- gramming language and argued that Java is a simple path for understanding the object orientation and programming skills. However, Hadjerrouit argued that Java is comparatively difficult language for novice students with no prior background of programming [168]. Hadjerrouit also identified that Java is more appropriate for students with some prior knowledge of programming, particu- larly in C/C++. In [460], Tuttle recommended Java as a second programming. 2.4. Dropout and Failure Rates in Introductory Programming Courses 10

However Biddle and Tempero[51] argued that Java is not a compelling choice for teaching novice students since it has several pitfalls, like the low enforce- ment of encapsulation, implicit pointers, the distinction between primitive and object types, difficulties related with genericity and early introduction of diffi- cult topics, like as exceptions and static functions. In [210], a case study for the selection of first programming language is presented and the study reports that Java is unsuitable for introductory programming courses.

2.4 Dropout and Failure Rates in Introductory Pro- gramming Courses

Introductory programming courses are known generally to pose challenges for both students and instructors [93] and also very infamous for high dropout and failure rates. Similarly, low retention has conventionally been an issue in major degree programs and several introductory computing courses [351]. These facts have caused novices to fear programming and to establish a negative opinion about their programming education [244]. In [429], Sloan and Troy reported the attrition rate of 40-60% at the University of Illinois. Bennedsen and Caspersen[46] estimated that more than two million students started computer science studies in colleges and universities all over the world in 1999 and 650,000 (33%) dropped or failed their first programming course. For last many years, the topics of student dropout and retention have been extensively discussed. Several researchers have conducted detailed studies to identify the causes of high dropouts and low retention. A landmark study [229] on analyzing the causes of dropout in CS1 course indicates that several factors influence a student’s decision to exit from the CS1. The most recurrent reasons were the lack of motivation and the lack of time. The longitudinal case study [338] indicated that programming discipline activity, limited student motivation and course arrangement are the main problems in a first programming course. 2.4. Dropout and Failure Rates in Introductory Programming Courses 11

Kinnunen and Malmi in [229] reported a very high dropout rate at Helsinki University of Technology. In order to identify the causes a detailed study has conducted on the decisions of computer science minor students to drop out from the CS1 course. This course has the yearly enrollment of 500-600 students, and the 30-50% dropout rate has been observed. During the study, eighteen dropouts were interviewed from CS1 course. The results of the study revealed that there are multiple reasons behind the students’ decisions to leave the CS1 course and therefore a single step to reduce dropout cannot be very fruitful. Xenos et al.[495] conducted a survey to identify the rationale for the attri- tion of students from the course of Informatics at Hellenic Open University. Five reasons of dropout were identified. The large number of students (62.1%) avowed that the rationale of attrition is their profession, while several (46.2%) claim that academic reason was the cause of their dropout, few students claim that they drop out due to family reasons (17.8 percent), health reasons (9.5%) and personal reasons (8.9%). In [60], Boyle et al. studied what makes the computer science students to suc- ceed and conjectured that the final year grade of students has no relation to their previous mathematical background, prior computing qualification, uni- versity entry score and school-level outcomes, but depend on their attitude, expectations and the pattern of educational methodologies followed by the uni- versities. Ventura[471] studied the determinant for the success in an object-first course, and deduced that the student effort and comfort level are the most significant predictors of success. Similarly Wilson[487] conducted a study and come to the conclusion that the comfort level is of one the several factors that foster the success in a computer science course. Motivation and effort are interrelated, and programming requires regular prac- tice, keeping the students motivation is of highest significance Settle et al.[419]. Motivation and involvement are influential factors in retaining students in a par- ticular program [376]. According to a study [229], the lack of motivation is a significant reason for high dropouts. Motivation has a significant role in pre- 2.4. Dropout and Failure Rates in Introductory Programming Courses 12

venting beginners and students from surrender, and feeling defeated [78]. Decades of research have been devoted to diminish the problems of the intro- ductory programming [472]. Several techniques and methodologies have been proposed, but the introductory programming courses still have high dropout and failure rates [475]. Chapter 3

Literature Review This chapter establishes the research context through a review of academic lit- erature concerning the support of an introductory programming. This chapter endeavors to identify the major systems and approaches which are formalized to simplify the hardness of contemporary programming languages. It also includes the review of precursor systems which are introduced to students in preparing them for the introductory programming courses.

3.1 Introduction

Programming is an area of a challenge within the domain of computing educa- tion research [415] and having a high dropout and low retention rates. Unlike other courses, many novice students have never been introduced to the pro- gramming. Learning how to program is a complex process and novice students commonly struggle with it. Decades of studies have been conducted to diminish the problems of introduc- tory programming, and many solutions have been developed. This chapter includes the review of the existing solutions introduced to the support of pro- gramming. It is important to mention that the solutions and tools described in the subsequent sections are quite fuzzy and may belong to several groups. So a very simple and straightforward scheme is followed to categorize the solutions, and it is fairly possible to categorize these solutions in many other ways.

13 3.2. Natural Programming Languages 14

3.2 Natural Programming Languages

Natural language programming (NLP) is a class of programming which allows to program in natural language [234]. NLP is an important effort to lessen the complexity of programming languages. The use of a natural programming language is recognized as an active research area of computational linguistics and computer science. The immense development in the area of parsing and other fields of natural language processing and computational linguistics makes it possible to use natural language in programming. Programming in natural language is an old dream of human, but many re- searchers like Dijkstra considered this as a foolish idea [123]. However, there are many strong partisans of natural language programming [465]. Since the 1960s, several notable natural programming languages have been introduced and some of them are illustrated below.

3.2.1 NLC

In [30], Ballard and Biermann described a system called NLC (Natural language computation) which is based on natural language and allows the user to write commands in English, and observe them executed on the output panel. NLC is developed for the manipulation of data stored in tables or matrices. In NLC, the problem domain is the world of matrices, so all the references are associated with entries, rows and entry columns and basic operations are performed on them and therefore, no reference is added to the problem notions. Scanner, syntax analyzer, semantic processor and matrix computer are the components of the NLC system [52]. The dictionary of NLC contains about 450 entries, in which 60 are verbs (imperative), 12 are domain nouns, and about 20 are functional nouns. It also contains numerous comparatives and adjectives. Since many words of English may be used in the different sense, so for tokens in the symbol table or other tables, multiple meanings happen quite recurrently. In NLC declaratives and interrogatives are not required, so each input must be an imperative, and each input starts with the imperative verb. In NLC it is possible to output the current execution state on the screen, and helps users 3.2. Natural Programming Languages 15

in removing the errors and confirming valid actions. NLC is developed for the manipulation of matrices so no elementary concept of introductory imperative programming like data types, variables, scopes, initialization, input/output, selection, repetition, arrays and abstract data type can be understood with NLC.

3.2.2 Pegasus

Pegasus is a multilingual natural programming language developed at the Darm- stadt University of Technology [234]. It is capable to read and understand the natural language and generate the equivalent executable program files. Pega- sus can transform programs written in natural languages, including English and German. The architecture of Pegasus is modeled in the form of the brain. Mind, long-term memory and the short-term memory are the main components of the brain in Pegasus. Pegasus aims to ease the programming for all classes of users, including ex- perts and beginners, computer scientists, engineers and mathematicians. Due to the use of natural language, the program written in Pegasus is highly simple, understandable and concise. From the pattern of natural language allowed in Pegasus it can be assumed that a large of users can easily program in Pega- sus without any specific knowledge of programming. However, the pattern of programming in Pegasus is eminently different from the style of imperative pro- gramming. The statements of Pegasus are in some manner imperative, though they largely based on the notions of natural language. For instance, the pro- nouns such as “it”, “he”, “that” are frequently used for references, whereas this type of implicit referencing is nearly unavailable in imperative program- ming languages. The syntactic and semantic compressions are nearly absent in imperative programming languages, but extensively employed in Pegasus. Pegasus simplifies the programming, but it is neither useful in comprehending the fundamental concepts of introductory imperative programming nor fruitful as a suitable precursor for introductory imperative courses. 3.2. Natural Programming Languages 16

3.2.3 Metafor

In [275], Liu and Lieberman illustrates that every computer program describes a story and therefore, programming is the art of creating a story. They intro- duced the idea of using specification in a natural language as an illustration for programs and developed a system called Metafor that conceptualizes users typed stories as a code. As a user type a story, the system automatically gen- erates the corresponding visualization of the user description as a scaffolding code. According to Liu and Lieberman the Metafor can achieve at least two objectives: support beginners in developing instincts about programming and help intermediate programming by a brainstorming tool; however, at minimum level, the sufficient reading knowledge of a programming language is assumed for users. Parser, Programmatic Interpreter, MetaforLingua, Code Renderer, Introspec- tion, Dialog and User Interface are the main components of Metafor. The generated code is not directly executable, yet it is fruitful as a brain- storming tool. On the whole, the Metafor is quite attractive and may help the beginners in understanding the programming. However, the introductory programming concepts like variables, data types, input/output and functions cannot be directly understood with Metafor.

3.2.4 NaturalJava

Price et al.[371] developed a NaturalJava as a user interface based on the natural language for developing Java programs. NaturalJava is developed at the University of Utah. Sundance, PRISM and TreeFace are the components of NaturalJava. Sundance is a natural language processing system that accepts English sentences and applies information extraction methods to construct case frames. PRIMS is a case frame interpreter and utilizes a decision tree to deduce program operations through the case frames. TreeFace is an abstract syntax tree management that renders the interface utilized by the case frame interpreter to handle the syntax tree. NaturalJava allows to program in natural language, but the pattern of programming is quite imperative. Variables, methods and 3.2. Natural Programming Languages 17

loops are explicitly defined in NaturalJava. Although it allows to program in natural language, but the pattern of statements is very closer to the synthesized programming and therefore, it may be useful for students in understanding the imperative programming languages. However, it is more beneficial for object- oriented programming since it is designed on the principles of Java, like the attributes, methods and classes.

3.2.5 SHRDLU

SHRDLU is one of a first natural language interface and developed by Terry Winograd. SHRDLU is capable to recognize instructions and perform conver- sation about a world (Blocksworld) comprises of some blocks (toy blocks) on a table [465]. Basically, SHRDLU is an artificial intelligence system and permits users to interact with objects in Blocksworld.

3.2.6 HANDS

HANDS (Human-centered Advances for the Novice Development of Software) is a programming system especially developed for children and based on usabil- ity [356]. Principally, it is an event-based programming language and contains domain-specific facilities for the development of simulations and interactive an- imations. HANDS has a conversational style, and it fully supports an aggregate operation. In HANDS, there is an agent called Handy that sit at a table and manipulate information on the cards. Cards store all the data in the system. Every card has a non case-sensitive unique name, and temporary naming mech- anism is available, so there is no concept of variables. HANDS is a very powerful programming system and ultimately helps in un- derstanding certain types of programming concepts, but unfortunately these concepts cannot be helpful in understanding the generic concepts of program- ming and contemporary programming languages. 3.2. Natural Programming Languages 18

3.2.7 SNAP

SNAP (A Stylized NAtural Procedural language) [199] is a natural language based procedural language developed for nonscientists [37]. The SNAP proce- dure comprises of statements based on English sentences. SNAP was defined in late 1966 and the prototype processor is developed almost completely in FORTRAN IV [36]. Several SNAP structures are similar to COBOL. A Small control section, translator and interpreter are the fundamental constituents of prototype processor. The SNAP statement generally begins with an impera- tive verb like PRINT, READ, DELETE, APPEND and REPEAT [35]. The statements of SNAP are like English statements, but somewhat organize in a computable manner. Such as the print statement for a string is comprised of a verb PRINT, a quotation and a period. The input statement is comprised of the verb READ, name and a period. Despite the facts that SNAP is designed essentially for teaching, its string handling techniques are principally appro- priate for text processing systems. The statements of SNAP are somewhat imperative, yet very verbose and may not be a very cogent precursor to the first introductory programming course, but it may be helpful in the course of basic data processing.

3.2.8 sEnglish sEnglish [235] is a commercial tool that allows to use natural language for writ- ing Matlab programs. It can produce Latex and HTML documents of sEnglish documents developed in natural language. sEnglish is suitable for numerical computing application, small handheld devices, complex and autonomous sys- tems. It is available for different platforms, including Linux and also attain- able in combination with Python, SciLab and Octave. The English allowed in sEnglish is fairly very simple, yet some knowledge of MATLAB is essential to program in sEnglish. The definition of variable in sEnglish is quite complex because multiple properties of a variable are defined in a single statement which is different from imperative languages. 3.3. Visual Languages and Microworlds 19

In their seminal paper [279], Lopes et al. presented a valuable detail and aspects of natural language in the design of a programming language. Through several examples, the naturalistic programming languages and some of its properties are analyzed, and finally it is identified that natural languages are unsuitable for programming. Redundancy avoidance, locality and immediacy are the fundamental attributes of natural languages and therefore implicit referencing, context dependence and compression are frequently used in natural programming languages. These features obviously increase the readability and decrease the intricacy of pro- gramming, but these traits are almost unavailable in imperative programming. Virtually, the central goal of natural language programming is to achieve suffi- cient comprehension to enable the use of natural language as a communication medium between user and computer and therefore precluding the requirement of the user to learn the conventional programming language [270]. Due to these facts, it can be identified that the notion of a natural language program- ming is not very useful for introductory programming courses. However, the imperative-oriented natural language programming is likely to be useful to sup- port the introductory programming. Similarly, it is widely argued that the use of natural language does not affect the introductory programming [345, 346]. Moreover, it is important to state that although the natural language pro- gramming is an active research area, but no extensive study has reported its significance in preparing and motivating the students for the first programming course. However, natural language programming is very useful as the end-user programming languages and already used in different areas including the robot programming [271], databases [23, 269, 368, 465], and question answering sys- tems [31, 171, 186, 223, 262, 280, 494, 505].

3.3 Visual Languages and Microworlds

Visual programming language (or just visual language) is a graphical tool that supports introductory programming. These languages relax the students from the rigid and complicated syntax of programming languages and motivate them 3.3. Visual Languages and Microworlds 20

to concentrate on problem-solving by providing the drag-and-drop facilities to develop the programs. Block languages are visual languages and based on the notions of programming bricks [140]. Visual, block-based programming environments provide another way of teaching programming to beginners and have proven quite successful for novices. A study [373] on novice programming suggests that block environments can improve novice performance on a number of programming activities. Fortunately, there is a large range of visual languages and some of them are described below.

3.3.1 Greenfoot

Greenfoot is an educational integrated development environment focused on learning and teaching programming to young novices. The target audience of the student starts from about 14 years-old upwards, and is also useful for college- and university-level education [242]. One of the central aims of Greenfoot is a design that clearly visualizes essential concepts of object-oriented programming. The Greenfoot design was motivated from microworlds and direct interaction environments. With the proper use of Greenfoot, novice students should get familiar with primary ideas of object-oriented programming and imperative programming concepts [180]. The Greenfoot system allows to create interactive, simulation-like programs in a two-dimensional plane. The environment of Greenfoot is quite simple to understand. Fig 3.1 shows the Greenfoot main window. In Greenfoot, the appli- cations are called scenarios, and the environment allows students to run these scenarios, modify and re-run the scenarios and see the effect of modification [149]. However, it provides no real experience of programming.

3.3.2 Scratch

Scratch is a programming language designed by the MIT Media Lab for teach- ing programming to youths and other novice programmers [145]. It allows the development of interactive stories, computer games, computer animation and 3.3. Visual Languages and Microworlds 21

Figure 3.1: Greenfoot main window [241] graphic artwork, and all types of other multimedia projects. Scratch is also very useful in creating book reports, science projects, greeting cards, simula- tions and tutorials. The Scratch project began in 2003, and publicly launched in 2007 [289]. It is free and available in almost 50 languages and more than two million copies of Scratch have been downloaded from the Scratch Web site. It is originally developed to help people between the ages of 8 and 16 [145]. Scratch allows beginners to develop programs by snapping together blocks to control 2-D graphical objects (sprites) moving on stage. Scratch comprises of different blocks and a graphical development environment that include a paint application for developing graphics and predefined sound editing facil- ities. Scratch also includes huge collections of model applications as well as sound files and graphics, all of which can be used to create Scratch projects. Scratch application projects can be shared on the Scratch Web site or saved to the file system. Scratch is very helpful for complex interplay, both in its use of asynchronous processes and concurrent programming [492]. In Scratch, it is extremely simple to develop new projects that include graph- 3.3. Visual Languages and Microworlds 22

Figure 3.2: Scratch user interface [289] ics, pre-built code block and sound files. Scratch allows programmers to mod- ify projects on the fly, enabling modifications to be made even while Scratch projects are running and therefore a dynamic programming language. Scratch does not support a text-based approach to programming and therefore, provides very nominal support in writing and learning actual programs. How- ever, it is helpful in understanding programming concepts, including the use of variables, conditional and iterative logic, sound effects and graphics. A study [268] on Scratch and Logo identifies that Scratch provides several affordances that would imply that it should be simple to learn and interpret. The Scratch user interface is quite easy to navigate. It avoids floating palettes and minimizes the use of the pane. Figure 3.2 shows the Scratch user interface. Scratch is an interpreted language, so the projects are not precompiled before their execution. Instead, the code that defines the Scratch project is interpreted and processed every time the application is executed. In [238], Kobsiripat claims that Scratch is useful for elementary school student and can lead innovative development. Scratch is more suitable for object-oriented courses, yet may be useful for an 3.3. Visual Languages and Microworlds 23

imperative paradigm. There are several variants of Scratch. For instance, Snap! (formerly BYOB) is a visual drag-and-drop educational programming language. It is an augmented realization of Scratch. Snap! runs on the browser and allows to create block. It is suitable for high school or college students. Fig 3.3 shows the Snap! environment.

Figure 3.3: Snap! environment

In [310], Meerbaum-Salant et al. demonstrated that students learning to pro- gram in Scratch show several nasty habits that are dissimilar to recommended programming practices like excessive decomposition and a bottom-up develop- ment method, starting with individual blocks. Moreno and Robles[331] further investigated the findings of [310] by analyzing the 100 projects of scratch and report that in general, projects suffer the investigated malpractices.

3.3.3 Blockly

Blockly is a web-based visual programming designed by Google. With Blockly, it is extremely simple to create solutions by using drag-and-drop components from the toolbox. Users can develop applications by joining blocks with the 3.3. Visual Languages and Microworlds 24

acquaintance of programming, and therefore, Google Blockly can be viewed as a high level graphical programming language. The users can drag-and-drop the blocks in the workplace, and each block defines a segment of code. The blocks include placeholders for variables and sub-clauses of commands and are able to state the scope of program-segment containment [295]. The block-based nature of Blockly permits the students to concentrate on the semantics of the algorithm rather than its syntax [216]. The Blockly is also available for mobile platforms. The code base for Blockly allows generating code in Python as well as in JavaScript. Blockly environment is quite simple and understandable as shown in Figure 3.4.

Figure 3.4: Blockly environment

Several concepts of programming including variables, selection structures, iter- ative structures and functions are easy to learn with Blockly.

3.3.4 Alice

Alice is an educational programming language with a powerful integrated de- velopment environment. It uses an object-oriented pattern of programming [181, 334] and currently available for multiple platforms [212]. Alice was initially 3.3. Visual Languages and Microworlds 25

developed by Randy Pausch as a rapid tool for virtual reality animation, com- plete with glove sensors and head mounted devices [112]. Paush employed the native Alice in his Building Virtual Worlds course, where the novice program- mer had to work together in order to create and demonstrate virtual worlds. Alice is programmed in Java and developed to provide students an elbow room to understand variables, arrays, events, objects, classes, expression, inheritance, recursion and data structure [110]. The students are capable to develop simple games and animations by dragging and dropping graphical fragment into an editing area. Alice aids beginning students to gain capacity in computational thinking. It uses the context of animation, game construction and storytelling to introduce novice programmers to programming and problem solving. Alice’s target age is best depicted as 12-18, or from initial secondary school to first-year students in college [101]. Alice 3 code editor, scene editor, and runtime displays are shown in Figure 3.5. The objects in Alice subsist in a three-dimensional

Figure 3.5: Alice 3 environment [112] virtual world, like similar to the video games or films [425]. The virtual world in Alice is itself an object; it has properties and methods. In fact, Alice is similar to other contemporary object-oriented programming. However, in Alice it is 3.3. Visual Languages and Microworlds 26

not required to learn grammar and syntax of the language to develop computer programs. During learning Alice, students can focus on understanding about the concepts of computer programming, including the logic of the algorithm rather than focusing on the spelling and grammar of a language. The virtual world of Alice is one that we can see. It has three-dimensional space; each object in the world has properties, these include size, position, color and so on. Alice includes a camera that allows visualizing the virtual world on the screen. Its environment allows to see the objects in the virtual world and makes it simple to comprehend computer programming with Alice. Minimal memorization of syntax, visualization and quick feedback are essential traits of Alice [181]. There is no syntax in Alice that has to be grasped. The program elements are only clicked mutually to develop a program. Due to its simplicity, it is possible for primary school children to develop the program. However, programming is very dissimilar from programming in the actual pro- gramming language [403]. Alice is also used in other programming courses. For instance, Mullins et al. [333] recommended Alice 2.0 as a first programming language and compared with C++ and found Alice more effective in improving the retention. Similarly, in [95] Alice and Java are used in CS1 and feedback from students indicates that the experience with Alice helped them to learn Java. Alice has the potential to be an astounding pedagogical tool, yet it has several stumbling blocks that must be considered when using Alice. In [370], Powers et al. critically analyzed Alice. An important feature of Alice is its support of graphical programs that manipulate 3-D objects in a 3-D virtual world. Cre- ating games, animated movies and other visual programs are easily allowed in Alice. It allows drag-and-drop graphical users and permit students to concen- trate on more significant programming elements. The drag-and-drop interface in Alice isolates learning syntax from learning semantic. For instance, when the novice programmer drags the if-statement in the program, Alice compels the novice programmer to select the correct Boolean value. The programmer can only drag valid statements. The structure thus does not permit for syntax 3.3. Visual Languages and Microworlds 27

errors, but for logic errors. Alice may also frustrate the students, especially when they are working to design 3-D objects to move like the real world ob- jects. Defining objects to move in a realistic way is usually difficult in Alice and students become so engrossed in defining the movements of their 3-D objects, and resultantly they would overlook the prime objective of learning elementary programming concepts. Almost same situations arose when students strived to prevent objects from moving. Despite these problems, Alice increases the confidence of students, but this confidence is applied mostly to program with Alice and looks evaporated when transitioning from Alice to the conventional textual languages. Several students claimed that although they can easily pro- gram in Alice, but did not have the dexterity required for actual programming. Probably the primary reason of this loss of confidence was syntax-associated. So an effective method is required to increase student confidence while tran- sitioning from Alice to the actual textual programming. Similarly, in [300], it is described that migration from visual programming languages to text-based languages is still a problem. Garlick and Cankaya compared Alice with conventional pseudocode by per- forming 2-semester study [152]. During the study, elementary programming constructs are introduced by Alice on one group, and the other group used traditional pseudocode. Both groups are later evaluated and it is found that students using Alice scored lesser than those studied the pseudocode. In [439], Stefik and Siebert report that visual tools do help beginners initially (but not for the blind), and therefore, they should not be recognized as a silver bullet. The intricacy in transition is the stumbling block of visual language and among several developments, the Tiled Grace [192] is an important effort to overcome this problem. Grace is an imperative object-oriented programming language developed to use in education, especially for introductory programming courses [55, 56, 339]. Grace allows a variety of techniques for teaching programming, including graphics-early, objects-early, objects-late, and functional-first [193]. Tiled Grace is a programming environment [192] for the textual language Grace 3.4. Visualization & Animation Tools 28

and provide a bridge between text environment and visual representation by allowing the textual and visual representation of the code equally. The user interface of Tiled Grace is illustrated in Figure 3.6.

Figure 3.6: Tiled Grace [192]

In Tiled Grace, programmers can switch to a typical textual view at any time, and can change the text before switching back to the tile view.

3.4 Visualization & Animation Tools

Visualization tool is a system that makes programming more accessible to stu- dents. According to Urquiza-Fuentes and Vel´azquez-Iturbide[463], the pro- gram visualization is the visualization of data-structures or actual program code. Cooper and Dann[102] described that program visualization tools are vi- sual languages which allow to see the visual execution of program. These tools illustrate the steps performed by the program during execution [330]. Funda- mentally the program visualization is an environment that typically shows the visualized program code in a window [257]. This window can embrace short comments as an element of the code. The rationale of the visualization is to 3.4. Visualization & Animation Tools 29

elucidate the run of the program to the student. The visualization tool can contain additional instructions to describe the program and to draw the stu- dents focus on the essential concepts. The visualization tools were developed to illustrate the program execution graphically with the aim of helping students to learn programming. The visualization tools demonstrate and animate pro- gramming concepts from elementary programming construct to object-oriented issues [43]. In [259], Lahtinen et al. describe how to take optimal benefits from visualization and report that with the use of visualization, students take a more active part in the programming course. An Algorithm visualization tool is a class of visualization tools designed to generate graphical representations that intend to aid learners in comprehend- ing the dynamic behavior of computer algorithms [204]. Basically, it is the static or dynamic visualization of the higher level, which illustrates the soft- ware [463]. Algorithm visualizations can offer a persuasive alternate to other kinds of instruction, particularly the written presentations such as real code and pseudocode [422]. Several studies on visualization [6, 20, 29, 224, 350, 434, 435, 462] reports the effectiveness of visualization in education.

3.4.1 Jeliot

Jeliot is a program animation tool developed for teaching and learning the fun- damental of programming [43]. It was developed as an alteration of Eliot to facilitate the web-based user. The Jeliot I was implemented at the University of Helsinki. The functionality of Jeliot I was almost similar to Eliot, but the technical design is entirely different and stands on client-server architecture. With improvisations, it is possible in Jeliot to modify the visual features of the animation while it is running. Jeliot 2000 is developed at the Weizmann Institute of Science. It was specif- ically developed to support novice learning. Jeliot 2000 intended for teaching elementary computer science to high school students [267]. The aim was to 3.4. Visualization & Animation Tools 30

help beginners in understanding the introductory concepts of algorithms and programming like I/O, assignment and control flow, whose dynamic features are not simply gripped by looking at the static depiction of an algorithm in a programming language. The user interface of Jeliot is shown in Figure 3.7.

Figure 3.7: User interface of Jeliot [43]

Jeliot 3 was implemented at the University of Joensuu and provides adequate support for learning introductory Java [43]. Jeliot 3 introduces object-oriented concepts, visualizing objects and inheritance [329]. Jeliot 3 is developed to learn introductory programming, but Moreno and Joy in [328] report that Jeliot 3 animations are difficult for beginners to understand.

3.4.2 Visual interpreter

In [473], Virtanen et al. present a visual interpreter defined for learning ele- mentary programming using C++. Visual interpreter (VIP) is developed to support the students learning process. VIP is developed in Java and therefore, extensible and virtually platform independent. VIP is not restricted to the web and can also be run as a stand-alone application. 3.4. Visualization & Animation Tools 31

The VIP architecture is modular and consists of independent elements: com- piler, interpreter, user interface and profiler as shown in Figure 3.8.

Figure 3.8: VIP structure [473]

The main window of VIP comprises of different parts as shown in Figure 3.9.

Figure 3.9: VIP window [473]

Small visualization for VIP can be formed in a few minutes by writing a standard C++ source file and linking a web page to it. The new visualization is instantly accessible through the web page using any web browser. This permits the instructor to focus on the real content rather than dealing with unimportant technical details. VIP is designed for tiny programs and demonstrates the 3.4. Visualization & Animation Tools 32

evaluations of each statement in detail and permits reversible visualizations [132].

3.4.3 VILLE

VILLE (or ViLLE) is a language-independent program visualization tool for teaching introductory programming. Major aspects of the tool are adding and defining new languages, flexible control of the visualization, visualization row by row, code line explanation, breakpoints and parallel view that display the code execution concurrently in two different languages [449]. Figure 3.10 shows the visualization view in VILLE.

Figure 3.10: The visualization view in VILLE [375]

VILLE allows to view the programming examples in different programming lan- guages. It supports pseudo code, C++ and Java. The pseudo code’s description can be changed to suit the instructor’s needs. It is also possible to describe and add new programming languages to expand language support. A study con- ducted to evaluate the VILLE’s usefulness in learning essential programming concepts showed that VILLE enhances students learning regardless of previous programming knowledge [375]. In [255], Laakso et al. analyzed the effects of cognitive load in using VILLE. The results show that students with the prior 3.4. Visualization & Animation Tools 33

knowledge of a tool learned considerably better. In 2009, the VILLE tool transmuted into a collaborative learning environment [220]. Now, it is a completely server-client system and emphasized on collab- oration between teachers. It supports various kinds of exercises, including the actual program visualization exercises.

3.4.4 PlanAni system

PlanAni is an automatic program animation system developed to be used in teaching fundamental programming constructs to beginners [407]. Basically, it is a variable role-based program animation system, where each variable role has a stored visualization called a role image. Role images provide hints on how successive values of the variables associate with each other and to other variables [9]. Figure 3.11 shows a screenshot of the PlanAni user interface when the system is animating a small program that verifies whether its input is a palindrome.

Figure 3.11: Visualization in the PlanAni system [407]

An experiment reported in [9] shows that debugging skills are enhanced by PlanAni. Similarly, Sajaniemi and Kuittinen in [407] reported the pedagogical significance of PlanAni. 3.4. Visualization & Animation Tools 34

3.4.5 EduVisor

The EduVisor (EDUcational VISual Object Runtime) is a visualization tool de- veloped for teaching object-oriented technology to novice students [327]. The visualization tool is integrated in an IDE, and displays the students the organi- zation of their own formation at runtime. EduVisor has three main objectives: i) to increase students understanding of the concepts introduced during the first programming course ii) to capable the students to quickly debug their programs, and iii) to enhance the eagerness of the student by visualizing their efforts.

Jeliot, Visual interpreter, VILLE, EduVisor, and PlanAni system are the pop- ular tools, but there are many other tools for animation and visualization and some of them are WinHIPE [359, 469], SRec [470], Swan [421], JGRASP [107], ALVIE [105], LEONARDO [106], ANIMAL [395], JHAVE[´ 337] and TRAKLA2 [245]. In the past 30 years, many different visualization tools were developed, but the opinion about the significance of visualization is markedly mixed. Lahtinen in [256] asserts that the available visualization tools for programming education are impressive yet the study conducted on them is mainly based on verifying the educational effectiveness of visualization tools. The majority of studies applies empirical techniques in controlled experimentation situations. Petre[366] holds a state that the comprehension of graphical clues provided by visualization needs a level of knowledge not always found in beginners. How- ever, to take benefits from visualization the novice students should make the effort to comprehend the graphical notation. So rather than simplifying the students learning process, visualization can primarily impose new extraneous cognitive load. 3.5. Flowchart Based Programming Environments 35

3.5 Flowchart Based Programming Environments

Flowchart based programming environment is a type of visual language which helps the students in learning programming by providing a graphical tool which allows students to write and visualize the programs using flowcharts.

3.5.1 RAPTOR

The RAPTOR (Rapid Algorithmic Prototyping Tool for Ordered Reasoning) is a visual programming environment developed to help the student envisage their algorithms by combining basic graphical symbols. The student can run their al- gorithm either step-by-step or in continuous mode. The RAPTOR environment is quite simple as shown in Figure 3.12.

Figure 3.12: RAPTOR environment

The RAPTOR environment displays the location of content of all variables as well as the current executing symbol [80]. RAPTOR does not compel top- down decomposition and allows students to develop their code incrementally. RAPTOR is developed in combination of C++, C# and Ada and run in the .NET Framework. RAPTOR programs are required to be structured. Due to its structural support, the RAPTOR is a useful system for learning imperative programming. 3.5. Flowchart Based Programming Environments 36

Price and Smith conducted a study on the use of Alice and RAPTOR in CS1 [372]. The study indicates that RAPTOR helped the students in visualizing the concepts and comprehending the selection and control structures.

3.5.2 Flowgorithm

Flowgorithm is one of a simple and a newly developed graphical tool, which allows novice programmers to develop and run programs using flowcharts. It can generate program in several programming languages like JavaScript, Visual Basic, Python, C++, Java and C#. Figure 3.13 shows a simple flowchart in Flowgorithm and its equivalent program in C++.

Figure 3.13: Flowgorithm environment

Although the Flowgorithm is too tiny, yet it is extremely useful for imperative programming courses. The essential features of introductory programming, including data types, variables, comments, input/output, selection and iteration structures and functions are allowed in Flowgorithm.

3.5.3 ProGuide

ProGuide is a dialogue based tool designed to help weaker students to de- velop basic programs [26, 312]. The tool provides hints, questions, and related 3.5. Flowchart Based Programming Environments 37

questions to assist students reach the solution. Algorithm editor/simulator and dialogue based tools are included in ProGuide and interact with students. With internal structures, ProGuide stores problems and the respective solution. The ProGuide interface is presented in Figure 3.14.

Figure 3.14: ProGuide interface [26]

ProGuide editor and simulator support the design of algorithms using flowcharts. ProGuide is developed to aid the novice students in their introductory learn- ing stage. The problems recommended to students at this introductory stage are typically simple and only require the elementary constructs. That is why the ProGuide only supports input/output, attribution, selection and repetition control structures. The developers of ProGuide are very hopeful of its success, yet no statistical results are available for describing its significance in support of learning.

3.5.4 Iconic Programmer

The Iconic Programmer is a tool that permits novice programmers to develop programs in the form of flowcharts by using graphical and menu-based interface [87]. Figure 3.15 shows the run panel of Iconic Programmer.

In Iconic programmer, it is possible to execute flowchart programs by stepping 3.6. Mini-languages 38

Figure 3.15: Run panel of Iconic Programmer [87] through the flowchart elements one at a time. Each of the elements represents a sequence, a condition or a loop. The flowcharts developed in Iconic Programmer can be converted into Turing or Java. Ionic Programmer supports input/output, decision structure, looping and a code generation [53].

3.6 Mini-languages

Mini-language is another way to introduce programming to students [69]. Mini- languages are developed to provide a gentle introduction to programming by allowing a clear and simple method to learn in an interactive visual environ- ment [70]. Mini-languages are highly stimulated from Logo and turtle graphics. Several researchers associated with the field of educational programming cate- gorize mini-language as microworld. The majority of mini-languages provide an actor. Students learn the program- ming by controlling this actor through the program. The actor could be a robot, a turtle or any active unit. Despite many advantages, the mini-languages have some inadequacies. The conventional control-structures like if and while are not supported in turtle subset. To imitate these constructs, some special pred- icates are required that present response while directing the actor. 3.6. Mini-languages 39

Karel the Robot was the first and the most popular mini-language. Basically, it is an imperative programming tool [102]. It was developed by Pattis [69], as a soft introduction to Pascal for the students of the university taking the introductory programming course. The actor, robot Karel, performs operations in a world that comprised of in- tersecting streets and avenues, walls and beepers (Fig. 3.16).

Figure 3.16: Karel the Robot [69]

In Karel the Robot a student develops code that controls a moving robot in a grid-based world. In [403], Ruf et al. reported the results of an experiment conducted to analyze the Scratch and Karel the Robot. Two classes were in- structed using Scratch and Karel and the motivation, self-regulation and change in abilities were analyzed. The results indicate that the class using Scratch has higher motivation and works better; on the other hand, the Karel class shows a higher identified regulation. Several researchers like Becker recommend Karel the Robot for CS1 [41]. Jeroo [408] is a tool with an integrated development environment and a mi- croworld that has been designed to aid novice students to learn the denota- tion of elementary control structure, objects and methods. Jeroo environment (shown in Figure 3.17) is quite simple and easy to understand. 3.6. Mini-languages 40

Figure 3.17: User interface for Jeroo [408]

Jeero is similar to Karel the Robot and its descendant. It has a simple syntax with a narrow scope with the aim of providing an easy transition to either C++ or Java. No data types and no variables other than references to Jeroo objects are available in Jeroo language. Martino is another mini-language inspired by Karel [344]. Martino is origi- nally developed for providing the fundamental concepts of informatics. Darel in Australia is another mini-language inspired by Karel. Karel-3D is another mini-language and extends the notions of Karel the Robot. Karel-3D diversifies the Karel into various directions. Marta is a logo based mini-language. It can be used as a simple introduction of programming for the large range of beginners from playgroup age to adults. Logo procedures, multiple-line programs and on-line programs are the different level of programming allowed in Marta [69]. Josef the robot is a mini-language with little influence of Logo. Josef was used to program the robot [456]. In 1980s, Moscow State University suggested a mini-language called Wayfarer for the students of the Mechanics and Mathematics Department. According to Brusilovsky et al.[69] the four to six weeks of work with Wayfarer provide 3.7. CS0 “Pre-programming” Courses 41

the students a strong foundation for the conventional concepts of programming. For Moscow State University, another mini-language called Turingal (TURING machine + PascAL) was developed in 1983. Turingal is based on Turing Ma- chine and works with a tape of symbols. Turingal supports control structures and subroutine, but its pattern of programming is fairly different from the con- ventional programming. Tortoise is a mini-language and similar to the Turingal, but includes more fea- tures. It was originally developed for fifteen to sixteen-year old students in Russian schools. Like Turingal, its patterns of programming and conventions are quite different from the conventional programming.

3.7 CS0 “Pre-programming” Courses

The CS0 is a course defined to present a way for students to obtain a stronger background in the area that is required for CS1 [138]. In a computing curriculum, the CS0 course commenced as the pre-programming course [324] with no prerequisites [114]. Fundamentally, it is a foundation course and usually introduced before the first programming course. Taheri et al.[447] described that foundation programming should focus on problem-solving skills and algorithmic thinking. CS0 virtually aims to provide a soft introduction of programming before the first massive course of programming. CS0 was intro- duced as an orientation computer science course for computer science majors [99]. These courses endeavor to provide the student an overview of the field. CS0 courses are an overview of computer science and programming using par- ticular languages [8]. Breath-first and depth-first are two common layouts of CS0 courses [297]. In breath-first CS0 course, the introduction to programming is limited to fundamental concepts. It mostly includes those topics that help students to understand computer science concepts [304]. The depth-first CS0 course generally pivoted on a particular programming language to develop the problem-solving skills. The CS0 course is one of a widely recognized approach to overcome the hardness of the first programming course by motivating and providing a soft introduction 3.7. CS0 “Pre-programming” Courses 42

of programming. The CS0 is not only important for CS1, but it is a gauge of a student for non-success to complete a degree in computing as claimed by [66]. Most of the CS0 courses utilized the contemporary tools (discussed in the pre- vious sections) for introducing students to the fundamental concepts of pro- gramming. In [72], Budny et al. proposed to introduce EXCEL, HTML and MATLAB before starting the C programming course. Through EXCEL stu- dents can gain the idea of data input, arrays, matrix operations and build in functions. HTML can be helpful in comprehending the layout of a program, and MATLAB can be fruitful for introducing branching and loops. Once the students are acquainted with these tools the introduction of C language is con- siderably simple for the students to understand. In [305], McIver and Conway introduced a programming language called GRAIL to teach beginners the elementary concepts of programming without coercing them to understand the complex syntax and semantics of contemporary pro- gramming languages. The GRAIL is very small, and its grammar is too concise. It is developed to present the concepts of imperative programming with the syn- tax that entails no prior experience. The syntax and semantic of the GRAIL are isomorphic. So each syntactic element has a unique meaning, and each con- struct has a single syntax. The isomorphism in syntax and semantic make the simplicity, but provide a few ways to implement their programs. In GRAIL, the non-ASCII Unicode characters are used to make programming simpler, like ← is used for the assignment, ≤ and ≥ are used for relation. The symbols are used to make a GRAIL more consistent with the existing knowledge of stu- dents. These symbols make the programming simple, but students face hard problems when transitioning to an actual programming language in that none of these symbols are allowed in contemporary programming languages. The GRAIL includes a single numeric type with the aim of keeping the simplicity. However, the notion of integer and real numbers are introduced at secondary education, so it is not necessary to include a single data type. The GRAIL claims to avoid the syntax of conventional programming language, but it pro- vides the use of array in a typical [] index notation, and its overall syntax (with 3.7. CS0 “Pre-programming” Courses 43

some exceptions) is akin to the contemporary programming languages. The selection structure in a GRAIL is almost similar to the equivalent structure of contemporary programming language. The University of Washington Bothel has defined a CS0 course for the students who had not yet declared to select computer science as a major discipline. The introduced course is a single-term introductory course for non-majors [357]. The course includes GameMaker [182, 352] and C# for introductory program- ming and primarily aims to attract students who had not taken any computer science course and that would support them with a solid foundation of program- ming language. It covers the basic topics like sequences, operators, selection statements, repetitive statements, functions and arrays. The course first uses GameMaker to cover the fundamental topics of elementary programming and the material recovered with C# by using XNA-based library [444]. The overall philosophy of the introduced CS0 course is very effective, yet it is specially de- veloped for non-major students and C# is too difficult for novice beginners. In [139], Faux reported the outcomes of a study that analyzed the significance of amalgamating problem solving, pseudo code, algorithm development and graphical techniques in the introductory computer science course (CS0) with the hypothesis that the foundation of these concepts prior to the introduction of a programming language would improve the success rate and decrease the learning curve requirements. The posttest programming score exhibits that the treatment group performed better than the control group. In [388], Rizvi et al. described a new course using Scratch to improve the reten- tion, attitude and performance of at-risk majors. In [389], Rizvi et al. reported that those students who selected Scratch [289, 310, 311] have a high level of self-efficacy in connection with programming abilities. Similarly, in [387], Rizvi and Humphries presents the results of Scratch programming language based CS0 course offered in Fall 2009 and Fall 2010 to control the drop out of at-risk computer science majors. The result shows that the course successfully achieved the objectives and strongly prepares students for further programming courses. Malan and Leitner[286] used the Scratch as a precursor of Java at the Har- 3.7. CS0 “Pre-programming” Courses 44

vard Summer School. At the end of the course, the students were analyzed to identify the significance of Scratch and found that 76% classified the effect as positive, 8% indicated the effect as negative, whereas 16% reported that there was no effect. In [128], Dyne and Braun described the contents and the evaluation of a CS0 course called Computational Thinking, which is developed to increase the ana- lytical problem solving of students. Initially, the course was designed for those students who are mathematically under prepared to start the introductory pro- gramming; however, it has been included in Montana Tech general education curriculum so the students majoring in any academic discipline may take the Computational Thinking course. The course mainly includes problem-solving and critical-thinking skills to help students with the analytical skills they will require to complete CS1. The student in the CS0 class along the students in FESP (Foundations of Engineering and Science Program) was tested and statis- tically analyzed. The results indicate the significant improvement in scores for CS0 students and found very fruitful in increasing analytical problem-solving skills. Reed describes the JavaScript, and World Wide Web based CS0 course [379]. The CS0 has a strong focus on programming, problem-solving and general top- ics of computer science, and therefore provides a wide perspective of the field, along with the basis of studies in computer science and successfully taught at Dickinson College. Although the course is well designed, but its sole objective is not to overcome the intricacy of the first programming course rather it cov- ered many other topics of computer science, and therefore 50% of class time is reserved for programming concepts and 15% of the class is reserved for HTML and interface design. The remaining 35% of the time is reserved for other topics in computer science. Several features of JavaScript like the typeless variables are detrimental in an introductory level and the students face many problems when transitioning to the actual programming languages like C, C++ and Java. In [461], Uludag et al. introduced an IT0/CS0 course that uses Scratch, App In- ventor for Android and Lego Mindstorms for majors and non-majors and aims 3.7. CS0 “Pre-programming” Courses 45

to increase the interest and motivation of the students. For California Polytechnic State University, a CS0 course has been introduced [177]. The course is designed to cover different tracks that students can se- lect (for example, music, robotics, mobile apps and gaming). This permits the novice student to learn the essential of programming and teamwork. The course is based on project-based learning [187, 412] approaches and focused on academic and non-academic factors known to improve student retention. The initial assessment indicates positive outcomes in the form of improved perfor- mance in CS0 courses and student retention. Dougherty[126] introduced six virtual worlds and implement in Alice2 for CS0 to cover the essential concepts of algorithms, particularly the control structures, data structures, objects and problem solving. The designed course was a 3-week module on algorithms and programming. In control structure, its covers selec- tion, condition, iteration and recursion. In data structure, it covers the number, string, Boolean, string, arrays, vector and list. In objects it covers properties and methods. In design content, the course covers divide and conquer, flow of control. Many other topics like function calls, parameters and return values, interacting objects, input/output, mouse event, searching, sorting, abstraction, concurrency, random, debugging and generalizing are included. The initial use of virtual worlds with Alice in CS0 revealed that students were more engaged in CS0 and motivated to define and implement the term projects with Alice. The University of Mary Washington tested the CS0 which is based on keeping programming concepts both accessible and fun for students [24]. The course makes use of several active learning techniques, including kinesthetic learning activities [42, 325, 427, 428], competitions, discussions, games and hands-on labs to cover computer literacy topics, essential of computer programming and the problem-solving. The course used Alice to teach programming concepts and feedback received from students shown that such course can motivate students to carry on their studies within the discipline of computer science. In [164] a CS0 course (Introduction to Programming Logic) at Chestnut Hill College is described. The course used Visual Logic for six weeks and then 3.7. CS0 “Pre-programming” Courses 46

switched to Python. Visual Logic is a graphical tool that allows the student to develop, executable flow charts. Visual Logic is very tiny, involves minimal syntax and extremely easy for a student to learn. The students’ evaluation in CS0 shows a positive response of using Visual Logic. Python with Visual Logic is also recommended by Agarwal et al. for teaching CS0 [3]. In [2] Python and Java are compared to teaching CS0 and Python is declared as a better option for CS0. To control the low retention rates of undergraduate majors in CS1 and CS2, Tennessee Technological University introduced the CS0 course [134]. The ob- jective is to provide students a breath-first introduction of computer science, and to introduce them to problem solving before starting Java. MindStorms robots [236] is used in CS0 to help the students. MindStorms robots can be programmed in the Drizzle with DIODE (IDE). The IDE provides very use- ful GUI and allows the automatic compilation of Drizzle program. In [485], Williams reported a study that indicates that students acknowledge the LEGO MINDSTORMS robots to learn C programming and introductory embedded systems design. However, a drawback of LEGO MINDSTORMS robots (and other equivalents) is a lack of access of robots outside of computer labs [315], and therefore a simulator like Robolang is used. A study on investigating the effectiveness of using a LEGO Mindstorms robot to increase motivation in an introductory programming course is reported in [307]. Different features relating to student motivation were quantified and the tests revealed no significance in using LEGO Mindstorms robot. In [16], Alvarez and Larranaga analyzed the significance of using LEGO Mindstorms robots to support novice students on the fundamental programming course. The study shows that the use of robots has not increased student’s marks but neither de- creased them. However, students deemed that the use of robots helped them to comprehend programming concepts. The study also revealed the problems related to the use of physical devices. In [122] a CS0 course is described to prepare students for CS1. The central aim of the CS0 was to increase student’s capability for algorithmic problem 3.7. CS0 “Pre-programming” Courses 47

solving and program design. The course covers problems-solving, algorithms, data representation, data manipulation, basic programming constructs, func- tions and procedures, data structures, abstract data type, queues, stacks, lists and object-oriented design. The intervention of CS0 has shown a positive effect on the comfort and confidence level of students. In [429], Sloan and Troy introduced a CS 0.5 course. CS 0.5 is an introduc- tory computer science course for CS majors who have no or little background in programming. The gentle revised version of Guzdials media computation [146, 167] is selected for CS 0.5. The course is originally introduced to control the retention rate and offered before CS1. The course provides beginners a pro- gramming maturity before diving into CS1. The course has been implemented and yields very encouraging results. In [1], it is described that Python is a good choice for CS0 because it has a sim- ple syntax. Python is a high level programming language developed by Guido Van Rossum [118]. It is freely available for several platforms at no cost. It is appropriate for large-scale as well as for the small-scale programming. It also provides support for object-oriented programming and graphical user interfaces. These features of Python make it feasible for CS0. Python provides a range of built-in data structures [40] and its range of features makes it apposite for CS1, CS2 and other advanced courses of computer science [1]. Python is helpful for students, but it has several bottlenecks. In Python vari- ables do not have to be declared (dynamic typing). On one side, it makes programming simple and relaxed the novice programmers from several efforts, but students face a hard problem when transitioning to the actual programming language. A hybrid introductory course called IT0 is introduced for students of infor- mation systems technologies [297]. The course integrates topics from discrete mathematics and programming logic. The central aim of creating ITO was to provide students with a solid base in problem-solving skills and mathematical reasoning and prepare them for advanced computing courses. 3.8. Other Popular Solutions 48

3.8 Other Popular Solutions

Mobile devices are becoming necessary tools for many educators and students [453]. In [454], Tillmann et al. proposed that computer programming and teach- ing of a programming should be performed directly on the mobile devices. Jor- dine et al. in [215] introduced a project that used mobile devices to enhance teaching and learning of Java programming. A very useful discussion included in [436] illustrates that faculty at different colleges and universities used Google’s App Inventor for Android (AIA) in the introductory computer science courses for non-majors. Google’s App Inventor for Android (AIA) is a visual programming environment and drag-and-drop tool (see Figure 3.18) for developing mobile phone appli- cations that are defined to be reachable and attractive to non-majors taking introductory computer science courses.

Figure 3.18: App Inventor component designer [491]

AIA offers a development environment to develop mobile applications, includ- ing social networking and web-services for Google’s Android platform. With Android Inventor, it is possible to define all kinds of fun, including games, educational software, location-aware application, high-tech application, SMS applications, web-enabled applications, applications that control robots and 3.8. Other Popular Solutions 49

many other complex applications [491]. A comparative study [358] of Scratch and App Inventor in introductory programming suggests the App Inventor for an official introduction of programming. Similarly, in [219] it is indicated that AIA platform is a powerful platform for teaching programming and computing fundamentals. Soares in [432] described that App Inventor has a large potential to be used for teaching the advanced concepts of computing. Similarly, the results of a survey [433] support the use of App Inventor to teach programming. A study on analyzing the effect of using mobile application development with App Inventor in a university core course revealed that App Inventor is effec- tive to motivate students [194] but provides no evidence in supporting the first programming course and assuaging the complexities involved in transition to programming languages. The study suggested to improve the App Inventor features. In a comprehensive study [378], it is concluded that smart phones, extensive scaffolding and studio-based learning are helpful in motivating and engaging the students to develop sophisticated applications by using images, arrays, sound and event handling. Computer games play a significant role in any area of education [124]. It has gained wider recognition as a motivating and engaging tool in the computer science curriculum [34, 39, 129, 284] and particularly the game development is gaining recognition in introductory programming courses [383]. In [253], Kurkovsky described the use of mobile game development to reinforce elementary programming topics such as arrays, loop and classes. Similarly, in [183, 184] a system called ProGames is presented to learn programming skills through attractive and visually-attractive games in Greenfoot and the results of its application are quite encouraging. In another study [89], a teaching model for the learning of debugging is introduced. The model use game based programming to aid the teaching of debugging for beginners. The initial evalu- ation of the model reveals that the model is helpful in increasing the students’ programming concept, but not very helpful in increasing their debugging self- efficacy. 3.8. Other Popular Solutions 50

In [264], Leutenegger and Edgington introduced a “Game First” approach to the teach introductory programming by realizing that game programming can motivate the beginners. The initial response indicates that this approach im- proved student understanding of basic topics. GameSalad is a free environment that provides a drag-and-drop rule based en- vironment to develop the applications [398]. In [119], the mobile based game development using GameSalad is proposed to increase the interest of students in the computing field. The results of the pilot study indicated improvement in students’ engagement. In [501] a game-type module called “Iterative Dungeon” is designed to aid stu- dent to visualize the loop. The module supports and visualizes the while loop, for loop and nested for loop. The module has been evaluated, and the results are quite promising and feedback from students indicates that the module has a positive effect on student learning. Similarly, in [502] a game-like module called Java Ninja is designed to support students in learning the concept of inheritance. The module has been evaluated, and the results are very promis- ing. In [503] a game-type module called “The Lost Java Code” is presented for teaching decision structures. The module is developed in GameMaker 8.1 and provides an interesting environment to practice and review the concepts of decision structures. The initial assessment outcome indicates encouraging results and student feedback is also very positive. In [443], Summet et al. introduced a CS1 curriculum that employees a robotics context to introduce introductory programming and initial trial classes has in- dicated that the approach is quite successful. Hu et al.[202] introduced a new style of teaching the introductory program- ming. The proposed method is based on integrating three concepts: using a visual language; using the notions of goal and plan; and having an obvious and clearly defined process. A method is introduced for representing goals and plans in a graphical notion along with a plan which is constructed in a visual programming environment (VPE). Like Scratch [288, 289, 290] and Al- ice [370] the programs in VPEs are developed by dragging and dropping the 3.8. Other Popular Solutions 51

statement blocks which naturally prevent syntax errors. According to Letovsky and Soloway[263], goal is a term use to denote intentions and plan is a term used to represent the techniques for realizing and achieving the intentions. In developing designs in the form of goals and plans, it is essential to have a for- mal system for representing them. Goals and plans are visually notated by using VPE like Scratch. Since the program has a specific number of goals to be achieved, so the visual notation in the proposed method provides three types of goals: input, process, and output. The visual plans are required for implement- ing the goals. Each goal has a parallel plan; therefore, there are three types of plans: input, process and output. The process plan utilizes its input dataflow and processes it to generate a new dataflow to achieve the goal of the process. Plan block is defined to represent a plan. BYOB is used to construct the plan block. BYOB [27] is a programming language to teach essential programming concepts like advanced data structures, object-oriented programming and in- formation. It allows to create new blocks (encapsulated procedures) and also allows recursion [354]. Analysis of a goal, design of the network of plan block, growing plan blocks and the integration of the expanded plan details are the five steps involved in the proposed method. All in all, the central idea of teach- ing programming to beginners by using goals and plans with visual notations is quite unique, but the large intermediate steps make it difficult, tedious and cumbersome to use. Cambranes in [73] introduced a tool called Origami that allows visual program- ming, which is based on flowchart for the construction of the program and the side-by-side system that dynamically generates the natural language description as a secondary representation. Origami helps the beginners to develop and ver- ify a program by using visual language. The Origami tool is inspired from Flip [74]. The programs developed in this environment are obviously syntactically error-free, but semantically invalid instructions are possible. Chapter 4

Learners Programming Language This chapter defines the predictor of success in the introductory programming courses and expounds the aim, the idea and intrinsic features of a learners’ programming language. It also provides the formal structure of learners pro- gramming language.

4.1 Introduction

Programming is a hard subject for teaching and learning [248]. Novice stu- dents lack the understanding of algorithms and elementary programming con- cepts and find the programming very difficult. Novice students in programming present many problems such as difficulties associated with the lack of capability of abstraction and the lack of abilities to deal with problems [261]. The early failure of comprehending essential concepts debilitates students’ con- fidence and increase dropout rates. Despite a great deal of work in the program- ming pedagogy and introductory programming courses, the students’ dropout and failure rates in programming courses are still very high [475]. A number of studies endeavor to discover student characteristics that predict success in CS1. Most of the studies are based on quantitative approach, yet many studies seek perspectives from faculty [230] or students [174]. Many researchers argued that students programming abilities are positively re-

52 4.1. Introduction 53

flected by their mathematical abilities [13, 411] and several researchers believe that students mathematical abilities positively reflect on their programming abilities [44, 411]. A study [377] conducted to examine the impact of student’s self-efficacy and mental model of programming for understanding and learning to program re- veals that programming self-efficacy is influenced by prior programming knowl- edge. The study also indicates that the student’s mental model of programming affects self-efficacy and that both the self-efficacy and mental model influence course performance. A study [169], conducted to analyze the impact of prior knowledge of pro- gramming in the computing degree program indicates that students who have knowledge in at least one programming language at the start of an introductory programming course, work considerably better in the evaluation than those with none. Similarly, the performance of students with the experience of multiple languages tends to be better than other. Bergin and Reilly[48] carried out a study on fifteen factors, including com- fort level, prior computer experience, prior academic experience, specific cogni- tive skills and self-perception of programming performance that may affect the performance on a first year object-oriented programming module. The study reports that student’s perception of their understanding is strongly correlated with programming performance. In addition, mathematics scores have a strong association with performance. In another study [49], Bergin and Reilly further analyzed the variables that influence the success in the introductory program- ming modules. During study 25 variables were identified at four different in- stitutions. The study identified that mathematics, game playing and comfort level are the predictors of programming performance. Amoako et al.[21] conducted a study to analyze the background of students, their areas of study and learning techniques utilized during the study of pro- gramming courses. The study concluded that learning strategies that students followed in the study of programming course has a strong influence on their performance. 4.1. Introduction 54

In [136], variables, including high school achievement, cognitive style, demo- graphic profiles, prior computer training and experience and problem-solving abilities that predict computer aptitude are analyzed. The results show that prior computer exposure and academic were the strongest predictors of class performance. The study also revealed that no variable dominated the others as a best predictor of success. In [188, 189], it is argued that prior programming experience effects on student performance in the first course. It is also described that the depth of the pro- gramming experience affects the first course. Tafliovich et al.[446] investigated the impact of prior knowledge in CS1 and found that prior knowledge of programming influences various facets of student experience in the first programming course. Self-efficacy and mental models are two important factors that may affect the learning of programming. Both are significant for knowledge acquisition and transfer. In [482], the combine impacts of the self-efficacy, mental model and prior knowledge in learning to program in an introductory course are investigated. Seventy-five students participated in the study. The students were enrolled in the introductory programming course. During the study, the self-efficacy was evaluated using Computer Programming Self-Efficacy Scale, whereas the students’ mental models were determined by program recall and program comprehension. The results indicate that self-efficacy for program- ming is affected by prior programming experience, and student self-efficacy improves significantly during an introductory programming course. In addi- tion, the student’s self-efficacy is influenced by their mental models, and both the self-efficacy and mental model have direct influence on overall achievement in an introductory programming course. In [15], Alvarado et al. examined whether the conventional factors like confi- dence and prior experience are still important in student’s attitude and perfor- mance in CS1. The results indicate that prior experience is still a reason in (male) students’ performance [15]. Owolabi et al.[353] conducted a study to examine the association between six 4.1. Introduction 55

factors (computer anxiety, mathematics anxiety, mathematics ability, gender, age and programming anxiety) and achievement in programming. The results indicate that the combine effect of these six factors on student programming success is 20.8%, and the association is significant. The results also report that only one factor (mathematics ability) has a significant (positive) association with the performance in Basic programming. In a salient study [488], Wilson and Shrock ascertain the factors that stimulate success in an introductory com- puter science course. The study included twelve probable factors: gender, math background, domain-specific self-efficacy, attribution for success/failure (ability, luck, effort and difficulty of a task), comfort level in the course, previous pro- gramming experience, previous non-programming experience, encouragement and working style preference. Subjects included 105 students registered in an introductory computer science course and use C++ as the programming lan- guage. The study used the midterm grades to ascertain the success in the course. The study identified that comfort level was the superlative predictor of success. Math background was next in significance in predicting the success in this class. The study also revealed that game playing had a negative impact on the midterm grade. The study also analyzed the different types of previous computer experiences and concluded that formal training had a positive influ- ence on class grade. Rountree et al. in [396] claim that the grade that a student conjectured to achieve at the beginning of the CS1 course is the strongest indicator of success. In [84], Caspersen et al. reported a study which shows that there is no correla- tion between students mental model and their performance in a programming course. In [160], Goold and Rimmer posits that secondary school performance and gender are significant in introductory programming, whereas the dislike of pro- gramming has a strong impact on performance. In [75], Campbell formalized a new model to identify the association between: learning style and success, programming behavior and accomplishment; and learning style and programming behavior. The study revealed that achieve- 4.1. Introduction 56

ment is significantly correlated with behavior. Several computer science educators claim that abstraction is a core ability [247, 437]. Capacity for managing abstraction may be a unique trait of com- puter science majors [185]. Or-Bach and Lavy[349] claim that abstraction is an essential concept in programming in common and in object-oriented pro- gramming, in particular. However, it is hard for many novice students to learn abstract thinking. Bennedsen and Caspersen[45] conducted a study to analyze the relation be- tween abstraction ability (cognitive development) and programming ability (a final grade in CS1), and found that there is no correlation between them. The study also computed Pearsons correlation between the score of the program- ming exam and the math score from high school and identified that there is no correlation between them. In [360] the correlations between abstract thinking, acquaintance with the pro- gramming languages and programming capability are analyzed. The study reported that there existed a moderating impact on programming ability be- tween familiarity and abstract thinking. A study[133], conducted by Erdogan et al. identified the predictors of program- ming. During study students’ achievement in programming course is defined as dependent variable and problem solving, creativity, general aptitudes, computer achievement, mathematics achievement and computer attitudes are defined as the independent variables. The result of the study described that only one factor that considerably predicts the student’s achievement in programming is general aptitude. Motivation is a very significant factor for successful instruc- tion [418]. Ambrose and Kulik in [17] identified that there is a correlation be- tween motivation and performance. Similarly, in [419], it is stated that student motivation and success in learning to program are correlated. Anderson et al. argue that students low motivation in a programming impacts their learning [22]. In [338], Nikula et al. described how to rehabilitate the troubled first pro- gramming course and illustrate that lack of motivation affect the high dropout rates in a first programming course. 4.1. Introduction 57

A study documented in [47] report that intrinsic motivation has a strong asso- ciation with performance in a first-year programming. The study also identified that comfort level has a large association with performance. In [47], Bergin and Reilly analyzed the correlation between the performance in a first-year programming course with gender, prior computing experience, learning style and academic performance. The study theorized that there is a correlation between programming ability and existing aptitude in science and mathematics. The study also suggested that the students’ marks in the first programming course don’t correlate with gender. However, the prior experience in programming has an impact on success. In [48], Bergin and Reilly identified that students perception of their comprehension of the module had a significant association with programming performance and mathematics and science scores also have a strong correlation with performance. Self-efficacy is the level or strength of one’s belief in one’s own capability to com- plete duties/assignments and achieve goals. The self-efficacy beliefs of students influence their engagement, and their engagement influences their performance. Hwang et al. identify a reciprocal association between self-efficacy beliefs and academic achievement [206]. Seyal et al. conducted a study to analyze the association between learning style and performance in programming [420]. The study identified that students learning styles have a strong impact on their programming performance. Solution planning and problem solving are probably the most complex skills the novice programming students ought to acquire. Many students lose their motivation and fail to define the solution when confronted with a program- ming problem. According to Carter and Jenkins[81], it is usual for students who approach their first-year project intent to evade programming at all costs, most probably because either they cannot program or presume that they can- not. Learning the programming is considered to be difficult, since the students are coerced to develop rational problem solving-skills, learn to communicate themselves in algorithmic style and use the programming languages that are frequently unusual and boring for many of them [332]. 4.2. Learners Programming Language 58

4.2 Learners Programming Language

The students without any prior knowledge of programming require to simulta- neously learn the method to solve the problem and the use of a programming language as a system to solve the problem [322]. These multiple requirements could lead to cognitive load [313] and therefore, the CS1 course often needs to offer scaffolding for beginners. CS0, a preprogramming course is an approach to raise CS1 performance [297]. In several computing curriculums, the first programming course (CS1) is pre- ceded by CS0 or some other type of preliminary course, and these courses can increase performance in CS1 [66, 122, 134, 164, 305, 429]. CS0 usually covers logical reasoning, problem solving, algorithm design, and programming con- structs with very little or no focus on syntax [231]. Learners programming language (LPL) is a CS0 course designed to support the students of the first programming course. Principally, the LPL does not negate or nullify the existing CS0 courses, but simply combine their rational features with some new notions in a reasonable way. The primary aim of LPL is to provide a prior programming knowledge and increase the comfort level of novice students, which in turn grow programming abilities, improve the self- efficacy/confidence, increase the motivation and consequently, improve the per- formance in a first programming course. It is generally recognized that students who have higher levels of contentment and are well motivated are expected to provide more desirable and effective feedback [82]. LPL helps the novice stu- dents in understanding the basic programming concepts, and increase their problem-solving abilities. The prime objectives of preprogramming (zeroth) programming languages are mostly confused with the introductory program- ming courses (CS1). In fact, the pre-programming languages are not developed for doing programming, but for teaching programming [305]. In the similar vein the learners programming language is principally aimed to support the learning of introductory programming but not for the actual programming. Most of the CS0 courses are breadth-first and cover the general topics of com- puter science, whereas LPL is a pure computer programming course and solely 4.3. Aim and Objectives 59

developed to support the first programming course. The learner’s programming language is designed for introductory imperative programming courses. However, the central philosophy of learners program- ming course can be tailored for any other programming paradigm. Imperative paradigm is selected for LPL because it is the most useful approach for introduc- tory programming courses [116, 117, 298, 299], and the majority of introductory programming courses are based on this paradigm. Imperative paradigm is very easy to understand, and it is fairly straightforward to translate algorithms into imperative code [165].

4.3 Aim and Objectives

The prime aim of LPL is to improve students’ performance in the first pro- gramming course. For the realization of the aim, the LPL course provides students with a tour of programming, touching the elementary concepts and their constructs. The following are the main objectives of learners program- ming language.

1. The first objective of LPL is to form the student’s comprehension of el- ementary programming concepts, in order to formulate them for success in the first programming course. In LPL, the novice students are exposed to jargon related to the first programming course. They learn about the elementary programming concepts. With LPL, beginners have the oppor- tunity to grow design skills and problem solving without any need to mess with the complex syntax. LPL also helps the students to understand real coding with simple notations, and also help in understanding the syntax and semantics of the first programming language. 2. The second objective is to build a positive image of programming in novice students. Because of the common perception that programming is intrin- sically very difficult, so most of the novice student realized that it is impossible for everyone to understand programming. 3. The third objective of LPL is to increase the comfort level and motiva- 4.4. Course Description 60

tion of novice students and gently prepare them for the first program- ming course. The soft introduction of LPL before the first programming course naturally decreased the inherent intricacies of the first program- ming course and improves the students’ comfort level and confidence. 4. The fourth objective of LPL is to increase the retention level of students. 5. The fifth objective of LPL is to reduce the dropout/failure rate.

4.4 Course Description

The LPL course is targeted to novice students who have no experience of pro- gramming and under prepared for first introductory programming. The LPL course is based on integrating problem-solving techniques, development of algo- rithm, graphical environment and a textual tool with a collaboration strategy. LPL fundamentally aims to introduce before the first imperative programming course (CS1) with the hypothesis that the introduction of LPL prior to the introduction of the first programming course would help the students in a first programming course. This course covers basic imperative programming concepts and problem solv- ing to help novice students by providing preprogramming skills and a confi- dence they will need to complete the first programming course. Many other CS0 courses [221, 318] concentrate less on programming and more on critical thinking. The LPL maintains a strong focus on programming and also assists students in understanding the syntax of the first programming language. Principally, the LPL provides a soft introduction of those major concepts which are usually essential in introductory programming courses. In [109], input/out- put, control structures, arrays and recursion are identified as difficult topics for novice students. According to Dancik and Kumar[111], the counter-controlled loops, particularly the down-counting loops are more difficult for students. Complex syntax is one a large barrier to learning programming. [87]. In [249], Kuittinen and Sajaniemi claim that variable is a very complex concept for stu- dents. Gobil et al.[157] found that students face difficulties in selection struc- ture. Mhashi and Alakeel[316] conducted a study to analyze the difficulties 4.4. Course Description 61

faced by their programming students. According to that study loops, recur- sion, arrays, pointers, passing parameters, abstract data type and use of library functions are the most difficult concepts of programming. Similarly, designing programs, decomposition of problem into sub-problems, designing of functions and error handling is recognized as the most difficult processes. A landmark study [88] on students’ difficulties in the first programming course corroborates that students have difficulties with conditional and loops. A study reported in [319] described that all the topics which are based on pointers and memory- based concepts are difficult for students. The fundamental programming concepts which are essential at CS0 level are truly limited and therefore involves very basic programming constructs. The LPL currently supports the following features:

1. Data type 2. Literal 3. Variable 4. Input/Output 5. Operators 6. Array (single dimension) 7. Selection control structure 8. Repetition control structure 9. Comments

The duration and lectures on LPL course are principally mutable; however, its lectures should be based on lectures and lab/activity class. The lecture por- tion of course focused on introducing the elementary programming constructs. Course topics covered during the lectures and further explained in the lab/ac- tivity class. In the LPL course, the essential concepts of imperative programming concepts are introduced in two phases. 4.5. Phase I 62

4.5 Phase I The central objective of a first phase is to introduce students with fundamen- tal concepts of introductory programming without coercing them to learn and concentrate on syntax and semantic of any programming language. While in- troducing the concepts it is important to keep the interest of novice students. The comfort level [48, 488] and confidence [15] are important for success in introductory programming courses, so the platform that introduced the funda- mental of programming in a first phase should engage the students in a way that increases their interest, comfort level and confidence. Visual (graphical) environments and virtual worlds are extremely simple and widely recognized as a good way of delivering the concepts and engaging the students [28, 392]. In the first phase of LPL, the elementary programming concepts will be intro- duced with suitable visual/graphical environment(s). All the relevant graphical environments are possible candidates for the first phase of LPL course. Graphical environments are very flexible in helping students to understand the object-oriented concepts [155], but can be productively used in the imperative programming courses. In graphical environments, flowcharts-based graphical environments are more advocated for the first phase of LPL course which aims to support the first imperative programming course. Flowchart-based graphical environments are simple, successful and more amenable with imperative pro- gramming languages. Programming language independence is another valuable characteristic of flowcharts. In [87], it is described that flowchart is a useful visualization and design tool for teaching and learning control structures and algorithm development [87]. Similarly, in [170] it is described that flowcharts can be productive for developing and learning algorithms. can pro- vide novice students with clear mental models of algorithms [195]. Mental models are significant in the comprehension of programming and the student’s success in introductory programming courses is influenced by growing the il- lustration of process flow and mental model. The significance of these models in growing understanding is also accentuated by Winslow[489]. Flowchart- based programming environments have been widely used in introducing novice 4.5. Phase I 63

students to programming [497]. In [196], a flowchart-based programming en- vironment is used to grow problem-solving skills of C minors in programming and found are very positive. Introducing novice students with the basics of programming at the initial stage by using contemporary programming languages negatively affect their perfor- mance; even with a pseudo code it is hard for novice students to communicate the flow of a program unless using flowcharts or diagrams [479]. During the first phase of LPL course, the essential concepts of programming are introduced in following manner:

1. Introduction of a concept. 2. Representation of a concept in visual environment. 3. Assignment of tasks/lab activity for the comprehension of a concept.

Consider a very brief and simple illustration of the variable followed in the first phase of LPL course. Step1. Introduction to a variable.

1. Definition of variable 2. Purpose of variable 3. The generic examples of variable 4. Properties of variable (name, type, value, and etc.) 5. More detail

Step2. Representation of variable in a visual environment (see Figure 4.1 as an example).

Step3. Assignment of tasks/lab activity for the comprehension of a variable, such as:

1. Write a program that will take input into variables that store your name, father’s name, registration number, contact number and address. 2. Modify your solution for the previous exercise to store the name, father’s name, registration number, contact number and address of five students. Also analyze that it is reasonable to extend the same program in the similar way for storing the identical information of 500 students. 4.5. Phase I 64

Figure 4.1: Simple illustration of variable

Decision structure is a fundamental concept of an imperative programming. Most of the students face problems in understanding decision structures. Deci- sion structure is an indispensable element of a learners’ programming language. Both phases of LPL course take an extra care in introducing decision structure to novice students. Following is one of a simple and brief illustration of the decision structure followed during the first phase of LPL course. Step1. Introduction to decision structure.

1. The generic idea of decision structure 2. Purpose and need of decision structure 3. Generic examples of decision structure 4. Compound statement 5. Relational operators 6. Relational expressions 7. Types of decision structures 8. More detail

Step2. Representation of decision structures in a visual environment (see Figure 4.2).

Step3. Assignment of tasks/lab activity for the comprehension of decision struc- ture, such as:

1. Write a program that will find and print out largest of two numbers. 4.5. Phase I 65

Figure 4.2: Simple illustration of decision structure

2. Write a program that will find and print out largest of two numbers.

Loop (repetition structure) is a fundamental programming construct [198], how- ever, it is extremely difficult for the majority of novice students. Loop is a pivotal element of learners programming language. During LPL course, the students are introduced to different types of loops. Consider a very brief and simple illustration of loops followed in the first phase of LPL course. Step1. Introduction to loops

1. The generic idea of loop 2. Purpose and need 3. Generic examples 4.5. Phase I 66

4. Types of loops 5. More detail

Step2. Representation of the loop in a visual environment (see Figure 4.3)

Figure 4.3: Simple illustration of loop

Step3. Assignment of tasks/lab activity for the illustration of the loop, such as:

1. Write a program that will take input in integer variable and find its fac- torial. 2. Write a program that will take input in integer variable and check, whether the number is prime or not.

Engagement and interest are indispensable to academic success and therefore, necessary measures are taken in engaging and motivating the students at the 4.5. Phase I 67

start of LPL course. The educational institutions seem to use incentives and/or rewards as a technique to augment student motivation and performance [287]. Competition, grades and appreciation are all highly beneficial within the aca- demic arena. For some students, extrinsic motivation [404] can be momentous. Many educators deem that the extrinsic motivator may work more rapidly and effectively than intrinsic motivation. Gutierrez and Schraw[166] argued that extrinsic rewards can help students to exhibit better performance. So, on the basis of the evaluation of assigned tasks/lab activities the students are given rewards for good performance, achievement and improvement. Many commercial and freeware tools are available, which can be used in the first phase of LPL course. As it is already discussed that Microworld based visual languages are more appropriate for pure object-oriented or object-first introductory programming courses, where the flowchart-based graphical envi- ronments are more suitable for imperative programming courses. Alice uses an object-oriented style of programming [181]. It offers three di- mensionality, which provides a sense of reality for objects [326], and used for teaching object-oriented programming [384]. BlueJ is an environment for learning and teaching object-oriented programming [243]. It provides a complete Java environment in which the object-oriented software project structure is offered graphically with UML-like diagrams [326]. Students can directly create objects of any class by using icons and relate with their methods. BlueJ is typically developed for Java and therefore, suitable for object-oriented programming [12, 240, 246]. Greenfoot is based on BlueJ and it combines Java IDE with a framework for developing Java programs [326], and therefore, it is appropriate for teaching Java and object-oriented programming [153, 242]. RAPTOR [80] is a visual environment that helps the students to visualize the algorithms. It allows students to develop and visualize their flowcharts. RAP- TOR is a viable choice for the first phase of imperative LPL course, yet it is also very fruitful for object-oriented programming [79]. 4.6. Pair Programming 68

The Iconic Programmer [87] is a visual tool for creating and converting the flowcharts. The Iconic Programmer is a logical option for the first phase of LPL course.

4.6 Pair Programming

The student effort in the initial days and weeks of CS1 is as an important factor in the high failure rate. It is widely observed that students who fail to acquire the basic concepts which are introduced in the first programming lessons are commonly incapable to recover, leading to high failure rates and drop out. Hence, it is indispensable to define approaches and interventions that can help novice programmers during their initial teaching sessions. Students frequently encounter anxiety when they begin to learn programming [415]; therefore in the first phase (as well as in a next phase) of LPL course, the students are motivated and encouraged to work in a pair. Working in a pair is the essence of pair programming. Pair programming is a recurring research topic [108] and one of a pivotal prac- tice of Extreme Programming. In this method, two programmers work together at one computer on the same task, the same plan, algorithm, programming or test. A person who holds the keyword is called the driver and the person who sits alongside the driver is called navigator [203, 369]. The basic duty of a driver is to develop the code. The navigator reviews the code and looks for possible errors that a driver leaves in the code by mistake. Pair programming has appeared as a positive method for developing high- quality software in a time-efficient manner. It plays an important role in helping students to grow self-reasoning and solutions in abstract situations, particularly for problem-solving. In [455], Tomayko described the experiment which shows that when program- mers work in a pair, they made the lesser error than in solo programming situations. Pair programming has been broadly applied in computer science education be- cause of the benefits it brings to students [336]. 4.6. Pair Programming 69

The pair programming has been very successful in introductory programming courses. Its achievement has been evident in increased confidence, better ac- complishment of programming tasks, increase retention rates and decreased frustration [467]. A comprehensive study [137] suggests that the more students are involved in pair programming, the more they benefit from it and the more they learn by collaborating with their partners. Du et al.[127] applied a pair programming to improve communication in the C language fundamental exercises. The experimental results indicate that pair programming is an effective technique for enhancing communication. In [83], Carver et al. described a study which is conducted to examine whether pair programming would increase retention. The results of the study revealed that pair programming significantly increased the retention of those students who are majoring in computer engineering, software engineering or computer science. Similarly [175] advocates that the advantages of pair programming in- clude improved success rates in introductory courses, high student confidence, increased retention in the major and development in learning outcomes. In [493] pair programming is found to be a useful technique in the early days of CS1. Similarly, [486] indicated that students who follow pair programming perform better on programming projects. In another study [303] on pair programming it is found that students who used pair programming developed better programs, more confident in their solution and enjoyed completing tasks more than stu- dents who worked alone. In the same way several other studies [62, 64, 94, 97, 173, 341, 374] show that pair programming provides considerable benefits to students in learning pro- gramming. Pair programming is already used in CS1 courses [92]. However, it is rarely used in CS0 course. Pair programming is an intrinsic element of LPL course. During LPL course, the students are motivated to work in a pair as it is fa- vorable for students and increased their confidence and performance. During programming/problem solving, the driver develops the code and logic, while the other, the navigator, observer or pointer analyzes the code/logic. The two 4.7. Phase II 70

students switch roles frequently (after every assignment or activity). Compatibility of pairs is an important issue in pair programming. Katira et al. [221] identified that students have a preference to pair with someone they rec- ognize to be of a same technical capability. Similarly, in [222] it is found that students are compatible with the partner whom they realize of similar capabil- ities [222]. Another study [63] suggest that students who are paired by ability perform better than randomly paired students and those who worked alone. For the LPL course, no strict rule is recommended to pair the students, but it is preferred to pair the students according to their abilities; nevertheless, it is also possible to consider other factors that can provide benefits to the students.

4.7 Phase II

The graphical environment excels at illustrating the programming concepts [357]. The graphical programming environment used in the first phase of LPL introduced the essentials of programming by demonstrating the complex ab- stract concepts with visual aid and also grow the interest and motivation of students but provide no real experience of programming. Klassen[231] de- scribed that Alice (a graphical environment) itself alone in CS0 is not adequate to prepare computer science majors for a first programming course. At a min- imum, CS0 ought to do real programming and should acquire a more rigorous approach. So in order to introduce the novice student with actual program- ming in a simple way, the second phase of LPL course covered the fundamental topics with a textual programming environment. The textual environment is recognized as a better platform to allow beginners to directly deal with real programming [208]. The second phase of LPL introduced the students to the real programming essentials without coercing them to understand the complex syntax of contem- porary programming languages. This phase in turn helps the novice students to understand the contemporary programming languages. The programming fundamentals which are obtained during the first phase of LPL are further re- fined during this phase. 4.7. Phase II 71

Principally, the ultimate success of the second phase is based on the employed language. The sensible use of an appropriate programming language can well prepare the students for the first programming course. The base language for this phase should be simple and understandable. Maintaining the interest of students remains a main concern of the second phase of course. However, the textual programming languages are less interactive than the graphical envi- ronments. Most of the contemporary programming languages are primitively text-oriented, and their syntax is usually very complex. Classical program- ming languages require the programmer to make massive conversions from the intended tasks to the code design [335]. It has been observed that syntax re- mains a major barrier to novice students in the field [439]. The programming languages like C, C++, Java and C# are popular languages and widely used in technical and commercial areas, but their syntax, semantic and other peculiar- ities are very complex and therefore it is irrational to use these languages. The subset/educational environments of contemporary programming languages like Educational C [402], Thetis [147] provide support in the form of visualization and debugging, but their syntax is still very complicated and thus unreasonable for the second phase of LPL course. There are many educational programming environments, and most of them are graphical. Python is an open scripting language and popular for introductory programming [424]; but its syntax is comparatively different from other programming languages, and consequently it is very difficult to switch on to other languages once the student gets all the essential concepts in the context of Python. In Python, variables do not have to be declared. This feature relaxed the programmers, but novice students face difficulties when moving to the actual programming languages which are mostly statistically typed. Natural programming languages (discussed in section 3.2) are text-oriented and allow to develop programs in a natural language. These languages relax the pro- grammers from the concrete syntax of programming languages. Redundancy avoidance, locality and immediacy are the essential traits of natural languages and consequently implicit referencing, context dependence and compression are 4.8. Dedicated Textual Language 72

recurrently used in natural programming languages. However, these traits are almost unavailable in programming and the experience of natural programming languages provides no real benefit in learning the contemporary programming languages. In [345, 346], it is described that the use of natural language does not affect the introductory programming. On that ground, the natural pro- gramming languages are not effective in the second phase of LPL. [152] conducted a study [152] and described that pseudocode is a better ap- proach for introductory programming. The CS0 courses introduced the problem- solving and the use of algorithms [1,3], so the recommended language for a second phase of LPL course is a textual language which avoids the excessive syntactic load and introduce the essential of programming by using simple, un- derstandable, self-explanatory and algorithmic-oriented computational state- ments. Any existing programming language that satisfies these properties is highly favorable for the second phase. However, it is reasonable and logical to design a dedicated programming language for the second phase of the learn- ers programming language. The specially developed textual language would typically address the defined objectives in a more effective manner. The subse- quent sections describe the fundamental features and the generic detail of the structure and construction of a textual programming language.

4.8 Dedicated Textual Language

The textual language is the nucleus of the learners’ programming language. Although it is not very different from other textual programming languages; however, has a simple syntax and tiny in size. The easy and self-explanatory statements are the real essence of a textual language. With these statements, the novice students easily understand the basics of programming. The dedi- cated textual language can be visualized as a simple, tiny and understandable form of contemporary programming languages. The high orthogonality is not useful in programming [416], so the textual lan- guage allows the limited but multiple ways for the realization of constructs. The use of articles (definite and indefinite) is usually highly difficult for novice 4.8. Dedicated Textual Language 73

students and especially for non-native English speakers, so the statements in textual language are extremely similar to pseudo code, and the use of articles is mostly optional. Identification and differentiation of statements are mostly very complex in the initial stages of programming. So nearly all the statements (with few excep- tions, like the expression) of textual language begin with a unique word, which is usually a verb. The textual language is case insensitive and supports all the essential features of programming, which are discussed in section 4.4.

4.8.1 Subroutines

The function is one of a main component of computer programs. In most of the contemporary programming languages, the syntax of defining function is extremely complex. In C language, the main function is conventionally defined in the following way. int main (void) { ... }

In Java, the declaration of the function is more complicated and involved more detail. As an illustration, consider the following example. public static void main(String[] args) { ... }

The textual language of a learners’ programming language allows the declara- tion of the function with simple and understandable ways. All the statements in the program are enclosed within the main function. Following is one of a possible way for the definition of the main function in learners programming language. 4.8. Dedicated Textual Language 74

start of main program ... end of program

The detail of arguments and return value can be attached to the definition of function in simple and understandable ways. The textual language also allows a very concise structure for every available construct. For instance, the declaration of a main function may take the following form: s t a r t ... end

The LPL course does not suggest the user defined functions at CS0 level. How- ever, it is possible to include the user-defined functions (if essential). Users define functions may be defined before or after the main program. Functions may take zero or more parameters. Functions may return a single value.

4.8.2 Data types and variables

The variable is a complex concept for students [249]. So the textual language has a particular focus on variables. The textual language supports different types of data. Currently it supports four fundamental data types: integer, float, character and string. GRAIL [305] support a single numeric type, whereas two numeric data types are allowed in a textual language. The declaration of variables in contemporary programming languages is very concise but not very helpful in visualizing the actual concept. As an illustration, consider the following C/C++ statement:

i n t age = 5 ;

Although the above statement is too concise, yet it does not clearly reflect the actual purpose. The declaration of variable in a textual language is simple and helpful in visualizing the actual purpose of the statement. All the variables in textual language are properly typed and require a proper declaration before 4.8. Dedicated Textual Language 75

use. In textual language, the declaration starts with a keyword create (or some other equivalents). Declaration statement may take the following form:

create an integer type variable named age create float variable fee ... create integer age

Declaration statements for all the types of variables in textual language are same. The language also allows the explicit initialization in a simple and obvious style.

4.8.3 Literals

Literal values of integer are numbers without any fraction. The float type literals are decimal values and must conform the following pattern:

(.)optional

Exponential notation is usually very difficult for beginners and therefore not available in the textual language. Character literal is a single character enclosed in single quotation marks. A string literal is a sequence of characters on a single line and enclosed in double quotation marks. The single dimension arrays of integer, float and character are also allowed in the language.

4.8.4 Operators

The language includes the conventional arithmetic operators (+, -, *, /, %), relational operators (>, >=, <, <=, <>) and logical operators (and, or). The textual language does not necessitate the use of these symbols for the operators. It is possible to consider any other simple and understandable character set of the operators. 4.8. Dedicated Textual Language 76

4.8.5 Assignment

Like other programming languages, the textual provides the assignment state- ment. The conventionally employed assignment operator (=) is permissible in a textual language:

num = 141 name = ‘‘Maria’’ profit = lastprofit

4.8.6 Input/Output

Input/Output (I/O) are the basic operations in almost every program. How- ever, I/O in most of the programming languages is esoteric [393]. The I/O in the C language is highly cryptic and confusing for students. The standard input statement in C language is a scanf function. It requires the detail about the format of input data and a pointer to a variable in which the data should be stored. The standard output statement in C language is a printf function. It also requires the description of the format of output data, but does not require a pointer to a variable. As an example, consider the following C code:

s c a n f ( ‘ ‘%d ’ ’ , &age ) ; printf(‘‘%d’’, age);

Input and output are principally symmetrical, but not in the case of C language. The I/O in C++ is based on streams, which is quite different from I/O in C language. The I/O in C++ involves the concepts of overloading of operators. As an illustration, consider the following C++ code:

c i n >> age ; cout << age ;

The concept of stream insertion (<<) and extraction (<<) in C++ is radically difficult for the majority of novice students. Like C and C++ the console input and output in Java is highly complicated. As an illustration, consider the following Java code: 4.8. Dedicated Textual Language 77

InputStreamReader isr = new InputStreamReader(System.in); BufferedReader bufreader = new BufferedReader(isr ); String inp = bufreader.readLine(); int number = Integer.parseInt(inp);

The input and output in textual language are extremely simple and understand- able. The same input and output statements are available for all the types of variables. The input statements in textual language may take the following forms:

take input in number input in name

Similarly the output statements may take the following forms:

display the value of age display number

4.8.7 Selection Structure

Selection structure is the cornerstone of the programming language. However, it is one of a most difficult topic for novice students [109, 157]. Contemporary programming languages allow a variety of selection structures, which is concise but not very illustrative and helpful in understanding the actual concept. As an example, consider a simple code of C language.

i f ( a > b) { number = a ; } e l s e { number = b ; } 4.8. Dedicated Textual Language 78

The above code is quite concise, but does not reflect that which of a statement (number = a; or number = b;) is executed if the condition is satisfied. The textual language allows the selection structures in much more illustrated ways and helps the novice students in comprehending the actual concept. The textual language supports the separate structures for single, double and multiple selection structures. Each conditional structure of textual language is prefixed by execute keyword (or some other equivalent) which helps the students in identifying and categorizing the statements. The selection statement may take the following form:

execute the statements if a < 10 ... e l s e i f a > 10 ... e l s e ... end o f i f

4.8.8 Loops

Loops are the cornerstone of programming languages. However, it is usually very difficult for novice students [109, 111, 316, 501]. The loops in most of the contemporary programming languages are not very illustrative. The textual language provides the loops which are closer to their equivalents in contempo- rary programming languages, but more illustrate. Each iterative structure in a textual language starts with the keyword repeat (or some equivalent). The counter-controlled loop and the un-expected condition loop are allowed in a textual language. The counter loop may take the following forms: repeat with variable num from 1 to 100 and the step is 2 ... end o f loop or, 4.8. Dedicated Textual Language 79

repeat the statements 100 times ... end o f loop

The un-expected condition loop may take the following forms:

repeat the statements as long as num < 10 ... end o f loop or,

repeat statement while num > 10 ... end o f loop

4.8.9 Comments

Most of the programming languages like Pascal, C, C++, Java and C# allow the comments. Comments are also allowed in a textual language. Currently, it allows single line comments, which are prefixed by the symbol(s) like “$$” or “##” and run to end of current line.

4.8.10 Error handling

Novice students make high errors in their programs, and frequent errors plague them as they learn to program. Denny et al. opined that syntax errors can be a significant barrier to student success [121]. Error correction is a fundamental part of the debugging process. Students strived to correct syntax errors, but consider the debugging as a difficult process [252]. The novice students work- ing on new and unfamiliar programming language waste considerable time in correcting errors. Excessive time spent on correcting errors can be unfavor- able as novice students become discouraged with programming. Denny et al. conducted a study [120] to identify the level to which syntax error is causing 4.8. Dedicated Textual Language 80

problems for students to program in CS1 courses and report that students en- countered difficulty with syntax errors. Study of errors and their impact on programming is a separate discipline [33, 76] and involve many other issues. The textual language provides support to novice students in debugging their programs. At a minimum level, the textual lan- guage environment provides a gentle support in the form of error messages that help the novice students to locate and correct errors.

4.8.11 High level code generation

The textual language is designed to introduce a means for the student to look at the major elementary concepts of imperative programming, without coercing novice students to grapple with the unusual notation of contemporary program- ming languages. It helps the students in understanding the syntax and other facets of contemporary programming languages. High level code generation is probably one of a most prominent feature of a textual language. The integrated development environment of textual language is augmented with a language translator that converts the students source pro- grams into the high level programming code. The translation of textual lan- guage programs into high level programs would help the novice students in real- izing their programs and also support them in understanding the contemporary programming languages. The high level code generation may also overcome the complexities encountered while transitioning to the first programming language.

4.8.12 Flexibility of a textual language

The prime aim of textual language is to introduce the students with real pro- gramming in a simple manner. The structure of the dedicated textual language discussed in the previous section is one of a possible way for the realization of textual language. However, the LPL does not restrict the same structure for all the dedicated textual languages. It is justifiable to design a textual language with different structure, but with the same philosophy of learners programming 4.9. Implementation 81

language. Principally, the textual language and the overall LPL course are based on the notion of flexibility.

4.9 Implementation

The learners’ programming language is intentionally designed as a flexible and an economical course. The pliable nature of LPL course makes it viable for the variety of academic programs. The overall cost, time and efforts required for the design and implementation of LPL course are very nominal, and no major risk is involved in its implementation. The LPL course is equipped with two-phase learning and pair programming. The first phase of LPL requires a graphical environment. Most of the graphical tools are open source environments and freely available and thus no special cost is required for the realization of the first phase. Graphical environments for the first phase of LPL course do not require any sophisticated hardware or operating system. The second phase of LPL requires a textual language. Although any appropriate textual language can be used, but a dedicated textual language is more beneficial in the second phase. The dedicated textual language is extremely tiny and its development cost is usually very nominal. Furthermore, no special hardware and operating system would be required for the textual language. It is believed that teach- ing the basic concepts of programming is effectively served by implementing solutions in both graphical and text-oriented languages [125]. Learners’ pro- gramming language provides more benefits to students by using the graphical language as well as the text-oriented language. In the LPL course, the lectures and laboratory assignments are used to cover the topics. The instructors have a choice to implement the LPL course in their own ways. However, there are three main methods for the implementation of LPL course. Figure 4.4 illustrates these methods. In the first method, the fun- damentals of elementary programming are introduced during the first phase. The second phase is commenced once the first phase is completed. So at any stage of LPL course, the students are either using graphical environment or the textual language. 4.9. Implementation 82

Figure 4.4: Implementation of LPL course

In the second method, the students are introduced to the fundamentals of pro- gramming by using the graphical environment of the first phase and after in- troducing 50% of course the second phase will be started. The second phase further explain the concepts which are already introduced in the first phase. In the same way the entire course will be covered. The third method is a variant of the second method. In this method, the stu- dents are introduced to the fundamentals of programming by using the graphical environment of the first phase and after few sessions the second phase will be started. The second phase introduced the concepts of programming. In the same way, the entire course will be covered. Chapter 5

Dedicated Textual Language This chapter is the continuation of chapter 4 and provides the basic guidelines for the design and construction of a textual language and its translator.

5.1 Introduction

The second phase is the core of LPL course, and textual language is the nucleus of the second phase. The success of the second phase mainly based on the ef- fectiveness of textual language. This section includes the generic guidelines for the construction of textual language. The guideline is effective for constructing the textual language which is outlined in section 4.7 and may be very helpful in developing a different textual language which is based on the actual philosophy of textual language. Like conventional programming languages, the syntax and semantics are the main components of a textual language. In the context of the programming languages, syntax refers to the modus operandi through which symbols from some alphabet may be combined to generate well-formed programs [430]. According to Chomsky, syntax is the study of the rules and procedures by which sentences are generated, in particular language [91]. Syntax describes the legitimate associations between the components of a lan- guage; by that means it provides a structural illustration of the several expres- sions that create legal (valid) strings in the language. Syntax deals exclusively with the structure and arrangement of symbols in a language and has no con- cerns with their meaning.

83 5.2. Lexical Structure 84

The syntax of the learners programming language is ramified into two parts: the lexical syntax and phrase level syntax. The lexical syntax defines how the sequence of characters is defined as a sequence of lexemes. Lexemes are coher- ent lexical units, such as integers or identifiers [309]. Fundamentally, the lexical syntax defines the dictionary of a textual language and the set of patterns for those lexemes which have variable structure. The phrase level syntax (or just syntax) describes the rules that govern the ways to combine words into grammatically correct programs of learners pro- gramming language. The meanings of programming language statements are interpreted from the rules called the semantics [142]. Semantics refers the meaning of syntactically correct strings in a language. For natural languages, semantics associates the sentences and phrases with thoughts, objects and feeling. As far as the program- ming languages are concerned, semantics define the behavior and manner that a computer or machine follows while the program is executing. Like other pro- gramming languages, the syntax of the learners programming language should be defined prior to semantics because the meaning can be given to valid expres- sions.

5.2 Lexical Structure

Definition of lexical syntax is the first step in the development of a textual language. The formation of the alphabet of the learners programming language is the first step in the definition of lexical syntax. An alphabet is a nonempty, finite set of symbols [197]. An alphabet of a textual language contains all the permissible symbols. After the formation of an alphabet, the second step is to identify all the lexical units of a textual language. Lexical units are those fundamental lexemes that constitute the program code. Keywords, identifiers, literals, operators, punctuators are the lexical units of a textual language. In textual language , the blank spaces are used to separate the lexemes. Keywords are reserved words and have predefined meanings in a learners’ pro- 5.2. Lexical Structure 85

gramming language. Keywords can only be used for their predefined aims and never be utilized as a programmer defined identifiers [162]. The number of per- missible keywords in the textual language is usually larger than the keywords of contemporary programming languages, because all the supported features of a textual language are included in the grammars rather than creating a separate package or library files. Identifiers are names assigned to different program components, such as vari- ables, constants, arrays, labels and functions [162]. In textual language, the identifiers composed of letters and digits in any sequence, but the first symbol must be a letter. The underscore character may also be allowed in identifiers. Any explicit value in a program is called a literal. Basically literal is a value written directly into a source program. In textual language there are four types of literals:

Character literal

Integer literal

Floating-literal

String literals

A character literal is a single character enclosed in single quotation marks. The integer-valued number is called integer literal. Integer literals comprise of a se- quence of digits without any fraction. The textual language allows the integer literals in only the decimal number systems. Floating-point literal is a number that contains an integer part, a decimal part and a fraction part. The integer and fraction section both comprise of a sequence of digits. A string literal is a sequence of characters enclosed in the double quotation mark. Operators are the symbols that operate on a value or a variable. The textual language supports the major operators like arithmetic, relational and logical operators. In natural language, the punctuators are used to organize and structure the text. Unlike conventional programming, the textual language has limited punctua- tions, and several punctuation marks are not categorized as a separate lexeme, 5.2. Lexical Structure 86

but a part of other lexeme. For example, in a lexeme “age”, the quotation marks are not recognized as separate lexemes, but used to delimit the string literal. Comments are the statements that help the readers in understanding the code [211]. Single line comments and multi line comments are the common types of comments provided by the majority of popular programming languages. Com- ments are the intrinsic part of the learners programming language. The a textual language supports single-line comments. After identifying all the lexical units, the next and the last step of lexical syn- tax is the formal definition (construction) of the identified lexical units. Fortu- nately, there are several verified and legitimate techniques that can be used for the definition of lexical syntax.

5.2.1 Regular Expression

The regular expression is a tool, and a mathematical system used for describing the regular languages. According to Kelty[226], the regular expression was emanated by Rudolph Carnap while working on the syntax of logic, later it was utilized by McCulloch and Pitts and transfigured into logical calculi of nerve nets. Afterwards, the regular expression was selected by John von Neumann for his elucidation of the EDVAC computer. Subsequently, an extensive work on the regular expression was conducted by Stephen Kleene who formalized the regular expression and stated the name regular expression. Janusz Brzozowski employed the regular expression for designing state diagrams for circuits [71]. Regular expression can be visualized as a mini programming language that can describe the text. The regular expression is used in many text editors including vi, grep and emacs. In fact, it is a key to robust, versatile and coherent text processing [148]. The regular expression is recognized as a universal method for describing the lexical specification of programming languages. It has many other applications, particularly in the area of networking [159, 498]. Regular expression over the alphabet Σ can be defined inductively [483]: 5.2. Lexical Structure 87

ˆ ∅ is a regular expression defined over the alphabet Σ, and describes the regular language containing no words, i.e. {} ˆ  is a regular expression defined over the alphabet Σ and describes the regular language containing one word and the length of that word is zero, i.e. {  }. ˆ a (where a ∈ Σ) is a regular expression defined over the alphabet Σ and describes the regular language { a }

ˆ If r1 and r2 are regular expressions defined over the alphabet Σ and de-

scribe the regular languages R1 and R2 respectively, then:

– ( r1 | r2 ) is a regular expression and describes the regular language

R1 ∪ R2.

– ( r1r2) is a regular expression and describes the regular language

R1R2.

– r1* is a regular expression and describes the regular language R1*.

5.2.2 Definition of lexical structure

The lexical units of a textual language are regular languages; therefore, for each unit a separate regular expression is described.

5.2.2.1 Identifiers

An identifier comprises of a letter followed by zero or more letters or digits. This pattern can easily be defined by the following regular expression:

anyletter → A | B | C | ... | Z | a | b | ... | z anydigit → 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 identifier → anyletter (anyletter | anydigit)*

If language allows the underscore, then it simply requires to add the underscore in the first rule. 5.2. Lexical Structure 88

5.2.2.2 Keywords

Unlike identifiers, the keywords have a static pattern, and their respective reg- ular expressions are fairly very simple. For instance, consider a word display as a keyword of a textual language. Its corresponding regular expression solely involves the concatenation of pertinent symbols. The following are the regular expressions for the possible keywords of textual language.

input → input output → output display → display close → close

Like keywords, operators and punctuators have fixed pattern; therefore, their regular expressions follow the same logic. As an illustration, consider the fol- lowing lexical units and their respective regular expressions:

>= → >= <> → <> <= → <=

5.2.2.3 Literals

For each type of a literal, there is a separate regular expression. Character literals are enclosed between single quotes, as in ‘b’. For convenience and un- derstandability the symbol  is used to represent a white space. Following is the regular expression of character literal:

characterliteral → ‘any character’

any character → a | b | ... | z | A | B | ... | Z | 0 | 1 | ... | 9 | | 

In textual language, the integer literals are only given in decimal number sys- tems, so the respective regular expression is highly amenable. Following is the regular expression of an integer constant. 5.2. Lexical Structure 89

integerliteral → ( + | - |  ) digit digit* digit → 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

The floating-point literals contain digit(s) before and after the decimal point and may be represented by the following regular expression.

floatliteral → ( + | - | ) digits . digits digits → digit digit* digit → 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

The string literal is enclosed in double quotation marks, and its respective regular expression may take the following form:

stringliteral → “any character*”

any character → a | b | ... | z | A | B | ... | Z | 0 | 1 | ... | 9 | 

There are several extensions of regular expressions ({} and []), though the con- ventional notation of regular expression is used for the definition of lexical units.

5.2.3 Regular Grammar

Regular grammar is another way to describe the lexical syntax of a textual language. Like regular expressions, the regular grammars are used to describe the regular languages and therefore, they are equivalent systems [397]. According to [385], regular grammar G is a quadruple (V, Σ, R, S):

ˆ V is the rule alphabet which contains terminals and nonterminals. ˆ Σ is a set of terminals. Terminals are those symbols that appear in strings constructed by G. ˆ R is a finite set of rules. ˆ S is the nonterminal and the start symbol of the grammar G.

The regular expression and regular grammar are equivalent systems. Any lan- guage that can be generated by the regular expression can also be defined by 5.3. Syntax and Grammar 90

the regular grammar. However, the construction of regular grammar is difficult then the construction of equivalent regular expression and therefore, the regular expression is more preferred for the definition of lexical patterns.

5.3 Syntax and Grammar

The syntax of a language is a collection of rules describing which of its expres- sions are grammatically valid [179]. The syntax deals with the formation of program/sentence in a language and concerned with the arrangement of lexical units in a program. The definition of the phrase level syntax is more strenuous and technical than lexical specifications. Among all the steps in the definition of a textual language, the phrase level syntax is the most time-consuming process. Similarly, the most indispensable task in the definition of syntax is the construction of grammar. Grammar is a system used to describe a language. Basically, grammar is a collection of rules and instances, which illustrates and educates the language; more precisely, grammar is any precise, finite-size and complete explanation of the language [163]. Grammar is the method through which words are grouped together to define significant and meaningful utterances [440]. According to Grune and Jacobs[163], a grammar is a book of directions and illustrations, which defines and teaches the language. There are different types of grammars, but the context-free grammar is a sim- ple and a mathematically precise mechanism for describing the syntax of the programming languages.

5.3.1 Context-free grammar

Context-free grammar (CFG) is a type of grammar used for the description of programming languages. The cardinal formalism of context-free grammar was defined by Noam Chomsky in 1950s [90, 91]. CFG [296] is defined as a four tuple (N, T, P, S): where, 5.3. Syntax and Grammar 91

ˆ N is the finite set of nonterminals. Nonterminals are also called variables or syntactic categories. Some authors use V to denote the nonterminals. ˆ T is a finite set of terminals. Some authors used Σ to denote the terminals. Nonterminals and terminals are always disjoint. ˆ P is set of productions or rules. Principally, it is non-empty finite set of productions of the form: B → α, where B ∈ N and α ∈ (N ∪ T)*. ˆ S is the start symbol or the distinguished symbol and a subset of N. Fundamentally S is the header of the first production of CFG.

Consider a context-free grammar which describes a section of English language [426].

|

| → with
→ the | a → flower | girl | boy → likes | touches | sees

The above grammar allows several strings in the language:

A girl sees

The girl sees a flower

A boy with a flower likes the girl 5.3. Syntax and Grammar 92

Each of the strings allowed in the language can be generated by an above gram- mar by derivation. The derivation is the process of applying the productions (rules) that generates the string of terminals from the starting symbol [96]. In context-free grammar, a derivation may include a sentential form with multiple nonterminals and there is a choice in the order in which the nonterminals are replaced. A derivation is called leftmost if in each step, the leftmost nonter- minal in the sentential form is replaced. Similarly, in rightmost derivation the rightmost nonterminal is replaced [272]. The following is the leftmost derivation of “a boy with a flower likes the girl”.

⇒ a ⇒ a boy ⇒ a boy ⇒ a boy with ⇒ a boy with
⇒ a boy with a ⇒ a boy with a flower ⇒ a boy with a flower ⇒ a boy with a flower ⇒ a boy with a flower likes ⇒ a boy with a flower likes ⇒ a boy with a flower likes
⇒ a boy with a flower likes the ⇒ a boy with a flower likes the girl In above derivation, the leftmost nonterminal is replaced in each step. The same string can be derived with rightmost derivation. As an illustration, con- sider the rightmost derivation of “a boy with a flower likes the girl”. 5.3. Syntax and Grammar 93

girl ⇒ the girl ⇒ likes the girl ⇒ likes the girl ⇒ likes the girl ⇒
likes the girl ⇒
flower likes the girl ⇒ a flower likes the girl ⇒ a flower likes the girl ⇒ with a flower likes the girl ⇒
with a flower likes the girl ⇒
boy with a flower likes the girl ⇒ a boy with a flower likes the girl Generating the sentence with the rightmost derivation is very similar to the leftmost derivation. The type of derivation is an important concern while se- lecting the technique for the recognition of context-free language. The derivation of a sizable string is significantly very large and therefore, it is represented by a tree called a parse tree (or derivation tree). A parse tree for a context-free grammar G = (N, T, P, S) is a tree if and only if it has the following properties [272, 321, 385].

1. The root is labeled S. 2. Every leaf node is labeled with an element of T ∪ {  }. 3. The label of an internal vertex is a nonterminal.

4. If a node has label B ∈ N, and its children are labeled (left to right) b1,b2,

...,bn, then B → b1,b2,...,bn is a production in P. 5. A leaf labeled  has no siblings, that is, a vertex with a child labeled  can have no other children.

The parse tree of “a boy with a flower likes the girl” is shown in figure 5.1. 5.3. Syntax and Grammar 94

Figure 5.1: Parse tree of “a boy with a flower likes the girl”

Sometimes there are sentences that have ambiguity. For instance, a sentence “in books selected information is given” is ambiguous [321], since it gives two meanings: i) in books selected, “information is given”, ii) in books, “selected information is given”. The same condition may occur in context-free languages. A context-free language G, is said to be ambiguous if there exists at least one string w ∈ L(G) that has at least two different parse trees [272]. In other words, the ambiguity in grammar G indicates that there is a sentence in L(G) for which there are at least two leftmost or rightmost derivations. Consider a context-free grammar G = ({S}, {a, b, c, + , *},S → S + S | S *S | a| b| c, S), is ambiguous. The string b*c+a has two distinct parse trees (shown in Figure 5.2).

The ambiguity as a problem in grammars was recognized by Floyd[144] and Cantor[77] at around the same time. In natural languages, the ambiguity is a common feature, and handled in differ- ent ways, whereas in programming languages, the ambiguity must be removed so there should be a single representation of each sentence. However, there are some context-free languages that can be defined only by ambiguous grammars. Such languages are called inherently ambiguous [426]. 5.3. Syntax and Grammar 95

Figure 5.2: Two parse trees for b*c+a

The syntax of the textual language is very descriptive and therefore, its gram- mar is likely to have an ambiguity. The ambiguity does not seem very problem- atic during the definition of a language; however, many techniques for parsing context-free languages cannot handle the ambiguous grammar.

5.3.2 Definition of syntax

In order to define the syntax of a textual language, it is far worthwhile to develop a separate context-free grammar for each construct, and finally integrate all these grammars to form the actual grammar of a textual language. The statements of the textual language program are enclosed in a function. The function is the fundamental part of computer programs, and the main function is an essential element of programs in many languages. The textual language programs are enclosed in the main function. Defining the structure of function is the first step in the inception of describing the syntax of textual language. The function has three parts:

Function header

Statement(s)

Function closing delimiter

Function header is a statement that indicates the start of the function. The main function header of a textual language is similar to the “void main (void)” of C language. In textual language, the function header is a plain and under- 5.3. Syntax and Grammar 96

standable statement that starts with a word that literally or metaphorically indicates the start of something. The remaining fields are the articles, prepo- sitions and other relevant components that constitute a simple statement. For instance, the following types of statements are allowed as the header of the main function (as discussed in section 4.8).

start of the main program or,

start of program or,

start

There are several ways to define the context-free grammar for the function header of textual language. Following is one of a possible realization of the function header:

→ VERB START
→ PREPOSITION OF PROGRAM SYMBOL |  → ARTICLE A | ARTICLE THE |  → FUNCTION NAME | 

The programs written in the textual language are translated by a language translator. During the construction of language translator all the lexical units of language are tokenized and each token represents the specific syntactic cat- egory. Therefore, the lexical units of textual language are tokenized for conve- nience, and the grammar is defined by using these tokens. In the above grammar, there are four nonterminals and

is the first production of grammar.
derivates VERB START and
. VERB START is a terminal (token) that represents a word (like start) which indicates the start of the function header. The 5.3. Syntax and Grammar 97

defines the remaining phrase of the function header and se- quentially generates the following:

1. preposition (of ) which is tokenized as PREPOSITION OF. 2. The article (a, the) is generated by . 3. The name of the function which is derived from . The may be reduced to empty string () 4. The word “PROGRAM” which is tokenized as PROGRAM SYMBOL.

In the simplest case, a single token VERB START is allowed as a function header. The name of the main function is tokenized as FUNCTION NAME, which is an identifier. In several cases, a separate type of token is not generated for a function name and a same type of token is generated for all the types of identifier. The statement “start of the main program” is a valid function header, and therefore it can be derived from the above grammar. After tokenization, the header becomes “VERB START PREPOSITION OF ARTICLE THE FUNC- TION NAME PROGRAM SYMBOL”, and can be derived as follows:

⇒ VERB START
⇒ VERB START PREPOSITION OF PROGRAM SYMBOL ⇒ VERB START PREPOSITION OF ARTICLE THE PROGRAM SYMBOL ⇒ VERB START PREPOSITION OF ARTICLE THE FUNC- TION NAME PROGRAM SYMBOL

The sentence “start of program” is a valid function header in textual language, and therefore, it can be generated by the grammar. After tokenization, the statement becomes “VERB START PREPOSITION OF PROGRAM SYMBOL”, and can be derived as follows:

⇒ VERB START
5.3. Syntax and Grammar 98

⇒ VERB START PREPOSITION OF PROGRAM SYMBOL ⇒ VERB START PREPOSITION OF PRO- GRAM SYMBOL ⇒ VERB START PREPOSITION OF PROGRAM SYMBOL

The second part of a function is the list of statements. The textual language is imperative in nature, and the statement section of the programs begins with the declaration of variables. After declaration, the programs include the actual statements that define the actual computations. In textual language, the statements for the declaration of variables are fairly simple and obvious. Every possible command starts with a word that indicates the declaration, and followed by the data type of variable and possible initial value. The statements also contain the supporting words. Other words are optionally allowed in the declarations to make it more understandable. The declaration statement may take one of the following forms:

create an integer type variable named age = 20 or,

create an integer type variable named sno or,

create float type variable fee = 1200 or,

create string variable studentname or,

create character grade

One of a possible context-free grammar for a declaration statement in textual language is stated below: 5.3. Syntax and Grammar 99

→ VERB CREATE IDENTIFIER → INDEFINITEARTICLE AN INTE- GER DATATYPE | FLOAT DATATYPE | STRING DATATYPE | CHARACTER DATATYPE → TYPESYMBOL |  → VARIABLESYMBOL |  → NAMEDSYMBOL |  → = CONSTANT | 

The above context-free grammar has six productions. The words type, variable and named are tokenized as TYPESYMBOL, VARIABLESYMBOL and NAMEDSYMBOL respectively. The is the first production of grammar and defines the VERB CREATE which is the first field of dec- laration statement and represents the word like create (or other equivalents). VERB CREATE is followed by which defines the permissible data types. optionally generates a token TYPESYMBOL. The may generate a token VARIABLESYMBOL and a nonterminal which optionally generate the NAMEDSYMBOL. The vari- able name is represented by IDENTIFIER. The is used to initialize the variable with the constant. Here a single token (CONSTANT) is used to represent any type of constant, including the integer, float, char- acter or string constant. It is possible and sometimes more desirable to de- fine different tokens for different types of constants. For instance, INTE- GER CONSTANT could be used to represent integer constant, and the FLOAT CONSTANT may be used to represent floating constant and so on. After tokenization, the “create an integer type variable named age = 30” be- comes “VERB CREATE INDEFINITEARTICLE AN INTEGER DATATYPE TYPESYMBOL VARIABLESYMBOL NAMEDSYMBOL IDENTIFIER = CON- 5.3. Syntax and Grammar 100

STANT ”, and derived as follows:

⇒ VERB CREATE IDENTIFIER ⇒ VERB CREATE INDEFINITEARTICLE AN INTE- GER DATATYPE IDENTIFIER ⇒ VERB CREATE INDEFINITEARTICLE AN INTE- GER DATATYPE IDENTIFIER ⇒ VERB CREATE INDEFINITEARTICLE AN INTE- GER DATATYPE TYPESYMBOL VARIABLESYMBOL IDENTIFIER ⇒ VERB CREATE INDEFINITEARTICLE AN INTE- GER DATATYPE TYPESYMBOL VARIABLESYMBOL NAMEDSYMBOL IDENTIFIER ⇒ VERB CREATE INDEFINITEARTICLE AN INTE- GER DATATYPE TYPESYMBOL VARIABLESYMBOL NAMEDSYMBOL = CONSTANT

Figure 5.3 shows a parse tree for the above derivation.

Figure 5.3: Parse tree of “create an integer type variable named age = 30”

Relatively short statements are also allowed to declare the variable. For in- stance, “create character grade” is a legal declaration statement of textual lan- guage and can be derived as follows: 5.3. Syntax and Grammar 101

⇒ VERB CREATE IDENTIFIER ⇒ VERB CREATE CHARACTER DATATYPE IDENTIFIER ⇒VERB CREATE CHARACTER DATATYPE IDEN- TIFIER ⇒ VERB CREATE CHARACTER DATATYPE IDENTIFIER ⇒ VERB CREATE CHARACTER DATATYPE IDENTIFIER

After declaration section, textual language programs contain the statements that specify the actual computation. These statements provide console in- put/output, arithmetic expressions, assignment statements, condition struc- ture, repetition structure, and many other facilities. In terms of functional- ity, these statements are far different from each other, but the central notions about the development of their context-free grammars are highly similar, and the knowledge of one can be practiced in the construction of other constructs. Input statement is one of a fundamental statement of textual language. In tex- tual language, the input statement is a simple and plain statement which starts with a keyword that indicates the start of the input process. The statement also includes the name of the input variable. It also contains supporting words that make the statement correct and understandable. Input statement may take the following forms:

take input in variablename or,

input in variablename

One of a possible context-free grammar for the input statement in textual lan- guage is stated below: 5.3. Syntax and Grammar 102

INPUTSYMBOL PREPOSITION IN IDENTIFIER → VERB INPUT | 

Here the VERB INPUT represents the words like take or get. The word “input” is represented by INPUTSYMBOL. A statement like “take input in age” will be tokenized into a form: “VERB INPUT INPUTMARKSYMBOL PREPOSITION IN IDENTIFIER” and derives as follows:

INPUTSYMBOL PREPOSI- TION IN IDENTIFIER ⇒ VERB INPUT INPUTSYMBOL PREPOSITION IN IDENTI- FIER

The output statement in a textual language is started with a word like “display”, “print” or “show”. It also includes the output parameter. Output statement of textual language may take the following forms:

display the value of age or,

print a number

One of a possible context-free grammar for the output statement is stated below:

→ VERB OUTPUT → ARTICLE A | ARTICLE THE |  → IDENTIFIER | CONSTANT | VALUE → PREPOSITION OF IDENTIFIER | CONSTANT 5.3. Syntax and Grammar 103

The above context-free grammar has four productions, and is the distinguished symbol of grammar. generates the VERB OUTPUT which represents the words like print, show and display. VERB OUTPUT is possibly followed by the article. The generates the IDENTIFER or CONSTANT along with the supporting words. A statement “display the value of age” is tokenized as “VERB OUTPUT AR- TICLE THE VALUE OF IDENTIFIER” and derived as follows:

⇒ VERB OUTPUT ⇒ VERB OUTPUT ARTICLE THE ⇒ VERB OUTPUT ARTICLE THE VALUE ⇒ VERB OUTPUT ARTICLE THE VALUE OF IDENTIFIER

The above derivation can be represented by a parse tree illustrated in Figure 5.4.

Figure 5.4: Parse tree of “display the value of age”

Arithmetic expressions are the pivotal components of textual language and al- most similar to the expressions in high-level programming languages. Therefore, the universal grammar of arithmetic expression can be utilized in a textual lan- guage. The arithmetic expression in a textual language supports all the basic arithmetic operations. In most of the situations the arithmetic expression is assigned to a variable. Therefore, it is more worthwhile to add the rule for assigning the expression to a variable. 5.3. Syntax and Grammar 104

The standard context-free grammar for the arithmetic expression is stated be- low:

→ IDENTIFIER = + - * / → ( ) → IDENTIFIER | CONSTANT

The is the distinguished symbol of grammar and defines a vari- able followed by an equal sign and expression. The part of an expression involv- ing the addition or subtraction is generated by . The can generate the , which in turn generates the part of expression in- volving multiplication and division. can also generates . The parenthesis based expression, IDENTIFIER and CONSTANT can be generated from . It can be seen that a context-free grammar of arithmetic expression is slightly more complex than the grammars of other constructs. Recursion is one of a main reason that makes the grammar more complicated. After tokenization, the expression “value = num1 * 20 / (code + counter)” becomes “IDENTIFIER = IDENTIFIER * CONSTANT / (IDENTI- FIER + IDENTIFIER)”, and can be derived as follows:

⇒ IDENTIFIER = ⇒ IDENTIFIER = ⇒ IDENTIFIER = ⇒ IDENTIFIER = / ⇒ IDENTIFIER = * / ⇒ IDENTIFIER = * / 5.3. Syntax and Grammar 105

⇒ IDENTIFIER = IDENTIFIER * / ⇒ IDENTIFIER = IDENTIFIER * CONSTANT / ⇒ IDENTIFIER = IDENTIFIER * CONSTANT / ( ) ⇒ IDENTIFIER = IDENTIFIER * CONSTANT / ( + ) ⇒ IDENTIFIER = IDENTIFIER * CONSTANT / ( + ) ⇒ IDENTIFIER = IDENTIFIER * CONSTANT / ( + ) ⇒ IDENTIFIER = IDENTIFIER * CONSTANT / ( IDENTIFIER + ) ⇒ IDENTIFIER = IDENTIFIER * CONSTANT / ( IDENTIFIER + ) ⇒ IDENTIFIER = IDENTIFIER * CONSTANT / ( IDENTIFIER + IDENTIFIER)

The Figure 5.5 illustrates the parse tree of “value = num1 * 20 / (code + counter)”.

Figure 5.5: Parse tree of “value = num1 * 20 / (code + counter)”

Like other components of textual language, the loop can be defined very eas- 5.3. Syntax and Grammar 106

ily. Two types of loops (counter and unexpected condition) are allowed in the textual language. The loop statement in a textual language starts with a word (like repeat). Loop contains a condition and the block delimiting statements, which are to be executed as long as the loop condition is satisfied. The loop may take the following forms:

repeat with variable num from 1 to 100 and the step is 2

...

end of loop or,

repeat the statements 100 times

...

end of loop or,

repeat the statements as long as num < 10

...

end of loop or,

repeat statement while num > 10

...

end of loop

One of a possible context-free grammar for the counter iteration structure is stated below:

→ VERB REPEAT → ARTICLE THE STATEMENTSSYM- BOL | STATEMENTSSYMBOL |  5.3. Syntax and Grammar 107

| → CONSTANT TIMESSYMBOL | PREPOSITION WITH IDENTIFIER PREPOSITION TO STEP- SYMBOL → PREPOSITION FROM |  → ISSYMBOL |  → CONSTANT | IDENTIFIER → ANDSYMBOL |  → WHILESYMBOL | ASSYMBOL LONGSYMBOL ASSYMBOL → ENDSYMBOL LOOP- SYMBOL | PREPOSITION OF → VERB TERMINATE LOOPSYMBOL → VERB CONTINUE LOOP- SYMBOL RELATOPERATOR RE- LATOPERATOR |  → ANDSYMBOL | ORSYMBOL |  5.3. Syntax and Grammar 108

In above grammar, the is the starting symbol and derives and . The defines the starting field of the loop statement. VERB REPEAT is the first fields defined by the and represents the keyword like repeat. may define two sentences (“the statement” or “statement”). may also reduce to the empty string. can either generate the or . is used to define the counter-controlled loop whereas is used to define the unexpected-condition based loop. Three nonterminals (, and ) are sequentially generated by . Two possible styles of counter loop are generated with . In the first style, the following type of loop statement can be generated:

repeat the statements 50 times

In the second style, the following type of loop statements may be generated:

repeat the statements with counter from 1 to 100 and the step is 10 or,

repeat the statements with number from 1 to 100 step 10 or,

repeat with serial from 1 to 100 step 10

For the implementation of the first style of the counter loop, the generates CONSTANT and TIMESSYMBOL. The CONSTANT repre- sents the actual number which is the total number of iterations. Here the can be used instead of CONSTANT and makes the statement more general. The TIMESSYMBOL is the token of a word “times”. 5.3. Syntax and Grammar 109

For the implementation of the second style of the counter loop, the generates a rule which is comprised of multiple terminals and nonterminals which are self-explanatory. The unexpected condition loop is defined with . The can generate the , , and . The either defines a word “while” or a sen- tence “as long as long”. The generates the relational expressions like (age > 5, num * grade + 50 < 200, and etc.). generates the sequence of statements which are to be executed as long as the condition is satisfied. generates the statement that indicates the end of the block of a loop. ENDSYMBOL is the first field and represents the words like end. either generates ARTICLE A, AR- TICLE THE, and PREPOSITION OF or reduce to the empty string. The continue and break statements are also allowed in a textual language. Both statements are used in the loops. The continue statement transfer control to the top of the loop block. is used to define continue statement, which may take the following forms:

continue the loop or,

continue a loop or,

continue loop

The generates VERB CONTINUE, and the LOOPSYMBOL. The VERB CONTINUE and LOOPSYMBOL repre- sent the words “continue” and “loop” respectively. The break statement immediately terminates the loop and program control re- sumes at the next statement following the loop. The break statement in a textual language may take the following forms:

terminate a loop 5.3. Syntax and Grammar 110

or,

terminate the loop or,

terminate loop

The generates VERB TERMINATE, and LOOPSYMBOL. The VERB TERMINATE represents the word “termi- nate”. The other productions in the grammar are extremely simple and self- explanatory. After tokenization, the block statements “repeat with counter from 1 to 100 and step is 2 num = num + 1 display the value of counter end of loop” become “VERB REPEAT PREPOSITION WITH IDENTIFIER PREPOSI- TION FROM CONSTANT PREPOSITION TO CONSTANT ANDSYMBOL STEPSYMBOL ISSYMBOL CONSTANT IDENTIFIER = IDENTIFIER + CONSTANT VERB OUTPUT ARTICLE THE VALUE OF IDENTIFIER ENDSYMBOL PREPOSITION OF LOOPSYMBOL”, and can be derived as follows:

⇒ VERB REPEAT ⇒ VERB REPEAT ⇒ VERB REPEAT ⇒ VERB REPEAT ⇒ VERB REPEAT ⇒ VERB REPEAT PREPOSITION WITH IDENTIFIER PREPOSITION TO STEPSYMBOL ⇒ VERB REPEAT PREPOSITION WITH IDENTIFIER PREPO- SITION FROM PREPOSITION TO STEPSYMBOL 5.3. Syntax and Grammar 111

⇒ VERB REPEAT PREPOSITION WITH IDENTIFIER PREPOSITION FROM CONSTANT PREPOSITION TO STEPSYMBOL ⇒ VERB REPEAT PREPOSITION WITH IDENTIFIER PREPOSITION FROM CONSTANT PREPOSITION TO CON- STANT STEPSYMBOL ⇒ VERB REPEAT PREPOSITION WITH IDENTIFIER PREPO- SITION FROM CONSTANT PREPOSITION TO CONSTANT ANDSYMBOL STEPSYMBOL ⇒ VERB REPEAT PREPOSITION WITH IDENTIFIER PREPO- SITION FROM CONSTANT PREPOSITION TO CONSTANT ANDSYMBOL STEPSYMBOL ⇒ VERB REPEAT PREPOSITION WITH IDENTIFIER PREPOSITION FROM CONSTANT PREPOSITION TO CON- STANT ANDSYMBOL STEPSYMBOL ISSYMBOL CONSTANT ⇒ VERB REPEAT PREPOSITION WITH IDENTIFIER PREPOSITION FROM CONSTANT PREPOSITION TO CON- STANT ANDSYMBOL STEPSYMBOL ISSYMBOL CONSTANT ⇒ VERB REPEAT PREPOSITION WITH IDENTIFIER PREPOSITION FROM CONSTANT PREPOSITION TO CON- STANT ANDSYMBOL STEPSYMBOL ISSYMBOL CONSTANT 5.3. Syntax and Grammar 112

⇒ VERB REPEAT PREPOSITION WITH IDENTIFIER PREPOSITION FROM CONSTANT PREPOSITION TO CON- STANT ANDSYMBOL STEPSYMBOL ISSYMBOL CONSTANT ⇒ VERB REPEAT PREPOSITION WITH IDENTIFIER PREPO- SITION FROM CONSTANT PREPOSITION TO CONSTANT ANDSYMBOL STEPSYMBOL ISSYMBOL CONSTANT IDEN- TIFIER = ⇒ VERB REPEAT PREPOSITION WITH IDENTIFIER PREPO- SITION FROM CONSTANT PREPOSITION TO CONSTANT ANDSYMBOL STEPSYMBOL ISSYMBOL CONSTANT IDEN- TIFIER = + ⇒ VERB REPEAT PREPOSITION WITH IDENTIFIER PREPO- SITION FROM CONSTANT PREPOSITION TO CONSTANT ANDSYMBOL STEPSYMBOL ISSYMBOL CONSTANT IDEN- TIFIER = + ⇒ VERB REPEAT PREPOSITION WITH IDENTIFIER PREPO- SITION FROM CONSTANT PREPOSITION TO CONSTANT ANDSYMBOL STEPSYMBOL ISSYMBOL CONSTANT IDEN- TIFIER = + ⇒ VERB REPEAT PREPOSITION WITH IDENTIFIER PREPO- SITION FROM CONSTANT PREPOSITION TO CONSTANT ANDSYMBOL STEPSYMBOL ISSYMBOL CONSTANT IDEN- TIFIER = IDENTIFIER + ⇒ VERB REPEAT PREPOSITION WITH IDENTIFIER PREPO- SITION FROM CONSTANT PREPOSITION TO CONSTANT ANDSYMBOL STEPSYMBOL ISSYMBOL CONSTANT IDENTI- FIER = IDENTIFIER + CONSTANT 5.3. Syntax and Grammar 113

⇒ VERB REPEAT PREPOSITION WITH IDENTIFIER PREPO- SITION FROM CONSTANT PREPOSITION TO CONSTANT ANDSYMBOL STEPSYMBOL ISSYMBOL CONSTANT IDENTI- FIER = IDENTIFIER + CONSTANT ⇒ VERB REPEAT PREPOSITION WITH IDENTIFIER PREPO- SITION FROM CONSTANT PREPOSITION TO CONSTANT ANDSYMBOL STEPSYMBOL ISSYMBOL CONSTANT IDENTI- FIER = IDENTIFIER + CONSTANT ⇒ VERB REPEAT PREPOSITION WITH IDENTIFIER PREPO- SITION FROM CONSTANT PREPOSITION TO CONSTANT ANDSYMBOL STEPSYMBOL ISSYMBOL CONSTANT IDEN- TIFIER = IDENTIFIER + CONSTANT ⇒ VERB REPEAT PREPOSITION WITH IDENTIFIER PREPO- SITION FROM CONSTANT PREPOSITION TO CONSTANT ANDSYMBOL STEPSYMBOL ISSYMBOL CONSTANT IDEN- TIFIER = IDENTIFIER + CONSTANT VERB OUTPUT ⇒ VERB REPEAT PREPOSITION WITH IDENTIFIER PREPO- SITION FROM CONSTANT PREPOSITION TO CONSTANT ANDSYMBOL STEPSYMBOL ISSYMBOL CONSTANT IDEN- TIFIER = IDENTIFIER + CONSTANT VERB OUTPUT ARTI- CLE THE ⇒ VERB REPEAT PREPOSITION WITH IDENTIFIER PREPO- SITION FROM CONSTANT PREPOSITION TO CONSTANT ANDSYMBOL STEPSYMBOL ISSYMBOL CONSTANT IDEN- TIFIER = IDENTIFIER + CONSTANT VERB OUTPUT ARTI- CLE THE VALU 5.3. Syntax and Grammar 114

⇒ VERB REPEAT PREPOSITION WITH IDENTIFIER PREPO- SITION FROM CONSTANT PREPOSITION TO CONSTANT ANDSYMBOL STEPSYMBOL ISSYMBOL CONSTANT IDEN- TIFIER = IDENTIFIER + CONSTANT VERB OUTPUT ARTI- CLE THE VALUE OF IDENTIFIER ⇒ VERB REPEAT PREPOSITION WITH IDENTIFIER PREPO- SITION FROM CONSTANT PREPOSITION TO CONSTANT ANDSYMBOL STEPSYMBOL ISSYMBOL CONSTANT IDEN- TIFIER = IDENTIFIER + CONSTANT VERB OUTPUT ARTI- CLE THE VALUE OF IDENTIFIER ⇒ VERB REPEAT PREPOSITION WITH IDENTIFIER PREPO- SITION FROM CONSTANT PREPOSITION TO CONSTANT ANDSYMBOL STEPSYMBOL ISSYMBOL CONSTANT IDEN- TIFIER = IDENTIFIER + CONSTANT VERB OUTPUT ARTICLE THE VALUE OF IDENTIFIER ENDSYMBOL LOOPSYMBOL ⇒ VERB REPEAT PREPOSITION WITH IDENTIFIER PREPO- SITION FROM CONSTANT PREPOSITION TO CONSTANT ANDSYMBOL STEPSYMBOL ISSYMBOL CONSTANT IDEN- TIFIER = IDENTIFIER + CONSTANT VERB OUTPUT AR- TICLE THE VALUE OF IDENTIFIER ENDSYMBOL PREPOSI- TION OF LOOPSYMBOL

The textual language of LPL supports the selection control structure, and each structure begins with a lexical unit that represents the selection or execution of statements and followed by a conditional clause and the sequence of statement as a consequence. The selection control structure may be a single, double or a multiple selection structure. In textual language, the selection control structure may take the following forms: 5.3. Syntax and Grammar 115

execute the statements if num > b

num = b

b = b + 1

else

num = 1

end of if or,

execute statements if num > b

num = b

b = b + 1

else

num = 1

end of if or,

execute the statements providing that 5 * b > 100

display the value of b

end of if

One of a possible context-free grammar for the selection control structure is stated below:

→ VERB EXECUTE → CONJUNCTION IF | CONJUNC- TION PROVIDED → THATSYMOL |  5.3. Syntax and Grammar 116

| ELSEIFSYMBOL | ELSESYMBOL → ENDSYMBOL CONJUNC- TION IF → PREPOSITION OF |  | 

For the illustration of above grammar, consider the following code:

execute the statements if valuea > valueb

valuea = valueb

valueb = valueb + 1

else

valuea = 1

end of if

After tokenization, the above segment “VERB EXECUTE ARTICLE THE STATEMENTSSYMBOL CONJUNCTION IF IDENTIFIER RELATIONAL- OPERATOR IDENTIFIER IDENTIFIER = IDENTIFIER IDENTIFIER = IDENTIFIER + CONSTANT ELSESYMBOL IDENTIFIER = CONSTANT ENDSYMBOL PREPOSITION OF CONJUNCTION IF ”, and can be derived as follows: 5.3. Syntax and Grammar 117

⇒ VERB EXECUTE ⇒ VERB EXECUTE ARTICLE THE STATEMENTSSYMBOL ⇒ VERB EXECUTE ARTICLE THE STATEMENTSSYM- BOL CONJUNCTION IF ⇒ VERB EXECUTE ARTICLE THE STATEMENTSSYMBOL CONJUNCTION IF RELATOPERATOR ⇒ VERB EXECUTE ARTICLE THE STATEMENTSSYMBOL CONJUNCTION IF RELATOPERATOR ⇒ VERB EXECUTE ARTICLE THE STATEMENTSSYMBOL CONJUNCTION IF RELATOPERATOR ⇒ VERB EXECUTE ARTICLE THE STATEMENTSSYM- BOL CONJUNCTION IF IDENTIFIER RELATOPERATOR ⇒ VERB EXECUTE ARTICLE THE STATEMENTSSYMBOL CONJUNCTION IF IDENTIFIER RELATOPERATOR ⇒ VERB EXECUTE ARTICLE THE STATEMENTSSYM- BOL CONJUNCTION IF IDENTIFIER RELATOPERATOR ⇒ VERB EXECUTE ARTICLE THE STATEMENTSSYMBOL CONJUNCTION IF IDENTIFIER RELATOPERATOR IDENTI- FIER 5.3. Syntax and Grammar 118

⇒ VERB EXECUTE ARTICLE THE STATEMENTSSYMBOL CONJUNCTION IF IDENTIFIER RELATOPERATOR IDENTI- FIER ⇒ VERB EXECUTE ARTICLE THE STATEMENTSSYMBOL CONJUNCTION IF IDENTIFIER RELATOPERATOR IDENTI- FIER ⇒ VERB EXECUTE ARTICLE THE STATEMENTSSYMBOL CONJUNCTION IF IDENTIFIER RELATOPERATOR IDENTI- FIER IDENTIFIER = ⇒ VERB EXECUTE ARTICLE THE STATEMENTSSYMBOL CONJUNCTION IF IDENTIFIER RELATOPERATOR IDENTI- FIER IDENTIFIER = ⇒ VERB EXECUTE ARTICLE THE STATEMENTSSYMBOL CONJUNCTION IF IDENTIFIER RELATOPERATOR IDENTI- FIER IDENTIFIER = ⇒ VERB EXECUTE ARTICLE THE STATEMENTSSYMBOL CONJUNCTION IF IDENTIFIER RELATOPERATOR IDENTI- FIER IDENTIFIER = IDENTIFIER ⇒ VERB EXECUTE ARTICLE THE STATEMENTSSYMBOL CONJUNCTION IF IDENTIFIER RELATOPERATOR IDENTI- FIER IDENTIFIER = IDENTIFIER ⇒ VERB EXECUTE ARTICLE THE STATEMENTSSYMBOL CONJUNCTION IF IDENTIFIER RELATOPERATOR IDENTI- FIER IDENTIFIER = IDENTIFIER ⇒ VERB EXECUTE ARTICLE THE STATEMENTSSYMBOL CONJUNCTION IF IDENTIFIER RELATOPERATOR IDENTI- FIER IDENTIFIER = IDENTIFIER IDENTIFIER = 5.3. Syntax and Grammar 119

⇒ VERB EXECUTE ARTICLE THE STATEMENTSSYMBOL CONJUNCTION IF IDENTIFIER RELATOPERATOR IDENTI- FIER IDENTIFIER = IDENTIFIER IDENTIFIER = + ⇒ VERB EXECUTE ARTICLE THE STATEMENTSSYMBOL CONJUNCTION IF IDENTIFIER RELATOPERATOR IDENTI- FIER IDENTIFIER = IDENTIFIER IDENTIFIER = + ⇒ VERB EXECUTE ARTICLE THE STATEMENTSSYMBOL CONJUNCTION IF IDENTIFIER RELATOPERATOR IDENTI- FIER IDENTIFIER = IDENTIFIER IDENTIFIER = + ⇒ VERB EXECUTE ARTICLE THE STATEMENTSSYMBOL CONJUNCTION IF IDENTIFIER RELATOPERATOR IDENTI- FIER IDENTIFIER = IDENTIFIER IDENTIFIER = IDENTI- FIER + ⇒ VERB EXECUTE ARTICLE THE STATEMENTSSYMBOL CONJUNCTION IF IDENTIFIER RELATOPERATOR IDENTI- FIER IDENTIFIER = IDENTIFIER IDENTIFIER = IDENTI- FIER + ⇒ VERB EXECUTE ARTICLE THE STATEMENTSSYMBOL CONJUNCTION IF IDENTIFIER RELATOPERATOR IDENTI- FIER IDENTIFIER = IDENTIFIER IDENTIFIER = IDENTI- FIER + CONSTANT ⇒ VERB EXECUTE ARTICLE THE STATEMENTSSYMBOL CONJUNCTION IF IDENTIFIER RELATOPERATOR IDENTI- FIER IDENTIFIER = IDENTIFIER IDENTIFIER = IDENTI- FIER + CONSTANT 5.3. Syntax and Grammar 120

⇒ VERB EXECUTE ARTICLE THE STATEMENTSSYMBOL CONJUNCTION IF IDENTIFIER RELATOPERATOR IDENTI- FIER IDENTIFIER = IDENTIFIER IDENTIFIER = IDENTI- FIER + CONSTANT ELSESYMBOL ⇒ VERB EXECUTE ARTICLE THE STATEMENTSSYMBOL CONJUNCTION IF IDENTIFIER RELATOPERATOR IDENTI- FIER IDENTIFIER = IDENTIFIER IDENTIFIER = IDENTI- FIER + CONSTANT ELSESYMBOL ⇒ VERB EXECUTE ARTICLE THE STATEMENTSSYMBOL CONJUNCTION IF IDENTIFIER RELATOPERATOR IDENTI- FIER IDENTIFIER = IDENTIFIER IDENTIFIER = IDENTI- FIER + CONSTANT ELSESYMBOL ⇒ VERB EXECUTE ARTICLE THE STATEMENTSSYMBOL CONJUNCTION IF IDENTIFIER RELATOPERATOR IDENTI- FIER IDENTIFIER = IDENTIFIER IDENTIFIER = IDENTI- FIER + CONSTANT ELSESYMBOL IDENTIFIER = ⇒ VERB EXECUTE ARTICLE THE STATEMENTSSYMBOL CONJUNCTION IF IDENTIFIER RELATOPERATOR IDENTI- FIER IDENTIFIER = IDENTIFIER IDENTIFIER = IDENTI- FIER + CONSTANT ELSESYMBOL IDENTIFIER = ⇒ VERB EXECUTE ARTICLE THE STATEMENTSSYMBOL CONJUNCTION IF IDENTIFIER RELATOPERATOR IDENTI- FIER IDENTIFIER = IDENTIFIER IDENTIFIER = IDENTI- FIER + CONSTANT ELSESYMBOL IDENTIFIER = 5.3. Syntax and Grammar 121

⇒ VERB EXECUTE ARTICLE THE STATEMENTSSYMBOL CONJUNCTION IF IDENTIFIER RELATOPERATOR IDENTI- FIER IDENTIFIER = IDENTIFIER IDENTIFIER = IDEN- TIFIER + CONSTANT ELSESYMBOL IDENTIFIER = CON- STANT ⇒ VERB EXECUTE ARTICLE THE STATEMENTSSYMBOL CONJUNCTION IF IDENTIFIER RELATOPERATOR IDENTI- FIER IDENTIFIER = IDENTIFIER IDENTIFIER = IDEN- TIFIER + CONSTANT ELSESYMBOL IDENTIFIER = CON- STANT ⇒ VERB EXECUTE ARTICLE THE STATEMENTSSYMBOL CONJUNCTION IF IDENTIFIER RELATOPERATOR IDENTI- FIER IDENTIFIER = IDENTIFIER IDENTIFIER = IDEN- TIFIER + CONSTANT ELSESYMBOL IDENTIFIER = CON- STANT ENDSYMBOL CONJUNCTION IF ⇒ VERB EXECUTE ARTICLE THE STATEMENTSSYMBOL CONJUNCTION IF IDENTIFIER RELATOPERATOR IDENTI- FIER IDENTIFIER = IDENTIFIER IDENTIFIER = IDEN- TIFIER + CONSTANT ELSESYMBOL IDENTIFIER = CON- STANT ENDSYMBOL PREPOSITION OF CONJUNCTION IF

Function closing delimiter is the last component of a main function in a textual language. The function closing delimiter is a statement that indicates the end of the function. The closing delimiter starts with a word that indicates the end of the function. This word is possibly followed by supporting lexical units that make the statement understandable. Function closing delimiter of the main function may take one of a following form:

end of the main program or,

end of the program 5.3. Syntax and Grammar 122

or,

end of program or,

end

One of a possible context-free grammar for the closing delimiter of the main function is stated below:

→ VERB END

→ PREPOSITION OF PROGRAM SYMBOL |  → ARTICLE A | ARTICLE THE |  < function name > → FUNCTION NAME | 

After tokenization, the statement “end of the main program” becomes “VERB END PREPOSITION OF ARTICLE THE FUNCTION NAME PROGRAM SYMBOL”, and can be derived as follows:

⇒ VERB END

⇒ VERB END PREPOSITION OF PROGRAM SYMBOL ⇒ VERB END PREPOSITION OF ARTICLE THE PROGRAM SYMBOL ⇒ VERB END PREPOSITION OF ARTICLE THE FUNC- TION NAME PROGRAM SYMBOL

By integrating the already defined context-free grammars, one can obtain an actual context-free grammar of a textual language. Following is a context-free grammar obtained by integrating the previously defined context-free grammars of the textual language: 5.3. Syntax and Grammar 123

→ VERB START
→ PREPOSITION OF PROGRAM SYMBOL |  → ARTICLE A | ARTICLE THE |  → FUNCTION NAME |  → |  → VERB CREATE IDENTIFIER → INDEFINITEARTICLE AN INTE- GER DATATYPE | FLOAT DATATYPE | STRING DATATYPE | CHARACTER DATATYPE → TYPESYMBOL |  → VARIABLESYMBOL |  → NAMEDSYMBOL |  → = CONSTANT |  |  INPUTSYMBOL PREPOSITION IN IDENTIFIER → VERB INPUT |  → VERB OUTPUT 5.3. Syntax and Grammar 124

→ IDENTIFIER | CONSTANT| VALUE → PREPOSITION OF IDENTIFIER | CONSTANT → IDENTIFIER = + - * / → ( ) | IDENTIFIER | CON- STANT → VERB REPEAT → ARTICLE THE STATEMENTSSYM- BOL | STATEMENTSSYMBOL |  | → CONSTANT TIMESSYMBOL | PREPOSITION WITH IDENTIFIER PREPOSITION TO STEP- SYMBOL → PREPOSITION FROM |  → ISSYMBOL |  → CONSTANT | IDENTIFIER → ANDSYMBOL |  5.3. Syntax and Grammar 125

→ WHILESYMBOL | ASSYMBOL LONGSYMBOL ASSYMBOL → ENDSYMBOL LOOP- SYMBOL | PREPOSITION OF → VERB TERMINATE LOOPSYMBOL → VERB CONTINUE LOOP- SYMBOL RELATOPERATOR RE- LATOPERATOR |  → ANDSYMBOL | ORSYMBOL → VERB EXECUTE → CONJUNCTION IF | CONJUNC- TION PROVIDED → THATSYMOL |  | ELSEIFSYMBOL | ELSESYMBOL → ENDSYMBOL CONJUNC- TION IF → PREPOSITION OF |  → VERB END

5.4. Semantics 126

5.4 Semantics

Syntactically a valid sentence is always not meaningful in a semantic sense. For instance, a sentence “colourless green ideas sleep furiously” is syntactically valid, but semantically senseless [91]. Semantics is the study of meaning. The definition of semantics is one of a logical and strenuous process of the language definition. Semantics define the association between the structure of a programming language phrase (set of words) and what the phrase actually implies. These phrases have no intrinsic meaning, but their meaning is resolved purely in the context of a system for interpreting and elucidating their structure [459]. As an example, consider the Figure 5.6.

Figure 5.6: Expression tree

There are many ways to interpret this tree. Assume that the nodes captioned with 1, 10, and 11 are interpreted as the normal decimal numbers and the nodes labeled + and * as the addition and the product of the values of their children. Then the parent node of the tree represents (1.10)+11=21. If * represent the exponent operation rather than multiplication, the tree represents 110+11 = 12. If numbers are considered in binary notation, the tree represents (in a decimal number system) (1 * 2 ) + 3 = 5. In actuality, the tree does not signify an evaluation at all, but only specify the inherent properties like its shape, height and number of nodes. Through context-free grammar, the syntax of the textual language is described; howbeit, it may generate numerous statements and programs that are syntac- 5.4. Semantics 127

tically correct yet semantically wrong. As an example, consider the following program of a textual language.

start of the main program

create an integer typed variable named age

take input in grade

display the value of rollnumber

end of program

This program is syntactically correct, still has several problems. The third line of a program takes input in a variable grade but grade has not been declared in the program. Similarly, the fourth statement of the program displays the value of rollnumber, but no variable of such name is declared in the program. Although both statements are syntactically valid but violate the rule of meaning of a textual language. The semantics of a language describe how each grammatically correct sentence is to be interpreted. The semantics of textual defines the association between the abstract structure of a program and its meaning. The fundamental features of textual language are very basic, and the possible semantics-related issues are quantitatively very limited, yet the description and verification of its semantics is a big challenge. The following are the major semantics issues of textual language.

1. All variables must be declared before they are referenced. The use of undefined variable is a semantic violation. 2. Assignment of a value/variable to the incompatible variable. For instance, it is illegal to assign floating point value (operand) to an integer type variable. Similarly, string operand cannot be assigned to a character type variable. 3. The loop continuation and termination statements can only be used in the loop; the use of these statements in non-iterative structure is a semantic violation. 5.4. Semantics 128

4. Arithmetic operators can only be used with numeric operands. The use of arithmetic operators with string operand is a semantic violation. 5. A subscript operator must only be used with array operators.

Concretely there are many techniques and systems available for defining the semantics of programming languages, yet none of them are as formal as sys- tems exist for defining the syntax of programming languages. Among several techniques, attribute grammar is one of a formal way for defining the semantics of textual language.

5.4.1 Attribute grammar

An attribute grammar is an extension of context-free grammar and used to describe the semantic of programming language [416]. Although the context-free grammar is a powerful meta-system used for describ- ing the structure of programming languages, but some aspects of the structure of programming languages like the type compatibility rules are hard or un- doable to express with context-free grammar. For example, it is quite difficult to define the rules of context-free grammar for restricting the assignment of floating-point value to the integer type variable. Similarly, it is inconceivable to define the rules in context-free grammar to describe that all variables need to be declared before they are used. The language rules that are difficult or impossible to express with context-free grammar and deals the legal forms of programs are called static semantic rules. These rules are called static semantic rules because the examination and inter- pretation required for verifying these definitions can be accomplished at compile time. Attribute grammar is a system devised by Knuth [237] to describe the syntax and the static semantics of the programs. Attribute grammar is a legiti- mate mechanism both to define and verify the correctness of the static semantic rules of a program. Formally, attribute grammar is a context-free grammar augmented with the at- tributes [141], attribute computation functions, and predicate functions [416]. Attributes are related to the nonterminals and terminal symbols of grammar. 5.4. Semantics 129

Attributes have values, therefore visualized as variables. Attribute functions (semantic functions) are coupled with grammar rules, and describe how to com- pute the values of attributes. Predicate functions are related to grammar rules and state the semantic rules of the language. With each symbol X of grammar, there is a set of attributes A(X). The set A(X) comprises of two sets S(X) and I(X). S(X) and I(X) are disjoint sets and represent the synthesized and inherited attributes, respectively. An attribute that is used to pass semantic information up a parse tree is called a synthesized attribute, whereas inherited attributes pass the semantic information down and across it. Apart from its wide application in the definition of programming languages, attribute grammar and its extensions are extensively used in other areas, in- cluding query processing [239], natural languages [458], character recognition [504], iterative type inferencing [317] and image parsing [172]. With the attribute grammar the semantics of textual language is described by identifying its attributes and defining the semantic rules that elucidate how the processing of these attributes is associated with the grammar rules of the language. Suppose X is a grammar symbol, and a is an attribute coupled with X, then X.a represents the value of a related with X. Since the same symbol may be appeared multiple times in the production of grammar, therefore subscripting is performed for each occurrence of the symbol, so the attribute value of each occurrence may be identified.

5.4.2 Definition of semantics

In order to define the static semantics of a textual language, it is far worthwhile to develop the attribute grammars for all the relevant constructs of the textual language. Consider the simple context-free grammar (discussed in section 5.3.2) for the declaration of variables in a textual programming language. 5.4. Semantics 130

→ VERB CREATE IDENTIFIER → INDEFINITEARTICLE AN INTE- GER DATATYPE | FLOAT DATATYPE | STRING DATATYPE | CHARACTER DATATYPE → TYPESYMBOL |  → VARIABLESYMBOL |  → NAMEDSYMBOL |  → = CONSTANT | 

The token IDENTIFIER in the above context-free grammar represents a vari- able name. A variable may appear many times in the program and in order to ensure the meaningful usage of variable it is mandatory to determine the type of variable, because the correctness of several operations depend upon the type of variable. For instance, in many programming languages, the bitwise and shiftwise operations can only be performed on integer operands. The token IDENTIFIER merely represents the variable name, but provides no informa- tion about its underlying type. It is necessary to define a data type attribute for a variable which is specified by IDENTIFIER and construct the equation which defines the way in which the data type attribute is associated with the type of the declaration. This can be accomplished by defining an attribute grammar and designating an attribute (here it is called dtype). The integer, float, string and character are the realizable values of dtype and correspond to the tokens INTEGER DATATYPE, FLOAT DATATYPE, STRING DATATYPE and CHARACTER DATATYPE. Here the at- tribute grammar contains a single attribute; however, it may contain several mutually dependent attributes. The nonterminal has a dtype stated with the token it represents. IDENTIFIER has the same dtype by the equations associated to . 5.4. Semantics 131

Grammar Rule: → VERB CREATE IDENTIFIER Semantic Rules: IDENTIFIER.dtype = .dtype

Grammar Rule: → INDEFINITEARTICLE AN INTEGER DATATYPE Semantic Rule: .dtype = integer

Grammar Rule: → FLOAT DATATYPE Semantic Rule: .dtype = float

Grammar Rule: → STRING DATATYPE Semantic Rule: .dtype = string

Grammar Rule: → CHARACTER DATATYPE Semantic Rule: .dtype = character

Consider a parse tree for the string create float type variable fee showing the dtype attribute as defined by the above attribute grammar. After tokenization, the string becomes “VERB CREATE FLOAT DATATYPE TYPESYMBOL VARIABLESYMBOL IDENTIFIER”, and a parse tree (Figure 5.7) shows the generated tokens rather than the actual lexemes. 5.4. Semantics 132

Figure 5.7: Parse tree of “create float type variable fee”

The type of the variable is now associated with IDENTIFIER through dtype and therefore, it is possible to define further semantic rules and computations. Consider the following context-free grammar for simple arithmetic expressions (discuss in section 5.3.2):

→ IDENTIFIER = + - * / → ( ) → IDENTIFIER → CONSTANT

This grammar defines the assignment operation and basic arithmetic expres- sions. The grammar is syntactically valid, though it is possible to define se- mantically invalid expressions. For instance: 5.4. Semantics 133

1. Undeclared variables may be used in the expressions.

2. Incompatible variables or expressions may be assigned to the variable. Such as the assignment of a floating point operand may be assigned to the integer type variable and string type operand may be assigned to a character type variable.

3. Several types of variables are impermissible in arithmetic expressions, but can be derived from the arithmetic expression grammar. For instance, the string and character type variables and constants are not allowed in arithmetic expressions.

To describe the semantics of the assignment and arithmetic expressions of tex- tual language, an attribute grammar is defined and attributes of the symbols in the attribute grammar are stated below:

ˆ dtype- A synthesized attribute associated with the tokens CONSTANT and IDENTIFIER. It is used to store the data type of constant or vari- able. The actual type of CONSTANT and IDENTIFIER is intrinsic. ˆ atype- A synthesized attribute related with the nonterminals , and . It is used to store the actual type: integer or float of , or . The actual type of each of these nonterminals is computed from the actual type of their child node (or child nodes). ˆ etype- An inherited attribute related with the nonterminal . It is used to store the type, either integer or float, which is assumed for the expression, as decided from type (dtype) of the variable (IDENTIFIER) on the left side of the assignment statement. ˆ isvalid- An attribute associated with the nonterminal . It is used to store a truth value, true or false. If a valid expression is assigned to a variable on the left side of the arithmetic statement, then it stores true otherwise false. 5.4. Semantics 134

Two additional functions lookup and gettype are used to extract the data type (dtype) of variables and constants respectively. Grammar Rule: → IDENTIFIER = Semantic Rules: expression.etype = IDENTIFIER.dtype assignment.isvalid =

if (expression.etype = expression.atype), then

true

else

false

end if

Grammar Rule:

+ Semantic Rules: expression1.atype =

if (expression2.atype = string or expression2.atype = character or theterm.atype = string or theterm.atype = character), then

error

else if (expression2.atype = float or theterm.atype = float), then

float

else if (expression2.atype = float or theterm.atype = integer), then

float

else if (expression2.atype = integer or theterm.atype = float), then

float

else if (expression2.atype = integer and theterm.atype = integer), then 5.4. Semantics 135

integer

else

error

end if

Grammar Rule:

- Semantic Rules: expression1.atype =

if (expression2.atype = string or expression2.atype = character or theterm.atype = string or theterm.atype = character), then

error

else if (expression2.atype = float or theterm.atype = float), then

float

else if (expression2.atype = float or theterm.atype = integer), then

float

else if (expression2.atype = integer or theterm.atype = float), then

float

else if (expression2.atype = integer and theterm.atype = integer), then

integer

else

error

end if

Grammar Rule:

5.4. Semantics 136

Semantic Rules: expression.atype = theterm.atype

Grammar Rule: * Semantic Rules: theterm1.atype =

if (theterm2.atype = string or theterm2.atype = character or thefac- tor.atype = string or thefactor.atype = character), then

error

else if (theterm2.atype = float or thefactor.atype = float), then

float

else if (theterm2.atype = float or thefactor.atype = integer), then

float

else if (theterm2.atype = integer or thefactor.atype = float), then

float

else if (theterm2.atype = integer and thefactor.atype = integer), then

integer

else

error

end if

Grammar Rule: / Semantic Rules: theterm1.atype =

if (theterm2.atype = string or theterm2.atype = character or thefac- tor.atype = string or thefactor.atype = character), then 5.4. Semantics 137

error

else if (theterm2.atype = float or thefactor.atype = float), then

float

else if (theterm2.atype = float or thefactor.atype = integer), then

float

else if (theterm2.atype = integer or thefactor.atype = float), then

float

else if (theterm2.atype = integer and thefactor.atype = integer), then

integer

else

error

end if

Grammar Rule: Semantic Rules: theterm.atype = thefactor.atype

Grammar Rule: → ( ) Semantic Rules: thefactor.atype = expression.atype

Grammar Rule: → IDENTIFIER Semantic Rules: IDENTIFIER.dtype = lookup(IDENTIFIER) 5.5. Design of Textual Language Translator 138

thefactor.atype = IDENTIFIER.dtype

Grammar Rule: → CONSTANT Semantic Rules: CONSTANT.dtype = gettype(CONSTANT) thefactor.atype = CONSTANT.dtype

The mechanism used to define the semantics of the arithmetic expression can be followed to describe the static semantics of the other elements of a textual language.

5.5 Design of Textual Language Translator

The LPL’s textual programming language originally aims to help the beginners in understanding the introductory programming courses. In order to achieve this objective it enables the beginners to develop the basic programs in simple and understandable language, and these programs are further translated into the high level code programming languages like C and C++. The textual language is called a source language, and high level language like C is called a target language. For the translation of programs into the target language(s) a particular class of a translator must be required. Terry[451] defines the translator as a function, and the domain of this function is a source language, and the range is contained in a target language (see Figure 5.8).

Figure 5.8: Translator [451]

There are many types of translators, but due to its inherent nature and prim- itive features, the translator of LPL’s textual language is quite close to the 5.5. Design of Textual Language Translator 139

compilers and therefore, the compiler is selected as a base translator of textual language. A compiler is a program that translates the program in the source language into an equivalent program in target language [5]. The compiler also reports the error detected in the source program during the translation process. Classi- cal compilers, ahead-of-time compilers, just-in-time compilers, cross compilers, incremental compilers, binary compilers and graph compilers are the main types of compilers [406]. The compiler of textual language is highly motivated from the conventional compiler; however, its structure and functions are quite simpler than the con- ventional compiler. The definite aim of the textual language compiler is to read a source program, analyze it and convert it into the high level programming language(s). The figure 5.9 shows the functional view of the textual language compiler.

Figure 5.9: Functional view of textual translator

These duties of textual language compiler keep it very close to a specific class of a compiler called the transcompiler, which typically deals with this kind of translation. Basically transcompiler (also called source-to-source compiler, high level translator or transpiler) is a translator that receives the source code in one language and generates the source code of another language. Several notable transcompilers have been developed. For instance, MIX10 is a source-to-source compiler [250, 251] which converts MATLAB [205, 233, 381] programs to X10 [86, 294, 320]. A Polyglot extension is a transcompiler that takes a program written in a language extension and converts it to Java’s source code [340]. Other popular source-to-source compilers are MATISSE [54], Emscripten [499] 5.5. Design of Textual Language Translator 140

and DMS Software Reengineering Toolkit [38]. The compiler of textual language is comprised of five main components: lexical analyzer, syntax analyzer, semantic analyzer, an intermediate code generator and the code generator. These five phases are the fundamental constituents of conventional compilers. The generic structure of the textual language compiler is shown in Figure 5.10. Principally, these five phases are sufficient for textual

Figure 5.10: Block diagram of LPL’s compiler language compiler in achieving its prime responsibilities. However, in a practical situation it is possible to skip or integrate these phases. Symbol table and error handler are also the important components of the textual language compiler.

5.5.1 Lexical analyzer

The lexical analyzer is the first phase of the textual language compiler. Con- ventionally it is also called scanner or tokenizer [191]. The lexical analyzer is usually a separate component that communicates with the other phases and components of the compiler through the global variables and subroutines. This phase receives the source program and translates into a representation that is 5.5. Design of Textual Language Translator 141

more helpful for the compiler. Lexical analyzer is responsible to read the sequence of characters encountered in the program and groups them into lexemes [190]. It also defines a token with each lexeme. Lexeme is a sequence of characters in a source program that matches the pattern for a token and is identified by the lexical analyzer as an instance of that token. Lexeme is the original string (a sequence of characters) matched by the pattern for comprising the token. In fact, the lexemes are the lowest-level syntactic element of the programming language. The token is a syntactic category that represents a lexeme. The lexical analyzer of textual language reads the stream of characters and generates a stream of words. It combines characters to form words and applies rules to determine whether or not each word is valid in the textual language. If the word is legal, the lexical analyzer assigns it a syntactic category (part of speech) and passes the lexeme and token to the syntax analyzer of the com- piler. Essentially, in the lexical analysis all the lexical units of textual language which are defined during the specification of lexical syntax are recognized and tokenized. Lexical analyzer of the textual language is also responsible for the removal of white space and comments. The lexical analyzer is the only phase of textual language compiler that scans every character of the input program. The first step in the definition of lexical analyzer is to identify all the lexical units of textual language and define their syntactic categories. The generic lexical units of textual language, including identifiers, keywords, operators and literals are defined in section 5.2.2. The selection of a token set merely depends on the structure of textual language and the design strategy followed by the compiler designer, but it strongly affects the structure of the complete compiler. Such as, the >, >=, <, and <= can be treated as a single comparison-operator or as four tokens. The former is usually better since it simplifies the code generation process. Principally, there is no firm rule to which is better. Normally arithmetic operators with the same associatively and precedence can be combined together. Each keyword of 5.5. Design of Textual Language Translator 142

textual language has a distinct lexical structure and meaning, and therefore a unique token is defined for each keyword. All the identifiers (age, grade, fee) has a similar lexical structure, so a same syntactic category will be assigned to every identifier. Mostly, function names and variable names are both categorized as a same category and it is the choice of the compiler designer to either categorized them in a same class or treat them differently. The former approach is easy to implement, but requires extra chores at the semantic level, whereas the latter approach increased the size of the compiler and requires extra housekeeping during the lexical level, but somewhat lessens the complexity and intricacy of the semantic analyzer. In the similar vein, all the numeric literals (5, 5.3) can either be tokenized as a single class of tokens or a separate syntactic category can be defined for each literal. The textual language provides essential programming features with simple and understandable statements, and therefore, it includes a large number of key- words. Table 5.1 provides a concise list of tokens of the textual language.

Table 5.1: Tokens in a textual language

Token Lexemes IDENTIFIER age, fee, grade, serialnumber RELATIONALOPERATOR > , >=, <, <= CONSTANT 53, 200, 600.84 VERB START start VERB CREATE create CONJUNCTION IF if ELSESYMBOL else INTEGER DATATYPE integer FLOAT DATATYPE float STRING DATATYPE string VERB OUTPUT print, display VERB REPEAT repeat, iterate 5.5. Design of Textual Language Translator 143

In textual language, tokens may have one-to-one association with lexemes. For instance, the keyword array is associated with a single token. However, a universal token like numbers and identifiers, may be related to multiple lexemes. Consider the following segment of the program:

start of the main program create an integer type variable named age = 30 display the value of age end of the main program

The series of lexemes and tokens obtained after the lexical analysis of the above program are shown in Table 5.2.

Table 5.2: Tokens generated by the lexical analyzer

Lexeme Token start VERB START of PREPOSITION OF the ARTICLE THE main FUNCTION NAME program PROGRAM SYMBOL create VERB CREATE an INDEFINITEARTICLE AN integer INTEGER DATATYPE type TYPESYMBOL variable VARIABLESYMBOL named NAMEDSYMBOL age IDENTIFIER = ASSIGNMENTOPERATOR 30 CONSTANT display VERB OUTPUT the ARTICLE THE value VALUE of OF 5.5. Design of Textual Language Translator 144

age IDENTIFIER end VERB END of PREPOSITION OF the ARTICLE THE main FUNCTION NAME program PROGRAM SYMBOL

In [4], Aho et al. describes three approaches for the implementation of the lexical analyzer:

1. Develop the lexical analyzer with their generator. The lexical-analyzer generator generates the lexical analyzer from the lexical specification. 2. Use the I/O facilities of a conventional programming language to write the lexical analyzer. 3. Develop the lexical analyzer in assembly language and explicitly handles the reading of input.

The structure and function of the lexical analyzer is quite simple from the other phases of conventional compilers, yet a significant level of knowledge and ex- perience is required for its realization. Similarly, the overall anatomy of the lexical analyzer of textual language is pretty straightforward, though the inclu- sive methods are highly coupled and logical. The detail included in this chapter is reasonably beneficial for the construction of the handcrafted lexical analyzer as well as to use lexical analyzer generators. Lexical analyzer generators are the programs that generate lexical analyzers. Numerous lexical analyzer generators are available that can be used for the rapid development of the lexical analyzer. Lex is a program that generates lexical analyzer. It was developed by Mike Lesk and Eric Schmidt at AT&T Bell Laboratories in the 1970s [266]. Lex takes a set of elucidation of possible tokens and generates a C code (lexical analyzer) that can recognize those tokens. The set of descriptions which is passed to lex is called a lex specification and described in the form of regular expression. The 5.5. Design of Textual Language Translator 145

executable program is not generated in Lex, though it converts the lex specifi- cation into a file containing a C code “yylex()”. The rewrite of Lex is a Flex [266]. Flex (fast lexical analyzer generator) is a freely accessible version of lex and compatible with it. The version of lex devel- oped in ratfor [227] is translated into C by Vern Paxon and named Flex [265]. Flex is more reliable, faster and portable than lex. The overall functioning of Flex is very similar to Lex. JFlex is a tool to generate a lexical analyzer for Java [343]. JFlex is also written in Java. There are many other tools like the FsLex, Quex, DFASTART, C# Lex, Alex, Ragel, GPLEX, re2c and lexertl that can be used for generating the lexical analyzer of textual language. For the recognition of regular sentences (lexical units) a finite automaton is necessary and sufficient [490]. A finite automaton [296] is a 5-tuple (Q, Σ, q0 , F, δ) , where: Q is a non-empty finite set of states. Σ is a non-empty finite input alphabet. q0 ∈ Q is the initial state (start state). F ∈ Q is the set of accepting states. δ:Q × Σ → Q is the transition function (mapping function or total function). A finite automaton is endorsed as a legitimate system for recognizing the reg- ular expression. The five elements of a finite automaton are graphically represented by transi- tion graph. Transition graph is a directed weighted graph that illustrates the finite automaton in a simple and understandable way. During the definition of textual language, the lexical units are described with a regular expression and during lexical analysis, these lexical units are recognized with finite automata. The recognition requires the construction of finite automata from the regular expression. There are several methods [85, 201, 207, 414, 496] for converting the regular expressions into their equivalent finite automata. Consider the following regular expression for the integer literal of LPL: 5.5. Design of Textual Language Translator 146

integerliteral → ( + | - |  ) digit digit∗ digit → 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

The integer literal generated from the above regular expression can be recog- nized by the following finite automaton.

(Q, Σ, q0 , F, δ) , where:

Q = {q0, q1, q2} Σ = {+, -, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9} q0 is the initial state

F = {q2} The transitions are defined in the form of functions:

δ(q0,+) = {q1} δ(q1,+) = {} δ(q2,+) = {}

δ(q0,-) = {q1} δ(q1,-) = {} δ(q2,-) = {}

δ(q0,0) = {q2} δ(q1,0) = {q2} δ(q2,0) = {q2}

δ(q0,1) = {q2} δ(q1,1) = {q2} δ(q2,1) = {q2}

δ(q0,2) = {q2} δ(q1,2) = {q2} δ(q2,2) = {q2}

δ(q0,3) = {q2} δ(q1,3) = {q2} δ(q2,3) = {q2}

δ(q0,4) = {q2} δ(q1,4) = {q2} δ(q2,4) = {q2}

δ(q0,5) = {q2} δ(q1,5) = {q2} δ(q2,5) = {q2}

δ(q0,6) = {q2} δ(q1,6) = {q2} δ(q2,6) = {q2}

δ(q0,7) = {q2} δ(q1,7) = {q2} δ(q2,7) = {q2}

δ(q0,8) = {q2} δ(q1,8) = {q2} δ(q2,8) = {q2}

δ(q0,9) = {q2} δ(q1,9) = {q2} δ(q2,9) = {q2}

The above finite automaton can be concisely represented by a transition graph (see Figure 5.11).

The finite automaton in 5.11 can recognize the integer literals defined by its regular expression. Once the finite automaton recognized an integer literal, it can generate the corresponding token. As an illustration, consider the revised finite automaton in Figure 5.12 that recognized an integer literal and generate 5.5. Design of Textual Language Translator 147

Figure 5.11: Finite automaton for integer literals the corresponding token.

Figure 5.12: Revised finite automaton for integer literals

The revised finite automaton recognized the integer literal by entering in the fi- nal state (q2). Once the integer literal is completely scanned and any other character is identified, the finite automaton leaves the final state and gen- erates a token “CONSTANT” and finally switched back to the initial state. Here a single token “CONSTANT” is designated for every type of literal. It is possible and sometimes more useful to define a separate token for every class of literal. For instance, INTEGER CONSTANT for an integer literal, FLOAT CONSTANT for the float value literals and STRING CONSTANT for string literal and so on. The finite automaton developed for integer literals can easily be augmented for float-value literals. Figure 5.13 shows the finite automaton that can recognize the integer literals as well as the float-value literals.

Nondeterministic finite automata and deterministic automata are two flavors of finite automata [5]. In deterministic finite automata any state and input 5.5. Design of Textual Language Translator 148

Figure 5.13: Finite automaton for numeric literal character has at most one transition state whereas in nondeterministic finite automata, there may be more than one possible state that is achievable from a particular state for the same input character [457]. Both the deterministic and nondeterministic finite automata are capable to recognize the lexical units of textual language. Although the deterministic finite automaton is a special case of nondeterministic finite automata, but both have a same capability, and any nondeterministic finite automaton can be converted into an equivalent de- terministic finite automaton. After defining the finite automaton for every lexical unit of textual language, the next step is the integration of automata into a single and large automaton. The integration is not mandatory, and it is possible to implement the lexical analyzer without the integrated finite automaton. The transition diagrams of finite automata can easily be converted into a com- puter program, and each state of the transition diagram gets a segment of code. If there are edges, leaving the state, then its respective code reads an input char- acter and decides an edge to follow. A function is required to read the next character from the buffer. If there is an edge captioned by a character read, then control switches to the code for the state pointed to by that edge.

5.5.2 Syntax analyzer

Syntax analyzer (also called parser [32]) is the second phase of textual lan- guage. Conventionally this phase receives the tokens from the lexical analyzer 5.5. Design of Textual Language Translator 149

and attempts to find that the tokens are grammatically correct by verifying that tokens can be generated by the grammar for the language [5, 423]. Principally the syntax analyzer determines how the token should be structured according to the syntax rules of the language [476]. If the syntax analyzer verifies that the input program is a valid, it generates a model (usually in the form of a parse tree) of the program for the subsequent use in a compilation process [100]. If the input program in not valid, the syntax analyzer reports the problem and proper diagnostic information to the user. Normally, the string of tokens is not an explicit parameter, but syntax analyzer calls a lexical analyzer to get the next token. The explicit construction of parse tree is not necessary as the verification and translation operations can be min- gled with syntax analyzer [5]. The syntax of textual language is defined by context-free grammar and veri- fication of syntax is performed by syntax analyzer. Like conventional syntax analyzer the textual language syntax analyzer attempts to find that input pro- gram (in the form of tokens) is grammatically correct. The syntax analyzer should report the syntax errors in a comprehensible manner and continue pro- cessing. The syntax analyzer has multiple methods for syntax verification. There are many parsing techniques, which can be used in the syntax analyzer of textual language. Most of the parsing methods are categorized into two classes: top- down and bottom-up methods [4,5, 281, 423]. Top-down parsing attempt to find a leftmost derivation for an input sentence and construct parse tree for the input from the root and generate the nodes of a tree in preorder. Backtracking parser and predictive parser are two forms of top-down parsers. The predictive parser used one or more lookahead tokens to predict the next construction in the input sentence, whereas a backtracking parser try different alternatives to parse the input and backing up if one possi- bility fails. Recursive-descent parsing and LL(1) parsing are two classes of top-down pars- ing. Recursive-descent parsing is quite flexible and is the most appropriate 5.5. Design of Textual Language Translator 150

method for a handwritten parser. In this method, a set of recursive proce- dures is used to parse the input. One procedure is related to each variable (nonterminal) of grammar. For instance, the grammar rule for a nonterminal B is viewed as a description of a procedure that will verify the B. The right- hand side of the rule B describes the structure of the code for this procedure. Generally recursive-descent parsing requires the computation of lookahead sets. These sets are called First and Follow sets [281]. The context-free grammar is translated into EBNF if recursive-descent parsing is to be used. The recursive- descent cannot handle the left-recursive grammar. A left-recursive grammar can cause a recursive-descent parser to enter in an infinite loop. A grammar is said to be left recursive, if it has a variable A, and there is derivation A ⇒∗ Aα for some string α [5]. Removal of left-recursion is necessary before using in a recursive-descent parser. The removal of left recursion is quite simple and straightforward. Consider the following productions:

A → A | β

The left recursion in the above productions can be eliminated by replacing the non-left recursive productions:

A → βA’

A’ → αA’ | 

Although there are some restrictions of recursive-descent parsing, but it’s overall idea and logic is quite simple. The easy and amenable nature of recursive- descent parser makes it a viable choice for textual language. As an example, consider the grammar of function header’s from the section 5.3.2:

→ VERB START
→ PREPOSITION OF PROGRAM SYMBOL |  → ARTICLE A | ARTICLE THE |  → FUNCTION NAME |  5.5. Design of Textual Language Translator 151

The recursive-descent procedures that parse the above segment of grammar can be written in pseudo code as follows: procedure header main begin

match(VERB START)

main phrase end header main procedure main phrase begin

if (token = PREPOSITION OF), then

match(PREPOSITION OF)

opt article

function name

else if (token = PROGRAM SYMBOL), then

match(PROGRAM SYMBOL)

end if end main phrase procedure opt article begin

if (token = ARTICLE A), then

match(ARTICLE A)

else

match(ARTICLE THE)

end if 5.5. Design of Textual Language Translator 152

end opt article procedure function name begin

if (token = FUNCTION NAME), then

match(FUNCTION NAME)

end if end function name procedure match(expectedToken) begin

if (expectedToken 6= token), then

error

else

gettoken

end if end match

In this pseudo code the token variable keeps the current next token in the input and match procedure matches the current token with its parameter, move forward the input if it succeeds, and generates error if expected token and current next token are dissimilar. The method followed for the development of pseudocode for the grammar of function header can be used for defining the pseudo code of the other segment of a textual language. As a second example, consider the segment of context-free grammar for the declaration of variables (described in section 5.3.2).

→ VERB CREATE IDENTIFIER 5.5. Design of Textual Language Translator 153

→ INDEFINITEARTICLE AN INTE- GER DATATYPE | FLOAT DATATYPE | STRING DATATYPE | CHARACTER DATATYPE → TYPESYMBOL |  → VARIABLESYMBOL |  → NAMEDSYMBOL |  → = CONSTANT | 

The recursive-descent procedures that parse the above segment of grammar can be written in pseudo code as follows: procedure declaration begin

match(VERB CREATE)

datatype

may type

variable

match(IDENTIFIER)

initialization end declaration procedure datatype begin

case token of

INDEFINITEARTICLE AN:

match(INDEFINITEARTICLE AN)

match(INTEGER DATATYPE) 5.5. Design of Textual Language Translator 154

FLOAT DATATYPE:

match(FLOAT DATATYPE)

STRING DATATYPE:

match(STRING DATATYPE)

CHARACTER DATATYPE:

match(CHARACTER DATATYPE)

end case end datatype procedure may type begin

if (token = TYPESYMBOL), then

match(TYPESYMBOL)

end if end may type procedure variable begin

if (token = VARIABLESYMBOL), then

match(VARIABLESYMBOL)

may named

end if end variable procedure may named begin

if (token = NAMEDSYMBOL), then 5.5. Design of Textual Language Translator 155

match(NAMEDSYMBOL)

end if end procedure initialization begin

if (token = ‘=’), then

match(=)

match(CONSTANT) end initialization

The recursive-descent parser can be developed with automated tools. Coco/R is a compiler generator that generates a recursive-descent parser and its associated parser [301]. There are several versions of Coco/R for different programming languages, such as Oberon, Pascal, , C, Java and C#. JavaCC (Java Compiler Compiler) is a compiler generator and used to generate parsers in Java. It accepts language specifications in BNF-like format [442]. JavaCC con- structs a recursive-descent parser [478], and all JavaCC grammar has LL prop- erty. Rdp is a parser generator that takes context-free grammar specifications defined in the Iterator Backus Naur Form as input and generates recursive- descent parser written in ANSI-C as an output [214]. LL parsing is a top-down parsing technique that uses an explicit stack to per- form a parse [281]. The first L in LL(1) refers that it parse the input from left to right, the second L refers a leftmost derivation and the 1 in parentheses refers to the fact that it uses only one input symbol of lookahead at each step to predict the direction of parse [5]. Like recursive-descent parsing, the LL(1) parsing requires the computation of the First and Follow sets. Similarly, the LL(1) parser cannot handle the ambiguous and left recursive grammar. LL(1) parsing is one of a possible option for the syntax analyzer of textual language. However, it is more suitable for automatically generated parsers, but not for the handwritten parsers. ANTLR (Another Tool For Language 5.5. Design of Textual Language Translator 156

Recognition) is a parser generator that accepts a context-free grammar aug- mented with syntactic and semantic predicates and necessary actions [362]. ANTLR has several features that make it simple to use than other parsing tools. ANTLR amalgamates the specification of lexical and syntax analysis, recognizes LL(k) grammar for K > 1 and automatically develops the abstract syntax trees. ANTLR is widely employed, with over 1000 academic and indus- trial users in different countries [363]. Consider the context-free grammar of the function header (discussed in section 5.3.2):

→ VERB START
→ PREPOSITION OF PROGRAM SYMBOL |  → ARTICLE A | ARTICLE THE |  → FUNCTION NAME | 

For the construction of LL(1) parsing table the computation of First and Follow sets are required.

First(

) = { VERB START } First(
) = { PREPOSITION OF ,  } First() = { ARTICLE A , ARTICLE THE ,  } First() = { FUNCTION NAME ,  } Follow(
) = { $ } Follow(
) = { $ } Follow() = { FUNCTION NAME , $ } Follow() = { $ }

The size of a parsing table is usually very large. So for conciseness, the following abbreviated codes are assigned to the nonterminals of grammar: 5.5. Design of Textual Language Translator 157

:

:
: :

Now the grammar takes the following form:

→ VERB START → PREPOSITION OF PROGRAM SYMBOL |  → ARTICLE A | ARTICLE THE |  < fn> → FUNCTION NAME | 

After the computation of the First and Follow sets, the next step is the con- struction of a parsing table. Table 5.4 shows the LL(1) parsing table of the function header. 5.5. Design of Textual Language Translator 158 $    NAME NAME FUNCTION  FUNCTION THE THE ARTICLE ARTICLE A A ARTICLE ARTICLE Terminals SYMBOL PROGRAM PRO- Table 5.3: Parsing table of function header OF OF > fn SYMBOL > < oa < GRAM PREPOSITION PREPOSITION START START > mp VERB VERB < > > > > hm mp oa fn Nonterminals < < < < 5.5. Design of Textual Language Translator 159

After the construction of a parsing table, it is simple to check the syntax of the input string. As an example, consider the LL(1) parse of “start of the main program”. After tokenization, the sentence becomes “VERB START PREPO- SITION OF ARTICLE THE FUNCTION NAME PROGRAM SYMBOL”.

Table 5.4: LL(1) parsing of function header

Parsing Stack Input Action

$ VERB START PREPO- → VERB START

SITION OF ARTI-

CLE THE FUNC-

TION NAME PRO-

GRAM SYMBOL $

$ VERB START VERB START PREPO- match

SITION OF ARTI-

CLE THE FUNC-

TION NAME PRO-

GRAM SYMBOL $

$ PREPOSITION OF → PREPOSI-

ARTICLE THE FUNC- TION OF

TION NAME PRO- PROGRAM SYMBOL

GRAM SYMBOL $

$PROGRAM SYMBOL PREPOSITION OF match

PREPOSI- ARTICLE THE FUNC-

TION OF TION NAME PRO-

GRAM SYMBOL $

$PROGRAM SYMBOL ARTICLE THE FUNC- → ARTICLE THE

TION NAME PRO-

GRAM SYMBOL $

$PROGRAM SYMBOL ARTICLE THE FUNC- match

ARTICLE THE TION NAME PRO-

GRAM SYMBOL $ 5.5. Design of Textual Language Translator 160

$PROGRAM SYMBOL FUNCTION NAME → FUNC-

PROGRAM SYMBOL TION NAME $

$PROGRAM SYMBOL FUNCTION NAME match

FUNCTION NAME PROGRAM SYMBOL $

$PROGRAM SYMBOL PROGRAM SYMBOL $ match

$ $ accept

At the end of parsing, the parsing stack and input both contain the $, which indicates that the input sentence is valid and successfully parsed by the parser. Bottom-up parsing is an important class of parsing techniques. It is more powerful than top-down parsing. However, the construction involved in bottom- up parsing is more complex. The bottom-up parser generates parse tree from the leaves (bottom) and working towards the root [4]. The general style of bottom-up parsing is called shift-reduce parsing. In shift- reduce parsing a parse tree is generated in a bottom-up manner, and the parser reduced the input string to the start symbol of grammar. At each reduction step in shift-reduce parsing a particular part of string that matched the right side of production is replaced by the header of that production and if the substring is selected appropriately at each step, the rightmost derivation is traced out in reverse. A step that pushed the current state on the stack and gets the next input token is called push operation. A step that replaced the right side of the production by its left part is known as reduce action [406]. The right part and its position are called the handle. The prefixes of right sentential forms that appear on the stack of the parser are called viable prefixes. A general method of shift-reduce parsing is called LR parsing. LR parsing is a kind of bottom-up technique that can recognize a large class of grammar. The LR class of languages is wider than LL. The first “L” in LR refers that it parse the input from left to right, the “R” refers a rightmost derivation in reverse, and a number in parentheses (if any) represent the number of lookahead symbols. 5.5. Design of Textual Language Translator 161

LR parser is comprised of an input, an output, a driver program, a stack and a parsing table. All LR parsers have a same driver program, but the parsing table varies from one parser to another. The model diagram of LR parser is shown in Figure 5.14.

Figure 5.14: model diagram of LR parser [4]

The LR parsing program reads characters from the buffer. It uses a stack to store a string of the form s0X1s1X2s2...Xmsm. Each si is referred as a state and each Xi is a grammar symbol. The parsing table has two parts (action and goto). The program works as follows:

It determines the current state of the stack (sm) and the current input symbol

(ai). Consult the parsing action table entry (action[sm,ai]), which can have the following values:

1. Shift j, where j is a state. 2. Reduce A → β. 3. Accept. The LR parser accepts the input. 4. Error. The parser identifies an error in the input and performs some action.

The goto function produces a state by taking the state and grammar symbol as arguments. Fortunately, there are different techniques for the construction of an LR parsing table from a grammar. Simple LR (SLR), canonical LR (or just LR) and lookahead LR (LALR) are three popular methods for the construction 5.5. Design of Textual Language Translator 162

of a LR parsing table [4]. The SLR is a simple, but the least powerful tech- nique. It may fail to generate parsing tables for certain context-free grammars. Canonical LR is the most expensive and most powerful technique. The LALR lies between SLR and Canonical LR. It is often used in practice, because the parsing tables generated by it are significantly smaller than the canonical LR tables. The LR parser is one of a possible candidate for the syntax analyzer of textual language. There are several extensions of LR which can also be considered in a textual language. Generalized LR (GLR) parsing and its extensions are at- tractive techniques and can be used for defining the syntax analyzer of textual language. GLR is asymptotically efficient and can handle a variety of context- free grammar [306]. Although it is difficult to construct an LR parser by hand, but it is very easy to develop efficient LR parsers with automatic parser generators [4]. Yacc (Yet Another Compiler Compiler) is an automated tool for generating parsers. Yacc is written in portable C [213]. It allows to generate parsers from LALR(1) grammars without necessarily learning the fundamental theory. Yacc accepts a grammar which specifies and describes a parser that accepts valid sentences in that grammar. The yacc specification has three-part structure [266]. The first section is the definition section which manages control information for the yacc- developed parser and usually framework the execution environment in which the yacc-generated parser will operate. The declaration section encompasses declarations of token employed in grammar, and the types of values utilized on the stack. The definition section also includes a literal block, which contain a C code. The second section consists of grammar rules, and the third section is a C code copied verbatim into the developed C program. Bison is a parser generator that converts a context-free grammar into a deter- ministic LR or GLR parser by using LALR parsing tables. Bison is a descendant of yacc [265]. It is free software, and available under the GNU General Public License. Bison accepts a grammar which describes and writes a parser that verifies valid sentences in that grammar. The Bison specification has three-part 5.5. Design of Textual Language Translator 163

structure [265], and these three parts are similar to the structure of yacc. Yacc and Bison are two popular bottom-up parser generator tools. However, there are many other tools that can be used for syntax analyzer of textual language and some of them are Beaver, CookCC, CSP, CUP, Dragon, Frown, GOLD, GPPG, Lemon, SableCC, Sweet Parser, UniCC and YooParse.

5.5.3 Semantic analyzer

Semantic analyzer is an indispensable phase of contemporary compilers. This phase determines the meaning of the program [476]. Fundamentally, the seman- tic analyzer processes the information that is beyond the capacity of context-free grammars and conventional parsing algorithms. A typical approach to semantic analysis is a syntax-directed, that is attaching semantic actions with grammar (syntax) rules describing the constructs of the language [406]. Semantic analyzer is the third phase of the textual language compiler. The textual language is very tiny and its compiler merely deals with the translation of source programs into their equivalent high level codes and has no concern with the generation of executable code. Therefore, all the semantic operations should be performed during the compilation of the program and consequently the semantic analyzer of the compiler is called a static semantic analyzer. Attribute grammar is an extension of context-free grammar and can be used to describe the static semantic of programming languages [416]. Attribute gram- mar is useful for languages that follow the theory of syntax-directed semantics, which state that the semantic aspect of a source program is strongly associated with its syntax. Generally a method of associating semantic rules to the gram- mar that defines the syntax of language is called a syntax-directed translation [5]. During the definition of the semantics of textual language (in section 5.4), the rules are specified with attribute grammar that defines its context sensitive re- strictions; however, the evaluation and verification of these rules are carried out by the semantic analyzer of the compiler. Attribute grammar is a formal method both to describe and verify the correctness of the semantic rules.[416]. 5.5. Design of Textual Language Translator 164

The semantic rules of attribute grammar specify the algorithm for computing the semantics of language in the form of attributes associated with each node of parse tree. It is already discussed that synthesized and inherited are two main types of attributes. The computation of the attribute values of S-attributed grammar can be performed by a bottom-up traversal of the tree. Mostly, the post order traversal is used. Consider the following pseudocode for the recursive post order evaluation of synthesized attributes.

PostorderEvaluation( T as treenode )

{

For each child N of T do

PostorderEvaluation ( N )

Evaluate all synthesized attributes of T

}

Evaluating the attribute values of parse tree is called decoration. If the at- tributes are inherited, then the decoration can be performed in a top down manner, otherwise bottom-up traversal will be performed for decorating the parse tree if all the attributes are synthesized. Type checking of arithmetic expressions is one of an important task of semantic analyzer. The semantic analyzer of a textual language receives the syntax tree from the syntax analyzer and evaluates the attribute grammar. As an illustra- tion, consider a parse tree (Figure 5.15) of the arithmetic statement A = B + C (IDENTIFIER = IDENTIFIER + IDENTIFIER) generated by the arithmetic expression grammar. The attribute grammar of simple arithmetic expressions has both inherited and synthesized attributes, so the computation cannot be performed in any single direction. However, it is possible to compute them in the following order:

1. IDENTIFIER.dtype = lookup(IDENTIFIER) 2. < expression>.etype = IDENTIFIER.dtype 5.5. Design of Textual Language Translator 165

Figure 5.15: A parse tree of A = B + C

3. IDENTIFIER.dtype = lookup(IDENTIFIER) 4. < thefactor>.atype = IDENTIFIER.dtype 5. < theterm>.atype = < thefactor>.atype 6. IDENTIFIER.dtype = lookup(IDENTIFIER) 7. < thefactor>.atype = IDENTIFIER.dtype 8. < theterm>.atype = < thefactor>.atype 9. < expression>.atype = either integer, float or error 10. < assignment>isvalid = either true or false

The flow of attributes in the tree is illustrated in Figure 5.16. Parse tree is rep- resented by solid lines, whereas the flow of attributes in the tree is represented by dashed line.

In this example, A and B are defined as float and C is defined as an integer. The final attribute values at each node of the tree are shown in Figure 5.17.

If most of the attribute values are identical, or the attribute values are only used to compute other attribute values, then it is more straightforward to use parameters together with the returned function values for communicating at- tribute values instead of storing them as fields in the nodes of a syntax tree. 5.5. Design of Textual Language Translator 166

Figure 5.16: Flow of attributes in the tree of A = B + C

Furthermore, if the attribute values have crucial form, then it may be illogical to store the attribute values in the nodes of the syntax tree. In such case, it is more practical to acquire the proper functioning and accessibility of attribute values by using data structures like lookup tables and graphs. To support these features the attribute grammar need to be modified by substituting the at- tribute equation with calls to procedures representing actions on the suitable data structures used to support and maintain attribute values. Most of the time it is reasonable to define a procedure that evaluates the inher- ited attributes in preorder form and synthesized attributes in postorder form by passing the values of inherent attribute as parameters to recursive calls on the children and accepting the values of the synthesized attribute as the return values. The synthesized values of arithmetic expression in a textual language can be computed by a recursive procedure. When multiple synthesized attributes are to be returned, then it is desirable to 5.5. Design of Textual Language Translator 167

Figure 5.17: Fully Attributed Tree use a record structure (or any other equivalent structure), even it is possible to divide the recursive procedure into multiple procedures to handle different cases. Handling of the attribute tree becomes more difficult if it contains both the inherited and synthesized attributes. If the inherited attributes do not rely on synthesized attributes, but the synthesized attributes relied on the inherited attributes (and synthesized attributes), then by a single pass of a tree (either parse or syntax), it is possible to evaluate all the attributes. However, mul- tiple passes are required if the inherited attributes depend on the synthesized attributes. The structure of grammar has a strong impact on the properties of attributes and it is possible to modify the grammar without changing its functions (lan- guage definition) to simply the further computations. A method has been devised to change all the inherited attributes into a synthesized attribute by modifying the grammar without affecting the actual language. But, it has been 5.5. Design of Textual Language Translator 168

observed that modifying the grammar in order to change the attribute gram- mars into a synthesized grammar commonly complicates the grammar and also affects its understandability, hence it is not a suggested to handle the problems of evaluating inherited attributes [281]. The syntax analyzer and semantic analyzer are not always realized as separate passes, but can be partially or fully combined [406]. If the semantic analysis is simple and no optimization is to be performed by the compiler, then the remaining functions can also be performed by subprograms of syntax analyzer [483]. Management of a symbol table is one of a pivotal duty of the semantic analyzer. The symbol table facilitates the semantic analyzer to maintain the meaning of names appeared in declarations and performs type inferencing and type check- ing on statements to verify their correctness according to the type rules of textual language. If the semantic is delayed until the construction of an abstract syntax tree, then the process of realizing semantic analysis is made appreciably straightforward and includes essential of defining the traversal order of the syntax tree, along with the computations to be carried out each time a node appeared in the traversal. The attribute grammar is a useful technique for describing the static seman- tics of language and used to define the semantic of textual language. Other techniques like operational semantics, denotational semantics and axiomatic semantics would be required for the dynamic semantics of a language [282, 416].

5.5.4 Symbol table management

The symbol table is one of the most important and central data structures of the compilers [476], and LPL’s compiler as well. The notions of the symbol table are usually discussed with semantic analysis, but it has close associations with syntax analyzer and even with lexical analyzer. The types, variables, functions and other identifiers are declared in programs, and these identifiers are bound to meaning in the symbol table [25]. A symbol table is used by a compiler 5.5. Design of Textual Language Translator 169

to maintain the track of scope and binding information about names [217]. This information is used in the input source program to identify the different components, like constants, variables, labels and functions. The symbol table is searched every time an identifier is encountered in the source program. When a new identifier or information about an existing identifier is discovered, the symbol table requires changes. The symbol table must have an efficient method for accessing the information store in the symbol table as well as for adding new information to the symbol table. The symbol table of textual language can be viewed as a series of rows, each row containing the attribute values that are associated with a particular identifier. The kinds of attributes stored in the symbol table are based on the nature and structure of a textual language. The following is the list of attributes that should be maintained in the symbol table of textual language.

1. Identifier name 2. Type 3. Address 4. A line number at which the identifier is declared 5. Line numbers at which the identifier is referenced

The overall structure of the textual compiler is extremely simple. It simply an- alyzed the source program and generates high level programs from the source program. For that reason the contents and structure of the symbol table should be pretty simple. The symbol table of textual compiler aid two main functions: in verifying the semantic correctness and supporting in the generation of code. The primary operations of symbol tables are insert, lookup and delete. The insert operation is required to store information in symbol table. The lookup is used to retrieve the information and delete operation is used to remove the information from the symbol table. The speed of the compiler directly depends on the organization of the symbol table. Fortunately, there are different ways of implementing the symbol table of 5.5. Design of Textual Language Translator 170

textual language. Linear lists, binary search tree, AVL tree, B trees and hash tables are implementation techniques of a symbol table. The linear list is a simple way for the implementation of symbol table [217] and generally used in those compilers that have no major concern with the compi- lation speed. The insert operation takes constant time if insertion performed at the front or rear of the list. Whereas lookup and delete are linear time oper- ations. A linear list may be considered for organizing the symbol table of the textual compiler of the textual language. A search tree is another approach to symbol table management. The search tree may be considered in textual compiler. Although a search tree has advantages over the linear list, yet it does not present good efficiency, particularly in delete operation. The hash table is a very efficient way of implementing the symbol table [217], because all the operations can be accomplished in about constant time. A hash table is an array of slots, called buckets, subscripted by an integer range. A hash function converts the search key (the identifier name) into an integer value, and the information associated with the search key is stored in the bucket in this position. The hash function may map two or multiple keys to a same po- sition (index), and this situation is called a collision. Hash collisions affect the performance of lookup and delete operations. The hash function itself should be efficient. There are several hash resolution techniques, and few of them are: open ad- dressing and separate chaining. In open addressing, for each single item only sufficient space is allocated in each bucket and collision is resolved by placing new items in subsequent buckets. The contents of the hash table are restricted by the capacity of an array used for the symbol table, and collisions more recurrent as the array fills and consequently affects the performance. More im- portantly, it is difficult to implement the delete operation. Separate chaining is one of the best alternates of open addressing. In separate chaining, each bucket is a linear list and the new item is inserted into the bucket list for resolving collisions. Figure 5.18 describes a basic example of this technique. Here the 5.5. Design of Textual Language Translator 171

size of the hash table is 7 and six identifiers (a, b, income, grade, empno and temp) have inserted, and grade and b have the same hash value (for example, 2). The identifier grade appears before b in bucket number 2; in the list, the actual sequence of items relies on the sequence of the insertion and the pattern of maintaining the list. Linked lists are the common implementation of lists in each bucket. The size of the bucket array is another important issue that

Figure 5.18: Separately chained hash table should be considered during the definition of a symbol table. Normally, it is decided at the compiler construction time. The size of the bucket array should be a prime number; this makes the hash functions quite better. A hash function is used for transforming the identifier name (character string) into an integer value. This is normally accomplished in three steps. In the first step, each character in the identifier is converted into a nonnegative integer, and during the second step, these integers are related and combined to form an integer. During the third step, the obtained integer is scaled within the valid range (indices). The hash table is a most advocated method for the actual compiler of textual language. However, it is possible to use other techniques in the experimental compilers.

5.5.5 Error handling

Novice students make high errors in their programs and spend considerable time in correcting errors. In conventional software language processing, it is 5.5. Design of Textual Language Translator 172

expected that the input program is according to the grammar of language. However, in practice, programs may contain errors [500]. The students of LPL course cannot be expected to write valid programs all the time, and therefore, a gentle support of handling errors is vital. The textual compiler is more often confronted with syntactically invalid programs than with correct ones. Virtu- ally, the reaction of error can lie anywhere between complete correction of an invalid program and a total collapse of the compiler. There are different levels of error responses, which could be used in LPL’s textual compiler. Error recovery and error repair are acceptable error handling methods [457]. The least complex and acceptable technique to error handling is error recovery. This technique is highly advocated for LPL’s textual compiler. In error recov- ery, the compiler adapts its input stream and internal data structures so that it may continue parsing and verifying the program. The error recovery allows the textual compiler to continue verifying the input program for further detectable errors and to communicate and recover from the errors as well. Ideally, the textual compiler would recover so gently that it would trigger exactly one fully descriptive message communicating an error and should not generate multiple messages due to that error. Panic-mode and phrase-level are error recovery strategies [5]. All of these strate- gies are possible candidates for textual compiler. The selection of error recovery strategy largely depends upon the parsing technique. For instance, a panic- mode strategy is recommended for recursive-descent parsers [281]. Panic-mode is the simplest recovery strategy. With this method, one identifying an error, the syntax analyzer discards the input symbol one at a time until one of a syn- chronizing token is found [410]. It is straightforward to implement panic mode strategy, but it has obvious limitations [154]. Its recovery results rely on the quality of synchronizing tokens and require higher skills and knowledge. The phrase-mode is of a useful error recovery strategy. With this method, on discovering an error a parser may perform a local correction. It strives to con- vert syntactically wrong phrases into correct ones [464]. A common correction is to replace a comma with a semicolon, add a missing semicolon, or delete an 5.5. Design of Textual Language Translator 173

invalid semicolon. Error repair is a type of error handling. This strategy modifies the source or internal representation of the invalid program to make it valid. This strategy makes the compiler complex and consequently not very advocated for textual compiler. Better error messages are important for novice students. In a compiler, the er- ror messages are of a most significant point of communication between system and programmer [292]. For novices, the error messages are typically critical [293]. The textual compiler performs error-handling and provides tiny support to stu- dents in handling errors. At a minimum level, the textual language environment provides a gentle support in the form of error messages that help novice students to locate and correct the errors.

5.5.6 Intermediate code generator

The intermediate code generator is the fourth phase of the textual compiler. In conventional compilers, the intermediate code generator translates the source program into an intermediate representation. Although the source program can be directly converted into the target program, some advantages of intermediate form are:

1. Certain code optimization techniques can be applied to the intermediate representation of a program. 2. A compiler for a different machine can easily be developed.

Code optimization [178] is the process of transforming a program to improve its runtime performance without affecting its functionality [406]. The textual compiler translates the input source program into the high level codes, so the novice students can understand the equivalent statements of their programs. The code optimizer usually changed the structure of the program and resul- tantly not required in the compiler of textual language. It is not necessary to define the intermediate code generator as a separate phase 5.5. Design of Textual Language Translator 174

rather it can be included as a subtask of the parser. Intermediate code is de- veloped as a result of semantic analysis, if the source program has no lexical, syntax or semantic errors [406]. An abstract syntax tree, polish notation, threaded code, n-tuple notation and abstract machine code are the popular kinds of intermediate representation [457]. All of these representations are the possible candidate for the textual compiler; however, the abstract syntax tree is the most recommended repre- sentation for textual compiler. The abstract syntax tree as an intermediate representation in the textual compiler provides several benefits. The major portion of the abstract syntax tree is generated during syntax and semantic analysis. So it is not necessary to include the intermediate code generator as a separate phase of the textual compiler. This in turn simplifies the compiler structure and reduced the efforts required for development of a compiler.

5.5.7 Code generator

Code generation is the concluding phase of the textual compiler. Its sole ob- jective is to generate high level programs. In conventional compilers, the code generation is a machine dependent phase and involves the detail of the par- ticular machine for which it is expected to generate code [217]. On the other hands, no specific knowledge of hardware is essentially required in the code gen- eration of textual compiler. For that reason the code generation of the textual compiler is quite different from code generation of conventional compilers. The inherent nature of textual language reduced the intrinsic complexity of its code generation and therefore it is not necessary to include the code generation as a separate phase of LPL’s compiler, but as a subtask of the parsing. Programmatic semantic [274, 276] is a useful technique and provides a tiny yet a very helpful guideline for describing the model of the textual code genera- tor. Programmatic semantic is a mapping between natural language constructs and programming language constructs. In this method, noun phrases map to classes, verbs map to functions and adjectives map to properties. A simple and viable method for a code generation of textual compiler is the 5.5. Design of Textual Language Translator 175

direct conversion of an abstract syntax tree into the high level code. This tech- nique is followed in NaturalJava [371]. The code generator of textual compiler receives the source program in the form of abstract syntax tree (renovated syn- tax tree). It analyzed the tree and generates the set of case frames and these case frames are processed by related procedures, which generate high-level lan- guage programs. A case frame is a pattern-based template developed for every statement/syntac- tic structure of a program. Each programming construct in a textual language has a separate case frame. During the code generation, the abstract syntax tree is analyzed, and individual statements/constructs are identified and their relevant information is collected. On the basis of the collected information a specific case frame is triggered. Each case frame has a trigger word and initial- izing routine that decides when it is applicable. A case frame also has a type, which symbolizes its general concept. It also has an arbitrary number of slots that extract required information from syntactic elements. Consider a loop statement “repeat the statements with age from 1 to 100 and step is 2”. Figure 5.19 illustrates the portion of an abstract parse tree of the loop statement. 5.5. Design of Textual Language Translator 176 Figure 5.19: Section of abstract(or renovated) parse tree for loop statement 5.5. Design of Textual Language Translator 177

During the code generation, the section of the abstract parse tree is analyzed. Due to VERB REPEAT a case frame for iteration is triggered and a code generator collects the information which is helpful in the generation of high level programs. Figure 5.20 shows the case triggered by the loop statement. The case frame generated for a loop statement includes all the pertinent in-

Figure 5.20: Case frame for loop statement formation, which is essential for the generation of high level code. This case frame is further passed to a corresponding function which generates the loop statement in high level languages by using the values of slots. The logic of a function that generates the high level code for the loop statement is extremely simple. It simply concatenates the values of slots with suitable statements and stores the generated text on the output buffer. The code generator of the textual compiler generates different types of case frames. Virtually every construct requires a separate case frame. The notion required in the definition of case frames for other constructs of textual language is very similar to the theory followed for loop structure. As an illustration, con- sider the case frame for variable declaration. While analyzing the abstract parse tree of variable declaration the code generator extracts the type of vari- able, name of variable and initial value. Figure 5.21 shows a section of the abstract parse tree for a declaration statement (create float type variable fee = 500.0 ). 5.5. Design of Textual Language Translator 178 Figure 5.21: Section of abstract parse tree for the declaration of float type variable 5.5. Design of Textual Language Translator 179

The declaration statements prefixed with VERB CREATE which trigger the declaration case frame. During compilation, the code generator extracts infor- mation from the abstract parse tree and generates the case frame. The case frame for the declaration is shown in Figure 5.22. The case frame of declaration

Figure 5.22: Case frame for variable declaration contains four slots. In the actual realization of the compiler, it is possible to add more slots. In the same way the case frames for other constructs can be defined and the code generator can easily be developed. Chapter 6

Experimental Prototype of Learners Programming Language This chapter defines an experimental learners’ programming language. The experimental language is delineated on the philosophy and techniques described in the chapter four.

6.1 Introduction

Learners programming language is a CS0 course and actually aims to increase the students’ performance in the first programming course. The aim, anatomy and the expected benefits of the LPL are discussed in chapter 4. However, to examine the actual worth and benefits of course, an experimental LPL course has been developed that can be applied in the real situation. The experimental LPL course is developed for imperative programming courses and intended to be fruitful for the computer science majors as well as for the non-major students. The course covers the basic imperative programming con- cepts and problem solving to prepare the students for the first programming course. The course includes the essential concepts of imperative programming and supports the following topics:

180 6.2. Phase I 181

1. Introduction to programming 2. Data type 3. Variable and literal 4. Console input and output 5. Operators 6. Single dimension array 7. Selection control structure 8. Repetition control structure 9. Comments

Lectures, lab work and assignments are used to cover these topics. The learn- ers’ programming language is based on two-phase learning, so the topics are introduced in the first phase and later covered during the second phase.

6.2 Phase I

The first phase of an experimental LPL course introduced the students with the essential concepts of programming. This phase introduced the fundamental concepts of programming by using graphical environment and motivates the students by relaxing them from the complex syntax of the programming lan- guages. An experimental learners’ programming language is developed for the imper- ative paradigm and therefore the flowchart-based programming environments are more recommended for the first phase. Fortunately, many flowchart-based programming environments are widely available, and there is no need to develop a special environment for the first phase. Although, it is possible to use any suitable flowchart-based programming environment in the experimental course, yet the RAPTOR [80], Iconic Programmer [87] and Flowgorithm are more rec- ommended for the first phase. Three steps which are discussed in the section 4.9 are used to introduce each topic of course. During the first step, the general idea of the topic is intro- duced. During the second step, the topic is further explored and discussed with 6.3. Phase II 182

the graphical representations. In the last step the tasks and lab activities are assigned for the comprehension of concepts.

6.3 Phase II

The first phase of the course introduced the students with the fundamentals of elementary programming. However, it provides no real experience of pro- gramming. In order to introduce the students with the actual programming the second phase is defined, which covered the fundamental topics with the textual programming language. For the experimental LPL course, a dedicated textual language is developed. The textual language supports all the topics, which are outlined in section 4.4. All the statements of textual programs are enclosed within a main program (similar to the main function of contempo- rary programming languages). User defined functions are not allowed in the textual language. The main program follows the structure described in section 4.8.1. For instance, it may take the following forms: start of the main program

...

end of program or,

start of program

...

end or,

start

...

end The textual language supports four data types: integer, float, character and string. All the variables are properly typed and require explicit declaration 6.3. Phase II 183

before any use. Variable declaration statements followed the pattern which is defined in the section 4.8.2. For instance, the declaration statement may take the following forms: create an integer type variable named age

create float variable amount

...

create integer age The language also supports the explicit initialization in a plain and obvious style:

create integer age = 70

The integer type literals are numbers without any fraction. The float type liter- als are decimal values and follow the pattern: (.)optional. Character literal in textual language is a single character enclosed in single quo- tation marks, whereas the string literal is a sequence of characters on a single line and enclosed in double quotation marks. Single dimension arrays of integer, float and character are allowed in the textual language. The declaration of an array variable may take the following form:

create an integer array named data of 20 elements

Traditional arithmetic operators (+, -, *, /, %), relational operators (>, >=, <, <=, <>) and logical operators (and/or) are allowed in the textual language. The traditional assignment operator (=) is also allowed in a textual language and can be used in conventional style:

A = 50

The textual language provides simple statements for console input and output. The syntax of these statements is similar to the patterns defined in the section 4.8.6. Multiple variables are not permitted, and only a single variable can be used in the input statement. Similarly, a single operand is permissible in the output statement. The input statement may take the following forms: 6.3. Phase II 184

take input in number

input in name

take input in row 2 of myarray

Similarly the output statement in an experimental textual language may take the following forms:

display the value of rollnumber

display a message hello

display number

display row 5 of myarray

The selection control structure is allowed in the textual language. Single, double and multiple selection structures are allowed in the language. The structure of a selection structure is similar to the syntax defined in the section 4.8.7. The header of a selection structure is prefixed with the keyword execute which helps the students in identifying and differentiating the selection structure. The selection structure in a textual language may take the forms:

execute the statements if age > 30

...

elseif a < 200

...

else

...

end of if

The repetition control structure is allowed in the textual language. Counter- controlled and the pretest un-expected condition loops are allowed in the textual language, and their syntax is similar to structure which is defined in section 4.8.8. The counter loop in the experimental textual language may take the following forms: 6.3. Phase II 185

repeat with variable num from 1 to 50 and step is 1

...

end of loop or,

repeat the statements 200 times

...

end of loop

The pretest un-expected condition loop may take the following forms:

repeat statements while num > 200

...

end of loop or,

repeat the statements as long as num < 10 and value > 1

...

end of loop

The single line comment is allowed in a language which is prefixed by symbols “##” and run to the end of the current line. The textual language and its respective compiler are defined according to the guidelines which are illustrated in chapter five. During the definition of lan- guage, the lexical structure, syntax and semantics are specified. Similarly, the compiler of an experimental textual language is comprised of lexical analyzer, syntax analyzer, semantic analyzer and code generator. A small integrated development environment is defined in C#. The environ- ment includes a tiny text editor and a simple compiler. The figure 6.1 shows the user interface of an environment. The environment is comprised of two sections: a code section and debug section. The code section is a text area in 6.3. Phase II 186

Figure 6.1: LPL’s textual environment which the students write their programs. A simple compiler is integrated with an environment that compiles the source program and provides the list of errors in a debug section of an environment. During the definition of lexical structure the lexical units are specified. Key- words, identifiers, literals, operators, punctuators are the lexical units of a tex- tual language. The regular expressions are followed to define the patterns of lexical units. The lexical analyzer recognized the lexical units. The lexical ana- lyzer followed the same principles which are defined in section 5.2. However, it also performs some extra chores. Conventionally the lexical analyzer generates a universal token (IDENTIFER) for every type of the identifier. For instance, consider the following statements:

create integer age = 50

create float fee = 70.0

age = 50.32 6.3. Phase II 187

During the analysis of the first statement, the conventional lexical analyzer generates a token IDENTIFIER for age and a token CONSTANT for 50. Similarly, in the second statement, the lexical analyzer generates a universal token IDENTIFIER for fee and a token CONSTANT for 70.0. In both cases, the IDENTIFER represents a variable, but provides no information about its type. Similarly, the CONSTANT represents a constant value, but provides no information about its underlying type. In the third statement a float-point value is assigned to age which is an integer type variable. In conventional style, the lexical analyzer generates a token IDENTIFIER for age and a token CONSTANT for 50.32. During the parsing of the third statement, the syntax analyzer successfully recognized the statement as it is legal to assign constant to an identifier. However, semantic analyzer should report an error during the analysis of the third statement. As an illustration, consider the segment of context-free grammar for variable declaration:

→ VERB CREATE IDENTIFIER → INDEFINITEARTICLE AN INTE- GER DATATYPE | FLOAT DATATYPE | STRING DATATYPE | CHARAC- TER DATATYPE → TYPESYMBOL |  → VARIABLESYMBOL |  → NAMEDSYMBOL |  → = CONSTANT | 

The declaration grammar allows several syntactically valid statements, which are semantically wrong. For instance, consider a declaration statement:

create an integer variable age = 25.7

After lexical analysis, the above declaration statement is transformed into a tokenized sentence: 6.3. Phase II 188

VERB CREATE INDEFINITEARTICLE AN INTEGER DATATYPE VARI- ABLESYMBOL IDENTIFIER = CONSTANT

During syntax analysis, a parse tree (Figure 6.2) is generated, and the parser successfully parsed the above sentence as it is legal to assign constant to an identifier.

Figure 6.2: Parse tree of variable declaration

During semantic analysis, the types are attached with IDENTIFIER and CONSTANT and after evaluation of these attributes the semantic error re- ports an error. The renovated parse tree is shown in Figure 6.3.

Figure 6.3: Renovated parse tree of variable declaration

The lexical analyzer of an experimental textual compiler performs some extra tasks, which reduced the responsibilities of the semantic analyzer. Unlike con- ventional scanners, the lexical analyzer generates a separate type token for each kind of lexeme. 6.3. Phase II 189

Following is the concise list of lexemes and tokens.

Type of Lexeme Token Integer-type identifier INT IDENTIFIER Float-type identifier FLT IDENTIFIER Character-type identifier CHR IDENTIFIER String-type identifier STR IDENTIFIER Integer constant INTEGERCONSTANT Character constant CHARACTERCONSTANT Float constant FLOATCONSTANT String constant STRINGCONSTANT

The lexical analyzer of a textual compiler generates a separate type of token for every class of variable and constant. This feature permits the syntax analyzer to perform several semantic checks during parsing as mentioned in [281]. Such as, it is possible to check the compatibility of a variable and its initial value in the declaration statement. However, it requires a separate production in a context-free grammar for every type of variable and therefore, it’s only suitable for small languages. The grammar of an experimental textual language is almost similar to a model grammar. Some changes are performed to make the grammar more suitable for a particular parsing algorithm. As an illustration, consider a grammar of arithmetic expression (discussed in section 5.3.2):

+ - * / → ( ) → IDENTIFIER | CONSTANT 6.3. Phase II 190

The above grammar is left recursive, and most of the parsing algorithms cannot handle the left recursive grammar. Therefore, the following segment of grammar which is non-left recursive is used in grammar:

→ PLUS | MINUS |  → MULTIPLICATION | DIVISION | MODU- LUS |  | LEFTPARENTHESIS RIGHTPARENTHESIS → INTEGERCONSTANT | FLOATCON- STANT → INT IDENTIFIER | FLT IDENTIFIER

The complete context-free grammar of an experimental textual language is de- fined in AppendixA. The grammar of a textual language is verified by a syntax analyzer which is a second phase of an experimental compiler. A recursive-descent parsing is employed in the syntax analyzer of an experimental compiler. For a simple and straightforward generation of a recursive-descent parser, the grammar of textual language is transformed into an extended Backus Naur Form (EBNF) and included in the AppendixB. The grammar of the textual language has multiple nonterminals and for each nonterminal a separate function is defined for the recursive-descent parser. As an illustration, consider the recursive-descent functions of a non-left arithmetic expression grammar of the LPL’s textual language: 6.3. Phase II 191

public int expression() { term ( ) ; e x p r e s s i o n d s ( ) ; } public void expression d s ( ) { if (token == Token.PLUS ) { match(Token.PLUS); term ( ) ; e x p r e s s i o n d s ( ) ; } else if (token == Token.MINUS) { match(Token .MINUS); term ( ) ; e x p r e s s i o n d s ( ) ; } } public void term() { f a c t o r ( ) ; term ds ( ) ; } public void term ds ( ) { i f ( token == Token .MULTIPLICATION) 6.3. Phase II 192

{ match ( Token .MULTIPLICATION ) ; f a c t o r ( ) ; term ds ( ) ; } else if (token == Token.DIVISION) { match(Token.DIVISION); f a c t o r ( ) ; term ds ( ) ; } else if (token == Token.MODULUS) { match(Token .MODULUS) ; f a c t o r ( ) ; term ds ( ) ; } } public void factor() { if (token == Token.INT IDENTIFIER | | token == Token.FLT IDENTIFIER) { n u m e r i c identifier (); } e l s e i f ( token == Token .INTEGERCONSTANT | | token == Token .FLOATCONSTANT) { numeric constants (); } 6.3. Phase II 193

e l s e i f ( token == Token .LEFTPARENTHESIS) { match ( Token .LEFTPARENTHESIS) ; expression (); match ( Token .RIGHTPARENTHESIS) ; } } public void numeric constants() { i f ( token == Token .INTEGERCONSTANT) { match ( Token .INTEGERCONSTANT) ; dtype = 1 ; } e l s e i f ( token == Token .FLOATCONSTANT) { match ( Token .FLOATCONSTANT) ; dtype = 2 ; } } public int numeric identifier() { if (token == Token.INT IDENTIFIER) { match(Token.INT IDENTIFIER); } else if (token == Token.FLT IDENTIFIER) { match(Token.FLT IDENTIFIER ) ; 6.3. Phase II 194

} } public void match(Token expectedToken) { if (token == expectedToken) { getToken(); } e l s e { e r r o r ( ) ; } }

The match function matches the current token with its parameter, moves for- ward the input pointer if it succeeds, and generates an error if the expected token and lookahead token are not similar. In [483], Wilhelm and Maurer defined that if the semantic is simple and no optimization is required by the compiler, then the remaining functions can be performed by the subprograms of a syntax analyzer. The semantic analyzer of an experimental compiler is quite simple, and its several duties are performed by the syntax analyzer. So the semantic analyzer is implemented as a subpro- gram of the parser. The semantic analyzer employed the attribute grammar and attached the attributes for defining the semantics of the language. These attributes are further evaluated by the semantic analyzer to perform the type checks. The contextual verification of an assignment, loop continuation statement and termination statements are important checks performed by the semantic ana- lyzer of an experimental textual compiler. The logic followed during semantic analysis is similar to the specification defined in the model of the learners’ pro- gramming language. 6.3. Phase II 195

The following attribute grammar is defined, and the attributes of the symbols are attached to describe the semantics of an assignment and arithmetic expres- sions of the experimental language.

ˆ dtype- A synthesized attribute associated with nonterminals. It is used to store the data type of nonterminals. The data type of each nonterminal is calculated from the data types of symbols which can be derived from that nonterminal. ˆ isvalid- A synthesized attribute associated with . It is used to store a truth value, true or false. If a valid expression is assigned to a variable on the left side of the arithmetic statement, then it stores true otherwise false. Grammar Rule: ASSIGNMENTOPERATOR Semantic Rule: assignment.isvalid =

if (numeric identifier.dtype = expression.dtype), then

true

else if (numeric identifier.dtype = float and expression.dtype = inte- ger), then

true

else

false

end if

Grammar Rule: Semantic Rule: expression = 6.3. Phase II 196

if (term.dtype = string or term.dtype = character or expression ds.dtype = string or expression ds.dtype = character), then

error

else if (term.dtype = float or expression ds.dtype = float), then

float

else if (term.dtype = float or expression ds.dtype = integer), then

float

else if (term.dtype = integer or expression ds.dtype = float), then

float

else if (term.dtype = integer and expression ds.dtype = integer), then

integer

else

error

end if

Grammar Rule:

→ PLUS Semantic Rule: expression ds1 =

if (term.dtype = string or term.dtype = character or expression ds2.dtype = string or expression ds2.dtype = character), then

error

else if (term.dtype = float or expression ds2.dtype = float), then

float

else if (term.dtype = float or expression ds2.dtype = integer), then

float 6.3. Phase II 197

else if (term.dtype = integer or expression ds2.dtype = float), then

float

else if (term.dtype = integer and expression ds2.dtype = integer), then

integer

else

error

end if

Grammar Rule:

→ MINUS Semantic Rule: expression ds1 =

if (term.dtype = string or term.dtype = character or expression ds2.dtype = string or expression ds2.dtype = character), then

error

else if (term.dtype = float or expression ds2.dtype = float), then

float

else if (term.dtype = float or expression ds2.dtype = integer), then

float

else if (term.dtype = integer or expression ds2.dtype = float), then

float

else if (term.dtype = integer and expression ds2.dtype = integer), then

integer

else

error

end if 6.3. Phase II 198

Grammar Rule: Semantic Rule: term.dtype =

if (factor.dtype = string or factor.dtype = character or term ds.dtype = string or term ds.dtype = character), then

error

else if (factor.dtype = float or term ds.dtype = float), then

float

else if (factor.dtype = float or term ds.dtype = integer), then

float

else if (factor.dtype = integer or term ds.dtype = float), then

float

else if (factor.dtype = integer and term ds.dtype = integer), then

integer

else

error

end if

Grammar Rule:

→ MULTIPLICATION Semantic Rule: term ds1.dtype =

if (factor.dtype = string or factor.dtype = character or term ds2.dtype = string or term ds2.dtype = character), then

error

else if (factor.dtype = float or term ds2.dtype = float), then 6.3. Phase II 199

float

else if (factor.dtype = float or term ds2.dtype = integer), then

float

else if (factor.dtype = integer or term ds2.dtype = float), then

float

else if (factor.dtype = integer and term ds2.dtype = integer), then

integer

else

error

end if

Grammar Rule:

→ DIVISION Semantic Rule: term ds1.dtype =

if (factor.dtype = string or factor.dtype = character or term ds2.dtype = string or term ds2.dtype = character), then

error

else if (factor.dtype = float or term ds2.dtype = float), then

float

else if (factor.dtype = float or term ds2.dtype = integer), then

float

else if (factor.dtype = integer or term ds2.dtype = float), then

float

else if (factor.dtype = integer and term ds2.dtype = integer), then

integer 6.3. Phase II 200

else

error

end if

Grammar Rule:

→ MODULUS Semantic Rule: term ds1.dtype =

if (factor.dtype = string or factor.dtype = character or term ds2.dtype = string or term ds2.dtype = character), then

error

else if (factor.dtype = float or term ds2.dtype = float), then

float

else if (factor.dtype = float or term ds2.dtype = integer), then

float

else if (factor.dtype = integer or term ds2.dtype = float), then

float

else if (factor.dtype = integer and term ds2.dtype = integer), then

integer

else

error

end if

Grammar Rule: Semantic Rule: factor.dtype = numeric identifier.dtype 6.3. Phase II 201

Grammar Rule: Semantic Rule: factor.dtype = numeric constants.dtype

Grammar Rule: → LEFTPARENTHESIS RIGHTPARENTHE- SIS Semantic Rule: factor.dtype = expression.dtype

Grammar Rule: → INTEGERCONSTANT Semantic Rule: numeric constants.dtype = integer

Grammar Rule: → FLOATCONSTANT Semantic Rule: numeric constants.dtype = float

Grammar Rule: → INT IDENTIFIER Semantic Rule: numeric identifier.dtype = integer

Grammar Rule: → FLT IDENTIFIER Semantic Rule: numeric identifier.dtype = float 6.3. Phase II 202

The attributes defined in the attribute grammar are computed by a semantic analyzer which is defined as a subprogram and implemented during the parsing. For convenience, numeric codes are used to represent attribute values. As an illustration, consider the recursive-descent functions of and . public int expression() { int dtype = 0; i n t d1 = 0 ; i n t d2 = 0 ; d1 = term ( ) ; d2 = expression d s ( ) ; i f ( d2 == −1) { dtype = d1 ; } else if (d1 == 1 && d2 == 1) { dtype = 1 ; } else if ((d1 == 2 && d2 == 2) | | (d1 == 1 && d2 == 2) | | (d1 == 2 && d2 == 1)) { dtype = 2 ; } return dtype; } 6.3. Phase II 203

public void assignment() { bool isvalid = false; i n t d1 = 0 ; i n t d2 = 0 ; d1 = n u m e r i c identifier (); d2 = expression(); if (d1 == 0 && d2 == 0) { isvalid = false; } else if ((d1 == d2) | | (d1 == 2 && d2 == 1)) { isvalid = true; } e l s e { isvalid = false; } }

In first function, the data type (dtype) of expression is calculated from the data types of term and expression ds and finally returned to the calling function. The second function is defined for assignment, which compares the types of numeric identifier and expression. If both are same or compatible, the variable isvalid is set to true, which indicates the correctness of the statement. A simple symbol table is defined for the experimental compiler. The table is sequentially accessed and contains the following information about the identi- fiers:

1. Name of lexeme 2. Token 6.3. Phase II 204

3. Category of the token 4. The line number at which the identifier is declared

The semantic analyzer of an experimental textual compiler defines the reno- vated abstract syntax tree as an intermediate code and therefore, the compiler has no explicit intermediate generator. The panic-mode technique is used as an error recovery strategy for the exper- imental compiler. In this technique a set of synchronizing tokens is calculated for every nonterminal of grammar. The synchronizing tokens of each nonter- minal are defined by calculating the first and follow sets. During parsing if an error is encountered, the parser scans ahead, generate a message (as shown in Figure 6.4) and discards the tokens until one of the synchronizing set of tokens is found in the input. The panic mode recovery is used in the experimental

Figure 6.4: LPL’s textual environment showing error messages compiler and a primary reason for its selection is a fact that recursive-descent parsing is used in the compiler and panic mode is the standard form of error recovery in recursive-descent parsing [281]. The synchronizing tokens of all the 6.4. Pair programming 205

nonterminals of grammar are included in AppendixC. Code generation is the last phase of the compiler. It allows converting the in- put source program into the high level code in C and C++. Figure 6.5 shows a screenshot of the textual environment with output. The code generator is

Figure 6.5: LPL’s textual environment with output included as part of syntax analyzer and logic followed in the code generation is almost similar to the method described in section 5.5.7. The code generator analyzed the renovated parse tree and collects the relevant information which is essential for code generation and finally develops the high level code. An experimental compiler is solely developed to verify the scope and effective- ness of a learners’ programming language and therefore, the issues like robust- ness and optimization are primarily not considered in its development.

6.4 Pair programming

Pair programming is an important pillar of the learners programming language. The LPL course necessitates the students to work in a pair as it is favorable 6.5. Implementation 206

to them and increase their confidence and performance. Principally the course prefers no strict rule for pairing, yet it is suggested to pair the students according to their abilities. However, it is possible to consider other possible methods of pairing.

6.5 Implementation

Experimental learners’ programming language course is particularly developed for the imperative paradigm. The course is highly flexible and economical. The overall cost and efforts required for the implementation of LPL course are very nominal, and no special training is required for the instructors to introduce the course. The first phase of course requires the flowchart-based programming environ- ment. The recommended tools for the first phase are freely available and principally no unusual cost is required for its implementation. Similarly, no exceptional hardware and operating systems are required for the programming environments of the first phase. The second phase of a course requires a textual language. A tiny language is particularly developed for the second phase. The language is free and requires no special hardware and operating system for its implementation. Experimental learners programming is flexible and can be introduced as a com- plete semester course or may also be offered as a short course before the first programming course. The lectures and laboratory assignments would be used to cover the topics of the course. The instructors have a choice to use any method (discussed in section 4.9) for the implementation of defined course. Chapter 7

Evaluation & Discussion This chapter evaluates the scope and essentiality of learners programming lan- guage. It also discusses the major research findings.

7.1 Introduction

In chapter four and five the learners programming language is introduced and its scope and projected benefits are discussed. In chapter six an experimental LPL course is defined which is based on the philosophy defined in chapter four. The learners’ programming language is defined as an effective solution for improving the performance of students in CS1. However, no matter how robust our design philosophy and how meticulous our design process, the only practical measure for assessing a CS0 course is how effectively it actually helps the students in comprehending the first programming course. Therefore, the LPL course which is introduced in this thesis is evaluated. Two methods are followed to evaluate the learners programming language. In the first method a comprehensive survey is conducted in which the researchers and instructors associated with the field of programming were asked to assess the central theory of learners programming language and determine whether the course is effective for novice students and helpful in learning the first programming course. In the second method, a small study is conducted in which the LPL course is practically offered to a group of students, and their performance in a first programming course is compared with the group of students who did not take the LPL course before the first programming course.

207 7.2. Expert Judgment 208

7.2 Expert Judgment

The learners’ programming language is a strategy to overcome the complexity of the first programming course. Generally it is recognized that a good strategy should be based on existing scientific knowledge. This knowledge is sometimes established through research, but mostly scientists should plainly express their judgment, and this is extremely so in risk scenarios that is identified by high levels of uncertainty. Usually in such situations, the judgments of experts will be required in order to pool knowledge and minimize error [58]. So a comprehensive study is conducted to evaluate the real worth, effectiveness, and the risks of the learners programming language. During the study, instructors and researchers of programming were asked through an online survey to judge the essentiality, scope and central theory of learners programming language. During the study, 127 respondents participated in the survey. Following items were discussed to assess the pertinence and essentiality of LPL course.

7.2.1 Hardness of CS1

The programming is usually hard to learn and therefore the novice students face problems in a first programming course. In order to identify the essentiality of LPL course the instructors and researchers were asked the following question: “The first programming course is difficult for novice students” The respondents can reply on a Likert scale ranging from 1 (Strongly disagree) to 5 (Strongly agree). The feedback received from the respondents is shown in Figure 7.1. 70.08% of the respondents agreed and 16.54% strongly agreed that novice students face problems in a first programming course. However, only 3.94% disagreed, and 9.45% are neither agreed nor disagreed. The learn- ers’ programming language is specially developed to overcome the hardness of the first programming course and the feedback from the respondents highly strengthen the essentiality of learners programming language. 7.2. Expert Judgment 209

Figure 7.1: Experts opinion on the hardness of CS1

7.2.2 Main cause of problems in CS1

There are several factors that make the first programming course more complex for novice students. In order to identify the main cause of the problem in CS1, the respondents were asked the following question: “What is a main cause of problems in a first programming course?” The respondents can reply the following responses:

1. Lack of a previous knowledge 2. Technical and complicated syntax of programming language 3. Weak background in mathematics 4. Demographic factors like age and gender 5. Any other

The feedback received from the respondents is shown in Figure 7.2.

32.28% of the respondents think that lack of previous of knowledge is a main cause of the problem in CS1. Whereas 39.37 consider the technical and com- plicated syntax as a main source of the problem and 17.32% believe the weak mathematical background as a source of the problem. However, only 3.15% of respondents consider that demographic variables are the main cause of the 7.2. Expert Judgment 210

Figure 7.2: Experts opinion on the main cause of problems in CS1 problem. It is interesting that 7.87% of respondents deemed that external fac- tors like the computer labs and academic environment are the main factors behind the intricacy of CS1. The learners’ programming language is merely developed to overcome the com- plexity of the first programming course by providing a soft introduction of programming and without coercing the students to deal with the complicated syntax of contemporary programming language, and it can be seen that the response from the respondents strengthens the scope of learners programming language.

7.2.3 Paradigm of introductory programming courses

Selection of an appropriate paradigm is a pivotal decision in the definition of introductory programming courses. During the study the instructors and re- searchers were asked to identify the suitable paradigm for the introductory pro- gramming courses. The response from the respondents is shown in Figure 7.3. The majority of respondents (71.65%) recommended the procedural paradigm for the introductory programming courses. Around 11.81% recommended the object-oriented paradigm and 1.57% suggested the logic paradigm and none of any respondents endorsed the functional paradigm. However, the 14.96% of 7.2. Expert Judgment 211

Figure 7.3: Experts opinion on the paradigm of introductory programming courses respondents recommended the other paradigms. The central idea of LPL course is very simple and the notion of learners pro- gramming language can be used with any paradigm. However, the LPL course prefers the procedural paradigm for the introductory programming courses and it can be seen that the feedback received from the respondents highly advocates the central theory of learners programming language.

7.2.4 Significance of prior knowledge in CS1

In order to identify the significance of prior knowledge of programming in a first programming course the instructors and researchers were asked the following question: “The students with some previous knowledge of programming work better in the first programming course than those who had not” The respondents can reply on a Likert scale ranging from 1 (Strongly disagree) to 5 (Strongly agree). The feedback received from the respondents is shown in Figure 7.4.

5.51% of the respondents strongly disagreed, and 13.39% disagreed about the significance of prior knowledge in CS1. 7.87% are neither agreed nor disagreed 7.2. Expert Judgment 212

Figure 7.4: Experts opinion on the significance of prior knowledge in CS1 and it is very encouraging that 66.14% of the respondents are agreed and 7.09% are strongly agreed with the significance of a prior programming knowledge in CS1. The prior knowledge of programming has a significant impact on the first pro- gramming course is a central thesis of learners programming course and the feedback received from the respondents highly advocates the central theory of learners programming language.

7.2.5 Significance of CS0

CS0 course is of a simple and effective method for providing the elementary knowledge of programming. In order to identify the significance of a CS0 course the instructors and researchers were asked the following question: “The introduction of a small programming course (foundation course) before the CS1 would help the students in learning the first programming course” The respondents can reply on a Likert scale ranging from 1 (Strongly disagree) to 5 (Strongly agree). The feedback received from the respondents is shown in Figure 7.5.

3.94% of respondents are strongly disagreed, and 21.26% disagreed about the significance of CS0 course in CS1. 4.72% are neither agreed nor disagreed and 7.2. Expert Judgment 213

Figure 7.5: Experts opinion on the significance of CS0 course it is very encouraging that 64.57% of the respondents are agreed, and 5.51% of respondents strongly agreed on the significance of CS0 course in CS1. The learners’ programming language is a small foundation course and the feed- back received from the respondents highly acknowledged the significance of pro- gramming foundation course, and implicitly supports the philosophy of learners programming language.

7.2.6 Significance of the graphical environment in CS0

The majority of the CS0 courses are equipped with graphical environments. In to identify the importance of graphical environment in reducing the complexity of the first programming course, the instructors and researchers were asked the following question: “The introduction of a small programming course that simply includes a graph- ical tool which provides drag-and-drop facility is completely sufficient for pro- viding the real introduction of programming and well prepares the students for first programming course” The respondents can reply on a Likert scale ranging from 1 (Strongly disagree) to 5 (Strongly agree). The feedback received from the respondents is shown in Figure 7.6. 7.2. Expert Judgment 214

Figure 7.6: Experts opinion on the significance of graphical environment in CS0

5.51% of respondents are strongly disagreed and 70.87% disagreed with the sig- nificance of a graphical tool in CS0. 4.72% are neither agreed nor disagreed whereas 17.32% are agreed and only 1.57% of the respondents are strongly agreed with the significance of a graphical tool in CS1. It can be seen that a large majority of respondents are agreed that the graphical tool is not sufficient for introducing programming and preparing the students for the first program- ming course. The learners’ programming language includes a graphical environment, yet it does not entirely rely on a graphical environment for introducing the fundamen- tal of programming and this strategy is endorsed from the expert judgment.

7.2.7 Significance of learners programming language

Learners programming language is a two-phase CS0 course and includes a graphical environment as well as the textual language. For identifying the scope and applicability of LPL course the instructors and researchers were asked the following question: “A learners programming language (a type of programming foundation course which is based on two-phase learning) that introduced the fundamental concepts 7.2. Expert Judgment 215

of programming with graphical tools and then cover the topics with a dedicated textual language which has a simple, friendly and understandable syntax would strongly support the students in learning and completing the first programming course” The respondents can reply on a Likert scale ranging from 1 (Strongly disagree) to 5 (Strongly agree). The feedback received from the respondents is shown in Figure 7.7.

Figure 7.7: Experts opinion on the significance of learners programming language

2.36% of the respondents strongly disagreed and 18.90% disagreed with the sig- nificance of learners programming language in CS1. 11.81% are neither agreed nor disagreed whereas 62.99% are agreed and 3.94% of the respondents ae strongly agreed with the significance of LPL course in CS1. It can be seen that a large majority of respondents are agreed with the significance of learners programming language.

7.2.8 High level code generation

Textual language converts the input source program into high-level programs. In order to determine the effectiveness of this feature the instructors and re- 7.2. Expert Judgment 216

searchers were asked the following question: “The textual language in LPL course allows the students to program with simple and understandable statements and finally generates the equivalents high level programs in different languages (like C and C++). This feature should help the students in learning the syntax of first programming language” The respondents can reply on a Likert scale ranging from 1 (Strongly disagree) to 5 (Strongly agree). The feedback received from the respondents is shown in Figure 7.8.

Figure 7.8: Experts opinion on the significance of high-level code generation

1.57% of the respondents are strongly disagreed and 14.17% disagreed with the significance of high level code generation. 14.17% are neither agreed nor disagreed whereas 66.93% are agreed and 3.15% of the respondents are strongly agreed with the significance of high level code generation. It can be seen that a large majority of respondents acknowledged that high level code generation is helpful in learning the syntax of the first programming language.

7.2.9 Motivation and comfort level

Lack of motivation and comfort level are the main issues in introductory pro- gramming courses and therefore the learners’ programming language aims to increase the motivation and comfort level of students. During the study, instruc- 7.2. Expert Judgment 217

tors and researchers were asked whether the learners’ programming language can improve the motivation and comfort level of students in a first programming course. The respondents can reply on a Likert scale ranging from 1 (Strongly disagree) to 5 (Strongly agree). The feedback received from the respondents is illustrated in Figure 7.9.

Figure 7.9: Experts opinion on the impact of LPL course on motivation and comfort level

3.15% of the respondents are strongly disagreed and 17.32% disagreed that learners programming language can increase the motivation and comfort level of students. 10.24% are neither agreed nor disagreed whereas 65.35% are agreed and 3.94% of the respondents are strongly agreed with the impact of learners programming language in increasing the motivation and comfort level of stu- dents. The feedback received from the respondents reveals that LPL has a potential to increase the motivation and comfort level of students.

7.2.10 Implementation of LPL course

LPL course can be introduced as a short course or may be introduced as a complete course before the first programming course. During the study the instructors and researchers were asked the following question: “Which of the following approach is more suitable for the implementation of 7.2. Expert Judgment 218

LPL course?” The respondents can reply the following responses:

1. A complete course in a semester before the first programming course 2. A short course before the first programming course

The feedback received from the respondents is shown in Figure 7.10.

Figure 7.10: Experts opinion on the implementation of LPL course

Figure 7.10 shows that 73.23% of the respondents recommended the LPL course, as a short course, whereas 26.77% recommended the learners programming lan- guage as a complete course before the first programming course. The learners’ programming language is a flexible course and therefore it either be used as a short course or a complete course before the first programming course.

7.2.11 Essential topics

The learners’ programming language covers the essential topics of programming. During the study following question was asked to verify whether the supported features are adequate at a foundation level. “The fundamental topics of programming (introduction to programming, data types, variables and literal, console I/O, single dimension array, selection con- trol structure, repetition structure and comments) are sufficient in the LPL 7.2. Expert Judgment 219

course and should prepare the students for the first programming course” The respondents can reply the following responses:

1. Yes 2. No (specify the further topics)

The feedback received from the respondents is shown in Figure 7.11.

Figure 7.11: Experts opinion on the essential topics of learners’ programming language

Figure 7.11 shows that 79.53% of the respondents agreed that currently sup- ported features are adequate at a foundation level. However, 20.47% suggested further topics like the introduction of recursion, multidimensional arrays, record and Disk I/O. The feedback received during the study supports the essential topics which are included in the learners programming language and thereby strengthen the scope of learners programming language.

7.2.12 Pair programming

Pair programming is the cornerstone of the learners programming language. During the study, researchers and instructors were asked whether the use of pair programming is helpful for novice students of learners programming language. The respondents can reply on a Likert scale ranging from 1 (Strongly disagree) 7.2. Expert Judgment 220

to 5 (Strongly agree). The feedback received from the respondents is illustrated in Figure 7.12.

Figure 7.12: Experts opinion on the impact of pair programming in LPL course

The 8.66% of the respondents are strongly disagreed and 30.71% disagreed with the impact of pair programming in the learners programming language. 4.72% are neither agreed nor disagreed whereas 51.18% are agreed and 4.72% of the respondents are strongly agreed with the impact of pair programming in learners programming language. The pair programming is an important pillar of learners programming language and feedback from the respondents in support of pair programming supports the essentiality of learners programming language.

7.2.13 Implementation risk of LPL course The learners’ programming language is a programming foundation course and apparently no risk is associated with its implementation. During study fol- lowing question was asked to determine the whether a risk is involved in the implementation of LPL course. “Is there any risk associated with the implementation of learners programming language?” 7.3. Practical Evaluation 221

The respondents can reply the following responses:

1. Yes 2. No 3. No Idea

The feedback received from the respondents is shown in Figure 7.13.

Figure 7.13: Experts opinion on the risk of LPL course

The 17.32% of the respondents deemed that risk is associated with the imple- mentation of LPL course, whereas 68.50% think that no risk is associated with the implementation of LPL course and 14.17% have no concrete idea about the risk of LPL course. It can be seen that the majority of respondents ac- knowledged that no risk is associated with the implementation of the learners programming language.

7.3 Practical Evaluation

In order to ascertain whether the LPL course is really effective in a first pro- gramming course a small study is conducted. As a part of the study 84 under- graduate students are selected and randomly divided into two groups (called the control group and treatment group). The experimental LPL course de- scribed in the previous chapter is introduced as a short course before the first 7.3. Practical Evaluation 222

programming course to the treatment group, whereas Blockly is introduced to control group. Later, C language based a first programming course is offered by a same instructor to the both groups of study. The following methods are employed to evaluate the effectiveness of the learners programming language.

7.3.1 Student's performance

The student’s performance in a first programming course is the key element to evaluate the significance of the precursor programming language. In order to identify the impact of learners programming language, the students of both groups are internally evaluated for dissertation research. Procedure After the completion of the first programming course the students of both groups are internally evaluated with pen and paper exam. The students’ pass rate and grades are used to measure their performance. Result The pass rate of students in the control group is 42.86% where 64.29% in the treatment group (see Figure 7.14).

Figure 7.14: Performance of students in CS1 7.3. Practical Evaluation 223

The mean marks for the control group is 42.14 (standard deviation = 25.65) whereas the mean marks for the treatment group is 58.33 (standard deviation = 23.15). Figure 7.15 shows the box plots of the marks obtained by the students of both groups in a first programming course.

Figure 7.15: Box plot of marks

It can be seen that the performance of students in the treatment group is much better than the students of the control group. Similarly, the students in the treatment group secure better grades than the students of the control group as shown in Figure 7.16.

Independent sample t-test is conducted on the two groups of students by using SPSS, which is a powerful package for analyzing data [260]. The results of a test are illustrated in Table 7.1. 7.3. Practical Evaluation 224

Figure 7.16: The grade wise illustration of marks

Table 7.1: Results of independent sample t-test

Confidence Interval Assumption T df Sig.(2-tailed) Mean Diff. Std. Error Diff. Lower Upper

Equal vari- 3.037 82 .003 16.190 5.332 5.584 26.797

ances assumed

Equal vari- 3.037 81.155 .003 16.190 5.332 5.582 26.798

ances not

assumed

An independent sample t-test conducted on the control group and treatment group reveals that students who took the LPL course are significantly better in a first programming course than the students of control group and the effect of the learners programming language is statistically significant, t = 3.037 and p < 0.01.

7.3.2 Self-perceived programming proficiency

The student's perception of their programming proficiency is an important fac- tor that reflects their confidence. In order to identify the impact of LPL course, the students were asked to evaluate their perceived proficiency as a programmer. 7.3. Practical Evaluation 225

Procedure Programming proficiency was quantified by one survey item that was defined for the current analysis. Programming proficiency is assessed at the end of the first programming course. The students of both groups were asked to appraise their perceived proficiency as a computer programmer on a Likert type scale:

1. I can not start a programming task without help. 2. I can start a programming task, but require support in recognizing prob- lems. 3. I am sometimes able to start a programming task, but require support in recognizing problems. 4. I am able to separately recognize problems in programs, though sometimes require help to discover solutions. 5. I am able to solitarily recognize problems in the programs and find solu- tions without any help.

Result In the treatment group the perception of proficiency in programming is much higher than the students in the control group (see Figure 7.17).

Figure 7.17: Self-perceived programming proficiency of students 7.3. Practical Evaluation 226

The Mann-Whitney U test is conducted on the self-perceived programming proficiency level of both groups and the results are included in Table 7.2 and 7.3.

Table 7.2: Ranks

Student group N Mean Rank Sum of Ranks Control group 42 35.26 1481.00 Treatment group 42 49.74 2089.00

Table 7.2 shows that the mean rank of the treatment group is much higher than the mean rank of the control group.

Table 7.3: U Test Statistics of students’ perception in programming

Test Response Mann-Whitney U 578.000 Wilcoxon W 1481.000 Z -2.786 Asymp. Sig. (2-tailed) .005 Exact Sig. (2-tailed) .005 Exact Sig. (1-tailed) .002

Mann-Whitney U test conducted on the control group and treatment group reveals that students who took LPL course have a higher perception of pro- gramming than the students of the control group and the effect of the learners programming language is significant, Z = -2.786 and p < 0.01.

7.3.3 Commitment to class

The student's affective commitment has a strong impact on their performance in a first programming course. The affective commitment describes how much the students are willing to be in the class because they are enjoying it. During the study, students affective commitment to the first programming was measured to identify the effect of learners programming language. 7.3. Practical Evaluation 227

Procedure The student's affective commitment is assessed before the end of the first pro- gramming course and the following question was asked to the students of both groups. “I am enthusiastic about this programming class” Students responded on a Likert scale ranging from 1 (Strongly disagree) to 5 (Strongly agree). Result In the treatment group the commitment to a class is higher than the students in the control group (see Figure 7.18).

Figure 7.18: Students commitment to a class

Mann-Whitney U test is conducted on two groups of students and the results are included in Table 7.4 and 7.5.

Table 7.4: Ranks

Student group N Mean Rank Sum of Ranks Control group 42 35.24 1480.00 Treatment group 42 49.76 2090.00

Table 7.4 shows that the mean rank of the treatment group is much higher than the mean rank of the control group. 7.3. Practical Evaluation 228

Table 7.5: U Test Statistics of students’ perception in programming

Test Response Mann-Whitney U 577.000 Wilcoxon W 1480.000 Z -2.886 Asymp. Sig. (2-tailed) .004 Exact Sig. (2-tailed) .004 Exact Sig. (1-tailed) .002

Mann-Whitney U test conducted on the control group and treatment group re- veals that students who took LPL course have a higher commitment to a class than the students of the control group and the effect of the learners program- ming language is significant, Z = -2.862 and p < 0.01.

7.3.4 Student’s perception

The students of the treatment group were asked to estimate the significance of LPL course. The significance of learners programming language is measured with one survey item that contained the following statement: “The LPL course is helpful in first programming course” Students responded on a Likert scale ranging from 1 (Strongly disagree) to 5 (Strongly agree). Procedure Students perception is assessed at the end of the first programming course. All the students of the treatment group participated in the survey. Result The student’s perception on the significance of learners programming language is quite positive. Figure 7.19 shows the overall picture of feedback received on the significance of LPL course.

The majority of students agreed with the significance of learners programming language in CS1. However, only 7.14% strongly disagreed and 14.29% disagreed with the significance of LPL course. The instructor of course reported that 7.3. Practical Evaluation 229

Figure 7.19: Student’s perception on the LPL course the majority of at-risk students insisted to introduce the learners programming language as a complete course in a semester before the first programming course. However, in our study it was introduced as a short course before the CS1. It is also reported that a large majority of students demands a system or a facility that should help them in locating, recognizing and correcting the errors. Similarly, several students urge to spend more time on the second phase of the LPL course. The instructor also reported that a large majority of students takes the benefits of pair programming, yet it is very interesting that some students presume that they were paired due to the limited equipment and facilities.

7.3.5 Interest in next programming course

Low retention is one of a major problem of introductory programming courses. During study it is assessed whether the LPL course improves the retention level of novice students. Procedure Calculating the percentage of students who begins as a freshman in the fall semester of one year and still assiduously attending classes in the next fall is the conventional method for finding the retention rate. However, this method is possible when the students have a choice to skip the next programming course. 7.3. Practical Evaluation 230

In many institutions the programming courses (CS1, CS2 and advanced pro- gramming courses) are not elective and the students have to attend these pro- gramming courses. In this case an alternate method is required to assess the retention rate. One possible method is to ask the students about their interest in the next programming course. Although this method does not purely deter- mine the retention rate, but can be helpful in identifying the expected retention rate. During study the students of both groups were asked the following question item: “I am willing in another programming course in the next semester” Students responded on a Likert scale ranging from 1 (Strongly disagree) to 5 (Strongly agree). Result The feedback received from the students suggest that the students of a treat- ment group have a higher interest in the next programming course than the students of the control group. Figure 7.20 shows the detail.

Figure 7.20: Students interest in the next programming course

A large majority of the students who attended the LPL course are more in- terested in another programming course than the students of a control group. Mann-Whitney U test is conducted on both groups of students and the results are included in Table 7.6 and 7.7. 7.3. Practical Evaluation 231

Table 7.6: Ranks

Student group N Mean Rank Sum of Ranks Control group 42 34.65 1455.50 Treatment group 42 50.35 2114.50

Table 7.6 shows that the mean rank of the treatment group is much higher than the mean rank of the control group.

Table 7.7: U Test Statistics about the interest of students in a next programming course Test Response Mann-Whitney U 552.500 Wilcoxon W 1455.500 Z -3.042 Asymp. Sig. (2-tailed) .002 Exact Sig. (2-tailed) .002 Exact Sig. (1-tailed) .001

The Mann-Whitney U test conducted on the control group and treatment group indicates that students who took LPL have higher interest in the next program- ming language than the students of a control group and the effect of the learners programming language is significant, Z = -3.042 and p < 0.01. Chapter 8

Conclusions and Future Work

This chapter presents the key findings of the work and describes the contribu- tions made in the fields of computing education. A brief introduction for further research is presented which implicitly illustrates some limitations of research.

8.1 Overview and Conclusion

The first programming course is notoriously difficult for novice students. This is because it necessitates the novice students to obtain the understanding of con- cepts for which they have no prior experience and express these concepts in the unusual method of programming language they have never before confronted. The difficulties in introductory programming courses are usually manifested in the form of high dropouts and low retention rates. Thus close support is re- quired to prepare the novice students for the first programming course. The learners’ programming language is a zeroth programming course which is especially designed to improve the students’ performance in the first pro- gramming course. Basically the learners programming language is a two-phase foundation programming course that includes a graphical environment with a dedicated textual programming language and augmented with a pair program- ming. In order to identify its actual worth and effectiveness the learners programming language is evaluated by taking expert judgments and applying the course on a

232 8.1. Overview and Conclusion 233

small group of students. It would be very difficult to make any definite conclu- sion about the learners programming language by relying on the limited number of expert judgment and a small sample analyzed in this thesis, yet the study suggested that there are several momentous benefits to the use of LPL as a precursor for the first programming course. So introducing the fundamental of programming with a graphical environment and later covering the program- ming concepts with a textual language and encouraging the students to work in a pair is a useful theory to prepare the novice students for the first program- ming course. Scores in the first programming course exhibit that a treatment group who has attended the LPL course is performing better than the control group. A better experience in the first programming course is reported in the treatment group. The pass rate of students in the control group was 42.86% where 64.29% in the treatment group. The mean marks for the control group was 42.14 (stan- dard deviation = 25.65), whereas the mean marks for the treatment group was 58.33 (standard deviation =23.15). Independent sample t-test conducted on the scores of students signified that the students of a treatment group are significantly better in the first programming course than the students of the control group, t = 3.037 and p < 0.01. These results suggest that the learners programming language is helpful for novice students and supportive in reducing the inherent complexity of the first programming course. The level of self-perceived programming proficiency in the students of the treat- ment group is much higher than the students of the control group. Programming proficiency was assessed at the end of the first programming course. Students that took the LPL course reported a higher self-perceived programming profi- ciency in the first programming course. Mann-Whitney U test conducted on students’ self-perceived programming proficiency described that students who took LPL have higher a perception of programming than the students of the control group and the effect of learners programming language is significant, Z = -2.786 and p < 0.01. This indicates that learners programming language has a potential to increase the confidence of novice students. 8.1. Overview and Conclusion 234

The students of a treatment group are more committed in a first programming course. The students’ affective commitment was assessed before the end of the first programming course. Mann-Whitney U test conducted on commitment level of students in both groups described that students who took the LPL have a higher commitment to a class than the students of a control group and the effect of learners programming language is significant, Z = -2.862 and p < 0.01. This suggests that learners programming language is also helpful in motivating and encouraging the students. The students’ perception of the significance of LPL course was assessed at the end of the first programming course. Around 66.67% of students acknowledged that LPL course supported the first programming course. This indicates that a large majority of students in the treatment group positively recognizes the significance of LPL course. After the completion of the first programming course, the 61.90% students of a treatment group are interested in the next programming course where only 30.95% students of the control group are willing in another programming course in the next semester. The feedback received from both groups is analyzed with Mann-Whitney U test which described that students who took LPL have a higher interest in the next programming course than the students of the control group and the effect of learners programming language is significant, Z = -3.042 and p < 0.01. This suggests that learners programming language may also be productive in improving the retention rate of students. The overall scope and effectiveness of learners programming language is highly recognized from the experts’ judgment. 86.61% of experts opined that novice students face problems in a first programming course. 39.37% of experts deemed that complex syntax is the main cause of a problem, whereas 32.28% consider that lack of previous knowledge is the main cause of the problem. Around 70.08% of experts agreed that the introduction of a small foundation program- ming course could help the students in CS1. However, 76.38% deemed that a single graphical tool is not sufficient in CS0 course. These findings softly advocates the essentiality of a different type of CS0 course. 66.93% of experts 8.1. Overview and Conclusion 235

agreed that a learners’ programming language that introduced the fundamen- tal concepts of programming with a graphical tool and then cover the topics with a dedicated textual language that has a simple, friendly and understand- able syntax would strongly support the students in learning and completing the first programming course. 69.29% of experts agreed that LPL course could increase the motivation and comfort level of students. It is very encouraging that 70.08% of experts acknowledge that the high level code generation would help the novice students in learning and understanding the complicated syntax of CS1. The LPL course includes the fundamental topics of programming and 79.53% of experts agreed that supported features are sufficient at CS0 level. It is very encouraging that 55.91% of experts agreed with the significance of pair programming in the LPL course and 68.50% think that no risk is associated with the implementation of LPL course. These findings suggest essentiality, scope and worth of learners programming language. The overall results and feedback obtained during the evaluation of learners pro- gramming language addressed the research questions of a thesis and suggest that the amalgamation of two-phase learning and pair programming in CS0 level is helpful for novice students and support them in learning and compre- hending the first programming course. Moreover, the results also suggest that learners programming language is helpful in increasing the motivation and com- fort level of students and may also support in improving the retention rate of students in CS1. However, it is important to mention that the introduction of a learners’ programming language is just a single step to help novice students cope better with the difficulties associated with the first programming course, and therefore to overcome all the major difficulties of introductory program- ming it is also important to address other relevant issues. At last it is important to state that the contributions of this work are: i) def- inition of two-phase learning which use graphical environment to present the elementary topics and to cover these topics by using the textual language ii) the definition of a textual language which allows to program with simple statements and permits the generation of high level code iii) use of pair programming in 8.2. Future Work 236

CS0 course. To the best of our knowledge, these concepts have not been amal- gamated and investigated in the zeroth programming course.

8.2 Future Work

In order to overcome the limitations of research, there are several future di- rections. The learners’ programming language is evaluated on a small sample. There is a need to conduct a comprehensive multi-institutional study on the impact of learners programming language, covering a broad range of students. This is because it is not clear whether the learners’ programming is equally effective on male and female, whether it is more effective for young students or adult students, whether the demographic factors strongly affects its usefulness, whether it is equally effective for computer science major and non-major stu- dents, whether it can reduce cognitive load [308, 445] in learning programming and whether it is affective for every programming paradigm. This would require exhaustive statistical analyses and would provide more profound insight about the scope, structure and effectiveness of learners programming language. The learners’ programming language is simply based on two-phase learning and augmented with a pair programming. However, the incorporation of other techniques like live-coding [364, 399] and doodles [481] in future studies would improve the effectiveness of learners programming language. The gender inequality in the field of computer science is no secret [14] and the number of women students involved in programming are very low [401]. So, further study studies could be conducted to identify the causes and the development of better precursor programming course that should increase the motivation of women. The model of LPL course is one a possible implementation of two-phase learn- ing. The supplementary work is required to identify, analyze and evaluate the other viable graphical environments for the first phase of LPL course. Simi- larly, the additional work would be useful in the definition and the alternate representations of the syntactic structure of textual language. The compiler of textual language typically follows the structure of contemporary 8.2. Future Work 237

compiler. There is a large margin to conduct further studies in the definition, optimization and implementation of the compiler. The contemporary error han- dling techniques are described in the definition of LPL model. Further studies could be conducted in the formulization of a robust error handling strategy that could provide better error handling and would support the novice students in identifying and correcting errors. These studies would be helpful in refining the design of the LPL and would open the new directions for further work. The second phase of LPL course includes a textual language. The development environment of textual language is usually less attractive for novice students. However, to increase and maintain the interest and comfort level of students are the main objectives of LPL course. So there is a need to formulate the devel- opment environment that could maintain the interest of students. Similarly, a detail study could be conducted in discerning whether the vigorous features like intelliSense, code refactoring and static checking are beneficial at CS0 level. The LPL course does not strictly recommend any particular method for the grouping of students. Further studies could also be conducted to identify a suitable method for pairing the students of LPL course. The prime objective of LPL course is to improve the performance of novice stu- dents in the first programming course, and therefore the scope and effectiveness of LPL course are evaluated by analyzing its effect on the CS1. There is a need to analyze the long term impact of LPL course on CS2 and further advanced programming courses. The learners’ programming language is an economical course and the efforts and cost required for its implementation are usually very low. However, further studies could be conducted to estimate the actual efforts and budget required for the design and implementation of LPL course. Similarly, there is a need to identify the risks associated with the implementation of LPL course. The learners’ programming language suggests three possible techniques for the implementation of a course. It is essential to conduct a study to analyze these techniques and ascertain the requisites and a suitable environment for each technique. Bibliography

[1] K. K. Agarwal and A. Agarwal. Simply Python for CS0. Journal of Computing Sciences in Colleges, 21(4):162–170, 2006.

[2] K. K. Agarwal, A. Agarwal, and M. E. Celebi. Python Puts a Squeeze on Java for CS0 and Beyond. Journal of Computing Sciences in Colleges, 23(6):49–57, 2008.

[3] K. K. Agarwal, A. Agarwal, and L. Fife. Python and Visual Logic: A Good Combination for CS0. Journal of Computing Sciences in Colleges, 27(4):22–27, 2012.

[4] A. V. Aho, R. Sethi, and J. D. Ullman. Compilers: Principles, Tech- niques, and Tools. Addison Wesley, USA, first edition, 1986.

[5] A. V. Aho, M. S. Lam, R. Sethi, and J. D. Ullman. Compilers: Principles, Techniques, and Tools. Addison Wesley, Boston, USA, second edition, 2006.

[6] T. Ahoniemi and E. Lahtinen. Visualizations in preparing for program- ming exercise sessions. Electronic Notes in Theoretical Computer Science, 178:137–144, 2007.

[7] E. E. Aispuro, G. Licea, J. Su´arez,A. Sandoval, M. A. Carre˜no,I. Estrada, R. Ju´arez-Ram´ırez,L. Aguilar, and L. G. Mart´ınez.Supporting the Devel- opment of Iapplications in Introductory Programming Courses. Computer Applications in Engineering Education, 20(2):214–220, 2012.

238 BIBLIOGRAPHY 239

[8] A. Akingbade, T. Finley, D. Jackson, P. Patel, and S. H. Rodger. JAWAA: Easy Web-based Animation from CS0 to Advanced CS Courses. ACM SIGCSE Bulletin, 35(1):162–166, 2003.

[9] N. M. Al-Barakati and A. Y. Al-Aama. The Effect of Visualizing Roles of Variables on Student Performance in an Introductory Programming Course. ACM SIGCSE Bulletin, 41(3):228–232, 2009.

[10] A. Ali. A Conceptual Model for Learning to Program in Introductory Pro- gramming Courses. Issues in Informing Science and Information Tech- nology, 6:517–529, 2009.

[11] A. Ali and D. Smith. Teaching an Introductory Programming Language in a General Education Course. Journal of Information Technology Edu- cation: Innovations in Practice, 13:57–67, 2014.

[12] B. Y. Alkazemi and G. M. Grami. Utilizing BlueJ to Teach Polymor- phism in an Advanced Object-Oriented Programming Course. Journal of Information Technology Education: Innovations in Practice, 11:271–282, 2012.

[13] C. A. Alspaugh. Identification of Some Components of Computer Pro- gramming Aptitude. Journal for Research in Mathematics Education, 3 (2):89–98, 1972.

[14] C. Alvarado and Z. Dodds. Women in CS: An Evaluation of Three Promis- ing Practices. In Proceedings of the 41st ACM Technical Symposium on Computer Science Education, pages 57–61, New York, USA, 2010.

[15] C. Alvarado, C. B. Lee, and G. Gillespie. New CS1 Pedagogies and Curriculum, the Same Success Factors? In Proceedings of the 45th ACM Technical Symposium on Computer Science Education, pages 379–384, New York, USA, 2014.

[16] A. Alvarez and M. Larranaga. Using LEGO Mindstorms to Engage Stu- BIBLIOGRAPHY 240

dents on Algorithm Design. In IEEE Frontiers in Education Conference, pages 1346–1351, 2013.

[17] M. L. Ambrose and C. T. Kulik. Old Friends, New Faces: Motivation Research in the 1990s. Journal of Management, 25(3):231–292, 1999.

[18] A. P. Ambr´osio,F. M. Costa, L. Almeida, A. Franco, and J. Macedo. Identifying Cognitive Abilities to Improve CS1 Outcome. In Frontiers in Education Conference, pages F3G–1–F3G–7, 2011.

[19] A. P. L. Ambr´osioand F. M. Costa. Evaluating the Impact of PBL and Tablet PCs in an Algorithms and Computer Programming Course. In Proceedings of the 41st ACM Technical Symposium on Computer Science Education, pages 495–499, New York, USA, 2010.

[20] S. Amershi, G. Carenini, C. Conat, A. K. Mackworth, and D. Poole. Pedagogy and usability in interactive algorithm visualizations: Designing and evaluating CIspace. Interacting with Computers, 20(1):64–96, 2008.

[21] P. Y. O. Amoako, K. A. Sarpong, J. K. Arthur, and C. Adjetey. Per- formance of Students in Computer Programming: Background, Field of Study and Learning Approach Paradigm. International Journal of Com- puter Applications, 77(12):17–21, 2013.

[22] R. Anderson, M. D. Ernst, R. Ord´o˜nez,P. Pham, and S. A. Wolfman. Introductory Programming Meets the Real World: Using Real Problems and Data in CS1. In Proceedings of the 45th ACM Technical Symposium on Computer Science Education, pages 465–466, New York, USA, 2014.

[23] I. Androutsopoulos, G. D. Ritchie, and P. Thanisch. Natural language Interfaces to Databases - An introduction. Natural Language Engineering, 1:29–81, 1995.

[24] K. Anewalt. Making CS0 Fun: An Active Learning Approach Using Toys, Games and Alice. Journal of Computing Sciences in Colleges, 23(3):98– 105, 2008. BIBLIOGRAPHY 241

[25] A. W. Appel. Modern Compiler Implementation in C. Cambridge Uni- versity Press, Cambridge, UK, 2004.

[26] C. Areia and A. Mendes. A Tool to Help Students to Develop Program- ming Skills. In Proceedings of the 2007 International Conference on Com- puter Systems and Technologies, pages 89:1–89:7, New York, USA, 2007.

[27] J. Arlegui, M. Moro, and A. Pina. Simulation of Robotic Sensors in BYOB. Proceedings of Robotics in Education, pages 25–32, 2012.

[28] M. Armoni, O. Meerbaum-Salant, and M. Ben-Ari. From Scratch to “real” Programming. ACM Transactions on Computing Education, 14 (4):25:1–25:15, 2015.

[29] A. T. Avancena and A. Nishihara. Usability and Pedagogical Assess- ment of an Algorithm Learning Tool: A Case Study for an Introductory Programming Course for High School. Issues in Informing Science and Information Technology, 12:21–43, 2015.

[30] B. W. Ballard and A. W. Biermann. Programming in Natural Language: “NLC” As a Prototype. In Proceedings of the 1979 Annual Conference, pages 228–237, New York, USA, 1979.

[31] S. Banerjee, S. Naskar, and S. Bandyopadhyay. BFQA: A Bengali Factoid Question Answering System. In Text, Speech and Dialogue, volume 8655 of Lecture Notes in Computer Science, pages 217–224. 2014.

[32] A. Barenghi, E. Viviani, S. C. R. D. Mandrioli, and M. Pradella. PA- PAGENO: A Parallel Parser Generator for Operator Precedence Gram- mars. In Software Language Engineering, volume 7745 of Lecture Notes in Computer Science, pages 264–274. Springer Berlin Heidelberg, 2013.

[33] T. Barik, K. Lubick, S. Christie, and E. Murphy-Hill. How Developers Visualize Compiler Messages: A Foundational Approach to Notification Construction. In Second IEEE Working Conference on Software Visual- ization, pages 87–96, 2014. BIBLIOGRAPHY 242

[34] T. Barnes, E. Powell, A. Chaffin, and H. Lipford. Game2Learn: Improving the Motivation of CS1 Students. In Proceedings of the 3rd International Conference on Game Development in Computer Science Education, pages 1–5, New York, USA, 2008.

[35] M. P. Barnett. SNAP: A Programming Language for Humanists. Com- puters and the Humanities, 4(4):225–240, 1970.

[36] M. P. Barnett and W. M. Ruhsam. A Natural Language Programming System for Text Processing. IEEE Transactions on Engineering Writing and Speech, 11(2):45–52, 1968.

[37] M. P. Barnett and W. M. Ruhsam. SNAP: An Experiment in Natural Language Programming. In Proceedings of the Spring Joint Computer Conference, pages 75–87, New York, USA, 1969.

[38] I. D. Baxter, C. Pidgeon, and M. Mehlich. DMS®: Program Transfor- mations for Practical Scalable Software Evolution. In Proceedings of the 26th International Conference on Software Engineering, pages 625–634, Washington, USA, 2004.

[39] J. D. Bayliss and S. Strout. Games As a “Flavor” of CS1. ACM SIGCSE Bulletin, 38(1):500–504, March 2006.

[40] D. Beazley and B. K. Jones. Python Cookbook. O’Reilly Media, USA, third edition, 2013.

[41] B. W. Becker. Teaching CS1 with Karel the Robot in Java. ACM SIGCSE Bulletin, 33(1):50–54, 2001.

[42] A. Begel, D. D. Garcia, and S. A. Wolfman. Kinesthetic learning in the classroom. ACM SIGCSE Bulletin, 36(1):183–184, 2004.

[43] M. Ben-Ari, R. Bednarik, R. B. Levy, G. Ebel, A. Moreno, N. Myller, and E. Sutinen. A Decade of Research and Development on Program Anima- tion: The Jeliot Experience. Journal of Visual Languages & Computing, 22(5):375–384, 2011. BIBLIOGRAPHY 243

[44] J. Bennedsen. Teaching and Learning Introductory Programming-A Model-Based Approach. PhD thesis, University of Oslo, 2008.

[45] J. Bennedsen and M. E. Caspersen. Abstraction Ability As an Indicator of Success for Learning Object-oriented Programming. ACM SIGCSE Bulletin, 38(2):39–43, 2006.

[46] J. Bennedsen and M. E. Caspersen. Failure Rates in Introductory Pro- gramming. ACM SIGCSE Bulletin, 39(2):32–36, 2007.

[47] S. Bergin and R. Reilly. The influence of motivation and comfort-level on learning to program. In Proceedings of the 17th Workshop of the Psy- chology of Programming Interest Group, pages 293–304, 2005.

[48] S. Bergin and R. Reilly. Programming: Factors That Influence Success. ACM SIGCSE Bulletin, 37(1):411–415, 2005.

[49] S. Bergin and R. Reilly. Predicting introductory programming perfor- mance: A multi-institutional multivariate study. Computer Science Ed- ucation, 16(4):303–323, 2006.

[50] A. Berglund and R. Lister. Introductory Programming and the Didac- tic Triangle. In Proceedings of the Twelfth Australasian Conference on Computing Education, volume 103, pages 35–44, Darlinghurst, Australia, 2010.

[51] R. Biddle and E. Tempero. Java Pitfalls for Beginners. ACM SIGCSE Bulletin, 30(2):48–52, 1998.

[52] A. W. Biermann, B. W. Ballard, and A. M. Holler. A System for Natural Language Computation. ACM SIGLASH Newsletter, 12(1):6–16, 1979.

[53] T. Binkis and T. Blazauskas. Implementation of Extensible Flowchart- ing Software Using Microsoft DSL Tools. In Proceedings of the 16th In- ternational Conference on Information and Software Technologies, pages 211–217, 2010. BIBLIOGRAPHY 244

[54] J. Bispo, L. Reis, and J. M. P. Cardoso. Multi-Target C Code Generation from MATLAB. In Proceedings of ACM SIGPLAN International Work- shop on Libraries, Languages, and Compilers for Array Programming, pages 95:95–95:100, New York, USA, 2014.

[55] A. P. Black, K. B. Bruce, M. Homer, and J. Noble. Grace: The Absence of (Inessential) Difficulty. In Proceedings of the ACM International Sym- posium on New Ideas, New Paradigms, and Reflections on Programming and Software, pages 85–98, New York, USA, 2012.

[56] A. P. Black, K. B. Bruce, M. Homer, J. Noble, A. Ruskin, and R. Yan- now. Seeking Grace: A New Object-oriented Language for Novices. In Proceeding of the 44th ACM Technical Symposium on Computer Science Education, pages 129–134, New York, USA, 2013.

[57] J. D. Blake. Language Considerations in the First Year CS Curriculum. Journal of Computing Sciences in Colleges, 26(6):124–129, 2011.

[58] F. Bolger and G. Rowe. The Aggregation of Expert Judgment: Do Good Things Come to Those Who Weight? Risk Analysis, 35(1):5–11, 2015.

[59] I. A. Bolshakov and A. Gelbukh. COMPUTATIONAL LINGUISTICS Models, Resources, Applications. Instituto Polit´ecnicoNacional, Mexico, 2004.

[60] R. D. Boyle, J. Carter, and M. Clark. What Makes Them Succeed? Entry, progression and graduation in Computer Science. Journal of Further and Higher Education, 26(1):3–18, 2002.

[61] G. Braught and T. Wahls. Teaching Objects in Context. Journal of Computing Sciences in Colleges, 23(5):101–109, 2008.

[62] G. Braught, L. M. Eby, and T. Wahls. The Effects of Pair-programming on Individual Programming Skill. ACM SIGCSE Bulletin, 40(1):200–204, 2008. BIBLIOGRAPHY 245

[63] G. Braught, J. MacCormick, and T. Wahls. The Benefits of Pairing by Ability. In Proceedings of the 41st ACM Technical Symposium on Computer Science Education, pages 249–253, New York, USA, 2010.

[64] G. Braught, T. Wahls, and L. M. Eby. The Case for Pair Programming in the Computer Science Classroom. ACM Transactions on Computing Education, 11(1):2:1–2:21, 2011.

[65] K. Brown and S. Ogilvie. Concise Encyclopedia of Languages of the World. Elsevier Ltd., 2009.

[66] M. Brown. CS0 As an Indicator of Student Risk for Failure to Complete a Degree in Computing. Journal of Computing Sciences in Colleges, 28 (5):9–16, 2013.

[67] M. Brown, C. Hu, C. Burch, and M. Nooner. CS0: Why, What, and How?: Panel Discussion. Journal of Computing Sciences in Colleges, 25 (5):79–81, 2010.

[68] K. B. Bruce. Controversy on How to Teach CS1: A Discussion on the SIGCSE-members Mailing List. ACM SIGCSE Bulletin, 36(4):29–34, 2004.

[69] P. Brusilovsky, E. Calabrese, J. Hvorecky, A. Kouchnirenko, and P. Miller. Mini-languages: a way to learn programming principles. Education and Information Technologies, 2(1):65–83, 1997.

[70] P. Brusilovsky, O. Shcherbinina, and S. Sosnovsky. Mini-languages for non-Computer Science Majors: What are the Benefits? Interactive Tech- nology and Smart Education, 1(1):21–28, 2004.

[71] J. A. Brzozowski. Derivatives of Regular Expressions. Journal of the ACM, 11(4):481–494, 1964.

[72] D. Budny, L. Lund, J. Vipperman, and J. L. Patzer II. Four steps to teaching C programming. In 32nd Annual Frontiers in Education, vol- ume 2, pages F1G–18–F1G–22, 2002. BIBLIOGRAPHY 246

[73] E. Cambranes. Using Natural Language Descriptions of Algorithms in the Early Stage of Programming. In IEEE Symposium on Visual Languages and Human-Centric Computing, pages 217–218, 2012.

[74] E. Cambranes. Using Natural Language Descriptions of Algorithms in the Early Stage of Programming. In IEEE Symposium on Visual Languages and Human-Centric Computing, pages 173–174, 2013.

[75] C. Campbell. A Model for Systematically Investigating Relationships be- tween Variables that Affect the Performance of Novice Programmers. PhD thesis, Faculty of Health, Engineering and Science, Edith Cowan Univer- sity, 2013.

[76] J. C. Campbell, A. Hindle, and J. N. Amaral. Syntax Errors Just Aren’t Natural: Improving Error Reporting with Language Models. In Proceed- ings of the 11th Working Conference on Mining Software Repositories, pages 252–261, New York, USA, 2014.

[77] D. G. Cantor. On The Ambiguity Problem of Backus Systems. Journal of the ACM, 9(4):477–479, 1962.

[78] A. Carbone, J. Hurst, I. Mitchell, and D. Gunstone. An Exploration of In- ternal Factors Influencing Student Learning of Programming. In Proceed- ings of the Eleventh Australasian Conference on Computing Education, volume 95, pages 25–34, Darlinghurst, Australia, 2009.

[79] M. C. Carlisle. Raptor: A Visual Programming Environment for Teach- ing Object-oriented Programming. Journal of Computing Sciences in Colleges, 24(4):275–281, 2009.

[80] M. C. Carlisle, T. A. Wilson, J. W. Humphries, and S. M. Hadfield. RAPTOR: A Visual Programming Environment for Teaching Algorithmic Problem Solving. ACM SIGCSE Bulletin, 37(1):176–180, 2005.

[81] J. Carter and T. Jenkins. Gender and Programming: What’s Going on? ACM SIGCSE Bulletin, 31(3):1–4, 1999. BIBLIOGRAPHY 247

[82] J. Carter, D. Bouvier, R. Cardell-Oliver, M. Hamilton, S. Kurkovsky, S. Markham, O. W. McClung, R. McDermott, C. Riedesel, J. Shi, and S. White. Motivating All Our Students? In Proceedings of the 16th Annual Conference Reports on Innovation and Technology in Computer Science Education - Working Group Reports, pages 1–18, New York, USA, 2011.

[83] J. C. Carver, L. Henderson, L. He, J. Hodges, and D. Reese. Increased Retention of Early Computer Science and Software Engineering Students Using Pair Programming. In 20th Conference on Software Engineering Education Training, pages 115–122, 2007.

[84] M. E. Caspersen, K. D. Larsen, and J. Bennedsen. Mental Models and Programming Aptitude. ACM SIGCSE Bulletin, 39(3):206–210, 2007.

[85] J. M. Champarnaud, J. L. Ponty, and D. Ziadi. From regular expressions to finite automata. International Journal of Computer Mathematics, 72 (4):415–431, 1999.

[86] P. Charles, C. Grothoff, V. Saraswat, C. Donawa, A. Kielstra, K. Ebcioglu, C. von Praun, and V. Sarkar. X10: An Object-oriented Approach to Non-uniform Cluster Computing. SIGPLAN Notices, 40 (10):519–538, 2005.

[87] S. Chen and S. Morris. Iconic Programming for Flowcharts, Java, Turing, etc. ACM SIGCSE Bulletin, 37(3):104–107, 2005.

[88] Y. Cherenkova, D. Zingaro, and A. Petersen. Identifying Challenging CS1 Concepts in a Large Problem Dataset. In Proceedings of the 45th ACM Technical Symposium on Computer Science Education, pages 695–700, New York, USA, 2014.

[89] C. Chiu and H. Huang. Guided Debugging Practices of Game Based Pro- gramming for Novice Programmers. International Journal of Information and Education Technology, 5(5):343–347, 2015. BIBLIOGRAPHY 248

[90] N. Chomsky. Three models for the description of language. IRE Trans- actions on Information Theory, 2(3):113–124, 1956.

[91] N. Chomsky. Syntactic Structures. De Gruyter, Berlin, second edition, 2002.

[92] V. A. Cicirello. On Self-selected Pairing in CS1: Who Pairs with Whom? Journal of Computing Sciences in Colleges, 24(6):43–49, 2009.

[93] A. Clear and T. Clear. Introductory Programming and Educational Per- formance Indicators - a Mismatch. In Proceedings of ITx New Zealand’s Conference of IT, pages 123–128, 2014.

[94] D. C. Cliburn. Experiences with Pair Programming at a Small College. Journal of Computing Sciences in Colleges, 19(1):20–29, 2003.

[95] D. C. Cliburn. Student Opinions of Alice in CS1. In 38th Annual Frontiers in Education Conference, pages T3B–1–T3B–6, 2008.

[96] D. I. A. Cohen. Introduction to Computer Theory. John Wiley & Sons, USA, second edition, 1996.

[97] S. Coleman and E. Nichols. Enhancing student Engagement through Pair Programming. In 11th Annual Conference of the Subject Centre for Information and Computer Sciences, pages 47–51, 2010.

[98] A. Colmerauer and P. Roussel. The birth of Prolog. In History of pro- gramming languages—II, pages 331–367. ACM, New York, USA, 1996.

[99] C. R. Cook. CS0: Computer Science Orientation Course. ACM SIGCSE Bulletin, 29(1):87–91, 1997.

[100] K. Cooper and L. Torczon. Engineering a Compiler. Morgan Kaufmann, USA, second edition, 2011.

[101] S. Cooper. The Design of Alice. ACM Transactions on Computing Edu- cation, 10(4):15:1–15:16, 2010. BIBLIOGRAPHY 249

[102] S. Cooper and W. Dann. Programming: A Key Component of Computa- tional Thinking in CS Courses for Non-majors. ACM Inroads, 6(1):50–54, 2015.

[103] C. J. Costa and M. Aparicio. Evaluating Success of a Programming Learn- ing Tool. In Proceedings of the International Conference on Information Systems and Design of Communication, pages 73–78, New York, USA, 2014.

[104] C. J. Costa, M. Aparicio, and C. Cordeiro. A Solution to Support Stu- dent Learning of Programming. In Proceedings of the Workshop on Open Source and Design of Communication, pages 25–29, New York, USA, 2012.

[105] P. Crescenzi and C. Nocentini. Fully Integrating Algorithm Visualization into a CS2 Course A Two-year Experience. ACM SIGCSE Bulletin, 39 (3):296–300, 2007.

[106] P. Crescenzi, C. Demetrescu, I. Finocchi, and R. Petreschi. Reversible Execution and Visualization of Programs with LEONARDO. Journal of Visual Languages & Computing, 11(2):125–150, 2000.

[107] J. H. Cross, D. Hendrix, and D. A. Umphress. JGRASP: An Integrated Development Environment with Visualizations for Teaching Java in CS1, CS2, and Beyond. In Frontiers in Educations, pages 1466–1467, 2004.

[108] S. Cruz, F. Q. B. D. Silva, and L. F. Capretz. Forty years of research on personality in software engineering: A mapping study. Computers in Human Behavior, 46:94–113, 2015.

[109] N. B. Dale. Most Difficult Topics in CS1: Results of an Online Survey of Educators. ACM SIGCSE Bulletin, 38(2):49–53, 2006.

[110] T. Daly. Minimizing to Maximize: An Initial Attempt at Teaching In- troductory Programming Using Alice. Journal of Computing Sciences in Colleges, 26(5):23–30, 2011. BIBLIOGRAPHY 250

[111] G. Dancik and A. Kumar. A tutor for counter-controlled loop concepts and its evaluation. In Frontiers in Education, volume 1, pages T3C–7– T3C–12, 2003.

[112] W. Dann and S. Cooper. Education: Alice 3: Concrete to Abstract. Communications of the ACM, 52(8):27–29, 2009.

[113] E. Dantsin, T. Eiter, G. Gottlob, and A. Voronkov. Complexity and Expressive Power of Logic Programming. ACM Computing Surveys, 33 (3):374–425, 2001.

[114] S. Davies, J. A. Polack-Wahl, and K. Anewalt. A Snapshot of Current Practices in Teaching the Introductory Programming Sequence. In Pro- ceedings of the 42nd ACM Technical Symposium on Computer Science Education, pages 625–630, New York, USA, 2011.

[115] E. de Jesus. Teaching Computer Programming with Structured Program- ming Language and Flowcharts. In Proceedings of the 2011 Workshop on Open Source and Design of Communication, pages 45–48, New York, USA, 2011.

[116] M. de Raadt, R. Watson, and M. Watson. Language trends in introduc- tory programming courses. In Proceedings of Informing Science and IT Education Conference, pages 329–337, 2002.

[117] M. de Raadt, R. Watson, and M. Toleman. Introductory Programming: What’s Happening Today and Will There Be Any Students to Teach To- morrow? In Proceedings of the Sixth Australasian Conference on Com- puting Education - Volume 30, pages 277–282, 2004.

[118] H. M. Deitel, P. Deitel, J. P. Liperi, and B. Wiedermann. Python How to Program. Prentice Hall, USA, first edition, 2002.

[119] S. Dekhane and X. Xu. Engaging Students in Computing Using Game- Salad: A Pilot Study. Journal of Computing Sciences in Colleges, 28(2): 117–123, 2012. BIBLIOGRAPHY 251

[120] P. Denny, A. Luxton-Reilly, E. Tempero, and J. Hendrickx. Understand- ing the Syntax Barrier for Novices. In Proceedings of the 16th Annual Joint Conference on Innovation and Technology in Computer Science Ed- ucation, pages 208–212, New York, USA, 2011.

[121] P. Denny, A. Luxton-Reilly, and D. Carpenter. Enhancing Syntax Error Messages Appears Ineffectual. In Proceedings of the 2014 Conference on Innovation & Technology in Computer Science Education, pages 273–278, New York, USA, 2014.

[122] C. Dierbach, B. Taylor, H. Zhou, and I. Zimand. Experiences with a CS0 Course Targeted for CS1 Success. ACM SIGCSE Bulletin, 37(1):317–320, 2005.

[123] E. W. Dijkstra. On the foolishness of ”natural language programming”. In Program Construction, volume 69 of Lecture Notes in Computer Science, pages 51–53. Springer Berlin Heidelberg, 1979.

[124] U. Dorji, P. Panjaburee, and N. Srisawasdi. A Learning Cycle Approach to Developing Educational Computer Game for Improving Students’ Learn- ing and Awareness in Electric Energy Consumption and Conservation. Journal of Educational Technology & Society, 18(1):91–105, 2015.

[125] M. Dorling and D. White. Scratch: A Way to Logo and Python. In Proceedings of the 46th ACM Technical Symposium on Computer Science Education, pages 191–196, New York, USA, 2015.

[126] J. P. Dougherty. Concept Visualization in CS0 Using ALICE. Journal of Computing Sciences in Colleges, 22(3):145–152, 2007.

[127] W. Du, M. Ozeki, H. Nomiya, K. Murata, and M. Araki. Pair Program- ming for Enhancing Communication in the Fundamental C Language Ex- ercise. In IEEE 39th Annual Computer Software and Applications Con- ference, volume 3, pages 664–665, 2015. BIBLIOGRAPHY 252

[128] M. V. Dyne and J. Braun. Effectiveness of a Computational Thinking (CS0) Course on Student Analytical Skills. In Proceedings of the 45th ACM Technical Symposium on Computer Science Education, pages 133– 138, New York, USA, 2014.

[129] M. Eagle and T. Barnes. Wu’s Castle: Teaching Arrays and Loops in a Game. ACM SIGCSE Bulletin, 40(3):245–249, 2008.

[130] A. Eckerdal, R. McCartney, J. E. Mostr¨om,M. Ratcliffe, and C. Zan- der. Can Graduating Students Design Software Systems? ACM SIGCSE Bulletin, 38(1):403–407, 2006.

[131] A. Eckerdal, M. Laakso, M. Lopez, and A. Sarkar. Relationship Between Text and Action Conceptions of Programming: A Phenomenographic and Quantitative Perspective. In Proceedings of the 16th Annual Joint Con- ference on Innovation and Technology in Computer Science Education, pages 33–37, New York, USA, 2011.

[132] M. H. Egan and C. McDonald. Program Visualization and Explanation for Novice C Programmers. In Proceedings of the Sixteenth Australasian Computing Education Conference, volume 148, pages 51–57, Darlinghurst, Australia, 2014.

[133] Y. Erdogan, E. Aydin, and T. Kabaca. Identifying Predictors of Program- ming Achievement . In 6th WSEAS International Conference on EDU- CATION and EDUCATIONAL TECHNOLOGY, pages 195–200, 2007.

[134] J. C. Ernest, A. S. Bowser, S. Ghule, S. Sudireddy, J. P. Porter, D. A. Tal- bert, and M. J. Kosa. Weathering MindStorms with Drizzle and DIODE in CS0. ACM SIGCSE Bulletin, 37(3):353–353, 2005.

[135] M. Esteves, B. Fonseca, L. Morgado, and P. Martins. Improving teaching and learning of computer programming through the use of the Second Life virtual world. British Journal of Educational Technology, 42(4):624–637, 2011. BIBLIOGRAPHY 253

[136] G. E. Evans and M. G. Simkin. What Best Predicts Computer Profi- ciency? Communications of the ACM, 32(11):1322–1327, 1989.

[137] S. Faja. Evaluating Effectiveness of Pair Programming as a Teaching Tool in Programming Courses. Information Systems Education Journal, 12(6): 36–45, 2014.

[138] C. Farrell. Predicting (and Creating) Success in CS1. Issues in Informa- tion Systems, 7(1):259–263, 2006.

[139] R. Faux. Impact of Preprogramming Course Curriculum on Learning in the First Programming Course. IEEE Transactions on Education, 49(1): 11–15, 1006.

[140] S. Federici. A Minimal, Extensible, Drag-and-drop Implementation of the C Programming Language. In Proceedings of the 2011 Conference on Information Technology Education, pages 191–196, New York, USA, 2011.

[141] G. Fil´e.Machines for Attribute Grammars. Information and Control, 69 (13):41–124, 1986.

[142] A. E. Fischer and F. S. Grodzinsky. The Anatomy of Programming Lan- guages. Prentice-Hall, Inc., Upper Saddle River, USA, 1993.

[143] W. T. Fitch. The Evolution of Language. Cambridge University Press, New York, USA, first edition, 2010.

[144] R. W. Floyd. On Ambiguity in Phrase Structure Languages. Communi- cations of the ACM, 5(10):526, 1962.

[145] J. L. Ford. Scratch Programming for Teens. Cengage Learning, Canada, first edition, 2008.

[146] A. Forte and M. Guzdial. Computers for Communication, Not Calcula- tion: Media as a Motivation and Context for Learning. In Proceedings of the 37th Annual Hawaii International Conference on System Sciences, pages 1–10, 2004. BIBLIOGRAPHY 254

[147] S. N. Freund and E. S. Roberts. Thetis: An ANSI C Programming Envi- ronment Designed for Introductory Use. ACM SIGCSE Bulletin, 28(1): 300–304, 1996.

[148] J. E. F. Friedl. Mastering Regular Expressions . O’Reilly Media, USA, third edition, 2006.

[149] R. J. Gallant and Q. H. Mahmoud. Using Greenfoot and a Moon Scenario to Teach Java Programming in CS1. In Proceedings of the 46th Annual Southeast Regional Conference, pages 118–121, New York, USA, 2008.

[150] F. Gao. Language and power: Korean-Chinese students’ language atti- tude and practice. Journal of Multilingual and Multicultural Development, 30(6):525–534, 2009.

[151] G. G´arcia-Mateosand J. L. Fern´andez-Alem´an.A Course on Algorithms and Data Structures Using On-line Judging. ACM SIGCSE Bulletin, 41 (3):45–49, 2009.

[152] R. Garlick and E. C. Cankaya. Using Alice in CS1: A Quantitative Experiment. In Proceedings of the Fifteenth Annual Conference on In- novation and Technology in Computer Science Education, pages 165–168, New York, USA, 2010.

[153] V. Garneli, M. N. Giannakos, and K. Chorianopoulos. Computing edu- cation in k-12 schools: A review of the literature. In IEEE Global Engi- neering Education Conference, pages 543–551, 2015.

[154] T. Geng, F. Xu, H. Mei, W. Meng, Z. Chen, and C. Lai. A Practical GLR Parser Generator for Software Reverse Engineering. Journal of Networks, 9(3):769–776, 2014.

[155] S. Georgantaki and S. Retalis. Using Educational Tools for Teaching Object Oriented Design and Programming. Journal of Information Tech- nology Impact, 7(2):111–130, 2007. BIBLIOGRAPHY 255

[156] C. Ghezzi and M. Jazayeri. Programming Language Concepts. John Wiley and Sons, New York, USA, third edition, 1997.

[157] A. R. M. Gobil, Z. Shukor, and I. A. Mohtar. Novice Difficulties in Selection Structure. In International Conference on Electrical Engineering and Informatics, volume 2, pages 351–356, 2009.

[158] B. Goldberg. Functional Programming Languages. ACM Computing Sur- veys, 28(1):249–251, 1996.

[159] Y. Gong, Q. Liu, X. Shao, C. Pan, and H. Jiao. A novel regular expression algorithm based on multi-dimensional finite automata. In IEEE 15th International Conference on High Performance Switching and Routing, pages 90–97, 2014.

[160] A. Goold and R. Rimmer. Indicators of Performance in First-year Com- puting. In 23rd Australasian Computer Science Conference, pages 74–80, 2000.

[161] L. Goosen. A Brief History of Choosing First Programming Languages. In History of Computing and Education 3, volume 269, pages 167–170. Springer US, 2008.

[162] B. S. Gottfried. Schaum’s Outline of Programming with C. McGraw-Hill, USA, second edition, 1996.

[163] D. Grune and C. J. H. Jacobs. Parsing Techniques: A Practical Guide (Monographs in Computer Science). Springer, second edition, 2007.

[164] D. Gudmundsen, L. Olivieri, and N. Sarawagi. Using Visual Logic: Three Different Approaches in Different Courses - General Education, CS0, and CS1. Journal of Computing Sciences in Colleges, 26(6):23–29, 2011.

[165] D. Gupta. What is a Good First Programming Language? Crossroads, 10(4):7–7, 2004. BIBLIOGRAPHY 256

[166] A. P. Gutierrez and G. Schraw. Effects of Strategy Training and Incentives on Students Performance, Confidence, and Calibration. The Journal of Experimental Education, 83(3):386–404, 2015.

[167] M. Guzdial. A Media Computation Course for Non-majors. ACM SIGCSE Bulletin - Proceedings of the 8th annual conference on Innova- tion and technology in computer science education, 35(3):104–108, 2003.

[168] S. Hadjerrouit. Java as First Programming Language: A Critical Evalu- ation. ACM SIGCSE Bulletin, 30(2):43–47, 1998.

[169] D. Hagan and S. Markham. Does It Help to Have Some Programming Ex- perience Before Beginning a Computing Degree Program? ACM SIGCSE Bulletin, 32(3):25–28, 2000.

[170] M. S. Hall. Raptor: Nifty tools. Journal of Computing Sciences in Col- leges, 23(1):110–111, 2007.

[171] B. Hammo, H. Abu-Salem, and S. Lytinen. QARAB: A Question An- swering System to Support the Arabic Language. In Proceedings of the ACL-02 Workshop on Computational Approaches to Semitic Languages, pages 1–11, Stroudsburg, USA, 2002.

[172] F. Han and S. Zhu. Bottom-Up/Top-Down Image Parsing with Attribute Grammar. IEEE Transactions on Pattern Analysis and Machine Intelli- gence, 31(1):59–73, 2009.

[173] B. Hanks, C. McDowell, D. Draper, and M. Krnjajic. Program Quality with Pair Programming in CS1. ACM SIGCSE Bulletin, 36(3):176–180, 2004.

[174] B. Hanks, L. Murphy, B. Simon, R. McCauley, and C. Zander. CS1 Stu- dents Speak: Advice for Students by Students. ACM SIGCSE Bulletin, 41(1):19–23, 2009.

[175] B. Hanks, S. Fitzgerald, R. McCauley, L. Murphy, and C. Zander. Pair BIBLIOGRAPHY 257

programming in education: a literature review. Computer Science Edu- cation, 21(2):135–173, 2011.

[176] M. D. Harris. Introduction to Natural Language Processing. Reston Pub- lishing Company, Inc. A Prentice-Hall Reston, Virginia, 1985.

[177] M. Haungs, C. Clark, J. Clements, and D. Janzen. Improving First-year Success and Retention Through Interest-based CS0 Courses. In Proceed- ings of the 43rd ACM Technical Symposium on Computer Science Edu- cation, pages 589–594, New York, USA, 2012.

[178] B. Hawkins, B. Demsky, D. Bruening, and Q. Zhao. Optimizing Binary Translation of Dynamically Generated Code. In Proceedings of the 13th Annual IEEE/ACM International Symposium on Code Generation and Optimization, pages 68–78, Washington, USA, 2015.

[179] D. Hemmendinger. Syntax, Semantics, and Pragmatics. In Encyclope- dia of Computer Science, pages 1737–1738. John Wiley and Sons Ltd, Chichester, UK, 2003.

[180] P. Henriksen and M. K¨olling. Greenfoot: Combining Object Visuali- sation with Interaction. In Companion to the 19th Annual ACM SIG- PLAN Conference on Object-oriented Programming Systems, Languages, and Applications, pages 73–82, New York, USA, 2004.

[181] C. W. Herbert. An Introduction to Programming Using Alice 2.2. Cengage Learning, USA, second edition, 2010.

[182] C. C. Hernandez, L. Silva, R. A. Segura, J. Schimiguel, M. F. P. Ledn, L. N. M. Bezerra, and I. F. Silveira. Teaching Programming Principles through a Game Engine. CLEI Electronic Journal, 13(2):1–8, 2010.

[183] R. Hijon-Neira, A.´ Vel´azquez-iturbide, C. Pizarro-Romero, and L. Carri¸co. Game Programming for Improving Learning Experience. In Proceedings of the 2014 Conference on Innovation & Technology in Com- puter Science Education, pages 225–230, New York, USA, 2014. BIBLIOGRAPHY 258

[184] R. Hijon-Neira, A.´ Vel´azquez-iturbide, C. Pizarro-Romero, and L. Carri¸co. Serious Games for Motivating into Programming. In IEEE Frontiers in Education Conference, pages 1–8, 2014.

[185] J. Hill, B. Houle, S. M. Merritt, and A. Stix. Applying Abstraction to Master Complexity. In Proceedings of the 2Nd International Workshop on The Role of Abstraction in Software Engineering, pages 15–21, New York, USA, 2008.

[186] L. Hirschman and R. Gaizauskas. Natural Language Question Answering: The View from Here. Natural Language Engineering, 7:275–300, 2001.

[187] C. E. Hmelo-Silver. Problem-Based Learning: What and How Do Stu- dents Learn? Educational Psychology Review, 16:235–266, 2004.

[188] E. Holden and E. Weeden. The Impact of Prior Experience in an In- formation Technology Programming Course Sequence. In Proceedings of the 4th Conference on Information Technology Curriculum, pages 41–46, New York, USA, 2003.

[189] E. Holden and E. Weeden. The Experience Factor in Early Program- ming Education. In Proceedings of the 5th Conference on Information Technology Education, pages 211–218, New York, USA, 2004.

[190] J. Holmes. Object oriented Compiler Construction . Prentice Hall, Inc., New Jersey, USA, first edition, 1994.

[191] A. I. Holub. Compiler Design in C. Prentice Hall Software Series, New Jersey, USA, 1990.

[192] M. Homer and J. Noble. Combining Tiled and Textual Views of Code. In Second IEEE Working Conference on Software Visualization, pages 1–10, 2014.

[193] M. Homer, T. Jones, J. Noble, K. B. Bruce, and A. P. Black. Graceful dialects. In Object-Oriented Programming, volume 8586 of Lecture Notes in Computer Science, pages 131–156. Springer Berlin Heidelberg, 2014. BIBLIOGRAPHY 259

[194] W. L. Honig. Teaching and Assessing Programming Fundamentals for Non Majors with Visual Programming. In Proceedings of the 18th ACM Conference on Innovation and Technology in Computer Science Educa- tion, pages 40–45, New York, USA, 2013.

[195] D. Hooshyar, R. B. Ahmad, R. G. Raj, M. H. N. M. Nasir, M. Yousefi, S. Horng, and J. Rugelj. A Flowchart-Based Multi-Agent System For As- sisting Novice Programmers with Problem Solving Activities. Malaysian Journal of Computer Science, 28:131–151, 2015.

[196] D. Hooshyar, R. B. Ahmad, S. Shamshirband, M. Yousefi, and S. Horng. A Flowchart-based Programming Environment for Improving Problem Solving Skills of Cs minors in Computer Programming. The Asian Inter- national Journal of Life Sciences, 24(2):629–646, 2015.

[197] J. E. Hopcroft, R. Motwani, and J. D. Ullman. Introduction to Automata Theory, Languages, and Computation. Addison Wesley, USA, 2000.

[198] I. Horton. Beginning C: From Novice to Professional (Beginning: from Novice to Professional). Apress, USA, fourth edition, 2007.

[199] R. J. W. Housden. Further thoughts on SNAP. Computers and the Humanities, 7(6):407–412, 1973.

[200] K. Howell. First Computer Languages. Journal of Computing Sciences in Colleges, 18(4):317–331, 2003.

[201] J. Hromkoviˇc,S. Seibert, and T. Wilke. Translating Regular expressions into Small ε-Free Nondeterministic Finite Automata. Journal of Com- puter and System Sciences, 62(4):565–588, 2001.

[202] M. Hu, M. Winikoff, and S. Cranefield. Teaching Novice Programming Using Goals and Plans in a Visual Notation. In Proceedings of the Four- teenth Australasian Computing Education Conference - CRPIT Volume 123, pages 43–52, 2012. BIBLIOGRAPHY 260

[203] H. Hulkko and P. Abrahamsson. A Multiple Case Study on the Impact of Pair Programming on Product Quality. In Proceedings of the 27th International Conference on Software Engineering, pages 495–504, New York, USA, 2005.

[204] C. D. Hundhausen and J. L. Brown. What You See Is What You Code: A live algorithm development and visualization environment for novice learners. Journal of Visual Languages & Computing, 18(1):22–47, 2007.

[205] B. R. Hunt, R. L. Lipsman, and J. M. Rosenberg. A Guide to MATLAB: For Beginners and Experienced Users. Cambridge University Press, New York, USA, 2001.

[206] M. H. Hwang, H. C. Choi, A. Lee, J. D. Culver, and B. Hutchison. The Relationship Between Self-Efficacy and Academic Achievement: A 5-Year Panel Analysis. The Asia-Pacific Education Researcher, pages 1–10, 2015.

[207] L. Ilie and S. Yu. Constructing NFAs by Optimal Use of Positions in Regular Expressions. In Combinatorial Pattern Matching, volume 2373 of Lecture Notes in Computer Science, pages 279–288. 2002.

[208] V. Isom¨ott¨onen, A. Lakanen, and V. Lappalainen. K-12 Game Program- ming Course Concept Using Textual Programming. In Proceedings of the 42nd ACM Technical Symposium on Computer Science Education, pages 459–464, New York, USA, 2011.

[209] M. Ivanovi´c,Z. Budimac, and D. Pauni´c.Educational Influences of Choice of First Programming Language. In Proceedings of the international con- ference on Numerical Analysis and Applied Mathematics, volume 1648, pages 310010–1–310010–4, 2015.

[210] J. Jablonowsk. A Case Study in Introductory Programming. In Proceed- ings of the International Conference on Computer Systems and Technolo- gies, pages 82:1–82:7, New York, USA, 2007. BIBLIOGRAPHY 261

[211] P. Jalote. An Integrated Approach to Software Engineering (Texts in Computer Science). Springer, USA, third edition, 2005.

[212] K. Johnsgard and J. McDonald. Using Alice in Overview Courses to Improve Success Rates in Programming I. In IEEE 21st Conference on Software Engineering Education and Training, pages 129–136, 2008.

[213] S. C. Johnson. Yacc: Yet Another Compiler-Compiler. Technical report, Bell Laboratories Murray Hill, USA, 1975.

[214] A. Johnstone and E. Scott. Rdp-an Iterator-based Recursive Descent Parser Generator with Tree Promotion Operators. ACM SIGPLAN No- tices, 33(9):87–94, 1998.

[215] T. Jordine, Y. Liang, and E. Ihler. A mobile-device based serious gaming approach for teaching and learning Java programming. In IEEE Frontiers in Education Conference, pages 1–5, 2014.

[216] D. Kafura, A. C. Bart, and B. Chowdhury. Design and Preliminary Re- sults From a Computational Thinking Course. In Proceedings of the 2015 ACM Conference on Innovation and Technology in Computer Science Ed- ucation, pages 63–68, New York, USA, 2015.

[217] O. G. Kakde. Algorithms for Compiler Design. Charles River Media, Inc., Rockland, USA, first edition, 2002.

[218] R. M. Kaplan. Choosing a First Programming Language. In Proceedings of the 2010 ACM Conference on Information Technology Education, pages 163–164, New York, USA, 2010.

[219] M. Karakus, S. Uludag, E. Guler, S. W. Turner, and A. Ugur. Teach- ing Computing and Programming Fundamentals via App Inventor for Android. In International Conference on Information Technology Based Higher Education and Training, pages 1–8, 2012.

[220] V. Karavirta, R. Haavisto, E. Kaila, M. Laakso, T. Rajala, and T. Salakoski. Interactive Learning Content for Introductory Computer BIBLIOGRAPHY 262

Science Course Using the ViLLE Exercise Framework. In International Conference on Learning and Teaching in Computing and Engineering, pages 9–16, 2015.

[221] N. Katira, L. Williams, E. Wiebe, C. Miller, S. Balik, and E. Gehringer. On Understanding Compatibility of Student Pair Programmers. ACM SIGCSE Bulletin, 36(1):7–11, 2004.

[222] N. Katira, L. Williams, and J. Osborne. Towards Increasing the Com- patibility of Student Pair Programmers. In Proceedings of the 27th Inter- national Conference on Software Engineering, pages 625–626, New York, USA, 2005.

[223] B. Katz, G. Borchardt, and S. Felshin. Natural Language Annotations for Question Answering. In Proceedings of the Nineteenth International Florida Artificial Intelligence Research Society Conference, pages 303– 306, Menlo Park, USA, 2006.

[224] C. Kehoe, J. Stasko, and A. Taylor. Rethinking the evaluation of algo- rithm animations as learning aids: an observational study. International Journal of Human-Computer Studies, 54(2):265–284, 2001.

[225] C. Kelleher and R. Pausch. Lowering the Barriers to Programming: A Taxonomy of Programming Environments and Languages for Novice Pro- grammers. ACM Computing Surveys, 37(2):83–137, 2005.

[226] C. M. Kelty. Logical Instruments: Regular Expressions, AI and thinking about thinking. In The Search for a Theory of Cognition: Early Mech- anisms and New Ideas, chapter 9, pages 244–279. Rodopi, Netherlands, 2011.

[227] B. W. Kernighan. RATFOR- A Preprocessor for a Rational Fortran. Software: Practice and Experience, 5(4):395–406, 1975.

[228] B. W. Kernighan and D. M. Ritchie. The C Programming Language. Prentice Hall, USA, second edition, 1998. BIBLIOGRAPHY 263

[229] P. Kinnunen and L. Malmi. Why Students Drop out CS1 Course? In Pro- ceedings of the Second International Workshop on Computing Education Research, pages 97–108, New York, USA, 2006.

[230] P. Kinnunen, R. McCartney, L. Murphy, and L. Thomas. Through the Eyes of Instructors: A Phenomenographic Investigation of Student Suc- cess. In Proceedings of the Third International Workshop on Computing Education Research, pages 61–72, New York, USA, 2007.

[231] M. Klassen. Visual Approach for Teaching Programming Concepts. In Proceedings of the 9th International Conference on Engineering Educa- tion, pages M4H–5–M4H–10, 2006.

[232] K. Klement. Russell’s 1903 - 1905 Anticipation of the Lambda Calculus. History and Philosophy of Logic, 24(1):15–37, 2003.

[233] A. Knight. Basics of MATLAB and Beyond. Chapman & Hall/CRC, USA, 2010.

[234] R. Kn¨oland M. Mezini. Pegasus: First Steps Toward a Naturalistic Pro- gramming Language. In Companion to the 21st ACM SIGPLAN Sympo- sium on Object-oriented Programming Systems, Languages, and Applica- tions, pages 542–559, New York, USA, 2006.

[235] R. Kn¨oll,V. Gasiunas, and M. Mezini. Naturalistic Types. In Proceedings of the 10th SIGPLAN Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software, pages 33–48, New York, USA, 2011.

[236] J. Knudsen. The Unofficial Guide to LEGO MINDSTORMS Robots. O’Reilly & Associates, Inc., USA, first edition, 1999.

[237] D. E. Knuth. Semantics of context-free languages. Mathematical systems theory, 2(2):127–145, 1968.

[238] W. Kobsiripat. Effects of the Media to Promote the Scratch Programming BIBLIOGRAPHY 264

Capabilities Creativity of Elementary School Students . Procedia - Social and Behavioral Sciences, 174(0):227–232, 2015.

[239] C. Koch and S. Scherzinger. Attribute grammars for scalable query pro- cessing on XML streams. The VLDB Journal, 16(3):317–342, 2007.

[240] M. K¨olling. Using BlueJ to Introduce Programming. In Reflections on the Teaching of Programming, volume 4821 of Lecture Notes in Computer Science, pages 98–115. Springer Berlin Heidelberg, 2008.

[241] M. K¨olling. Introduction to Programming with Greenfoot: Object-Oriented Programming in Java with Games and Simulations. Pearson Higher Ed- ucation, USA, first edition, 2009.

[242] M. K¨olling.The Greenfoot Programming Environment. ACM Transac- tions on Computing Education, 10(4):14:1–14:21, 2010.

[243] M. K¨olling, B. Quig, A. Patterson, and J. Rosenberg. The BlueJ System and its Pedagogy. Computer Science Education, 13(4):249–268, 2003.

[244] M. Konecki. Introductory Programming Education for Visually Impaired. International Journal of Research in Engineering and Technology, 3(17): 65–70, 2014.

[245] A. Korhonen, J. Helminen, V. Karavirta, and O. Sepp¨al¨a. TRAKLA2. In 9th Koli Calling International Conference on Computing Education Research, pages 43–46, 2003.

[246] S. Kouznetsova. Using BlueJ and Blackjack to Teach Object-oriented Design Concepts in CS1. Journal of Computing Sciences in Colleges, 22 (4):49–55, 2007.

[247] J. Kramer. Is Abstraction the Key to Computing? Communications of the ACM, 50(4):36–42, 2007.

[248] D. Krpan, S. Mladenovi´c,and M. Rosi´c. Undergraduate Programming Courses, Students Perception and Success. Procedia - Social and Behav- ioral Sciences, 174:3868–3872, 2015. BIBLIOGRAPHY 265

[249] M. Kuittinen and J. Sajaniemi. Teaching Roles of Variables in Elementary Programming Courses. ACM SIGCSE Bulletin, 36(3):57–61, 2004.

[250] V. Kumar. MIX10: Compiling MATLAB to X10 for High Performance. Master’s thesis, School of Computer Science McGill University, Montreal, 2014.

[251] V. Kumar and L. Hendren. MIX10: Compiling MATLAB to X10 for High Performance. ACM SIGPLAN Notices, 49(10):617–636, 2014.

[252] S. K. Kummerfeld and J. Kay. The Neglected Battle Fields of Syntax Errors. In Proceedings of the Fifth Australasian Conference on Computing Education, volume 20, pages 105–111, Darlinghurst, Australia, 2003.

[253] S. Kurkovsky. Mobile Game Development: Improving Student Engage- ment and Motivation in Introductory Computing Courses. Computer Science Education, 23(2):138–157, 2013.

[254] M. Laakso, E. Kaila, T. Rajala, and T. Salakoski. Define and Visualize Your First Programming Language. In Eighth IEEE International Con- ference on Advanced Learning Technologies, 2008, pages 324–326, 2008.

[255] M. J. Laakso, T. Rajala, E. Kaila, and T. Salakoski. The Impact of Prior Experience in Using a Visualization Tool on Learning to Program. In Cognition and Exploratory Learning in Digital Age, pages 129–136, 2008.

[256] E. Lahtinen. Students’ Individual Differences in Using Visualizations: Prospects of Future Research on Program Visualizations. In Proceedings of the 8th International Conference on Computing Education Research, pages 92–95, New York, USA, 2008.

[257] E. Lahtinen and T. Ahoniemi. Annotations for Defining Interactive In- structions to Interpreter Based Program Visualization Tools. Electronic Notes in Theoretical Computer Science, 178:121–128, 2007.

[258] E. Lahtinen, K. Ala-Mutka, and H. J¨arvinen.A Study of the Difficulties of Novice Programmers. ACM SIGCSE Bulletin, 37(3):14–18, 2005. BIBLIOGRAPHY 266

[259] E. Lahtinen, T. Ahoniemi, and A. Salo. Effectiveness of Integrating Pro- gram Visualizations to a Programming Course. In Proceedings of the Sev- enth Baltic Sea Conference on Computing Education Research - Volume 88, pages 195–198, Darlinghurst, Australia, 2007.

[260] S. Landau and B. S. Everitt. A Handbook of Statistical Analyses using SPSS. Chapman and Hall/CRC, USA, first edition, 2003.

[261] A. V. D. A. Leal and D. J. Ferreira. Teaching Computer Programming Based on Patterns with Activities and Collaborative Games Using Con- crete Materials for High School Students. In IEEE Frontiers in Education Conference, pages 1604–1610, 2013.

[262] O. Lebedeva and L. Zaitseva. Question Answering Systems in Education and their Classifications. In Joint International Conference on Engineer- ing Education & International Conference on Information Technology, pages 359–366, Riga, Latvia, 2014.

[263] S. Letovsky and E. Soloway. Delocalized plans and program comprehen- sion. IEEE Software, 3(3):41–49, 1986.

[264] S. Leutenegger and J. Edgington. A Games First Approach to Teaching Introductory Programming. ACM SIGCSE Bulletin, 39(1):115–118, 2007.

[265] J. R. Levine. flex & bison. O’Reilly Media, Sebastopol, USA, first edition, 2009.

[266] J. R. Levine, T. Mason, and D. Brown. lex & yacc. O’Reilly Media, Sebastopol, USA, first edition, 1990.

[267] R. B. Levy, M. Ben-Ari, and P. A. Uronen. The Jeliot 2000 program animation system. Computers & Education, 40(1):1–15, 2003.

[268] C. M. Lewis. How Programming Environment Shapes Perception, Learn- ing and Goals: Logo vs. Scratch. In Proceedings of the 41st ACM Technical Symposium on Computer Science Education, pages 346–350, New York, USA, 2010. BIBLIOGRAPHY 267

[269] Y. Li, H. Yang, and H. V. Jagadish. Constructing a Generic Natural Lan- guage Interface for an XML Database. In Advances in Database Technol- ogy, volume 3896 of Lecture Notes in Computer Science, pages 737–754. Springer Berlin Heidelberg, 2006.

[270] H. Lieberman and H. Liu. Feasibility Studies for Programming in Natural Language. In End User Development, volume 9 of Human-Computer Interaction Series, pages 459–473. Springer Netherlands, 2006.

[271] N. K. Lincoln and S. M. Veres. Natural Language Programming of Com- plex Robotic BDI Agents. Journal of Intelligent & Robotic Systems, 71 (2):211–230, 2013.

[272] P. Linz. An introduction to formal languages and automata. Jones & Bartlett Publishers, USA, third edition, 2000.

[273] R. Lister, E. S. Adams, S. Fitzgerald, W. Fone, J. Hamer, M. Lindholm, R. McCartney, J. E. Mostr¨om,K. Sanders, O. Sepp¨al¨a,B. Simon, and T. Lynda. A Multi-national Study of Reading and Tracing Skills in Novice Programmers. ACM SIGCSE Bulletin, 36(4):119–150, 2004.

[274] H. Liu and H. Lieberman. Toward a Programmatic Semantics of Natural Language. In IEEE Symposium on Visual Languages and Human Centric Computing, pages 281–282, 2004.

[275] H. Liu and H. Lieberman. Metafor: Visualizing Stories as Code. In Proceedings of the 10th International Conference on Intelligent User In- terfaces, pages 305–307, New York, USA, 2005.

[276] H. Liu and H. Lieberman. Programmatic Semantics for Natural Lan- guage Interfaces. In Extended Abstracts on Human Factors in Computing Systems, pages 1597–1600, New York, USA, 2005.

[277] C. Lo, Y. Lin, and C. Wu. Which Programming Language Should Stu- dents Learn First? A Comparison of Java and Python. In International BIBLIOGRAPHY 268

Conference on Learning and Teaching in Computing and Engineering, pages 225–226, 2015.

[278] C. Loftus, L. Thomas, and C. Zander. Can Graduating Students Design: Revisited. In Proceedings of the 42Nd ACM Technical Symposium on Computer Science Education, pages 105–110, New York, USA, 2011.

[279] C. V. Lopes, P. Dourish, D. H. Lorenz, and K. Lieberherr. Beyond AOP: Toward Naturalistic Programming. ACM SIGPLAN Notices, 38(12):34– 43, 2003.

[280] V. Lopez, M. Pasin, and E. Motta. AquaLog: An Ontology-Portable Question Answering System for the Semantic Web. In The Semantic Web: Research and Applications, volume 3532 of Lecture Notes in Computer Science, pages 546–562. 2005.

[281] K. C. Louden. Compiler Construction: Principles and Practice. PWS Publishing Company, Boston, USA, 1997.

[282] K. C. Louden and K. A. Lambert. Programming Languages: Principles and Practice. Cengage Learning, USA, third edition, 2011.

[283] P. A. Luker. Never Mind the Language, What About the Paradigm? ACM SIGCSE Bulletin, 21(1):252–256, 1989.

[284] A. Luxton-Reilly and P. Denny. A Simple Framework for Interactive Games in CS1. ACM SIGCSE Bulletin, 41(1):216–220, 2009.

[285] R. Machado. An Introduction to Lambda Calculus and Functional Pro- gramming. In 2nd Workshop-School on Theoretical Computer Science, pages 26–33, 2013.

[286] D. J. Malan and H. H. Leitner. Scratch for Budding Computer Scientists. ACM SIGCSE Bulletin, 39(1):223–227, 2007.

[287] Y. Malhotra, D. F. Galletta, and L. J. Kirsch. How Endogenous Motiva- tions Influence User Intentions: Beyond the Dichotomy of Extrinsic and BIBLIOGRAPHY 269

Intrinsic User Motivations. Journal of Management Information Systems, 25(1):267–300, 2008.

[288] J. Maloney, L. Burd, Y. Kafai, N. Rusk, B. Silverman, and N. Resnick. Scratch: a Sneak Preview [education]. In Proceedings of the Second Inter- national Conference on Creating, Connecting and Collaborating through Computing, 2004, pages 104–109, 2004.

[289] J. Maloney, M. Resnick, N. Rusk, B. Silverman, and E. Eastmond. The Scratch Programming Language and Environment. ACM Transactions on Computing Education, 10(4):16:1–16:15, 2010.

[290] J. H. Maloney, K. Peppler, Y. Kafai, M. Resnick, and N. Rusk. Pro- gramming by Choice: Urban Youth Learning Programming with Scratch. ACM SIGCSE Bulletin, 40(1):367–371, 2008.

[291] L. Mannila and M. de Raadt. An Objective Comparison of Languages for Teaching Introductory Programming. In Proceedings of the 6th Baltic Sea Conference on Computing Education Research: Koli Calling 2006, pages 32–37, New York, USA, 2006.

[292] G. Marceau, K. Fisler, and S. Krishnamurthi. Measuring the Effectiveness of Error Messages Designed for Novice Programmers. In Proceedings of the 42nd ACM Technical Symposium on Computer Science Education, pages 499–504, New York, USA, 2011.

[293] G. Marceau, K. Fisler, and S. Krishnamurthi. Mind Your Language: On Novices’ Interactions with Error Messages. In Proceedings of the 10th SIGPLAN Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software, pages 3–18, New York, USA, 2011.

[294] S. Markstrum, R. M. Fuhrer, and T. Millstein. Towards Concurrency Refactoring for x10. SIGPLAN Notices, 44(4):303–304, 2009.

[295] A. Marron, G. Weiss, and G. Wiener. A Decentralized Approach for Programming Interactive Applications with JavaScript and Blockly. In BIBLIOGRAPHY 270

Proceedings of the 2nd Edition on Programming Systems, Languages and Applications Based on Actors, Agents, and Decentralized Control Abstrac- tions, pages 59–70, New York, USA, 2012.

[296] J. C. Martin. Introduction to Languages and The Theory of Computation. The McGraw-Hill Companies, New York, forth edition, 2010.

[297] N. L. Martin. IT0: Discrete Math and Programming Logic Topics as a Hybrid Alternative to CS0. Information Systems Education Journal, 30 (1):30–44, 2015.

[298] R. Mason and G. Cooper. Introductory Programming Courses in Aus- tralia and New Zealand in 2013 - Trends and Reasons. In Proceedings of the Sixteenth Australasian Computing Education Conference - Volume 148, pages 139–147, Darlinghurst, Australia, 2014.

[299] R. Mason, G. Cooper, and M. de Raadt. Trends in Introductory Pro- gramming Courses in Australian Universities: Languages, Environments and Pedagogy. In Proceedings of the Fourteenth Australasian Computing Education Conference - Volume 123, pages 33–42, 2012.

[300] Y. Matsuzawa, T. Ohata, M. Sugiura, and S. Sakai. Language Migration in non-CS Introductory Programming Through Mutual Language Trans- lation Environment. In Proceedings of the 46th ACM Technical Sympo- sium on Computer Science Education, pages 185–190, New York, USA, 2015.

[301] R. McCauley. A Bounty of Accessible Language Translation Tools. ACM SIGCSE Bulletin, 33(2):14–15, 2001.

[302] M. McCracken, V. Almstrum, D. Diaz, M. Guzdial, D. Hagan, Y. B. Ko- likant, C. Laxer, L. Thomas, I. Utting, and T. Wilusz. A Multi-national, Multi-institutional Study of Assessment of Programming Skills of First- year CS Students. ACM SIGCSE Bulletin, 33(4):125–180, 2001. BIBLIOGRAPHY 271

[303] C. McDowell, L. Werner, H. E. Bullock, and J. Fernald. The Impact of Pair Programming on Student Performance, Perception and Persistence. In Proceedings of the 25th International Conference on Software Engi- neering, pages 602–607, Washington, USA, 2003.

[304] R. L. McFall and M. DeJongh. Increasing Engagement and Enrollment in Breadth-first Introductory Courses Using Authentic Computing Tasks. In Proceedings of the 42nd ACM Technical Symposium on Computer Science Education, pages 429–434, New York, USA, 2011.

[305] L. McIver and D. Conway. GRAIL: A Zeroth programming language. In Proceedings of Seventh International Conference on Computing in Edu- cation, pages 43–50, 1999.

[306] S. McPeak and G. C. Necula. Elkhound: A Fast, Practical GLR Parser Generator. In Compiler Construction, volume 2985 of Lecture Notes in Computer Science, pages 73–88. Springer Berlin Heidelberg, 2004.

[307] W. I. McWhorter and B. C. O’Connor. Do LEGO Mindstorms Motivate Students in CS1? ACM SIGCSE Bulletin, 41(1):438–442, 2009.

[308] J. Mead, S. Gray, J. Hamer, R. James, J. Sorva, C. S. Clair, and L. Thomas. A Cognitive Approach to Identifying Measurable Milestones for Programming Skill Acquisition. ACM SIGCSE Bulletin, 38(4):182– 194, 2006.

[309] A. Meduna. Elements of Compiler Design . Auerbach Publications, Boca Raton, 2007.

[310] O. Meerbaum-Salant, M. Armoni, and M. Ben-Ari. Habits of Program- ming in Scratch. In Proceedings of the 16th Annual Joint Conference on Innovation and Technology in Computer Science Education, pages 168– 172, New York, USA, 2011.

[311] O. Meerbaum-Salant, M. Armoni, and M. Ben-Ari. Learning Computer BIBLIOGRAPHY 272

Science Concepts with Scratch. Computer Science Education, 23(3):239– 264, 2013.

[312] A. J. Mendes and M. J. Marcelino. Tools to support initial program- ming learning. In International Conference on Computer Systems and Technologies, pages IV.16–1–IV.16–6, 2006.

[313] J. J. G. V. Merrienboer and J. Sweller. Cognitive load theory and com- plex learning: Recent developments and future directions. Educational Psychology Review, 17(2):147–177, 2005.

[314] B. Meyer. The Outside-In Method of Teaching Introductory Program- ming. In Perspectives of System Informatics, volume 2890 of Lecture Notes in Computer Science, pages 66–78. Springer Berlin Heidelberg, 2003.

[315] R. M. Meyer and D. T. Burhans. Robotran: A Programming Environ- ment for Novices Using LEGO Mindstorms Robots. In Proceedings of the Twentieth International Florida Artificial Intelligence Research Society Conference, pages 321–326, USA, 2007.

[316] M. M. Mhashi and A. M. Alakeel. Difficulties Facing Students in Learning Computer Programming Skills at Tabuk University. In Proceedings of the 12th International Conference on Education and Educational Technology, pages 15–23, Morioka, Japan, 2013.

[317] A. Middelkoop, A. Dijkstra, and S. D. Swierstra. Iterative Type Inference with Attribute Grammars. SIGPLAN Notices, 46(2):43–52, 2010.

[318] D. Middleton. Trying to Teach Problem-solving Instead of Just Assigning It: Some Practical Issues. Journal of Computing Sciences in Colleges, 27 (5):60–65, 2012.

[319] I. Milne and G. Rowe. Difficulties in Learning and Teaching Programming-Views of Students and Tutors. Education and Information Technologies, 7(1):55–66, 2002. BIBLIOGRAPHY 273

[320] J. Milthorpe, V. Ganesh, A. P. Rendell, and D. Grove. X10 as a Parallel Language for Scientific Computation: Practice and Experience. In IEEE International Parallel Distributed Processing Symposium, pages 1080– 1088, 2011.

[321] K. L. P. Mishra and N. Chandrasekaran. Theory of Computer Science Automata, Languages and Computation. Prentice Hall, New Delhi, India, third edition, 2008.

[322] S. Mishra, S. Balan, S. Iyer, and S. Murthy. Effect of a 2-week Scratch Intervention in CS1 on Learners with Varying Prior Knowledge. In Pro- ceedings of the 2014 Conference on Innovation & Technology in Computer Science Education, pages 45–50, New York, USA, 2014.

[323] J. C. Mitchell. Concepts In Programming Languages. Cambridge Univer- sity Press, Cambridge, England, 2004.

[324] W. Mitchell. Another Look at CS0. Journal of Computing Sciences in Colleges, 17(1):194–205, 2001.

[325] K. Mobley and S. Fisher. Ditching the Desks: Kinesthetic Learning in College Classrooms. The Social Studies, 105(6):301–309, 2014.

[326] S. Montero, P. D´ıaz,D. D´ıez,and I. Aedo. Dual instructional support ma- terials for introductory object-oriented programming: Classes vs. objects. In IEEE Education Engineering (EDUCON), pages 1929–1934, 2010.

[327] J. Moons and C. De Backer. Rationale Behind the Design of the Edu- Visor Software Visualization Component. Electronic Notes in Theoretical Computer Science, 224:57–65, 2009.

[328] A. Moreno and M. S. Joy. Jeliot 3 in a Demanding Educational Setting. Electronic Notes in Theoretical Computer Science, 178:51–59, 2007.

[329] A. Moreno, N. Myller, E. Sutinen, and M. Ben-Ari. Visualizing Programs with Jeliot 3. In Proceedings of the Working Conference on Advanced Visual Interfaces, pages 373–376, New York, USA, 2004. BIBLIOGRAPHY 274

[330] A. Moreno, E. Sutinen, and M. Joy. Defining and Evaluating Conflictive Animations for Programming Education: The Case of Jeliot ConAn. In Proceedings of the 45th ACM Technical Symposium on Computer Science Education, pages 629–634, New York, USA, 2014.

[331] J. Moreno and G. Robles. Automatic detection of bad programming habits in scratch: A preliminary study. In IEEE Frontiers in Education Conference, pages 1–4, 2014.

[332] R. Moser. A Fantasy Adventure Game As a Learning Environment: Why Learning to Program is So Difficult and What Can Be Done About It. ACM SIGCSE Bulletin, 29(3):114–116, 1997.

[333] P. Mullins, D. Whitfield, and M. Conlon. Using Alice 2.0 As a First Language. Journal of Computing Sciences in Colleges, 24(3):136–143, 2009.

[334] P. M. Mullins and M. Conlon. Engaging Students in Programming Fun- damentals Using Alice 2.0. In Proceedings of the 9th ACM SIGITE Con- ference on Information Technology Education, pages 81–88, 2008.

[335] B. A. Myers, J. F. Pane, and A. Ko. Natural Programming Languages and Environments. Communications of the ACM, 47(9):47–52, 2004.

[336] N. Myller, R. Bednarik, E. Sutinen, and M. Ben-Ari. Extending the En- gagement Taxonomy: Software Visualization and Collaborative Learning. ACM Transactions on Computing Education, 9(1):7:1–7:27, 2009.

[337] T. L. Naps. JHAVE:´ Supporting Algorithm Visualization. IEEE Com- puter Graphics and Applications, 25(5):49–55, 2005.

[338] U. Nikula, O. Gotel, and J. Kasurinen. A Motivation Guided Holistic Rehabilitation of the First Programming Course. ACM Transactions on Computing Education, 11(4):24:1–24:38, 2011.

[339] J. Noble, M. Homer, K. B. Bruce, and A. P. Black. Designing Grace: Can an Introductory Programming Language Support the Teaching of BIBLIOGRAPHY 275

Software Engineering? In IEEE 26th Conference on Software Engineering Education and Training, pages 219–228, 2013.

[340] N. Nystrom, M. R. Clarkson, and A. C. Myers. Polyglot: An extensi- ble compiler framework for Java. In Proceedings of the Conference on Compiler Construction, pages 1380–152, 2003.

[341] C. O’Donnell, J. Buckley, A. H. E. Mahdi, J. Nelson, and M. En- glish. Evaluating Pair-Programming for Non-Computer Science Major Students. In Proceedings of the 46th ACM Technical Symposium on Com- puter Science Education, pages 569–574, New York, USA, 2015.

[342] J. O’Kelly and J. P. Gibson. RoboCode & Problem-based Learning: A Non-prescriptive Approach to Teaching Programming. ACM SIGCSE Bulletin, 38(3):217–221, 2006.

[343] E. Okike and M. Attamah. Evaluation of JFlex Scanner Generator Us- ing Form Fields Validity Checking. International Journal of Computer Science Issues, 7(3):12–20, 2010.

[344] G. Olimpo, D. Persico, L. Sarti, and M. Tavella. An Experiment In Intro- ducing The Basic Concepts Of Informatics. In Proceedings of the Fourth World Conference on Computers in Education, pages 31–37. Elsevier Sci- ence Publishers, 1985.

[345] O. L. Oliveira, A. M. Monteiro, and N. T. Roman. Natural Language in Introductory Programming: An Experimental Study. In Proceedings of the 16th Annual Joint Conference on Innovation and Technology in Computer Science Education, pages 363–363, New York, USA, 2011.

[346] O. L. Oliveira, A. M. Monteiro, and N. T. Roman. Can Natural Language be Utilized in the Learning of Programming Fundamentals? In IEEE Frontiers in Education Conference, pages 1851–1856, 2013.

[347] O. L. D. Oliveira. Facilitating how to Learn Algorithms and Computer BIBLIOGRAPHY 276

Programming. In Proceedings of International Conference on Engineering and Technology Education, volume 11, pages 229–233, 2010.

[348] O. L. D. Oliveira, A. M. Monteiro, and N. T. Roman. Programming Fun- damentals and Human factors: an Empirical Study of Three Variables. iSys-Revista Brasileira de Sistemas de Informa¸c˜ao, 8(1):102–124, 2015.

[349] R. Or-Bach and I. Lavy. Cognitive Activities of Abstraction in Object Orientation: An Empirical Study. ACM SIGCSE Bulletin, 36(2):82–86, 2004.

[350] W. I. Osman and M. M. Elmusharaf. Effectiveness of Combining Al- gorithm and Program Animation: A Case Study with Data Structure Course. Issues in Informing Science and Information Technology, 11: 155–168, 2014.

[351] C. Ott. Decoding Feedback: Improving feedback practices for students in introductory programming courses. PhD thesis, University of Otago, Dunedin, New Zealand, 2014.

[352] M. Overmars. Learning Object-Oriented Design by Creating Games. IEEE Potentials, 23(5):11–13, 2005.

[353] J. Owolabi, P. Olanipekun, and J. Iwerima. Mathematics Ability and Anxiety, Computer and Programming Anxieties, Age and Gender as De- terminants of Achievement in Basic Programming. GSTF Journal on Computing, 3(4):109–114, 2014.

[354] J. Pais. Using Puzzle Simulations to Introduce Object Oriented Pro- gramming with BYOB. Journal of Computing Sciences in Colleges, 29 (5):192–193, 2014.

[355] J. F. Pane, C. A. Ratanamahatana, and B. A. Myers. Studying the Language and Structure in Non-programmers’ Solutions to Programming Problems. International Journal of Human-Computer Studies, 54(2):237– 264, 2001. BIBLIOGRAPHY 277

[356] J. F. Pane, B. A. Myers, and L. B. Miller. Using HCI techniques to Design a More Usable Programming System. In Proceedings of IEEE Symposia on Human Centric Computing Languages and Environments, pages 198–206, 2002.

[357] M. Panitz, K. Sung, and R. Rosenberg. Game Programming in CS0: A Scaffolded Approach. Journal of Computing Sciences in Colleges, 26(1): 126–132, 2010.

[358] S. Papadakis, M. Kalogiannakis, V. Orfanakis, and N. Zaranis. Novice Programming Environments. Scratch & App Inventor: A First Compari- son. In Proceedings of the 2014 Workshop on Interaction Design in Edu- cational Environments, pages 1:1–1:7, New York, USA, 2014.

[359] C. Pareja-Flores, J. Urquiza-Fuentes, and J. A. Vel´azquez-Iturbide.Win- HIPE: An IDE for Functional Programming Based on Rewriting and Vi- sualization. ACM SIGPLAN Notices, 42(3):14–23, 2007.

[360] C. J. Park and J. S. Hyun. Effects of Abstract Thinking and Familiar- ity with Programming Languages on Computer Programming Ability in High Schools. In International Conference on Teaching, Assessment and Learning, pages 468–473, 2014.

[361] K. Parker, J. Chao, T. Ottaway, and J. Chang. A Formal Language Selection Process for Introductory Programming Courses. Journal of In- formation Technology Education: Research, 5(1):133–151, 2006.

[362] T. Parr and K. Fisher. LL(*): The Foundation of the ANTLR Parser Generator. ACM SIGPLAN Notices, 46(6):425–436, 2011.

[363] T. J. Parr and R. W. Quong. ANTLR: A predicated-LL (k) Parser Gen- erator. Software - Practice and Experience, 25(7):789–810, 1995.

[364] J. Paxton. Live Programming As a Lecture Technique. Journal of Com- puting Sciences in Colleges, 18(2):51–56, 2002. BIBLIOGRAPHY 278

[365] A. Pears, S. Seidman, L. Malmi, L. Mannila, E. Adams, J. Bennedsen, M. Devlin, and J. Paterson. A Survey of Literature on the Teaching of Introductory Programming. ACM SIGCSE Bulletin, 39(4):204–223, 2007.

[366] M. Petre. Why Looking Isn’t Always Seeing: Readership Skills and Graphical Programming. Communications of the ACM, 38(6):33–44, 1995.

[367] N. Pillay and V. R. Jugoo. An Investigation into Student Characteristics Affecting Novice Programming Performance. ACM SIGCSE Bulletin, 37 (4):107–110, 2005.

[368] A. Popescu, A. Armanasu, O. Etzioni, D. Ko, and A. Yates. Modern Natural Language Interfaces to Databases: Composing Statistical Pars- ing with Semantic Tractability. In Proceedings of the 20th International Conference on Computational Linguistics, Stroudsburg, USA, 2004.

[369] L. Porter, M. Guzdial, C. McDowell, and B. Simon. Success in Introduc- tory Programming: What Works? Communications of the ACM, 56(8): 34–36, 2013.

[370] K. Powers, S. Ecott, and L. M. Hirshfield. Through the Looking Glass: Teaching CS0 with Alice. ACM SIGCSE Bulletin, 39(1):213–217, 2007.

[371] D. Price, E. Riloff, J. Zachary, and B. Harvey. NaturalJava: A Natural Language Interface for Programming in Java. In Proceedings of the 5th International Conference on Intelligent User Interfaces, pages 207–211, New York, USA, 2000.

[372] K. Price and S. Smith. Improving Student Performance in CS1. Journal of Computing Sciences in Colleges, 30(2):157–163, 2014.

[373] T. W. Price and T. Barnes. Comparing Textual and Block Interfaces in a Novice Programming Environment. In Proceedings of the Eleventh Annual International Conference on International Computing Education Research, pages 91–99, New York, USA, 2015. BIBLIOGRAPHY 279

[374] P. Radhakrishnan and S. Kanmani. Improvement of programming skills using pair programming by boosting extraversion and openness to expe- rience. International Journal of Teaching and Case Studies, 4(1):13–35, 2013.

[375] T. Rajala, M. Laakso, E. Kaila, and T. Salakoski. Effectiveness of Pro- gram Visualization: A Case Study with the ViLLE Tool. Journal of In- formation Technology Education: Innovations in Practice, 7:15–32, 2008.

[376] R. Rajaravivarma. A Games-based Approach for Teaching the Introduc- tory Programming Course. ACM SIGCSE Bulletin, 37(4):98–102, 2005.

[377] V. Ramalingam, D. LaBelle, and S. Wiedenbeck. Self-efficacy and Mental Models in Learning to Program. ACM SIGCSE Bulletin, 36(3):171–175, 2004.

[378] S. Reardon and B. Tangney. Smartphones, Studio-Based Learning, and Scaffolding: Helping Novices Learn to Program. ACM Transactions on Computing Education, 14(4):23:1–23:15, 2014.

[379] D. Reed. Rethinking CS0 with JavaScript. ACM SIGCSE Bulletin, 33 (1):100–104, 2001.

[380] S. Reges. Back to Basics in CS1 and CS2. ACM SIGCSE Bulletin, 38(1): 293–297, 2006.

[381] A. H. Register. A Guide to MATLAB Object-Oriented Programming. Chapman & Hall/CRC, New York, USA, 2007.

[382] V. G. Renumol, S. Jayaprakash, and D. Janakiram. Classification of Cog- nitive Difficulties of Students to Learn Computer Programming. Technical report, Department of Computer Science, Indian Institute of Technology, Madras, India, 2009.

[383] A. Repenning and A. Ioannidou. Broadening Participation Through Scal- able Game Design. ACM SIGCSE Bulletin, 40(1):305–309, 2008. BIBLIOGRAPHY 280

[384] J. S. Rey. From Alice to BlueJ: A Transition to Java. Master’s thesis, Robert Gordon University, 2009.

[385] E. Rich. Automata, computability and complexity: Theory and Applica- tions. Pearson Prentice Hall, USA, 2008.

[386] A. Riker. Natural Language in Programming An English Syntax-based Approach for Reducing the Difficulty of First Programming Language Acquisition. Master’s thesis, Department of Computer Science, Graduate School of Arts and Sciences, Brandeis University, 2010.

[387] M. Rizvi and T. Humphries. A Scratch-based CS0 course for At-risk Computer Science Majors. In Frontiers in Education Conference, pages 1–5, 2012.

[388] M. Rizvi, T. Humphries, D. Major, M. Jones, and H. Lauzun. A CS0 Course Using Scratch. Journal of Computing Sciences in Colleges, 26(3): 19–27, 2011.

[389] M. Rizvi, T. Humphries, D. Major, H. Lauzun, and M. Jones. A New CS0 Course for At-Risk Majors. In 24th IEEE-CS Conference on Software Engineering Education and Training, pages 314–323, 2011.

[390] A. Robins. Learning edge momentum: a new account of outcomes in CS1. Computer Science Education, 20(1):37–71, 2010.

[391] A. Robins, J. Rountree, and N. Rountree. Learning and Teaching Pro- gramming: A Review and Discussion. Computer Science Education, 13 (2):137–172, 2003.

[392] S. H. Rodger. Introducing Computer Science Through Animation and Virtual Worlds. ACM SIGCSE Bulletin - Inroads: paving the way towards excellence in computing education, 34(1):186–190, 2002.

[393] J. Rosenberg and M. K¨olling.I/O Considered Harmful (at Least for the First Few Weeks). In Proceedings of the 2nd Australasian Conference on Computer Science Education, pages 216–223, New York, USA, 1996. BIBLIOGRAPHY 281

[394] J. B. Rosser. Highlights of the History of the Lambda-Calculus. Annals of the History of Computing, 6(4):337–349, 1984.

[395] G. R¨osslingand B. Freisleben. ANIMAL: A System for Supporting Multi- ple Roles in Algorithm Animation. Journal of Visual Languages & Com- puting, 13(3):341–354, 2002.

[396] N. Rountree, J. Rountree, and A. Robins. Predictors of Success and Failure in a CS1 Course. ACM SIGCSE Bulletin, 34(4):121–124, 2002.

[397] E. Rowland and D. Zeilberger. A case study in meta-automation: Au- tomatic generation of congruence automata for combinatorial sequences. Journal of Difference Equations and Applications, 20(7):973–988, 2014.

[398] K. Roy, W. C. Rousse, and D. B. DeMeritt. Comparing the Mobile Novice Programming Environments: App Inventor for Android vs. GameSalad. In Frontiers in Education Conference, pages 1–6, 2012.

[399] M. J. Rubin. The Effectiveness of Live-coding to Teach Introductory Programming. In Proceeding of the 44th ACM Technical Symposium on Computer Science Education, pages 651–656, New York, USA, 2013.

[400] M. A. Rubio, R. Romero-Zaliz, C. Manoso, and A. P. D. Madrid. En- hancing an Introductory Programming Course with Physical Computing Modules. In IEEE Frontiers in Education Conference, pages 1–8, 2014.

[401] M. A. Rubio, R. Romero-Zaliz, C. Ma˜noso,and A. P. D. Madrid. Closing the Gender Gap in an Introductory Programming Course. Computers & Education, 82:409–420, 2015.

[402] M. Ruckert and R. Halpern. Educational C. ACM SIGCSE Bulletin, 25 (1):6–9, 1993.

[403] A. Ruf, A. M¨uhling, and P. Hubwieser. Scratch vs. Karel: Impact on Learning Outcomes and Motivation. In Proceedings of the 9th Workshop in Primary and Secondary Computing Education, pages 50–59, New York, USA, 2014. BIBLIOGRAPHY 282

[404] R. M. Ryan and E. L. Deci. Intrinsic and Extrinsic Motivations: Classic Definitions and New Directions. Contemporary Educational Psychology, 25(1):54–67, 2000.

[405] B. Sabitzer and S. Pasterk. Brain-based Programming continued: Effec- tive teaching in programming courses. In IEEE Frontiers in Education Conference, pages 1–6, 2014.

[406] V. O. Safonov. Trustworthy Compilers . John Wiley & Sons, Inc., Hobo- ken, New Jersey, USA, first edition, 2010.

[407] J. Sajaniemi and M. Kuittinen. Program Animation Based on the Roles of Variables. In Proceedings of the 2003 ACM Symposium on Software Visualization, pages 7–ff, New York, USA, 2003.

[408] D. Sanders and B. Dorn. Classroom Experience with Jeroo. Journal of Computing Sciences in Colleges, 18(4):308–316, 2003.

[409] K. A. Sarpong, J. K. Arthur, and P. Y. Owusu. Causes of Failure of Students in Computer Programming Courses: The Teacher–Learner Per- spective. International Journal of Computer Applications, 77(12):27–32, 2013.

[410] I. Sasano. Toward Modular Implementation of Practical Identifier Com- pletion on Incomplete Program Text. In Proceedings of the 8th Interna- tional Conference on Bioinspired Information and Communications Tech- nologies, pages 231–234, Brussels, Belgium, 2014.

[411] V. L. Sauter. Predicting Computer Programming Skill. Computers & Education, 10(2):299–302, 1986.

[412] J. R. Savery. Overview of Problem-based learning: Denitions and Dis- tinctions. Interdisciplinary Journal of Problem-based Learning, 1(1):9–20, 2006.

[413] H. Schildt. C# 4.0 The Complete Reference . McGraw-Hill Osborne Media, USA, first edition, 2010. BIBLIOGRAPHY 283

[414] G. Schnitger. Regular Expressions and NFAs Without ε-Transitions. In Proceedings of Annual Symposium on Theoretical Aspects of Computer Science, volume 3884 of Lecture Notes in Computer Science, pages 432– 443. Springer Berlin Heidelberg, 2006.

[415] M. J. Scott. Self-Beliefs in the Introductory Programming Lab and Game- Based Fantasy Role-Play. PhD thesis, Brunel University London, 2015.

[416] R. W. Sebesta. Concepts of Programming Languages. Addison-Wesley, Boston, tenth edition, 2012.

[417] A. Sen. Using Code Analysis Tool in Introductory Programming Class. Issues in Information Systems, 15(1):1–10, 2014.

[418] L. M. Serrano-C´amara, M. Paredes-Velasco, C. Alcover, and J. A. Velazquez-Iturbide. An Evaluation of Students’ Motivation in Computer- Supported Collaborative Learning of Programming Concepts. Computers in Human Behavior, 31:499– 508, 2014.

[419] A. Settle, A. Vihavainen, and J. Sorva. Three Views on Motivation and Programming. In Proceedings of the 2014 Conference on Innovation & Technology in Computer Science Education, pages 321–322, New York, USA, 2014.

[420] A. H. Seyal, Y. S. Mey, M. H. Matusin, H. N. H. Siau, and A. A. Rah- man. Understanding Students Learning Style and Their Performance in Computer Programming Course: Evidence from Bruneian Technical In- stitution of Higher Learning. International Journal of Computer Theory and Engineering, 7(3):241–245, 2015.

[421] C. A. Shaffer, L. S. Heath, and J. Yang. Using the Swan Data Structure Visualization System for Computer Science Education. ACM SIGCSE Bulletin, 28(1):140–144, 1996.

[422] C. A. Shaffer, M. L. Cooper, A. J. D. Alon, M. Akbar, M. Stewart, BIBLIOGRAPHY 284

S. Ponce, and S. H. Edwards. Algorithm Visualization: The State of the Field. ACM Transactions on Computing Education, 10(3):9:1–9:22, 2010.

[423] T. Sharma, S. Das, and V. Bhalla. Parsing - A Brief Study. International Journal of Research, 1(8):1265–1275, 2014.

[424] E. Shein. Python for Beginners. Communications of the ACM, 58(3): 19–21, 2015.

[425] G. B. Shelly, T. J. Cashman, and C. W. Herbert. Alice 2.0 Introductory Concepts and Techniques. Cengage Learning, USA, first edition, 2006.

[426] M. Sipser. Introduction to the Theory of Computation. Cengage Learning, USA, second edition, 2006.

[427] P. A. G. Sivilotti and S. M. Pike. The Suitability of Kinesthetic Learning Activities for Teaching Distributed Algorithms. ACM SIGCSE Bulletin, 39(1):362–366, 2007.

[428] P. A. G. Sivilotti and S. M. Pike. A Collection of Kinesthetic Learning Activities for a Course on Distributed Computing. ACM SIGACT, 38(2): 56–74, 2007.

[429] R. H. Sloan and P. Troy. CS 0.5: A Better Approach to Introductory Computer Science for Majors. ACM SIGCSE Bulletin, 40(1):271–275, 2008.

[430] K. Slonneger and B. Kurtz. Formal Syntax and Semantics of Program- ming Languages. Addison Wesley Longman, Massachusetts, first edition, 1995.

[431] A. Snyder. Encapsulation and Inheritance in Object-oriented Program- ming Languages. ACM SIGPLAN Notices - Proceedings of the conference on object-oriented programming systems, languages, and applications, 21 (11):38–45, 1986. BIBLIOGRAPHY 285

[432] A. Soares. Reflections on Teaching App Inventor for Non-Beginner Pro- grammers: Issues, Challenges and Opportunities. Information Systems Education Journal, 12(4):56, 2014.

[433] A. Soares and N. L. Martin. Teaching Non-Beginner Programmers with App Inventor: Survey Results and Implications. Information Systems Education Journal, 13(5):24, 2015.

[434] J. Sorva, V. Karavirta, and L. Malmi. A Review of Generic Program Visualization Systems for Introductory Programming Education. ACM Transactions on Computing Education, 13(4):15:1–15:64, 2013.

[435] J. Sorva, J. L¨onnberg, and L. Malmi. Students Ways of Experiencing Visual Program Simulation. Computer Science Education, 23(3):207–238, 2013.

[436] E. Spertus, M. L. Chang, P. Gestwicki, and D. Wolber. Novel Approaches to CS0 with App Inventor for Android. In Proceedings of the 41st ACM Technical Symposium on Computer Science Education, pages 325–326, New York, USA, 2010.

[437] P. Sprague and C. Schahczenski. Abstraction the Key to CS1. Journal of Computing Sciences in Colleges, 17(3):211–218, 2002.

[438] R. Stansifer. The Study of Programming Languages. Prentice Hall, first edition, 1994.

[439] A. Stefik and S. Siebert. An Empirical Investigation into Programming Language Syntax. ACM Transactions on Computing Education, 13(4): 19:1–19:40, 2013.

[440] D. E. Stevenson. Programming Language Fundamentals by Example. Auerbach Publications, USA, first edition, 2006.

[441] B. Stroustrup. A history of C++: 1979–1991. In History of programming languages—II, pages 699–769, New York, USA, 1996. BIBLIOGRAPHY 286

[442] G. Succi and R. W. Wong. The Application of JavaCC to Develop a C/C++ Preprocessor. ACM SIGAPP Applied Computing Review - Spe- cial issue on new applications of parsing tools, 7(3):11–18, 1999.

[443] J. Summet, D. Kumar, K. O’Hara, D. Walker, L. Ni, D. Blank, and T. Balch. Personalizing CS1 with Robots. ACM SIGCSE Bulletin, 41(1): 433–437, 2009.

[444] K. Sung, M. Panitz, S. Wallace, R. Anderson, and J. Nordlinger. Game- themed Programming Assignments: The Faculty Perspective. ACM SIGCSE Bulletin, 40(1):300–304, 2008.

[445] J. Sweller. Cognitive Load Theory, Learning Difficulty, and Instructional Design. Learning and Instruction, 4(4):295–312, 1994.

[446] A. Tafliovich, J. Campbell, and A. Petersen. A Student Perspective on Prior Experience in CS1. In Proceeding of the 44th ACM Technical Sym- posium on Computer Science Education, pages 239–244, New York, USA, 2013.

[447] S. M. Taheri, Y. Hidehiko, and H. K. Tripathy. Novel Assessment of Different Intelligent Tools for Problem Solving. Computer Science and Engineering, 3(3):67–75, 2013.

[448] S. M. Taheri, M. Sasaki, and H. T. Ngetha. Evaluating the Effectiveness of Problem Solving Techniques and Tools in Programming. In Science and Information Conference, pages 928–932, 2015.

[449] M. Tekdal. The Effect of an Example-Based Dynamic Program Visual- ization Environment on Students’ Programming Skills. Journal of Edu- cational Technology & Society, 16(3):400–410, 2013.

[450] J. Templeman and D. Vitter. Visual Studio .NET - The .NET Framework Black Book. Coriolis Group Books, Arizona, USA, 2002.

[451] P. D. Terry. Compilers and Compiler Generators an introduction with C++. International Thomson Computer Press, 1997. BIBLIOGRAPHY 287

[452] A. L. Tharp. Selecting the Right Programming Language. ACM SIGCSE Bulletin - Proceedings of the 13th SIGCSE symposium on Computer sci- ence education, 14(1):151–155, 1982.

[453] S. Tigrek and M. Obadat. Teaching Smartphones programming using (Android Java): Pedagogy and innovation. In International Conference on Information Technology Based Higher Education and Training, pages 1–7, 2012.

[454] N. Tillmann, M. Moskal, J. D. Halleux, M. Fahndrich, J. Bishop, A. Samuel, and T. Xie. The Future of Teaching Programming is on Mobile Devices. In Proceedings of the 17th ACM Annual Conference on Inno- vation and Technology in Computer Science Education, pages 156–161, New York, USA, 2012.

[455] J. E. Tomayko. A Comparison of Pair Programming to Inspections for Software Defect Reduction. Computer Science Education, 12(3):213–222, 2002.

[456] I. Tomek. Josef, the robot. Computers & Education, 6(3):287–293, 1982.

[457] J. Tremblay and P. G. Sorenson. The Theory and Practice of Compiler Writing. McGraw-Hill Computer Science Series, USA, 1985.

[458] U. Tukeyev and D. Rakhimova. Augmented Attribute Grammar in Mean- ing of Natural Languages Sentences. In 13th International Symposium on Advanced Intelligent Systems (ISIS), 2012 Joint 6th International Confer- ence on Soft Computing and Intelligent Systems, pages 1080–1084, 2012.

[459] F. Turbak and D. Gifford. Design Concepts in Programming Languages. The MIT Press, Cambridge, Massachusetts, first edition, 2008.

[460] S. M. Tuttle. iYO Quiero Java!: Teaching Java As a Second Programming Language. Journal of Computing Sciences in Colleges, 17(2):34–45, 2001.

[461] S. Uludag, M. Karakus, and S. W. Turner. Implementing IT0/CS0 with Scratch, App Inventor For android, and Lego Mindstorms. In Proceedings BIBLIOGRAPHY 288

of the 2011 Conference on Information Technology Education, pages 183– 190, New York, USA, 2011.

[462] J. Urquiza-Fuentes and A. Vel´azquez-Iturbide.Toward the Effective Use of Educational Program Animations: The Roles of Student’s Engagement and Topic Complexity. Computers & Education, 67:178–192, 2013.

[463] J. Urquiza-Fuentes and J. A.´ Vel´azquez-Iturbide. A Survey of Successful Evaluations of Program Visualization and Algorithm Animation Systems. ACM Transactions on Computing Education - Special Issue on the 5th Program Visualization Workshop, 9(2):9:1–9:21, 2009.

[464] J. Urquiza-Fuentes, F. J. Almeida-Mart´ınez,A. P´erez-Carrasco,and J. A.´ Vel´azquez-Iturbide. Visualization of the Syntax Error Eecovery Within the Compilation Process. In International Symposium on Computers in Education, pages 1–6, 2012.

[465] D. Vadas and J. R. Curran. Programming With Unrestricted Natural Language. In Proceedings of the Australasian Language Technology Work- shop, pages 191–199, Sydney, Australia, 2005.

[466] A. Vahldick, A. J. Mendes, and M. J. Marcelino. A Review of Games De- signed to Improve Introductory Computer Programming Competencies. In IEEE Frontiers in Education Conference, pages 1–7, 2014.

[467] T. VanDeGrift. Coupling Pair Programming and Writing: Learning About Students’ Perceptions and Processes. ACM SIGCSE Bulletin, 36 (1):2–6, 2004.

[468]J. A.´ Vel´azquez-Iturbide. A Programming Languages Course for Fresh- men. ACM SIGCSE Bulletin, 37(3):271–275, 2005.

[469] J. A. Vel´azquez-Iturbide,C. Pareja-Flores, and J. Urquiza-Fuentes. An approach to effortless construction of program animations. Computers & Education, 50(1):179–192, 2008. BIBLIOGRAPHY 289

[470] J. A. Vel´azquez-Iturbide,A. P´erez-Carrasco,and J. Urquiza-Fuentes. In- teractive Visualization of Recursion with SRec. ACM SIGCSE Bulletin, 41(3):339–339, 2009.

[471] P. R. Ventura. Identifying Predictors of Success for an Objects-First CS1. Computer Science Education, 15(3):223–243, 2005.

[472] A. Vihavainen, J. Airaksinen, and C. Watson. A Systematic Review of Approaches for Teaching Introductory Programming and Their Influence on Success. In Proceedings of the Tenth Annual Conference on Inter- national Computing Education Research, pages 19–26, New York, USA, 2014.

[473] A. T. Virtanen, E. Lahtinen, and H. M J¨arvinen. VIP, a Visual Interpreter for Learning Introductory Programming with C++. In Fifth Koli Calling Conference on Computer Science Education, pages 125–130, 2005.

[474] C. Wallace and P. Martin. Not whether Java but how Java. In Proceedings of Asia-Pacific Software Engineering and International Computer Science Conference, pages 517–518, 1997.

[475] C. Watson and F. W. B. Li. Failure Rates in Introductory Programming Revisited. In Proceedings of the 2014 Conference on Innovation & Tech- nology in Computer Science Education, pages 39–44, New York, USA, 2014.

[476] D. Watson. High-Level Languages and Their Compilers. Addison-Wesley, USA, 1989.

[477] D. A. Watt. Programming Language Design Concepts. John Wiley & Sons Ltd, West Sussex, England, 2004.

[478] M. Werner. A Parser Project in a Programming Languages Course. Jour- nal of Computing Sciences in Colleges, 18(5):184–192, 2003. BIBLIOGRAPHY 290

[479] B. T. Westphal, F. C. Harris Jr, and M. S. Fadali. Graphical program- ming: A vehicle for teaching computer problem solving. In 33rd Annual Frontiers in Education, volume 2, pages F4C–19–23, 2003.

[480] R. L. Wexelblat. The consequences of one’s first programming language. In Proceedings of the 3rd ACM SIGSMALL Symposium and the First SIGPC Symposium on Small Systems, pages 52–55, New York, USA, 2014.

[481] J. Whalley, C. Prasad, and P. K. A. Kumar. Decoding Doodles: Novice Programmers and Their Annotations. In Proceedings of the Ninth Aus- tralasian Conference on Computing Education, volume 66, pages 171–178, Darlinghurst, Australia, 2007.

[482] S. Wiedenbeck, D. Labelle, and V. N. R. Kain. Factors Affecting Course Outcomes in Introductory Programming. In 16th Annual Workshop of the Psychology of Programming Interest Group, pages 97–109, 2004.

[483] R. Wilhelm and D. Maurer. Compiler Design. Addison-Wesley, Harlow, England, 1995.

[484] R. Wilhelm and H. Seidl. Compiler Design: Virtual Machines. Springer- Verlag, 2011.

[485] A. B. Williams. The Qualitative Impact of Using LEGO MINDSTORMS Robots to Teach Computer Engineering, 2003.

[486] L. Williams, E. Wiebe, K. Yang, M. Ferzli, and C. Miller. In Support of Pair Programming in the Introductory Computer Science Course. Com- puter Science Education, 12(3):197–212, 2002.

[487] B. C. Wilson. A Study of Factors Promoting Success in Computer Science Including Gender Differences. COMPUTER SCIENCE EDUCATION, 12:141–164, 2002.

[488] B. C. Wilson and S. Shrock. Contributing to Success in an Introductory BIBLIOGRAPHY 291

Computer Science Course: A Study of Twelve Factors. ACM SIGCSE Bulletin, 33(1):184–188, 2001.

[489] L. E. Winslow. Programming Pedagogy – A Psychological Overview. ACM SIGCSE Bulletin, 28(3):17–22, 1996.

[490] N. Wirth. Compiler Construction. Addison-Wesley, 1996.

[491] D. Wolber, H. Abelson, E. Spertus, and L. Looney. App Inventor: Create Your Own Android Apps. OReilly Media, Inc., Sebastopol, Canada, first edition, 2011.

[492] U. Wolz, H. H. Leitner, D. J. Malan, and J. Maloney. Starting with Scratch in CS1. ACM SIGCSE Bullein, 41(1):2–3, 2009.

[493] K. Wood, D. Parsons, J. Gasson, and P. Haden. It’s Never Too Early: Pair Programming in CS1. In Proceedings of the Fifteenth Australasian Com- puting Education Conference - Volume 136, pages 13–21, Darlinghurst, Australia, 2013.

[494] W. A. Woods. Progress in Natural Language Understanding: An Ap- plication to Lunar Geology. In Proceedings of the National Computer Conference and Exposition, pages 441–450, New York, USA, 1973.

[495] M. Xenos, C. Pierrakeas, and P. Pintelas. A survey on student dropout rates and dropout causes concerning the students in the Course of Infor- matics of the Hellenic Open University. Computers & Education, 39(4): 361–377, 2002.

[496] G. Xing. A Simple Way to Construct NFA with Fewer States and Transi- tions. In Proceedings of the 42nd Annual Southeast Regional Conference, pages 214–218, New York, USA, 2004.

[497] S. Xinogalos. Using Flowchart-based Programming Environments for Simplifying Programming and Software Engineering Processes. In IEEE Global Engineering Education Conference, pages 1313–1322, 2013. BIBLIOGRAPHY 292

[498] X. Yu and M. Becchi. Exploring Different Automata Representations for Efficient Regular Expression Matching on GPUs. ACM SIGPLAN Notices, 48(8):287–288, 2013.

[499] A. Zakai. Emscripten: An LLVM-to-JavaScript Compiler. In Proceedings of the ACM International Conference Companion on Object Oriented Pro- gramming Systems Languages and Applications Companion, pages 301– 312, New York, USA, 2011.

[500] V. Zaytsev. Formal foundations for semi-parsing. In IEEE Conference on Software Maintenance, Reengineering and Reverse Engineering , pages 313–317, 2014.

[501] J. Zhang, M. Atay, E. R. Caldwell, and E. J. Jones. Visualizing Loops Using a Game-Like Instructional Module. In IEEE 13th International Conference on Advanced Learning Technologies, pages 448–450, 2013.

[502] J. Zhang, E. R. Caldwell, and E. Smith. Learning the Concept of Java Inheritance in a Game. In 18th International Conference on Computer Games: AI, Animation, Mobile, Interactive Multimedia, Educational Se- rious Games, pages 212–216, 2013.

[503] J. Zhang, E. Smith, E. R. Caldwell, and M. Perkins. Learning and Prac- ticing Decision Structures in a Game. Journal of Computing Sciences in Colleges, 29(4):60–67, 2014.

[504] M. Zhao. Two-dimensional extended attribute grammar method for the recognition of hand-printed chinese characters. Pattern Recognition, 23 (7):685–695, 1990.

[505] Z. Zheng. AnswerBus Question Answering System. In Proceedings of the Second International Conference on Human Language Technology Re- search, pages 399–404, San Francisco, USA, 2002. Appendices

293 Appendix A

Textual Language Grammar

In this appendix the context-free grammar of an experimental textual language is defined. The experimental textual language is used in the second phase of an experimental learners programming language.

→ STARTSYMBOL
→ OFSYMBOL PROGRAMSYMBOL
→  → ASYMBOL → THESYMBOL →  → MAINSYMBOL →  → ENDSYMBOL
→  → CREATESYMBOL

294 295

→ THESYMBOL → ASYMBOL → ANSYMBOL → <flt var dec> → INTEGERSYMBOL <flt var dec> → FLOATSYMBOL → CHARACTERSYMBOL → STRINGSYMBOL → TYPESYMBOL →  INT IDENTIFIER → ARRAYSYMBOL AINT IDENTIFIER OFSYM- BOL INTEGERCONSTANT FLT IDENTIFIER → ARRAYSYMBOL AFLT IDENTIFIER OFSYM- BOL INTEGERCONSTANT 296

CHR IDENTIFIER → ARRAYSYMBOL ACHR IDENTIFIER OFSYM- BOL INTEGERCONSTANT STR IDENTIFIER → VARIABLESYMBOL →  → NAMEDLITERAL →  → ASSIGNMENTOPERATOR →  → INTEGERCONSTANT → CHARACTERCONSTANT → ASSIGNMENTOPERATOR <flt initial> →  <flt initial> → INTEGERCONSTANT <flt initial> → CHARACTERCONSTANT <flt initial> → FLOATCONSTANT → ASSIGNMENTOPERATOR CHARACTERCONSTANT →  → ASSIGNMENTOPERATOR STRINGCONSTANT →  → ELEMENTSSYMBOL →  297

→  INPUTSYMBOL INSYMBOL → INPUTMARKSYMBOL →  → INT IDENTIFIER → FLT IDENTIFIER → CHR IDENTIFIER → STR IDENTIFIER → ROWSYMBOL OFSYM- BOL → AINT IDENTIFIER → AFLT IDENTIFIER → ACHR IDENTIFIER → OUTPUTSYMBOL → VALUESYMBOL → TEXTSYMBOL STRINGCON- STANT 298

→ OFSYMBOL → CHARACTERCONSTANT → INTEGERCONSTANT → FLOATCONSTANT → STRINGCONSTANT → REPEATSYMBOL → THESYMBOL STATE- MENTSSYMBOL → STATEMENTSSYMBOL →  → INTEGERCONSTANT TIMESSYMBOL → WITHSYMBOL INT IDENTIFIER TODOWN- TOSYMBOL STEPSYMBOL → FROMSYMBOL →  → ISSYMBOL →  → INTEGERCONSTANT → INT IDENTIFIER → ANDSYMBOL →  299

→ WHILESYMBOL → ASSYMBOL LONGSYMBOL AS- SYMBOL → ENDSYMBOL LOOPSYMBOL → OFSYMBOL → TERMINATESYMBOL LOOPSYMBOL → CONTINUESYMBOL LOOPSYMBOL RELATOPERATOR RE- LATOPERATOR →  → ANDSYMBOL → ORSYMBOL → EXECUTESYMBOL → IFSYMBOL → PROVIDEDSYMBOL → ONSYMBOL CONDI- TIONSYMBOL THATSYMBOL → THATSYMOL →  300

→ ELSEIFSYMBOL → ELSESYMBOL → ENDSYMBOL IFSYM- BOL → OFSYMBOL →  → PLUS → MINUS →  → MULTIPLICATION → DIVISION → MODULUS →  → LEFTPARENTHESIS RIGHTPARENTHESIS → INTEGERCONSTANT → FLOATCONSTANT → INT IDENTIFIER → FLT IDENTIFIER → CHR IDENTIFIER ASSIGNMEN- TOPERATOR 301

→ STR IDENTIFIER ASSIGNMEN- TOPERATOR ASSIGNMENT- OPERATOR → ACHR IDENTIFIER ASSIGN- MENTOPERATOR ASSIGNMENTOP- ERATOR → AINT IDENTIFIER → AFLT IDENTIFIER → CHR IDENTIFIER → CHARACTERCONSTANT ACHR IDENTIFIER → STR IDENTIFIER → STRINGCONSTANT Appendix B

Textual Language Grammar in EBNF

In this appendix the context-free grammar of an experimental textual language is defined in Extended BackusNaur Form (EBNF). The EBNF representation of a grammar allows a simple and straightforward method for the construction of recursive-descent parser.

::=

::= STARTSYMBOL
::= [OFSYMBOL [ASYMBOL |THESYMBOL] [MAINSYMBOL] PROGRAMSYMBOL] ::= ENDSYMBOL
::= { }{ } ::= CREATESYMBOL ::= THESYMBOL ::= ASYMBOL ::= ANSYMBOL ::= 302 303

::= ::= < nonint var dec> ::= <flt var dec> < nonint var dec> ::= < nonint var dec> ::= ::= INTEGERSYMBOL [TYPESYMBOL] <flt var dec> ::= FLOATSYMBOL [TYPESYMBOL] ::= CHARACTERSYMBOL [TYPESYMBOL] ::= STRINGSYMBOL [TYPESYMBOL] [VARIABLESYMBOL [NAMEDLITERAL] ] STR IDENTIFIER [ASSIGNMENTOPERATOR STRINGCONSTANT] ::= [VARIABLESYMBOL [NAMEDLITERAL]] INT IDENTIFIER [ASSIGNMENTOPERATOR ] ::= ARRAYSYMBOL [NAMEDLITERAL] AINT IDENTIFIER OFSYM- BOL INTEGERCONSTANT [ELEMENTSSYMBOL] 304

::= [VARIABLESYMBOL [NAMEDLITERAL]] FLT IDENTIFIER [ASSIGNMENTOPERATOR <flt initial>] ::= ARRAYSYMBOL [NAMEDLITERAL] AFLT IDENTIFIER OFSYM- BOL INTEGERCONSTANT [ELEMENTSSYMBOL] ::= [VARIABLESYMBOL [NAMEDLITERAL] ] CHR IDENTIFIER [ASSIGNMENTOPERATOR CHARACTERCONSTANT ] ::= ARRAYSYMBOL [NAMEDLITERAL] ACHR IDENTIFIER OFSYM- BOL INTEGERCONSTANT [ELEMENTSSYMBOL] ::= INTEGERCONSTANT ::= CHARACTERCONSTANT <flt initial> ::= INTEGERCONSTANT <flt initial> ::= CHARACTERCONSTANT <flt initial> ::= FLOATCONSTANT ::= ::= ::= ::= ::= ::= 305

::= ::= [INPUTMARKSYMBOL ] IN- PUTSYMBOL INSYMBOL ::= INT IDENTIFIER ::= FLT IDENTIFIER ::= CHR IDENTIFIER ::= STR IDENTIFIER ::= ::= ROWSYMBOL OFSYM- BOL ::= AINT IDENTIFIER ::= AFLT IDENTIFIER ::= ACHR IDENTIFIER ::= OUTPUTSYMBOL ::= ::= ::= [ASYMBOL |THESYMBOL ] ::= VALUESYMBOL ::= TEXTSYMBOL STRINGCON- STANT ::= OFSYMBOL ::= ::= CHARACTERCONSTANT ::= INTEGERCONSTANT ::= FLOATCONSTANT ::= STRINGCONSTANT 306

::= REPEATSYMBOL [THESYMBOL STATEMENTSSYMBOL |STATEMENTSSYMBOL] ::= ::= ::= { } ::= INTEGERCONSTANT TIMESSYMBOL ::= WITHSYMBOL INT IDENTIFIER [FROMSYMBOL] TODOWNTOSYMBOL [ANDSYMBOL ][ASYMBOL |THESYMBOL ] STEPSYMBOL [ISSYMBOL] ::= INTEGERCONSTANT ::= INT IDENTIFIER ::= {statement>} ::= WHILESYMBOL ::= ASSYMBOL LONGSYMBOL AS- SYMBOL ::= ENDSYMBOL [ASYMBOL |THESYMBOL |OFSYMBOL ] LOOPSYMBOL ::= TERMINATESYMBOL [ASYMBOL |THESYMBOL ] LOOPSYMBOL ::= CONTINUESYMBOL [ASYMBOL |THESYMBOL ] LOOPSYMBOL 307

::= RELATOPERATOR { } ::= ANDSYMBOL ::= ORSYMBOL ::= EXECUTESYMBOL [THESYMBOL STATE- MENTSSYMBOL |STATEMENTSSYMBOL ] {} ::= IFSYMBOL ::= PROVIDEDSYMBOL [THATSYMOL ] ::= ONSYMBOL [ASYMBOL |THESYMBOL ] CONDITION- SYMBOL THATSYMBOL ::= ENDSYMBOL [OFSYMBOL ] IF- SYMBOL ::= ELSEIFSYMBOL {} ::= ELSESYMBOL {} ENDSYMBOL [OFSYMBOL ] IFSYMBOL ::= ::= [PLUS |MINUS ] ::= 308

::= [MULTIPLICATION |DIVISION |MODULUS ] ::= ::= ::= LEFTPARENTHESIS RIGHTPARENTHESIS ::= INTEGERCONSTANT ::= FLOATCONSTANT ::= INT IDENTIFIER ::= FLT IDENTIFIER ::= CHR IDENTIFIER ASSIGNMEN- TOPERATOR ::= STR IDENTIFIER ASSIGNMEN- TOPERATOR ::= ::= ASSIGNMENT- OPERATOR ::= ACHR IDENTIFIER ASSIGN- MENTOPERATOR ::= ASSIGNMENTOP- ERATOR ::= AINT IDENTIFIER ::= AFLT IDENTIFIER ::= CHR IDENTIFIER ::= CHARACTERCONSTANT ::= ACHR IDENTIFIER ::= STR IDENTIFIER ::= STRINGCONSTANT 309

::= ::= Appendix C

Synchronizing Sets

In this appendix a set of synchronizing tokens is calculated for every nonterminal of the grammar of an experimental textual language. The synchronizing tokens of each nonterminal are defined by calculating its first and follow sets.

program = {STARTSYMBOL, EOF} header main = {STARTSYMBOL, CREATESYMBOL, IN- PUTMARKSYMBOL, INPUTSYMBOL, OUTPUTSYMBOL, REPEATSYMBOL, CHR IDENTIFIER, STR IDENTIFIER, ROWSYMBOL, INT IDENTIFIER, FLT IDENTIFIER, TERMINATESYMBOL, CONTINUESYMBOL, EXECUTESYMBOL, ENDSYMBOL} main phrase = {OFSYMBOL, CREATESYMBOL, IN- PUTMARKSYMBOL, INPUTSYMBOL, OUTPUTSYMBOL, REPEATSYMBOL, CHR IDENTIFIER, STR IDENTIFIER, ROWSYMBOL, INT IDENTIFIER, FLT IDENTIFIER, TERMINATESYMBOL, CONTINUESYMBOL, EXECUTESYMBOL, ENDSYMBOL, EOF}

310 311

endmark main = {ENDSYMBOL, EOF} body section = {CREATESYMBOL, INPUTMARKSYM- BOL, INPUTSYMBOL, OUTPUTSYMBOL, REPEATSYMBOL, CHR IDENTIFIER, STR IDENTIFIER, ROWSYMBOL, INT IDENTIFIER, FLT IDENTIFIER, TER- MINATESYMBOL, CONTINUESYMBOL, EXECUTESYMBOL, ENDSYMBOL} declaration = {CREATESYMBOL, INPUTMARKSYM- BOL, INPUTSYMBOL, OUTPUTSYMBOL, REPEATSYMBOL, CHR IDENTIFIER, STR IDENTIFIER, ROWSYMBOL, INT IDENTIFIER, FLT IDENTIFIER, TER- MINATESYMBOL, CONTINUESYMBOL, EXECUTESYMBOL, ENDSYMBOL} dat type name declare = {CREATESYMBOL, INPUTMARKSYM- BOL, INPUTSYMBOL, OUTPUTSYMBOL, REPEATSYMBOL, CHR IDENTIFIER, STR IDENTIFIER, ROWSYMBOL, INT IDENTIFIER, FLT IDENTIFIER, TERMINATESYMBOL, CONTINUESYM- BOL, EXECUTESYMBOL, ENDSYMBOL, THESYMBOL, ASYMBOL, ANSYMBOL, FLOATSYMBOL, CHARACTERSYMBOL, STRINGSYMBOL, INTEGERSYMBOL} 312

all var dec = {CREATESYMBOL, INPUTMARKSYM- BOL, INPUTSYMBOL, OUTPUTSYMBOL, REPEATSYMBOL, CHR IDENTIFIER, STR IDENTIFIER, ROWSYMBOL, INT IDENTIFIER, FLT IDENTIFIER, TERMINATESYMBOL, CONTINUESYM- BOL, EXECUTESYMBOL, ENDSYMBOL, FLOATSYMBOL, CHARACTERSYMBOL, STRINGSYMBOL, INTEGERSYMBOL} nonint var dec = {CREATESYMBOL, INPUTMARKSYM- BOL, INPUTSYMBOL, OUTPUTSYMBOL, REPEATSYMBOL, CHR IDENTIFIER, STR IDENTIFIER, ROWSYMBOL, INT IDENTIFIER, FLT IDENTIFIER, TERMINATESYMBOL, CONTINUESYM- BOL, EXECUTESYMBOL, ENDSYMBOL, FLOATSYMBOL, CHARACTERSYMBOL, STRINGSYMBOL} int var dec = {CREATESYMBOL, INPUTMARKSYM- BOL, INPUTSYMBOL, OUTPUTSYMBOL, REPEATSYMBOL, CHR IDENTIFIER, STR IDENTIFIER, ROWSYMBOL, INT IDENTIFIER, FLT IDENTIFIER, TER- MINATESYMBOL, CONTINUESYMBOL, EXECUTESYMBOL, ENDSYMBOL, INTE- GERSYMBOL } 313

flt var dec = {CREATESYMBOL, INPUTMARKSYM- BOL, INPUTSYMBOL, OUTPUTSYMBOL, REPEATSYMBOL, CHR IDENTIFIER, STR IDENTIFIER, ROWSYMBOL, INT IDENTIFIER, FLT IDENTIFIER, TER- MINATESYMBOL, CONTINUESYMBOL, EXECUTESYMBOL, ENDSYMBOL, FLOAT- SYMBOL} chr var dec = {CREATESYMBOL, INPUTMARKSYM- BOL, INPUTSYMBOL, OUTPUTSYMBOL, REPEATSYMBOL, CHR IDENTIFIER, STR IDENTIFIER, ROWSYMBOL, INT IDENTIFIER, FLT IDENTIFIER, TER- MINATESYMBOL, CONTINUESYMBOL, EXECUTESYMBOL, ENDSYMBOL, CHAR- ACTERSYMBOL} str var dec = {CREATESYMBOL, INPUTMARKSYM- BOL, INPUTSYMBOL, OUTPUTSYMBOL, REPEATSYMBOL, CHR IDENTIFIER, STR IDENTIFIER, ROWSYMBOL, INT IDENTIFIER, FLT IDENTIFIER, TERMINATESYMBOL, CONTINUESYM- BOL, EXECUTESYMBOL, ENDSYMBOL, STRINGSYMBOL } 314

ideclare phrase = {CREATESYMBOL, INPUTMARKSYM- BOL, INPUTSYMBOL, OUTPUTSYMBOL, REPEATSYMBOL, CHR IDENTIFIER, STR IDENTIFIER, ROWSYMBOL, INT IDENTIFIER, FLT IDENTIFIER, TER- MINATESYMBOL, CONTINUESYMBOL, EXECUTESYMBOL, ENDSYMBOL, VARI- ABLESYMBOL, ARRAYSYMBOL} fdeclare phrase = {CREATESYMBOL, INPUTMARKSYM- BOL, INPUTSYMBOL, OUTPUTSYMBOL, REPEATSYMBOL, CHR IDENTIFIER, STR IDENTIFIER, ROWSYMBOL, INT IDENTIFIER, FLT IDENTIFIER, TER- MINATESYMBOL, CONTINUESYMBOL, EXECUTESYMBOL, ENDSYMBOL, VARI- ABLESYMBOL, ARRAYSYMBOL} cdeclare phrase = {CREATESYMBOL, INPUTMARKSYM- BOL, INPUTSYMBOL, OUTPUTSYMBOL, REPEATSYMBOL, CHR IDENTIFIER, STR IDENTIFIER, ROWSYMBOL, INT IDENTIFIER, FLT IDENTIFIER, TER- MINATESYMBOL, CONTINUESYMBOL, EXECUTESYMBOL, ENDSYMBOL, VARI- ABLESYMBOL, ARRAYSYMBOL} 315

int initial = {CREATESYMBOL, INPUTMARKSYM- BOL, INPUTSYMBOL, OUTPUTSYMBOL, REPEATSYMBOL, CHR IDENTIFIER, STR IDENTIFIER, ROWSYMBOL, INT IDENTIFIER, FLT IDENTIFIER, TERMINATESYMBOL, CONTINUESYM- BOL, EXECUTESYMBOL, ENDSYMBOL, INTEGERCONSTANT, CHARACTERCON- STANT} flt initial = {CREATESYMBOL, INPUTMARKSYM- BOL, INPUTSYMBOL, OUTPUTSYMBOL, REPEATSYMBOL, CHR IDENTIFIER, STR IDENTIFIER, ROWSYMBOL, INT IDENTIFIER, FLT IDENTIFIER, TERMINATESYMBOL, CONTINUESYM- BOL, EXECUTESYMBOL, ENDSYMBOL, INTEGERCONSTANT, CHARACTERCON- STANT, FLOATCONSTANT} statement = {ENDSYMBOL, ELSEIFSYMBOL, ELSESYMBOL, INPUTMARKSYMBOL, INPUTSYMBOL, OUTPUTSYMBOL, REPEATSYMBOL, CHR IDENTIFIER, STR IDENTIFIER, ROWSYMBOL, INT IDENTIFIER, FLT IDENTIFIER, TER- MINATESYMBOL, CONTINUESYMBOL, EXECUTESYMBOL} 316

input statement = {ENDSYMBOL, ELSEIFSYMBOL, ELSESYMBOL, INPUTMARKSYMBOL, INPUTSYMBOL, OUTPUTSYMBOL, REPEATSYMBOL, CHR IDENTIFIER, STR IDENTIFIER, ROWSYMBOL, INT IDENTIFIER, FLT IDENTIFIER, TER- MINATESYMBOL, CONTINUESYMBOL, EXECUTESYMBOL} any identifier = {ENDSYMBOL, ELSEIFSYMBOL, ELSESYMBOL, INPUTMARKSYMBOL, INPUTSYMBOL, OUTPUTSYMBOL, REPEATSYMBOL, CHR IDENTIFIER, STR IDENTIFIER, ROWSYMBOL, INT IDENTIFIER, FLT IDENTIFIER, TER- MINATESYMBOL, CONTINUESYMBOL, EXECUTESYMBOL} array variable = {AINT IDENTIFIER, AFLT IDENTIFIER, ACHR IDENTIFIER, ROWSYMBOL} array identifier = {ENDSYMBOL, ELSEIFSYMBOL, ELSESYMBOL, INPUTMARKSYMBOL, INPUTSYMBOL, OUTPUTSYMBOL, REPEATSYMBOL, CHR IDENTIFIER, STR IDENTIFIER, ROWSYMBOL, INT IDENTIFIER, FLT IDENTIFIER, TER- MINATESYMBOL, CONTINUESYMBOL, EXECUTESYMBOL, AINT IDENTIFIER, AFLT IDENTIFIER, ACHR IDENTIFIER} 317

output statement = {ENDSYMBOL, ELSEIFSYMBOL, ELSESYMBOL, INPUTMARKSYMBOL, INPUTSYMBOL, OUTPUTSYMBOL, REPEATSYMBOL, CHR IDENTIFIER, STR IDENTIFIER, ROWSYMBOL, INT IDENTIFIER, FLT IDENTIFIER, TER- MINATESYMBOL, CONTINUESYMBOL, EXECUTESYMBOL} output parameter = {ENDSYMBOL, ELSEIFSYMBOL, ELSESYMBOL, INPUTMARKSYMBOL, INPUTSYMBOL, OUTPUTSYMBOL, REPEATSYMBOL, CHR IDENTIFIER, STR IDENTIFIER, ROWSYMBOL, INT IDENTIFIER, FLT IDENTIFIER, TER- MINATESYMBOL, CONTINUESYMBOL, EXECUTESYMBOL, CHARACTERCON- STANT, INTEGERCONSTANT, FLOAT- CONSTANT, STRINGCONSTANT, ASYM- BOL, THESYMBOL, VALUESYMBOL, TEXTSYMBOL} output para trail = {ENDSYMBOL, ELSEIFSYMBOL, ELSESYMBOL, INPUTMARKSYMBOL, INPUTSYMBOL, OUTPUTSYMBOL, REPEATSYMBOL, CHR IDENTIFIER, STR IDENTIFIER, ROWSYMBOL, INT IDENTIFIER, FLT IDENTIFIER, TER- MINATESYMBOL, CONTINUESYMBOL, EXECUTESYMBOL, VALUESYMBOL, TEXTSYMBOL} 318

const identifier = {ENDSYMBOL, ELSEIFSYMBOL, ELSESYMBOL, INPUTMARKSYMBOL, INPUTSYMBOL, OUTPUTSYMBOL, REPEATSYMBOL, CHR IDENTIFIER, STR IDENTIFIER, ROWSYMBOL, INT IDENTIFIER, FLT IDENTIFIER, TER- MINATESYMBOL, CONTINUESYMBOL, EXECUTESYMBOL, OFSYMBOL, CHAR- ACTERCONSTANT, INTEGERCONSTANT, FLOATCONSTANT, STRINGCONSTANT} constants = {ENDSYMBOL, ELSEIFSYMBOL, ELSESYMBOL, INPUTMARKSYMBOL, INPUTSYMBOL, OUTPUTSYMBOL, REPEATSYMBOL, CHR IDENTIFIER, STR IDENTIFIER, ROWSYMBOL, INT IDENTIFIER, FLT IDENTIFIER, TER- MINATESYMBOL, CONTINUESYMBOL, EXECUTESYMBOL, CHARACTERCON- STANT, INTEGERCONSTANT, FLOAT- CONSTANT, STRINGCONSTANT} loop statement = {ENDSYMBOL, ELSEIFSYMBOL, ELSESYMBOL, INPUTMARKSYMBOL, INPUTSYMBOL, OUTPUTSYMBOL, REPEATSYMBOL, CHR IDENTIFIER, STR IDENTIFIER, ROWSYMBOL, INT IDENTIFIER, FLT IDENTIFIER, TER- MINATESYMBOL, CONTINUESYMBOL, EXECUTESYMBOL} 319

loop types = {ENDSYMBOL, ELSEIFSYMBOL, ELSESYMBOL, INPUTMARKSYMBOL, INPUTSYMBOL, OUTPUTSYMBOL, REPEATSYMBOL, CHR IDENTIFIER, STR IDENTIFIER, ROWSYMBOL, INT IDENTIFIER, FLT IDENTIFIER, TER- MINATESYMBOL, CONTINUESYMBOL, EXECUTESYMBOL, INTEGERCONSTANT, WITHSYMBOL, WHILESYMBOL, ASSYM- BOL} for loop = {ENDSYMBOL, ELSEIFSYMBOL, ELSESYMBOL, INPUTMARKSYMBOL, INPUTSYMBOL, OUTPUTSYMBOL, REPEATSYMBOL, CHR IDENTIFIER, STR IDENTIFIER, ROWSYMBOL, INT IDENTIFIER, FLT IDENTIFIER TER- MINATESYMBOL, CONTINUESYMBOL, EXECUTESYMBOL, INTEGERCONSTANT, WITHSYMBOL} for cases = {ENDSYMBOL, ELSEIFSYMBOL, ELSESYMBOL, INPUTMARKSYMBOL, INPUTSYMBOL, OUTPUTSYMBOL, REPEATSYMBOL, CHR IDENTIFIER, STR IDENTIFIER, ROWSYMBOL, INT IDENTIFIER, FLT IDENTIFIER, TER- MINATESYMBOL, CONTINUESYMBOL, EXECUTESYMBOL, INTEGERCONSTANT, WITHSYMBOL} 320

limit = {OFSYMBOL, TODOWNTOSYMBOL, ANDSYMBOL, ASYMBOL, THESYMBOL, STEPSYMBOL, INTEGERCONSTANT, INT IDENTIFIER} while loop = {ENDSYMBOL, ELSEIFSYMBOL, ELSESYMBOL, INPUTMARKSYMBOL, INPUTSYMBOL, OUTPUTSYMBOL, REPEATSYMBOL, CHR IDENTIFIER, STR IDENTIFIER, ROWSYMBOL, INT IDENTIFIER, FLT IDENTIFIER, TER- MINATESYMBOL, CONTINUESYMBOL, EXECUTESYMBOL, WHILESYMBOL, ASSYMBOL} while phrase = {INT IDENTIFIER, FLT IDENTIFIER, IN- TEGERCONSTANT, FLOATCONSTANT, LEFTPARENTHESIS, WHILESYMBOL, ASSYMBOL} close loop = {ENDSYMBOL, ELSEIFSYMBOL, ELSESYMBOL, INPUTMARKSYMBOL, INPUTSYMBOL, OUTPUTSYMBOL, REPEATSYMBOL, CHR IDENTIFIER, STR IDENTIFIER, ROWSYMBOL, INT IDENTIFIER, FLT IDENTIFIER, TER- MINATESYMBOL, CONTINUESYMBOL, EXECUTESYMBOL} 321

terminate loop = {ENDSYMBOL, ELSEIFSYMBOL, ELSESYMBOL, INPUTMARKSYMBOL, INPUTSYMBOL, OUTPUTSYMBOL, REPEATSYMBOL, CHR IDENTIFIER, STR IDENTIFIER, ROWSYMBOL, INT IDENTIFIER, FLT IDENTIFIER, TER- MINATESYMBOL, CONTINUESYMBOL, EXECUTESYMBOL} continue loop = {ENDSYMBOL, ELSEIFSYMBOL, ELSESYMBOL, INPUTMARKSYMBOL, INPUTSYMBOL, OUTPUTSYMBOL, REPEATSYMBOL, CHR IDENTIFIER, STR IDENTIFIER, ROWSYMBOL, INT IDENTIFIER, FLT IDENTIFIER, TER- MINATESYMBOL, CONTINUESYMBOL, EXECUTESYMBOL} relational expression = {INTEGERCONSTANT, FLOATCONSTANT, LEFTPARENTHESIS, INPUTMARKSYM- BOL, INPUTSYMBOL, OUTPUTSYMBOL, REPEATSYMBOL, CHR IDENTIFIER, STR IDENTIFIER, ROWSYMBOL, INT IDENTIFIER, FLT IDENTIFIER, TER- MINATESYMBOL, CONTINUESYMBOL, EXECUTESYMBOL, ENDSYMBOL, EL- SEIFSYMBOL, ELSESYMBOL} logical operator = {ANDSYMBOL, ORSYMBOL, INT IDENTIFIER, FLT IDENTIFIER, IN- TEGERCONSTANT, FLOATCONSTANT, LEFTPARENTHESIS} 322

if statement = {ENDSYMBOL, ELSEIFSYMBOL, ELSESYMBOL, INPUTMARKSYMBOL, INPUTSYMBOL, OUTPUTSYMBOL, REPEATSYMBOL, CHR IDENTIFIER, STR IDENTIFIER, ROWSYMBOL, INT IDENTIFIER, FLT IDENTIFIER, TER- MINATESYMBOL, CONTINUESYMBOL, EXECUTESYMBOL} word if = {IFSYMBOL, PROVIDEDSYMBOL, ONSYM- BOL, INT IDENTIFIER, FLT IDENTIFIER, INTEGERCONSTANT, FLOATCONSTANT, LEFTPARENTHESIS} if suffix = {ENDSYMBOL, ELSEIFSYMBOL, ELSESYMBOL, INPUTMARKSYMBOL, INPUTSYMBOL, OUTPUTSYMBOL, REPEATSYMBOL, CHR IDENTIFIER, STR IDENTIFIER, ROWSYMBOL, INT IDENTIFIER, FLT IDENTIFIER, TER- MINATESYMBOL, CONTINUESYMBOL, EXECUTESYMBOL} expression = {INTEGERCONSTANT, FLOATCONSTANT, LEFTPARENTHESIS, RELATOPERA- TOR, ANDSYMBOL, ORSYMBOL, IN- PUTMARKSYMBOL, INPUTSYMBOL, OUTPUTSYMBOL, REPEATSYMBOL, CHR IDENTIFIER, STR IDENTIFIER, ROWSYMBOL, INT IDENTIFIER, FLT IDENTIFIER, TERMINATESYMBOL, CONTINUESYMBOL, EXECUTESYMBOL, ENDSYMBOL, ELSEIFSYMBOL, ELSESYM- BOL} 323

expression ds = {PLUS, MINUS, RELATOPERATOR, ANDSYMBOL, ORSYMBOL, INPUT- MARKSYMBOL, INPUTSYMBOL, OUTPUTSYMBOL, REPEATSYMBOL, CHR IDENTIFIER, STR IDENTIFIER, ROWSYMBOL, INT IDENTIFIER, FLT IDENTIFIER, TERMINATESYMBOL, CONTINUESYMBOL, EXECUTESYMBOL, ENDSYMBOL, ELSEIFSYMBOL, ELSESYM- BOL} term = {INTEGERCONSTANT, FLOATCONSTANT, LEFTPARENTHESIS, PLUS, MINUS, RE- LATOPERATOR, ANDSYMBOL, ORSYM- BOL, INPUTMARKSYMBOL, INPUTSYM- BOL, OUTPUTSYMBOL, REPEATSYMBOL, CHR IDENTIFIER, STR IDENTIFIER, ROWSYMBOL, INT IDENTIFIER, FLT IDENTIFIER, TERMINATESYMBOL, CONTINUESYMBOL, EXECUTESYMBOL, ENDSYMBOL, ELSEIFSYMBOL, ELSESYM- BOL} term ds = {MULTIPLICATION, DIVISION, PLUS, MINUS, RELATOPERATOR, ANDSYM- BOL, ORSYMBOL, INPUTMARKSYM- BOL, INPUTSYMBOL, OUTPUTSYMBOL, REPEATSYMBOL, CHR IDENTIFIER, STR IDENTIFIER, ROWSYMBOL, INT IDENTIFIER, FLT IDENTIFIER, TER- MINATESYMBOL, CONTINUESYMBOL, EXECUTESYMBOL, ENDSYMBOL, EL- SEIFSYMBOL, ELSESYMBOL} 324

factor = {INT IDENTIFIER, FLT IDENTIFIER, IN- TEGERCONSTANT, FLOATCONSTANT, LEFTPARENTHESIS, MULTIPLICATION, DIVISION, PLUS, MINUS, RELATOP- ERATOR, ANDSYMBOL, ORSYMBOL, INPUTMARKSYMBOL, INPUTSYMBOL, OUTPUTSYMBOL, REPEATSYMBOL, CHR IDENTIFIER, STR IDENTIFIER, ROWSYMBOL, TERMINATESYMBOL, CONTINUESYMBOL, EXECUTESYMBOL, ENDSYMBOL, ELSEIFSYMBOL, ELSESYM- BOL} numeric constants = {INTEGERCONSTANT, FLOATCONSTANT, MULTIPLICATION, DIVISION, PLUS, MINUS, RELATOPERATOR, ANDSYM- BOL, ORSYMBOL, INPUTMARKSYM- BOL, INPUTSYMBOL, OUTPUTSYMBOL, REPEATSYMBOL, CHR IDENTIFIER, STR IDENTIFIER, ROWSYMBOL, INT IDENTIFIER, FLT IDENTIFIER, TER- MINATESYMBOL, CONTINUESYMBOL, EXECUTESYMBOL, ENDSYMBOL, EL- SEIFSYMBOL, ELSESYMBOL} 325

numeric identifier = {MULTIPLICATION, DIVISION, PLUS, MINUS, RELATOPERATOR, ANDSYM- BOL, ORSYMBOL, INPUTMARKSYM- BOL, INPUTSYMBOL, OUTPUTSYMBOL, REPEATSYMBOL, CHR IDENTIFIER, STR IDENTIFIER, ROWSYMBOL, INT IDENTIFIER, FLT IDENTIFIER, TER- MINATESYMBOL, CONTINUESYMBOL, EXECUTESYMBOL, ENDSYMBOL, EL- SEIFSYMBOL, ELSESYMBOL} assignment = {ENDSYMBOL, ELSEIFSYMBOL, ELSESYMBOL, INPUTMARKSYMBOL, INPUTSYMBOL, OUTPUTSYMBOL, REPEATSYMBOL, CHR IDENTIFIER, STR IDENTIFIER, ROWSYMBOL, INT IDENTIFIER, FLT IDENTIFIER, TER- MINATESYMBOL, CONTINUESYMBOL, EXECUTESYMBOL} array assignment = {ACHR IDENTIFIER, AINT IDENTIFIER, AFLT IDENTIFIER, ENDSYMBOL, ELSEIFSYMBOL, ELSESYMBOL, IN- PUTMARKSYMBOL, INPUTSYMBOL, OUTPUTSYMBOL, REPEATSYMBOL, CHR IDENTIFIER, STR IDENTIFIER, ROWSYMBOL, INT IDENTIFIER, FLT IDENTIFIER, TERMINATESYMBOL, CONTINUESYMBOL, EXECUTESYMBOL} numeric array = {AINT IDENTIFIER, AFLT IDENTIFIER, ASSIGNMENTOPERATOR} 326

char operand = {CHARACTERCONSTANT, ENDSYM- BOL, ELSEIFSYMBOL, ELSESYMBOL, INPUTMARKSYMBOL, INPUTSYMBOL, OUTPUTSYMBOL, REPEATSYMBOL, CHR IDENTIFIER, STR IDENTIFIER, ROWSYMBOL, INT IDENTIFIER, FLT IDENTIFIER, TERMINATESYMBOL, CONTINUESYMBOL, EXECUTESYMBOL} str rvalue = {STRINGCONSTANT, ENDSYMBOL, ELSEIFSYMBOL, ELSESYMBOL, IN- PUTMARKSYMBOL, INPUTSYMBOL, OUTPUTSYMBOL, REPEATSYMBOL, CHR IDENTIFIER, STR IDENTIFIER, ROWSYMBOL, INT IDENTIFIER, FLT IDENTIFIER, TERMINATESYMBOL, CONTINUESYMBOL, EXECUTESYMBOL} numarr operand = {ENDSYMBOL, ELSEIFSYMBOL, ELSESYMBOL, INPUTMARKSYMBOL, INPUTSYMBOL, OUTPUTSYMBOL, REPEATSYMBOL, CHR IDENTIFIER, STR IDENTIFIER, ROWSYMBOL, INT IDENTIFIER, FLT IDENTIFIER, TER- MINATESYMBOL, CONTINUESYMBOL, EXECUTESYMBOL, INTEGERCONSTANT, FLOATCONSTANT, LEFTPARENTHESIS, ROWSYMBOL}