Integrating a Graph Builder Into Python Tutor
Total Page:16
File Type:pdf, Size:1020Kb
Integrating a Graph Builder into Python Tutor Diogo Soares # University of Minho, Braga Portugal Maria João Varanda Pereira1 # Ñ Research Centre in Digitalization and Intelligent Robotics, Polythechnic Insitute of Bragança, Portugal Pedro Rangel Henriques # Ñ University of Minho, Braga, Portugal Abstract Analysing unknown source code to comprehend it is quite hard and expensive task. Therefore, the Program Comprehension (PC) subject has always been an area of interest as it helps to realize how a program works by identifying the code that implements each functionality. This means being able to map the problem domain with the program domain. PC is a complex area, but its importance for programmers is so high that many approaches and tools were proposed along the last two decades. Program Animation is one of those approaches requiring specialized techniques. For each programming language, there are already tools that enable us to execute a program step by step, visualize its execution path, observe the effect of each instruction on its data structures, and inspect the value of its variables at any point. In the present context, we sustain the idea that PC techniques and tools can also be of great value for students taking the first steps in programming using a specific language. To this end, weaimto improve Python Tutor, a well-known program visualization tool, with graph-based representations of source code such as Control Flow Graph (CFG), Data Flow Graph (DFG), Function Call Graph (FCG) and System Control Graph (SCG). This helps novice programmers to understand the source code analyzing not only the variable contents but also a set of automatically generated graph-based visualizations, that were not included in Python Tutor so far. This will allow the students to be focused on certain aspects of the program (depending on the graph), abstracting others such as details of its syntax. 2012 ACM Subject Classification Human-centered computing → Visualization systems and tools; Social and professional topics → Computer science education; Software and its engineering → Source code generation Keywords and phrases Program Visualization, Python Tutor, Data Flow Graphs, Control Flow Graphs Digital Object Identifier 10.4230/OASIcs.ICPEC.2021.6 Supplementary Material Software (Graph Builder): https://graph-builder.di.uminho.pt/visualize.html Funding This work has been supported by FCT – Fundação para a Ciência e Tecnologia within the Projects Scopes: UIDB/05757/2020 and UIDB/00319/2020. 1 Introduction There’s along the web many tutorials on programming and several learning methodologies but the majority of them uses the same formula [4]: start with the explanation of a set of examples and propose some similar exercises that train the students to write similar code to solve them. To fully understand a programming language, it is not enough to read the code 1 to mark corresponding author © Diogo Soares, Maria João Varanda Pereira, and Pedro Rangel Henriques; licensed under Creative Commons License CC-BY 4.0 Second International Computer Programming Education Conference (ICPEC 2021). Editors: Pedro Rangel Henriques, Filipe Portela, Ricardo Queirós, and Alberto Simões; Article No. 6; pp. 6:1–6:15 OpenAccess Series in Informatics Schloss Dagstuhl – Leibniz-Zentrum für Informatik, Dagstuhl Publishing, Germany 6:2 Graph-Based Visualization Builder of programs written in that language, it is also necessary to comprehend how the computer behaves when executing the programs code. For each piece of code, the student shall be able to build a full map between the program domain and the problem domain. Usually it is hard to understand his own code, even to a software engineer that has to maintain third-part code he never saw before, the task is much harder. The average developer spends about 60% to 90% of his time understanding code previously developed; these figures are not changing over time [19]. With this in mind, this paper describes and discusses the integration into Python Tutor of a new tool for automatic generation of graph-based visualizations to turn ease the task of program comprehension. As said in [7], the best way to understand programs is by giving programs another aspect than that of their source code. This new feature works as Python Tutor plugin and it analyses the input code and produces several types of complementary graphs depending on the user desires. As the program comprehension task is also very relevant for the beginners wishing to understand how a given program works and how it solves a given problem, we believe that Python Tutor improved with our plug-in will be very useful to help programming students. The paper has 5 sections after that. The state of the art on program animation appears in Section 2. Section 3 discusses how graph-based visualizations can be included in an Animator adding actual value to it. Section 4 exhibits the architecture to build the new tool and discusses how it can be developed and integrated into the host tool. Section 5 illustrates the final system built and describes a experiment conducted to assess the value addedby the visualizations provided. Section 6 closes the paper and points out some directions for future research. 2 Program Visualization and Animation Think-aloud protocol [22] is a method where a person verbalize his thoughts and it can be used not only to improve self−understanding and reasoning but also to explain the cognitive process to others. That’s why is so important to produce documentation when developing software. The use of this method also allows realizing that the thoughts of a developer differ according to the familiarity of a program’s domains [17]. With this in mind, it was derived two concepts of comprehension models, the top-down comprehension model and the bottom-up comprehension models. The top-down approach is used when the programmer is familiar with the program he’s trying to understand. He starts by creating a general hypotheses about the program goal and how it is achieved by using his previous knowledge about the program. This is called the expectation-based comprehension, where the programmer has pre-generated expectations of the code’s meaning [13]. After creating his hypotheses, the developer starts searching in the code for algorithms/structures that he can link to his theory and refines his hypotheses as he does that [6]. This is the inference-based comprehension [13]. If the programmer has no previous knowledge about a program, he has to analyze line by line to create a hypothesis about the program purpose. In this method, the developer starts aggregating functions by their goals. What usually happens is that developers end by using an integrated model [11] where they use top-down approach when possible and bottom-up when necessary. Although the way of coding varies from person to person, follow a guide of good practices in programming (i.e. appropriate variable names and indentation rules [18]) can lead to an uniformed structure that makes program comprehension easier. A different form of coding can really slow down the comprehension of a program for the most experienced programmer [20]. D. Soares, M. J. V. Pereira, and P. R. Henriques 6:3 Nowadays, there’s even more factors that can affect program comprehension. For example, a simple color schema, available in syntax-oriented text editor/IDE, helps the programmers comprehending a program [16]. Newer studies have their focus not only on source code comprehension, but also on the comprehension of structures, hierarchies and architecture of the program as well on the relations between components [19]. Software visualization is the process of giving a visual representation of a program. It takes the information that is presented in the source code and converts it to a graphical representation with the goal of improving the comprehension of that program. Although a visual image can help to understand a program, it’s not that simple since a basic program can have multiple representations. The best one will depend on the task to be performed and what the developer wants to know. This also means that each representation has to be thought for specific tasks and a good portray of the system is imperative for it to behelpful. Simply repackaging massive textual information into a massive graphical representation is not helpful, so a linear translation is not ideal since experts want to see software visualised in context – not just what the code does, but what it means [14]. The visual representation of a program can be focused in some aspects of the program and it can have different levels of abstraction. This can also solve scalability problems because the visualization of big programs at once it will not be a good help. So, the visualization tool must allow an high level of interactivity to go one step at a time, filter the information, to establish the data that is wanted, to define the level of abstraction and zoom facilities [8]. Graphically these representations can be based on diagrams, maps, icons, graphs and specific drawings. Some examples are the Nassi-Shneiderman diagram used to represent a control-flow of a program [12]; Space-filling (like tree maps or sunburts) allows to visualize in a very compressed way code metrics and statistics [5]; Graphs are the most common representation and consists of nodes and arcs where the nodes represent blocks of code and the arcs its flow [10]. In terms of functionality, there are two types of visualizations: static and dynamic [10]. A static representation can be seen as a simple image of the program, it doesn’t change over time and it’s based on static information. Dynamic representations shows the information that changes as the program is executed. A good example of an animated visualization tool is Python Tutor.