EXAMENSARBETE INOM TEKNIK, GRUNDNIVÅ, 15 HP STOCKHOLM, SVERIGE 2019

Using eye tracking to study variable naming conventions and their effect on code readability

PONTUS BROBERG

SHAPOUR JAHANSHAHI

KTH SKOLAN FÖR ELEKTROTEKNIK OCH DATAVETENSKAP

Using eye tracking to study variable naming conventions and their effect on code readability

PONTUS BROBERG, SHAPOUR JAHANSHAHI

Master in Computer Science Date: June 7, 2019 Supervisor: Richard Glassey Examiner: Örjan Ekeberg School of Electrical Engineering and Computer Science Swedish title: En studie av variabelnamngivningskonventioners åverkan på läslighet av kod med hjälp av ögonspårning

iii

Abstract

Using when naming variables is largely considered to be best prac- tise when writing code these days. But is it really the best variable naming convention when it comes to code readability and understanding? And how does different variable naming conventions affect the readability of code? This thesis researches these questions using eye tracking technology. Test subjects are timed as they look at and explain code snippets using different variable naming conventions while their gaze is plotted onto a heatmap. The variable naming conventions tested were single letters, single words, multiple words in camel case and multiple words in snake case. From the results shown, the con- clusion is drawn that no significant difference in readability can be confirmed between the different variable naming conventions. iv

Sammanfattning

Att använda camel case när man namnger variabler i kod anses i stort sett vara god praxis. Men är det verkligen det bästa sättet att namnge variabler när det kommer till läslighet och förståelse av kod? Och hur påverkar olika variabelnamns-notationer läsligheten av kod? Denna avhandling undersöker dessa frågor med hjälp av ögonspårningsteknologi. Tid togs medan testperso- ner tittade på och försökte förklara små kodexempel med olika sorters vari- abelnamn samtidigt som deras blick översattes till ett färgdiagram. De olika sätten att skriva variabelnamn som testades var en bokstav, ett ord, flera ord i camel case och flera ord i snake case. Från resultatet som visas dras slutsat- sen att ingen betydande skillnad på läsbarheten av koden kunde hittas när det kommer till användandet av olika variabelnamngivningskonventioner. Contents

1 Introduction 1 1.1 Research Question ...... 2 1.2 Approach ...... 2 1.3 Thesis Outline ...... 2

2 Background 3 2.1 Variable naming conventions ...... 3 2.2 Heatmap ...... 4 2.3 Code readability ...... 4 2.4 Eye tracking ...... 5 2.5 Related work ...... 5

3 Methods 7 3.1 Preparations ...... 7 3.1.1 Creating code snippets ...... 7 3.1.2 Test station setup ...... 9 3.2 User tests ...... 9 3.3 Limitations ...... 10

4 Results 11 4.1 Test subjects ...... 11 4.2 Mean time ...... 11 4.3 Time distribution ...... 11 4.4 Questions ...... 13 4.5 Heatmaps ...... 14

5 Discussion 15 5.1 Timed data ...... 15 5.2 Heatmaps in relation to timed data ...... 17 5.3 Possible improvements ...... 17

v vi CONTENTS

5.4 Future work ...... 18

6 Conclusions 19

Bibliography 20

A Heatmaps 22 Chapter 1

Introduction

Reading and understanding code is a key component in the life of a program- mer and being able to understand a program quickly is something that is as- sociated with being a good programmer. However, in a perfect world, a pro- gramming novice should also be able to understand a piece of code quickly if the code is written in such a way that it is easy to understand. Since about 70% of code consists of identifiers [1], one could argue that code readability mostly depends on how the author chooses to name these variables, methods and classes. Code readability is also very important when maintaining pro- grams. A bit more than two thirds of a programs total lifecycle cost is spent on code maintenance and researchers have noted that reading code seems to be the area where time is most spent when maintaining code [2].

While studying computer science at KTH, one is introduced to programming languages such as Java, Python and GO during the first year and all of these languages seem to follow the same kind of praxis, using camel case, when it comes to naming variables. However, one never really get a scientific explana- tion as to why you should program using camel case. Instead you are simply told that it is the best variable naming convention and that you should write code according to it. Is this really the best way to name variables or is it just popular without any scientific reason to back it up?

With eye tracking technology it is possible to get instant feedback where a developer is looking on a monitor. Using this technology, a developer can be presented with a snippet of code on a monitor and be told to read it, trying to figure out what the code does. The eye tracker could generate a heatmap showing what parts of the code the developer spent the most time reading.

1 2 CHAPTER 1. INTRODUCTION

1.1 Research Question

The purpose of this study is to challenge the praxis of using camel case to name variables when writing code and to see if there maybe is another way that might be better, considering code readability and comprehension. The study therefore will aim to answer the question: How does using different variable naming conventions affect the readability of a program?

1.2 Approach

In order to answer this question, four different variable naming conventions will be used. These are: single letter, single word, multiple words in camel case and multiple words in snake case. Six different code snippets will be written where each of these snippets will be written using two different vari- able naming conventions. All of the variable names will be chosen to help describe the code as well as possible, within the limits of the variable naming convention. A test subject will be shown a set of code snippets and using Tobii eye trackers, their gaze will be translated to a heatmap. All test subjects will also be timed as they try to generally explain what the snippet of code does. The data from these tests will then be analyzed, discussed and conclusions will be drawn.

1.3 Thesis Outline

The following chapter will provide relevant background, definitions and a brief explanation on how eye tracking works. It will also present some similar stud- ies that has been done in this area. The third chapter describes the methods used to gather data. In the fourth chapter all results will be presented. The following fifth and sixth chapter discusses the results and answers the research question. Chapter 2

Background

This chapter will begin with definitions of the variable naming conventions that we use in this study. Following that is a rundown of the eye tracking technology that is used, explaining briefly how it works and what it can do. Last section in the chapter presents related work that has been done previously in the field.

2.1 Variable naming conventions

Using a set of rules for naming variables in code is a wide-spread practice. A prime example of when to use such a variable naming convention is when working with other people on writing and maintaining code. [3].

This section will define the four different variable naming conventions that we used during the tests. These can essentially be split into two different groups which are multi-worded variables (camel case and snake case) and single-worded variables (single letter and single word).

Definition: Camel case Multiple words in camel case (hereafter named MWCC) is a variable naming style used to write variables with multiple words without using spaces. Since spaces in code means the start of a new type of command or a new part of the code, spaces cannot be used when naming variables. To distinguish the different words from each other you can start every new word with a capital letter. An example of MWCC would be sumOfNumbers. [4]

3 4 CHAPTER 2. BACKGROUND

Definition: Snake case Multiple words in snake case (hereafter MWSC) is a multiple worded variable naming style where instead of using spaces you use instead. These underscores used to separate the words makes the variable look somewhat like a snake, hence the name. Note that no letters are written with capital letters in MWSC. An example of MWSC would be sum_of_numbers. [4]

Definition: Single word Single worded (hereafter SW) variable names are names that use only one word. No capital letters are ever written. An example of SW would be sum.

Definition: Single letter Single letter (hereafter SL) variable names are names that will use only a single letter to describe the variable. Letters are not written in uppercase and the letter is used to try and describe the variable as well as possible. An example of SL would be s since that is the first letter of the word sum.

2.2 Heatmap

A heatmap is a graphical representation of a matrix. Individual tiles in a grid is painted with a shaded colour, scaled to represent the value corresponding to the element of the data matrix [5]. If we interpret a computer screen as a grid with rows and columns, we can use a heatmap to visualize which parts of the screen a user is looking at the most.

2.3 Code readability

Buse and Weimer define readability as "a human judgment of how easy a text is to understand" [2]. They acknowledge that there are many different factors that determine the readability of code. There are formatting aspects, such as the use of proper indentation, fonts and colours as well as identifier names. Code complexity can also affect code readability [6]. CHAPTER 2. BACKGROUND 5

2.4 Eye tracking

According to their website [7], eye tracking is a sensor technology used in com- puters and other devices to see where a user is looking. A user can also use this technology to control a computer with their eyes since the tracker not only shows where a user is looking but also can detect presence, attention and fo- cus. This is especially useful when a user is unable to speak or use their hands.

The eye tracker works by creating a pattern of near-infrared light on the user’s eyes with the help of its cameras, projectors and algorithms. High-resolution images are taken of the user’s eyes and patters and then the user’s gaze point is calculated with the help of machine learning, image processing and mathe- matical algorithms. In this thesis we will be using eye tracking to plot a heatmap that will show us where the test subject has spent most of their time looking at the code.

2.5 Related work

In a similar study from 2010 by Bonita Sharif and Jonathan I. Maletic, also used eye tracking to study identifier styles such as camel case and snake case (under_score) [8]. In this study they used an eye tracker to gather quantitative data as additional insight to other data gathering techniques. The results from this test shows that there were no significant difference between the two styles when it came to accuracy, however subjects seemed to recognize identifiers in the style more quickly than camel case.

Another study was made on camel case and underscore in 2009 where the authors wanted to better understand the readability of identifiers [9]. Here an empirical study of 135 programmers and non-programmers was conducted where the experiment was presented as a game. A subject was shown a phrase and then shown four different clouds moving on the screen, each cloud con- taining an identifier written in either camel case or underscore style. Only one cloud was written exactly the same as the previous picture and the subject was to identify this cloud. as quickly as possible. Results from this study showed that camel case identifiers provided higher accuracy among all subjects, re- gardless of programming experience. Also it was found that those trained in camel casing style recognized camel cased identifiers faster than those written in underscore style. 6 CHAPTER 2. BACKGROUND

Several studies have been done on code readability and how to measure if code is readable or not. One example is this study by Todd Sedano from 2016 where he presents a new technique on how to measure code readability and tests this technique [10]. 21 subjects were involved in this field test where they all followed Sedano’s code readability testing during four sessions. After these sessions, half of the people writing unreadable code was now writing readable code and all the programmers already writing readable code improved. Chapter 3

Methods

In this chapter we present the methods used. The first section describes how we prepared code snippets and setup test stations. The following section describes the process of conducting user tests. The final section discusses the limitations in our method.

3.1 Preparations

3.1.1 Creating code snippets In order to compare our selected variable naming conventions to each other, we created code snippet pairs. Each snippet pair represents a common program- ming challenge, written using different variable naming conventions. The code snippets in each pair is identical except for the variable names used.

Java was our language of choice in the code snippets because we targeted test subjects who were Computer Science students with at least one year of Java programming experience.

We limited the number of code snippet pairs to six to meet a 15 minutes max- imum time constraint of each user test to allow for as many tests as possible. The distribution of comparisons between the selected variable naming con- ventions was chosen to reflect our hypothesis. The amount of comparisons between Single Letter/Word and Multi-Word variable names was maximized without excluding comparisons between SL and SW as well as MWCC and MWSC.

7 8 CHAPTER 3. METHODS

Code Snippet Pair Test group 1 Test group 2 1 SL MWSC 2 SW MWCC 3 MWCC MWSC 4 MWSC SL 5 SL SW 6 MWCC SW Table 3.1: Variable naming conventions used in the code snippets.

Code Snippet 1 This code snippet pair calculates the sum of the values in an array. The snippets in this pair use the SL and MWSC variable naming conventions, respectively.

Code Snippet 2 This code snippet pair represent a Selection Sort algorithm. The snippets in this pair use the SW and MWCC variable naming conventions, respectively.

Code Snippet 3 This code snippet pair can be interpreted as a random delay function. The snippets in this pair use the MWCC and MWSC variable naming conventions, respectively.

Code Snippet 4 This code snippet pair prints he sum of all even numbers between 1 and 10, inclusive. The snippets in this pair use the MWSC and SL variable naming conventions, respectively.

Code Snippet 5 This code snippet pair illustrates the minimum distance between coordinate pairs. The snippets in this pair use the SL and SW variable naming conven- tions, respectively.

Code Snippet 6 This code snippet pair represent currency exchange. The snippets in this pair use the MWCC and SW variable naming conventions, respectively. CHAPTER 3. METHODS 9

3.1.2 Test station setup Two parallel test stations were prepared using a pair of Tobii EyeX eye trackers connected to computers running the Unity video game engine. Using the work of Sundkvist and Persson[11], we prepared 2D views in Unity containing our code snippets. During our user tests the test subject will look at the screen and the Tobii EyeX tracks the eye movements of the test subject in relation to the rendered 2D views and Unity stores the coordinate data for later use.

3.2 User tests

Test subjects were divided randomly into two evenly distributed test groups. Each test group was assigned to a test station. Each test subject was first given an introduction during which we calibrated the eye tracking gear to adjust to the test subject’s eyes. When the calibration was satisfactory and the test sub- ject was ready, we moved on to the code snippets.

Each test subject was presented with six code snippets based on which group they were assigned to (see table 3.1). This was done to make sure that no test subject would see the same kind of code but with two different variable naming conventions since it would always be easier to understand the code the second time around. Example of code snippet pair 1 is shown in figure 3.1. The objective was to correctly explain what the code did in general terms. For each of the six code snippets, the time it took for the test subject to reach the objective was recorded using a stopwatch. Finally, after all of the snippets had been seen the subject was asked three follow-up questions: • 1: Did you notice any difference between the code snippets?

• 2: Out of these four different ways to write variables, which one, if any- one, do you use?

• 3: Which of these four variable naming conventions did you feel was the easiest to understand? Between question one and two we debriefed the subject on what the difference between the pictures were and what this test is all about.

When all the test subjects were done with their tests, heatmaps were generated using the stored coordinate data from each code snippet that the test subject gazed upon[11]. We fused these heatmaps (illustrating the gaze of the test 10 CHAPTER 3. METHODS

Figure 3.1: Code snippet pair 1 subjects) together, creating one heatmap per code snippet (6 per group, 12 in total).

3.3 Limitations

An important limiting factor was the small sample size. The test subjects were all computer science students with similar experience, which could also have affected the results. Time restraints due to test station availability limited us to 15 minutes per test subject, which lead to the decision of having the test subjects look at 6 code snippets each. The test stations are equipped with Tobii EyeX eye trackers of unknown quality, which proved to be a huge problem for us. Chapter 4

Results

In this chapter we present the results from the user tests. The first two sections show the average and the distribution of time to understand each code snip- pet. This is followed by the results of the questions posed to the test subjects. Heatmaps of the gazes of the test subjects are mentioned in the last section of this chapter, and can be found in Appendix A.

4.1 Test subjects

There were a total of 33 participants divided into two groups (17 in Group 1, 16 in Group 2).

4.2 Mean time

The arithmetic mean of the time to finish the objective for participants within a group was documented separately for every code snippet, and these average values can be seen in figure 4.1. It is clear that for each code snippet it took Group 2 longer on average to understand the code.

4.3 Time distribution

In figure 4.2 we present the distribution of the time it took the test subjects to finish the test for each code snippet, using a box plot. The median values are displayed using horizontal lines in the boxes. The boxes in this graph repre- sent the inter-quartile range (IQR, i.e. the range of data points between the first and third quartiles of the data sets). Outliers and suspected outliers are

11 12 CHAPTER 4. RESULTS

Figure 4.1: Average time to answer per code snippet. data points at least 3 and 1.5 inter-quartile ranges, respectively, higher than the values within the IQR and they are pictured as filled or outlined circles, re- spectively, above the plot. The whiskers underneath and above the boxes show the minimum and maximum values, respectively. If there are any (suspected) outliers the whiskers will instead represent the inner fence (i.e. the start of the range where suspected outliers can exist, 1.5 IQRs above the third quartile) [12].

As can be seen in the results from code snippet pairs 1 and 4 in figure 4.2, the median values for MWSC are lower than those of their SL counterparts, and code snippet pairs 2 and 6 show lower median values for SW compared to MWCC. In snippet pair 3, the median of MWCC is lower than MWSC, and in pair 5, the median of SL is lower than SW.

Furthermore, we observed that snippets 3MWCC, 3MWSC and 4SL each had outliers that took more than twice as much time as any other test subjects to finish. In snippets 1SL, 4MWSC, 5SL and 5SW there are suspected outliers well beyond the inner fence. Some of the box plots show a large variance, which is most clearly visible in the size of the IQRs of snippet pairs 2 and 5. Group 2 had a greater maximum evaluation time than group 1 in all but the first code snippet. CHAPTER 4. RESULTS 13

Figure 4.2: The distribution of time in seconds to answer each code snippet.

4.4 Questions

Two of the three questions asked after the tests were quantifiable. Figure 4.3 shows for each naming convention (SL, SW, MWCC, MWSC) how many of the test subjects claims to use when programming. MWCC and SW is shown to be most frequently used among our test subjects. Figure 4.4 presents which one of the naming conventions the subjects thought were the easiest to understand. Here MWSC seems to be the easiest with SW and MWCC also receiving many points. Note that, for both of these questions, no answer and multiple answers were allowed. 14 CHAPTER 4. RESULTS

Figure 4.3: The variable naming convention used by the test subject.

Figure 4.4: Which variable naming convention was easiest to understand?

4.5 Heatmaps

The heatmaps of the accumulated gazes of the test subjects can be found in Ap- pendix A. The pink dots on the picture represents where each test subject have been looking according to the eye tracker. We fused the heatmaps from all test subjects who viewed the same code snippet together, resulting in 12 heatmaps, one for each code snippet. Separate heatmaps for each snippet viewed by each subject would have resulted in too many pictures, and separately these pictures contain insufficient useful data. Chapter 5

Discussion

This chapter discusses the results in different areas. Firstly a discussion about the timed data is held. The following section puts this timed data in relation to the generated heatmaps. In the final section we discuss improvements for this study and potential future work.

Before we move onto more detailed discussions it is important to highlight the subjective nature of these tests. The specific variable names were as previ- ously explained chosen to describe the variables as well as possible. But how to describe a variable as good as possible is a highly subjective matter. While we would think that numbers_to_add is a good descriptive name of an array, one could also argue that it could be misinterpreted or maybe simply not as clear to someone else.

5.1 Timed data

Overall the results are quite in line with earlier studies that have been done on the subject. SL code snippets, on average, took longer or equal time to the MWCC/MWSC snippets, and no significant difference could be seen between MWCC and MWSC. If we look at snippet 1 and 6 we can see that the average results between the groups are very similar. We believe this is because these two were the least complex snippets of code where the variables didn’t really need to give much information for the subject to understand the code.

Snippet 2 is an interesting case. Here we can see that MWCC is slower than SW even though the first variable declared in the MWCC is called arrayToSort. We assumed beforehand that this variable name was going to be a dead give-

15 16 CHAPTER 5. DISCUSSION

away and that the subjects, upon seeing this, would immediately identify that the code was a sorting algorithm, but this was not the case since the average time of SW is about 13 seconds shorter. There can be several reasons for this. One of them could be that the subject, looking at the MWCC snippet, thought that there could be something else going on in the code besides sorting. This would result in the subject reading the code thoroughly, hence taking more time to finally answer what it does. Another reason could be that we see an example of the Hawthorne Effect [13]. This effect is defined as the problem in field studies when the subject is aware that they are participating in an experiment and their behaviour is mod- ified from what it should have been if they were not aware of the experiment taking place. What we think might have happened is that since the subject was aware that an experiment was taking place, they read all of the code instead of jumping to a conclusion upon seeing the first arrayToSort variable. It is however important to mention that the Hawthorne Effect have been criticized and there are studies that conclude that there is no evidence for such an effect [14].

If we look at snippet 3 and 5, where we compare MWCC to MWSC and SL to SW, it is clear that the results don’t really favour either the one or the other. On snippet 3, the averages were only about three seconds apart. Even though the averages on snippet 5 were about 17 seconds apart, the box plot in figure 4.2 shows that the inter-quartile ranges of these snippets overlap and cover both medians, so there isn any significant difference. The SW snippet has a higher variance as can be seen by the larger IQR. Not taking the suspected outliers into account would give SW a new average time of 74 seconds which is much closer to it’s corresponding SL average time. Lastly, looking at snippet 4, we see the biggest difference in percentages between the groups at about 26%. However, the SL snippet has a similar out- lier as snippet 5 SW had when it comes to the maximum time spent, which can be seen in 4.2. Snippet 4 SL maximum time is more than double the max- imum time of snippet 4 MWSC. If we choose to calculate the average time for snippet 4 SL without the outlier we get a new average time of 41 seconds.

Taking all of this into account we can not see any significant difference when it comes to naming variables with multiple words or single words, or even single letters. We can however derive from figure 4.3 and 4.4 that variables such as MWSC and MWCC seem to be a lot more popular and perceived as easier to understand from the developer’s point of view. CHAPTER 5. DISCUSSION 17

5.2 Heatmaps in relation to timed data

As we can see in Appendix A, the gaze points on the heatmaps do not perfectly align with the text in the code snippets. However, we find it highly improbable that the test subjects were looking at the gaps in-between the text rows instead of at the code itself. The patterns in figure A.6 and figure A.11 indicate that the users followed the patterns of the code snippet rows, but the information got corrupted somehow. This is most likely due to an eye tracking calibration error. However, some information can still be gathered.

When comparing snippet 1 SL and snippet 1 MWSC, it seems that the subjects looking at SL spent most of their time looking at the calculation operations of the code. We can also see that the subjects looking at MWSC have a much more evenly spread out heatmap. These two snippet average times were very similar (29 and 30 seconds respectively) and this would suggest that to under- stand the code using SL variables, the subjects had to calculate every operation while the subjects looking at MWSC variables only had to read the calculation operations once to understand what was going on.

5.3 Possible improvements

The eye tracking equipment didn’t work quite as expected. We ran into many problems. Some had to do with the hardware, such as sensitivity to changes in light (which we could not control) and varied results between the two test systems (i.e. the eye tracking gear). Others had to do with the test subjects moving their heads around too much. A single test system setup would have required double the time, but would have been more accurate.

We unwillingly introduced a couple of bugs in our code snippets. This re- sulted in some mistrust from the test subjects, some of whom started to look for more bugs. This could possibly have driven up the completion time of those tests. Also, creating definitive result objectives for each code snippet would have simplified the evaluation of the completion of the tests.

Finally, the questions we asked each test subject after the code snippets could have been put into a questionnaire instead. This would help prevent any biased nuances while writing down the answers. 18 CHAPTER 5. DISCUSSION

5.4 Future work

A point of interest would be to attempt to reduce the variable naming con- ventions problem to a descriptive variables problem in order to ascertain the effect on readability variables using the same naming convention but different descriptiveness have. One way this could be done is by narrowing this problem down to using SL variable names and determining whether using descriptive variable letters (e.g. ’e’ and ’o’ representing ’even’ and ’odd’, respectively) are more readable than alphabetically ordered variable letters (e.g. ’a’, ’b’, ’’). Chapter 6

Conclusions

We found some indications that Single Letter variables differed from the other naming conventions, but such a conclusion would require further testing.

In conclusion, no significant difference in readability could be confirmed be- tween the variable naming conventions SL, SW, MWCC and MWSC with the data obtained from our tests.

19 Bibliography

[1] Markus Pizka Florian Deissenboeck. Concise and consistent naming. fetched: 2019-05-13. 2006. url: https : / / link . springer . com/content/pdf/10.1007/s11219-006-9219-1.pdf. [2] . P. L. Buse and W. R. Weimer. “Learning a Metric for Code Readabil- ity”. In: IEEE Transactions on Software Engineering 36.4 (July 2010), pp. 546–558. issn: 0098-5589. doi: 10.1109/TSE.2009.70. [3] Steve McConnell. Code Complete, Second Edition. Redmond, WA,USA: Microsoft Press, 2004. isbn: 0735619670, 9780735619678. [4] unknown. Most Common Programming Case Types. fetched: 2019-05- 14. 2018. url: https : / / chaseonsoftware . com / most - common-programming-case-types/#camelcase. [5] Michael Friendly. “The history of the cluster heat map”. In: The Amer- ican Statistician (2009). [6] Andreas Bexell. “Comparing functional to imperative Java: with re- gards to readability, complexity and verbosity”. PhD thesis. 2017. url: http://urn.kb.se/resolve?urn=urn:nbn:se:lnu: diva-64712. [7] Tobii. This is Eye Tracking. fetched: 2019-05-14. 2019. url: https: //www.tobii.com/group/about/this-is-eye-tracking/. [8] B. Sharif and J. I. Maletic. “An Eye Tracking Study on camelCase and under_score Identifier Styles”. In: 2010 IEEE 18th International Con- ference on Program Comprehension. June 2010, pp. 196–205. doi: 10. 1109/ICPC.2010.41. [9] D. Binkley et al. “To camelcase or under_score”. In: 2009 IEEE 17th In- ternational Conference on Program Comprehension. May 2009, pp. 158– 167. doi: 10.1109/ICPC.2009.5090039.

20 BIBLIOGRAPHY 21

[10] T. Sedano. “Code Readability Testing, an Empirical Study”. In: 2016 IEEE 29th International Conference on Software Engineering Educa- tion and Training (CSEET). Apr. 2016, pp. 111–117. doi: 10.1109/ CSEET.2016.36. [11] Leif Tysell Sundkvist and Emil Persson. Code Styling and its Effects on Code Readability and Interpretation. 2017. [12] T.W. Kirkman. Statistics to Use. fetched: 2019-06-03. 1996. url: http: //www.physics.csbsju.edu/stats/box2.html. [13] J. G. Adair. “The Hawthorne effect: A reconsideration of the method- ological artifact”. In: Journal of Applied Psychology 69.4 (1984), pp. 334– 345. doi: 10.1037/0021-9010.69.2.334. [14] Stephen R. G. Jones. “Was There a Hawthorne Effect?” In: American Journal of Sociology 98.3 (1992), pp. 451–468. doi: 10.1086/230046. url: https://doi.org/10.1086/230046. Appendix A

Heatmaps

Figure A.1: Heatmap of code snippet 1 (Group 1, SL).

22 APPENDIX A. HEATMAPS 23

Figure A.2: Heatmap of code snippet 2 (Group 1, SW).

Figure A.3: Heatmap of code snippet 3 (Group 1, MWCC). 24 APPENDIX A. HEATMAPS

Figure A.4: Heatmap of code snippet 4 (Group 1, MWSC).

Figure A.5: Heatmap of code snippet 5 (Group 1, SL). APPENDIX A. HEATMAPS 25

Figure A.6: Heatmap of code snippet 6 (Group 1, MWCC).

Figure A.7: Heatmap of code snippet 1 (Group 2, MWSC). 26 APPENDIX A. HEATMAPS

Figure A.8: Heatmap of code snippet 2 (Group 2, MWCC).

Figure A.9: Heatmap of code snippet 3 (Group 2, MWSC). APPENDIX A. HEATMAPS 27

Figure A.10: Heatmap of code snippet 4 (Group 2, SL).

Figure A.11: Heatmap of code snippet 5 (Group 2, SW). 28 APPENDIX A. HEATMAPS

Figure A.12: Heatmap of code snippet 6 (Group 2, SW).

TRITA-EECS-EX-2019:323

www.kth.se