On Automatically Assessing Code Understandability

1 On Automatically Assessing Code Understandability Simone Scalabrino, Student Member, IEEE, Gabriele Bavota, Member, IEEE, Christopher Vendome, Member, IEEE, Mario Linares-Vásquez, Member, IEEE, Denys Poshyvanyk, Member, IEEE, Rocco Oliveto, Member, IEEE Abstract—Understanding software is an inherent requirement for many maintenance and evolution tasks. Without a thorough understanding of the code, developers would not be able to fix bugs or add new features timely. Measuring code understandability might be useful to guide developers in writing better code, and could also help in estimating the effort required to modify code components. Unfortunately, there are no metrics designed to assess the understandability of code snippets. In this work, we perform an extensive evaluation of 121 existing and new code-related, documentation-related, and developer-related metrics. We try to (i) correlate each metric with understandability and (ii) build models combining metrics to assess understandability. To do this, we use 444 human evaluations from 63 developers and we got a bold negative result: none of the 121 experimented metrics is able to capture code understandability, not even the ones assumed to assess quality attributes apparently related, such as code readability and complexity. While we observed some improvements combining metrics in models, their effectiveness is still far from making them suitable in practice. Finally, we conducted interviews with five professional developers to understand the factors that influence their ability to understand code snippets, aiming at identifying possible new metrics. Index Terms—Software metrics, Code understandability, Empirical study, Negative result F 1 INTRODUCTION Developers spend most of their time (∼70%) under- to 5: high readability). However, the perceived readability standing code [1]. While such a finding might look sur- is something different from the actual understandability of prising, it only highlights the pivotal role that code un- the code; a developer could find a piece of code readable derstanding plays in any code-related activity, such as fea- while still experiencing difficulties in understanding it, for ture implementation or bug fixing. While the importance example due to unknown APIs used. Let us consider the of code understandability is undisputed for maintenance- code fragment listed below: related activities [2], [3], [4], [5], [6], [7], there is no empirical AsyncHttpClient client=new AsyncHttpClient(); foundation suggesting how to objectively assess the under- String cookies=CookieManager.getInstance().getCookie(url); Log.e(TAG, cookies); standability of a given piece of code. Indeed, our knowl- client.addHeader(SM.COOKIE, cookies); edge of factors affecting (positively or negatively) code understandability is basically tied to common beliefs or is A developer, as well as all readability metrics proposed focused on the cognitive process adopted when understand- in the literature [10], [11], [12], [13], [14], [15], would con- ing code [8], [9]. For example, we commonly assume that sider this snippet of code as readable, since it is concise code complexity can be used to assess the effort required and utilizes meaningful identifier names. Nevertheless, this to understand a given piece of code. However, there is no snippet of code is not necessarily easy to understand for empirical evidence that this is actually the case. Similarly, any given developer, because the used APIs could be un- researchers have designed models and proposed metrics to known to her and even poorly documented. For example, evaluate the readability of code by correlating them with the the developer may not understand the implications of the readability perceived by developers [10], [11], [12], [13], [14], getCookie(url) method call without prior experience using [15]. This typically means that the developers participating the API or without reading the documentation, e.g., she in the evaluation are required to read a fragment of code might not know whether getCookie(url) could throw an and rate its readability on a given scale (e.g., from 1: low exception, return a null value, or produce some other side effects. Having a metric to estimate the effort required to under- • S. Scalabrino and R. Oliveto are with the University of Molise, Italy. stand a given piece of code would have a strong impact on E-mail: {simone.scalabrino, rocco.oliveto}@unimol.it • G. Bavota is with the Università della Svizzera italiana (USI), Switzer- several software engineering tasks. For example, it would be land. possible to use such a metric to (i) improve the estimation E-mail: [email protected] of the time needed to fix a bug (the lower the under- • C. Vendome is with Miami University, USA. standability, the higher the time to comprehend the code E-mail: [email protected] • D. Poshyvanyk is with the The College of William & Mary, USA. and thus to fix the bug); (ii) create search-based refactoring E-mail: [email protected] recommender systems using code understandability as a • M. Linares-Vásquez is with the Universidad de los Andes, Colombia. fitness function; or (iii) assess the quality of code changes E-mail: [email protected] during code reviews. 2 While we have possible proxies for code understand- • We used combinations of feature models to predict ability, such as code complexity and readability, we do not the independent variables that we define as proxies of (i) know whether these proxies actually correlate with the understandability; effort required to understand a piece of code, and (ii) have • We conducted structured interviews with five expe- a metric able to provide an estimation for code understand- rienced developers to understand what makes code ability. Previous attempts to define a code understandability understandable/not understandable to them; we used model [4], [16], [6] have not been empirically evaluated, such information to propose possible future directions consider understandability as a factor in a quality model [2], for code understandability assessment. [17], or measure understandability at the level of a whole Paper structure. Section 2 provides background infor- software system [18]. mation about code characteristics possibly related to its In this work, we investigate 121 metrics to determine understandability. Section 3 describes the 121 metrics used the extent to which they correlate with code understand- in our empirical study, while Section 4 presents the proxies ability. These metrics can be categorized into three types: we used to assess the developers’ understandability of a (i) code-related metrics (105 metrics), (ii) documentation- given code snippet. The design and results of the study related metrics (11), and (iii) developer-related metrics (5). are presented in Sections 5 and 6, respectively. Section 7 The code-related metrics are comprised of classic code met- discusses the threats that could affect our findings, while rics, like LOC and cyclomatic complexity, and readability Section 8 concludes the paper and highlights future research metrics, like text coherence [14], [15] and code indentation directions. [10]. As the aforementioned example illustrates, a developer may be able to read some code snippet, but it may use unknown code. Thus, we included existing documentation- 2 BACKGROUND &RELATED WORK related metrics, like the availability of external documen- In this section, we describe metrics and models that have tation, and introduced nine new documentation-related been proposed to measure code readability. All these metrics metrics. Finally, we included developer-related metrics to have been included in our study. Afterwards, we briefly understand the extent to which the developer’s experience describe related work presenting metrics for measuring un- and background might influence code comprehension. derstandability (at system level) as a single quality attribute To run our investigation, we performed a study with 63 and as part of a quality model. participants using the 121 metrics to determine their cor- relation with the understandability of code snippets. Each 2.1 Code Readability participant was required to understand up to eight code snippets, leading to a total of 444 evaluations. We consider Identifiers and comments play a crucial role in program understandability from two perspectives: perceived and ac- comprehension, since developers express domain knowl- tual understandability of the code snippets. The participants edge through the names that they assign to the code entities were presented a code snippet to understand and they were at different levels (i.e., packages, classes, methods, variables) asked whether they could understand it (perceived under- [20], [21], [22], [23], [24]. Thus, source code lexicon impacts standability). If their answer was positive, they were asked the psychological complexity of a program [14], [25]. An- to answer three verification questions (actual understandabil- other aspect that also contributes to the readability (and ity). We also monitored the time spent understanding each potentially understandability) of source code are structural snippet to estimate the effort. aspects such as indentation, code entity length [10], [11], and visual/spatial aspects such as syntax highlighting, code By performing an extensive statistical analysis, we ob- formatting, and visual areas covered by code entities [12]. tained a negative empirical result: none of the considered All the aforementioned aspects have been used as fea-

Load more