C_deScene Co_eScene Cod_Scene Code_cene CodeS_ene CodeSc_ne CodeSce_e CodeScen_

— because your code is worth it The cost of code: Bridging the gap between tech and business

CodeScene’s project management metrics let you measure where you spend your costs and how the development activity shifts over time. This information is useful to bridge the gap between the technical side of the organization and the business side, as you let non-technical managers peek into the codebase from a different point of view.

The Need for Cost Metrics

CodeScene’s project management metrics answer between developers and managers here: to a two common questions: manager, the concept of a “commit” doesn’t carry much meaning. A commit is a technical term 1. How shall we prioritize improvements to that doesn’t translate to anything in the world our codebase? of managers. At the same time, technical debt with high interest rates and low quality code are 2. How can we follow-up on the effects of the important subjects to address. So how can we improvements we do? Did we really get an communicate in the language of a manager while effect out of that design improvement? still tying our data back to something that carries meaning for the developers responsible for Sure, a traditional hotspot analysis already the code? addresses these questions and gives us a tool to prioritize. However, there’s a linguistic chasm

3 CodeScene bridges this chasm by introducing point reports. We then analyze how those costs In this example, the work trend shows that there From here we can get much more detailed data a suite of project management metrics. These are distributed across the different parts of your was a burst of critical features added in April. by diving down to the file level and identifying the metrics combine our existing behavioral code codebase. This gives you Hotspots measured by Unfortunately there seems to have been several parts of the code where we spend most of our time. analyses with data from project management tools cost rather than the more technical metrics. Let’s bugs too, with nearly 40% of the development time Here’s an example: like , where CodeScene extracts time-based look at an example. spent on fixing defects. Now, if we do some focused costs (i.e. minutes of time to completion) or story refactorings we’d expect that to pay off in the future cost and work trends.

Get detailed cost metrics on a file level.

Calculate hotspots by costs on architectural level.

You use this information to ensure that the code project management results and the results evolves in the right direction. For example, you’d from the technical hotspot analyses. This is an like to see a decrease in the amount of bug fixes expected finding. However, the main purpose of As you see in the preceding figure, the most before we dive into the technical analyses and look and an increase in the amount of features. You can the project management metrics is to provide a developer time is spent in the Web Backend sub- for refactoring targets, we’d like to inspect the work also use the cost trends to measure the effect of basis for communication. Thus, you’d use this data system. This means we want to ensure that the trends. Here’s what they look like for the code in the large-scale improvements. to motivate investments in software quality, like code is easy to understand and to evolve. If not, we Web Backend sub-system: explaining the need for a larger refactoring of a top want to prioritize improvements to that part. But As a developer, you’ll probably notice that there hotspot. tends to be a a strong correlation between the

Try it Yourself

The project management analyses are exclusive to our on-premise version of CodeScene.

CodeScene provides an open API that lets you integrate with any project management tool. CodeScene also comes with an out of the box JIRA integration that supports costs as both time and story points.

Use the trends in type of work to see where your time is spent.

4 5 Measure Conway’s Law with Codescene

A knowledge map shows the main developer behind each module.

Mel Conway’s astute observation that an organization’s communication structure should be reflected in the software architecture has received plenty of attention over the past years. Part of that is due to the popularization of microservices, which promises natural team boundaries where each team might be responsible for their own A knowledge map is useful to guide on- and analysis and measure on architecturally significant service. As such, Conway’s Law is an important principle that drives off-boarding, but it doesn’t really help us on our components and sub-systems, as opposed to both organizational and technical decisions. At the same time, the quest to measure Conway’s Law. But it’s a starting individual files. CodeScene solves this by letting organizational and social side of code is largely left to subjective point. Our next step is to add the organizational you specify your architectural boundaries and judgments. What if we could guide those decisions with objective data dimension by aggregating individuals into teams. teams. instead? Follow along and see how you can measure Conway’s Law. We also want to raise the abstraction level of the

The Social Side of Code

As soon as an organization grows beyond a CodeScene’s behavioral code analysis helps you fill handful of people, social aspects like coordination, in the blanks. Behavioral code analysis emphasizes communication, and motivation issues increase trends in the development of your codebase in importance. Unfortunately these, well, softer by mining version-control data. Since version- aspects of software development are invisible control data is also social data – we know exactly in our code; if you pick up a piece of code from which that wrote each piece of code your system there’s no way of telling if it’s been – it’s possible to build up knowledge maps of a written by a single developer or if that code is codebase, as shown in the next figure. a coordination bottleneck for five development teams. That is, we miss an important piece of information: the people side of code. Assign individual developers to teams.

6 7 Using that configuration, CodeScene measures the own component in an analysis. This gives you a The preceding figure shows a system that’s fairly Behavioral code analysis helps you ask the right knowledge distribution on an architectural level by powerful tool to evaluate how well your architecture well aligned with Conway’s Law as most of the questions, and points your attention to the aspects aggregating the contributions to individual files into aligns with your organization, as shown in the next coordination needs are low. This means that there’s of your system – both social and technical – that the configured logical boundaries. For example, figure. little overlap between the contributions of the are most likely to need it. You use this information if you do microservices, each service would be its different teams. However, there’s one exception: to find parts of the code that may have to be split the Jenkins Plugin has attracted code from three and modularized to facilitate parallel development separate teams over the analysis period. This might by separate teams, or, find opportunities to be fine – a behavioral code analysis doesn’t judge introduce a new team into your organization to take – but it’s a pattern that deviates from the rest of the on a shared responsibility. codebase and, as such, might be worth to look into and understand.

There’s More

The way developers collaborate is crucial to the success of any system, and this blog post has really just scratched the surface. If you want to dive deeper, you might want to check out my new book, Software Design X-Rays: Fix Technical Debt with Behavioral Code Analysis, which goes into much more detail with several real-world examples. Visualize in which sub-systems that each team works. The analyses themselves are completely automated, so try them out in CodeScene Cloud, which is free for open source, or check out the on-premise version of CodeScene. The previous visualization is based on the actual The same analysis also lets you measure the code contributions of each team. Each team gets coordination needs on an architectural level. This assigned a color (look at the color legend to the is useful to detect sub-systems that become right in the figure), and the team that has written coordination bottlenecks or lack a clear ownership, most of the code gets highlighted for that sub- as shown in the next figure. system. As such, the information is always up to date and you can chose how far back in time you want to go when collecting the data.

Find team coordination bottlenecks based on code contributions.

8 9 Early warnings for future maintenance problems

A codebase under active development evolves at a rapid pace, and These early warnings point your attention to • Identifies Steep Increases In Complexity: as soon as the organization scales beyond 10-12 people it’s virtually different aspects of the system: Since CodeScene knows how your code impossible for a single individual to maintain a holistic picture of the typically evolves, the tool can detect parts of system. The roots of future maintenance problems are often introduced • Detects Code before it becomes a Hotspot: the code that suddenly increases in complexity in change bursts, perhaps by shoehorning a new feature into an existing This warning highlights code that isn’t a hotspot as shown in the next figure. This doesn’t mean design, and from there they only grow worse over time. Wouldn’t it be yet, but climbs rapidly on the hotspot ranking it’s a problem, rather the warning is great if you could get an early warning when that happens so that you (check out Predict Maintenance Problems In CodeScene’s way of drawing your attention to can take appropriate counter measures and save your code from decay? Large Codebases for an introduction to an area of the code. Use this information to hotspots). You use this information as a driver focus code reviews and additional testing. to refactor those Hotspots into more cohesive units, if appropriate. What’s an Early Warning?

The CodeScene tool offers the ability to detect The following figure shows an example on three potential maintenance problems and early warnings different warnings auto-detected by CodeScene in your codebase. The earlier you react to those in Google’s TensorFlow codebase. TensorFlow is a findings, the better, so let’s look at a few examples. library for machine learning, and the warnings are highlighted using yellow tiles:

Early warnings are delivered for code that accumulate complexity rapidly.

CodeScene detects early warnings in your codebase.

10 11 • Predict Delivery Risk: CodeScene calculates a changed code itself and combines its technical risk profile for your codebase, which is based metrics with social data such as developer on how the system has evolved and what a experience. This is information that you use to typical change looks like. That is, the algorithm prioritize code reviews and to reason about focuses more at how a commit looks than the delivery risks of individual features.

The detailed view of high risk commits.

Context Matters

The advantage of social code analyses – like these early warnings – is that they take your context into account. This is important because different organizations have different quality goals; In some codebases large, monolithic files are the norm, while others prefer a more modular design with small and cohesive units.

CodeScene solves that by making the warnings relative to the rest of your code, which means that false positives are kept at a minimum and the presented results are directly relevant in your context.

Finally, there’s the option of integrating the early warning detection into your continuous integration pipeline as demonstrated here.

CodeScene is free for open source, so give it a try at codescene.io.

12 13 By default, CodeScene calculates statistics for all patterns, it raises a warning. If that happens, there Meet the branch branches that have been worked on during the past are several things you can do: two months. This gives us a chance to ensure that we stick to a short development duration. As you • Re-plan the scope: Sometimes it’s just too see, we also get a separate measure of the time much work in a single feature. Identifying measures: a behavioral it takes from the last commit until the branch is a smaller feature set that you can deliver merged. faster is one way to shorten the lead times and minimize risk. In the preceding figure, that average lead time to code analysis to predict merge is 13 hours, which is definitely too long. • Prioritize verification activities: Use the This happens frequently because practices like early warning to focus extra code reviews code reviews tend to become bottlenecks. The and tests on the highlighted branches. consequences is that a developer finishes the work delivery risks on a feature, move on to something new since we Long lived branches are at odds with continuous learn fast that it will take time to get feedback, integration. They also increase the risk for merge and then has to context switch back to address conflicts and unexpected feature interactions, so potential code review findings. That’s expensive. use the early warnings above to act on time and as Many organizations transition to short lived feature branches and feedback to your planning. employ practices like continuous integration/delivery. To work Such process loss can be detected automatically, in practice, those feature branches have to be kept short lived. and you see an example on an early warning By applying behavioral code analysis, we’re able to visualize the in the preceding figure. Here’s how it works: branching activity, measure lead times, and even predict the delivery since CodeScene analyses your behavior as an Predict Delivery Risks risk of individual branches. The resulting information may highlight organization, the algorithms know what your bottlenecks in our process or development workflows, and also gives typical branching patterns look like. As soon as The detailed analysis provides more information as us early warnings so that we can prevent future problems. the tool detects branches that deviates from those shown in the next figure. Follow along and see how it’s done.

Measure Lead Times

CodeScene is a behavioral code analysis tool that measure branching activity as shown in the next analyses patterns in version-control data. Recently, figure. CodeScene introduced a new suite of analyses that

A detailed analysis of the work on each branch.

Try it Yourself

The branch analyses are available in both our on-premise version and in the cloud version available at codescene.io. Give it a try – codescene.io is An overview of the branch measures. free for open source projects.

14 15 CodeScene in your continuous integration pipeline

CodeScene lets you uncover and prioritize code that’s hard to maintain or parts of the code that become team productivity bottlenecks. As such the techniques are reactive. Wouldn’t it be great if we could catch such problems much earlier, ideally before they are even delivered to our main branch? CodeScene detects high risk changes on your development branches. In this blog post we explore a new feature of CodeScene that turns the analyses into a pro-active tool for early feedback. You’ll see how CodeScene offers the ability to detect maintenance problems and early warnings in your codebase by integrating the analysis results into your build pipeline and/or as robot comments in a code review tool The analysis is triggered by a pull request, a range the system has evolved and what a typical change like Gerrit. of commits, or a single commit; You decide through looks like. That is, CodeScene looks more at how a the API. The resulting risk classification helps us commit impacts the system than the changed code developers focus our time and expertise to the itself. The technical metrics relate to the amount areas where it’s likely to be needed the most and of code that is changed, the complexity of the it does so before we even know we might have a changed code, and the diffusion of the changes problem. Let’s look at a specific example. (e.g. how many different sub-systems does the commit touch).

Fight reviewer fatigue What’s a High Risk Commit? The social dimension of the risk profile relates to the experience of the programmer doing the The challenge with all preventive and corrective At Empear we’ve developed a system for automated Any risk classification algorithm has to go beyond change. The more experienced the programmer, techniques is that they require time and discipline. risk classifications to prioritize the code we need technology and include a social dimension too; the lower the risk. This means that two commits Let’s take code reviews as an example. Code to review. The risk classification is built into Reasoning about risk based on code alone is with identical changes may be classified differently reviews done right are a proven defect removal CodeScene, which exposes a REST API that lets you misleading. Let me give you an example. depending on the behind them. technique. A code review is also an opportunity for integrate the classification into your continuous Experience mediates risk. We’ve also designed knowledge sharing and learning. However, none of integration pipeline. The following figure shows an Let’s say that I do a large, sweeping change to the feature as a self learning algorithm that those benefits come for free. example from a Jenkins build: the Linux kernel. Now, pretend that Linus Torvalds automatically adjusts as a developer gains more would do exactly the same changes. Do our experience. Like all manual processes code reviews are individual changes carry the same risk? Obviously hard to scale. As your organization grows, code not - Linus knows the code and has worked on it for Before we move on I’d like to point out that reviewer fatigue becomes a real thing; There’s just almost three decades while I’ve never touched the CodeScene never exposes its experience scores. so many lines of code you can review each day. kernel before. Clearly, Linus’s changeset should be The main reason is because such metrics are way Beyond that point you’re likely to slip. The result is considered less risky than mine. too easy to misinterpret as some kind of ill-advised increased lead times, bugs that pass undetected to CodeScene resolves this by a machine learning performance evaluation. With that covered, let’s production, and – in extreme cases – the risk algorithm that calculates a unique risk profile for see what’s possible once we integrate analysis for burnout. your codebase. The risk profile is based on how information into our daily workflow.

16 17 CodeScene as an extra team member

If you use CodeScene you’re familiar with its early warning system. For example, CodeScene detects complexity trends where a piece of code becomes increasingly more difficult to maintain over time:

Automated warnings for absent but expected change patterns.

The absent change pattern warning is based on Used this way, CodeScene takes on the role of CodeScene’s temporal coupling analysis: If a an extra team member. It’s available to all teams, cluster of files have changed together for a long all the time. CodeScene never becomes bored or time they are intimately related. In the example fatigued. It even aims to be friendly. above, a commit modified the fileLinkTagHelper.cs . CodeScene knows that in 90% of all modifications The main advantage of CodeScene’s continuous to that file, a related class named ScriptTagHelper.cs integration support is that it lets you react to is changed too. potential problems early. But there’s a potentially large saving at the other end of the spectrum too; The warning fires when such a temporal change Instead of treating all pull requests as equals, pattern is broken. Please note that this may be a CodeScene’s risk classification lets you prioritize good sign – perhaps we just refactored some code your code reviews and focus your time where Get automated warnings for steep complexity increases. duplication – but it may also be a sign of omission (and when) it’s likely to be needed the most. Code and a potential bug. As a consequence, this reviewer fatigue is a real thing, so let’s use our warning is based on a self-correcting algorithm; time wisely. If you keep ignoring the warning it will go away automatically as the temporal coupling decreases below the thresholds.

By integrating CodeScene in your build pipeline you members change this piece of code they also get those early warnings immediately as you push normally make a change to the file over here - did a commit. This is information that you use to guide you forget to change that file?”. The following figure your code reviews. shows an example: Explore CodeScene

In addition, CodeScene will be able to detect the The continuous integration support is available today in our on-premises absence of expected change patterns. CodeScene version of CodeScene. We also plan to support it in codescene.io by basically tells you that “hey, when your team integrating with GitHub to provide automated feedback on pull requests.

18 19 TheC o d e

20 21 Software (r)Evolution The Challenges of Scale Today’s software systems consists of hundreds of thousands, often million lines of code. The scale of such systems make them virtually impossible part 1: to reason about; There are few people in the world who can keep a million lines of code in their head. Predict maintenance problems in large codebases

Welcome to the first part in the Software (r)Evolution series! Software (r)Evolution is a series of articles that explore novel approaches to understanding and improving large-scale codebases. Along the way we’ll use modern data science to uncover both problematic code as well as the behavioral patterns of the developers that build your software. This combination lets you to identify the parts of your system that A typical system consists of multiple technologies and complex sub-systems. benefit the most from improvements, detect organizational issues and ensure that the suggested improvements give you a real return on your investment. I’m thrilled, so let’s get started!

Today’s systems are also built on several different technical and organizational challenges. That is, technologies. As an example, consider a classic any solution we come up with must: three tier architecture where you use JavaScript on the front-end, implement the services in Java • Gather the collective intelligence of all How Software Evolution Making changes to such code is a high risk activity. or .Net, and use SQL to access the database. As contributing programmers to identify the parts Given the scale of today’s codebases, we need if this technical variety isn’t complex enough, of the codebase in need of extra attention. Lets You Understand Large more efficient tools to identify those parts of the large-scale systems are also developed by multiple system so that we can apply corrective actions, programmers organized into different teams. • Present a language-neutral approach to handle Codebases invest extra testing efforts, or focus code reviews. Each individual programmer is only exposed to polyglot codebases. In this article we use data on our past behavior as a small part of the codebase. That means every Bad code tends to stick. Not only does it stay where software developers to guide such decisions. programmer has their own view of how the system • Prioritize the parts of the code that matters the it is; It stays there for years often outliving its looks. No one has a holistic picture. most with respect to productivity and risk original programmers, in the organizational sense, amongst million lines of code. and to the displeasure of the next generation of Our main challenges, if we want to understand programmers responsible for its maintenance. and improve large codebases, is to balance these

22 23 Unfortunately, as evident by sources like the CHAOS report - the majority of all projects fail to deliver on time or on budget - this is where organizations fail. I think there’s a simple explanation for this ceaseless failure of our industry: The reason it’s so hard to prioritize improvements is because most of the time we make our decisions based on what we see: the system as it looks today, it’s code. But I will claim that it’s incomplete information. In particular, there are two key aspects that we lack:

1. Time: From the code alone we cannot see how the system evolves and we cannot identify long term trends. There’s no way of separating The best thing with this approach is that virtually all code that’s stable in terms of development from software organizations already have the data they code that we have to keep changing. As we’ll need - we’re just not used to think about it that way see, a time dimension is vital to our ability to - in their version-control systems. A version-control prioritize improvements to the code. system is basically a behavioral log of how each developer has interacted with the code. Not only 2. Social information: From the code alone we does version-control data record when and where a cannot tell if some module is a productivity code change was made; Version-control data also Have a look at the three graphs in the illustration those files represent code we need to work with all bottleneck where multiple developers need to records social information in terms of who made above. All graphs show the same thing. The X-axis the time. It’s also important to identify files with coordinate their work. Since communication that change. So let’s put version-control data to shows each file in the system sorted on their high change frequencies from a quality perspective. and coordination needs are driving forces of use on a large system. change frequencies (the number of commits as The reason is that a high change frequency is software costs, we’d need a social dimension recorded in the version-control data). The Y-axis correlated to frequent maintenance problems. In in order to make informed decisions around our shows the number of commits for each file. fact, the importance of change to a module is so system. And, perhaps somewhat surprisingly, high that more elaborate metrics rarely provide this social dimension goes beyond management Towards an Evolutionary View The graphs above show the data from three any further predictive value when it comes to fault and is just as important for our ability to make radically different systems. Systems from different prediction. sound technical decisions. of Software domains, codebases of different size, developed by different organizations and of different age. Yet all Let’s see how we can use software evolution to help As part of my day job at Empear, I’ve analyzed graphs show the same power law distribution. us out. hundreds of different codebases. There are some patterns that I see recur over and over again, The distributions above show that most of our independent of programming languages and development activity is located in a relatively small technology. Uncovering these patterns help us part of the total codebase. The majority of all files Use Version-Control Systems understand large codebases. are in the long tail, which means they represent code that’s rarely, if ever, touched. as Behavioral Data This change distribution of code has several Data analysis is mainstream now and the rise of interesting implications. First of all, it gives us a However, despite these findings, our model still machine learning has taught us developers how tool to prioritize improvements and refactorings. suffers a weakness. Why, because all code isn’t to find patterns in complex phenomenons. I find Refactoring complex code is both a high-risk equal. For example, it’s a huge difference to it genuinely fascinating that we are yet to turn activity and expensive. Using our knowledge increase a simple version number in a single-line those techniques on ourselves. We need to close of how code evolves, we’re able to focus on the file compared to correcting a bug in a file with that gap, so that’s what we’ll spend the rest of this parts where we’re likely to get a return on that 5.000 lines of ++ with tricky, nested conditional article series on. Let’s uncover what happens when investment. That is, the parts where we spend logic. The first change is low risk and can for all we start to study patterns in our own behavior, as most of our development efforts as shown in the practical purposes be ignored while the second programmers, in order to better understand how following illustration. type of change needs extra attention in terms of our systems grow. test and code inspections. Thus, we need to add a Any improvements we make to the files in the red second dimension to our model in order to improve area (highlighted in the illustration) have a high its predictive power; We need to add a complexity likelihood of providing productivity gains since dimension. Let’s see how that’s done.

24 25 A language-neutral Identify high risk changes complexity metric with Hotspots

There have been several attempts at measuring A hotspot is complicated code that you have to software complexity. The most well-known work with often. Hotspots are calculated from two approaches are McCabe Cyclomatic Complexity different data sources: or Halstead complexity measures. The major drawback of these metrics is that they are language 1. We use the lines of code as a simple proxy specific. That is, we need one implementation for for complexity. each of the programming languages that we use to build our system. This is at conflict with our goal of 2. We calculate the change frequency of each providing language-neutral metrics to get a holistic file by mining their version-control history. overview of modern polyglot codebases. CodeScene for software analysis provides its Fortunately, there’s a much simpler metric that Hotspot analysis as an interactive map that lets you performs well enough: the number of lines of code. explore your whole codebase interactively. In the Yes, sure, the number of lines of code is a rough following visualizations, each file is represented as metric, yet it has just as good predictive power as a circle as described above: more elaborate metrics like cyclomatic complexity. The advantage of using lines of code lies in the simplicity of the metric; Lines of code is both language neutral and intuitive to reason about. So let’s use lines of code as a proxy for complexity and combine it with a measure of change frequency to identify Hotspots in our codebase. The Hotspots map lets you view each Hotspot in up-front so that you can schedule the context of your system. The visualization also additional time or allocate extra testing lets you identify clusters of Hotspots that indicate efforts. problematic sub-systems. Of course there’s more to a true Hotspot than a high change-frequency. We’ll • Hotspots point to code review candidates. explore those aspects in a minute, but let’s first Code reviews are powerful in terms of walk through some use cases. defect removal, but they’re also an expensive and manual process so we want to make sure each review represents time that is well invested (code review fatigue is Know how to use Hotspots a real thing). In this case, use the hotspots map to identify your code review A Hotspots analysis has several use cases and candidates. serves multiple audiences: • Hotspots are input to exploratory tests. • Developers use hotspots to identify A Hotspot Map is an excellent way for a maintenance problems. Complicated code skilled tester to identify parts of the that we have to work with often is no codebase that seem unstable with lots of fun. The hotspots give you information on development activity. Use that information where those parts are. Use that information to select your starting points and focus to prioritize re-designs. areas for exploratory tests.

• Technical leaders use hotspots for risk management. Making a change to a Hotspot or extending its functionality with new features is high risk. A Hotspot analysis lets you identify those areas

26 27

Dig deeper with complexity differentiate between a file that just grows in pure The following illustration shows how this algorithm size from the case where each line of code becomes manages to narrow down the amount of Hotspots trends and machine learning harder to understand (often due to liberal use of to a small part of the total code size when run on a conditional logic). Both cases have their own set of number of open source projects: problems, but the second case implicates higher Once we’ve identified our top Hotspots we need to risk. That means we need to be more language understand how they evolve; Do we already know aware when calculating trends. about the Hotspot in the sense that we’ve started to improve the code and make future changes less risky? Are they Hotspots because they get more and more complicated over time or is it more Multi-Dimensional Hotspots a question of minor changes to a stable code structure? Alright, I hinted earlier that true Hotspots are about more than just high change frequencies. For To answer those questions, we need to look at example, we’d like to consider the complexity trend the trend over time as illustrated in the following as part of the Hotspot criteria. CodeScene does picture. that. In addition, CodeScene employs a machine learning algorithm that look at deeper change patterns in the analysis data, like development fragmentation that indicates coordination problems on an organizational level and coupling to other entities. The rationale is that complicated code that changes often is more of a problem if:

1. The hotspot has to be changed together with several other modules. As you see in the picture above, the prioritized of all commits touching those Hotspots. This means Hotspots only make up 2-3% of the total code any code improvement to a prioritized Hotspot is 2. The hotspot affects many different developers size. Yet there’s a disproportional amount of time well-invested. on different teams. development activity in that small part with 11-16% The picture above shows the complexity trend of single hotspot, starting in mid 2014 and showing its Since CodeScene incorporates organizational evolution over the next year and a half. It paints a information like team structures we’re able to worrisome picture since the complexity has started detect code that’s truly expensive to maintain. Each to grow rapidly. Worse, the complexity grows non- time a change pattern in your codebase crosses an linear to the amount of new code, which indicates organizational boundary you’ll pay a price in terms that the code in the hotspot gets harder and harder of coordination and communication overhead. to understand. As a bonus, you also see that the accumulation of complexity isn’t followed by any increase in descriptive comments. This looks more and more like a true maintenance problem.

If you payed close attention, you’ve probably noted that the complexity trend metric differentiates between lines of code and complexity. While lines of code serves well as a heuristic on a Hotspot map, the metric is too rough when it comes to certain long-term trends. For example, we want to

28 29 Hotspots identify defect dense parts

As we launched CodeScene we spent a lot of time validating and testing the analyses on real-world codebases. One thing I did was to investigate how well the identified Hotspots predict defects. This was done by identifying where in the code corrective actions were performed. We then re-winded the version-control history, measured the Hotspots and looked for a correlation.

Our results shows a strong correlation between the top Hotspots and the most defect dense parts in the code. In general, the top Hotspots only make up a minor part of the code, yet that code is responsible for 25-70% of all reported and resolved defects.

My book Your Code as a Crime Scene dives deeper into some of the research findings we used as a starting point when developing CodeScene Enterprise Edition, as well as to why and how Hotspots work. But let’s just summarize the conclusions in one line: There’s a strong correlation between Hotspots, maintenance costs and software defects. Hotspots are an excellent starting point if you want to find your productivity bottlenecks in code and with CodeScene Enterprise Edition that knowledge is more accessible than ever.

30 [email protected] www.codescene.com