Bachelor’s thesis

Information and Communications Technology

2019

Aleksandr Osipov SELECTING # PROFILING TOOLS

BACHELOR’S THESIS | ABSTRACT

TURKU UNIVERSITY OF APPLIED SCIENCES

Bachelor of Engineering, Information and Communications Technology Degree programme

2019 | 56

Aleksandr Osipov

SELECTING C# PROFILING TOOLS

The usage of profiling tools could certainly be beneficial for game development since it helps to identify performance bottlenecks. However, the number of available profilers is high and can cause frustration among developers. The objective of this thesis is to research the most recommended profilers and determine the best one based on following criteria: affordability, learnability, intuitive UI, optimized performance, deep analysis, and customizability. The investigation was carried out on Redgate ANTS memory and performance profilers, namely, the CodeTrack, JetBrains dotMemory and dotTrace profilers and the built-in Visual Studio profilers. As a target for testing, the game Barotrauma was used, which was developed in cooperation with FakeFish ltd and Undertow Games. In order to ensure that all profilers have similar conditions, the requirements were to play the game in single player for 10 to 30 minutes, encounter at least one monster attack and experience one breach of the submarine. This thesis investigates all profiler findings and assesses them according to the six aforementioned criteria, based on the author’s own user experience. The results show that built-in Visual Studio profilers have the highest grades. JetBrains and CodeTrack are tied for second place, considering JetBrains is more user-friendly and performant, while CodeTrack is a fully free software. ANTS profilers take the last place; nevertheless, they are helpful in optimization matters. This thesis offers advice on selecting the right profiling tool for game development companies and makes a small practical contribution on optimization of future games.

KEYWORDS:

Profiler, performance, C#, .NET framework, software development

CONTENTS

LIST OF ABBREVIATIONS 5

1 INTRODUCTION 6

2 CURRENT PROFILING APPROACHES AND TOOLS 8 2.1 Testing approaches 8 2.2 Testing levels and types 9

3 PROFILERS AND METHODS 12 3.1 The most common profilers include: 12 3.2 Methods 13 3.2.1 ANTS 13 3.2.2 CodeTrack 24 3.2.3 JetBrains 31 3.2.4 Visual Studio profiler 41

4 RESULTS AND DISCUSSION 50

5 CONCLUSION 55

REFERENCES 56

FIGURES

Figure 1. Memory profiler main menu. 14 Figure 2. Memory profiler configuration. 15 Figure 3. Memory profiler results. 16 Figure 4. Memory profiler class list. 17 Figure 5. Memory profiler instance categorizer. 18 Figure 6. Performance profiler main menu. 19 Figure 7. Performance profiler configuration. 19 Figure 8. Performance profiler results. 20 Figure 9. Performance profiler results options. 21 Figure 10. Performance profiler low profiling overhead mode. 22 Figure 11. Performance profiler lowest profiling overhead mode. 23 Figure 12. CodeTrack main menu. 24 Figure 13. CodeTrack configuration. 25 Figure 14. CodeTrack sampling profiling option. 26 Figure 15. CodeTrack tracing profiling option. 27 Figure 16. CodeTrack deep trace profiling option. 27 Figure 17. CodeTrack results. 28 Figure 18. CodeTrack results in flame view. 29 Figure 19. CodeTrack results in list view. 29 Figure 20. CodeTrack results in timeline view. 30 Figure 21. dotMemory main menu. 31 Figure 22. dotMemory results. 32 Figure 23. dotMemory snapshot comparison. 33 Figure 24. dotMemory memory traffic. 33 Figure 25. dotMemory single snapshot. 34 Figure 26. dotTrace main menu. 35 Figure 27. dotTrace control panel. 36 Figure 28. dotTrace sampling snapshots. 37 Figure 29. dotTrace results. 37 Figure 30. dotTrace tracing snapshots. 38 Figure 31. dotTrace tracing results. 39 Figure 32. dotTrace tracing results in hot spots view. 39 Figure 33. dotTrace timeline results. 40 Figure 34. Visual Studio Diagnostic tools, CPU. 42 Figure 35. Visual Studio Diagnostic tools, Memory. 43 Figure 36. Visual Studio Diagnostic tools memory results. 44 Figure 37. Visual Studio Diagnostic tools CPU profiling results. 44 Figure 38. Visual Studio profiler target options. 45 Figure 39. Visual Studio profiler options. 46 Figure 40. Visual Studio profiler memory snapshots. 47 Figure 41. Visual Studio profiler memory snapshot data. 47 Figure 42. Visual Studio profiler CPU snapshot data. 48 Figure 43. Visual Studio profiler CPU call chain. 49

TABLES

Table 1. Profiler comparison. 54

LIST OF ABBREVIATIONS

CLR

CPU Central processing unit

GC Garbage collection

GPU Graphics processing unit

IDE Integrated development environment

IIS Internet information services

JIT Just-in-time

RAM Random access memory

UI User interface 6

1 INTRODUCTION

Scheduled profiling in software development process increases the chance of enhancing slow sections of code (Bertolino & Faedo, 2007, p. 4). It is important to choose a reliable and convenient profiler for a profiling routine.

The usage of a performance profiler will guide developers through resource-demanding sections of code, highlighting issues and potential improvements. To achieve an optimized product, performance analysis is advised (Ammann and Offutt 2008, pp. 3-4). Developers can tackle the problem by simply viewing the code, which is a very time- consuming and low efficiency method, or developers can use help of available profilers (Dolan & Moré, 2002, p. 1).

“Performance profilers are software development tools designed to help you analyze the performance of your applications and improve poorly performing sections of code.” (SmartBear 2019)

Performance analysis is part of software testing and forms part of the ANSI/IEEE 1059 software testing standard, which reads:

“…a technique to validate application. The definition of testing is that testing is the process of analysing a software item to detect the differences between existing and required conditions (that is defects/errors/bugs) and to evaluate the features of the software item. The purpose of testing is verification, validation and error detection in order to find problems – and the purpose of finding those problems is to get them fixed.”

There are two main types of software testing: functional and non-functional (Hailpern and Santhanam 2010, pp. 9-10). Each consists of subtypes whose purpose is to assess the system's compliance with its specified requirements in functional types and test a software for the requirements which are nonfunctional in nature but important such as performance, security, user interface etc.

Method, used for most tests, includes test groups with isolated environment and limited knowledge about software, and test specialists with given requirements. For performance test, it is compulsory to have software to test performance impact.

Nowadays there are so many options and this creates confusion among developers. The problem, discussed in this thesis, is how to choose the most relevant and trustful profiler.

TURKU UNIVERSITY OF APPLIED SCIENCES THESIS | Aleksandr Osipov 7

The next chapter, Chapter 2 explains the current methods the software testing is performed. Each approach, level and type will be described in detail, without diverting from the main topic, which is a part of software performance testing. Chapter 3 describes the most recommended profilers, among different online resources. It also introduces existing problems and solutions alongside with methods, used to solve the problems and find optimal profiler.

TURKU UNIVERSITY OF APPLIED SCIENCES THESIS | Aleksandr Osipov 8

2 CURRENT PROFILING APPROACHES AND TOOLS

Profiling is a part of software testing. Many approaches exist in software testing, for example: static and dynamic approach, exploratory approach, white, grey and black box approach etc. Among the different approaches, there is a minimum of three distinctive levels of testing: unit testing, integration testing and system testing, and many software testing types or techniques such as compatibility, smoke and sanity, functional and non- functional testing.

2.1 Testing approaches

Static testing is a manual or automated review of code to find errors at the early stage of development and it is carried out when the code is not executed, in contrast with dynamic testing where code is executed to determine resource usage, overall performance and confirm that software meets business requirements (Pan 1999).

Exploratory testing is an important approach for agile projects (Jorgensen and Jorgensen 2006), because its aim to keep testers on the same page with developers of a fast evolving project. It emphasizes the engagement of testers, the lack of plan and freedom to choose the path for testing, only using one’s own sense of where the problems might lie.

White, grey and black box approaches regulate the openness of software source code for testers. White box testing is completely open, also called glass box, transparent box, etc., so testers have access to source code and design documents, to help create test cases using exposed internal structure and systems. Black box testing implies a closed code base with only input and output data available for testers. The grey box approach is a combination of white and black box, which means that the testers have limited knowledge of the testing system such as working models and architecture diagram, but generally have detailed design documents available. The white box approach is commonly used at unit level although it can be applied at integration and system levels. Black box testing is applicable to any level of testing. Grey box testing suits for high level testing, but is recommended to be carried out together with deeper white box testing (Limaye 2009, pp. 107-108).

TURKU UNIVERSITY OF APPLIED SCIENCES THESIS | Aleksandr Osipov 9

2.2 Testing levels and types

Levels of software testing range from the check of single function or logically complete small section of code up to testing the whole application with all internal functions, modules and underlying architecture. Depending on the stage of the development process, a suitable test is performed. The first testing level is unit testing, which should be carried out early enough to check the separate modules of the system. After unit testing, comes integration level testing to verify the ability of modules to interact and work with each other. The final testing level is performed on system of all integrated parts before introducing it to the market (Labiche and Thevenod-Fosse and Waeselynck and Durand 2000, 136).

Unit testing is the testing of a single module or a group of interconnected modules. It is usually carried out with white box testing approach, by a to verify correct work of a module on early stage of development (Kumar and Syed, 2010, p. 54).

Integration level testing is a check of a group of units combined to produce an output. If hardware and software have any relation, this is tested at this stage using both white box testing and black box testing (Naik and Tripathy 2008, pp. 16-17).

System level testing is performed to examine a system as a whole, i.e., all integrated components and modules. It falls under the black box testing approach. It checks the design and expected behavior of the software (Craig & Jaskiel 2002).

The main testing type that will be discussed in this thesis is performance testing. Performance testing is a non-functional technique aimed to determine the software behavior such as stability and responsiveness under different workloads. The most important types are load testing, stress testing, soak testing, and spike testing (Denaro, Polini and Emmerich 2004).

Load testing reproduces expected workload of concurrent users, transactions or operations over the certain period of time by either real or virtual users to find potential bottlenecks and check responsive time of the application (Vokolos and Weyuker 1998, p. 81).

Stress testing is type of testing where program performs a higher number of operations and is given the traffic loads above expected in order for developers to understand the scalability of the system (Pan 1999). This technique pressures up limited hardware

TURKU UNIVERSITY OF APPLIED SCIENCES THESIS | Aleksandr Osipov 10

resources and helps to identify the breaking point of application. System strain can cause problems with memory, such as memory leaks, stack overflow; decrease in speed of data flow due to physical constrains of storage devices and related errors such as data corruption and data exposure, which is a security issue. Stress testing includes soak testing and spike testing (Khan 2010, pp. 12-15).

Soak testing, also referred as endurance testing, simulates graduate escalation of end users, calls or operations with the objective to observe possible impact of dense and continuous activity, over long period time. Such testing may display performance decline and excessive demand of system resources.

Spike testing checks software stability during dramatic rise and fall of system users or system activity over a relatively short period of time, in comparison to soak testing. Due to sudden increment or decrement of user activity, the system may not respond as well as it intended and in the worst case, it can crash the application.

After completing different types of performance testing, in case a problem is found, developers proceed to deep analysis, with the use of special tools. Profiling is a second phase, completely optional, and it is the process of identifying sections of code which need optimization. For the purpose of profiling, a profiler program is used.

Profilers differ in type based on the final output and on the method. There are two distinctive types based on output: flat profiler and call-graph profiler. A flat profiler collects average call times of the program. A call-graph profiler also collects call times, but among others, it is capable of displaying call-chain, where the developer can examine caller and callee information (GNU gprof 1998).

Furthermore, profiling tools use various methods to gather data. Event-based profilers use the trapping method to collect a well-defined event set such as allocation of memory for the new object, starting and finishing function, thrown exception etc. Event-based profilers slow down the machine but gather more precise data about the application execution. Statistical profilers let the program run at nearly full speed, sometimes revealing otherwise hidden issues, since such issues do not have great performance impact. Statistical profilers, also called sampling profilers, use the sampling method to record statistics. This approach, depending on the case, is less accurate numerically, but statistically provides a better picture of the testing software (IBM 2019).

There is a number of techniques to enable a profiler in a program. Among them are:

TURKU UNIVERSITY OF APPLIED SCIENCES THESIS | Aleksandr Osipov 11

• Manual addition of instructions into source code • Automatic addition by tool, according to provided instructions • Intermediate level addition which happens in assembly or decompiled bytecodes. • Additions assisted by compiler itself • Addition to a compiled executable • Addition at runtime, called runtime instrumentation, where all operations, during software execution, are controlled by the tool, and • Runtime injection, which is a lighter version of runtime instrumentation, where code is modified at runtime to jump to helper functions.

The purpose of this thesis is to examine the most recommended, over the internet, profilers and choose the most suitable one for the current case—a game called “Barotrauma”, developed in cooperation with FakeFish oy and Undertow Games, and is explained in the next chapter. Developers are certain that there is an existing performance inconsistency during gameplay.

TURKU UNIVERSITY OF APPLIED SCIENCES THESIS | Aleksandr Osipov 12

3 PROFILERS AND METHODS

The testing environment for this thesis project is the .NET game with C# scripting. C# is an object-oriented , developed by , used with XML- based Web services on the .NET platform. A number of requests in Google search engine was conducted using the combination of phrases: “best profiler”, “.NET profiler”, “profiler for .NET developers”, “.NET development”. While going through the surveys and forums results concerning profilers for languages not supported by .NET framework were ignored. Among the user suggestions, profilers that were mentioned only once were excluded, assuming that they are not a frequent choice for most developers. The majority of answers was found on the following websites: Stack Overflow, Quora, Dzone, Stackify, Reddit. Mainly 4 profilers seem to be the most common among users’ recommendations and few discontinued profilers.

3.1 The most common profilers include:

1) ANTS profilers – .NET profilers from RedGate. It offers a free 14-day trial and has different licenses, according to type of use. Education, non-profit and student licenses are possible.

2) CodeTrack – free open source profiler, for profiling and debugging .NET applications. It started as one-man project and grew as complete software tool for performance profiling.

3) dotTrace and dotMemory – .NET profilers from JetBrains. They offer the first 10 days for free, before user is asked to get a license. These products come as part of the set of .NET tools from JetBrains. It includes different licenses, also free for student and teacher use.

4) Visual Studio Profiler – a built-in profiler, which comes with Visual Studio Integrated Development Environment (IDE).

5) JustTrace – also a recommended profiler from Telerik but it has been discontinued.

TURKU UNIVERSITY OF APPLIED SCIENCES THESIS | Aleksandr Osipov 13

6) Slimtune – a free profiler whose source code available under MIT License, and does not have active support by developers.

3.2 Methods

Barotrauma is not a deterministic game; internal variables are randomized, and actions are only statistically predictable. It will not give exactly the same output in each testing session, but it will not create the same conditions and the same scenarios happening in the game. To minimize differences, the requirements were 10 to 30 minutes of gameplay in single player mode, using a standard submarine, where the player must encounter at least one monster attack and at least one hull breach. To make an estimation about the effect of the profiler on performance, the normal frame rate of the game without profiler attached is 60 frames per second (fps). This is an acceptable frame rate for most games and it is important that profilers do not diminish the frame rate significantly because this will influence the playability of the game. A low frame rate results in a game behaving abruptly and slowly.

The next sections will describe each of the profilers and how they were tested; and provide an in-depth discussion on the findings. Based on the discussion fora, online published surveys and own experience, profilers have to adhere to certain criteria before being considered. These criteria are:

• Affordability • Learnability • UI intuitiveness • Profiler performance • Analysis depth • Customizability

3.2.1 ANTS

ANTS bundle for profiling comes from the British company Redgate, with headquarters located in Cambridge, United Kingdom. ANTS consists of a separate memory profiler and a performance profiler.

TURKU UNIVERSITY OF APPLIED SCIENCES THESIS | Aleksandr Osipov 14

Memory profiler

The main menu (Figure 1) contains buttons to start new profiling session or open profiler results from previous sessions, also has a shortcut to start profiling application, used recently. Also, it provides useful to documentation of the profiler.

Figure 1. Memory profiler main menu.

When starting a new session, there are 11 possible methods of profiling according to the type of the testing application. Barotrauma is a .NET application, so the first option “.NET executable” was used (Figure 2). Customizations of the profiling session were also possible under “Additional profiler options”.

TURKU UNIVERSITY OF APPLIED SCIENCES THESIS | Aleksandr Osipov 15

Figure 2. Memory profiler configuration.

The memory profiler did not affect the gameplay harshly. The game stayed in a range of 55-60 fps. The profiler tracked the overall usage of Random-Access Memory (RAM), although the user must take memory snapshots, while actively testing the software in order to obtain an in-depth analysis of RAM at that specific point in time (Figure 3).

TURKU UNIVERSITY OF APPLIED SCIENCES THESIS | Aleksandr Osipov 16

Figure 3. Memory profiler results.

Snapshots were analyzed in a matter of seconds and the profiler outputted color-coded diagrams with information grouped by types of data stored in RAM, types of objects etc.

Statistics could be represented in a table with corresponding classes, allocated types of data, sizes and instances (Figure 4). This option was available under the “Class list” button.

TURKU UNIVERSITY OF APPLIED SCIENCES THESIS | Aleksandr Osipov 17

Figure 4. Memory profiler class list.

The Button “Instance categorizer” brought a chain of calls for allocating the chosen object (Figure 5).

TURKU UNIVERSITY OF APPLIED SCIENCES THESIS | Aleksandr Osipov 18

Figure 5. Memory profiler instance categorizer.

The tab on the bottom allowed snapshots comparison with different filters.

Performance profiler

The main menu of the profiler was similar to Memory profiler, with three main sections: to start a new session, to start a session of a previous application and links to documentation (Figure 6).

TURKU UNIVERSITY OF APPLIED SCIENCES THESIS | Aleksandr Osipov 19

Figure 6. Performance profiler main menu.

The profiling settings included 11 types of applications available for testing, also the user could choose a profiling mode, based on the needed depth of profiling (Figure 7).

Figure 7. Performance profiler configuration.

TURKU UNIVERSITY OF APPLIED SCIENCES THESIS | Aleksandr Osipov 20

Profiling modes are different techniques that affect performance and the amount of gathered data. The decision should be based on how deep analysis is required. In the early stage of seeking the performance problem, the lightest mode should be used to identify the issue. For finding more information about the performance bottleneck, the heavier options are recommended, but with a focus on problematic part of the software. The option with the highest profiling overhead will provide the most information, causing the most impact on performance.

To compare modes the following options were chosen: the highest profiling overhead mode with most details, low profiling overhead mode with less details and the lowest profiling overhead mode with least details.

The heaviest profiling was affecting the gameplay and the game was producing 8-12 fps. It significantly slowed down the game, executing with the speed about 6 times slower than normal speed (Figure 8).

Figure 8. Performance profiler results.

TURKU UNIVERSITY OF APPLIED SCIENCES THESIS | Aleksandr Osipov 21

The results of profiling were given in the form of graph, with the possibility to zoom in certain areas of profiling, table with chain of calls, time of completion main body of the method as well as body with children, hit counts and line-level results. The Graph showed the overall load of Central Processing Unit (CPU) at any point in time of the session. Table characterized methods that were called during session, with the option to expand the call tree and have detailed results of the main method body execution and children execution. Line-level profiling output displayed timing for individual lines in source code.

Among the listed ways of presenting data (Figure 9), the profiler offered more approaches to output the results. Some of them were not applicable in current case, but were certainly useful such as calls, Current SQL execution plan, File I\O etc.

Figure 9. Performance profiler results options.

TURKU UNIVERSITY OF APPLIED SCIENCES THESIS | Aleksandr Osipov 22

The low profiling overhead mode happened to be very similar to the high profiling mode. In this mode, the game run at same 8-12 fps. The findings were similar to the first mode, graph with CPU load, table with methods and line-level timings (Figure 10).

Figure 10. Performance profiler low profiling overhead mode.

The lowest profiling overhead mode did not influence gameplay as much as two previous modes and allowed the user to play the game at nearly 60 fps (Figure 11).

TURKU UNIVERSITY OF APPLIED SCIENCES THESIS | Aleksandr Osipov 23

Figure 11. Performance profiler lowest profiling overhead mode.

This mode had significantly less details, although it still showed line-level timing for available data.

TURKU UNIVERSITY OF APPLIED SCIENCES THESIS | Aleksandr Osipov 24

3.2.2 CodeTrack

CodeTrack is a one-man project, it is .NET profiler developed as a hobby by Nico Van Goethem. It is completely free for personal and commercial use (Figure 12).

Figure 12. CodeTrack main menu.

The main menu was simple, yet sufficient. It included tabs on the left to navigate through collection process, analyzing menu, settings and information about the software. The biggest part of the menu occupied recent processes and buttons to start new process by invoking executable, attaching profiler to already running process, trace a windows service, trace a website running on Internet Information Services (IIS), an extensible web server, designed by Microsoft or Internet Information Services Express which is lighter version of the same product and trace .NET Core software.

Starting a new process from executable brought new window to setup the profiler before the session (Figure 13).

TURKU UNIVERSITY OF APPLIED SCIENCES THESIS | Aleksandr Osipov 25

Figure 13. CodeTrack configuration.

In the “Process” field should be specified a link to an executable to test, optional “Arguments” text field can be used if application needs any startup arguments, working directory that is going to be used for the session, optional environmental variables and important part that should be automatic but editable is to choose .NET version and bitness of the application.

Further to profiling modes, there were 3 types (Figure 14).

TURKU UNIVERSITY OF APPLIED SCIENCES THESIS | Aleksandr Osipov 26

Figure 14. CodeTrack sampling profiling option.

Sampling mode had the lowest overhead and was scheduled to take snapshots with equal intervals of the stack for every thread running. It did not track any object info or any parameter but represented the fastest way to profile the application. Advanced options were to enable inlining to track functions that got optimized by the compiler, by inlining into code; by disabling this option functions are not skipped and trace the garbage collection (GC) for tracking what was destroyed during the execution.

In tracing mode all method calls get traced (Figure 15). There was an option to set filters to exclude desired method calls, which can give more targeted results and faster execution. Also advanced section was present with the number of configurations such as precise method timing, tracing inlined functions, timing native memory, resolving generic signature for traced methods and tracing garbage collection.

TURKU UNIVERSITY OF APPLIED SCIENCES THESIS | Aleksandr Osipov 27

Figure 15. CodeTrack tracing profiling option.

Furthermore, the heaviest approach for collecting information is “Deep Trace” (Figure 16). It had all functionality of simple trace, but with the opportunity to create sets for tracking: plugin sets, diagnostics etc.

Figure 16. CodeTrack deep trace profiling option.

TURKU UNIVERSITY OF APPLIED SCIENCES THESIS | Aleksandr Osipov 28

CodeTrack window (Figure 17) is the result of profiling with tracing enabled. This method used significant amount of resources and the game was running under 10 fps. The results were shown in the tree structures with the possibility to see method calls per thread and in case program had threads doing similar work, combined in single tree.

Figure 17. CodeTrack results.

Same information was displayed in the graphical way in “Flame” tab (Figure 18). The length of the line is determined by the method timing. Light colors were displaying low call counts and dark red high call counts.

TURKU UNIVERSITY OF APPLIED SCIENCES THESIS | Aleksandr Osipov 29

Figure 18. CodeTrack results in flame view.

List tab was formatting the data in similar way as “Tree” methods: child methods and timing, but grouping would be based on methods assembly, namespace and class (Figure 19).

Figure 19. CodeTrack results in list view.

TURKU UNIVERSITY OF APPLIED SCIENCES THESIS | Aleksandr Osipov 30

Timeline demonstrated all methods chronologically left to right (Figure 20). From top to bottom it displayed call stacks, which gave information about child methods. On the left side thread ids were located, it was also possible to see call stacks for all threads by scrolling down.

Figure 20. CodeTrack results in timeline view.

Only tracing option was described for the current profiler, which considered to be average between sampling and deep tracing, but in fact was closer to deep tracing. Deep tracing has shown more detailed output with more impact on performance, whereas sampling approach did not seem to affect the gameplay.

TURKU UNIVERSITY OF APPLIED SCIENCES THESIS | Aleksandr Osipov 31

3.2.3 JetBrains

JetBrains is software development company with headquarters in Prague, Czechia, has developed tools for profiling CPU usage – dotTrace and for profiling memory usage – dotMemory. dotMemory dotMemory is standalone memory profiler, that is accessible, as any application, through executable, however it was possible to integrate into Visual Studio IDE as part of ReSharper extension, therefore it also available through Visual Studio.

First window with main menu (Figure 21) had section on the left to choose type for new session with extra options to work with workspaces and additional information about profiler and how to work with it. By choosing “Local” under “New session” it opened available local applications that profiler can track. In the app arsenal were listed standard types of software, built using .NET framework such as .NET Core application, standalone windows application, ASP.NET framework-based applications, Internet Information Services etc.

Figure 21. dotMemory main menu.

TURKU UNIVERSITY OF APPLIED SCIENCES THESIS | Aleksandr Osipov 32

Execution frame rate with profiling enabled was balancing between 15 and 50 fps.

Results (Figure 22) were displayed in a form of diagram of overall memory usage with snapshots taken during the session. Snapshots must be taken manually or automatically when specified condition met. It also offered brief comparison between the snapshots, which is useful for finding differences in memory allocations through the execution.

Figure 22. dotMemory results.

If user needs closer comparison between snapshots there was a “Compare” button at the bottom of the window to list all allocated objects in selected snapshots.

It opened a new window (Figure 23) with data structures and parameters such as type name, survived objects that are present in both snapshots, new objects created in between snapshots, dead objects that were present in first but missing in second snapshots etc. It showed details of memory based on mentioned allocations and it gave a picture of what was happening in between taken snapshots.

TURKU UNIVERSITY OF APPLIED SCIENCES THESIS | Aleksandr Osipov 33

Figure 23. dotMemory snapshot comparison.

Also, button “View memory traffic” revealed gathered data about objects in memory but in simpler way (Figure 24). It only gave the amount of allocated bytes and objects and collected bytes and objects.

Figure 24. dotMemory memory traffic.

TURKU UNIVERSITY OF APPLIED SCIENCES THESIS | Aleksandr Osipov 34

dotMemory had column on the left that had currently open results, for fast switching between different parts of analysis.

To view a single snapshot, one was double clicked (Figure 25). It opened explanation of memory at time of snapshot with a lot of details. All parts were self-descriptive and some of them had visual representations. Furthermore, it had tabs on top to inspect findings based on types, inspections, call trees etc. majority of which shown charts.

Figure 25. dotMemory single snapshot.

TURKU UNIVERSITY OF APPLIED SCIENCES THESIS | Aleksandr Osipov 35

dotTrace

The start window (Figure 26) of performance profiler was nearly identical to memory profiler with few minor exceptions. It had three sections, where in first one there was selection between attaching profiler to an app, running local app and profiling remote app, in second section there were types of available applications to diagnose and the last section with link to an executable, also containing profiling methods: sampling, tracing, line-by-line and timeline. It was different with attaching profiler to a running app by a number of options, containing only sampling and timeline.

Figure 26. dotTrace main menu.

Sampling is suitable for cases when accurate time measurement is the most important criteria for finding the problem. It is recommended for most of the cases, when timing of functions is enough.

Tracing is more advanced option, that measures time and call count. JetBrains claims time to be not precise due to profiler overhead, but the call numbers are exact. This

TURKU UNIVERSITY OF APPLIED SCIENCES THESIS | Aleksandr Osipov 36

option is more suitable for deeper analysis, when just time per function is not enough, but requires more data.

Line-by-line is advanced option. It generates detailed information about every single line of source code. This option creates massive overhead, which effects the call time. To achieve accurate results, it is recommended to select line-by-line profiling when function causing program to slow down is known, so the profiling process can be centered around a single segment of code, using filters.

Timeline collects temporal statistics about threads and events in the software. Mostly suitable for multithreaded applications.

For comparison, tests were done with sampling, tracing and timeline options since these are the most common profiling techniques and should give sufficient information about profiler.

When running the profiler, it was possible to manipulate gathering data via small control panel (Figure 27) such as taking snapshot or killing the process. The control panel always rendered on top of all open windows, what in fact eased the control.

Figure 27. dotTrace control panel.

After testing the game for about 15 minutes dotTrace outputted a list with 5 snapshots that were taken (Figure 28). Each snapshot contained data about the application execution at the collected time frame. A snapshot is an actual sample of call stack data, framed by the start and the end of a snapshot, it is not necessarily a single moment in time as it is in dotMemory. The reason to capture performance data as snapshots is to

TURKU UNIVERSITY OF APPLIED SCIENCES THESIS | Aleksandr Osipov 37

have an opportunity to narrow down the search for bottlenecks in the software and compare various parts of software.

Figure 28. dotTrace sampling snapshots.

Samples were shown in a performance viewer window (Figure 29), where it had collection of ways to check the data from top left tab. The options were threads tree, call tree, plain list and hot spots.

Figure 29. dotTrace results.

TURKU UNIVERSITY OF APPLIED SCIENCES THESIS | Aleksandr Osipov 38

Threads tree and call tree are, as the name says, trees that group data by threads or by calls, which is s all viewed previously profilers. Plain list outputs function names and user can rearrange functions based on class, namespace, assembly or leave them ungrouped. Hot spots view focuses on most time-consuming functions and it displays a list of callback trees for each 100 functions with the highest values.

Tracing technique noticeably dropped fps to an average 10 and resulted in a similar window with a number of snapshots (Figure 30).

Figure 30. dotTrace tracing snapshots.

Every snapshot has collected data from the testing session, same as with sampling technique, but with accurate number of calls (Figure 31). Profiler acquires the call stack data for timing as in sampling, furthermore it listens to notifications from Common Language Runtime (CLR) about entering and exiting a function, which as a result slows down the execution and affects the timing. Common Language Runtime is a virtual machine, that converts compiled code of the .NET program into a machine code. The window with statistics was displayed in a very similar way to sampling, but the main focus remained on the number of calls than on timing.

TURKU UNIVERSITY OF APPLIED SCIENCES THESIS | Aleksandr Osipov 39

Figure 31. dotTrace tracing results.

Tracing results in “Hot spots” (Figure 32) tab formed, in this case, with timing and call count.

Figure 32. dotTrace tracing results in hot spots view.

TURKU UNIVERSITY OF APPLIED SCIENCES THESIS | Aleksandr Osipov 40

Final test of the dotTrace was the Timeline option. With this configuration, Barotrauma did not lag and were producing on average 60 fps. In this session data was gathered from the beginning until the end as one sample.

The findings were shown in a Timeline Viewer (Figure 33). It differs from Performance Viewer in structure and features. Timeline Viewer contained plenty of filters on the left tab: events, interval filters, thread state, subsystems.

Figure 33. dotTrace timeline results.

Events filter groups functions based on particular event:

• .NET memory allocation • When exception is raised • Where application wrote to debug output • Points where garbage collection executed • Time frames where just-in-time (JIT) compiler performed conversion from intermediate language into native code • Time when reading and writing operations were performed • Time when communication between application and SQL servers were performed

Interval event filter:

TURKU UNIVERSITY OF APPLIED SCIENCES THESIS | Aleksandr Osipov 41

• UI freeze interval • Incoming HTTP requests

Thread state filter:

• Running thread • Waiting thread

Subsystems, helps to check timing for individual components:

• User code execution • Native or system code execution • Threads being ready to run on next available CPU core • Time that threads are waiting for exclusive access to an object • Garbage collection • File input and output • Time from execution within namespace or assembly

In the middle section of the Timeline Viewer was a list of functions with timings, percentage and the performance graph. The tab on the right contained the call stack, call tree and source code window.

3.2.4 Visual Studio profiler

Visual Studio is integrated development environment (IDE) that is commonly used among .NET developers and is essential choice of C# developers. According to the surveys in 2017 (https://insights.stackoverflow.com/survey/2017#overview) and 2018 (https://insights.stackoverflow.com/survey/2018#overview) made by Stack Overflow with 64.000 and 100.000 participants respectively, Visual Studio and are the most popular tools used by developers.

In 2010 Microsoft releases Visual Studio with built-in profiler and continues support until present moment of this paper – 2019. Since C# and the whole .NET framework was developed by Microsoft, it is the easiest way to monitor performance of the .NET app with IDE of the same company.

Visual Studio supports debugger-integrated Diagnostic Tools and non-debugger Performance Profiler. Diagnostic Tools can provide information such as showing

TURKU UNIVERSITY OF APPLIED SCIENCES THESIS | Aleksandr Osipov 42

breakpoints and variable values, since it works only during debugging sessions. On the other hand, Performance profiler results will be closer to end-user experience, because it is done on build and not on project running in IDE.

Diagnostic tools

Diagnostic Tools window can be invoked inside Visual Studio through version specific commands, for example in 2017 version via Debug-Windows-Show Diagnostic Tools. Compact size tab as shown on figure (Figure 34) is a control panel for diagnosing. In integrated profiler, CPU and memory usage profiling do not exclude each other. CPU activity is tracked throughout the session and for tracking memory user must take snapshots.

Figure 34. Visual Studio Diagnostic tools, CPU.

During this play session 3 snapshots were taken, each representing different stage of the game. FPS was rarely dropping below 55 and was balancing in 55-62 range. On the

TURKU UNIVERSITY OF APPLIED SCIENCES THESIS | Aleksandr Osipov 43

figure (Figure 35) 3 snapshots have collected data about time from the beginning of the session, object count with difference from previous snapshot and heap size with difference from previous snapshot. Object and heap size data if clicked opens gathered measurements sorted by objects or heap size respectively. In braces contains differences from previously taken snapshot, which is also viewable, and red or green arrow implying rise or drop in comparison. Figure (Figure 36) is second snapshot with information sorted by object count.

Figure 35. Visual Studio Diagnostic tools, Memory.

TURKU UNIVERSITY OF APPLIED SCIENCES THESIS | Aleksandr Osipov 44

Figure 36. Visual Studio Diagnostic tools memory results.

Selecting a line in a table opens detailed material about references, roots and count.

In Diagnostic Tools tab under CPU Usage stores CPU activity during gameplay, sorted by function.

Each function stores information about caller and callee functions with number of calls and time spent relatively to caller function in percentage (Figure 37). These statistics are available in caller/callee, call tree and modules view. Caller/callee view lets user navigate from method to method, always showing three blocks with info individual to each function. Bottom window demonstrates time spent on each line in source code and changes depending on open method.

Figure 37. Visual Studio Diagnostic tools CPU profiling results.

TURKU UNIVERSITY OF APPLIED SCIENCES THESIS | Aleksandr Osipov 45

Non-debugger Performance Profiler

To open Performance Profiler in Visual Studio 2017 user needs to navigate to Analyze- Performance Profiler or Debug-Performance Profiler. It opens new tab in main workspace with configurations to choose target and tools (Figure 38). Startup project can be already opened in visual studio or launched directly by profiler, also profiler has option to attach itself to a running app, it has support for ASP.NET and Windows Store applications.

Figure 38. Visual Studio profiler target options.

It has variety of available in current case tools (Figure 39): tracker of .NET object allocations, CPU usage profiler, GPU usage profiler, memory profiler and Performance Wizard, additionally there are application timeline, HTML UI responsiveness, JavaScript memory and network options for other situations, not applicable in our testing. To perform assessment of the profiler Memory Usage and CPU Usage options were used.

TURKU UNIVERSITY OF APPLIED SCIENCES THESIS | Aleksandr Osipov 46

Figure 39. Visual Studio profiler options.

TURKU UNIVERSITY OF APPLIED SCIENCES THESIS | Aleksandr Osipov 47

Memory profiler

Memory profiler measurements did not affect the gameplay, frame rate was steady 60 fps. The results (Figure 40) are shown as a graph of RAM throughout the execution with snapshots containing screenshots of game at that time, managed heap size and number of objects. On every sample, except the first one, differences under main values demonstrate change in heap memory and object allocations. Heap size and objects expand to a table (Figure 41) with object types, object count, size in bytes, inclusive size in bytes and module.

Figure 40. Visual Studio profiler memory snapshots.

Figure 41. Visual Studio profiler memory snapshot data.

TURKU UNIVERSITY OF APPLIED SCIENCES THESIS | Aleksandr Osipov 48

CPU profiler

Performance profiler did affect the gameplay, frame rate was balancing between 40 and 60 fps. Statistics (Figure 42) were preparing around a minute after terminating the testing session. In observations Visual Studio created a graph with CPU activity, with 2 sliders to zoom in values. By default, it selects beginning and end of session to assemble a report, but with every change of zoom frame it recalculates the values.

Figure 42. Visual Studio profiler CPU snapshot data.

Under the graph located a table with function names, total CPU, self CPU and module. Function name and module columns are self-explanatory, however the rest of the columns are not clear. Total CPU is calculated as following: total method activity, including time when function called other functions, divided by app activity and multiplied by 100 to get CPU cost in percent. Self CPU column is determined similarly to previous column by dividing method activity by app activity, but in this case excluding time when function called other methods and only leaving execution of function body, and multiplied by 100 to get estimation in percentage.

Full report with caller/callee, call tree or module table (Figure 43) is accessible by double clicking a function. It contains blocks, similar to report made by debugger-integrated profiler with caller function, current function and callee function. On the bottom of the report it points to a place in source code, where function was called, indicating time in which each line performed.

TURKU UNIVERSITY OF APPLIED SCIENCES THESIS | Aleksandr Osipov 49

Figure 43. Visual Studio profiler CPU call chain.

TURKU UNIVERSITY OF APPLIED SCIENCES THESIS | Aleksandr Osipov 50

4 RESULTS AND DISCUSSION

The research, completed on profilers for .NET applications, is described in this chapter. All observations and ratings are based on the researcher’s personal opinion. For a more thorough evaluation of the various criteria, additional research with multiple user opinions should be done. The evaluation of each tool is presented as a separate paragraph, where features and details are discussed and assessed based on criteria, stated in the Methods chapter:

• Affordability Criteria evaluating the standard price of the software license and possible offers from the company. High score means relatively low and reasonable price. • Learnability The easiness of learning the functionality with the help of provided tutorials and documentation. High score means it is easy to learn. • UI intuitiveness Simplicity of the UI, minimal effort needed to understand the program without help. High score means it is more intuitive and requires less effort to use. • Profiler performance The ability of profiler to manage computer resources to keep execution speed sufficient. The performance that testing software can output with the profiler attached. High score means while testing, the execution was close to normal and profiler has low performance impact on machine. • Analysis depth The volume of information collected by profiler. High score means more information gathered. • Customizability Available number of analyzing modes and ways to display findings. High score means convenient amount of analyzing modes and sufficient number of options to show data.

As a final part of discussion, the table with scores concludes the comparison and gives a visual interpretation of findings evaluation.

ANTS

TURKU UNIVERSITY OF APPLIED SCIENCES THESIS | Aleksandr Osipov 51

Redgate sells ANTS performance profiler in a bundle with ANTS memory profiler and debugger for 3rd party code for 935 euros for one license with duration 1 year. Individually price of performance profiler is 465 euros for standard license and 715 for PRO license, price for memory profiler 565 euros. While being non-free products, company has deals for personal use 50% of the price, free license for open-source projects and educational licenses. Due to high price but list of possible deals affordability is medium.

Both memory profiler and performance profiler offer intuitive and mostly clear UI, although it is still necessary to check online tutorial. Searching for a performance problem in a software is not an easy task, even if user is stuck and overwhelmed in the profiler findings, Redgate documentation describes very well how to use every function in profiler and leaves no room for frustration. Profiler learnability is high and UI intuitiveness is medium.

Memory profiler had little to no influence on the performance, light lags were happening only at the moment when snapshots were taken. Performance profiler effects differ with chosen mode of profiling. With the heaviest and moderate profiling modes on, the game was hardly playable (8-12 fps). Light profiling mode with small overhead collects enough information to detect the problem, while leaving the game playable (60 fps). Profiler performance is medium due to significant decrease in frame rate.

ANTS profilers gathered significant amount of information including connections to other functions in the chain, timing, hit count and size of data. With great collection abilities, profilers are also able to present findings in many ways, personalizing it to every user to ease the process of searching. Profiling depth and customizability are high.

CodeTrack

CodeTrack is completely free profiling tool, that can be used in personal and commercial projects. Affordability is high.

It mostly has simple UI, but it gets more complex after the collecting data. If it is enough for user to see all information in a call tree, it is user friendly, but to inspect information in the rest of available methods user should visit documentation containing many examples to overcome the confusion with presented analysis. Profiler learnability is high and UI intuitiveness is medium.

TURKU UNIVERSITY OF APPLIED SCIENCES THESIS | Aleksandr Osipov 52

It mostly has large impact on performance (10 fps) when Deep Trace and Tracing options are used, but low effect with Sampling method. Profiler performance is low due to majority of analyzing modes being slow.

Profiler offers several ways to show collected results. The results contain plenty of information about processes running in tested software, demonstrating the depth of profiling. Analytics depth and customizability are high.

JetBrains

JetBrains has variety of license policies, yearly subscription, monthly subscription, for companies, for individual use, special offers. Profilers are available as part of a bundle. The standard price for a bundle of .NET IDE, Visual Studio extensions and profilers is 450 euros for companies and 179 euros for individuals, with price being reduced each year of subscription. Special offers have plenty of deals that cover student licenses, open source projects, educational organizations, startups, special trainings, non-profit organizations etc. Affordability is medium due to large number of deals and price, for non-free software, that is lower than ANTS profilers.

JetBrains consists of dotMemory and dotTrace. Both profilers have similar UI which is very user-friendly and easy to navigate. To understand the concept of how the profilers present the findings, there is a chance user might need to address the documentation, where everything is simply explained. dotMemory and dotTrace output the results in very explanatory manner, it is easy to understand. Furthermore, profilers can be integrated into Visual Studio IDE. Profiler learnability and UI intuitiveness are high.

Performance impact is noticeable, but bearable. Barotrauma was executing in range of 15 to 60 fps with memory profiling, which is slower than other memory profilers, 10 fps with deep CPU profiling and normal 60 fps with light Timeline CPU profiling. The heavier options were in performance profiler, but it has a number of modes for light and deep profiling, making it customizable for each case. Profiler performance is medium.

Profilers have plenty of options to express analytics. Staring from sorted lists and timing for each line of code in both profilers to colorful and expressive charts in dotMemory profiler. Depth of analysis and customizability are high.

Visual Studio profiler

TURKU UNIVERSITY OF APPLIED SCIENCES THESIS | Aleksandr Osipov 53

Visual Studio IDE has integrated profiler, which makes it the easiest option for Visual Studio users, and since Visual Studio is the most popular IDE at the moment according to research (https://pypl.github.io/IDE.html) and mentioned in Method chapter two researches done by Stack Overflow, integrated profiler would be popular choice of the majority of developers. Profiler comes as free component for Visual Studio. Prices for IDE are following: Community version is free, Professional version is 45 dollars per month or 1199 dollars per year, Enterprise version costs 250 dollars per month or 5999 per year. Furthermore, Microsoft has number of deals including licenses for educational purposes. Affordability is high.

It is first profiler that offers tools with and without debugger. Memory and performance profilers that were tested, built with straightforward UI, perfectly matching with the theme of IDE itself, so Visual Studio users will not likely have troubles understanding it. Information displaying the results is descriptive and readable. The documentation and tutorials sometimes are not clear. Learnability is medium, but UI intuitiveness is high.

Profiling tools were having insignificant impact on performance, which is unique among viewed profilers. It did not disturb the gameplay during the testing sessions. Profiler performance is high.

With low overhead being an advantage, the way to change the depth of analysis is different, comparing to other profilers, which can be counted as small disadvantage. However, it has different approach for profiling options to track data, it is unique among reviewed profilers. Overall it still provided deep analytics with several ways of presenting it, which were sufficient. Depth of analytics and customizability are high.

TURKU UNIVERSITY OF APPLIED SCIENCES THESIS | Aleksandr Osipov 54

Table 1. Profiler comparison.

ANTS CodeTrack JetBrains Visual Studio

Affordability Medium High Medium High

Learnability High High High Medium

UI intuitiveness Medium Medium High High

Performance Medium Low Medium High

Analysis depth High High High High

Customizability High High High High

As the table Profiler comparison sums the evaluation, all investigated profilers have medium and high scores, which proves the point of being the most popular profiling software, recommended online. There are relatively unique evaluations, such as learnability factor in Visual Studio profiler being the only candidate with medium grade, among others with high marks. Furthermore, profiler performance criteria is the most diverse in results. It contains all possible grades, but what is most important Visual Studio solely received high, because of its optimized work, that does not interfere with testing application performance.

The purpose of this thesis was to make a profiler recommendation for game development based on user experience, usability and industry preferences. This thesis, however, did not address the profiler reliability (i.e. is the profiler able to measure the game’s performance accurately). Further recommended study in this area would be to make code changes to the profiler-identified inefficient code and then running the profiler again to see if the changes made an impact. These results should then be compared to theoretical measures such as big-O-notation.

TURKU UNIVERSITY OF APPLIED SCIENCES THESIS | Aleksandr Osipov 55

5 CONCLUSION

The purpose of this thesis was to solve the problem of choice of available profilers for the game development industry. The research was conducted on 4 profilers, widely recommended among online fora. The procedure was to play the game Barotrauma in single player mode for 10 to 30 minutes, experience at least one monster attack and at least one submarine hull breach. The same procedure was repeated for all chosen profiling tools and afterwards, the profiler findings were evaluated on the following criteria: affordability, learnability, UI intuitiveness, profiler performance, analysis depth and customizability. In the discussion chapter, all the results were summarized and presented in a table which determined the winner – the Visual Studio profiler. The Visual Studio built-in profiler has the highest score among researched profilers. Profilers from JetBrains are the second on the list of the best profilers. This software has reasonable price and number of deals for reduced price or free license, while providing great diagnostics. In case there is a need of free profiling tool, CodeTrack is a perfect choice after Visual Studio. The use of ANTS profilers is certainly beneficial for the industry, however, due to the marks for the above-mentioned criteria it took the last place in the list. This research was based purely on user experience and does not include evaluation of precision and accuracy of the gathered statistics. In order to achieve that, future research should include data collection before and after improving slow sections of code to compare the change in performance.

TURKU UNIVERSITY OF APPLIED SCIENCES THESIS | Aleksandr Osipov 56

REFERENCES

1059-1993 - IEEE Guide for Software Verification and Validation Plans, IEEE 1994, doi: 10.1109/IEEESTD.1994.121430.

Ammann, P. and Offutt, J (2008) ‘Introduction to Software Testing’, Cambridge University Press.

Bertolino, A. and Faedo, I. A. (2007) ‘Software Testing GNU gprof g Research: Achievements, Challenges , Dreams Software Testing Research : Achievements , Challenges , Dreams’, (September 2007).

Craig, . D., & Jaskiel, S. P. (2002) Systematic software testing. Artech House.

Denaro, G., Polini, A. and Emmerich, W., 2004, January. Early performance testing of distributed software applications. In ACM SIGSOFT Software Engineering Notes (Vol. 29, No. 1, pp. 94-103). ACM.

Dolan, E.D. and Moré, J.J., 2002. Benchmarking optimization software with performance profiles. Mathematical programming, 91(2), pp.201-213.

GNU gprof, The GNU Profiler 1998, .

Hailpern, B. and Santhanam, P. (2010) ‘Software debugging, testing, and verification’, IBM Systems Journal, 41(1), pp. 4–12. doi: 10.1147/sj.411.0004.

IBM Knoweledge Center, IBM 2019, .

Jorgensen, P. C. and Jorgensen, P. C. (2019) ‘Exploratory Testing’, Software Testing, (c), pp. 369–374. doi: 10.1201/b15980-20.

Khan, M.E., 2010. Different forms of software testing techniques for finding errors. International Journal of Computer Science Issues (IJCSI), 7(3), p.24.

Kumar, P. and Syed, K., 2010. Software testing–goals, principles, and limitations. International Journal of Engineering Science & Advanced Technology (IJESAT), Volume-1, Issue-1, p. 52-56.

Labiche, Y., Thévenod-Fosse, P., Waeselynck, H. and Durand, M.H., 2000, June. Testing levels for object-oriented software. In Proceedings of the 22nd international conference on Software engineering (pp. 136-145). ACM.

Limaye, M. G (2009), ‘Software Testing: Principles, Techniques and Tools’ Tata McGraw-Hill Education.

Naik, K. and Tripathy, P., 2008. Software testing and quality assurance. Wiley-Blackwell.

Pan, J. (Spring 1999). "Software Testing" (coursework). Carnegie Mellon University. Retrieved November 21, 2017.

Pan, J., 1999. Software testing. Dependable Embedded Systems, 5, p.2006.

SmartBear, SmartBear Software tools company 2019, < https://smartbear.com>.

Vokolos, F.I. and Weyuker, E.J., 1998, October. Performance testing of software systems. In Proceedings of the 1st International Workshop on Software and Performance (pp. 80-87). ACM.

TURKU UNIVERSITY OF APPLIED SCIENCES THESIS | Aleksandr Osipov