<<

Profiler Tool Selection for Curricular Support

Abstract

Profiling is the process of analyzing the structure and performance of software. Profiling can be used to detect patterns of use, to verify performance, to optimize code, to identify corruption and to expose memory leaks or excessive resource demands. Components of a large system can be profiled individually or together. Profiling is accomplished through software tools – software that runs and/or instruments the application under study.

Since profiling reinforces understanding of conceptual processes, it can be used effectively in science to strengthen programming skills and awareness of design issues. Essentially, such tool usage exposes hidden details. For example, analysis of memory usage can highlight different types of memory management. function calls can illustrate the impact of different portions of a software system (i.e. functions executed). Students thus learn to think critically in a practical setting, fostering the capacity to evaluate vendor claims and counter asumptions (such as ‘Java has no memory leaks’).

In this paper, we summarize our attempts to select, evaluate, and compare profiling tools, where the computer science classroom is the targeted use of such tools. First, we derived desired characteristics for profiling tools. Then we determined availability, subject to the constraints of an academic environment. Finally, we evaluated implementation and usage details.

Profiling Tool Criteria

Profiling is not a standard topic in the curriculum. However, profiling can be employed as a supplementary tool to reinforce material in programming and courses. Hence, we reviewed the fundamental concepts that CS students need to understand. Ideally, profiling should be demonstrated at a level that students can easily understand, that can reinforce concepts that are being taught in the classroom, and that students find useful in their computer science education.

We focused on selecting a tool that students could easily acquire, apply to their own code, and interpret the feedback so provided. Our criteria considered computer equipment accessible to students and programming languages familiar to most students. We also surveyed the computer science faculty, soliciting their feedback on appropriate tool usage and desirable features.

Faculty were asked to rank preferred features. Presentation details such as the presence of a GUI and the display of profiling results (charts, graphs) were two features queried. Other features focused on utility. Specifically, whether the profiling tool would work with multiple languages, C++, as well as the Unix, Windows and MacOS environments. Faculty were also asked to briefly answer the following questions: • Which languages do you feel are important to consider? • Which platform or environment is most important? • What features or capabilities would you most like to see introduced to the classroom? • Is there a particular profiling focus or result you feel would benefit CS students most?

We balanced results of the faculty survey with our own classroom experiences and perceived student expectations. In sum, we derive a set of features that described a tool that profiles C++, and is available for both Windows and UNIX systems. A graphical interface was considered to be less important, as long as the formatted results were easy to read and interpret. Since Java is another important language in the curriculum, profiling different languages, in particular C, C++, and Java, would be ideal.

In terms of conceptual material to reinforce with a profiling tool, memory usage was emphasized. Other important concepts included: profiling disk accesses, procedure-level profiling, I/O performance tracking, identification of excessive resource use, and algorithm performance. To aid in teaching profiling, professors anticipated using a profiler to compare algorithms. This could be done by comparing different software programs, or by using one program that changes its control flow over time.

Profiling Tool Availability

Choosing a profiling tool was challenging. In fact, selection proved to be one of the toughest tasks encountered. Anticipated as a fairly straightforward step, we were surprised to spend (too much) time sorting through various profiling tools trying to find one that met all of our needs. The major challenges we faced were: finding free tools, encountering poor documentation, and determining whether tools met all of our criteria for language, platform, cost, features.

Our project aims to integrate profiling tools into classrooms with students as the target. Thus, we sought an open-source tool. Purchasing a tool might make it inconvenient and infeasible for many students to acquire the tools for their own use and experimentation. Also, academic departments might want free distribution. Many of the most appealing tools were not free, so we did not test them. Generally, these tools had user friendly GUIs and well-developed graphical features. Most of these tools, however, were targeted to very specific audiences. The second, and perhaps the most frustrating, problem encountered was poor or completely missing documentation. With some of the most promising and free tools, we were unable to get past the installation stage because there was no documentation to support investigation and no tutorials. We suspected incomplete documentation would be much less of an issue with a purchased tool because well-documented user manuals are usually a selling point. After many hours, we abandoned two tools with good potential – most notably and NetBeans.

Eclipse is a development platform that is open source. Many Eclipse developers have created plug-in tools with various capabilities, including profiling. Although Eclipse offers a wide variety of features, the major obstacle was lack of appropriate introductory documentation. Even after hours of research, experimentation, and searching for beginner (step-by-step) instructions, we were unable to get the profiling plug-in tool installed and working. Without more detailed instruction, Eclipse did not appear to be a suitable tool for easy installation targeted toward small, low-level profiling. NetBeans presented a similar problem.

The last challenge faced was finding a tool that met a variety of our desired features. Aside from cost, these included language, platform, and graphical features. We found tools that met some, but not all of our criteria, so we decided not to explore them further. For example, many tools profile only Java programs, but we desired multi-language support. In the end, we compromised and selected a tool that met all of the criteria but not necessarily in the optimal manner. For example, some tools had better graphical user interfaces than others but did not have multi-language support.

Profiling Tools Examined

To prepare for presentation of a profiling tool to a university classroom, we experimented with a wide range of profiling tools ourselves. Online searches and investigation led to many sources for profiling software. Experimenting with the available software provided a comprehension of the breadth of features, strengths, weaknesses, and capabilities of the accessible tools. From our initial exploration, we were able to gain a more thorough understanding of the profiling tools currently available. By understanding the limitations of the tools, we were able to determine a set of minimum functionality to demonstrate in our classroom instruction.

We narrowed our selection from a couple dozen profiling tools down to three strong candidates, and then performed tests on each tool. We ran a small set of programs in each one, evaluating the ease of installation, ease of use, user interface, existence and clarity of documentation and instructions, and the usefulness and readability of the results. The programs we ran were small to medium sized C++ programs – approximately equivalent to the length and complexity of a typical program written in an introductory data structures class. Programs included different types of loops, pointers, arrays, lists, linked lists, etc. During each test, we noted any unexpected errors, confusing instructions or unclear results. These factors determined our evaluation of the tool’s ease of use. Finding answers to issues, explanations of results, and troubleshooting details during our examination of each tool helped us assess the tool’s documentation and instruction quality. The final result set for each run was considered for usefulness and readability. We compared the breadth of the tool’s analysis, the clarity of the output, and any other extra information that the tool gathered.

Our three candidates were GProf, DevPartner and . All three are available as freeware, work with C++ programs, and were downloaded and installed without problems.

GProf runs on a Linux shell, which is readily available on university lab . It is conveniently built-in on all Unix/Linux platforms. The tool includes thorough help files. We were also able to find extensive documentation online to provide instructions and explanations of its features.

GProf does not, however, include a graphical interface. Results are displayed as text in the Linux shell it is run in. The results are precise, but the impact of the tool’s output is not presented in an easily comparable format.

DevPartner is a free download, providing limited features of the full cost version. The software is easy to download and install. It runs on top of Visual Studio, providing profiling results for programs that are run within the Visual Studio . It also includes help files that imitate Visual Studio’s help file format. The tool provides some graphic results, and shows clear tables that are easy to interpret.

DevPartner result tables give statistical information for every function or method that is called during the program execution. This display includes kernel and system function calls such as heap actions, thread actions, and library elements such as string functions and I/O functions. For each method, DevPartner lists how many times the function was called; fastest, slowest, and average execution times; child functions, and relative times spent in the child functions. A call tree is also represented graphically by double-clicking on the function name. In addition to the method statistics, the source code is shown with line-by-line summary of function calls, so that it is visually apparent at what line in the code the most time or resources are being spent.

DevPartner results are summed into a Performance Analysis Session Summary, which lists a brief set of statistics including the total time, number of methods called, total number of calls, and a brief list of the most prevalent methods. As a whole, DevPartner is very easy to use, and the results are interesting and simple to interpret.

While the DevPartner download is free, Visual Studio is not. For this reason, we felt DevPartner was not the best choice for us. Aside from the dependence on the host software, we felt DevPartner was a strong candidate. For acdemic departments that already use Visual Studio, this profiling tool might be an appropriate choice.

ValGrind runs on a Linux platform. The program was relatively uncomplicated to download and install. It works on multiple languages, profiles the entire program without requiring modification, re-compiling or re-linking of the application, and offers a variety of tools and extensibility.

We ran the profiler on a test program in order to see the clarity, depth, and features of the results. What we found is that Valgrind is a good comprehensive tool. It includes a collection of distinct tools, most notably Memcheck, Cachegrind, and Massif.

Memcheck is a . Results for Memcheck detect memory-management problems. The tool checks all memory reads and writes, as well as calls to malloc/new and free/delete. It also displays uninitialized/unallocated memory, and a nice summary of memory leaks. The memory leak summary shows how many blocks of memory have been lost due to leaks and how many of those blocks leaked are still reachable.

Cachegrind profiles the I1, L2, and D1 caches in the CPU. Output identifies the source of cache misses. Cachegrind will also identify the number of cache misses, memory references, and instructions, with line-by-line, function, module, or complete program summaries. The downside of Cachegrind is that it does not consider kernel activity or other process activities.

Massif is a heap profiler that determines the heap memory usage of a program. The tool prints out data on heap blocks and stack size.

ValGrind causes the profiled program to run significantly slower (by up to 100 times, depending on the tool that is used). There is also a graphical plug-in available called KCachegrind. However, this plug-in has little documentation for the novice user or those unfamiliar with the Linux environment.

Considering the characteristics of the three candidates, and the our established criteria, we settled on ValGrind. Despite causing programs to run at a slower rate, it was the outstanding contender. It was easy to install, included help files and abundant documentation online, and worked with multiple languages, provided a graphical user interface plug-in, offered many tool functions, and worked without requiring extra code manipulation or recompiling.

Conclusion

Our work focused on finding an appropriate profiling tool for use in the classroom. By comparing and contrasting various tools, we chose a tool that satisfied language, platform and cost requirements. We geared tool selection towards the needs of professors, and students, in the context of the introductory computer science course sequence. Desired features included: support for C++; support for multiple languages; operable in both Windows and Unix environments; ability to track memory usage; and some graphical display of results. Our search for a profiling tool which met all of our requirements criteria proved to be extremely challenging. Many of the tools had some, but not all, of the features desired. We also encountered difficulties with tool documentation, and finding tools that were free to the public.

After sorting through a variety of tools and rejecting many (some with good potential), we settled on a tool called Valgrind. Valgrind is a free Linux -based profiling tool which has, with multiple language support and a graphical add-on. Future work includes the development of introductory tutorials, and sample programs to profile in an introductory CS course.