INFORMATION TO USERS

This manuscript has been reproduced from the microfilm master. UMI films the text directly from the original or copy submitted. Thus, some thesis and dissertation copies are in typewriter face, while others may be from any type of computer printer.

The quality of this reproduction is dependent upon the quality of the copy submitted. Broken or indistinct print, colored or poor quality illustrations and photographs, print bleedthrough, substandard margins, and improper alignment can adversely affect reproduction.

In the unlikely event that the author did not send UMI a complete manuscript and there are missing pages, these will be noted. Also, if unauthorized copyright material had to be removed, a note will indicate the deletion.

Oversize materials (e.g., maps, drawings, charts) are reproduced by sectioning the original, beginning at the upper left-hand corner and continuing from left to right in equal sections with small overlaps. Each original is also photographed in one exposure and is included in reduced form at the back of the book.

Photographs included in the original manuscript have been reproduced xerographically in this copy. Higher quality 6" x 9" black and white photographic prints are available for any photographs or illustrations appearing in this copy for an additional charge. Contact UMI directly to order.

University Microfilms International A Bell & Howell Information Company 300 North Zeeb Road. Ann Arbor, Ml 48106-1346 USA 313/761-4700 800/521-0600

Order Number 9412048

Visualizing program variable data for debugging

Shomper, Keith Allen, Ph.D.

The Ohio State University, 1993

UMI 300 N. ZeebRd. Ann Arbor, MI 48106

V is u a l iz in g P r o g r a m V a r ia b l e D a t a f o r D e b u g g in g

DISSERTATION

Presented in Partial Fulfillment of the Requirements for

the Degree Doctor of Philosophy in the Graduate

School of The Ohio State University

By

Keith A. Shomper, B.A., M.S.

The Ohio State University

1993

Dissertation Committee: Approved by

Wayne E. Carlson

Bruce Weide W Adviser Mukesh Singhal Department of Computer and Information Science © Copyright by

Keith A. Shomper

1993 To Vickie, Rebekah, Matthew, and Jonathan A cknowledgements

My appreciation must first and foremost be to the Lord God who gave me the ability and determination to complete this research.

I must also thank Wayne Carlson, my advisor, for keeping me focused on one problem and rescuing me from “biting off more than I could chew.” Your advice on both this research and for professional development is very much appreciated. I thank

Bruce Weide for his many useful comments on this work, especially for introducing the idea for providing access to the user-defined “print” procedures. Finally, to Mukesh

Singhal, thank you for being a part of my reading committee. Your presence as a reader motivated me to produce a better document.

There are others in the Computer and Information Science Department at Ohio

State that I wish to thank: Chuck Destefani for being my study partner in the early part of this program, John Boyd for helping me believe that we could finish our programs within the schedules we had set for ourselves, and to the professors in my graphics courses, Rick Parent and Roni Yagel, for making this experience rewarding.

Finally, I’d like to thank those who helped me succeed in this program by their steady influence and reassuring support. I thank my parents, Richard and Lillian

Huber, for teaching me how to be disciplined in my work. I also wish to thank my wife’s parents, Giovanni and Frances Dastoli, for opening their home to me so I could relax from schoolwork when things got hectic. Thank you Pastor and Mrs. Grandy and the people at Heritage Baptist Church for your prayers and support.

Finally, most of all, I am indebted to my wife. Sweetheart, your love and unfailing belief in me kept me from quitting when it would have be easy to do so. To Rebekah,

Matthew, and Jonathan, your playful ability to distract me from my schoolwork and to keep me from being swallowed up in it helped me to maintain a healthy focus on the importance of balancing my profession with the greater responsibility of raising you.

iv V it a

December 4, 1961 ...... Born - Harrisburg, Pa.

1983 ...... B.A. Mathematics, University of Northern Colorado Greeley, Co. 1984 ...... M.S. Computer Science Air Force Institute of Technology Dayton, Oh. 1985-1990 ...... Computer Analyst Offutt Air Force Base Omaha, Ne.

Publications

K. Shomper

Visual Debugging Technical Research Report OSU-ACCAD-5/93-TR5, Advanced Computing Center for the Arts and Design, The Ohio State University, 2036 Neil Ave, Columbus, Oh 43120, May 1993.

Fields of Study-

Major Field: Computer and Information Science

Studies in: Computer Graphics Prof. Wayne Carlson Software Engineering Prof. Bruce Weide Distributed Computing Prof. Mukesh Singhal T a b l e o f C o n t e n t s

DEDICATION ...... ii

ACKNOWLEDGEMENTS ...... iii

VITA ...... v

LIST OF TABLES ...... ix

LIST OF FIGURES ...... x

CHAPTER PAGE

I Introduction ...... 1

1.1 Necessity of Debugging ...... 2 1.2 Developments in D ebugging ...... 3 1.2.1 History of Debugging ...... 3 1.2.2 Debugging Theory ...... 5 1.2.3 Debugging Practice ...... 8 1.2.4 Debugger Technology ...... 10 1.3 Debugging Graphical P ro g ra m s ...... 12 1.4 The T h e s i s ...... 14 1.5 Outline of the Dissertation ...... 15

II Related W ork ...... 18

2.1 Visual Methods in Programming ...... 20 2.1.1 Taxonomies of Visual M e th o d s ...... 21 2.1.2 Visual Programming ...... 24 2.1.3 Program Visualization ...... 27 2.1.4 Visual Debugging ...... 30 2.2 Visually-Oriented Debugging Tools ...... 31 2.2.1 EXDAMS (1 9 6 9 ) ...... 32 2.2.2 Incense (1980) ...... 34 2.2.3 GDBX (1985) ...... 37 2.2.4 VIPS (1987) ...... 38 2.2.5 PROVIDE (1988) 39 2.2.6 VIPS II (1991) ...... 41 2.2.7 Commercial Debuggers ...... 43 2.3 Summary of Previous W ork ...... 44 2.3.1 Best Characteristics...... 44 2.3.2 Limitations and Restrictions...... 45 2.4 Desirable Characteristics for a Visual Debugger ...... 46

III The Data Visualizer ...... 48

3.1 Designing the Data Visualizer ...... 48 3.2 Data Selection and D isplay ...... 49 3.2.1 D ata S e le c tio n ...... 50 3.2.2 Data Display...... 51 3.3 Debugging With the Data Visualizer ...... 52 3.4 Architecture and Environment ...... 59

IV Data Selection ...... 64

4.1 Reducing Information ...... 65 4.2 Abstracting Data Structures ...... 67 4.2.1 Extending Definitional Reduction ...... 68 4.2.2 Integrating Definitional and Programmatic Reduction . . . 71 4.3 The DV Selection Definition System ...... 73 4.3.1 Classifying Data Fields ...... 74 4.3.2 Handling Pointer F ields ...... 75 4.3.3 Procedure-Based Selection ...... 80 4.3.4 Type-Based Selection and Multiple Definitions ...... 83 4.3.5 Data Selection: An Illustrated Exam ple ...... 84 4.4 Summary ...... 87

V Data Display ...... 91

5.1 DV’s D ata Display Philosophy ...... 92 5.2 Managing A bstractions ...... 94

vii 5.2.1 Organizing the D ata ...... 95 5.2.2 Modifying Abstractions ...... 97 5.3 DV D ata Display O b je c ts ...... 97 5.3.1 The Main Window and Menu Bar ...... 98 5.3.2 Common Features ...... 99 5.3.3 The Function Object ...... 101 5.3.4 The Variable O bject ...... 102 5.3.5 The Graphic Object and the Drawing P a d ...... 107 5.4 Data Display: An Illustrated Exam ple ...... 109 5.5 S u m m a ry ...... 114

VI S u m m a ry ...... 117

6.1 Evaluation ...... 117 6.2 C ontributions ...... 120 6.3 Future W o r k ...... 121 6.4 Conclusion ...... 122

BIBLIOGRAPHY ...... 123

viii L ist o f T a b l es

TABLE PAGE

1 Methods of Data Reduction Used by Various Debugging Tools .... 68

2 Selection Modes...... 77

3 Types of Association Used to Organize Debugger D a ta ...... 96

ix L ist o f F ig u r es

FIGURE PAGE

1 Shapiro’s Debugging Algorithm ...... 6

2 Memory Allocations Shown as Blocks (a) and as Text (b ) ...... 13

3 Shu’s Taxonomy of Visual M ethods ...... 22

4 A Comparison of Myers’ and Shu’s Taxonomies for VP Systems . . . 23

5 A Comparison of Myers’ and Shu’s Taxonomies for PV Systems . . . 23

6 A PIGS Program for Sorting N Numbers in Descending Order .... 25

7 BALSA Image for Breadth-First Traversal Animation...... 29

8 A Flowback Analysis G raph ...... 33

9 (a) Default Formats, (b) Custom Format for RECORD in (a), (c) a Data Structure with Referents, and (d) Layout for Data Structure in (c) 35

10 VIPS II Debugging an Example Application...... 42

11 A Linked List: Abstract View (left) and Full Detailed View (right) . 51

12 DV Data Display Objects ...... 53

13 xdbx and DV as They Appear Initially...... 54

14 xdbx and DV Displaying polygon and ra y ...... 56 15 Evidence of an Error: P Appears in the Wrong Location...... 57

16 The Displayed Variables After Correcting Component P [2 ]...... 58

17 Communication Relationships Among the Debugging Processes . . . . 61

18 Two Views of a Linked List: Abstract (left) and Detailed (right) . . . 70

19 Two Views of the List Data Structure Used by D V ...... 72

20 A Simple Linked List: (a) Its Representation, (b), (c) and (d) Three Possible Listings of the Elements ...... 76

21 The Selection Definition System Interface (for Structures) ...... 77

22 A Model Linked L is t ...... 79

23 (a) A Generalized List Record and (b) The Generalized List A = (a,(b,c),d)...... 81

24 The Traditional Box-And-Arrow View ...... 85

25 The Tree With Thread Flags Elided ...... 86

26 An Inorder Listing of the Tree’s Contents...... 88

27 An Inorder Listing With Highlights...... 89

28 Abstracting a Circular Q ueue ...... 93

29 DV’s Initial Display S ta te ...... 98

30 DV Displaying a Function add and Its Variable s u m ...... 99

31 DV’s Menu Bar Commands ...... 100

32 DV Functions Displayed Cascading (Default) ...... 103

xi 33 DV Functions Resized and Tiled by the U ser ...... 104

34 Simple and Combination Variables ...... 105

35 Creating a Graphic for polygon...... 108

36 Highlighting the polygon Graphic ...... 110

37 xdbx and DV Displaying polygon and r a y ...... 112

38 Examining the Spatial Relationship Between polygon and ray . . . 113

39 Displaying P as a 3D Point ...... 115

xii CHAPTER I

Introduction

“To err is human . . . Everyone has experienced the truth embodied in this familiar quote—computer programmers not excepted. However, human imperfection can be especially bedeviling to the programmer, because a computer’s forgiveness is much less than divine. Ordinarily, a programming error will produce bugs in the program.

A bug is a programming error which has manifested itself by some visible fault in the program’s behavior [25]. Generally, there are two approaches to reduce the number of bugs in a program: formal methods and debugging.

Formal methods limit the introduction of bugs during program development by compensating for human frailties in several ways. They encourage the removal of error-producing ambiguities from the problem statement, program design, and imple­ mentation. They also provide methodological assistance in problem solving. Often just faithfully following formal development procedures can help lower the bug count by requiring programmers to think more rigorously, i.e., less intuitively, about the problem’s “messy details.”

Debugging is an iterative process of error removal. It is distinguished from program testing as follows. Testing is the activity of searching for evidence that the program contains an error or bug. Testing attempts to discover bugs; it is not concerned with their removal [45]. Debugging is the effort to remove program bugs by characterizing the fault produced by a bug, locating and correcting the error, and validating the program with the correction. Under this definition, debugging is a strictly reactive approach to error control (as opposed to formal methods which are mainly proactive).

However, as we shall see in Section 1.2.3, some debugging activity can occur before any bugs are discovered, and that doing this will make debugging the program easier.

1.1 Necessity of Debugging

Proverbial wisdom says: “An ounce of prevention is worth a pound of cure.” While it is certainly preferable to adopt procedures to prevent errors rather than to correct them using debugging techniques, it is certain that some errors will remain despite the application of formal methods. People make mistakes, and some errors will slip into the program regardless of the best attempts to prevent them. Therefore, debugging is not a menace to sound programming practice as Dijkstra suggested [27], but a useful tool to combat the effects of our human imperfection. Furthermore, most software is created in an environment requiring some assumptions whose consequences cannot be known with all certainty [74]. In these situations errors arise. Therefore, debugging will remain a necessary and legitimate activity for error removal despite the complementary use of formal methods.

Since debugging will continue to be a part of the software development process, programmers must learn how to do it efficiently, and presently this is a problem. Most professional programmers still rely on inefficient manual methods, chiefly printed out­ put, to debug their programs [58]. It is necessary, therefore, that we study debugging 3

to increase its utility. Debugging software is an exercise in problem solving, so its

practitioners can expect to benefit from research in problem solving techniques. By

examining the ways people debug programs, we can gain insight on how to improve

debugging techniques and tools. The next section outlines the development of de­

bugging: its history, theory, practice, and some recent advances in tools to support

it.

1.2 Developments in Debugging

Debugging grew up with computer programming [73], but although it has long been

a part of program development, debugging practice and debugger development has

not received much attention in comparison with other software areas [4, 65]. For

this reason, debugging methods have not changed much since their beginning. This

does not mean that there has been no advancement in debugging; user interfaces to

debuggers are now quite good, but much work remains to help programmers debug

more effectively.

1.2.1 History of Debugging

Early debugging systems used binary-encoded lights and switches on the computer’s

console to inspect and modify variables in memory registers [32]. Later, the lights and

switches were replaced with a typewriter-like interface. This change allowed system

software to manage communication between the programmer and his program; this was the birth of the on-line debugger. Through the late fifties and sixties debug­ gers improved. At first symbols could be associated with program locations; later 4 debugging took place at the assembly language level. Tracing, breakpoints, and pro­ gram patching were all introduced during this time. By the mid-sixties, debuggers provided most of the capabilities available in today’s terminal-oriented debuggers, including the ability to display variable data in different formats (e.g., decimal, octal, etc.) [68]. Moreover, some even provided rudimentary graphical interaction via the light pen [89].

Unfortunately, as debuggers improved, the expense of computer use pushed most debugging off the machine. During this time analysts debugged programs with printed memory dumps in order to save computer cycles [13]. As computer memories grew, full dumps became less practical, so programmers turned to selective dumps, trac­ ing, and program snapshots. When time-sharing operating systems finally made it cost effective for debugging to return on-line, the assembly language-based debugging tools were no longer adequate for debugging the newer high-level languages. These languages’ ability to construct abstract computation models and user-defined data types had outstripped the debugger’s ability to easily determine when a computation is complete or to display the user-defined types. So instead of debugging practice re­ turning full-circle to the computer, users continued with the practice of using printed output to debug their programs. Although today “printed” output is more frequently written to disk and examined on-line, conventional debugging remains essentially the same as when hardcopy printing was used.

The challenge today is to develop debugging tools to bring the programmer back to the computer [58]. To do this debuggers must allow software developers to debug their 5

programs at the application abstraction level. They must exploit the computer’s data

processing capability to abstract and organize the debugger data before presenting

it, rather than placing this burden on the user. Furthermore, debuggers must allow

programmers to move about the execution space of their programs as abstractly as

they might in mental simulation. Efforts have begun recently to enhance the debugger

in these directions [2, 59, 66, 76].

1.2.2 Debugging Theory

As previously mentioned, although debugging is nearly as old as the profession of

programming, it has only recently been recognized as an activity that could bene­ fit from additional study [4]. We attribute this neglect to two prevailing views on

debugging. We’ve already discussed, and discounted, the first: that debugging is a menace to software engineering. The second view pictures debugging as an abstruse

art that cannot be formalized in the same manner as other development steps. This view is expressed in the classic waterfall model of software development by its unusual silence about an activity that can consume over fifty percent of the development ef­ fort [83]. Still, in spite of this general disregard, some researchers have attempted to characterize the debugging process. A brief summary of that work is presented here.

Early work on the theories of debugging attempted to characterize the steps or sequences a programmer took while debugging his programs. Gould [35] suggested that debugging was an iterative process of selecting the appropriate debugging tactic

(tactics range from looking for common syntactic errors to perusing “suspicious” code) and using this tactic to find clues about the error. If the tactic unearthed sufficient 6

clues to solve the problem, then the programmer would locate the bug and correct

the error. If the selected tactic revealed no clues, or the clues were insufficient to

solve the problem, then the programmer would select a new tactic, possibly based

on former clues, and repeat the process. Gould argued that debugging tactics were

learned primarily through experience.

Shapiro’s goal [74] was to provide more algorithmic structure to the debugging

problem in an attempt to automate as much of the process as possible. He presented

the debugging process in the following algorithm:

Read the program P to be debugged Repeat Apply a test case While P is found to have an error Diagnose the error Correct the error Endwhile Until there are no more test cases to apply.

Figure 1: Shapiro’s Debugging Algorithm

Shapiro went on to demonstrate diagnosis and correction algorithms that, with the aid of the user, could pinpoint and correct a subset of error types in PROLOG.

Vessey [86] used a similar model for the debugging process. He noted that expert programmers tended to exhibit more “system thinking” than novices during diagnosis of the error. System thinking is the term Vessey used to describe a programmer’s tendency to obtain a comprehensive understanding of the program while debugging.

Gugerty [39] also suggested that both novices and experts attempted to comprehend 7

the overall program purpose, and argued that experts were more adept at doing so.

Reflecting views similar to Gould, Gugerty noted that expert programmers, with their

larger bag-of-tricks, were better at applying a debugging strategy which he dubbed

“symptomatic search.” According to Gugerty, programmers employ symptomatic

search when they debug on the premise that the current fault has characteristics of

some familiar fault type.

Citing Gould [35], Weiser [87] suggested that programmers commonly worked backward from the fault location to the error. Using this thesis, Weiser introduced program slicing as a debugging technique. Katz and Anderson [48], using a debug­ ging model similar to Shapiro’s, supported the notion that a programmer might move forward or backward through the code depending on whether the debugging strat­ egy was one of comprehension or causal reasoning. Nanja and Cook [64] state that debugging strategies can be typified in two approaches, the comprehension and the isolation approaches. Their definition of comprehension is like that of Gugerty and also similar to Vessey’s system thinking. The isolation approach would be an amal­ gam of symptomatic search, Weiser’s working backwards, and Katz and Anderson’s causal reasoning.

Atwood and Ramsey [6] opted for a different approach to understanding the de­ bugging problem. They developed a theoretical framework to explain human com­ prehension of computer programs. They then used this model to explain programmer behavior during debugging. In particular, they suggested that a bug’s location in the program’s nesting structure was the prime determinant of the effort needed to correct 8

the bug. However, citing [75], Myers [61] observed that Atwood and Ramsey’s exper­

imental design might have been flawed and not generalizable to all programs. More

recently, Araki, et al., [4] introduced a more general model of the debugging process.

This model is intended to encompass all previous models, including that of Shapiro

and its variants. More importantly, however, this new model identifies three classes of

support tools that assist the programmer through the debugging task. These classes

are tools to support verification of debugging-related hypotheses about the program,

tools to manage the set of hypotheses, and tools to assist in selecting alternate hy­

potheses for further consideration.

1.2.3 Debugging Practice

While some researchers seek to improve debugging effectiveness in experimental set­

tings, others offer insight on techniques found useful in practice. The most popular

debugging technique is to work backwards from the observed failure, attempting to

reconcile the differences between the actual program behavior and the expected pro­

gram behavior [30]. Very often the search for bugs takes place in an atmosphere of impatience; no one likes errors in his programs [14]. Thus, it seems reasonable that in

order to increase debugging effectiveness, we must discover procedures that help the programmer work backwards to find the error without inducing heavy delays. This observation has lead many authors to suggest preplanning for debugging. Brooks urges programmers to “resist the temptation” of optimism that assumes there will be no bugs [13]. He advises programmers to build “scaffolding,” debugging-specific pro­ grams and data, in conjunction with the development of the program. Although other 9 authors are not quite so dismal in their predictions, they continue to promote various types of pro-active debugging tips, ranging from desk-checking just before the pro­ gram is executed to using assertions at strategic locations in the code [13, 30, 42, 83].

The most common advice to programmers is to include debugging statements as the program is written. Lauesen, Ledgard, Levine, and Van Tassel [52, 53, 54, 83] all recommend early insertion of diagnostic output statements. Lauesen proposes using these statements to output the current value of key variables; Van Tassel argues that they should show control flow while remaining brief and selective. Both also discuss control methods for switching debugging output on and off. In his remarks on this point, Dunn [30] states:

Moreover, when inserting diagnostic probes [statements] in their pro­ grams, programmers should start by getting information at a level as close to the application as possible. ... Observing more problem- related operations (e.g., seeing which record the pointer is at, not how the pointer was computed) permits the programmer greater opportunity to apply intuition—the most important debugging asset available.

Considering how frequently preplanning for debugging, especially by providing diagnostic output, is advocated and how most programs initially contain errors, it is surprising to find that most programmers do not follow these recommendations, but they do not. Programmers are optimistic, in this case to a fault. We have not yet heeded Brook’s advice and resisted the temptation to assume that: “Perhaps there are no bugs.” 10

1.2.4 Debugger Technology

Generally, today’s windowed debuggers provide good visibility of program control flow and control of debugger functions. It is now quite easy to set breakpoints, identify the next source line to be executed, or see the execution in context. Windowed debuggers have also improved on methods to display pointers and organize the output of variable data. These improvements are welcome; they make debuggers easier to use. However, they do not, to any great degree, extend the programmer’s ability to debug his code at higher abstraction levels. The basic functions of the debugger have evolved little since the sixties. Researchers, recognizing this limitation, have recently begun to propose and implement new functions for debuggers to better equip them as tools for debugging high-level languages. We conclude Section 1.2 by presenting preliminary results on abstracting the breakpoint mechanism and providing more flexible execution control.

A breakpoint is a location in the program, usually identified by or relative to a source statement, where the user wishes control to return to the debugger, so the program state at that location may be queried [32]. From the mid-sixties only mi­ nor changes have been made to breakpoints. The most notable of these is to make breakpoints conditional or to allow recording of program information at these points without a full cessation in program execution [66]. According to Olsson, et al., [66] these enhancements don’t go far enough; they don’t allow the user to correlate logi­ cally related breakpoints into more abstract occurrences. Olsson proposes replacing traditional breakpoints with events. Their experimentation vehicle is a debugger 11

called Dalek.

An event mimics a breakpoint by being explicitly raised at the breakpoint. Such

an event is called a primitive event. Higher level events are constructed by logically

combining primitive events. Events can also execute Dalek language procedures when

raised. Event combination allows more abstract reasoning about the course of exe­

cution in a program. For example, suppose a certain sequence of procedure calls

constitutes some application task. Monitoring the completion of each procedure with

breakpoints may not readily reveal when the task is complete, as the sequence may be incomplete or out of order. However, by raising primitive events upon completion of each procedure, with the constraint that they are raised in order, a user can quickly

determine when the task is complete. The authors claim their event mechanism makes debugging more abstract and requires much less work of the user during an extended debugging session than when debugging the same problem with traditional breakpoints.

Agrawal, et al., [2] report on recent work to improve a debugger by extending it with an execution-backtracking facility. As discussed in Section 1.2.3, a programmer’s most common debugging technique is to work backwards from the fault location seek­ ing the program error. While others have also shown reversible program execution to be useful for debugging [9, 58, 84, 88], all earlier attempts at providing reversible execution assumed the availability of unbounded storage. To avoid this limitation,

Agrawal introduces the notion of structured backtracking. Structured backtracking restricts reversible execution, disallowing some backtracking. For example, a user 12

cannot backtrack the program from outside a loop to inside the same loop. The

authors argue that the restrictions on reverse execution under structured backtrack­

ing do not greatly limit its usefulness, and it removes the apparent requirement for

unbounded storage. This makes structured backtracking reasonably space-efficient.

Agrawal implemented structured backtracking in a prototype debugger called Spyder.

Spyder resembles Sun Microsystems’ dbxtool, and debugs programs written in ANSI

C.

Attempts to extend the debugger are not limited to the activities discussed above.

Researchers are also investigating how to make better debugging tools for parallel,

real-time, and embedded programs [33, 59]; how to integrate Weiser’s slicing methods

into a debugging tool [2]; and how to use computer graphics to better visualize the

program state while debugging [76]. Our interest, and the focus of this dissertation,

falls into this third area.

1.3 Debugging Graphical Programs

Visualizing a program’s state is a difficult task. Dijkstra remarked in [26] that “our

powers to visualize processes evolving in time are relatively poorly developed.” To

compensate for this limitation, he argued for structured program control statements in order to “make the correspondence between the program (spread out in text space)

and the process (spread out in time) as trivial as possible.” However, representing the current state by correspondence to the program’s text or by textual data values is not always appropriate or effective. For example, the explanation of how to insert a node into a linked list is made simpler by representing the link pointers as arrows 13

50 K 10 K A1

37 K

10 K A2 A1 = C800 (10 K) 27 K A2 = 9400 (10 K) 5 K A3 A3 = 6C00 (5K) A4 = 2800 (9 K)

10 K

9 K A4

0 K

w (*>)

Figure 2: Memory Allocations Shown as Blocks (a) and as Text (b)

rather than as text. Consider also how much easier it is to quickly assess the state of available memory when the current allocations are shown in blocks as in Figure 2(a) rather than as in 2(b).

Using pictures to visualize the current data states of a graphically-oriented pro­ gram, such as a ray tracer, is a natural idea. Many of the data structures in these types of programs already have an inherent graphical depiction, because the objects they represent are graphical. These data structures, like pointer variables, are usu­ ally better understood as pictures, because the pictorial display explicitly reveals geometric properties that are hidden when displayed as text.

While it is the intent of our research to increase the debugger’s ability to visualize 14 all types of data structures, we are nonetheless influenced by the need to support the debugging of graphical applications. Indeed, as we shall see in the examples of

Sections 2.1 and 2.2, the attention paid to visualizing data structures in graphics pro­ grams has been lacking, yet improving debugger data visualization for these programs can have an immediate and significant impact on the ease with which these types of programs are debugged.

1.4 The Thesis

As we have discussed in Sections 1.2.1 and 1.2.4, current debuggers fail to provide sufficient abstraction capabilities for debugging high-level languages. This premise is supported in the literature, as well as by ongoing research activities. For example,

Agrawal, Olsson, and others [4, 33, 69] are working to provide more abstract execution control, better control mechanisms, and debuggers for parallel computations. The abstract display of program variables for debugging also needs to be addressed. Araki, et al., [4] observe:

... debugging tools are not yet effective because they do not provide enough abstraction to represent and retrieve information at the speci­ fication and computation-model level.

Our contention that such a debugging tool would be more effective and that its potential is sufficiently strong to warrant investigation is supported by [4, 50, 59, 61,

76]. Therefore, our thesis is:

Any comprehensive attempt to raise the debugger’s ability to handle

higher levels of abstraction will require a concomitant commitment to 15

displaying user-defined data structures as abstract objects. Furthermore,

the methods for data selection and display must be sufficiently simple to

encourage programmers to use them in the hurried atmosphere of debug­

ging. As a consequence, these methods should also support user-developed

“print” procedures for pro-active debugging, but must not depend on

them. Finally, such a capability can be demonstrated as an extension to

existing debugger technology.

As we will see in Chapter Two, a few others have attempted to create graphic

abstractions for debugger data; however, these methods have proved too inflexible

and cumbersome in the frenzied atmosphere of the debugging environment. This

work remedies that situation.

1.5 Outline of the Dissertation

The rest of this dissertation is outlined as follows. Chapter Two presents related

work. It discusses how the complexity of software has motivated programmers to use

visual methods to develop and understand their programs. It also introduces visual

debugging and outlines the history of debuggers which have promoted graphically en­

hanced displays for their data. Chapter Two concludes by summarizing the best and

worst features of these forerunners and offering a new list of desirable characteristics for a visual debugger.

In Chapter Three we introduce the Data Visualizer (DV), our research platform for work on debugger data visualization, and give a brief overview of how the current 16 implementation of DV extends the standard UNIX1 debugger xdbx. This chapter also introduces DV’s two-step philosophy for visualizing debugger data: data selection and data display. Additionally, we present a simple example of a typical DV debugging session to familiarize the reader with DV’s capabilities. Chapter Three concludes with an overview of DV’s architecture and supporting environment.

How to select debugger data for analysis is the topic of Chapter Four. Here we examine the importance of reducing the amount of data that the programmer must consider in order to manage the problem effectively. This chapter also addresses pre­ vious visual debuggers’ techniques for abstracting data structures. These abstraction techniques are then extended and combined in DV’s selection definition system to produce a more powerful mechanism for selecting debugging data. In our discussion of the selection definition system, we present new methods for identifying interest­ ing debugger data that offer solutions for handling pointer data, multi-purpose data structures, and data structures that are too complex for definitional mechanisms.

Chapter Five presents the second step of our data visualization philosophy. It begins by discussing DV’s outlook on data display in the light of previous methods for illustrating data. Next, it examines techniques for managing the displayed data to aid programmers in building abstract visualizations. This chapter also presents

DV’s data display objects in more detail than in Chapter Three. Finally, we return to

Chapter Three’s debugging example to better illustrate the features of DV’s display objects.

Chapter Six summarizes this work. In this chapter, we evaluate DV against the

1UNIX is a trademark of AT&T. characteristics set forth earlier in Chapter Two. We also note DV’s contributions to visual debugging and make recommendations for future work. Finally, we conclude with our overall assessment of DV and the future of visual debugging based on our experience with DV. CHAPTER II

Related Work

“We just don’t talk on the same level.” This might be how a teenage boy would explain why he and his parents don’t communicate. It might also be a programmer’s reply when he is asked about the problems with his most recent software project.

Programming is difficult, because men and machines don’t speak the same language.

Efforts to bridge this gap started with the development of assembly language, and it is certain that the crusade to find easier ways to communicate with the computer will continue for many years to come. Even if only professionally-trained computer programmers interacted with computers, it is doubtful that conventional, high-level programming languages would be satisfactory as the only medium for communicating with computers. The popularity of the windowed interface with its “point-and-click” interface paradigm indicates that graphical interaction is preferred over text-based interaction.

Furthermore, the low-cost personal computer has made computing facilities widely available to users who are not programmers. These users arrive with high expectations concerning the usefulness of computers and their applications. In order to meet the varied needs of many users, applications provide options to configure their operation, but users who want, and expect, capabilities beyond the available options soon find

18 19

themselves needing to program. For these users, learning some programming language

may require time or skills which they do not have [36, 78]. Visual programming

attempts bring the power of programming to users who are, by profession, non-

programmers.

Applying graphics to programming has other advantages for both the computer

scientist and the non-computer professional. Graphic descriptions and representations

of programs convey higher-level abstractions by de-emphasizing syntax issues [62].

Higher-level abstractions, in turn, assist program understanding. By illustrating ideas

with pictures of familiar, real-world objects, the ideas can be made easier to think

about, understand, and remember [72, 78]. In his survey on graphics in programming,

Raeder [72] lists these other advantages which pictures enjoy when compared with

text representations:

• Our eyes give us “random access” to picture details, whereas we scan text sequentially.

• Graphics are multidimensional with a richer language of shapes, colors, texture, animation, etc.

• Pictures convey information to the “reader” faster than text, because of our sensory system’s bent toward real-time image processing.

• Since pictures are natural abstractions of real objects, they provide useful meta­ phors for illustrating ideas; words must attempt to develop the metaphor in the reader’s mind.

For these reasons, and others, the use of graphics to describe, create, and explain

programs has a prolific heritage in computer science. Indeed, the benefit of pictorial representations for programs is widely recognized and endorsed by the multitude of graphical methods which exist to aid programmers in software decomposition, design, 20 and implementation. Data flow diagrams, structure charts and flow charts are just a few examples of these methods.

2.1 Visual Methods in Programming

Although its name suggests an intuitive meaning, there is no clear cut definition for visual programming. Grafton informally defined visual programming by stating that it involved three primary areas of research: techniques to graphically illustrate soft­ ware, graphic-based programming, and program animation [37]. Myers offered a more precise definition of visual programming (VP) later in [62]. According to Myers, VP

“refers to any system that allows the user to specify a program in a two (or more) dimensional fashion.” This definition does not exclude non-graphical languages. For example, program text specified as two dimensional tables or forms would qualify as VP. However, most VP systems use graphically-based languages. Shu provided a third definition for visual programming in [78]. She used the term visual programming to mean “the use of meaningful graphical representations in the process of program­ ming.” This definition, which is much broader than Myers’ and more like Grafton’s, causes the term to lose some of its intuitive notion that programming should have something to do with directing the computer in a algorithmic fashion to take an ac­ tion. To avoid confusion, we use the term visual methods to replace the Shu-Grafton definition for visual programming and reserve VP for Myers’ definition. 21

2.1.1 Taxonomies of Visual Methods

In [62], Myers presented two taxonomies to categorize visual methods. His first major division is on whether or not a system provides program visualization (PV). Program visualization is the use of graphics “to illustrate some aspect of the program or its execution behavior.” Myers justifies this division by noting that VP supports program creation, and PV uses graphics to illustrate programs after they are created. Thus VP and PV have entirely different goals, though they both use graphics. Myers classifies

PV systems by whether they visualize a program’s code or data and by whether the visualization systems produce static or dynamic pictures.

To distinguish VP systems, Myers uses two orthogonal criteria: whether or not the

VP systems are also programming-by-example systems, and whether the VP systems use interpreted (interactive) languages or compiled (batch) languages. Programming- by-example (PBE) applies to systems which either can mimic the actions of a user’s example (similar to a recording program) or can automatically create a program by inferring its structure based on example input/output pairs. Most PBE systems are interpreted VP systems, but the reverse is not true.

Shu also divides visual methods into two classes: visual environments and visual languages (see Figure 3). According to Shu, in visual environments humans “inter­ act with the computer where showing is the primary means of communication.” In contrast, visual languages are used to tell the computer what to do. Shu includes both languages which manipulate visual information and languages which simplify communication for visual interactions in her visual language class; although, these 22

Visual Programming

Visual Environment Visual Languages for ... ✓ V isu alizatio n o f ... H andling S u p p o rtin g Programming V isual V isual V isual with Visual C oaching Information Interactions Expressions y Data or Program Software Information a n d /o r D esign About Data Execution

Copyright 1060 International Business Machines Corporation. Visual Programming Languages Reprinted with permission from IBM Systems Journal, 26(4). Diagrammati : Iconic Form System s System s System s

Figure 3: Shu’s Taxonomy of Visual Methods

languages are themselves textual. The visual environment class is also further divided based on whether the user manipulates graphic objects to show the computer actions to be mimicked (visual coaching), or whether the computer uses graphics to show the program’s data, its code or execution state, or its design description.

Because Shu’s taxonomy includes textual languages which handle visual infor­ mation and support visual interactions, her visual language class is a superset of

Myers’ VP (see Figure 4). Her visual environment category is similar to PV, with one curious exception. Shu disregards visual coaching as a language, because it uses examples to show the computer what actions to perform. She describes this method as “do as I show you.” We think it is more appropriate to emphasize the doing (i.e., telling) rather than the showing of this method. Such an emphasis acknowledges My­ ers interactive PBE systems as visual languages and puts these taxonomies in better 23

Batch Interactive

Visual PBE Systems Coaching

Diagrammatic and Diagrammatic and non-PBE Systems Form Systems Iconic Systems

Figure 4: A Comparison of Myers’ and Shu’s Taxonomies for VP Systems

Static Dynamic

Software Design Software Design and Code Program and/or Execution

Data or Information Program and/or Data About Data Execution

Figure 5: A Comparison of Myers’ and Shu’s Taxonomies for PV Systems 24 agreement. Figures 4 and 5 show the relationships between these two taxonomies using this new emphasis. In these figures the rectangles’ rows and columns are la­ beled with Myers’ orthogonal criteria for distinguishing VP and PV systems; Shu’s categories appear in the rectangles. Thus, for example, Shu’s category for visualizing software designs is similar to Myers’ PV systems for statically visualizing program source code.

In the next two sections we present some VP and PV systems as general examples of systems in these fields. These sections primarily use Myers’ terminology; how­ ever, in most cases VP may be replaced with visual language or visual programming language, and PV may be replaced with visual environment.

2.1.2 Visual Programming

Although visual programming is considered a recent research area1, it has a sur­ prisingly deep heritage. Sutherland’s Graphical Program Editor, an early VP system whose program pictures resembled hardware diagrams, demonstrates the desire for programmers to use graphical program descriptions during the assembly-language heyday [81]. Other early VP systems include programs which could “compile” flow­ charts and the graphical language AMBIT/G [61, 62]. Modern VP systems rely more heavily on interactive graphics to specify the program. Two examples of these types of systems are the Programming with Interactive Graphical Support (PIGS) System and Piet.

PIGS [70] represents the more modest types of VP systems, those systems which

1Glinert has suggested that this field is only in its infancy [34]. 25

HEADER BLOCK OF NSD: SORT VAR A : ARRAY [1..10] OF INTEGER; N, I, CHANGE, TEMP : INTEGER; READ(N)______DO FOR I := 1 TO N______R EA D (A [Ij) DO UNTIL CHANGE = 0 CHANGE := 0 DO FOR I := 1 TO N - 1 \ l F A[I] < A[I+1]

TEMP := A[I+1]; A[I+1] := A[I]; A [I] := T E M P ; C H A N G E := 1; DO FOR I := 1 TO N______| WRITE(A[I])

Reprinted by permission of John Wiley & Sons Limited

Figure 6: A PIGS Program for Sorting N Numbers in Descending Order

extend conventional programming languages with graphical environments. The main feature of the PIGS environment is a graphical editor for interactively manipulating

Nassi-Shneiderman diagrams (NSD). PIGS uses interpreted PASCAL as the under­ lying programming language. Users construct programs by combining NSD elements for simple statements, loops, and conditionals with the PIGS editor. This editor also controls input of the NSD elements’ action text (see Figure 6).

Pong [70] cites several advantages that the PIGS approach has over conventional programming. First, PIGS enforces structured programming. Second, PIGS executes the NSD program directly, so there is no discrepancy between the program and its documentation, the NSD flow diagram. Third, by highlighting the NSD diagram 26 to show an executing program’s control flow, PIGS helps the user understand the dynamic process behavior better. Finally, pseudo-code or natural language can re­ place the PASCAL text in the NSD chart. Although, the “program” would not be executable, it would graphically specify the program’s high-level control structure, promoting stepwise refinement.

Piet is an example of VP systems that attempt to support programming using only graphical interactions. Indeed, Glinert [34] claims that users never need to re­ turn to the keyboard once Piet is loaded. However, such a complete departure from traditional program text editing does not come without cost. To achieve keyboardless programming, Piet severely restricts the number and type of variables; Piet only al­ lows four six-digit, integer variables in a procedure. Rather than giving these variables text names, Piet denotes them as red, blue, green, and orange rectangles.

The metaphor for Piet programs is the flow chart. To create a program with

Piet, a user selects icons for the basic, built-in operations from a menu and organizes them on the program editing easel. He then specifies the program control structure by connecting the icon instances as appropriate. An entire Piet procedure can be associated with an icon. This allows the procedure to be called recursively and from other procedures.

Because of its similarity to flow charts, Piet does a good job of illustrating the program’s control flow. Additionally, allowing the user to graphically “handle” the operations and procedures via icons helped novice programmers understand a pro­ gram more clearly than its corresponding PASCAL version. These same novices also 27 preferred using Piet over PASCAL2, but more experienced programmers found Piet too restrictive and awkward. Glinert defends Piet as a prototype language and sug­ gests that because expert programmers have “on the whole, overcome the difficulty associated with learning to program,” they probably find Piet too rudimentary for their needs. Probably the most serious argument against Piet is that there is no clear answer for removing its restrictions on variables.

The preceding systems represent only a small part of the work in VP. Additional background on visual programming and the influence of visual technology on the evolution of programming and language environments can found in [3, 22, 23, 44, 77].

2.1.3 Program Visualization

Many times after having translated his task into a suitable programming language, the programmer finds that his efforts with the program are just beginning. The program may need to be debugged, to have its algorithms analyzed, to have enhancements added, or to be tuned for performance. These are some of the activities for which program visualization is useful.

One of the earliest visualizations for programs, albeit manually created, was the flow chart. Flow charts assisted programmers in seeing the organization and control flow of the program and in documenting its purpose. The problem with manually created charts, however, was that they rarely accurately represented the program they visualized [51]. Thus, many systems began to appear for automatically generating

2The favorable response to the Piet system was probably inflated, because of its novelty and the presence o f the designer during questioning [34]. 28 flow charts from source code [31, 38, 40, 51]. Graphical computer aided software engineering (CASE) tools are the results of continuation on work in this area.

Program visualization for other purposes (e.g. debugging, algorithm analysis, per­ formance modeling), like the term itself, didn’t “catch on” until more recently. In fact, while CASE-like tools are historically considered PV systems, generally the term is meant to apply to the more recent applications. We conclude this section by pre­ senting the Brown ALgorithm Simulator and Animator (BALSA), an ambitious PV system for visualizing algorithms in execution.

BALSA was developed with the thesis that “it is possible to expose the funda­ mental characteristics of ... programs through the use of dynamic (real-time) graphic displays ...” [18]. Its developers targeted BALSA towards educational uses, under­ graduate instruction and algorithm research, but noted that algorithm animation would also be useful for program debugging. BALSA was motivated, in part, by the sorting animations by Baecker [8], but whereas Baecker’s animations were gener­ ated from snapshots of still computer images, BALSA animations were produced in real-time on the computer’s graphics terminal.

Four types of people use the BALSA system: the user, the scriptwriter, the algo­ rithm designer, and the animator. The user is synonymous with the student or the watcher of the animation. The scriptwriter choreographs the animation for playback as part of a lecture. Generally, the scriptwriter is associated with an educator. The algorithm designer and animator are the programmers who provide the means to animate the algorithm. 29

ABCDEFGHI J KLM, A ■ ■ ■ □ B■ C■ D • □ E • □ • F■ □ □ G□ • H• I • J K • L • M •

Reprinted with permission

Figure 7: BALSA Image for Breadth-First Traversal Animation

To create an animation, the designer begins with a “clean” algorithm to which he adds calls to BALSA to notify it of interesting events. The designer also replaces any

I/O routines with BALSA I/O routines; these allow BALSA to control I/O and the displayed graphics in a uniform manner. The animator writes software to maintain an image which responds to the interesting events. This combination of the image and its animating control software is called a view. Views respond to interesting events as defined by the animator. In addition, views provide facilities for reverse execution and user inquiries about the image. A typical animation takes approximately 20 hours to program, and an additional one or two hours to script [19]. Figure 7 is a still frame example of a BALSA animation for a breadth-first traversal of an undirected 30 graph. The image represents the graph’s adjacency matrix. It illustrates the order of node traversal by changing the shape, color, and fill-style of the nodes as they are discovered and visited.

BALSA animations are a unique medium for instruction; Brown likened them to a dynamic computer science textbook [19]. The experience of the BALSA project demonstrates the effectiveness of graphics and animation to illustrate the complex dynamics of a program’s control flow and data structures. While Brown never did specifically investigate program debugging using BALSA’s graphics, he consistently argued for using graphics images in debugging [16, 18]. Indeed, he admitted that

“many of our displays involve using graphical cues for ‘state’ information about atomic pieces of data structures” [19]. Recent work by Brown in algorithm animation can also be found in [17].

2.1.4 Visual Debugging

Displaying the program state is an important part of debugging. Section 1.2.3 noted that the most common debugging technique is to attempt to reconcile the differences between the actual and expected program behaviors, and that usually this technique involved examining diagnostic output of variable values. Dunn recommended ab­ stracting the output information to enhance its usefulness [30].

Visual debugging strives to provide these capabilities to programmers. More specifically, the term visual debugging refers to any attempt at graphically enhancing program state data for the primary purpose of debugging. BALSA is not a visual debugger. While it may be used for debugging, its primary role is to illustrate pro­ 31

grams which are already correct. Furthermore, the PV systems discussed previously

(i.e., CASE tools and flow chart generators) are not visual debugging tools, since they

also do not directly support debugging. A survey of systems which do support visual

debugging follows in the next section.

2.2 Visually-Oriented Debugging Tools

Visual debugging tools are part of the family of PV systems according to the tax­

onomies of Shu and Myers. Shu identified the visual debuggers PROVIDE and VIPS

as visual environments for visualizing program execution states. Myers classified the

visual debugger Incense as a PV system for illustrating static data structures. At

first it might seem as though Shu and Myers are again at odds over their classifi­

cation scheme, with one calling visual debuggers dynamic (execution) tools and the

other calling them static tools. This is not the case. Visual debuggers are execution

tools, because they display run-time program states. However, many of these tools,

including Incense, statically display the variable states at breakpoints.

Many authors have contributed to the body of literature on visual debugging.

Nevertheless, to avoid an overly lengthy and somewhat redundant presentation of

the subject, we mainly limit our discussion to the more often cited authors in this

area. We also include a short section on the current status of visual debugging

with commercial debuggers to give the reader an idea of how visual methods are

working into state-of-the-art debugging environments on workstations and personal

computers. Readers interested in pursuing the background of this subject further may refer to additional articles in the bibliography [1, 7, 12, 15, 20, 63, 65]. 2.2.1 EXDAMS (1969)

Possibly the first debugging system advocating the use of graphical formats was the

Extendible Debugging and Monitoring System (EXDAMS) [61]. One main design goal for EXDAMS was to create “an extendible facility to which new debugging and monitoring aids could be added easily” [9]. To achieve this goal, EXDAMS instruments the original source code, in PL/1, with debugging statements to record the program’s execution history at run-time. It also builds a symbolic model of the program in a random-access file. The history records dynamic information to answer queries about execution-time states. The program model enables interpretation of the execution history at the source language level. An information retrieval routine controls all access to the history file; it also presents a uniform interface for the monitoring and debugging aids.

Although Balzer’s intent was for others to provide debugging aids to the basic

EXDAMS environment, he nevertheless provided some of his own static and dynamic debugging aids. Several of his aids are notable. Flowback analysis, which creates an inverted tree-like graph to show how a given variable’s value is derived (see Figure 8), is a remarkable example of the early use of graphics to display debugging information a la Weiser’s slicing. The source code aid shows the current source line highlighted in context. This visual cue was carried over into nearly every other visual debugger to follow. The windows aid assigns variables to a screen area where they are updated as the user moves through execution time. Using this display, variables that index arrays can be displayed graphically as arrows, a feature foreshadowing the method 33

A B BP

= 21

as 28

Figure 8: A Flowback Analysis Graph

subsequent debuggers would use to draw pointers.

However, in spite of its many farsighted ideas, EXDAMS suffered from some un­ desirable characteristics which limited its practicality as a general debugging tool.

First, EXDAMS requirement for a full execution history made it unusable for large or long-running programs. Additionally, generation of the history required modification of the source, which could possibly introduce or obscure some bug types. Finally, the playback paradigm restricted the user to a pre-specified execution scenario during the debugging session. 2.2.2 Incense (1980)

After EXDAMS there was relatively little advancement in visual debugging. Some non-debugging systems [57, 82, 85] during the seventies referred to the usefulness of graphics for debugging purposes, but advances in visual debugging would wait until a more powerful graphical environment was available. The experiences gained in the development of Smalltalk at Xerox Palo Alto Research Center (PARC) set the stage for renewed work on visual debugging [49, 61].

Incense was the first visual debugger to graphically display data structures in a windowed environment [61]. Actually, to call Incense a debugger is a bit mislead­ ing. Myers developed Incense at Xerox PARC using an Alto minicomputer as a sys­ tem to interactively investigate program data structures. Incense operated on Mesa programs3, but could not be easily used to debug them due to a lack of integration between the Mesa debugger and Incense. Myers’ intent was to integrate Incense with the Mesa debugger under the Cedar programming environment, but Incense never reached this stage. Thus, while Incense could display data values for an application during debugging, it could not dynamically update its displays at new breakpoints.

Incense’s goal was similar to that of the many visual debuggers which followed after it: to reduce the degree of detail in the data to enable programmers to reason about the program’s state at a higher conceptual level. To this end, Incense allowed

“the programmer to define the [graphical] display for any type and have that display used whenever data of that type is shown” [60]. The mechanisms Incense used for

3Mesa is a Pascal-like language developed at Xerox PARC. 35

defining displays were called Formats, Artists, and Layouts.

data: 3 23.456 7890 lesser: REAL INTEGER greater data: 4 lesser:

hours; 16 data: 10 minutes: 25 seconds: 30 lesser: greater: RECORD

(a)

00 (d)

Figures reprinted with permission

Figure 9: (a) Default Formats, (b) Custom Format for RECORD in (a), (c) a Data Structure with Referents, and (d) Layout for Data Structure in (c)

Formats control how the data is displayed. For example, the default Format for most data types is boxed text as shown in Figure 9(a), but users may create custom formats as in 9(b). The Artist is an abstract data type for handling the display graphic’s creation, erasure, selection, and modification. Myers invented the Layout to handle the placement of pointer variables and their referents. When placing a 36

display graphic, the user specifies a rectangular area on the screen where the graphic

should appear. Layouts allow the user to determine beforehand how to divide this

area into sub-rectangles for placement of any referents. Figure 9(c) shows how a data

structure with referents would be displayed using the Layout in 9(d).

Like EXDAMS, Incense elevated expectations for the possibilities of applying

graphics to debugging. Incense introduced the computer-drawn, box-and-arrow for­

mat now common with current visual debuggers. It also demonstrated automatic

drawing of actual variable states in a compiled program, in a manner similar to hand-

drawn figures [60]. Myers argued that such presentations of debugger data were easier

to understand, more abstract, and more enjoyable to use than text representations.

Finally, towards the goal of extendibility, Incense provided users with the facilities to

create custom displays of data structures.

In practice, however, Incense could not actually support debugging. Besides lack­ ing integration with the Mesa debugger, Incense was unacceptably slow.4 Moreover, users couldn’t capitalize on Incense’s abstraction mechanisms, because “the creation of the visualizations was often more difficult than the actual data manipulation al­ gorithms, which meant that programmers never created custom visualizations” [63].

Another five years would pass before any significant strides in visual debugging oc­ curred.

4The display of even simple pointers took nearly two seconds on the Alto. 37

2.2.3 GDBX (1985)

Taking the lead from Myers, Baskerville moved on to develop Graphical-DBX (GDBX),

a debugger for the C language [10]. GDBX is an extension to the dbx debugger,

providing graphical display of variable states in a manner similar to Myers’ box-and-

arrow format. Baskerville extended the capabilities of the visual debugger beyond

Incense in two ways. First, GDBX integrated the graphic display of data with a full-

featured debugger. The result was that users of GDBX had available all the facilities

of dbx plus additional data visualization capabilities. Second, the GDBX display

system included a two-dimensional layout algorithm to help the user place the data

boxes.

GDBX also introduced some new ideas for visual debuggers. First, GDBX used

a scrollable, virtual drawing area to increase usefulness of the display window. Next,

GDBX users could change displayed values, even pointer targets, and cause the cor­

responding modifications to program values; this functionality was a direct result of

the integration of GDBX and dbx. The most important new feature, however, was

that GDBX provided several display modification commands that allowed the user to

elide5 data. Although elision of field data was possible with Incense using Formats,

Baskerville was the first to provide a truly usable method. From readily accessible

menus users could direct GDBX not to display certain record fields, to limit the depth

of pointer dereferencing, or to limit the number of array elements to display. Debug­ gers can easily return enough information to overwhelm the user [11]; these elision

bTo elide data is to hide or remove parts of it from view, i.e., to suppress unnecessary detail. 38 functions were an important step toward helping users more precisely define which data is interesting and needs investigation.

In looking ahead to future enhancements, Baskerville mostly suggested improve­ ments to the functions GDBX already provided. As a result, he ignored the hard problem of graphically abstracting the interesting data to better assist the user in conceptualizing the state of the process. As we shall see in Section 2.2.7, commercial debuggers have, by and large, continued to sidestep this problem.

2.2.4 VIPS (1987)

The Visual and Interactive Programming Support (VIPS) debugger represented the next step in visual debugging. Isoda, et a!., developed VIPS on a PERQ workstation with a DIPS host computer [46]. The target language for VIPS was Ada, but rather than debugging the Ada executable file directly, VIPS interpreted the DIPS Ada compiler’s Diana and quadruple files. By using this strategy, VIPS was able to support a new view of the program, the block view: a macroscopic, nested-block view of Ada packages and procedures. The block view complements the source code view in that it gives the user both a high- and low-level view of the program’s control flow.

Overall, in addition to the block and source code views, VIPS provided five other views or windows to examine and manage the process. These windows are:

• The data window which displays data structures similarly to Incense or GDBX.

• The figure definition window through which displayed variables are associated with their display definition.

• The editor window which is used for modifying the program source code.

• The acceleration window for controlling the interpreter’s execution speed. 39

• The interaction window through which all standard I/O occurs.

VIPS brought together in one system many of the features of the preceding debug­ gers. VIPS used dynamic arrows to illustrate data flow as in EXDAMS; it borrowed

Myers’ ideas for allowing custom data visualizations using the figure definition win­ dow and specialized quadruple programs, and it was an integrated debugger similar to GDBX. Altogether, VIPS provided more functionality than any previous debug­ ger, but its generality belied its usefulness. Despite its many functions, VIPS failed to give programmers real leverage for debugging, because its functions were mostly superficial. The VIPS interpreter lacked features common in typical debuggers, most notably the breakpoint. Additionally, users needed to understand quadruple pro­ gramming in order to build custom data displays. VIPS also could not elide data, making it unclear how useful it would be for debugging programs with large data structures [67]. Finally, VIPS did not provide a virtual screen area, as did GDBX, exacerbating the display problems when large data structures were used.

2.2.5 PROVIDE (1988)

The PROVIDE debugger is different from the other systems surveyed here, because it attempted to improve not only the visualization of program data states, but also to improve the methods for determining which states to visualize and how those states should be reached. For this reason, PROVIDE bears similarities to Agrawal’s Spyder debugger discussed in Section 1.2.4. This section, however, will not address these features of PROVIDE; instead, it will concentrate on PROVIDE’s data visualization capabilities. 40

PROVIDE is a prototype debugger for a variant of the C language which allows

only integer and one-dimensional array data types [58]. PROVIDE’s configuration

consists of Macintosh computers (Macs) networked to a VAX host: the Macs han­

dle the duties, while the VAX manages PROVIDE’s databases and C

language interpretation. All data graphics are displayed on the Macs.

To construct pictures of program data, users begin by binding program objects to

the form al parameters of picture types in the PROVIDE library. A program object is

synonymous with an ordinary C language variable. A picture type’s formal parame­

ters control its display characteristics. For example, PROVIDE supports a pie chart

picture type with formal parameters for the title, wedge labels, and wedge percentage

values. Once a picture type is bound to the program objects, the user positions the

data picture in a display window. From this point forward, the graphic is updated as

the associated program objects change values.

Picture types are complex objects, offering the greatest potential for abstracting

variable state data of any of the visual debuggers surveyed. They can support virtually

any type of graphic and allow direct manipulation of that graphic for modification

of the program object values. These features vaguely resemble those furnished in X

Toolkit Intrinsics widgets. Users familiar with the X Toolkit realize the difficulty of

creating new widget classes [5]; the same thing is true of PROVIDE picture types.

Furthermore, picture types tend to lack generality. These are crucial weaknesses,

because they demand many picture type variations with “the requirement that expert programmers generate [these] types” [58]. As a solution, Moher cited ongoing research 41

to build an interactive facility to intelligently assist users in constructing picture types.

2.2.6 VIPS II (1991)

For VIPS II, Isoda and Shimomura redirected VIPS’ focus from a tool for displaying

many program views to a tool for visualizing linked data structures [76]. VIPS II is

built to communicate with UNIX’s dbx debugger in a manner similar to Baskerville’s

approach with GDBX. Thus, VIPS II provides all the features of a typical breakpoint

debugger plus data visualization capabilities.

While VIPS allowed visualization of all types of data structures, VIPS II restricts itself to linked structures. To improve performance in visualizing large linked data

structures, VIPS II uses two types of views. The first view type is called a whole-list

view; it shows the linked data structure as a depth-first tree with no detail: nodes

(data records) are displayed only as small rectangles (see Figure 10). Because this view shows only a broad picture of the list, VIPS II can update it very rapidly. Views with 100 nodes can be updated in approximately one second.6 The second type of list view is the partial-list view. It shows a detailed display of some section of the structure with each node showing the corresponding data record’s current field values and all pointer fields dereferenced as arrows. The partial-list view is limited to ten nodes so it also may be updated quickly. Association of nodes which represent the same record in different views is achieved through graphical cues such as bounding rectangles (in the whole-list window) and positioning.

0VIPS II was evaluated on a Sun Sparcstation. Each node in the linked structure had two pointer fields and one string field. Whole List

Root Node: root Directory: /vips/vd/V2R2/demo Update Whole/Partial Display F ile: stock_main.c Function: main /"N r* BackPointers : Invisible Input Output 10 WIN (Return to Cur) Changes : Invisible List WINs 0 Updated ( Search ) Object : 0 File Direction Forward Keyword:

0014 put_goods(&root, "grape", 25); 0015 put_goods(&root, "durian", 10); 0016 put_goods(&root, "mango", 10);

Partial List 0017 put_goods(&root, "lemon", 100); 10018 put_goods(&root, "kiwi", 30); 0019 put_goods(&root, "orange", 300); Start Node Address = 0x22ee8 0020 put_goods(&root, "papaya", 10); *0021 put_goods(&root, "peach", 30);

luce Input Output h Input Number = (1) *root= { goods = "apple" num = 200 less = 0x22(04 more - 0x22f20

(Display List )(Display Variable I ( Display Stack^) ( List Status ) (P^nt ) ( print *) ( next ) ( step ) (,stop 3 (editor ) ( copy ) < clear )

(2) stop at 7vips/vd/V2R2/demo/stock_main.cn: 18 run

Cop)-righ( 1991 IEEE, reprinted uiih prrm iuicn | Running: stock /dev/ttyp3 Figure 10: VIPS II Debugging an Example next______Application 43

Other improvements in VIPS II include a variation of Baskerville’s method to highlight the most recently changed values, and switches to control both the layout of the linked structure and which pointer values will be dereferenced. Furthermore, since only the partial-list is automatically updated at breakpoints, VIPS II shades the whole-list’s background when this view becomes outdated. Although, VIPS II only displays linked data structures, the authors plan to extend it to include arrays, records, and functions.

2.2.7 Commercial Debuggers

Commercial efforts in visual debugging are picking up steam. Displaying variable data in the box format has become standard practice in state-of-the-art commercial debug­ gers. While some debuggers, such as Think-C for the Macintosh, use fairly pedestrian graphics to display variables, others, such as CodeVision by Silicon Graphics (SGI), enhance the data boxes with color, scrollable fields, and menus for formatting the data values. The most advanced debuggers also allow pointer types to be drawn as arrows [50, 80]. Furthermore, CodeVision includes some advanced features from research debuggers like GDBX’s elision capabilities and VIPS II’s whole-list view [80].

What debuggers in the marketplace haven’t addressed yet is methods to display graphic abstractions of variable data beyond the pointer arrow. No present commer­ cial debugger for compiled languages on workstations or PCs provides even rudimen­ tary facilities for custom displays, as did Incense, PROVIDE, and VIPS. Except for the infusion of superior layout algorithms, commercial debuggers have little more to offer for improvements in data visualization using the box-and-arrow format. The 44 next step for these debuggers would be to include display customization facilities.

2.3 Summary of Previous Work

It is evident from Sections 1.2.1, 1.2.4, and 2.2 that much work has been aimed at improving the debugger, both on execution control and for visualizing the program’s current data state. This section reviews advances in visual methods using the debug­ gers surveyed in Section 2.2. It outlines both the highlights of these debuggers and their limitations and weaknesses. After having done this, we will be better prepared to look at new methods for improving debugger data visualization.

2.3.1 Best Characteristics

Each of the debuggers surveyed showed some good characteristics which ought to be continued in future debugging tools. EXDAMS introduced and demonstrated the value of multiple views for the same state data. Certainly, situations arise when a choice among different visualizations can make a difference between merely observing the data and understanding it [11].

Myers was the first to suggest and support custom analogical pictures of abstract debugger information. This lead was endorsed by both VIPS and PROVIDE. In all cases, these systems proposed some “programming language” to construct their made-to-order images. Incense also introduced the box-and-arrow format; this format still has its place in some debugging situations.

Perhaps the most important contribution to date has been Baskerville’s elision functions. With debuggers overwhelming the user with low-level data detail, these 45 functions help programmers begin to get this problem under control. Elision, or the hiding of unnecessary detail, is now practiced in advanced commercial debuggers, and in VIPS II with its whole-list view. Furthermore, GDBX’s integration of graph­ ical capabilities with UNIX’s dbx was also a significant step in providing industrial strength power to a graphical debugger. An additional side-effect of this accomplish­ ment was that it ensured that the data graphics could accurately reflect the process state automatically at breakpoints.

VIPS II followed the practice of tightly integrating common debugger functions with data visualization capabilities. Additionally, VIPS II introduced the practical notion that the pictorial displays needed to be drawn quickly to be useful. How­ ever, just as each of these debuggers had certain advantages, they also had some disadvantages. These disadvantages are discussed in the next section.

2.3.2 Limitations and Restrictions

Through the years, researchers have overcome some of the weaknesses of the earlier systems. EXDAMS needed to instrument the source code with debugging statements to record state history information, and the debugging routines interpreted this infor­ mation rather than the actual run-time data states. As previously mentioned, these techniques could obscure or introduce bugs by altering the original source. Further­ more, the interpretive environment removed the programmer one level from the actual execution, slowing down the debugging cycle.

Incense avoided each of these limitations, but had its own problems with “de­ tached” debugging, because Incense and the Mesa debugger were never integrated. 46

However, integration of debugger and display facilities was not Myers’ primary goal.

His goal was to provide a system “to define the display for any [data] type”, and he

succeeded in this goal. Nevertheless, in practice, building custom data graphics for a

data type ended up being more difficult than simply debugging the program at hand.

What Incense really needed was an easier way to accomplish its fundamental goal.

GDBX skirted the issue of custom displays and concentrated on improving Myers’

box-and-arrow format. This is GDBX’s chief weakness, a weakness shared by all of the

current commercial visual debuggers. VIPS returned to address custom displays, but

fell into the same trap as Incense; likewise, PROVIDE encountered similar problems.

Finally, none of these earlier systems provided any useful support for displaying the

naturally visual data of graphically-oriented programs.

Therefore, despite the advances and good characteristics of previous efforts, vi­

sual debuggers still cannot help the user easily present the program’s variable data

at an abstract-model level, because the former customization languages were too

cumbersome and the default box-and-arrow format is insufficient for many abstrac­

tions. What is needed are new methods for identifying the state data of interest and

displaying this data as graphic abstractions.

2.4 Desirable Characteristics for a Visual Debugger

Having examined the best and worst characteristics of earlier visual debugging tools, we are now prepared to identify the features that can improve the effectiveness of

data visualization for future debuggers. Looking closely, it becomes apparent that

Sections 2.3.1 and 2.3.2 highlight two major issues for elevating the presentation of 47 debugger data from detailed, low-level information to high-level, abstract visualiza­ tions. A number of other, more minor, attributes also deserve mention. The following list first presents the two key features needed to fortify visualization capabilities and then itemizes three additional worthwhile elements. These capabilities are:

• A method for automatically excluding unnecessary data values (record fields), so programmers can concentrate better on the critical data. This method, although similar to present-day elision capabilities, needs to better address the management of pointer values.

• Customized graphic displays of the data are desirable, but newer, simpler meth­ ods to build these visualizations must be found to encourage programmers to exercise the capability.

• An integrated solution for using pre-defined data structure “print” procedures with the aforementioned data selection and graphic capabilities, so users who practice defensive debugging procedures (as outlined in Section 1.2.3) have easy access to their debugging routines.

• A method to facilitate data organization both to help the user associate related variable values and to support efficient updates of these values at program breakpoints.

• Multiple complementary views of identical data values for increasing program understanding and supporting graduated degrees of abstraction. CHAPTER III

The Data Visualizer

In order to advance the goals of Section 2.4, we have created the Data Visualizer

(DV), a graphical extension to the xdbx debugger. This chapter introduces DV. It begins by exploring the motivation behind DV’s design. Next, it introduces DV’s two- step philosophy of data selection and display for visualizing debugger data. Third, it presents a broad overview of DV’s capabilities from a user’s perspective with a simple example of a possible debugging session. Finally, this chapter concludes with a discussion of DV’s supporting hardware and software and how DV communicates with xdbx and dbx.

3.1 Designing the Data Visualizer

As we have seen, programmers commonly debug programs by examining the state of key variables at execution breakpoints. Furthermore, existing debuggers do not provide adequate visualizations for these key variables beyond representing pointers as arrows and records as boxed values. DV was motivated by the desire to give programmers greater flexibility in identifying and displaying key variables.

At first we considered developing a graphical editor of the type suggested in [58] to create pictorial displays for abstracting key variable information. However, as we

48 49 considered this option, it seemed more likely to lead to the same quagmire for defining graphic abstractions that troubled Incense, VIPS, and PROVIDE, i.e., the daunting effort required by a user to create custom pictures and bind them to key variables causes users to ignore the customization facilities when debugging. Therefore, we chose a more promising approach: extend an existing windowed debugger with an additional “data visualization” window for displaying the state of key variables.

Using this approach, we are able to take advantage of all of the features of an existing debugger (setting breakpoints, displaying source code in context, etc.), and concentrate entirely on extending the debugger’s data visualization capabilities by improving methods to select and display the data. In this respect, DV is similar to

Baskerville’s GDBX.SHOW process [10] or the structure browser in Silicon Graphics’

CodeVision debugger [80]. However, unlike GDBX or CodeVision, DV also addresses the difficulties of visualizations beyond boxes and arrows, in the spirit of Incense,

VIPS, and PROVIDE, but without overly complex definitions.

3.2 Data Selection and Display

According to Ducasse and Emde, there are two important tasks facing the program­ mer as he debugs his software. They are the task of collecting all of the necessary debugging information to solve the problem, and the task of restricting (abstracting) this information to the relevant details [29]. Using “print” statements to produce de­ bugging information hard-codes the collection task, usually resulting in inappropriate information for debugging the problem. On the other hand, symbolic debuggers, like dbx, provide all of the necessary information to debug the program, but they do a 50 poor job of presenting the relevant information, usually overwhelming the analyst with debugger data which is too detailed. Thus, Ducasse and Emde argue that a debugger should be designed to support these tasks separately.

DV is designed to support the extraction of data and its abstraction into man­ ageable information objects as separate tasks, because these tasks are performed by different processes. Extraction of the data is performed by dbx through operations initiated by the user on the xdbx interface. Abstraction of the data is accomplished with DV using two steps or phases, data selection and data display.

3.2.1 Data Selection

Data selection is the process of identifying which data items in a user defined data structure are important to the debugging problem. It is also the first step in determin­ ing how to abstract the debugger data so the programmer can focus on debugging at the application model level. DV’s data selection facilities are comparable to GDBX’s elision functions, but provide more options for controlling how much and what data is displayed, especially in the handling of pointer data.

The heart of data selection in DV is the selection definition system. Using the selection definition system, a programmer can hide the details of a given data type to display variables of that type from an abstract perspective, or he can choose to view each data field and display the type as it would be shown in a typical debugger.

Figure 11 is an example of two different views of a linked list made possible by different definitions of the list’s “node” data type. Chapter Four is devoted to discussing DV’s selection definition system and how it can begin to help programmers debug their 51

File Edit Display Draw

l i s t <5> l i s t <1> • l i s t (2) data = { 0x8430 name s “Vickie" 'Vickie' age = 29 age = 29 height = 60 height • 60 > +— *list->next <3) data = < name = "Keith" ’Keith' age = 31 age = 31 height = 71 height = 71 > *list->next->next (4) next = 0x8490 data = -C name = "Rebekah" ’Rebekah’ age = 6 age = 6 height = 48 height = 48 > ------+ next = (nil)

Figure 11: A Linked List: Abstract View (left) and Full Detailed View (right)

applications at higher abstraction levels.

3.2.2 Data Display

Data display is the second phase of abstracting the debugger data into manageable information objects. Data display is handled in DV through four data display ob­ jects, the Function, the Variable, the Graphic, and the Drawing Pad. The Function is a visual analog of a C programming language procedure (e.g., as procedures have local variables, so Variables are displayed in, and organized by, Functions). Variables appear similar to the traditional boxed record; however, the data management oper­ ations they export distinguish them from their forerunners. The Graphic object and the Drawing Pad are original inventions of DV. Graphic objects are typed pictures 52 for illustrating the features of a data structure which can be shown better graphically

rather than textually. Graphics are displayed in the Drawing Pad.

Examples of these data display objects are shown in Figure 12. This figure shows two Functions, main and Intersect, each containing Variables. The drawing pad, also shown in Figure 12, is a special Function for displaying program variables as

Graphics. In this example, the variable polygon is displayed as both a Variable in the Function main and as a three-dimensional (3D), planar polygon Graphic in drawing pad. Intersect shows the triple as a Variable, while drawing pad shows this same triple as a 3D point Graphic. The Drawing Pad also shows 3D axes to help orient the view. Chapter Five presents each of the data display objects in detail. It explains the operations that each object supports, and discusses how the programmer can use these operations to view his data as more abstract, less detailed, entities.

3.3 Debugging With the Data Visualizer

This next section illustrates how a programmer could use DV in an ordinary de­ bugging task to visualize debugging data. This example sets the stage for the rest of this document by promoting familiarity with DV’s primary facilities for debugger data visualization and by lending context to the detailed discussions of the selection definition system and data display objects in Chapters Four and Five. We use the program Intersect, the same program used to produce Figure 12 earlier, for our ex­ ample. Intersect reads parameters for a semi-infinite ray and a triangular polygon in three-space and determines whether or not the ray intersects the polygon. File Edit Display Draw

Intersect

polygon (1> <0.5, -1 .0 , -0.699999988079071045) 0*422972977161407471 (-0.5, 1.0, -1.0) 0.891691696724700928 <1.5, 1.0, -1.0) -0.9837837815284729

drawing pad

Figure 12: DV Data Display Objects B xdbx 2.1

/n/music/0/kshomper/cis782/ray3/Intersect.c File Edit Display Draw

nainO { Ray ray; vctr_3 polygonI31, P; in t h it;

fprintf(stderr, "Input ray start point and direction (six float values):"); scanf("ZfZfZfZfZfZf",&ray.pointt03,Vay.point[l],tray.pointC2], &ray.directionC03,irey.directionC13,&ray.directionC23); normalize_3(ray.direction); fprintf(stderr, "\n">;

fprintf(stderr, "Input the three points of the plane (nine float values):"); scanf("ZfZfZfZfZfZfZfZfZf\ fcpolygonCOlCOl, &polygon[03C13, ipolygon[03123, tpolygonElKOl, tpolygonClltll, ipoIygonI13[23, tpoly9ont2][0], &polygonC23C13, S>polygon[2K21>; fprintf(stderr, "\n">;

hit = Intersect(ray, polygon, P>;

i f ( h it - PXYGON) < fprintf(stderr, "The ray intersects the polygon at (Zf, Zf, Zf)\n", PC03, PE13, PC23);

Ready fo r execution

| run | | cont | | next | 1 s t c p 1 | stop at | | stop in | 1 delete 1

1 II “p 11 * » > 1 | p rin t | | p rin t » | 1 1 1 1

status I display undisplay dump q uit

Reading symbolic information... Read 615 symbols Reading symbolic information... Read 615 symbols (xdbx)^

Figure 13: xdbx and DV as They Appear Initially 55

Suppose we discover during testing that Intersect contains a bug (e.g., Intersect reports that a given ray, which is known to intersect the input polygon, does not intersect the polygon). We can use DV to debug Intersect by invoking our modified version of xdbx on the erroneous program. Initially, DV’s main window appears empty (see Figure 13) as DV waits for instructions from either the user or xdbx to display variables1’2. Three variables which obviously warrant monitoring are the planar polygon, the semi-infinite ray, and the point at which the ray intersects the polygon’s plane (if such a point exists).

Figures 14 through 16 demonstrate one possible method for monitoring these variables. First, we set a breakpoint to stop the program after reading the input parameters and examine the relationship between the variables polygon and ray

(see Figure 14). Because the Graphics for these variables appear to intersect, we suspect there may be an error in the Intersect procedure. Therefore, we set a second breakpoint in Intersect immediately after computing the intersection point

P and display P at this breakpoint as shown in Figure 15. Because P does not lie on the ray, the bug must occur in Ray Point. On examining RayPoint we discover that the statement:

P[2] = ray.point[2] + t * ray.direction[0]; incorrectly uses the x-component rather than the z-component of ray.direction.

1 Generally, Variables are synonymous with program variables; however, they may represent com­ binations of program variables, constants, and program variable expressions. We use the generic term variables to refer to both program variables and DV Variables, and reserve the capitalized form for the DV Variable object. 2 Displayed variables have their values updated and presented to the user at each execution break­ point. A displayed variable is also called a monitored variable. B xdbx 2.1 (patch level 1) B dv

/n/«usic/0/ksho«per/cis782/ray3/Intersect.c File Edit Display Draw vctr_3 polygonE33, P; in t h it;

fprintf(stderr, ‘Input ray start point and direction (six float values):"); scanf("ZfZfXfZfZfZf",iray.pointE01,tray.pointtll,tray.pointE21, &ray.directionE03,tray.directionE13,tray.directionE23); nor*alize_3(ray.direction); polygon (1) fprintf(stderr, "\n’>; (0.5, -1 .0 , -0.693999388079071045) (-0.5, 1.0, -1.0) fprintf(stderr, "Input the three points of the plane (nine float values):"); (1.5, 1.0, -1.0) scanf( "ZfZfZfZfXfZfZfXfZf", ipolygonE03E03, tpolygonE03E13, tpolygonE03E23, tpolygonE13C03, tpolygonCHEll, &polygonC13E23, tpolygonE23E03, tpolygonE23E13, ipolygonE23E23); ray (2) fprintf(stderr, "\n">; point = (0.200000002980232239, 0.0 , 0.800000011920928955) hit = Intersect(ray, polygon, P); direction = (0.1111111119389534, 0.444444447755813599, -0.88888889S11627197)

i f ( h it = POLYGON) < fprintf(stderr, "The ray intersects the polygon at (Zf, Zf, Zf)\n", PC03, PC11, PE21); > else if (hit = PLANE) < fprintf(stderr, "The ray does not intersect the polygon, \n"); drawing pad fprintf(stderr, "but intersects the polygon's plane at (Zf, Zf, Zf)\n", PE03, PE13, PE23); {polygon (1 )| |ray (2) stopped in nain at line 236 in file “/Intersect.c"

1 ™ 1 1 cont 1 1 next 1 1 st*P 1 I stop a t 1 stop in 1 1 delete 1

| where | | p rin t * | 1 “p 1 1 c*own 1 | p rin t | 1 func 1 1 f ‘ “ 1

status ) [ display | | undisplay| ) &mp | j~search | | quit

Running: In tersect Input ray start point and direction (six float values):.2 .0 .8 .1 .4 -.8

Input the three points of the plane (nine float values):.5 -1. ".7 -.5 1. -1. 1.5 1. -1.

(xdbx) display polygon display polygon (xdbx) display ray display ray (xdbx)^

^'Intersect '*aln 'polygon = ( (0.5, -1 .0 , -0.633999388079071045) (-0.5, 1.0, -1.0)

Figure 14: xdbx and DV Displaying polygon and ray Cn 0 5 B xdbx 2.1 (patch level 1) B dv

/n/f»us i c/0 Asho«per/c i s782/ray3/1 ntersect. c 169 File Edit Display Draw int j, k, in = 0, plane; int old_sign, sign; vctr.3 norn, vsub; float t, dot; Intersect Point ptsC31, pt;

CalcNornal(polygon, nom); polygon (1) (6) (0,5, -1 .0 , -0.699999988079071045) 0.422972977161407471 dot = dot_3(nor«, ray.direction); (-0.5, 1.0, -1.0) 0.891891896724700928 i f (FABS(dot) > EPISILON) < (1,5, 1.0, -1.0) 1.02297234139862061 /* test for intersections for non-perpendicular faces wrt the ray */ ray (2) SubVecfray.point, polygonlO], vsub); .200000002980232239, 0 .0 , 0.800000011920928955 t = -(dot_3(ncr«, vsub)) / dot; .1111111119389534, 0.4444444477K813599, -0.S RayPoint(ray, t, P);

plane = nax3(nor«);

ProjectPts(polygon, 3, plane, pts); ProjectPts(P, 1, plane, tpt); drawing pad old_si 9n s NOSIGN; in = 1; {polygon (1) | (6) stopped in Intersect at line 169 in file VIntersect.c"

1 rtJn 1 1 1 | next { | step | 1stop at I 1 stop in 1 1 delete 1

| where | | down | | p rin t | | p rin t * | 1 “p 1 1 1 1 f u ° 1 status 1 | display | | undisplay| | dump | |~ search | | quit

display ray (xdbx) func Intersect func Intersect (xdbx) stop a t 169 (4) stop at Vn/nusic/0/ksho*per/cis782/rey3//Intersect.c":169 (xdbx) display PEOI, P t l l , PC23 display PC01, PCI], PI21 (xdbx) cont cont I(xdbx) _'Intersect'*aln'polygon = ( (0.5, -1 .0 , -0.699939388079071045) (-0.5, 1.0, -1.0)

Figure 15: Evidence of an Error: P Appears in the Wrong Location Cn -a B xdbx 2.1 (patch level 1> B dv

/n/nus ic/0/ksho«per/c i s782/ray3/Intersect.c File Edit Display Draw „> CC2] = 812] - BC21;

void RayPoint(ray, t, P> Intersect Ray ray; flo a t t ; v ctr.3 P; polygon (1) (6) < (0.5, -1 .0 , -0.699999988079071045) 0.422972977161407471 PCO] = ray.pointCOJ + t * ray.directiontO]; (-0.5, 1.0, -1.0) PCD = ray.pointCD + t * ray.directionC D ; 0.891891896724700928 (1.5, 1.0, -1.0) PI2] = ray.pointC2] ♦ t * ray.directionCOl; -0.9837837815284729 > ray (2) int inside(bl, b2, b3, b4, pi, p2) .200000002980232239, 0.0 , 0.8000000119209289K {float bl, b2, b3, b4, pi, P2; .1111111119389534, 0.4444444477S813599, -0.88 retu m ((b l <= p i U p i <= b2> &&

int *ax3(v) vctr_3 v; { craving pad i f (FftBS(v[0]> > FRBS(vCD)) { i f (F8BS(vC0]> > FABS(vC2])> < (polygon (1) [ r a y (2) | <6> stopped in Intersect at line 169 in file “/Intersect.c"

1 njn 1 | cont | | next | step | | stop at | 1 stop in 1 1 delete | where | 1 ** 1 j down | print j | print * | 1 1 | f ile

| sta tu s | | display | |undisplay| d * P | | search | 1 v i t |

(xdbx) stop a t 169 <4> stop at Vn/«usic/0/kshopiper/cis782/ray3//lntersect.c“:169 (xdbx) display PCO], PCD, PC21 display PCO], PCD, PC2] (xdbx) cont cont (xdbx) func RayPoint func RayPoint (xdbx) set PC21 = ray.polntC2) ♦ t * ray.directionC2] set PC2I = ray.pointC2] + t * ray.directionCD

^'Intersect'Min'polygon = ( (0.5, -1 .0 , -0.699993388079071045) (-0.5, 1.0, -1.0)

Figure 16: The Displayed Variables After Correcting Component P[2] cn 0 0 59

After correcting this error, P lies on the ray as expected (see Figure 16), increasing our confidence that the correction is sound.

Our example illustrates how xdbx and DV cooperate to visualize debugger data.

This example focuses on the Drawing Pad and Graphics; however, DV provides two other methods for data abstraction: elision and integrated access to user-defined

“print” procedures. We explore each of these methods in depth during the discussion on data selection. Finally, in order to present a coherent example of debugging with

DV, it was necessary to ignore many details concerning the data display objects’ capabilities. In the full presentation on data display objects in Chapter Five, these details are addressed and additional data display features are introduced.

3.4 Architecture and Environment

In Section 3.1 we discussed the motivation behind DV’s design; this section elaborates on that material. Because DV is exclusively targeted at addressing the inadequacy of extant debuggers’ data display capabilities, we sought an available debugger which would be suitable for extending in this direction, xdbx is the debugger we chose, because of the accessibility of its source code and the ease with which it could be modified to communicate with both dbx and DV.

xdbx [24] is an interface to the dbx debugger. It contains three primary windows for user interaction and three for information display (the xdbx interface to dbx appears on the left in Figures 13-16). Briefly the purpose of each window is: 60

• The file window, the informational window at the top of the application, shows the name of the displayed source file.

• The source window, the interaction window immediately below the file window, displays a source file. • The message window is an informational window. Generally, it reports the present breakpoint location with respect to the program’s source code.

• The command window is an interaction window providing a button interface to the most common dbx commands.

• The dialogue window, the third interaction window, provides access to dbx''s terminal interface.

• The display window is the third informational window. It shows the status of each monitored variable at the current breakpoint.

Although DV is designed to replace xdbx's display window, DV is not intrinsically dependent upon xdbx or any other debugging interface. DV is designed to cleanly separate the duties of dbx, xdbx, and DV. Specifically, dbx provides the debugging environment, xdbx handles the interface to that environment, and DV manages the visualization of displayed variables. Because of this separation of duties, DV could easily be attached to another d&z-based debugger or could be operated without any xdbx support. Conversion of DV to a debugger other than dbx, while possible, would require more extensive modifications.

Figure 17 illustrates the relationships between DV, xdbx, and dbx as they are currently implemented. Xdbx is the master process; it is initiated first and issues requests to both dbx and DV to perform debugging actions or display data. DV fulfills a dual role. It handles xdbx's request for displaying data, and it also issues its own requests to dbx to build and maintain an information database on the displayed variables. 61

xdbx

modified communication pipedv l a y e r _____

pipe 2

pipe 1

dbx

Figure 17: Communication Relationships Among the Debugging Processes 62

Two modifications to xdbx were required to allow these processes to communicate

as described. First, we created a new procedure, pipedv, to invoke DV from xdbx

and establish a communication pipe between them. Xdbx calls pipedv immediately

after forking dbx and establishing the first communication pipe. Pipedv receives the

handle of the first pipe as a parameter, which it passes to DV, allowing DV to also

communicate with dbx.

Permitting DV to share idfix’s communication pipe to dbx precipitated the sec­

ond modification: to restrict xdbx''s access to dbx. This modification was necessary,

because when both processes would send requests to dbx simultaneously, the return

messages would become unintelligibly tangled or dbx would abruptly term inate. We

restricted access to dbx by a token-passing scheme; only the process possessing the token may make requests to dbx. Originally, six xdbx procedures wrote directly to

dbx; we modified these six procedures to check if xdbx holds the token before writing.

Because xdbx depends more heavily on dbx than DV does, xdbx keeps the token until

DV requests it. Furthermore, any debugging commands invoked by xdbx while DV holds the token are queued until DV returns the token. DV returns the token after completing each operation. If DV cannot immediately acquire the token from xdbx, it blocks until the token is available. Under this scheme we prevent the possibility of xdbx and DV getting out of sync (i.e., DV displays some state which is not represen­ tative of the current breakpoint) by having xdbx instruct DV to update its variables after each stopping action. 63

We considered two alternative designs for DV to avoid having DV and xdbx share access to dbx, but neither was as acceptable as our final solution. The first approach, to open two independent pipes todbx, required modifications to dbx which we wished to avoid so we could focus on DV’s development. We also considered invoking two independent dbx sessions. However, such a scenario would require more handshaking between xdbx and DV to keep the dbx sessions synchronized, than what is required of the token passing method, making this alternative less acceptable.

DV is implemented on a Sun Sparcstation for debugging C language programs.

Although C is our target language for development, fewer than 20 of over 320 proce­ dures, functions, and macros comprising DV are dependent on the C language syntax.

It is, therefore, quite feasible to modify DV to support additional languages. Addi­ tional software supporting DV beyond xdbx and dbx includes the X Window System,

Xt Intrinsics, the Motif widget set [5, 41, 47] for DV’s user interface and the Ohio

State University’s EDGE Graphics Library for drawing 3D graphics [21]. DV sup­ ports color for its Graphic objects. We developed the portions of DV which use color on the SGI Personal Iris and Crimson workstations. Under this configuration, we executed DV remotely on the Sun using the SGI equipment for the display station. CHAPTER IV

Data Selection

Data selection is the process of identifying which data items are important for debug­ ging the problem at hand. Judging what items are important can be a difficult task, because it is hard to find a good balance between selecting too much information or not enough. Yet determining the right degree of detail in the data can seriously affect the programmer’s ability to debug. Too little detail can leave a programmer’s analysis depending on erroneous assumptions about data values, and too much infor­ mation can obscure meaningful clues, or worse, it can lead the programmer to despair of searching for debugging clues, because the data management is overwhelming.

Many authors offer their advice on creating useful debugging output [29, 30, 52,

53, 54, 79, 83, 87]. This advice commonly addresses either the placement or content of debugging or “print” statements in the source code. However, we agree with Ducasse and Emde that manually inserted “print” statements do not provide a programmer with sufficient flexibility for controlling debugging output, since the relevance of the output data is clearly dependent on the debugging situation [29]. Furthermore, pro­ grammers have demonstrated a general disregard for systematically placing debugging statements in their programs, negating the value of this type of approach.

Since we cannot determine beforehand what debugging information is crucial for

64 65 solving the problem, it would be useful to have a tool that could easily extract de­ bugging information as we debug the program. The more flexible our selection mech­ anism, the more likely we are to have the right data available to us when it is needed.

Debuggers provide good versatility for examining both the control flow of the pro­ cess and its run time data values, but they do this at the expense of presenting their information in excruciating detail. If the programmer could direct the debug­ ger to automatically remove irrelevant data items before the program variables are displayed, then the user could concentrate exclusively on pertinent data values while taking advantage of the debugger’s flexibility for data access. Thus, selection with the debugger occurs at two levels: when extracting information from the debugger about current variable states and when choosing what information in those states

(e.g., selection of specific fields within a record) is worthy of examination. Through­ out the rest of this document, references to data selection refer to the second level, unless noted otherwise. We also use the term reduction to refer to the removal of irrelevant1 data.

4.1 Reducing Information

In opening this chapter, we alluded to some possible problems that may arise when too much detail is present in the information set: valuable debugging information is obscured by superfluous data, or the burden of managing so much data exceeds the programmer’s capacity or will to do so. Recognizing such problems, Baskerville intro­

1 We consider non-selected data to be irrelevant. Therefore, we can use data selection as a method for reducing information, since it distinguishes relevant and irrelevant data. 66 duced his elision functions in GDBX. These functions demonstrated how reduction can help highlight interesting debugger data. In this section, we address some of the other benefits of reduction. Specifically, we see five reasons for reducing, as much as necessary, data in the information set. They are:

• Control: It is obvious that more data requires more control or management effort, the rest of the debugging environment being equal. Managing data which provides no insight on the program’s bug(s) wastes the programmer’s time and energy.

• Efficiency: In addition to wasting the programmer’s resources, the management of extraneous data wastes machine resources.

• Expediency: Management or display of all selected variables may simply not be possible due to hard limits such as screen size. A familiar example of this is when the debugging output scrolls completely off the screen, because the output window is too small or the debugging statement appears within a loop.

• Focus: Researchers on human memory report of the short-term memory’s in­ ability to store many data items for problem solving [56]. Irrelevant data robs the programmer of precious mental capacity. It can 'also distract him from considering more pertinent data, muddying his perspective of the problem.

• Abstraction: Data abstraction rests, in part, on the principle of information hiding. Modern computer science theory and languages promote a program­ ming style where only the relevant information at a given abstraction level can be accessed. However, debuggers inhibit data abstraction by forcing the pro­ grammer to view too much data detail, because they show entire data structures at the level of the source language’s built-in types. This makes it more difficult to reason about the process.

Of course, there are some pitfalls to reducing the data: important data items may be removed. Therefore, the system for reducing the data ought to be interactive and flexible enough to retrieve the elided data on demand. With such a capability, the programmer need not hoard the data in fear that it might be useful—sometime, 67

but can safely select the most necessary information, making more data available as

circumstances require.

4.2 Abstracting Data Structures

Reduction is necessary for handling debugger data as our discussion in the previous

section indicates. It is also the basis for abstraction. Indeed, each of the debugging

systems surveyed in Section 2.2 employs some scheme for reducing the displayed infor­

mation. The methods these systems use to reduce their data can be classified into two

groups, definitional reduction and programmatic reduction, as shown in Table 1. Def­

initional reduction, typified by Baskerville’s elision functions, removes non-pertintent

data by allowing the user to define selection characteristics on record and array vari­

ables’ fields. Programmatic reduction is achieved by using procedures to “format”

the debugger data to emphasize or de-emphasize pre-specified data features.

Both types of reduction methods have their advantages and disadvantages. Con­

trolling and modifying what data is relevant is much simpler using definitional meth­

ods. However, no one has yet demonstrated the compatibility of definitional reduction

with data abstractions beyond box-and-arrow diagrams. Programmatic reduction al­ lows the programmer to use any of the fields of a variable or combination of variables

to abstract a data structure3, but not without the attendant difficulties of program­ ming.

2We use CodeVision from Silicon Graphics, Inc. to represent the present capabilities of this group. 3Strictly speaking, a data structure is a connected collection of one or more types of data that is usually associated with a single variable. However, in practice, a programmer may implement a data structure with seemingly independent variables. 68

Table 1: Methods of Data Reduction Used by Various Debugging Tools

Allows Multiple Reduction Allowable Default System Definitions Type Abstractions Abstraction EXDAMS V Programmatic Any No Default Incense V Programmatic Any Box-and-Arrow GDBX Definitional Box-and-Arrow Same VIPS V Programmatic Any Box PROVIDE y/ Programmatic Any No Default VIPS II Definitional Box-and-Arrow Same Commercial2 Definitional Box-and-Arrow Same

We propose that a system for abstracting debugger data should support reduction

methods from both classes. The rest of this section examines how to do this. First,

we look at how to extend current definitional reduction methods so that they support

higher level abstractions. Then, we discuss how definitional and programmatic re­

duction can be integrated to simplify the visualization of data structure abstractions.

Discussion of DV’s mechanisms for displaying the abstractions, however, appears later in Chapter Five.

4.2.1 Extending Definitional Reduction

Table 1 identifies three systems which use definitional reduction to manage their debugger data. In each case, although the method for defining the selection criteria for the data fields is different, the results are generally the same: unwanted data fields are simply not displayed, and if the data field contains a pointer, then the pointer is 69

not expanded or dereferenced. This capability, which we refer to as simple elision,

must be extended in at least two ways if definitional reduction is to support higher

level abstractions. To begin, a better mechanism is needed to handle pointer fields.

Additionally, multiple data selection definitions for the same data structure must be

available simultaneously.

Simple elision helps to bring the debugger data under control and fosters abstrac­

tion; however, it cannot, by itself, support a system for abstracting data structures.

A major shortcoming of simple elision as an abstraction mechanism is that it is only

useful for contiguous data structures such as records and arrays. Simple elision is

ineffective for non-contiguous or linked data structures, because it is too basic an

operation for handling the linked structure’s pointer fields. While hiding a pointer

field presents no difficulties, selecting pointer-referenced data results in a view that

reveals the data structure’s individual record boundaries. This view of the linked data

structure, as a collection of boxes with a visible “skeleton” of connecting pointers, in

addition to being more difficult to manage as a unit, compromises the data structure’s

abstraction as a single object.

Most modern programming languages provide pointers for building data struc­

tures. Furthermore, the conventional computer science curriculum promotes the use of

pointers for constructing data structures, especially for “variable-sized” data. There­ fore, any system supporting the abstraction of data structures must inevitably address

how to handle the linked data structure’s pointer field(s) in a manner which preserves

the data structure’s abstraction as a single object. 70

File Edit Display Draw

l i s t (5> l i s t <1> • l i s t <2> data = name = "Vickie" ’Vickie' age = 29 age = 23 height = 60 height = 60 > ♦—— ------♦ »llst~>next <3? next = Ox83dO data = < name = “Keith" •Keith’ age = 31 age = 31 height = 71 height = 71 > •1lst~/next->next (4) +—— ------♦ next - 0x8490 data = name = ■Rebekah" age = 6 age = 6 height = 48 height = 48 > ♦------next = (nil)

Figure 18: Two Views of a Linked List: Abstract (left) and Detailed (right)

In addition to handling linked data structures, a definitional reduction system ought to allow multiple selection definitions for abstracting the same data structure.

While this capability is present in each of the programmatic systems, it is noticeably absent in the definitional systems. It is unclear why multiple definitions are ignored by the definitional systems, because they are useful in situations where different “views” of the data structure are necessary or for handling multipurpose data structures.

As an example of the former situation, consider the linked list in Figure 18. The developer of the primary list operations needs the detailed view on the right, since he is working directly with the list’s representation. A user of the linked list needs to, and should, only see the list as a single abstraction as shown in Figure 18 on the 71

left. In the not-so-perfect world where the linked list developer and user are the same

programmer, it would be useful if the debugger allowed the programmer to rapidly

switch between list views as needed. Multiple definitions for abstracting the same

data structure support this mode of operation.

Also, as mentioned above, the presence of multipurpose data structures, such

as an untyped pointer, union, or the generalized list [43], demand flexibility in a

definitional system for selecting data to support alternate views of the data structure.

For example, DV uses a list data structure to manage its data display objects. The

implementation of this list relies on untyped pointers so that objects of any type

may be inserted into the list. However, the debugger’s “natural” selection of an

untyped pointer field’s value (a pointer to the C type void or char) is useless for

viewing the list’s contents (see the top view in Figure 19). Multiple definitions give

the programmer the ability to choose, depending on the untyped pointer’s context,

an appropriate definition to retrieve the list’s contents as demonstrated in the bottom

view of Figure 19.

4.2.2 Integrating Definitional and Programmatic Reduction

The chief advantage of programmatic reduction is flexibility, but this same flexibility

makes it hard to apply. For example, both Myers and Moher lament the difficulty of

using the abstraction features that their systems provide [58, 63]. Also, VIPS requires

users to understand a special purpose programming language, the Figure Description

Language (FDL), to define new figures. These FDL procedures must be translated be­ fore debugging the primary application, possibly adding another edit-compile-debug B dv a File Edit Display Draw

*varlist->heed <28> data - (nil) next = 0x181528 H*varlist->head->next->data (32)

I *varlist->head->next (27) data = 0xl80fe8 ' next = 0x180010 ►j«varlist->heed->next->next->data (33)

> «varlist->head->next->next (26) data : 0x17fb20 ' next = 0xl7edd8 •*|*varl,*varlist->head->next->oext->next->data (34)

>*varlist->bead->next->next->next (25) R 7 data = 0xl7e628 1 next = (nil)

•((Variable) varlist->head->next->data) (29) •varlist-vbead (17) key.nane = 0xl810a8 "d.ip" data = (nil) type = 0x181068 "stru ct .in tp art* next = 0x181528 value = 0xl82ba0 "sun = 6\rmm = 3\ni = 3\nnext = 0xf7fff838\n" address = 4160747608 size = 16

•((Variable) varlist->head->next->next->data> (30) » *varlist->head->next (16) key_nane = 0xl7eb40 "nun" data = 0xl80fe8 1 type = 0xl7fbd8 "int" next = 0x180010 value = 0x182678 "3\n" address = 4160747568 size = 4 *varlist->head->next->next (15) •((Variable) varlist->head->next->next->next->data) (31) data = 0xl7fb20 1 key_nane = 0xl7dl60 “'addit'nain'sun" next = 0xl7edd8 type = 0xl7e6e0 "int" *varlist->head->next->next->next (14) value = 0xl7afa8 "S\n" data = 0xl7e628 1 address = 4160747572 next - (nil) size = 4

-a Figure 19: Two Views of the List Data Structure Used by DV to 73

cycle for each user-defined figure. Integrating definitional and programmatic reduc­

tion methods allows the programmer to exploit the advantages each method provides.

With such support in a system for reducing data, we envision a programmer us­

ing definitional techniques, extended as discussed in the previous section, to abstract

most of the program’s interesting data structures. However, each selection definition

also provides built-in access to any user-defined selection procedure for when a data

structure’s complexity or the programmer’s fancy exceeds the definitional method’s

data selection capabilities. This access to user-defined procedures is especially valu­

able to programmers who develop data structure “print” procedures as a part of a

pro-active approach to debugging [42]. Since these procedures are already a part of

the application, they do not introduce any additional debugging activity, nor do they

require the user to learn a new language for visualizing data abstractions. Thus, pro­

grammers can rely mainly on simple definitional methods for abstracting debugger

data and appeal to programmatic methods, which support their pro-active debugging

efforts, only as necessary.

4.3 The DV Selection Definition System

To further our experimentation in abstracting data structures for debugging, we de­ signed and implemented DV to support data selection using the reduction features discussed previously. At the heart of DV’s data selection facilities is the selection definition system. This section introduces DV’s selection definition system, and it addresses some of the practical problems which we had to solve to create this imple­ mentation. 74

4.3.1 Classifying Data Fields

Before we discuss how to select debugger data with the selection definition system, it is helpful to develop a classification for the types of field data that appear in data structures. The descriptions of these field classes are not meant to be mutually exclusive, since some data fields may be appropriately assigned to more than one class. Neither are these classes meant to be exhaustive of all the possible ways in which field data may be used. Rather we present this classification system in order to provide a common terminology for explaining the features of the selection definition system, to gain some foothold on sorting out common uses for field data, and to lend further insight on how data selection can be improved for abstracting data structures.

When constructing programs, especially large ones, the developer must often ad­ dress the design and implementation of new data types. Commonly, these types are defined using pointers to link record components together to form a single data struc­ ture [71]. How a data structure’s individual record fields are used to support the type’s abstraction varies widely. Nevertheless, we propose the following categories for describing how data structure field data is used:

• Application: This data is of interest to the user of the application. It is primarily for the storage of this information that the data structure exists, it determines the format of the record(s), and it is generally the most common type of data in the structure. An example of application data is the data field(s) of a linked list.

• Link: This pointer data connects records together to link the data structure into a single object. An example of link data is the “next” field of a linked list.

• Indirect: Although similar to link data, indirect data de-emphasizes the ref­ erence’s pointer value and emphasizes the referenced object. An example of 75

indirect data is the individual pointer variable which marks the insertion loca­ tion in a linked list.

• Index: Index data is similar to indirect data; however, this class implies that the referenced object also belongs to a sequence of like objects in contiguous memory locations. An example of index data is a pointer to a single character of a string of characters.

• Decision: This is data included in the structure for the purpose of interpreting the structure. An example of decision data is the tag field in a generalized linked list [43].

• Bookkeeping: This data is included in the structure to simplify maintenance of the data type abstraction. Bookkeeping data is generally synonymous with internal or private data. An example of bookkeeping data is a reference count.

• Redundant: Redundant data is included in the structure to simplify program­ ming, traversal, or record keeping on the data structure. An example of redun­ dant data is the “prev” link in a double-linked list.

4.3.2 Handling Pointer Fields

Handling pointer fields is possibly the most difficult aspect of abstracting a data structure, because the pointer’s versatility allows it to be used in so many different ways. Our classification of field data emphasizes this point in that three of its seven categories (link, indirect, and index) specifically address pointer data. This emphasis is not accidental, nor does it give undue attention to the pointer field.

As we have seen, former debugging tools were content in most cases to abstract the pointer field as an arrow. The argument for doing so was that typically arrows are used by programmers to represent pointers in figures [10, 60]. Often this assessment is valid; one need only leaf through a textbook on data structures to verify this point of view.

However, at times this viewpoint is too narrow. Certainly, a programmer familiar with linked lists when asked to write down the elements of the list in Figure 20(a) 76

Red lareen/"I,. _ ^ Blue El (a)

Red Red, Green, Blue Green Red— Green—^ Blue Blue

(b) (c) (d)

Figure 20: A Simple Linked List: (a) Its Representation, (b), (c) and (d) Three Possible Listings of the Elements

would write something similar to 20(b) or 20(c) rather than what is shown in 20(d).

Our issue is this: although visible (link) pointers are most naturally represented as arrows, many situations arise in abstracting data structures where we may follow a

(indirect) pointer without actually drawing it. Therefore, the first step in handling the pointer field is to give programmers the ability to specify the manner in which the pointer is used so it can be visualized appropriately.

The selection definition system provides the programmer this capability for spec­ ifying how a pointer is used through the mode option in the selection definition interface (see Figure 21). For non-pointer fields the mode option simply controls whether or not a field is retained when the variable is displayed, but for pointer fields additional options are available for specifying whether the pointer is used as indirect, Type Kerne struct .typedcfns Instance i n Displayed Width 300

Displayed Fields

Selection Function

Field Nana Type hode P rity Dereference fis Highlight nane char CB9J Invis

width in t Invls >

nunfieldsshoun int Value P

char 132] rfunction Value P

nunfields int Value P

fielddata struct _typedata * Link * 1 0

OK

Figure 21: The Selection Definition System Interface (for Structures)

Table 2: Selection Modes

Mode Field Type Meaning Invis Non-Pointer/Pointer Do not select field Value Non-Pointer/Pointer Select field only Refer Pointer Select referent only Link Pointer Select field and referent Index Pointer Select field and referent with context 78

link or index data. Table 2 lists the possible mode-data type combinations. Most of

these combinations are also illustrated in the binary tree example of Section 4.3.5.

Allowing the programmer to dereference pointers using different modes is a sig­

nificant step towards visualizing linked data structures as single objects, but it by no

means solves all the difficulties associated with pointers. We have already discussed

the problem of untyped pointers; other problems are cyclic pointer chains, traversal-

dependent abstractions, and the invisibility of indirect references. In the rest of this

section, we see how DV’s selection definition system solves these problems.

Examining Figure 21 once again, we note that three other control mechanisms ap­

pear on the selection definition interface for modifying how pointer fields are selected:

Priority, Dereference As, and Highlight. Priorities and dereferencing apply when the

mode is set to either Link, Refer, or Index; highlighting is for Refer mode only.

Priorities are integer values for controlling the selection algorithm’s traversal order on the record’s pointer fields. Smaller priorities take precedence. Thus, the pointer field with the smallest priority value is traversed first; this is followed by the next highest priority pointer, etc. If two or more pointer fields are assigned the same priority, then the tie(s) are broken by comparing the pointers’ order in the record with earlier fields taking precedence. Using priorities a programmer can control the selection order of record data in traversal-dependent data structures. Thus, for example, by modifying field priorities a binary data structure’s information could be selected using preorder, inorder or postorder traversal, or the contents of a linked list could be selected in forward or reverse order. An illustrated example of priorities using a binary search 79 tree is presented in Section 4.3.5.

The next control mechanism, dereferencing, is primarily for handling untyped pointers, but may be used whenever a pointer field needs to be dereferenced with an alternate type. Figure 19 was created using dereferencing to permit the data field’s untyped pointer to be dereferenced as a pointer to a Variable object. This field is also commonly used in conjunction with the variable’s preferred definition which we discuss at length in Chapter Five.

File Edit Display Draw

*11st.bead <2) color = "Blue" next - 0x6100

color = "Green" next = 0x60e8 l i s t <1> head = 0x6118 current = 0x60e8 color = "Red" next s (nil)

l i s t <5> color = “Blue"

color - "Green"

<1 : 2 ; 0> color = "Red"

Figure 22: A Model Linked List

The final control mechanism is for highlighting indirect pointer types. Ordinarily, an indirect pointer is followed invisibly to the data it references. Highlighting an indirect pointer causes hints about the pointer to be included in the referenced data.

Therefore, highlighting lets the programmer compromise between emphasizing the pointer with the Link specification and virtually ignoring it altogether with the un­ 80 highlighted Refer specification. As an example of highlighting, consider the list in the top half of Figure 22. This list includes an additional pointer field in the header node for identifying a “current” element. In this case, simply marking the current element with an indirect pointer hint satisfies our informational needs without over­ emphasizing the pointer’s value. Presently, DV uses a special text string enclosed in angle brackets for highlighting selected data. When DV’s Variable object displays the data, this string could be used to control the highlighting color or font; however, current limitations in the Motif toolkit prevent DV from fully exploiting such features.

The last problem which we have yet to discuss is how DV handles selection on data structures with cyclic pointer chains (e.g., a circular queue). Essentially, DV’s selection algorithm performs a depth first traversal on all records in the data struc­ ture that are reachable from the initial record through link or indirect pointers. Nil pointers are not traversed. Additionally, no bound on traversal depth is currently imposed, although, one could easily be included in the algorithm. Once a record is visited it is added to a list of visited records for the current traversal. Before a new record is selected for traversal, this list is checked to prevent an infinite traversal around cyclic pointer chains.

4.3.3 Procedure-Based Selection

Section 4.3.2 focused on pointer fields; therefore, it naturally restricted its attention to link, indirect, and index data. Since application data is by definition “interesting”, it will generally be selected using a Value mode. Furthermore, since bookkeeping data is private and redundant data adds no additional insight for debugging, these 81

field types will usually be elided with an Invis mode. The last field class, the decision

field, requires more attention.

TAG DATA LINK

(a)

0»)

Figure 23: (a) A Generalized List Record and (b) The Generalized List A = (a ,(b ,c),d)

An example of decision data is the TAG field in a generalized linked list (see

Figure 23). An algorithm for interpreting generalized lists uses the TAG value to

determine whether the DATA field contains immediate data or a pointer to another

list. Using the selection definition system, we could create two definitions for the list

record: one definition would use the Value mode for selecting the immediate data,

and the other definition would use either the Link or Refer modes for following the pointer. However, unless we provide some mechanism for evaluating the TAG field,

the selection algorithm cannot determine which definition to use for each individual record.

Our first thought for solving this problem was to allow the mode to be chosen by a 82 conditional expression based on the record’s decision fields. Of course this idea could be extended to conditionally modify each of the control mechanisms. The primary drawback of this approach, however, is that the record must contain the decision field.

Therefore, we opted for an entirely different method.

One of the capabilities of the dbx debugger is that it can execute any procedure in the current application using the dbx call command. Since DV communicates freely with dbx (see Section 3.4), the selection definition system also has access to any of the application’s procedures, specifically any data structure “print” procedures. Essen­ tially, we use the dbx call facility as a “hook” back to the application’s programming language for handling decision data. The advantage of this method is its flexibility and the convenient way it incorporates pro-active debugging techniques into DV’s data selection facilities. The disadvantage of this method is that it relies on the user to have created print procedures for his more complex data structures beforehand.

We think this risk is acceptable, since, as Myers noted in a survey of programmers at

Xerox PARC [61]:

“most people felt that they were currently in the habit of writing data display routines to allow debugging of complex structures.”

Furthermore, others claimed they might do so, if the debugger supported this practice. Access to a user-defined procedure is accomplished by typing the proce­ dure name in the “Selection Function” field. A procedure name in this field takes precedence over all field definitions. 83

4.3.4 Type-Based Selection and Multiple Definitions

Using the selection definition system as described above, the programmer selects the variable field data to retain whenever a variable of the given type is displayed. Two reasons motivate us in choosing to make the selection definitions based on data type.

First, because the variable states to be examined are instances of data structures built from new data types, it is convenient for the variables’ selection mechanism to also be based on data type. Second, since it is reasonable to expect program variables with the same data type to represent different instances of the same abstraction, it also seems reasonable to define the selection criteria for one type and thus support multiple instances of the same abstraction.

Nevertheless, although we can expect variables of the same type to most often have their data fields selected uniformly, the selection system must not enforce this policy. As with the linked list in Figure 18, a programmer may choose to examine both an abstract and a detailed view of the data structure simultaneously. Moreover, a multipurpose record, for holding similar, but non-identical, objects of the same class, may require variations in the selected data4. Therefore, definitions should be based on data types for the programmer’s convenience in defining how the data fields of program variables should be selected, but they should permit enough flexibility so alternate definitions can be accessed when needed.

DV’s selection definition system provides this kind of flexibility, and more. If no definition exists for a given type when DV is instructed to display a variable of

4The X Window System, X Intrinsics, and Motif are replete with these types of data structures for events, resources, “callback” structures, etc. 84 that type, then DV creates a default definition and uses it to control what variable data is selected for display. The default definition selects all field data for records, but does not dereference pointer fields. However, the default selection definition, like all definitions, is editable through the selection definition interface so that the programmer can modify the default selections as necessary. When more than one definition exists for a single data type, the user must specify which definition is preferred when variables of that type are displayed; otherwise, DV uses the default definition. DV also uses the default definition, if the preferred definition does not exist (e.g., it was never created, or was deleted from the definition database, etc.).

4.3.5 Data Selection: An Illustrated Example

This next section illustrates how a programmer could use DV’s selection definition system to control data selection on a threaded binary search tree. The top half of

Figure 24 shows the full representation of the binary tree in the traditional box-and- arrow style. The application data for this tree is stored in the name field; rthread and lthread are decision data for determining whether or not the pointer fields rchild and lchild contain children or thread links. The picture of this tree was produced with the selection definition shown below it. Notice that the rchild and lchild fields are specified with the Link mode, so both the link arrow and the referent are selected for display. Figure 24 is also an example of the selection algorithm’s ability to handle linked data structures with cycles.

If the programmer decides to ignore the rthread and lthread fields, he can automatically elide this data in each record using the definition shown at the bottom 0 dv

File Edit Display Draw

*tree->rchild <3) nane = "Keith" rthread = 1 lthread s 1 rch ild = (n il) lchild = 0x64b0

tree (1) y-* •tre e <2> Ox64bO J name = "Hatt" I rthread s 0 *tree->lchild->rchlld (5) lthread = 0 nane = "Rebekah" rchild = 0x6518 rthread = 1 lchild = 0x6580 lthread = 1 rch ild = 0x64b0 nane = “Vickie" lch ild = 0x6580 rthread = 0 lthread = 1 rch ild = Ox65e9 lch ild =

Type Name stru c t .node

Displayed Uidth 300

Displayed Fields

Selection Function I

Field Nane Type Hode Priority Dereference fis Highlight

nane char C80J 1 Value P

rthread int 1 Value P

lthread Int |Value p

rchild struct .node * □ L ink f lloll

lchild struct .node * L ink t* □ ] l ° ..1 1 ......

Figure 24: The Traditional Box-And-Arrow View B d v

File Edit Display Drew

*tree->rchild <3> name = "Keith1 rchild a (nil) lchild = 0x64b0

tre e <1> •tre e (2) nane = "Hatt" rch ild = 0x6518 *trec->lchild->rchlld <5) lch ild = 0x6580 name = "Rebekah1 rch ild = 0x64b0 lch ild = 0x6580 *tree->lchiId <4> name = "Vickie1 rch ild = 0x65e8 lchild = (nil)

s tru c t _node Instance

300

Selection Function

Tape Hodo Priority DercFcrcnce As Highlight char C80] V a lu e U ]

rthread in t Invls >

lthread in t Invis

rchild struct .node •

lchild struct .node •

CANCEL HRI

Figure 25: The Tree With Thread Flags Elided 87

of Figure 25. The results of using this definition are shown at the top of Figure 25.

To display the contents of the tree in sorted order requires an inorder traversal with

the name field as the only visibly selected item. The selection definition to do this is

shown in Figure 26 with the data that is selected by this definition. Notice that the

definition in Figure 26 uses a negative priority value to cause the data referenced by

lchild to appear before the name field of the same record5. '

Our final example illustrates how pointer hints are added to highlighted, indirect

pointer data (see Figure 27). The selection definition for creating Figure 27 is identical

to that of Figure 26, except for highlighting. Highlight text is encoded as:

Notice that the tree’s contents still appears as a single object, but highlighting infor­

mation has been added which allows the programmer to recover the tree’s structure

without displaying the pointers as arrows.

4.4 Summary

This chapter began by looking at the importance of data reduction for identifying relevant debugger data. Later, we saw how linked data structures could be preserved intact as abstract objects using definitional reduction by extending simple elision.

Also, we discussed how to support data selection better by integrating definitional and programmatic reduction methods. In Section 4.3 we introduced the centerpiece of

DV’s data selection facilities, the selection definition system, and used examples from

5Non-pointer fields are considered to have a priority value of zero, so that references with negative priorities appear before them and references with positive priorities appear after them. 0 dv

Ftle Edit Display Draw

tre e <1> • tre e <2> •Vickie" 'Rebekah’ •Matt" •Keith"

n tn etuc t .node Instancestru

Displayed Width 300

Selection Function

Field Nane Type Mode P rio rity Dereference fls Hl 9h l l 9h t

Value £

rthread int Imils »

lthread int lnvls >

rch ild struct .node • Refer >

lchild struct .node • Refer >

Hi; LI

Figure 26: An Inorder Listing of the Tree’s Contents E d v

File Edit Display Drew

tre e <1> •tre e <2> 0x64bO <3:4:i> <3:5:0> <1:5:0> 'Vickie'

<2:4:0>

<2:5:o> 'Hatt*

'Keith'

Type Nane s tru c t .node Instance

300

Selection Function

Node P rio rity Dereference As HighlightType char (803

rthread in t In v ls

lthread In t In v ls

rch ild struct .node * Refer »

lchild struct .node • Refer

CANCEL

Figure 27: An Inorder Listing With Highlights 90 this system to demonstrate the principles advocated earlier in the chapter. However, reducing the information in the data structure by selection is only the first step towards visualizing the data structure’s abstraction. To complete the process we need to display the reduced data structure in a manner that conveys the abstraction we are attempting to visualize. This topic, data display, is the subject of Chapter

Five. CHAPTER V

Data Display

Determining how to display data structures once the pertinent information is selected is the second step in DV’s approach for visualizing debugger data. As we have seen, previous visual debuggers generally offer two types of abstraction mechanisms to display their data: low-level, box-and-arrow views which are easy and natural to use but limited in their usefulness as abstractions, or completely customized views which are difficult to define. DV attempts to strike a balance between these two options with its data display objects. This requires us to think anew about how to use graphics for visualizing debugger data. The next section addresses this topic and discusses how it helped motivate the design of DV’s data display objects. Additionally, because our debugging model expects that programmers will need to rapidly create and modify abstract data views, Section 5.2 looks at ways to simplify these activities so they don’t distract the programmer from the search for program bugs. After having raised these issues, we proceed in Section 5.3 to introduce DV’s data display objects and discuss their features. Section 5.4 returns to the example debugging session in Chapter

Three to further illustrate the data objects’ display features. Finally, we conclude this chapter by reviewing how the display objects implement DV’s approach for visualizing debugger data and how they enable users to effectively manage that data.

91 92

5.1 DV’s Data Display Philosophy

In order to deliver a display system with more abstracting power than the traditional

box-and-arrow format, but requiring less effort than fully-customized visualizations,

we adopt two goals for displaying debugger data:

• Display linked data structures as single objects using data selection and display techniques that are common to the box-and-arrow format.

• Provide a set of graphical display procedures that can be easily bound to selected debugger data.

The aim of the first goal is to elevate the abstraction level of boxed, textual data

beyond primitive record boundaries. To reach this goal we depend on the assumption

that most programmers are familiar enough with common data structure abstractions

(e.g., lists, queues, and trees) so that they only need simple graphical cues in the boxed

text to comprehend the abstractions’ states. For example, although the circular

queue in Figure 28(a) is displayed linearly in 28(b), we expect most programmers

to easily recognize that these pictures represent the same queue state when given the information that the boldly-outlined box (in 28(b)) is the first element and the zeros represent empty elements. Avoiding decorative graphics frees the user from extraneous customizing of the abstraction’s appearance and allows him to concentrate more on the debugging task.

Our first goal is satisfied largely by DV’s selection definition system (see Sec­ tion 4.3) and the Variable display object. By default a Variable displays all selected data for a linked structure as text with divisions between the linked records denoted by the marker -| ------h In the future, these markers could be used to modify 93

head

0 0 23 34 45 56 67 0

tail (a) (b)

Figure 28: Abstracting a Circular Queue

a Variable’s display characteristics to permit better visualization of a linked data structure’s sub-objects (records) while still preserving the single object abstraction.

Moreover, ongoing research for automatically drawing data structures [28, 55] may eventually reveal even better display methods for abstracting data structures. These possibilities are discussed further in Section 6.3.

We also recognize that displaying debugger data as text, no matter how nicely it is formatted, does not always evoke the meaning of the data as effectively as a well-chosen graphic. However, creating a suitable picture or plot of the data is more difficult than displaying the text, because it demands an algorithm for synthesizing the image. We surmise that past attempts at providing graphic visualizations for debugger data have failed because of an overemphasis on customization, which forced the programmer to create an image-generating algorithm as part of the debugging 94 task. This premise prompted the second goal and led to the creation of DV’sGraphic display object. A Graphic uses a display procedure and data from an associated

Variable to produce a picture in the Drawing Pad. A guiding principle for choosing the original set of display procedures was to keep the Graphics’ pictures as simple as possible.

While DV’s Graphics are appropriate for many applications, the simplicity of these objects make them particularly suitable for debugging graphical programs (e.g., ray tracing, animation, hidden-surface removal, etc.). This emphasis reflects our desire to support the debugging of these types of programs. When debugging graphical programs, programmers often are frustrated in their attempts to understand the data, because its presentation as text does not clearly show the data’s inherent geometric information. Indeed, our own experience with debugging graphical programs has reinforced our conviction that the ability to display debugging data pictorially is a significant aid for these types of programs.

5.2 Managing Abstractions

If while debugging, programmers always knew beforehand which data values were important and how they could best be displayed, then they might not require any flexibility in the display system. Realistically, however, we expect programmers to look at several variables—some providing information, others not—in search of a bug.

We expect that users may wish to change variables’ abstract depictions to emphasize different aspects of the data. Also, it is possible that a programmer may choose to group variables together, because, as a whole, they represent an abstract state which 95 holds a key to the program’s fault. In the rest of this section, we consider how a visual debugger can support this operational model by giving the user the ability to rapidly organize the displayed debugger data and modify the abstract views it presents.

5.2.1 Organizing the Data

Deciding what information to examine in search of a bug is the first problem a pro­ grammer must face when debugging. Once this choice is made, a system for organiz­ ing the selected data needs to be applied; otherwise, it may be impossible to manage the data. This is a classic problem with dbx, because the displayed variable data is confusingly intermixed with other debugger output, including the next executable statement and the dbx command prompt, xdbx mollifies this problem a little by pro­ viding a separate window area for the displayed data, but it continues to present the information in a single, fixed-order list.

Visual debuggers provide better organization of the displayed data using three types of association: positional, set, and abstraction. Positional association groups variables according to their proximity on the screen. A debugger provides positional association, if it allows the user to show relationships among variables by modifying their location on the screen. Set association groups variables by a common charac­ teristic (e.g., particularly “suspicious” variables, or variables belonging to the same procedure). Abstraction association groups variables by displaying them collectively as a single abstraction. For example, two variables x and y could be combined and displayed abstractly as a point in two-space. Table 3 illustrates which types of asso­ ciations the debugging systems surveyed in Section 2.2 use. 96

Table 3: Types of Association Used to Organize Debugger Data

Typ e System Positional Set Abstraction EXDAMS V Incense VV GDBX V VIPS VV PROVIDE VV VIPS II V Commercial V

Allowing users to positionally associate variables on the computer screen is useful,

because it is an explicit visual reminder that these variables are abstractly related.

As Table 3 shows, most visual debuggers provide positional association.

Set association organizes the debugger data even better. In addition to visually grouping related variables, set association allows operations such as visibility or up­

dating to be applied to entire sets. Applying operations to variables by sets can significantly improve the debugger’s performance when many variables are displayed.

However, only VIPS uses set association1, and VIPS uses it in place of positional association so that related variables within sets cannot be grouped further. More­ over, VIPS makes no attempt at exploiting the opportunity to apply operations to the variables by sets.

Abstraction association is necessary for building visualizations of program objects

1VIPS uses the program block in which the variable is declared as its set characteristic. 97

that are implemented using more than one program variable. For example, a stack

data structure may be implemented using an array and an integer variable indexing

the top of the stack. Abstraction association combines these variables to display

the stack as a single object. Traditionally, only debuggers employing programmatic reduction provide abstraction association2.

5.2.2 Modifying Abstractions

Just as programmers need flexibility in organizing their displayed data, they must also be able to quickly modify how that data is abstracted. Most visual debuggers provide mechanisms for modifying a variable’s displayed abstraction; however, debug­ gers which use data selection methods based on data type have an advantage when many variables of the same type are displayed, since the selection criteria for these variables can be changed by modifying a single selection definition.

5.3 DV Data Display Objects

We have already, by necessity, been introduced to DV’s data display objects and some of their features in Chapters Three and Four. This section presents these objects in more detail. It begins by introducing DV’s top-level user-interface objects: the main window and menu bar. Next, it proceeds with a discussion of the features Functions,

Variables, Graphics, and the Drawing Pad have in common. After this, we examine the functionality of each display object in succession and how each object participates in visualizing debugger data.

2This can be seen by comparing Table 3 with Table 1. 98

5.3.1 The Main Window and Menu Bar

In its current implementation, DV is invoked by our modified version of xdbx as described earlier in Chapter Three. Initially, only DV’s main window and menu bar appear (see Figure 29). DV uses the main window as a canvas or “bulletin board” to display debugger data. Users display variables by issuing the dbx display command either from xdbx or through DV’s menu bar. Figure 30 demonstrates how a Function containing one Variable is typically displayed in the main window.

Menu Bor - File Edit Display Draw

Main Window

Figure 29: DV’s Initial Display State

The menu bar provides access to DV commands, and is divided into four categories as illustrated in Figure 31. The File menu allows the users to save and restore selection definitions (see Section 4.3) to external files, or exit DV. Users modify the current selection definition database via the Edit menu. The Display menu lets users display new variable data from DV or control whether or not DV updates 99

B dv

F ile E d it D isplay Draw

add

sun <1> Variable

Function

Figure 30: DV Displaying a Function add and Its Variable sum

its displayed data at each execution breakpoint. Inhibiting updates is sometimes desirable when stepping statement-by-statement through the program. Lastly, the

Draw menu controls whether or not the Drawing Pad is displayed. The Drawing

Pad is a special Function for displaying variables as Graphics (see Section 5.3.5).

5.3.2 Common Features

DV’s data display objects fulfill different roles for visualizing debugger data. The primary purpose of Functions is to organize the debugger data. Variables control vi­ sualization of the selected data by providing access to methods for modifying how this data is displayed. Graphics and the Drawing Pad handle DV’s non-textual abstrac­ tions. Although their purposes differ, DV’s display objects exhibit several common features. 100

DrawlEdit Save Define) Slwple [Show! Restore Delete Hide Exit

File Edit Display Draw

Figure 31: DV’s Menu Bar Commands

The first features DV’s display objects have in common are that they have a uniform appearance and respond in a mutually consistent manner to the user’s in­ teractions. These features are accomplished in the current implementation by using

Motif widgets for building the display objects. Each display object also exports meth­ ods for modifying its display state, and in each case these methods are accessed via popup menus. In addition, all manager3 display objects provide a virtual screen with scrollbars for presenting their data. Finally, all objects are updated at each execution breakpoint, unless the user specifically directs otherwise.

3A display object is amanager, if it allows modifications on individual parts of the data it displays (e.g., Variables allow modifications on individual fields). Functions, Variables, and the Drawing Pad are manager display objects. 101

5.3.3 The Function Object

Two concerns motivated our creation of the Function object: organization of the

displayed variable data and performance. After implementation we unexpectedly dis­

covered a third benefit, the efficient use of DV’s drawing area. Functions are a visual

analog to program procedures; as procedures contain (local) variables, so Functions

contain Variables (see Figure 30). This analogy seemed to be a natural method for

organizing the display of potentially many variable states. In practice it has proven its

utility. Organizing Variables by Function allows users to associate related variables in

meaningful ways. For example, they may chose to concentrate on only those variables

within the current call stack frame, or they may back up the call stack to view the

parameters passed to the current procedure. Furthermore, Functions allow variables

of the same name, but in different procedures, to be displayed simultaneously, even

if the variables are of different types.

Organizing Variables by Function is a type of set association similar to that em­

ployed by VIPS. However, unlike the VIPS association mechanism, Functions also

provide performance benefits. Complicated visualizations of variables are expensive

to update at each breakpoint; when updating several of these abstractions, there is

a noticeable lag. Most often, however, only the variables in the current or active procedure are of interest and need to be updated. Thus, by default, we only update

Variables in the Function corresponding to the active procedure. Functions which

are not updated are deemed stale by DV, and DV “greys out” the Variables in these

Functions to notify the user that their values may be outdated. If the user needs to 102

update Variables in stale Functions, he may do so via an update procedure exported

by the Function, or he may set the Function’s update mode such that the Function

is updated unconditionally at each breakpoint.

DV displays Functions within its main window. Because DV shares the screen with

the xdbx debugger and possibly other windowed applications, it is important for DV to

use screen space wisely. Functions are a convenient method for doing this. By default,

Functions cascade across the main window as they are created (see Figure 32); they

also may be resized and tiled as Figure 33 illustrates. By using the cascading method,

users can save screen space and yet easily view all the Variables of a given Function

by clicking the mouse pointer on the Function’s title bar to raise the Function to the

top of the window display hierarchy. Additionally, DV automatically raises the most

recently updated Function to the top of the hierarchy to reveal its newly updated

Variables. This practice, when combined with the Function’s default update mode, conveniently keeps the Function representing the currently active procedure at the top of the display hierarchy as the program moves from breakpoint to breakpoint.

5.3.4 The Variable Object

After the user identifies a variable for display and the selection definition system’s selection algorithm retrieves its data, DV displays the results in a Variable. In DV, there are two types of Variables: simple and combination. Simple Variables manage the selected data of a single program variable; combination Variables allow selected data from many program variables to be grouped together. A Variable’s type is distinguished by the format of its name as it appears in the title bar; combination F ile E dit Display Dray

Update.Functlon update_dcfault_uldqet oet_variable_value

reply <10? |0xld8938 "m* = l\n'

o :

Figure 32: DV Functions Displayed Cascading (Default) File Edit Display Draw

Upda to.Function update.default.wldgot

functlorr>kBy.nawo <7) v~>kcy_nano <8> 10xld0e68 "'addit'naln" 0xld38S0 "nun"

get_var1ab1e_va1uc read.dbx

name (9) reply (10? Oxld84a8 "nun" 0xld8938 "nun = l \ n “

e :

Figure 33: DV Functions Resized and Tiled by the User 105

Variables use angle brackets to enclose the name(s) of the program variable(s) they associate, while simple Variables contain only a single name without the enclosing brackets. Each of these Variable types appear in Figure 34; function->key_name, key and title_name, input parameters to Create_Variable, are simple Variables.

They are grouped together in the combination Variable above them.

File Edit Display Draw

key,na»e, key, tltle_nawe> <1) 0xlS2e48 "'addlt'aain"

0x184080 "sun’

0x164060 "sun"

Figure 34: Simple and Combination Variables

With combination Variables, programmers can create new abstractions by freely associating variables. For example, by combining Create_Variable’s parameters in a combination Variable we explicitly declare that these variables are associated by some characteristic. Henceforth, these program variables are manipulated within

DV as a unit, simplifying both their management and the programmer’s recognition that they are related. Another example is to combine the loop indices for a two- dimensional array and display the result as a point. Since combination Variables group multiple variables into a single abstraction, they are a type of abstraction 106

association. Moreover, they are the first implementation of an abstraction association

mechanism for a visual debugger that is not explicitly based on user programming.

Variables are similar to GDBX’s boxes, but they provide more functionality. In

addition to allowing combinations, DV Variables export procedures to:

• Change the Variable’s order in the display hierarchy.

• Recast their selected value4.

• Hide value fields.

• Show elided value fields.

• Chose an alternate selection definition.

• Associate a Graphic with the Variable.

• Destroy the Variable.

To perform these actions a Variable keeps track of the program variable it repre­

sents, including the variable’s data type, current value, address in memory, dbx display

state, who it references, and who it is referenced by. Furthermore, in addition to being

set associated by their parent Function, Variables can also be positionally associated within the Function using the mouse. Thus, between the Function and Variable, DV provides the programmer with a mechanism for each type of association to organize the debugger data.

5.3.5 The Graphic Object and the Drawing Pad

Recall from Section 5.1 that our second goal for visualizing debugger data was con­ cerned with producing simple pictures using pre-specified, graphical display proce­

4Combination Variables may not be recast, because they are typeless. Furthermore, Variables representing structures may not be recast, due to a limitation in the C language. 107 dures. Initially, we envisioned the Graphic as a variant of the Variable: users choose one of several types of Graphic objects to display the selected data as a point, line, color, etc., rather than text. Basically, this approach is used by most visual debuggers for alternate visualizations of the data. However, this approach prevents different ab­ stract pictures from being combined or drawn in the same space. Such a restriction is a serious handicap to the usefulness of these abstractions for graphically-oriented programs. Therefore, we chose to draw all Graphics on a single “drawing pad” using data from associated Variables.

To create a Graphic the user selects the desired view from the Variable’s popup menu (see Figure 35). This action causes the selected Graphic to appear in the

Drawing Pad. The Graphic is also at the same time bound to the Variable so they are updated together. Presently, the following types of Graphics are supported by

DV5.

• Point(s) — 2D and 3D

• Segment(s) — 2D and 3D

• Infinite and semi-infinite line — 2D and 3D

• Polygon — 2D and 3D

• Vector — 2D and 3D

• Color (red, green, blue)

Graphics are drawn in three modes: fresh, stale, and highlighted. Graphics as­ sociated with updated Variables are drawn fresh. When a Variable is “greyed out”, bA 3D primitive means that the primitive is defined in three-dimensional space, not that it has “thickness.” File Edit Display Draw

ray <1) point = <0.200000002980232239, 0.0, 0.800000< directio n = <0.1111111119389534, 0.4444444477558:

polyyyi <2) Undisplay <0.5, -1 .0 , -0.699999988079071 <-0.5, 1.0, -1.0) Raise <1.5, 1.0, -1.0) Lower Color Set Definition 2D Points Cast 2D Segments rnn Uncast 2D Polygon Hide Fields 2D Vector Show Fields 2D Ray Mask Dividers I 2D Line 3D Points 3D Segments 3D Polygon 3D Vector 3D Ray 3D Line

Figure 35: Creating a Graphic for polygon 109

because its parent Function is stale, its Graphic is displayed in stale mode. High­

light mode is used for identifying the association between Graphics and Variables6.

The user highlights a Graphic by “pressing” its pushbutton label (see Figure 36). We

chose this method because it prevents confusion about Variable/Graphic relationships

without cluttering the Drawing Pad with labels.

As we have seen, Graphics are displayed in the Drawing Pad. Since the Draw­ ing Pad is a modification of the Function, it can also be resized, moved, selectively updated, and have its stacking order in the display hierarchy changed. Moreover, its popup menu provides access to the following additional features.

• Display/undisplay coordinate axes

• Center the drawing area

• Set the viewing mode (2D or 3D)

• Set the viewing parameters (viewing distance and eye position)

5.4 Data Display: An Illustrated Example

In our debugging example in Chapter Three we used many of DV’s data display objects features to generate the views from which our figures were created. However, it was necessary at that point to ignore the details of these features until the display objects could be introduced more completely. In this section we return to Chapter

Three’s example and examine precisely the capabilities of DV’s data display objects.

To recap, we monitored three variables which we expected to lead us to the pro­ gram’s bug. In Figure 14 (duplicated here as Figure 37), we examined the graphic

0Color is also provided for this purpose, but it cannot be used on monochrome screens. Highlighting is useful for both monochrome and color workstations. 110

File Edit Display Draw

ray ( ! ) point = (0.200000002380232239, 0.0, 0.800000< d irection = <0.1111111119389534, 0.4444444477558:

polygon (2) (0.5, -1 .0 , -0.699999388079071045) (-0.5, 1.0, -1.0) (1.5, 1.0, -1.0)

drawing pad Pushbutton |ray (1)

Figure 36: Highlighting the polygon Graphic Ill relationship between polygon and ray. We produced the Graphics for these Vari­ ables and the view shown in the Drawing Pad by associating a Graphic with each

Variable and then using the Drawing Pad’s exported procedures to center the view, display the axes, and set the Pad for 3D viewing.

Next we declared that polygon and ray appeared to intersect. Admittedly it is difficult to accurately judge 3D spatial relationships from one 2D projection—but this is not what we did. The Drawing Pad allows users to modify the viewpoint so the data may be examined from many angles and distances. Rotating the data made it apparent that polygon and ray did indeed intersect. In Figure 38 we illustrate this capability. This figure shows five frames (landscape orientation) of the ray and polygon Graphics as the viewer’s eye is rotated about the y-axis. Each frame repre­ sents a 45 degree offset from the one before it with the first frame offset 45 degrees from the initial view in Figure 37. The user interface for controlling the view is also shown in this figure.

After determining that the ray and polygon intersect, we stepped into the pro­ cedure Intersect to compute and display the intersection point P. Note that P is displayed as the combination Variable7 (see Figure 39). It was necessary to do this, because P is declared in Intersect as a pointer variable as required by the C language parameter passing conventions; therefore, displaying P naturally yields only a memory address. In this instance, the combination Variable

7Figure 39 shows the user issuing the command display P[0], P [l], P[2] from thexdbx dialogue window; actually, such a command displays three individual Variables. Combination Variables are displayed using DV’s menu bar as described in Section 5.3.1, but we overlooked this detail to preserve clarity in the example. S xdbx 2.1 (patch level 1> (3 dv a /n/nus ic/O/kshomper/cIs782/ray3/Intersect.c File Edit Display Draw v c tr.3 polygonC33, P; in t h it;

fprintf(stderr, "Input ray start point and direction (six float values):"): scanf("ZfZfZfZfZfZf",iray.point[01,&ray.pointC13,&ray.point[23, iray.directiont03,fcray.directionil3,tray.direction[23); polygon (1) nomalize_3(ray,direction); fprintf(stderr, "\n"); (0.5, -1 .0 , -0.899999388079071045) (-0.5, 1.0, -1.0) fprintf(stderr, "Input the three points of the plane (nine float values):"); (1.5, 1.0, -1.0) scanf( "ZfZfZfZfZfZfZfZfZf", &polygont03t03, &polygonI0)[13, ipolygontO)I2), ipolygonC13I03, tpolygontlllll, &polygontl3I23, &polygonC21[03, &polygonI21C13, &polygonC23C21); ray (2) fprintf(stderr, "\n"); point = (0.200000002380232239, 0.0 . 0.8000000U9209289S3) direction = (0.1111111119339534, 0.4444444477^813599, -0.888888895511627197) hit = Intersect(ray, polygon, P);

i f ( h it = POLYGON) { fprintf(stderr, "The ray intersects the polygon at (Zf, Zf, Zf)\n“, PI03, P i l l , PC23); > else if (hit = PLANE) < fprintf(stderr, "The ray does not intersect the polygon, \n">; crawing pad fprintf(stderr, "but intersects the polygon's plane at (Zf, Zf, Zf)\n", PC01, PC11, PC23); [polygon ( 1 )| |ray (2) stopped in nain at line 238 in file "/Intersect.c"

1 ™ 1 1 COnt 1 | next | 1 steP 1 | stop at j | stop in | 1 delete |

| where | | print * | 1 * \ 1 d°yr* | | p rin t | 1 func 1 1 1

status j | display ] |undlsplay| | dunp | | search ] | quit

Running: In tersect Input ray start point and direction (six float values);.2 .0 .8 .1 .4 -.8

Input the three points of the plane (nine float values):.5 -1. -.7 -.5 1. -1. 1.5 1. -1.

(xdbx) display polygon display polygon (xdbx) display ray display ray I (xdbx) A ,'Intersect'Bain'polygon = ( (0 .5 , -1 .0 , -0.699939388079071045) (-0.5, 1.0, -1.0)

Figure 37: xdbx and DV Displaying polygon and ray use S liders to Change O rientation

Figure 38: Examining the Spatial Relationship Between polygon and ray 114

allows us to reassociate the x-, y-, and z-components of P and display it as a 3D

point.

Once P is displayed as a point it is evident, even without alternate views, that its value is incorrect. Moreover, since polygon and ray have evidently not been

corrupted by earlier procedure calls, we know the problem must lie in RayPoint.

Upon correcting P ’s z-component and appealing once more to the Drawing Pad’s view control, it is easy to ascertain that our modification is appropriate.

5.5 Summary

This chapter presented DV’s mechanisms for data display: Functions, Variables,

Graphics, and the Drawing Pad. This section reviews how these mechanisms im­ plement DV’s display philosophy and how they enable programmers to effectively manage debugger data.

Variables, in cooperation with the selection definition system, provide a usable technique for abstracting linked data structures as single objects. The technique is usable, because a Variable abstraction is easily defined while representing its data clearly enough to allow the programmer to comprehend the data structure’s state.

Minimizing variations in the graphical cues makes this possible. However, Variables provide potential for allowing greater graphical variations as our understanding about drawing data abstractions increases.

Variables also cooperate with Graphics to empower programmers for displaying debugger data as geometric objects. By combining Graphics in a single Drawing

Pad, the debugger data may be abstractly compared, a capability which is particu- (3 xdbx 2.1 (patch level 1) S i ® dv

/n/nusic/O/kshonper/c1s782/ray3/Intersect.c File Edit Display Draw int j, k, in = 0, plane; int old_sign, sign; vctr.3 nora, vsub; float t, dot; Intersect Point ptsC31, pt;

CaIcNomaKpolygon, nor*); polygon (1) (6) (0.5, -1 .0 , -0.699999988079071045) 0.422972977161407471 dot = dot_3(nor*, ray,direction); (-0.5, 1.0, -1.0) 0.891891896724700928 i f (FfiBS(dot) > EPISILON) { (1.5, 1.0, -1.0) 1.02297294139862061 /* test for intersections for non-perpendicular faces wrt the ray */ ray <2) SubVec(ray,point, polygonCO], vstfc); .200000002980232239, 0.0 , 0.8000000119209289S t = -

plane = »ax3(nor*>;

ProJectPts(polygon, 3, plane, pts); ProjectPts(P, 1, plane, &pt); drawing pad o ld .sig n = NOSIGN; in = 1; [polygon (1) | [ (6) stopped in Intersect at line 169 in file VIntersect.c"

1 1 1 1 | next | step | 1 stop at 1 1 stop in 1 | delete

| where | | up 1 1 p rin t | | p rin t « | 1 *"■= 1 | f il e

| statu s | | display |undisplay| dunp | j search ) 1 V l t |

display ray (xdbx) func Intersect func Intersect stop a t 169 (4) stop at Vrv'«usic/0Asho*per/cis782/ray3//Intersect.c":169 (xdbx) display PCO], P i ll , PC2] display PC01, PCI], P121 (xdbx) cont cont (xdbx)^

„'Intersect'*ain'polygon = ( <0.5, -1 .0 , -0.699999988079071045) (-0.5, 1.0, -1.0)

Figure 39: Displaying P as a 3D Point 116 larly suited for debugging graphical programs. Moreover, the Drawing Pad exports methods for thoroughly examining the data, permitting the programmer to view data in two or three dimensions and to adjust the viewpoint accordingly.

All types of associations—positional, set, and abstraction—are available to the programmer for organizing debugger data. Functions provide set association. They also sensibly conserve screen space and provide performance benefits. Combination

Variables provide abstraction association, and both types of Variables, simple and combination, may be positionally associated. Furthermore, Graphics allow debugger data to be abstractly associated in the Drawing Pad as alluded to earlier. CHAPTER VI

Summary

In this chapter we summarize the dissertation and the research it represents. We begin by examining how DV satisfies the desirable characteristics set forth in Chapter Two for an improved visual debugger. Next, we present DV’s contributions to the field of visual debugging. Our experience developing and using DV led to many new ideas for improvements and extensions. In Section 6.3 we discuss these ideas. Finally, we conclude by offering our overall assessment of DV and its potential for furthering our capability to visualize debugger data.

6.1 Evaluation

Section 2.4 identified the following five desirable characteristics for improving present- day debuggers’ data visualization capabilities.

• A method for automatically excluding unnecessary data values that better ad­ dresses the management of pointer data.

• A simpler method for building graphic visualizations that encourages program­ mers to exercise the capability.

• An integrated solution for using pre-defined, data structure “print” procedures with the aforementioned data selection and graphic capabilities.

• A method for organizing related variable values and supporting efficient updates of these values at program breakpoints.

117 118

• A method for supporting multiple, complementary views of identical data values at graduated degrees of abstraction.

The essence of these characteristics is to raise the debugger’s ability to abstract

and organize its displayed data without significantly increasing the programmer’s

burden for defining and managing those abstractions. It is critical to minimize the

effort to define the displayed abstractions, because experience shows that users are

looking for visualization methods that can be applied “on-the-fly.”

DV is a visual debugger that embodies these characteristics. It does so by con­

solidating in organized sets boxed, text-based abstractions for linked and non-linked

data structures with the capability to display these abstractions using simple graphic

primitives. Furthermore, DV protects the users’ investments in pre-defined “print”

procedures for their data structures. Therefore, programmers who practice pro-active

debugging techniques have their own data structure display algorithms immediately

available to them to display their data while debugging. The rest of this section exam­ ines each of these characteristics individually and summarizes the specific mechanisms

DV uses to satisfy them.

DV’s selection definition system promotes programmers’ efforts to rapidly abstract data structures using data reduction and a systematic method for handling field data in records. The five specification types for pointer fields—Invis, Value, Link, Refer, and Index—open to DV users a new opportunity for visualizing linked data structures as unbroken abstract Variables without an appeal to programming. Not only can the

Variable elide irrelevant link information, making the data easier to comprehend, but it minimizes the task of managing the data by organizing it into fewer objects. 119

Graphical pictures of debugger data are desirable, but previous visual debuggers failed to include methods for synthesizing the variable data into meaningful pictures without alienating their users, because the visualizations they promoted demanded too much effort to create. DV’s Graphics, Drawing Pad, and combination Variables use simpler graphics and non-programmatic grouping mechanisms to empower users in combining, displaying, and comparing their debugger data graphically. We expect that allowing users to build graphic displays without resorting to programming will encourage them to use these features while debugging.

To promote defensive or pro-active debugging techniques, DV’s selection defini­ tion system supports data selection using the programmer’s pre-defined “print” pro­ cedures. This capability is integrated with Variable objects so that any data selected by a “print” procedure can be used in a Graphic or combination Variable in the same manner as data retrieved by the selection definition system’s selection algorithm.

DV’s main mechanism for organizing its displayed data is the Function. Updating

Variables by Function is an efficient method for handling the dynamic states of many displayed Variables. With Functions associating Variables in sets and Variables also providing positional and abstraction association, users are free to organize their data in more ways than any previous visual debugger allowed. Moreover, as noted earlier, the Drawing Pad also can be used to abstractly associate variable data.

Variables allow multiple definitions through a preferred definition mechanism.

Also, since the same data may be displayed more than once, programmers can view different abstractions of the same data simultaneously by assigning different preferred 120

definitions to each copy of the data.

6.2 Contributions

In this section we review the primary contributions of this research. Ideas for future

research appear in the following section.

Of the methods used to manage and abstract the detail of debugger data, elision

has proved to be the easiest method for programmers to apply and the most widely

used. DV extends simple elision of structured data by supporting different types of

field data in records, most notably for various pointer classes. Indeed, DV is the

first visual debugger to systematically address visualization of pointers as something

other than the link arrow. Consequently, DV is the first visual debugger to display

linked data structures as single objects without requiring the user to create a display

algorithm for this purpose. In addition, DV uniquely provides:

• Integrated access to user-defined “print” procedures as a data selection mecha­ nism.

« Combination of independent program variables to form new abstractions with­ out requiring user programming.

• Demonstration of a visual debugger with mechanisms (Graphics and the Draw­ ing Pad) for supporting graphical program debugging.

In addition to the contributions listed above, DV also introduced the Function

as an object for both organizing debugger data and offering a scheme for efficiently

updating the displayed data. Finally, the binding of Graphics to Variables offers a new, flexible method for controlling input parameters to the algorithms for displaying the debugger data as pictures. 121

6.3 Future Work

Inevitably, as could be expected, our experiences in developing DV led to new chal­

lenges for extending the debugger’s ability to present its data. We discuss the most

interesting of them here.

Our classification of record field types gives us a foothold for describing common

uses for the sub-parts of data structures. As a result, we are able to extend the defi­

nitional technique for selecting interesting debugger data and displaying it abstractly.

However, other classifications could potentially yield even better definitional control.

In particular, we believe a finer classification of pointers is possible to augment or

replace the role now performed by the highlighting mechanism. Also, the distinction

between showing the data with a Value specification and completely eliding it with

an Invis specification may be too sharp. An intermediate specification that indicates

the data’s presence only, not its value, ought to be studied.

Currently, DV uses textual symbols to implement pointer highlighting and record

divisions in linked data structures. A coloring scheme for highlighting data or show­

ing record divisions would avoid cluttering the pertinent debugger data with these

extraneous symbols.

Often a history of variable values reveals patterns that indicate the nature of the program’s error. DV’s Functions represent an excellent opportunity for developing

“history pages” of old variable states. This requires developing both an implementa­ tion for non-updating copies of Functions and a method for managing these copies.

The parameter binding between Variables and Graphics by elision, while flexible, 122

is somewhat clumsy to use. A better approach might be to associate data in the

Variable to Graphic parameters via an editable table.

Finally, DV provides no layout policy for Variables within Functions. We felt that

attacking this problem would take us too far afield of our principal research. Neverthe­

less, layout policies can affect the manner in which groups of Variables are perceived;

therefore, DV could benefit from experimenting with different layout policies for its

Variables.

6.4 Conclusion

The goal of this research was to demonstrate that we could raise the data visualization

capabilities of an existing debugger beyond the traditional box-and-arrow format

while keeping these capabilities simple enough to use in the debugging environment’s

hurried atmosphere. We believe DV accomplishes this goal. Moreover, previous

debuggers did not attempt to distinguish the characteristics of the elementary parts of the interesting data structures. DV’s two-step philosophy for visualizing debugger

data stresses the need for programmers to describe both the relevant data and how it needs to be seen. We expect DV and its philosophy to lead to an even better understanding of how to visualize debugger data. B ibliography

[1] Evan Adams and Steven Muchnick. Dbxtool: A Window-Based Symbolic Debug­ ger for the Sun Workstation. Software: Practice and Experience, 16(7):653—669, July 1986. [2] H. Agrawal, R. DeMillo, and E. H. Spafford. An Execution-Backtracking Ap­ proach to Debugging. IEEE Software, 8(5):21—26, May 1991. [3] Allen Ambler and Margaret Burnett. Influence of Visual Technology on the Evolution of Language Environments. IEEE Computer, 22(10):9-24, October 1989. [4] K. Araki, Z. Furukawa, and J. Cheng. A General Framework for Debugging. IEEE Software, 8(5):14-20, May 1991. [5] Paul J. Asente and Ralph R. Swick. X Window System Toolkit. Digital Press, 1990. [6] Michael E. Atwood and H. R. Ramsey. Cognitive Structures in the Compre­ hension and Memory of Computer Programs: An Investigation of Computer Program Debugging. Technical Report TR-78-A21, Army Research Institute, August 1978. 80 pages. [7] Moshe Augenstein and Yedidyah Langsam. Graphic Displays of Data Structures on the IBM PC. SIGCES Bulletin, 18(1):73-81, February 1986. [8] Ron Baecker. Sorting out Sorting. 16mm color sound film, 25 minutes, 1981. Dy­ namics Graphics Project Group, Computer Systems Research Group, University of Toronto, Toronto, Ontario. [9] R. M. Balzer. EXDAMS-EXtendable Debugging And Monitoring System. Pro­ ceedings of AFIPS Spring Joint Computer Conference, 34:567-580, 1969. [10] David B. Baskerville. Graphic Presentation of Data Structures in the DBX Debugger. Master’s thesis, University of California at Berkeley, Department of Electrical Engineering and Computer Science, August 1985. [11] H. D. Bocker, Gerhard Fischer, and Helga Nieper. The Enhancement of Un­ derstanding through Visual Representations. CHI ’86 Proceedings, pages 44-50, April 1986.

123 124

12] J. D. Bovey. A Debugger for a Graphical Workstation. Software: Practice and Experience, 17(9):647-662, September 1987. 13] Frederick Brooks. The Mythical Man-Month. Addison-Wesley, Menlo Park, CA, 1975. 14] A. R. Brown and W. A. Sampson. Program Debugging. Macdonald and Co., London, 1973. 15] Gretchen Brown, Richard Carling, Christopher Herot, David Kramlich, and Paul Souza. Program Visualization: Graphical Support for Software Development. IEEE Computer, 18(8):27—35, August 1985. 16] Marc H. Brown. Exploring Algorithms Using Balsa-II. IEEE Computer, 21(5):I4­ 86, May 1988. 17] Marc H. Brown and John Hershberger. Color and Sound in Algorithm Anima­ tion. IEEE Computer, 25(12):52—63, December 1992. 18] Marc H. Brown and Robert Sedgewick. A System for Algorithm Animation. Computer Graphics, 18(3):177—186, July 1984. 19] Marc H. Brown and Robert Sedgewick. Techniques for Algorithm Animation. IEEE Software, 2(1):28—39, January 1985. 20] Thomas A. Cargill. The Blit Debugger. The Journal of Systems and Software, 3(4):277—284, December 1983. 21] W. E. Carlson, R. E. Parent, D. Ebert, and K. Boyer. EDGE - Educational and Development Graphics Environment. Technical Report OSU-CISRC-1/90-TR3, The Ohio State University, Department of Computer and Information Science, 1990. 22] S. K. Chang. Visual Languages: A Tutorial and Survey. IEEE Software, 4(1):29- 39, January 1987. 23] S. K. Chang. Visual Languages and Visual Programming. Plenum Press, New York, NY, 1990. 24] Po Cheung. X Window System, Version 11 Manual Pages — xdbxfI). Sun Microsystems, Inc., 1990. 25] James W. Cortada. Historical Dictionary of Data Processing: Technology. Green­ wood Press, Inc., W estport, CT, 1987. 26] Edsger Dijkstra. Go To Statement Considered Harmful. Communications of the ACM, 11(3):147—148, March 1968. 27] Edsger Dijkstra. The Humble Programmer. Communications of the ACM, 15(10):859—866, October 1972. 125

[28] Chen Ding and Prabhaker Mateti. A Framework for the Automated Draw­ ing of Data Structure Diagrams. IEEE Transactions on Software Engineering, 16(5):543—557, May 1990. [29] Mireille Ducasse and Anna-Maria Emde. OPIUM: A Debugging Environment for Prolog Development and Debugging Research. ACM SIGSOFT, Software Engineering Notes, 16(2):67—72, April 1991. [30] Robert Dunn. Software Defect Removal. McGraw-Hill Book Co., New York, NY, 1984. [31] T. 0. Ellis, J. F. Heafner, and W. L. Sibley. The Grail Project: An Experiment in Man-Machine Communication. Technical Report RM-5999-Arpa, RAND Cor­ poration, 1968. [32] Thomas Evans and D. L. Darley. On-Line Debugging Techniques: A Survey. Proceedings of the AFIPS Fall Joint Computer Conference, 29:37-50, 1966. [33] Michael B. Feldman and Melinda L. Moran. Validating a Demonstration Tool for Graphics-Assisted Debugging of Ada Concurrent Programs. IEEE Transactions on Software Engineering, 15(3):305—313, March 1989. [34] Ephraim P. Glinert and Steven L. Tanimoto. Piet: An Interactive Graphical Programming Environment. IEEE Computer, 17(11):7—25, November 1984. [35] J. D. Gould. Some Psychological Evidence on How People Debug Computer Programs. International Journal of Man-Machine Studies, 7:151-182, 1975. [36] Laura Gould and William Finzer. Programming By Rehearsal. Technical Report Xerox PARC CSL-84-1, Palo Alto, May 1984. 133 pages. [37] Robert Grafton and Tadao Ichikawa. Visual Programming: Guest Editors’ In­ troduction. IEEE Computer, 18(8):6—9, August 1985. Special Issue on Visual Programming. [38] R. A. Green. Programming Tool for Automated Flow Chart Generation of As­ sembly Language Programs. IBM Technical Disclosure Bulletin, 15(10):2999- 3001, October 1973. [39] L. Gugerty and G. M. Olson. Comprehension Differences in Debugging by Skilled and Novice Programmers. In Empirical Studies of Programmers, pages 13-27. Ablex Publishing, Norwood, NJ, 1986. [40] Lois M. Haibt. A Program to Draw Multi-Level Flow Charts. In Proceedings of the Western Joint Computer Conference, pages 131-137, San Francisco, CA, 1959. [41] Dan Heller. Volume Six: Motif Programming Manual. O’Reilly and Associates, 1991. 126

[42] Joseph Hollingsworth. Software Component Design-for-Reuse: A Language- Independent Discipline Applied to PhD Ada. thesis, The Ohio State University, Department of Computer and Information Science, Columbus, OH, 1992. [43] Ellis Horowitz and Sartaj Sahni. Fundamentals of Data Structures. Computer Science Press, Rockville, MD, 1982. [44] Tadao Ichikawa, Erland Jungert, and Robert R. Korfhage, editors. Visual Lan­ guages and Applications. Plenum Press, New York, NY, 1990. [45] Valerie Illingworth, Edward Glaser, and I. C. Pyle, editors. Dictionary of Com­ puting, Third Edition. Oxford University Press, New York, NY, 1990. [46] Sadahiro Isoda, Takao Shimomura, and Yuji Ono. VIPS: A Visual Debugger. IEEE Software, 4(3):8-19, March 1987. [47] Oliver Jones. Introduction to the X Window System. Prentice Hall, Englewood Cliffs, N J, 1989. [48] Irwin Katz and John Anderson. Debugging: An Analysis of Bug-Location Strate­ gies. Human-Computer Interaction, 3(4):351—399, 1987-1988. [49] Alan Kay and Adele Goldberg. Smalltalk-72 Instruction Manual. Technical Report Xerox PARC SSL-76-6, Palo Alto, 1976. 130 pages. [50] Steven Kearns. The MultiScope Debuggers Make Debugging Easier. BYTE, 16(5):271—274, May 1991. [51] Donald Knuth. Computer Drawn Flow Charts. Communications of the ACM, 6(9):555-563, September 1963. [52] S. Lauesen. Debugging Techniques. Software: Practice and Experience, 9(1):51- 63, January 1979. [53] Henry F. Ledgard. Programming Proverbs. Hayden Book Company, Inc., Rochelle Park, NJ, 1975. [54] Lawrence Levine. Debugging, Planned or Ad Hoc, Which is More Effective? Journal of Educational Data Processing, 14(4):1—9, 1977. [55] Prabhaker Mateti and Chen Ding. The Architecture of a Prototype System for Drawing Data Structures. In Eurographics ’90. Proceedings of the European Comp Graphics Conference and Exhibition, pages 401-411, 1990. [56] G. A. Miller. The Magical Number 7 Plus or Minus Two: Some Limits on Our Capacity for Processing Information. Psychological Review, 63:81-97, 1956. [57] Mitchell Model. Monitoring System Behavior in a Complex Computational En­ vironment. Technical Report Xerox PARC CSL-79-1, Palo Alto, January 1979. 179 pages. 127

[58] T. G. Moher. PROVIDE: A Process Visualization and Debugging Environment. IEEE Transactions on Software Engineering, 14(6):849-857, June 1988. [59] Thomas Moher and Paul Wilson. Offsetting Human Limits with Debugging Technology. IEEE Software, 8(5):12—13, May 1991. [60] B. A. Myers. INCENSE: A System for Displaying Data Structures. Computer Graphics, 17(3):115—125, July 1983. [61] Brad A. Myers. Displaying Data Structures for Interactive Debugging. Technical Report Xerox PARC CSL-80-7, Palo Alto, June 1980. 97 pages. [62] Brad A. Myers. Visual Programming, Programming by Example, and Program Visualization: A Taxonomy. CHI ’86 Proceedings, pages 59-66, April 1986. [63] Brad A. Myers, Ravinder Chandhok, and Atul Sareen. Automatic Data Visual­ ization for Novice Pascal Programmers. Proceedings of 1988 IEEE Workshop on Visual Languages, pages 192-198, October 1988. [64] Murthi Nanja and Curtis Cook. An Analysis of the On-line Debugging Process. In Empirical Studies of Programmers: Second Workshop, pages 172-184. Ablex Publishing, Norwood, NJ, 1987. [65] F. J. Ojeda. DDS: A Subsystem for Displaying Data Structures for Interactive Debugging. Master’s thesis, Case Western Reserve University, Department of Computer Engineering and Science, August 1985. [66] R. Olsson, R. Crawford, W. Ho, and C. Wee. Sequential Debugging at a High Level of Abstraction. IEEE Software, 8(5):27—36, May 1991. [67] D. P. Pazel. DS-Viewer-An Interactive Graphical Data Structure Presentation Facility. IBM Systems Journal, 28(2):307-322, 1989. [68] Bernhard Plattner and Jurg Nievergelt. Monitoring Program Execution: A Sur­ vey. IEEE Computer, 14(11):76—93, November 1981. [69] M. Krish Ponamgi, Wenwey Hseush, and Gail Kaiser. Debugging Multithreaded Programs with MPD. IEEE Software, 8(5):37—43, May 1991. [70] M. C. Pong and N. Ng. PIGS—A System for Programming with Interactive Graphical Support. Software: Practice and Experience, 13(9):847-855, Septem­ ber 1983. [71] Terrence W. Pratt. Programming Languages: Design and Implementation, Sec­ ond Edition. Prentice Hall, Inc., Englewood Cliffs, N J, 1984. [72] Georg Raeder. A Survey of Current Graphical Programming Techniques. IEEE Computer, 18(8): 11—25, August 1985. [73] E. Satterthwaite. Debugging Tools for High-Level Languages. Software: Practice and Experience, 2(3):197-217, July/September 1972. 128

[74] Ehud Y. Shapiro. Algorithmic Program Debugging. The MIT Press, Cambridge, MA, 1983. [75] Sylvia Sheppard, Phil Milliman, and Bill Curtis. Factors Affecting Programmer Performance in a Debugging Task. Technical Report TR-79-388100-5, General Electric Company, February 1979. 56 pages. [76] Takao Shimomura and Sadahiro Isoda. Linked-List Visualization for Debugging. IEEE Software, 8(5):44-51, May 1991. [77] N. C. Shu. Visual Programming. Van Nostrand Reinhold, Co., New York, NY, 1988. [78] N. C. Shu. Visual Programming: Perspectives and Approaches. IBM Systems Journal, 28(4):525-547, 1989. [79] Ian Sommerville. Software Engineering, Third Edition. Addison-Wesley Pub­ lishing Company, New York, NY, 1989. [80] John Stearns and Michael Boom. CodeVision User’s Guide Beta Part 3 - Code- Vision Debugger. Moutain View, CA, beta version edition, 1991. [81] William R. Sutherland. On-line Graphical Specification of Computer Procedures. PhD thesis, Massachusetts Institute of Technology, 1966. Lincoln Labs Report TR-405. [82] Daniel Swinehart. Copilot: A Multiple Process Approach to Interactive Pro­ gramming Systems. PhD thesis, Standford University, Department of Computer Science, July 1974. [831 Dennie Van Tassel. Program Style, Design, Efficiency, Debugging, and Testing. Prentice Hall, Englewood Cliffs, NJ, 1978. [84] T. Teitelbaum and T. Reps. The Cornell Program Synthesizer: A Syntax- Directed Programming Environment. Communications of the ACM, pages 563— 573, September 1981. [85] Warren Teitelman. A Display Oriented Programmer’s Assistant. Technical Re­ port Xerox PARC CSL-77-3, Palo Alto, March 1977. 30 pages. [86] I. Vessey. Expertise in Debugging Computer Programs: A Process Analysis. International Journal of Man-Machine Studies, 23:459-494, 1985. [87] M. Weiser and J. Lyle. Experiments on Slicing-Based Debugging Aids. In Em ­ pirical Studies of Programmers, pages 187-197. Ablex Publishing, Norwood, NJ, 1986. [88] M. Zelkowitz. Reversible Execution as a Diagnostic TooL_PhD thesis, Cornell University, Department of Computer Science, Ithaca, NY, 1971. [89] Luther Zimmerman. On-line Program Debugging - A Graphic Approach. Com­ puters and Automation, 16(11):30—34, November 1967.