BEN SHNEIDERMAN Direct Manipulation

en Shneiderman is a long-time proponent of direct manipulation for user interfaces. Direct manipulation affords the user control Band predictability in their interfaces. Pattie Maes believes direct manipulation will have to give way to some form of delegation— namely software agents. Should users give up complete control of their interaction with interfaces? Will users want to risk depending on “agents” that learn their likes and dislikes and act on a user’s behalf? Ben and Pattie debated these issues and more at both IUI 97 (Intelligent User Interfaces conference - January 6–9, 1997) and again at CHI 97 in Atlanta (March 22–27, 1997). Read on and decide for yourself wherev the future of interfaces should be headed—and why. s

42 interactions...november + december 1997

debate on vs Interface Agents

PATTIE MAES

Excerpts from debates vs at IUI 97 and CHI 97

interactions...november + december 1997 43

Jim: Okay, welcome all to the afternoon ses- that allows us to predict the time it takes for a sion. My name is Jim Alty. I think I am sup- specific user to learn a specific task, the speed posed to be the moderator, whatever that of performance on bench-mark tasks, the rate means, for this session. I hope things don’t get and distribution of errors, and the retention of too rough. The debate topic this afternoon, as those tasks over time. We look at subjective you all know, is direct manipulation versus satisfaction as a secondary measure and have intelligent agents. On my right I’ve got Ben developed a standardized Questionnaire of Shneiderman—fresh from his triumphs this User Interaction Satisfaction (QUIS) that the I think we morning when he’s been gnawing away at the university has licensed to more than 100 orga- would do best agent people—from the University of Mary- nizations around the world. QUIS consists of to focus on the land. On my left, of course, fresh for a fight is 71 items, from low-level details to higher level remarkable Pattie Maes from the MIT Media Laboratory. concepts in the interface (Office of Technolo- human Let me just explain. It’s 15 minutes from gy Liaison, +1-301-405-4210). capabilities in each speaker to place their position, and then We think accommodating individual dif- the visual 5 minutes allowed each for rebuttal. Then we ferences is important—not just mentioning domain, which I open it up to the floor for about 30 minutes experts and novices, but understanding quan- think are largely or so. Could you please use the microphones? titatively what performance differences we underutilized If you want to line up behind the micro- anticipate. Do we expect experts to perform by the current phones, I will select you to make your com- twice as fast or twenty times as fast as novices? designs with ments. Then at the end, there will be two Would men perform differently from women “40 icons in 2-3 5-minute summing ups. So, let us commence in the use of interfaces, or prefer different windows. I the debate. interfaces? And then we try to deal with think we should broader cultural issues that are even more dif- have two or Ben: First, my thanks to the organizers of this ficult to measure. three orders of Intelligent User Interfaces workshop for dar- For me, the future is most clearly moving magnitude ing to arrange this debate. It was Angel Puer- in the direction of “ visualiza- more: 4,000 or ta’s careful attention to my concerns that tion.” I think we would do best to focus on more items on made me feel comfortable in coming to speak the remarkable human capabilities in the visu- the screen in an here, and so I want to offer him a souvenir al domain, which I think are largely underuti- orderly way that from our lab—the usual T-shirt. And to Jim lized by the current designs with 40 icons in enables people Alty, a cup of tea for when he needs to relax, 2-3 windows. I think we should have two or to see all of the from our group at the lab. three orders of magnitude more: 4,000 or possibilities and I am pleased to represent the University of more items on the screen in an orderly way navigate among Maryland’s Human-Computer Interaction that enables people to see all of the possibili- them. Lab, which is now 14 years young. Over the ties and navigate among them. years we’ve explored a host of user-interface I will show you three brief videotaped design issues in an interdisciplinary way examples. They all show applications and involving , psychology, and extensions of the strategy of “direct manipula- the library school. Our goal is to create envi- tion,” a term I coined in 1982 to describe the ronments where users comprehend the dis- existing successful systems. These systems all play, where they feel in control, where the had rapid, incremental, and reversible actions, system is predictable, and where they are will- selection by pointing, and immediate feedback ing to take responsibility for their actions. To (100-millisecond updates for all actions). I direct me, responsibility will be the central issues in believe that this strategy reduces errors and manipulation this debate. encourages exploration. The current manifes- My pitch over 20 years has been to make a tations of direct manipulation are the visual science out of user interface research. I want to ways of searching in databases and on the Web get past the notion of “user-friendly,” a vague, accompanied by visual presentation of results. and misleading term, and to be really clear Let’s take a look at an example that goes vsinterface about specifying who the users are and what back 4 years in the first videotape called the agents their tasks are. Then we can make a theory FilmFinder.

44 interactions...november + december 1997

debate

Figure 1(a): FilmFinder shows 1500 films in a starfield display where the location of each point is determined by the year of the film (x- axis) and its popularity in video store rentals (y-axis). The color encodes the film type (Ahlberg & Shneider- man, 1994). ftp://www.cs.umd.edu/pro jects/hcil/Screen- dumps/Film/film- alldots.gif

Figure 1(b): FilmFinder after zooming in on recent popular films. When fewer than 25 films remain, the titles appear automatically. ftp://www.cs.umd.edu/pro jects/hcil/Screen- dumps/Film/titles.gif

Users have great control over the display and as they select items, the details appear in windows on the sides. FilmFinder video (see Figures 1a-c, CHI94 long, and then we can use the category button videotape or HCIL 1995 Video tape reports): to show only drama or only action films. We This display shows thousands of films can zoom in on more recent pictures and take arranged on the x-axis from 1920 to 1993 only the more popular ones. And when there and on the y-axis from low to high populari- are fewer than 25, the titles will appear auto- “ ty. We can use the length slider to trim the set matically. When we select one of them, we get of films by the number of minutes in the film a description of the film and information so we do not have to see films that are too about the actress and actor. We can also hunt

interactions...november + december 1997 45

Figure 1(c): FilmFinder after selecting a single film. The info card pops up with details-on-demand. ftp://www.cs.umd.edu/projects/hcil/Screen-dumps/Film/film-sean.gif

for films organized by actors. In this case, you It would be hard to see how to program might be interested in Sean Connery, and his any kind of agent tool to anticipate all of the films will appear on the screen. possibilities that your eye would pick up. We show the age of the youthful offenders on the Ben: Okay. I think you get an idea that the x-axis. There are 4,700 of them, from 10 to controls are visually apparent as you drag 19 years old. The number of days until a deci- them. The updates occur in 100 milliseconds sion was made on their treatment is and users get a clear understanding of what shown on the y-axis. The rules of this organi- the situation is. This work goes back to 1993, zation say that decisions must be made with- and the 1994 CHI conference has a couple of in 25 days, but you can see a lot more than papers describing it [1, 2]. A general purpose they anticipated are well above the 25-day version of that software was used for the limit. direct Department of Juvenile Justice project, which Interesting things pop up whenever you try manipulation you will hear about shortly. a . I hope you will spot these yel- Here is a FilmFinder done in the UNIX low lines—those were a surprise. We thought version of the product called SpotFire (Figure there was a bug in the program, but it turns 2). Chris Ahlberg has made a commercial out that if you start clicking on them to get product out of this and you can download the details-on-demand, you’ll find out they all vsinterface the demo version off of the Web (http:// occur in a Hartford County. They were all agents www.ivee.com). brought in on a certain day. They all have the

46 interactions...november + december 1997

debate same charge, which is narcotics possession. These were drug busts and they were all put on treatment plans at the same time. Those kind of patterns happen to anyone who tries visualization programs, and it would be hard to see how you could program an agent to anticipate all of the possibilities that your eye can pick up in 1/10th of a second. Another problem we have dealt with in the second videotape has to do with visualizing per- sonal histories. It is called LifeLines (Figure 3). The placement line shows this youth is currently placed at Waxter, a detention center. We also see a placement in a drug-abuse pro- gram and a placement at Cheltenham. This one is thicker because the court found him guilty of auto theft. When I click on the line Figure 2: Spotfire version of FilmFinder provides increased user controls. of the drug-abuse program, all of the related Users can set axes (set to length in minues and year) and glyph attributes information is highlighted. I can tell that the (color is set to subject and larger size indicates award winning film). Spot- placement was for a breaking and entering fire is available in Unix, Windows and Java versions (http://www.ivee.com)

Figure 3: Youth Record prototype using the Lifelines display to show a case history for the Maryland Department of Juvenile Justice. (CHI96 videotape or HCIL 1996 Video Tape Reports; 4) ftp://www.cs.umd.edu/projects/hcil/Research/1997/patientrecord.html

interactions...november + december 1997 47

case and was requested by Brown. A click on image acts as an overview, giving a visual rep- Brown’s name gives the contact information. resentation of the entire body. The axial Ben: Here are the reviews which were written. A cryosections are a local view showing a During the CHI97 debate, click on the code brings the text of the review. thumbnail corresponding to the slider posi- Pattie made a point about In the same way, I can get more details about tion on the overview. We can explore the body my use of autofocus cam- each case and placement by clicking on the by simply dragging the slider. It updates in eras, suggesting that they labels. Those screens are all for the old system, real time, giving an experience of flying were some sort of agent. showing that LifeLines can be used as a front through the human body. Here we see the As we were speaking Alan end and acts as a large menu for all of the brain, the shoulders, the torso, the abdomen, Wexelblat was taking pic- screens. The top buttons can access general the thighs, the knees, and all of the way down tures of the events using information, but any critical information will to the toes. We press the retrieve button and my camera. When I went appear on the overview screen. For example, the corresponding full-size image is retrieved through the color slides I here’s a mention of suicide risk. Seeing the over the network from the NLM archive in a found that most of the overview gives the user a better sense of how couple of minutes. We provide several useful pictures he took were out much information is available and what type alternative overviews and also the ability to of focus! As with most of information it is. Of course, this implies generate any coronal section overview, for autofocus cameras, there that all of the information can be presented in example, near the back of the body or near the is a narrow area in the one screen. front. center that is used for focusing and this must be Ben: We think that LifeLines can also be Ben: Other labs are working on related ideas placed on the intended applied for medical records, and we are now of information visualization. From the Pacific subjects of the photo. applying it in a project with IBM. We give an Northwest National Labs, this textual data- Unfortunately Alan didn’t overview of patient histories that contains the base that has been presented in a two-dimen- know this and simply consultations, conditions, documents, hospi- sional mountains-and-clusters visualization to pointed at the stage area talizations, and medications. Users have great give users an idea of the volume and interrela- but the focus point was control over the display and as they select tionship of items. Steve Eick at AT&T Labs on the background poster. items, the details appear in windows on the has these wonderful visual overviews of large Almost all the photos sides. We think this strategy has great power textual documents. were unusable. in providing convenient access to large and Here, the characters in a children’s book are complex databases in a way that gives the users color coded so that you can see the progress of We can all find this amus- an overview of what is happening and an the story as it moves on to different characters. ing and leave it at that, appreciation of where the details fit into the Departmental e-mail networks and richer but I think there is a seri- overall context. The visual presentation gives information, such as 3D network representa- ous point which is that users enormous bandwidth and there are tions, are part of the things he likes to show agents don’t always do potentially thousands of selectable items on with a variety of user controls to filter the traf- what we expect them to the screen at once offering rapid access to the fic and reveal patterns of usage that might be do, and it takes some item that you are seeking. difficult to see with other data presentation knowledge to make The third and final example, is a visual strategies. effective use of agents (or database of the human body called the Visible The closing slide says that the overview is auto-focus cameras). If we Human. Our role was to develop a browsing the most important. It gives users a sense of were to assess responsi- user interface to the 15 gigabytes of images at context, of what to look at—the big picture. bility — I would take part the National Library of Medicine (3; CHI96 Then they zoom in on what they want, filter of it in failing to give ade- videotape and HCIL 1996 Video Reports; out what they don’t want, and finally go for quate instruction to Alan, free software available for Sun workstations details-on-demand. My claim is that this gives he might take part from http://www.nlm.nih.gov and select the users the feeling of being in control and there- because he was the direct Visible Human links till you find our Visible fore they can be responsible for the decisions user, and maybe the Human Explorer). they make. Thank you. manufacturer has another This direct manipulation interface presents part for a poor design a thumbnail image of a coronal cross section which fails to provide of the body reconstructed directly from the Jim: Okay, thanks very much, Ben. That was appropriate feedback. axial cryosections (Figure 4). This coronal perfect timing. We now hand it over to Pattie.

48 interactions...november + december 1997

debate

Figure 4: Visible Human Explorer user interface, showing a reconstructed coronal section overview (on the left) and an axial preview image of the upper abdominal region (on the upper right). Drag- ging the sliders animates the cross-sections through the body (North et al., 1996). ftp://www.cs.umd.edu/projects/hcil/Research/1995/visible-human.html

Pattie: I’m not going to bribe the moderator tive because it knows what your interests are. It with tea or T-shirts or anything. I hope that can, for example, tell you about something the work will speak for itself. The word “agent” that you may want to know about based on the is used in a lot of different ways. I want to start fact that you have particular interests. Current this presentation by explaining what I mean by software, again, is not at all proactive. It does- the word “agent,” and in a particular, “software n’t take any initiative. All of the initiative has to agent.” Basically, software agents are a new come from the user. A third difference with approach to user software, a new way of think- current software is that software agents are direct ing about software that end-users have to use. more long-lived. They keep running, and they manipulation In particular, the way in which agents differ can run autonomously while the user goes from the software that we use today is that a about and does other things. Finally, software software agent is personalized. agents are adaptive in that they track the user’s A software agent knows the individual user’s interests as they change over time. habits, preferences, and interests. Second, a So, you can ask, well, why do you call it an vsinterface agents software agent is proactive. It can take initia- agent? Why call it an agent given that the term

interactions...november + december 1997 49

is already so overloaded, and given the fact gone. Continuously, new information is being that it’s really software that is slightly different created, new versions of software are being from existing software? Well, we call it an added. Also, that environment is completely agent to emphasize the fact that agent soft- unstructured. Take a look at the World Wide ware can act on your behalf while you are Web, for example. You couldn’t possibly try to doing other things. We also want to empha- visualize the World Wide Web in any way size that it does this based on its knowledge of because it is completely unstructured and Why do we need your preferences, just like a travel agent will because it has been built by so many different software agents? act on your behalf by buying you a travel tick- people and is continuously changing. I believe … Take a look at et based on the information that the travel that the dominant metaphor that we have the World Wide agent has about your preferences. Note that I today is a mismatch for the computer envi- Web, for exam- prefer not to use the term “intelligent agents” ronment we are dealing with tomorrow. ple. You couldn’t nor “autonomous agents” because those terms Second, the user 20 years ago was different possibly try to have even more problems associated with from the typical user today. Twenty years ago visualize the them. we mostly dealt with professional users of World Wide Web Now, why do we need software agents? computers. Today and tomorrow the con- in any way Why does our software need to become more sumer electronics market is going to be the because it is personalized? Why does our software need to one that dominates, and those users do not completely take initiative to help us as a user? This needs even know how to program their VCRs. How “unstructured to happen because our current computer envi- are they going to deal with user interfaces? and because it ronment is getting more and more complex, Third, the number of things that people has been built and the users are becoming more and more have to keep track of and the number of tasks by so many naive. Finally, the number of tasks to take care that they use their computers for is huge and different people of, and the number of issues to keep track of, is increasing all of the time. As we know from and is continu- are continuously increasing. Let me tell you other domains, whenever workload or infor- ously changing. more about this. First of all, the nature of our mation load gets too high, there is a point I believe that computer environment is radically different where a person has to delegate. There is no the dominant today from 20 years ago, back when the cur- other solution than to delegate. For example, metaphor that rent style of computer interaction was invent- many of you may have students that you del- we have today is ed. Twenty years ago, one typically had one egate certain tasks to, or you may have per- a mismatch for user using one computer, and everything in sonal assistants that you delegate certain tasks the computer that computer, every file, every object, was in to, not because you can’t deal with those tasks environment we a particular place because the user put it there. yourself, but because you are overloaded with are dealing with There was a limit to the amount of informa- work and information. I think the same will tomorrow. tion on that computer. It was completely stat- happen with our computer environments: ic. Nothing changed unless the user made it that they become just so complex and we use change. It was completely structured and well them for so many different things that we organized. Today, our computer environments need to be able to delegate. are completely different. We need to be able to delegate to what More and more the World Wide Web and could be thought of as “extra eyes or extra our browser is becoming the one and only ears” that are on the lookout for things that interface. It’s not quite the case yet today, but you may be interested in. We also need “extra it will be a year from now. In that situation, hands or extra brains,” so to speak, because direct our computer is no longer this closed environ- there will be tasks that we just cannot deal manipulation ment that we have complete control over. with because of our limited attention span or Instead, our computer or the screen is a win- limited time, and we need other entities to be dow onto this vast network, this vast network able to represent us and act on our behalf. of information and other people. That net- Some examples to make this more concrete (I work is continuously changing. It is dynamic. didn’t bring any videos because of the limited vsinterface Something may be in one place today, and the amount of time), but most of you have seen at agents next day it may be in another place or may be least one of these agents. These are some of

50 interactions...november + december 1997

debate

the ones we built in our lab. Letizia, built by sent e-mail or whether you already replied to Henry Lieberman, who is here at the confer- a certain e-mail message. It may proactively ence, is an agent which continuously watches remind you of information related to the you as you browse the Web, analyzing and information you are currently looking at. It We need to be memorizing all of the pages that you visit. It works in EMACS. When I am, for example, able to delegate extracts from those pages the common key- looking at an e-mail message from a particular to what could words. person, it proactively reminds me of the previ- be thought of Whenever you are using your Web brows- ous e-mail messages from that same person, as “extra eyes er, Letizia always looks ahead and checks which is very useful because I may have for- or extra ears” whether within a certain depth of the current gotten to reply to one of them. that are on the page, there happen to be any pages that you Firefly, some of you may have tried that lookout for may be interested in. So, for example, if I am agent, is basically a personal filterer for enter- things that interested in scuba diving, my agent may have tainment, not unlike the movie application you may be picked it up because I look at a lot of pages that Ben talked about, except that this agent interested in. about scuba diving. If I go to a particular will again keep track of your interests, your entertainment site, it may look ahead and say, preferences, and proactively tell you about“ hey did you realize that if you follow that link new movies that you may be interested in which that there are some pages about scuba diving you even forgot to ask about in the first place. in the Florida area? The Remembrance Agent is Yenta is another agent that we built which another agent that continuously tracks the tracks what the user’s interests are by looking behavior of the user. It helps you remember at your e-mail and files and extracting key- certain things. It helps you remember who words. It talks to other Yenta agents belonging

interactions...november + december 1997 51

wouldn’t have thought of yourself. Kasbah is another set of agents that buy and sell on behalf of users. We are currently setting up this experiment MIT-wide, meaning for 15,000 people. We have already done tests with 200 people. It’s basically a marketplace where you can create an agent who will buy or sell a second hand book or a second hand music CD for you. You just tell the agent, “I want to sell The Joshua Tree by U2. I want to ask $12 for it at first. You are allowed to go as low as $9. You have two months to sell this CD. You should be really tough and only change the price all the way at the end, near when the 2 months are over.” That agent will represent you in that marketplace, negotiating to other users, and if it notices that another on your behalf with other people or other direct user shares some of your interests, especially if agents who may be interested in buying that manipulation those interests are very rare, then it introduces CD from you. Again, it is sort of acting on you to that other user. It may say “hey did you your behalf. You don’t have to waste any time realize that at this IUI conference there is trying to make 10 bucks, but the agent will do another person who is interested in going this for you. scuba diving in Florida” so that maybe then I think it’s important to address some com- vsinterface we can decide to go scuba diving together. mon misconceptions about agents: First of all, agents Again, it’s suggesting something that you sorry to say so, but agents are not an alterna-

52 interactions...november + december 1997

debate tive for direct manipulation. A lot of confer- ences and magazines pitch agents against direct manipulation. They are actually com- plementary metaphors. Whenever you have an agent interface, typically you also need a very good application interface because an agent is not a substitute for an interface. An Pattie Maes is an Asso- agent basically interacts with the application ciate Professor at MIT’s just like you interact with the application. It’s Media Laboratory, as if you had someone else looking over your where she founded and shoulder as you are using the application, directs the Software noticing some of your preferences and habits Agents Group. She cur- and then offering to automate some of the rently holds the Sony Corporation Career Development tasks for you. So you still need a very good Chair. Previously, she was a visiting Professor and a direct manipulation interface—visualiza- Research Scientist at the MIT Artificial Intelligence Labora- tion—all of these wonderful tools so that the tory. She holds a Bachelor’s degree and Ph.D. degree in user can personally interact with the applica- Computer Science from the Vrije Universiteit Brussel in Bel- tion. An agent can never predict all of the gium. Her areas of expertise are Human Computer Interac- movies that I may possibly be interested in. It tion, Electronic Publishing and Electronic Commerce, may be able to make some interesting sugges- Artificial Intelligence, and Artificial Life. tions to me, but I will still need to look up Pattie is one of the pioneers of a new research area particular movies myself. called Software Agents, that is, semi-intelligent computer A second misconception is that some peo- programs which assist a user with the overload of infor- ple think that agents are necessarily personi- mation and the complexity of the online world. She is one fied or anthropomorphized. In fact, most of the organizers for the leading conferences in this area agents are not. Most of them don’t even deal such as the annual Autonomous Agents conference and with a natural language interaction interface. the annual Practical Applications of Agents and Multi- A third misconception is that agents necessar- Agent Systems conference. She is a founder and board ily rely on traditional AI (artificial intelligence) member of the Agent Society, an international industry techniques, like knowledge representation and and professional organization established to assist in the inferencing. In fact, most of the agents that are widespread development and emergence of intelligent commercially available and have proven success- agent technologies and markets. ful with large numbers of users rely on either Pattie is a consultant in the area of Software Agents for user programming or on machine learning several major companies such as Apple Computer, Hughes rather than traditional AI techniques. Research, etc. She is the editor of three books and is an I want to conclude by addressing some crit- editorial board member and reviewer for numerous profes- icisms of software agents that Ben has come sional journals and conferences, such as the User Modeling up with—as well as people like Jaron Lanier journal, the Personal Computing journal and the Artificial (see “A Conversation with Jaron Lanier,” Life journal. Her work has achieved several prizes, includ- interactions ii.3 (July 1995), pp. 46-65), who ing the IBM best Bachelor’s thesis award (1984), the OOP- is also very vocal about all of this. Opponents SLA-1987 best paper award, the Ars Electronica 1995 of agents typically argue that well-designed award, the Interactive Media Festival Award 1995, the visualization interfaces are better. Like I said ArcTec 1995 award and so on. before, you still need a well-designed interface Finally, she is a founder of Firefly Network, Inc. in when incorporating agents in an application. Boston, Massachusetts—one of the first companies to com- However, some tasks I may just not want to mercialize software agent technology. do myself even if the interface was perfect. If my car had a perfect interface for fixing the engine, I still wouldn’t fix it. I just don’t want to bother with fixing cars. I want someone else to do it.

interactions...november + december 1997 53

A second criticism is that agents make the autonomous agents and presents an anthropo- user dumb. That’s actually more one of Jaron’s morphic vision. Even in the current proceed- objections rather than Ben’s, I think. To some ings her article is titled “Intelligent Software,” extent it’s true. If I don’t fix my car then I’m not so I was delighted with her opening remarks Opponents of going to learn about fixing cars. However, this that rejected intelligent and anthropomorphic agents typically does not constitute a problem. As long as there’s designs. The old Pattie Maes wrote “agents argue that well- always an agent available or I can call one by a will appear as living entities on the screen, designed visual- motor association like AAA, then that’s fine. It’s conveying their current state of behavior with ization too bad that I will never learn about cars, but I animated facial expression or body language interfaces are want to learn about other things instead. rather than windows text, graphics, and fig- better. Like I A third criticism expressed is that using ures.” So we’ve got two Pattie Maes. I will said before, you agents implies giving up all control. That’s incor- choose the newer one that demonstrates still need a well- rect. I think you do give up some control when movement in my direction including her last designed inter- you deal with an agent. I tell the car mechanic to slide which might have been written by me: face when fix my car or to fix this or that part of the car. I “User understanding is central, and user con- incorporating don’t know how exactly he or she is going to do trol is vital for people to be successful.” “agents in an that. I don’t mind giving up some control, actu- In fact, I have other ways of celebrating application. ally, and giving up control over the details as long Pattie Maes. I encourage you to look at her However, some as the job is done in a more-or-less satisfactory Firefly Web site, which is an interesting appli- tasks I may just way, and it saves me a lot of time. cation. Collaborative filtering, I think, will not want to do Okay, just very briefly, I want to say that I become an important approach for many myself even if think where the true challenge lies is in domains. But as a user, I can’t find the agents the interface designing the right user-agent interface. In on the Firefly Web site. In fact, as I searched was perfect. If particular, we need to take care of these two to find the agents, all I came up with was that my car had a issues: understanding and control. Under- the company had previously been called perfect interface standing means that the agent-user collabora- Agents, Inc. and is now called Firefly. If you for fixing the tion can only be successful if the user can read the Firefly Web site, you will not find the engine, I still understand and trust the agent, and control word “agents” in the description of this sys- wouldn’t fix it. means that users must be able to turn over tem. In fact, the interface is a quite lovely, I just don’t want control of tasks to agents but users must never direct manipulation environment, allowing to bother with feel out of control. I believe that this is a won- users to make choices by clicking in a very fixing cars. I derful interface-design challenge, and we have direct manipulation way. want someone come up with a lot of solutions to actually So, I think we’ve made progress in clarify- else to do it. make sure that the agent’s user interface has ing the issues in the past year of our ongoing these two properties that the user feels in con- discussions. For example I think we can sepa- trol or has control when he or she wants it, as rate out the issue of natural language interac- well as that the user understands what the tion, which as far as I can see, has not been a agent does and what its limitations are. Let me success. The systems that were offered com- save that for later, maybe. Thanks. mercially even a few years ago, like Q&A from Symantec or Intellect from AI Corporation to Jim: Thanks very much. Okay, Ben’s going to do database query, and Lotus HAL for spread- go up to 5 minutes to say whatever he likes. sheets, are gone, and direct manipulation is the surviving technology. direct Ben: How interesting. We are debating, but A second issue is anthropomorphic inter- manipulation part of me is drawn to the idea of celebrating faces such as chatty bank tellers and the Postal Pattie Maes and encouraging you to follow Buddy or the proposed Knowledge Navigator her example. I want to draw the audience’s of Apple’s 1988 video. Microsoft’s playful attention to her transformation during the attempt at a social interface in BOB is also a months we’ve had these discussions. As I go failed product. As far as I can see the anthro- vsinterface back to Pattie Maes’s work and I read her ear- pomorphic or social interface is not to be the agents lier papers and her Web sites, she promotes future of computing.

54 interactions...november + december 1997

debate

A third issue of adaptive interfaces is quite interesting. I would concede half a point and say that we now see two levels: the user inter- face level, which users want to be predictable and controllable, as Pattie has stated, and the level below the , where there may be some interesting algorithms such as collabora- tive filtering. If those can be adaptive there may be benefits in the way that Pattie describes. This is related to other adaptations such as when I save a file to disk, I see it saved, and it is retrievable by me. Under the table, there’s a great deal of adaptation dealing with space allocation, disk fragmentation, and compression strategies, but from the user’s point of view, there’s no adaptation. It’s quite predictable. The same goes for engines in

interactions...november + december 1997 55

automobiles: I turn the key, I step on the gas, tant to distinguish these different types of it goes. Underneath the hood, below the user’s agents and not lump them all together. concern, there are adaptive algorithms that Second, I absolutely agree with Ben that so will set the engine speed based on many fac- far the most successful interfaces are the ones tors, such as the temperature, gas mixture, etc. where the agents are pretty much invisible. That level of adaptivity is important as long as They are not visualized as a little anthropo- it does not interfere with the user’s prediction morphic character. For example, in Firefly, of behavior. that is the case. That doesn’t mean that there I am concerned about the anthropomor- isn’t an agent operating. For example, in Fire- phic representation: it misleads the designers, fly, the Firefly agent will proactively tell you I am concerned it deceives the users; it increases anxiety about about other users that you may want to talk about the computer usage, interferes with predictability, to. It will warn you when there is something confusion of reduces user control, and undermines users’ that has changed somewhere that you may be human and responsibility—which I think is central. I interested in. There is still an agent there machine think anthropomorphic representations monitoring all of your preferences and proac- capabilities. I destroy the users’ sense of accomplishment; I tively making recommendations to you, but make the basic think users want to have the feeling they did that doesn’t mean that there has to be this lit- assertion that the job—not some magical agent. tle cute character on the screen. people are not Finally I am concerned about the confu- Now, I think one of the reasons that Ben machines and sion of human and machine capabilities. I and I disagree is actually that we are focusing machines are make the basic assertion that people are not on completely different problem domains. In not people. I do machines and machines are not people. I do pretty much all of the problem domains that not think that not think that human-to-human interaction is Ben looks at we are dealing with a user who is “human-to-human a good model for the design of user interfaces. a professional user, and we are dealing with a interaction is a task domain that is very well structured and an good model for Jim: Okay, Pattie, would you like to respond information domain that is very well orga- the design of to that? nized, so that it lends itself to visualizing all of user interfaces. the different dimensions. The kind of prob- Pattie: First of all, I should clarify that lems that we have typically been dealing with autonomous agents or the word “agents” has a are very different because we are dealing with much broader meaning than the words “soft- end users who are not necessarily very trained. ware agent,” and my group at the Media Lab They may use the Web for a couple of hours does research on autonomous agents more per week, but that is about it. We are dealing generally, as well as software agents. So when with a very different information domain, an Ben was quoting from our Web site, he’s actu- information domain that may be very ill struc- ally quoting other work that we do, for exam- tured and very dynamic. For example, the ple, work on synthetic characters that can World Wide Web is actually sort of one of the interact with people in a virtual environment, key domains that we do all of our research on. which doesn’t have anything to do with the Finally, to illustrate that these approaches software agent’s work. In fact, it has less and aren’t necessarily incompatible, I could envi- less to do with it than it may have at one sion a version of Ben’s movie finder which uses point. So, if you go to autonomous agent con- Ben’s nice visualization interface, where an ferences, for example, the Agents Conference agent is continuously monitoring what movies direct in Marina del Rey in February (First Interna- you seem to be interested in. That agent may, manipulation tional Conference on Autonomous Agents, for example, highlight particular areas in the see interactions iii.6, Conference Preview), interface which it thinks you will be specifi- you’ll see work being discussed that relates to cally interested in. That kind of interface robots, autonomous robots. You’ll see work would actually combine an agent that learns about synthetic believable characters, and about your preferences and proactively makes vsinterface you’ll see work about software agents. So that’s suggestions to you, with a nice visualization agents one thing I wanted to respond to. It’s impor- interface. The reason why you want that kind

56 interactions...november + december 1997

debate of proactive software is that the user does not necessarily always want to have all of that con- trol when searching for a movie. I believe that users sometimes want to be couch-potatoes and wait for an agent to suggest a movie to them to look at, rather than using 4,000 slid- ers, or however many it is, to come up with a movie that they may want to see. Ben Shneiderman is a Professor in the ¥ Department of Question: Okay, I have a remark to Ben, which Computer Science, then transitions into a question for both speakers. Head of the Human- Ben, I was a little bit irritated, I think Pattie was Computer Interac- too, with your lovely presentation on information tion Laboratory, visualization, which seems to be entirely beside and Member of the Institute for Systems Research, all the point. As Pattie said, we will take the best at the University of Maryland at College Park. He has direct manipulation and visualization as we can taught previously at the State University of New York possibly get. It seems to me the contribution to this and at Indiana University. He regularly teaches popu- discussion would be some negative examples of lar short courses and organizes an annual satellite where agentlike things are bad. At some level, I television presentation on User Interface Strategies take the thrust of your position, Ben, as being seen by thousands of professionals since 1987. The reactionary, to put it in simpler words, sort of third edition of his book Designing the User Inter- fear-driven. I would like to test my theory by ask- face:tion (1987) has recently been published by Addi- ing the following question, which is, we have now son-Wesley Longman. a new medium on the interface of playing which Ben is the author of Software Psychology: Human is speech. Speech is now practical. IBM makes Factors in Computer and Information Systems (1980) speech systems that are being used and so on. Addison-Wesley Longman, Reading, MA and his 1989 My question actually to both of you is to see book, co-authored with Greg Kearsley, what your reactions are to this new technology. I Hands-On!, contains a hypertext version on two disks. predict that if you are going to have speech with He is the originator of the Hyperties hypermedia sys- a computer, first of all, research at Stanford tem, now produced by Cognetics Corp., Princeton recently has shown that once you have a comput- Junction, NJ. In addition he has co-authored two text- er talking you cannot prevent people from books, edited three technical books, published more anthropomorphizing the computer. I do not see than 190 technical papers and book chapters. His 1993 how you are going to have a coherent speech edited book Sparks of Innovation in Human-Computer interface without using human communication Interaction collects 25 papers from the past 10 years principles. So I predict that you will say just don’t of research at the University of Maryland. do it. I also want to hear Pattie‘s comments Ben has been on the Editorial Advisory Boards of about speech technology. nine journals including the ACM Transactions on Com- puter- Human Interaction and the ACM interactions. Ben: Do we have another half an hour here? I He edits the Ablex Publishing Co. book series on thought I said very positive things about Fire- “Human-Computer Interaction.” He has consulted and fly and its agent and adaptation, and I cer- lectured for many organizations including Apple, tainly like to see automaticity built into AT&T, Citicorp, GE, Honeywell, IBM, Intel, Library of interfaces that amplify the user’s capabilities. I Congress, NASA, NCR, and university research groups. have trouble with the words like “agents,” and “expert,” and “smart,” and “intelligent” because, they mislead the designer, and designers wind-up leaving out important things. In fact, I love Pattie’s slide up here. If the agent-oriented community would adopt

interactions...november + december 1997 57

these principles it would go a long way in in my research, the main reason being com- making me sympathetic. For example Pattie pletely personal—that these systems often writes “Make the user model available to the don’t understand my accent, but apart from user.” I don’t see that being done in most of that, I do agree that it is not a very high band- the work about agents. Explanations should width kind of connection. There is also a lot be available and methods of operation should of ambiguity. be understandable to the user. So much of the So I personally would like to see speech work in agentry goes against the principles. I being used just for situations where the hands like the new Pattie. I am ready to be partners are not available, like in your car or as an addi- and collaborate with the new Pattie. tional channel, actually, for example, giving an Now, to focus on speech. We have heard agent some additional advice while you are for 25 years the great hope and dream that also pointing at something. For example, speech is going to solve our user interface Henry Lieberman, sitting here in the audi- problems. Dreamers prophecy that the Star ence, did some interesting work where he User interfaces Trek scenario is going to take over and we will taught an agent a particular procedure, and should be talk to our computers. I do not believe that while he was performing actions with the predictable, so speech will be a generally usable tool. It has mouse, he would give speech inputs to tell the that users trust important niches: opportunities for disabled agent what it had to pay attention to. For them. User users, for certain hands-busy, eyes-busy, and example, pay attention to this corner here that interfaces mobility-required applications. In preparing I am dragging or to this side of the rectangle. should be thor- the third edition of my book I worked hard to So in that situation it is very useful because oughly tested, find speech applications that do recognition your hands are already doing something else, and users should effectively. I am quite happy with speech and you need that additional channel to con- be thoroughly store-and-forward applications by telephone, vey some more information in parallel. trained for all but the recognition paradigm is not being ¥ emergencies. widely accepted, even for minor tasks such as In emergency voice dialing. Speech output, except by tele- Question: This question is for both of you. Both “situations, phone, is also a problem because speech is of you seem to be concerned about protecting the people cannot very slow and disruptive of cognitive process- user’s control of the environment, but the one solve problems. ing. I think what annoys me the most about things studies have shown time and time again is They can only the devotees of speech, is their failure to take that users are very good at making mistakes. So do what’s in the scientific evidence that speaking com- how do your positions relate to time-critical deci- rehearsed and mands is cognitively more demanding than sion-support environments, such as medical sys- predictable. pointing. Speech uses your short-term memo- tems or cockpit systems? ry and working memory. By contrast, hand- eye coordination can be conducted in parallel Pattie: I have actually been focusing on a with problem solving by another part of your completely different kind of application, a brain and therefore does not degrade your per- type of application that is not as critical. For formance as much as speaking. example, if your World Wide Web agent gives you a wrong Web page to look at—it assumes ¥ that you are interested in a Web page and you Question: You can do both at the same time. are not—that is not at all critical. It is not a big deal. I have been focusing on that kind of Ben: Yes, you can do hand-eye tasks in paral- situation and those kinds of problems, the lel with problem solving, more easily that you ones where if there is an error it is not very can speak while problem solving. This fact is costly. I’ve been doing that because I believe not a barrier to use of speech, but it is a hur- that it will be very hard to make agents that dle that designers of speech systems must rec- always come up with the right answer, always ognize if they are to find ways to overcome it. do the right thing. I believe that there is a very Pattie: I must admit, I actually agree with a lot large set of these kind of applications where of what Ben says. I haven’t used speech at all things don’t have to be completely precise or

58 interactions...november + december 1997

debate

100 percent correct in order for the agent to talking about the wonderful visual processing be very, very useful to a user. and how important visualization is—this seems to be assuming that everybody has perfect vision. Ben: I like your question. I think it is an What if I am blind, if I am over sixty-five, and I have been extremely important research area. There is a I have a very small useful field of view and I focusing on that long history of work often called supervisory don’t notice things so much on the periphery, or if kind of situation control. Tom Sheridan is a key player in the I am, in my case, a person sitting right here who and those kinds area for nuclear-reactor control rooms, cock- is having a problem with my contact lenses, how of problems, the pits, and so on. I think the design of these sys- could you, Ben you were saying—well speech is ones where if tems is most effective when the users have a okay for disabled users—how are you going to there is an error clear predictive model of what their actions render that into speech for a blind per- it is not very will produce. If they do not know or are son, and what is the role of that? costly. I’ve been uncertain about what the results of their doing that actions are, they will disengage the automatic Ben: That is a legitimate concern. Direct because I system, as is the evidence with cockpit systems manipulation does benefit from and depends believe that it or nuclear-control rooms. So one danger is heavily on visual representations. For those will be very that in complex control-room environments who are vision challenged or blind, alterna-“hard to make when an emergency occurs, users are uncer- tives to visual displays are important. What agents that tain about its behavior. In these situations surprises me is there are great supporters of always come up they are likely to disengage a potentially help- direct manipulation in the visually-challenged with the right ful system and do what makes sense to them. community, because direct manipulation answer, always Therefore, as Jim Foley said, keep it simple, depends on spatial relationships. Blind users do the right very simple. often are strong at spatial processing. If you thing. I believe User interfaces should be predictable, so that can provide movement left, down, backward, that there is a users trust them. User interfaces should be forward, they can navigate fairly rich spaces in very large set of thoroughly tested, and users should be thor- efficient ways. these kind of oughly trained for all emergencies. In emer- I would say also you have been a little too applications gency situations, people cannot solve problems. quick about criticizing menu selection. The where things They can only do what’s rehearsed and pre- question is what would the alternative be, and don’t have to be dictable. It is a good topic that I would love to how might those menus be better designed? I completely see more attention to it by this community. do believe that fast and vast menus are a great precise or 100 benefit in many applications. percent correct ¥ in order for the Question: In the interest of brevity, I was going Jim: Okay. There are two more questions if agent to be to bring up several human limitations or con- you could be very brief, please. very, very useful straints on humans that make direct manipula- to a user. tion a little more interesting and wanted to ask ¥ you both for comments, not that I have all of the Question: This is directed more toward Ben answers, but I think I would just limit it to your than toward Pattie. Will this debate between basic kind of law that the more things you have direct manipulation and agency always exist in to scroll through or the more things you have to interface design or will it eventually be replaced search through, the longer it takes you to search. by some kind of fusion of the two approaches? In The idea of how do you deal with this, particu- other words, are we going to see new Bens and larly, for many people scrolling is not a particu- new Patties every day, or is there going to be some direct larly usable thing. I have particularly seen this kind of “Shneider-Maes”? manipulation with older people, and of course, if you want a good example just go to a fast food restaurant and Ben: I think it has been interesting to see how watch a new person at the cash register try to find the debates evolve. I certainly will point you how to ring up your hamburger. It is pretty ter- to the new Pattie, whom I am ready to cele- rible. You will be there for days. brate and be partners with, as I said, but I vsinterface That kind of idea, and particularly—we are think the debate will move on. I think it has agents

interactions...november + december 1997 59

matured in interesting ways from where we about user performance. We can deal with sat- were a year ago. I think we are all at the edge isfaction also, but please focus on user perfor- of looking at new interfaces, and so, as we mance and realistic tasks. Please, please, please push that envelope back, we are getting better do your studies—whether they are controlled understanding of the territory, of the strengths scientific experiments, studies, or and weakness of direct manipulation, of the simply observations, and get past the wishful strengths and weaknesses of agents, and where thinking and be a scientist and report on real they are appropriate. I am pleased by the users doing real tasks with these systems. That progress in the discussion. is my number one take-away message. I am here to I am here to promote direct manipulation promote direct Pattie: I think we both have changed. Would with comprehensible, predictable, and con- manipulation you agree to that or not? trollable actions. Direct manipulation designs with compre- promote rapid learning. It supports rapid per- hensible, pre- Jim: I think before they start kissing let’s move formance and low error rates while supporting dictable, and on. Last question. exploratory usage in positive ways. controllable Direct manipulation is a youthful concept ¥ actions. Direct which is still emerging in wonderful ways. manipulation Question: This starts out at least as a clarifica- Our current work leans to information visual- designs promote tion question for Dr. Shneiderman, but it may ization with dynamic queries, but there are rapid learning. go places from there. To what extent does direct people doing fascinating things with enriched It supports rapid manipulation, in your definition and in your control panels style sheets, and end-user pro- performance and view, admit autonomous system behavior? gramming. Graphical macros would be my “low error rates Because it seems to me that as soon as you admit favorite project to advance the design of gen- while supporting anything unexpected, uncontrolled, potentially eral computing tools. It is embarrassing that exploratory anthropomorphizable out of the system that you after 15 years of graphic user interface being usage in positive are interacting with, you have opened the door, widely available, we have no graphical macros ways. you have taken a step down the slippery slope tools. What is going on? This is the greatest toward agentness. opportunity for visual programming. Third—and I am answering your question Ben: Yeah. I am in favor of increased automa- here—I think the intelligent agent notion tion that amplifies the productivity of users limits the imagination of the designer, and it and gives them increased capabilities in carry- avoids dealing with interface issues. That’s my ing out their tasks, while preserving their sense view of the agent literature—there is insuffi- of control and their responsibility, responsibil- cient attention to the interface. Maybe the ity, responsibility. I am sort of not answering way agents will mature is as Pattie is suggest- your question because I don’t want to work in ing; that the agents take care of the processes the language you are dealing with. We should below the table, and there is a nice direct be thinking about productivity improvement manipulation interface that the user sees. tools for users, whether they are graphical A leading AI researcher commented to me macros, dynamic queries, starfield displays or that the 30 years of planning work in AI is other things. essentially down the tubes because of lack of attention to the user interface. The designers ¥ deliver a system and the first thing that the direct Question: Then what is it about agents that you users say is, “This is great but what we really manipulation dislike? want to do is change these parameters.” The designers say, “Well, you know, we didn’t put Ben: Can I go to my closing slide? I want to that in the interface.” They just haven’t reassert the importance of scientific evalua- thought adequately about the interface, nor tions. We must get past the argumentation done testing early enough. vsinterface about my system being more friendly than I believe that this language of “intelligent, agents yours or more natural or intuitive, and talk autonomous agents” undermines human

60 interactions...november + december 1997

debate responsibility. I can show you numerous arti- field definitely has grown a lot in the past 10 cles in the popular press which suggest the years or so. In fact, one of the ways I think in computer is the active and responsible party. which a lot of this agent work distinguishes We need to clarify that either programmers or itself from traditional AI is that agent research operators are the cause of computer failures. focuses on building complete systems, systems Agent promoters might shift some attention that are tested, systems that really have to to showing users what is happening so that work, and those same principles and method- they can monitor and supervise the perfor- ologies can be seen in all of the agents work, mance of agents. I was disturbed that in the whether it be robots, synthetic characters, or Autonomous Agents conference that Pattie is software agents. I believe that participating in, the organizers refused to The field is maturing and paying more there are real include the topics of supervision of agents and attention to building things that really work limits to what user interfaces for programming agents. By and paying attention to important UI issues. we can do with contrast, I like Pattie’s summary slide—I think As to people taking responsibility for their visualization and her list is quite wonderful. agents, I think they indeed should. It is soft- direct manipula- My closing comment is that I think there ware that is running on your behalf and that tion because our are exciting opportunities in these visual inter- you have delegated certain tasks to. So, per- computer envi- faces that give users greater control and there- sonally, I don’t see why that problem is specif- ronments are fore greater responsibility in the operation of ic to agents as opposed to software in general. becoming more computers. Thanks. and more com- plex. We cannot Jim: Thanks. Okay, Pattie? Jim: Okay, I would like to thank the Shnei- just add more derman–Maes team for coming here today“and more sliders Pattie: I want to conclude by saying that I and talking to us, and thank you all for par- and buttons. believe that there are real limits to what we ticipating. Also, there are can do with visualization and direct manipu- limitations lation because our computer environments are References because the becoming more and more complex. We can- [1] Ahlberg, C. and Shneiderman, B., Visual Informa- users are not not just add more and more sliders and but- tion Seeking: Tight coupling of dynamic query filters computer- tons. Also, there are limitations because the with starfield displays, Proceedings Of ACM CHI94 trained. So, I users are not computer-trained. So, I believe Conference (April 1994), 313-317 + color plates. believe that we that we will have to, to some extent, delegate [2] Ahlberg, C. and Shneiderman, B., AlphaSlider: A will have to, to certain tasks or certain parts of tasks to agents compact and rapid selector, Proceedings of ACM CHI94 some extent, that can act on our behalf or that can at least Conference, (April 1994), 365-371. delegate certain make suggestions to us. [3] North, C., Shneiderman, B., and Plaisant, C., User tasks or certain However, this is completely a complemen- controlled overviews of an image library: A case study parts of tasks to tary technique to well-designed interfaces— of the Visible Human, Proceedings 1st ACM Internation- agents that can visualization, and direct manipulation—not a al Conference on Digital Libraries, (March 1996), 74-82. act on our replacement. Users still need to be able to [4] Plaisant, C., Rose, A., Milash, B., Widoff, S., and behalf or that bypass the agent if they want to do that. Also Shneiderman, B., LifeLines: Visualizing personal histo- can at least I should say that what we have learned the ries, Proceedings of ACM CHI96 Conference (April make sugges- hard way really is that we have to, when 1996), 221-227, 518. tions to us. designing an agent, pay attention to user- interface issues, such as understanding and control. These are really very, very important PERMISSION TO COPY WITHOUT FEE, ALL OR PART OF THIS MATERIAL IS GRANT- if you want to build agents that really work ED PROVIDED THAT THE COPIES ARE NOT MADE OR DISTRIBUTED FOR DIRECT and that users can trust. The user has to be COMMERCIAL ADVANTAGE, THE ACM COPYRIGHT NOTICE AND THE TITLE OF able to understand what the agent does. THE PUBLICATION AND ITS DATE APPEAR, AND NOTICE IS GIVEN THAT COPYING

The user has to be able to control things if IS BY PERMISSION OF THE ASSOCIATION FOR COMPUTING MACHINERY. TO COPY they desire or to the extent that they want to OTHERWISE, OR PUBLISH, REQUIRES A FEE AND/OR SPECIFIC PERMISSION. control things. I agree with Ben that the agent ©ACM 1072-5520/97/1100 $3.50

interactions...november + december 1997 61