COMMUNICATIONS CACM.ACM.ORG OF THEACM 06/2009 VOL.52 NO.06

One Laptop Per Child: Vision vs. Reality

Hard-Disk Drives: The Good, The Bad, and the Ugly How CS Serves The Developing World Network Front-End Processors The Claremont Report On Database Research Autonomous Helicopters

Association for Computing Machinery Think Parallel..... It’s not just what we make. It’s what we make possible.

Advancing Technology Curriculum Driving Software Evolution Fostering Tomorrow’s Innovators

Learn more at: www.intel.com/thinkparallel

ACM Ad.indd 1 4/17/2009 11:20:03 AM ABCD springer.com

Noteworthy Journals

Autonomous Biological Personal and Robots Cybernetics Ubiquitous G. Sukhatme, University W. Senn, Universität Bern, Computing of Southern California, Physiologisches Institut; ACM Viterbi School of Engi- J. Rinzel, National neering, Dept. Computer Institutes of Health (NIH), P. Thomas, Univ. Coll. Science Dept. Health Education & London Interaction Centre Autonomous Robots Welfare; J. L. van Hemmen, reports on the theory and TU München, Abt. Physik Personal and Ubiquitous applications of robotic systems capable of Biological Cybernetics is an interdisciplinary Computing publishes peer-reviewed some degree of self-sufficiency. It features medium for experimental, theoretical and international research on handheld, wearable papers that include performance data on actual application-oriented aspects of information and mobile information devices and the robots in the real world. The focus is on the processing in organisms, including sensory, pervasive communications infrastructure that ability to move and be self-sufficient, not on motor, cognitive, and ecological phenomena. supports them to enable the seamless whether the system is an imitation of biology. Under the main aspects of performance and integration of technology and people in their Of course, biological models for robotic function of systems, emphasis is laid on everyday lives. The journal carries compel- systems are of major interest to the journal communication between life sciences and lingly-written, timely and accessible contribu- since living systems are prototypes for technical/theoretical disciplines. tions that illuminate the technological, social autonomous behavior. and design challenges of personal and ISSN 0340-1200 (print version) ubiquitous computing technologies. High Impact Factor in Robotics and AI 7 ISSN 1432-0770 (electronic version) ISSN 1617-4909 (print version) ISSN 0929-5593 (print version) Journal no. 422 ISSN 1617-4917 (electronic version) ISSN 1573-7527 (electronic version) Journal no. 779 Journal no. 10514 Scientometrics T. Braun, Lorand Eötvös University, Inst. Inor- Cybernetics and Data Mining ganic and Analytical Chemistry Systems Analysis and Knowledge Scientometrics is concerned with the Discovery quantitative features and characteristics of I. V. Sergienko, Acad. science. Emphasis is placed on investigations in Science Ukraine, G. I. Webb, Monash which the development and mechanism of Glushkov Institute University, School of science are studied by statistical mathematical Cybernetics Computer Science &, methods. The journal publishes original studies, Cybernetics and System Software Engineering short communications, preliminary reports, Analysis publishes The premier technical review papers, letters to the editor and book articles on: software and publication in the field, reviews on scientometrics. hardware; algorithm theory and languages; Data Mining and Knowledge Discovery is a 7 High Impact Factors in Computer Science., programming and programming theory; resource collecting relevant common methods Interdisciplinary Applications and Information optimization; operations research; digital and and techniques and a forum for unifying the Science & library Science analog methods; hybrid systems; machine- diverse constituent research communities. The machine and man-machine interfacing. journal publishes original technical papers in ISSN 0138-9130 (print version) both the research and practice of data mining ISSN 1060-0396 (print version) ISSN 1588-2861 (electronic version) and knowledge discovery, surveys and tutorials ISSN 1573-8337 (electronic version) Journal no. 11192 of important areas and techniques, and Journal no. 10559 detailed descriptions of significant applica- tions. 7 High Impact Factor in Information Systems and AI

ISSN 1384-5810 (print version) ISSN 1573-756X (electronic version) Journal no. 10618

Easy Ways to Order for the Americas 7 Write: Springer Order Department, PO Box 2485, Secaucus, NJ 07096-2485, USA 7 Call: (toll free) 1-800-SPRINGER 7 Fax: 1-201-348-4505 7 Email: [email protected] or for outside the Americas 7 Write: Springer Customer Service Center GmbH, Haberstrasse 7, 69126 Heidelberg, Germany Call: +49 (0) 6221-345-4303 Fax: +49 (0) 6221-345-4229 Email: [email protected] 7 7 7 014088x COMMUNICATIONS OF THE ACM

Departments News Viewpoints

5 ACM-W Letter 22 Privacy and Security ACM-W Celebrates Answering the Wrong Questions Women in Computing Is No Answer By Elaine Weyuker Asking the wrong questions when building and deploying systems 9 Letters To The Editor results in systems that cannot Share the Threats be sufficiently protected against the threats they face. 10 blog@CACM By Eugene H. Spafford Speech-Activated User Interfaces and Climbing Mt. Exascale 25 Inside Risks Tessa Lau discusses why she Reducing Risks of Implantable doesn’t use the touch screen on Medical Devices her in-car GPS unit anymore and A prescription to improve security Daniel Reed considers the future and privacy of pervasive health care. of exascale computing. 13 Micromedicine to the Rescue By Kevin Fu Medical researchers have long 12 CACM Online dreamed of “magic bullets” that go 28 The Profession of IT Making That Connection directly where they are needed. Beyond Computational Thinking By David Roman With micromedicine, this dream If we are not careful, our fascination could become a life-saving reality. with “computational thinking” 27 Calendar By Don Monroe may lead us back into the trap we are trying to escape. 101 Careers 16 Content Control By Peter J. Denning Entertainment businesses say digital rights management prevents the 31 Viewpoint Last Byte theft of their products, but access Why “Open Source” Misses control technologies have been the Point of Free Software 103 Puzzled a uniform failure when it comes Decoding the important differences Solutions and Sources to preventing piracy. Fortunately, in terminology, underlying By Peter Winkler change is on the way. philosophy, and value systems By Leah Hoffmann between two similar categories 104 Future Tense of software. Webmind Says Hello 18 Autonomous Helicopters By By Robert J. Sawyer Researchers are improving unmanned helicopters’ capabilities 34 Kode Vicious to address regulatory requirements Obvious Truths and commercial uses. How to determine when to put By Gregory Goth the brakes on late-running projects

and untested software patches. SSOCIATION A 21 Looking Backward and Forward By George V. Neville-Neil

CRA’s Computing Community ESEARCH R Consortium hosted a day-long UTING symposium to discuss the important P OM computing advances of the last C several decades and how to sustain that track record of innovation. By Bob Violino H COURTESY OF THE P Association for Computing Machinery Advancing Computing as a Science & Profession PHOTOGRA

2 COMMUNICATIONS OF THE ACM | JUNE 2009 | VOL. 52 | NO. 6 06/2009 VOL. 52 NO. 06

Practice Contributed Articles Virtual Extension

56 The Claremont Report As with all magazines, page limitations often on Database Research prevent the publication of articles that might By Rakesh Agrawal, Anastasia Ailamaki, otherwise be included in the print edition. To ensure timely publication, ACM created Philip A. Bernstein, Eric A. Brewer, Communications’ Virtual Extension (VE). Michael J. Carey, Surajit Chaudhuri, VE articles undergo the same rigorous review AnHai Doan, Daniela Florescu, process as those in the print edition and are Michael J. Franklin, Hector Garcia-Molina, accepted for publication on their merit. These Johannes Gehrke, Le Gruenwald, articles are now available to ACM members in Laura M. Haas, Alon Y. Halevy, the Digital Library. Joseph M. Hellerstein, Yannis E. Ioannidis, Hank F. Korth, Deriving Mutual Benefits Donald Kossmann, Samuel Madden, from Offshore Outsourcing Roger Magoulas, Beng Chin Ooi, Amar Gupta Tim O’Reilly, Raghu Ramakrishnan, Sunita Sarawagi, Michael Stonebraker, Advancing Information 38 Hard-Disk Drives: The Good, Alexander S. Szalay, and Gerhard Weikum Technology in Health Care the Bad, and the Ugly Steven M. Thompson and New drive technologies and 66 One Laptop Per Child: Vision vs. Reality Matthew D. Dean increased capacities create new By Kenneth L. Kraemer, Jason Dedrick, categories of failure modes that and Prakul Sharma The Challenge of Epistemic will influence system designs. Divergence in IS Development By Jon Elerath Mark Lycett and Chris Partridge Review Articles 46 Network Front-end Processors, Hyperlinking the Work Yet Again 74 How Computer Science for Self-Management The history of NFE processors sheds Serves the Developing World of Flexible Workflows light on the trade-offs involved in By M. Bernardine Dias and Eric Brewer Jonghun Park and Kwanho Kim designing network stack software. By Mike O’Dell Re-Tuning the Music Industry—Can Research Highlights They Re-Attain Business Resonance? 51 Whither Sockets? Sudip Bhattacharjee, Ram D. Gopal, High bandwidth, low latency, 82 Technical Perspective James R. Marsden, and and multihoming challenge Reframing Security for the Web Ramesh Sankaranarayanan the sockets API. By Andrew Myers By George V. Neville-Neil A Holistic Framework for Knowledge 83 Securing Frame Communication Discovery and Management Article development led by in Browsers Dursun Delen and Suliman Al-Hawamdeh queue.acm.org By Adam Barth, Collin Jackson, and John C. Mitchell Forensics of Computers and Handheld Devices: 92 Technical Perspective Identical or Fraternal Twins? About the Cover: Software and Hardware Nena Lim and Anne Khoo The One Laptop Per Support for Deterministic Child vision is being overwhelmed by the Replay of Parallel Programs Technical Opinion reality of business, politics, By Norman P. Jouppi Leveraging First-Mover Advantages logistics, and competing interests worldwide. in Internet-based Consumer Services

The photo illustration on 93 Two Hardware-based Approaches for T.P. Liang, Andrew J. Czaplewski, the cover is adapted from Deterministic Multiprocessor Replay Gary Klein, and James J. Jiang

ON BY SUPERBROTHERS OLPC photos taken in the I Gobi Desert. By Derek R. Howe, Pablo Montesinos, Luis Ceze, Mark D. Hill,

ILLUSTRAT and Josep Torrellas

JUNE 2009 | VOL. 52 | NO. 6 | COMMUNICATIONS OF THE ACM 3 COMMUNICATIONS OF THE ACM A monthly publication of ACM Media

Communications of the ACM is the leading monthly print and online magazine for the computing and information technology fi elds. Communications is recognized as the most trusted and knowledgeable source of industry information for today’s computing professional. Communications brings its readership in-depth coverage of emerging areas of computer science, new trends in information technology, and practical applications. Industry leaders use Communications as a platform to present and debate various technology implications, public policies, engineering challenges, and market trends. The prestige and unmatched reputation that Communications of the ACM enjoys today is built upon a 50-year commitment to high-quality editorial content and a steadfast dedication to advancing the arts, sciences, and applications of information technology.

ACM, the world’s largest educational STAFF EDITORIAL BOARD and scientifi c computing society, delivers BPA Audit Pending resources that advance computing as a GROUP PUBLISHER EDITOR-IN-CHIEF science and profession. ACM provides the Scott E. Delman Moshe Y. Vardi ACM Copyright Notice computing fi eld’s premier Digital Library [email protected] [email protected] Copyright © 2009 by Association for and serves its members and the computing Executive Editor NEWS Computing Machinery, Inc. (ACM). profession with leading-edge publications, Diane Crawford Co-chairs Permission to make digital or hard copies conferences, and career resources. Managing Editor Marc Najork and Prabhakar Raghavan of part or all of this work for personal Thomas E. Lambert Board Members or classroom use is granted without Executive Director and CEO Senior Editor Brian Bershad; Hsiao-Wuen Hon; fee provided that copies are not made John White Andrew Rosenbloom Mei Kobayashi; Rajeev Rastogi; or distributed for profi t or commercial Deputy Executive Director and COO Senior Editor/News Jeannette Wing advantage and that copies bear this Patricia Ryan Jack Rosenberger notice and full citation on the fi rst Director, Offi ce of Information Systems Web Editor VIEWPOINTS page. Copyright for components of this Wayne Graves David Roman Co-chairs work owned by others than ACM must Director, Offi ce of Financial Services Editorial Assistant Susanne E. Hambrusch; be honored. Abstracting with credit is Russell Harris Zarina Strakhan John Leslie King; permitted. To copy otherwise, to republish, Director, Offi ce of Membership Rights and Permissions J Strother Moore to post on servers, or to redistribute to Lillian Israel Deborah Cotton Board Members lists, requires prior specifi c permission Director, Offi ce of SIG Services P. Anandan; William Aspray; Stefan and/or fee. Request permission to publish Donna Cappo Art Director Bechtold; Judith Bishop; Soumitra Dutta; from [email protected] or fax Andrij Borys Stuart I. Feldman; Peter Freeman; (212) 869-0481. ACM COUNCIL Associate Art Director Seymour Goodman; Shane Greenstein; President Alicia Kubista Mark Guzdial; Richard Heeks; For other copying of articles that carry a Wendy Hall Assistant Art Director Richard Ladner; Susan Landau; code at the bottom of the fi rst or last page Vice-President Mia Angelica Balaquiot Carlos Jose Pereira de Lucena; or screen display, copying is permitted Alain Chesnais Production Manager Helen Nissenbaum; Beng Chin Ooi; provided that the per-copy fee indicated Secretary/Treasurer Lynn D’Addesio Loren Terveen in the code is paid through the Copyright Barbara Ryder Director of Media Sales Clearance Center; www.copyright.com. Past President Jennifer Ruzicka PRACTICE Stuart I. Feldman Marketing & Communications Manager Chair Subscriptions Chair, SGB Board Brian Hebert Stephen Bourne Annual subscription cost is included in Alexander Wolf Public Relations Coordinator Board Members the society member dues of $99.00 (for Co-Chairs, Publications Board Virgina Gold Eric Allman; Charles Beeler; students, cost is included in $42.00 dues); Ronald Boisvert, Holly Rushmeier Publications Assistant David J. Brown; Bryan Cantrill; the nonmember annual subscription rate Members-at-Large Emily Eng Terry Coatta; Mark Compton; is $100.00. Carlo Ghezzi; Benjamin Fried; Pat Hanrahan; Anthony Joseph; Columnists Marshall Kirk McKusick; ACM Media Advertising Policy Mathai Joseph; Alok Aggarwal; Phillip G. Armour; George Neville-Neil Communications of the ACM and other Kelly Lyons; Martin Campbell-Kelly; ACM Media publications accept advertising The Practice section of the CACM Bruce Maggs; Michael Cusumano; Peter J. Denning; in both print and electronic formats. All Editorial Board also serves as Mary Lou Soffa; Shane Greenstein; Mark Guzdial; advertising in ACM Media publications is the Editorial Board of . SGB Council Representatives Peter Harsha; Leah Hoffmann; at the discretion of ACM and is intended Norman Jouppi; Mari Sako; Pamela Samuelson; CONTRIBUTED ARTICLES to provide fi nancial support for the various Robert A. Walker; Gene Spafford; Cameron Wilson Co-chairs activities and services for ACM members. Jack Davidson CONTACT POINTS Al Aho and Georg Gottlob Current Advertising Rates can be found PUBLICATIONS BOARD Copyright permission Board Members by visiting http://www.acm-media.org or Co-Chairs [email protected] Yannis Bakos; Gilles Brassard; Alan Bundy; by contacting ACM Media Sales at Ronald F. Boisvert and Holly Rushmeier Calendar items Peter Buneman; Ghezzi Carlo; (212) 626-0654. Board Members [email protected] Andrew Chien; Anja Feldmann; Gul Agha; Michel Beaudouin-Lafon; Change of address Blake Ives; James Larus; Igor Markov; Single Copies Jack Davidson; Nikil Dutt; Carol Hutchins; [email protected] Gail C. Murphy; Shree Nayar; Lionel M. Ni; Single copies of Communications of the Ee-Peng Lim; M. Tamer Ozsu; Vincent Letters to the Editor Sriram Rajamani; Jennifer Rexford; ACM are available for purchase. Please Shen; Mary Lou Soffa; Ricardo Baeza-Yates [email protected] Marie-Christine Rousset; Avi Rubin; contact [email protected]. Abigail Sellen; Ron Shamir; Marc Snir; ACM U.S. Public Policy Offi ce WEB SITE Larry Snyder; Manuela Veloso; COMMUNICATIONS OF THE ACM Cameron Wilson, Director http://cacm.acm.org Michael Vitale; Wolfgang Wahlster; (ISSN 0001-0782) is published monthly 1100 Seventeenth St., NW, Suite 50 Andy Chi-Chih Yao; Willy Zwaenepoel Washington, DC 20036 USA AUTHOR GUIDELINES by ACM Media, 2 Penn Plaza, Suite 701, T (202) 659-9711; F (202) 667-1066 http://cacm.acm.org/guidelines RESEARCH HIGHLIGHTS New York, NY 10121-0701. Periodicals Co-chairs postage paid at New York, NY 10001, Computer Science Teachers ADVERTISING David A. Patterson and and other mailing offi ces. Association Stuart J. Russell POSTMASTER Chris Stephenson ACM ADVERTISING DEPARTMENT Board Members Please send address changes to Executive Director 2 Penn Plaza, Suite 701, New York, NY Martin Abadi; Stuart K. Card; Communications of the ACM 2 Penn Plaza, Suite 701 10121-0701 Deborah Estrin; Shafi Goldwasser; 2 Penn Plaza, Suite 701 New York, NY 10121-0701 USA T (212) 869-7440 Maurice Herlihy; Norm Jouppi; New York, NY 10121-0701 USA T (800) 401-1799; F (541) 687-1840 F (212) 869-0481 Andrew B. Kahng; Linda Petzold; Michael Reiter; Mendel Rosenblum; Association for Computing Machinery Director of Media Sales Ronitt Rubinfeld; David Salesin; (ACM) Jennifer Ruzicka Lawrence K. Saul; Guy Steele, Jr.; 2 Penn Plaza, Suite 701 [email protected] Gerhard Weikum; Alexander L. Wolf New York, NY 10121-0701 USA Media Kit [email protected] T (212) 869-7440; F (212) 869-0481 WEB Co-chairs      Marti Hearst and James Landay       Board Members Printed in the U.S.A.  

   Jason I. Hong; Jeff Johnson;    Greg Linden; Wendy E. MacKay   

4 COMMUNICATIONS OF THE ACM | JUNE 2009 | VOL. 52 | NO. 6 acm-w letter

DOI:10.1145/1516046.1516047 Elaine Weyuker

Not only does this make it relatively ACM-W Celebrates inexpensive to attend meetings since students and faculty often travel to- Women in Computing gether, the proximity also helps estab- lish and maintain a local community Computer science is no longer the hot, of women pursuing a common goal. We have sponsored quite a number of high-enrollment field it once was. these meetings both within the U.S. and Australia, with one being planned This is not news. While many sugges- continue on to graduate school. Simi- in Turkey. tions have been made for increasing larly we hope to encourage the mas- Another unique ACM-W initiative enrollments, it is unlikely that comput- ter’s student to aim for a Ph.D. We of- is the Ambassador program in which a er science will ever be as vibrant as it fer up to 20 $500 scholarships per year. woman serves as the Ambassador from could be—and should be—as long as a Moreover, we have recently asked the her country and shares information large portion of the talent pool remains ACM’s Special Interest Groups (SIGs) about the climate there for women in underrepresented. After all, if we are to partner with us by offering scholar- computing. At times we have had rep- missing the best and the brightest of ship recipients complimentary regis- resentatives from six different conti- a group who can offer exciting ideas tration as well as provide conference nents. We are now developing our first that would enrich the field, computer mentors to help them learn the ropes. internationally distributed program science suffers. In addition, different We are thrilled by the response we aimed at attracting middle school girls groups often present different perspec- have received from many of the SIGs. to computer science by adapting a suc- tives—a scenario completely lost when Another program involving SIG cessful program to several different we do not encourage diversity. cooperation is our Athena Lecturer cultures. With this in mind, the mission of Award honoring the most outstanding This is just a sampling of the many the ACM Women’s Council (ACM-W) is women scholars. It was established programs within ACM-W created to to inform and support women in com- to address the fact that women are promote and further advance women puting. Since ACM is an international often overlooked when nominations in the computing field. Readers are en- organization, this means developing are considered for advanced mem- couraged to visit our Web site at http:// programs with a worldwide reach; bership grades or awards. The goal women.acm.org to learn about the with something for each of ACM’s very of the Athena Lecturer Award is to full range of programs and initiatives broad constituencies: K–12 students, celebrate women’s scholarship and offered. ACM-W is an all-volunteer or- undergraduates at liberal arts and technical contributions to the field ganization open to anyone interested research institutions, master’s and as well as increase the visibility of in improving gender diversity. If you Ph.D. students, faculty from all types women scholars. Rather than ask- see a project that interests you, please of institutions, and women in industry ing for individual nominations, each consider volunteering. If you have an and government working as computer SIG is invited to nominate their most idea for a new project, let us know. practitioners and researchers. Increas- outstanding women scholars. By us- Take a look at our newsletter to see ingly, we strive to partner both with ing this format, we encourage SIGs to project details, read interviews with other segments within ACM and other think about promoting women in the outstanding women, and learn about organizations dedicated to improving field, and hopefully remember these upcoming events. gender diversity. women when they are nominating Diversity is not the problem of the Some of our active programs in- people for other awards or selecting underrepresented group. It is every- clude scholarships to help women keynote speakers or program chairs one’s problem. If we want out field to students attend research conferenc- for future conferences. grow and flourish, we need the contri- es. This effort is not aimed at the ad- Many readers will be familiar with bution of talented people of all types. vanced Ph.D. student who has already the Grace Hopper Celebration of committed to a career in academia or Women. To keep the Hopper momen- Elaine Weyuker is chair of ACM-W and is a researcher at AT&T Labs specializing in empirical software engineering industrial research. Rather we look to tum going throughout the year, ACM- and testing research. support the undergraduate woman by W offers regional Hopper-like events giving her a chance to see the types of designed to attract attendees within a options available and encourage her to two-hour driving radius of each other. © 2009 ACM 0001-0782/09/0600 $10.00

JUNE 2009 | VOL. 52 | NO. 6 | COMMUNICATIONS OF THE ACM 5    ACM Senior Members  1  /8 . - //  $/ /4,/  /-  0 1/ ./4  /. 5/04-/ 81-1 / 1/4 5 04 1/ 5//  4  , 41 / /  0/  /51  / , ; $  #03 4. ,// $ /51/ 3 "/ 4 $- "  #,/  /:  ,8. ; /0  ;    #,  7:3 8 7.  #//.  / ! -  #.6 /;  . 14/.   %:/ .   / D5/;D5/; -  # 6 / / 1  4,/3/ #/,/--  / *    (/  # /./ 3 /. 4/ &   . 3,: 7. #  , 1 #17./ $ $8 ,,/ 4 1 /  ; "15 -1-3 4 $ / 7  / " /2  5    -/ %4/ $/ 4  4 7.   #.:  -1: $1 ,1 1 $/ 5  #-1.    1. . : -1 $  $1 ; % : : -  -  '//    7. $/ /9  -14 / /    B   C/;   $/./ " /7  #.   4/ #  /: (//. ( $4 -1/ # / 1. :. #  -1/ 8/-/   1/:  $4 1 /. /; (4  /= /: /7 - / #.:  $4 1 #-. -1 :  /5  1  - / !7/ $5 -1/-3 $ /0 F   E 4 //  1 -1/ $ // $ /7/  8 8/-/  -/ ./ 3 ( / ( %4 1: $ :/ "/ /  7 3:  :   3 7. :/ ) $ %4 1:   .. //4:  45    // - $ -- ,   4 / 4/ 7/  4 "/ % #23 4  :: / ? 4B/;  -/ ,/  %4   :  1 /3   ' 1  5   /7 /.  %/7/ 18 $ 18.1: 8/-/  / #,/  . ( %-; ;,/ 1  1 -1 # / /   7-3 ;,/ 1  &/ %/:   $7  3 -1/  ! .: 4  ' $:   14 .8. "  ; $- " !7/4:/ %4 '/1/E $7/ / ;/ $  /3/  ; 2/.  ". $ /0 '> //. 1 4. :3  8-1: ":.  "  *9 (    -1/ //: #. "//; /  (/,/ # /  7 $/ (3 4  "< ;4  (/   33 ./ .  . $ "1/; . # (/55 $ 4//   $ /7/ /4    "   $ /7/  (1 /1 / /: //  1 1 // $7  #4 84: /  ( /,  / 1 .8. #,/   -1 / (4  #/ 7.  (04 ,. 4 /,  $..3 $ /7/   //   #/ / (4 (1   / -A@ "  (  8 3 #/1/ 14/.  +1   8 /:  :/ 0   -1/    %14   /,/ #:4. * /   1 / 2/4 ;/0/ #4. /-/7- "/ / /': / /5 

Additional ACM Senior Members will be included in an upcoming issue. http://seniormembers.acm.org

ACM_SeniorMembers_note.indd 1 4/27/09 3:59:45 PM I=:68B ACM, INTEL, AND 6#B#IJG>C< GOOGLE CONGRATULATE BARBARA H. LISKOV 6L6G9 FOR HER FOUNDATIONAL INNOVATIONS IN PROGRAMMING LANGUAGE DESIGN THAT HAVE MADE SOFTWARE MORE RELIABLE AND HER MANY CONTRIBUTIONS TO BUILDING AND INFLUENCING THE BY THE COMMUNITY... PERVASIVE COMPUTER SYSTEMS THAT POWER FROM THE COMMUNITY... DAILY LIFE.

FOR THE COMMUNITY...

˜ÌiÊˆÃÊ>Ê«ÀœÕ`Ê뜘ÜÀʜvÊÌ iÊ Ê°Ê °Ê/ÕÀˆ˜}ÊÜ>À`]Ê>˜`Ê ºœœ}iʈÃÊ`iˆ} Ìi`ÊÌœÊ i«ÊÀiVœ}˜ˆâiÊ*ÀœviÃÜÀʈÎœÛÊvœÀÊ iÀÊ ˆÃÊ«i>Ãi`Ê̜ʍœˆ˜ÊÌ iÊVœ““Õ˜ˆÌÞʈ˜ÊVœ˜}À>ÌՏ>̈˜}ÊÌ ˆÃÊÞi>À½ÃÊ ÀiÃi>ÀV ÊVœ˜ÌÀˆLṎœ˜Ãʈ˜ÊÌ iÊ>Ài>ÃʜvÊ`>Ì>Ê>LÃÌÀ>V̈œ˜]ʓœ`Տ>ÀÊ ÀiVˆ«ˆi˜Ì]Ê*ÀœviÃÜÀÊ >ÀL>À>ʈΜ۰ÊiÀÊVœ˜ÌÀˆLṎœ˜ÃʏˆiÊ>ÌÊ >ÀV ˆÌiVÌÕÀiÃ]Ê>˜`Ê`ˆÃÌÀˆLÕÌi`ÊVœ“«Ṏ˜}Êv՘`>“i˜Ì>Ãp>Ài>ÃÊ Ì iÊvœÕ˜`>̈œ˜ÊœvÊ>Ê“œ`iÀ˜Ê«Àœ}À>““ˆ˜}ʏ>˜}Õ>}iÃÊ>˜`Ê œvÊv՘`>“i˜Ì>Êˆ“«œÀÌ>˜ViÊ̜Êœœ}i°Ê7iÊ>ÀiÊ«ÀœÕ`Ê̜ÊLiÊ>Ê Vœ“«iÝÊ`ˆÃÌÀˆLÕÌi`ÊÜvÌÜ>Ài°Ê >ÀL>À>½ÃÊܜÀŽÊVœ˜ÃˆÃÌi˜ÌÞÊ Ã«œ˜ÃœÀʜvÊÌ iÊ Ê°Ê °Ê/ÕÀˆ˜}ÊÜ>À`Ê̜ÊÀiVœ}˜ˆâiÊ>˜`Êi˜VœÕÀ‡ ÀiyÊiVÌÃÊÀˆ}œÀœÕÃÊ«ÀœLi“ÊvœÀ“Տ>̈œ˜Ê>˜`ÊÜ՘`ʓ>Ì i“>̈VÃ]Ê >}iÊÌ iÊÀiÃi>ÀV ÊÌ >ÌʈÃÊiÃÃi˜Ìˆ>Ê˜œÌʜ˜ÞÊ̜ÊVœ“«ÕÌiÀÊÃVˆi˜Vi]Ê >Ê«œÌi˜ÌÊVœ“Lˆ˜>̈œ˜Êà iÊÕÃi`Ê̜ÊVÀi>Ìiʏ>Ã̈˜}Ê܏Ṏœ˜Ã°»ÊÊ LÕÌÊ̜Ê>ÊÌ iÊwÊi`ÃÊÌ >ÌÊ`i«i˜`ʜ˜ÊˆÌÃÊVœ˜Ìˆ˜Õi`Ê>`Û>˜Vi“i˜Ì°

˜`ÀiÜÊ°Ê ˆi˜ vÀi`Ê<°Ê-«iV̜À 6ˆViÊ*ÀiÈ`i˜Ì]Ê œÀ«œÀ>ÌiÊ/iV ˜œœ}ÞÊÀœÕ« 6ˆViÊ*ÀiÈ`i˜Ì]Ê,iÃi>ÀV Ê>˜`Ê ˆÀiV̜À]ʘÌiÊ,iÃi>ÀV -«iVˆ>Ê˜ˆÌˆ>̈ÛiÃ]Êœœ}iÊ

œÀʓœÀiʈ˜vœÀ“>̈œ˜ÊÃiiÊÜÜÜ°ˆ˜Ìi°Vœ“ÉÀiÃi>ÀV ° œÀʓœÀiʈ˜vœÀ“>̈œ˜]ÊÃiiÊ ÌÌ«\ÉÉÜÜÜ°}œœ}i°Vœ“ÉVœÀ«œÀ>ÌiÉ ˆ˜`iÝ° ̓Ê>˜`Ê ÌÌ«\ÉÉÀiÃi>ÀV °}œœ}i°Vœ“É°Ê

ˆ˜>˜Vˆ>ÊÃÕ««œÀÌÊvœÀÊÌ iÊ Ê°Ê °Ê/ÕÀˆ˜}ÊÜ>À`ʈÃÊ«ÀœÛˆ`i`ÊLÞʘÌiÊ œÀ«œÀ>̈œ˜Ê>˜`Êœœ}i°Ê CACM_ACM_Books_and_Courses_4C_full-page_LMNTK:Layout 1 4/9/09 11:59 AM Page 1

! !3""% /% . "'$%%$"$ %- Helping Members Meet Today’s Career Challenges

5 7-($ ! ! "'$%%!' &# !'%  '%$&' %$"   !&4-

4/&8/-*/&0634&0--&$5*0/*/$-6%&407&3  "! ! "'$%%! ' &#  !'%($&'  %$$! &"" %!",!# &*30(3".)*()-*()54   $!! & " 306/%5)&$-0$,"$$&4450  0/-*/&$0634&40/"8*%&3"/(&0' $0.165*/("/%#64*/&44501*$4*/.6-5*1-&-"/(6"(&4 ) '%((+$&' %  6/*26&7"#<&9&3$*4&41-"$&64&340/4:45&.464*/(3&"- )"3%8"3&"/%40'58"3&"--08*/(5)&.50("*/*.1035"/5+0#3&-"5&%&91&3*&/$& $!"" % "/&&'&3&/$&*#3"3:&95&/%45&$)/*$"-,/08-&%(&0654*%&0'5)&$-"44300.1-64 0/-*/&9&$65*7&6.."3*&4"/%26*$,3&'&3&/$&$"3%450"/48&30/5)&+0#26&45*0/4*/45"/5-: ,! *$ .&.#&34$"/"$$&44"44&44.&/54"/%4&-'456%:$0634&40>*/&"/:8)&3&"/%"/:5*.& 8*5)065"-*7&/5&3/&5$0//&$5*0/ %08/-0"%"#-&6*$,&'&3&/$&6*%&"/%" .*/65&4*5&03*&/5"5*0/$0634&'03/&864&34"3&"-40 "7"*-"#-&50)&-1.&.#&34(&545"35&% )&/-*/&0634&30(3".*401&/5030'&44*0/"-"/%56%&/5&.#&34

! !3""%$" 6$ ! !3""% $" 3""%)2 .&.#&34"3&&-*(*#-&'03"%# %(!%0=&35061(3"%&50"3&.*6.03 6--*#3"3:46#4$3*15*0/5)306()6/&   --30'&44*0/"-"/%56%&/5&.#&34"-40 )"7&' %%&" "! !""% '30. 03.03&%&5"*-47*4*5 00,4 9 <*/4305"5*/($0--&$5*0/0' )5511%"$.03(#00,4"#065!4&-$'. $0.1-&5&6/"#3*%(&%#00,40/5)&)055&45 $0.165*/(501*$4 )*47*356"--*#3"3:1654 )&/-*/&00,40--&$5*0/*/$-6%&4'  */'03."5*0/"5:063'*/(&35*14&"3$)#00, %%&" "! !""% '30."'"3*<00,4 ."3,033&"%$07&350$07&3 063#00,4)&-' /-*/&'&"563*/(-&"%*/(16#-*4)&34*/$-6%*/( "--084'0326*$,3&53*&7"-"/%#00,."3,4-&5 &*--:"'"3*1654"$0.1-&5& "/%#64*/&44 :06&"4*-:3&563/5041&$*'*$1-"$&4*/"#00, &3&'&3&/$&-*#3"3:3*()50/:063%&4,5017"*-"#-& 5030'&44*0/"-&.#&34"'"3*8*--)&-1:06 ;&30*/0/&9"$5-:5)&*/'03."5*0/:06/&&%3*()5 8)&/:06/&&%*5

#0 0"$ 9990 0"$18"! letters to the editor

DOI:10.1145/1516046.1516049 Share the Threats

THMAN EL MOULAT’S comment tual machines, aiming to verify or refute fining an effective dependability case. “What Role for Computer a problem a customer might be having Is this correct? Science in the War on Ter- and possibly provide a workaround or CJ Fearnley, Upper Darby, PA ror?” (Apr. 2009) concern- new build of the software. If this sce- ing the article “The Topolo- nario turns out to be common, we roll it Ogy of Dark Networks” by Jennifer Xu and into our testing sandbox; Author’s Response: Hsichun Chen (Oct. 2008) that the views Project services. When trying to re- Requirements traditionally break the and articles in Communications should motely configure and build a solution behavior of a system into a collection of have no bearing on or bias toward any for a customer, we first build it in a vir- functions, each describing in full some agenda, political or religious, is a point tual machine, then apply the solution feature of the system. A radiotherapy well taken. However, in light of the se- and test. This process also greatly im- machine might, for example, offer curity breaches occurring throughout proves delivery of the solution; and functions to recall a patient’s prescribed the digital world, any information that Software demonstration. We build dose from a database; set the equipment exposes threats should indeed be well our demos in a virtual machine, mak- to deliver a given dose; activate the received and published wherever it is ing it much easier for us to get them out equipment; and so on. Prioritizing functions relevant to technologists and security to field personnel. isn’t very useful, because the critical specialists, as in Communications. Jerry Walter, Troy, OH aspects of a system typically involve many It is reasonable to suspect that poten- functions, though often not in their entirety. tial terrorist cells or factions willingly A property, on the other hand, describes and wantonly seek ways to destroy West- How to Define the Granularity an expected observation of the system’s ern technologies and organizations. An of Properties and Functions behavior and can be expressed at any level article aimed at exposing threats or edu- I was confused about the discussion of granularity: that, for example, some cating the public on future threats does of properties and functions in Daniel of the dose delivered to a patient never not in any way target a specific race, Jackson’s review article “A Direct Path exceeds some fixed limit; that a patient creed, or religion. to Dependable Software” (Apr. 2009). receives his or her prescribed dose within I applaud the authors of “The Topol- Jackson seemed to be saying that prop- some tolerance; that the dose delivered ogy of Dark Networks” and hope Com- erties are more fine-grain than func- and the dose logged always match; and munications continues to keep us up to tions yet also that a property cuts across so on. So a property can at the same date with factual articles of this nature. several functions at the same time. time be more fine-grain than a single Organizations that are concerned with Doesn’t this imply that properties are function (since it describes the function their own beliefs, traditions, and ob- coarse-grain, assuming they transcend only partially) and cut across multiple jectives should be willing to transpar- several functions? functions. ently share their interests with the rest Trying to resolve my questions with Daniel Jackson, Cambridge, MA of the world. the help of Webster’s dictionary, I John Orlock, IL learned that a function is a “factor” and a Corrections property is any attribute or characteristic. In the Q&A “Our Dame Commander” So functions and properties can be both (Apr. 2009), Leah Hoffmann described Virtualization Still Evolving fine- and coarse-grain, depending on Wendy Hall as “the third female presi- Kirk L. Kroeker’s news article “The Evo- the assumptions of abstraction inher- dent of ACM.” Hall is the sixth, pre- lution of Virtualization” (Mar. 2009) ent in the mind of the author. ceded by: Jean Sammet (1974–1976), took a limited view of its subject. I con- Does Jackson view a “function” as a Adele Goldberg (1984–1986), Gwen Bell trast it with how my company uses vir- modularity construct in a programming (1992–1994), Barbara Simons (1998– tual machines for several quite practi- language? Does he mean that properties 2000), and Maria Klawe (2002–2004). cal purposes: are those factors or attributes (“func- Software testing. Rather than build a tions” if you will) that are independent The photographs of the Rebooting test environment, then rebuild it after a of the software’s special-case imple- Computing Summit (Apr. 2009) were series of tests, we set it up with a set of mentation? taken by Richard P. Gabriel (page 2) and baseline virtual machines (perhaps cli- I may still be confused, but trying by Mary Bronzan (page 19). ent/server), then run our tests. This way to infer Jackson’s meaning led me to

when we finish testing, we copy back conclude the following: Fine-grain at- Communications welcomes your opinion. To submit a Letter over the baseline virtual machine and tention to the software’s behavior-level to the Editor, please limit your comments to 500 words or are ready for the next round of testing; characteristics (including properties, less and send to [email protected]. Customer support. We look to mimic functions, or whatever abstractions a customer configurations in a set of vir- developer is using) is important in de- © 2009 ACM 0001-0782/09/0600 $10.00

JUNE 2009 | VOL. 52 | NO. 6 | COMMUNICATIONS OF THE ACM 9 blog@cacm

The Communications Web site, cacm.acm.org, features 13 bloggers in the BLOG@CACM community. In each issue of Communications, we’ll publish excerpts from selected posts, plus readers’ comments.

DOI:10.1145/1516046.1516072 cacm.acm.org/blogs/blog-cacm

only have to speak the number of the Speech-Activated item you want from the list. However, it also seems to correctly recognize the spoken version of anything in the User Interfaces and list, even if it’s not displayed on the current screen (e.g., the name of an Climbing Mt. Exascale artist in the music player). In my tests it’s been surprisingly Tessa Lau discusses why she doesn’t use the touch screen accurate at interpreting my speech, on her in-car GPS unit anymore and Daniel Reed considers despite the generally noisy environ- the future of exascale computing. ment on the road. What has surprised me the most about this interface is that the voice- From Tessa Lau’s empowering, and makes me excited based control is so enjoyable and “Hello, Computer” about the future of voice-based inter- fast that I don’t use the touch screen Four years ago when I faces. anymore. Speech recognition, which bought my first in-car The nüvi’s interface is simple and had been in the realm of artifical in- Global Positioning Sys- well designed. There’s a wireless, but- telligence for decades, has finally tem (GPS) unit, it felt ton-activated microphone that you matured to the point where it’s now like a taste of the future. The unit mount to your steering wheel. When reliable enough for use in consumer knew where I was, and regardless you activate the mic, a little icon ap- devices. of how many wrong turns I made, it pears on the GPS screen to indicate Part of the power of the speech- could tell me how to get where I want- that it’s listening, and the GPS plays activated user interface comes from ed to go. It was the ultimate adaptive a short “I’m listening” tone. You can the ability to jump around in the in- interface: No matter where I started, speak the names of any buttons that terface by spoken word. Instead of it created a customized route that appear on the screen or one of the having to navigate through several would lead me to my destination. always-active global commands (e.g., different screens by clicking but- Alas, my first GPS unit met an un- “main menu,” “music player,” or tons, you can jump straight to the timely end in a theft involving a dark “go home”). Musical tones indicate desired screen by speaking its name. night, an empty street, and a smashed whether the GPS has successfully in- It’s reminiscent of the difference be- window. terpreted your utterance. If it recog- tween graphic user interfaces (GUIs) My new GPS, a Garmin nüvi 850, nized your command, it takes you to and command lines; GUIs are easier comes with a cool new feature: the next screen and verbally prompts to learn, but once you master them, speech-activated controls. you for the next piece of information command lines offer more efficiency Speech recognition brings a new (e.g., the street address of your des- and power. As is the case with com- dimension to the in-car human-com- tination). Most of the common GPS mand lines, it takes some experimen- puter interface. When you’re driving, functionality can be activated via spo- tation to discover what commands you’re effectively partially blind and ken confirmations without even look- are available when; I’m still learning have no hands. Being able to talk to ing at the screen. about my GPS and how to control it the computer and instruct it using Lists (e.g., of restaurant names) more effectively. nothing but your voice is amazingly are annotated with numbers so you Kudos, Garmin, you’ve done a great

10 COMMUNICATIONS OF THE ACM | JUNE 2009 | VOL. 52 | NO. 6 blog@cacm job with the nüvi 850. I can’t wait to ing for several reasons, both socio- we have been loathe to mount the in- see what the future will bring! (Voice- logical and technological. tegrated research and development based access to email on the road? It needed to change our current hard- seems almost within reach.) Petascale Retrospective ware/software ecosystem and pro- Disclaimer: The views expressed On the sociological front, I remember curement models. here do not necessarily reflect the views participating in the first peta-scale of my employer, ACM, or any other en- workshop at Caltech in the 1990s. Exascale Futures tity besides myself. Seymour Cray, Burton Smith, and Evolution or revolution, it’s the per- others were debating future petas- sistent question. Can we build reli- Reader’s comment: cale hardware and architectures, a able exascale systems from extrapo- Information I’ve read lately on the topic second group was debating device lations of current technology or will of speech recognition indicates that a technologies, a third was discussing new approaches be required? There device’s ability to correctly recognize application futures, and a final group is no definitive answer as almost any commands depends in large measure on of us was down the hall debating fu- approach might be made to work at the quietness of the environment. I have ture software architectures. All this some level with enough heroic effort. often found that voice systems on my cell was prelude to an extended series of The bigger question is: What design phone don’t work well unless I find a quiet architecture, system software, pro- would enable the most breakthrough place to access them. So it is good to hear gramming models, algorithms, and scientific research in a reliable and that Garmin has found an effective way applications workshops that spanned cost-effective way? to interpret commands while driving—an several years and multiple retreats. My personal opinion is that we environment that you note can be noisy. At the time, most of us were con- need to rethink some of our dearly As you speak of future enhancements, vinced that achieving petascale per- held beliefs and take a different ap- it brings up the issue of what drivers formance within a decade would re- proach. The degree of parallelism should be able to do while on the road. quire new architectural approaches required at exascale, even with future Multitasking is great, but I’m not sure and custom designs, along with radi- many-core designs, will challenge email while driving is such a good idea… cally new system software and pro- even our most heroic application —Debra Gouchy gramming tools. We were wrong, or developers, and the number of com- at least so it superficially seems. We ponents will raise new reliability and broke the petascale barrier in 2008, resilience challenges. Then there are From Daniel Reed’s using commodity x86 microproces- interesting questions about many- “When Petascale Is sors and GPUs, InfiniBand intercon- core memory bandwidth, achievable Just Too Slow” nects, minimally modified , and system bisection bandwidth, and I/O It seems as if it were just the same message-based program- capability and capacity. There are yesterday when I was at ming model we have been using for just a few programmability issues the National Center for the past 20 years. as well! Supercomputing Applications and we However, as peak system perfor- I believe it is time for us to move deployed a one teraflop Linux cluster mance has risen, the number of users from our deus ex machina model as a national resource. We were as has declined. Programming massively of explicitly managed resources to excited as proud parents by the con- parallel systems is not easy, and even a fully distributed, asynchronous figuration: 512 dual processor nodes terascale computing is not routine. model that embraces component (1 GHz Intel Pentium III processors), Horst Simon explained this with an in- failure as a standard occurrence. To a Myrinet interconnect, and (gasp) a teresting analogy, which I have taken draw a biological analogy, we must stunning 5 terabytes of RAID storage. the liberty of elaborating slightly. The reason about systemic organism It achieved a then-astonishing 594 gig- ascent of Mt. Everest by Edmund Hil- health and behavior rather than cel- aflops on the High-Performance LIN- lary and Tenzing Norgay in 1953 was lular signaling and death, and not PACK benchmark, and was ranked heroic. Today, amateurs still die each allow cell death (component failure) 41st on the Top500 list. year attempting to replicate the feat. to trigger organism death (system The world has changed since then. We may have scaled Mt. Petascale, but failure). Such a shift in world view We hit the microprocessor power we are far from making it pleasant or has profound implications for how (and clock rate) wall, birthing the even a routine weekend hike. we structure the future of interna- multicore era; vector processing re- This raises the real question: Were tional high-performance comput- turned incognito, renamed as graphi- we wrong in believing different hard- ing research, academic/government/ cal processing units (GPUs); terabyte ware and software approaches would industrial collaborations, and system disks are available for a pittance at be needed to make petascale com- procurements. your favorite consumer electronics puting a reality? I think we were abso- store; and the top-ranked system on lutely right that new approaches were Tessa Lau is a research staff member at IBM Almaden the Top500 list broke the petaflop needed. However, our recommenda- Research Center in San Jose, CA. Daniel Reed is director of scalable and multicore systems at barrier last year, built from a combi- tions for a new research and devel- Research in Redmond, WA. nation of multicore processors and opment agenda were not realized. At gaming engines. The last is interest- least, in part, I believe this is because © 2009 ACM 0001-0782/09/0500 $10.00

JUNE 2009 | VOL. 52 | NO. 6 | COMMUNICATIONS OF THE ACM 11 cacm online ACM Member News DOI:10.1145/1516046.1516050 David Roman EMER WINS ECKERT-MAUCHLY Making that Connection AWARD ACM and the IEEE Computer Society will jointly present The goal of holding readers’ attention the Eckert-Mauchly Award to Joel Emer, director of has made provocation a timeworn edi- microarchitecture research at torial strategy. Communications doesn’t Intel, for pioneering contributions resort to screaming headlines like to performance analysis, modeling methodologies, and most storefront fare, but it does strive design innovations in several to publish eye-catching imagery for its significant industry must-read articles. This month’s cover microprocessors. Emer story, “One Laptop Per Child: Vision vs. developed quantitative methods including measurement of real Reality,” with its title’s inherent ten- machines, analytical modeling, sion, is a case in point. and simulation techniques Communications also aims for au- that are now widely employed to evaluate the performance of thority; its articles can be a beginning complex computer processors. as much as an end. The “Viewpoints” Emer will receive the 2009 pages, for example, may introduce Eckert-Mauchly Award, the most unsettled and unsettling ideas that prestigious award in the computer architecture community, at the prompt readers to react and respond International Symposium on not only to the editorial but to each Computer Architecture, June other. Indeed, the recent debate on 20–24, in Austin, TX. network neutrality that was first pre- EGGERS RECEIVES ATHENA sented in the pages of the February is- LECTURER AWARD sue, continued into the May issue, and Susan Eggers, a professor of it’s hardly over yet. computer science and engineering You can be a part of this debate at cacm.acm.org. Communications’ Web site at the University of Washington, has won ACM’s 2009–2010 invites and lends itself to quick feedback via the “User Comments” feature that Athena Lecturer Award. Eggers’ allows a continued conversation about a topic. Reachable from the “Tools for work on computer architecture Readers” at the top right of each article page, and at the bottom of every article and experimental performance analysis led to the development page, the feature requires a simple sign-in (so we can follow who’s speaking). of Simultaneous Multithreading, From there, readers are welcome to present what Editor-in-Chief Moshe Vardi the first commercially viable calls “well-reasoned and well-argued opinions” to keep the discussion lively. multithreaded architecture. This I encourage all readers to start or join an online discussion. technique improves the overall efficiency of certain processors known as superscalar and has been Wanted: Expert Bloggers adopted by Intel, IBM, and others. Ever consider yourself a blogger? If so, we should talk. Communications wants to expand its ever-evolving roster of expert bloggers. WHITNEY RECOGNIZED FOR DISTINGUISHED SERVICE Experience is a plus but credentials and passion are equally important. The level ACM presented the Distinguished of commitment we require is open-ended; if you are willing to work with us, we Service Award to Telle Whitney will accommodate your schedule. If you are interested but cannot add it to your for her profound impact on workload at the moment, we could put you on our future schedule or at least get the participation of women in

computing. Whitney, president ORATION you on our radar. and CEO of the Anita Borg P COR In addition, if you follow the blogs of someone you consider a good fit Institute for Women and L NTE for Communications, we’d like to hear your recommendations. Contact us at Technology, cofounded the Grace I Hopper Celebration of Women in [email protected]. Computing, which has grown into an annual event. The conference is widely recognized as one of the

best ways to encourage women to H COURTESY OF THE major in computing, continue on P to graduate school, and pursue a

career in computing. PHOTOGRA

12 COMMUNICATIONS OF THE ACM | JUNE 2009 | VOL. 52 | NO. 6 news

Science | DOI:10.1145/1516046.1516051 Don Monroe

these ideas help patients is “probably Micromedicine measured in decades, not in years,” Shapiro admits. Long before that, how- ever, researchers could use the new N to the Rescue tools to explore biology in the lab. The Medical researchers have long dreamed of “magic bullets” challenge of engineering biology, rath- er than merely observing it, could yield that go directly where they are needed. With micromedicine, powerful insights into how biological this dream could become a life-saving reality. systems work.

HEADACHE OR other pain scientist and biological chemist at the Hijacking Biology will send many of us to Weizmann Institute of Science in Re- Recent years have been revolutionary the medicine cabinet hovot, Israel, likens this approach to a for biology. The human genome, as for a pain reliever. Mol- “smart envelope.” The envelope “would well as computer-based tools that mea- ecules from the swal- open up only at the right place and the sure thousands of biological chemicals Alowed pill quickly find their way direct- right time for the specific action,” such simultaneously, have inundated biolo- ly to the source of the pain. But how do as releasing a potent but toxic cancer gists with data about how these chemi- they know where to go? Of course, they drug, he says. “This would open up a cals interact to create the processes don’t; the molecules travel throughout whole range of molecules that are to- of life. An eager group of researchers the body, chemically reacting wherever tally inaccessible today as drugs.” around the world take this data glut they can. In addition to delivering drugs, mi- as a challenge to build new biological The consequences of “broadcast- croscopic agents could transform the circuits from scratch, in what is known ing” drugs to the whole body are pro- regeneration of damaged tissues and as “synthetic biology.” Using various IRO, P

HA found. Drugs that attack rogue, cancer- the diagnosis of disease. The time until strategies, they are assembling pieces S

D

HU causing cells also afflict other dividing E cells, such as those in the intestine.

AR & mRNA disease indicators

AD In fact, chemotherapy doses are often KA V

I reduced to avoid nausea and other un- R pleasant side effects, and other, more EN-DOR,

B powerful drugs are too toxic to even be RI

U considered. , L I

G Output Researchers have long dreamed of Input Computation drug “magic bullets” that go directly where identification of diagnosis disease indicators administration INYAMIN

B they are needed. Indeed, many current drugs are formulated to be taken up by AY 2004) ENENSON, M particular tissues, and nanotechnol- B

V ogy is giving researchers even more de- AAKO Y livery options. But what if the delivery ssDNA drug system could “diagnose” the local con- FROM D TE

P ditions? In contrast to today’s “dumb A ATURE 429, 423-429 (27

AD N envelopes,” Ehud Shapiro, a computer An example of a test-tube “molecular computer” created by Ehud Shapiro and colleagues.

JUNE 2009 | VOL. 52 | NO. 6 | COMMUNICATIONS OF THE ACM 13 news

                                                                                       ×                                       

An example of a synthetic riboswitch engineered by Maung Nyan Win and Christina Smolke in which the ribozyme is turned off when the aptamer binds ligand.

that might enable completely new ap- A Caltech team led by Christina that the field has yet to settle on the proaches to medical technology. Smolke, now a professor of bioengi- best approach. “Ultimately, you want In 2004, for example, Shapiro and neering at Stanford University, de- to get to a place where there’s some his colleagues created a test-tube “mo- signed complex RNA molecules that level of standardization,” she says. lecular computer” consisting of three included three separate sections, per- interconnected modules. The first forming sensing, computation, and Send in the Clones module sensed the concentrations of actuation. Although all three modules One barrier to standardization is the four types of messenger RNA, the work- are part of the same molecule, they act wide range of possibilities for using ing copies of the genetic instructions independently, so the function of each biological agents in medicine. Smolke’s in DNA, which are used to produce pro- part can be separately modified, she technique, for example, might be used teins. The second module performed a says. “You have this plug-and-play type to genetically modify cells in a par- “diagnosis,” computing whether two capability to build many types of func- ticular tissue, but she is also exploring of the messenger RNA levels decreased tions from a smaller set of modular modification of immune cells outside while two others increased, a signa- components.” The RNA molecules they of the body to combat cancer. “We’re ture that might indicate a disease. De- design are manufactured by yeast or utilizing the function that [the immune pending on the results of the compu- even mammal cells after the research- cell] already does really well, and then tation, the third module dispensed a ers insert the corresponding DNA. endowing it with enhanced functions,” drug molecule. “We demonstrated the In addition to computing Boolean she says. whole process, beginning to end,” Sha- logic operations, Smolke’s team has Chris Anderson, a professor in the piro notes, “but in a test tube.” demonstrated other signal-processing department of bioengineering at the To both sense specific strands of functions, including bandpass filtering University of California, Berkeley, envi- messenger RNA and to perform the and adjustable signal gain, with their sions a different strategy, one based on computation, the Weizmann research- RNA platform. But she acknowledges engineered bacteria, but admits that ers exploited the sequence-specific “it’s impossible to know what’s going matching of DNA strands. So far, to win out.” A “huge advantage” of us- though, they have not operated their In addition to ing bacteria, Anderson says, is that the molecular computer in the complex computing Boolean biological processes targeted by anti- environment of a living cell. Other bacterial agents are very different from CI. 104, 14287 (2007) S . teams have had success with different those of human cells, so the engineered D logic operations, CA A . schemes. For instance, a group includ- bacteria can be easily killed. L AT ing Shapiro’s former collaborator Yaa- a Caltech team has For bacteria to be effective, they N kov Benenson, now a researcher at the demonstrated other must be able to evade the body’s im- KE, PROC. FAS Center for Systems Biology at Har- mune defenses. Anderson and his col- L MO vard University, demonstrated compu- signal-processing leagues have transplanted genes from S tation—but not sensing—in cultured functions with its other bacteria that allow their E. coli to human kidney cells. They exploited the survive for hours in the bloodstream, HRISTINA D. C

newly discovered phenomenon of RNA RNA platform. instead of just a few minutes. They also D interference, in which the presence of introduced growth-control mecha- nisms into the bacteria, he stressed. short RNA templates activates cellular YAN WIN AN mechanisms that suppress protein syn- “They’re not able to grow without feed- N AUNG

thesis for matching messenger RNA. ing something to the patient.” M

14 COMMUNICATIONS OF THE ACM | JUNE 2009 | VOL. 52 | NO. 6 news

In 2006, Anderson and his col- of computation that reflects the inter- described by researcher Tad Hogg of leagues unveiled a bacterium they had action between the two strains, each of Hewlett-Packard Labs, they could sig- engineered to invade nearby cells. Im- which could be tuned to detect separate nal to point others to medically impor- portantly, the invasion only occurred conditions. In some cases, Weiss says, tant locations. In addition, they might under chosen conditions, including “a cell that specializes in the detection be able to transmit information to the lack of oxygen, which often occurs near of one condition can do it much bet- outside world. tumors. Rather than directly combin- ter than a cell that tries to do too many Augmenting, rather than replac- ing sensing, computation, and actua- things at once.” ing, the diagnostic strengths of the tion into a single DNA or RNA molecule, From a broader perspective, says medical community could be an im- Anderson’s genetic modules com- computer scientist Tatsuya Suda of portant early application of micro- municate using smaller molecules, in the University of California at Irvine, medicine, and relaxes the demands much the same way as normal cells. “there’s always communication in- for on-board computation and drug When the researchers insert new DNA volved” in micromedicine. The sens- delivery. At a minimum, small de- into the bacteria, they include special ing of the environment by the tiny vices might extend the capabilities of sequences that respond to other chem- agents is a kind of communication, he chemicals whose locations are moni- icals in the cell or the environment. notes, as is the dispensing of drugs. As tored in modern medical equipment. They “connect” their modules by in- researchers design these tiny commu- “As those imaging devices advance,” ducing this sensitivity to the products nications systems, he stresses, they Hogg says, “they should be able to of other genes that they insert. In ad- need to pay careful attention to noise. give you some information more than dition, by requiring that two different In addition to communicating just ‘here’ or ‘not here,’ but what they molecules attach to adjacent regions with their environment, microscopic found” in a particular region, perhaps of DNA, they created the cellular equiv- agents may communicate with each by combining several important local alent of an AND gate. other. As an example, Suda cites re- measurements. generative medicine, in which the cre- Even before medical applications A Need to Communicate ation of a replacement organ requires become practical, Shapiro suggests, The chemical sensitivity of genes gives coordinated response by many agents. the emerging tools could provide new cells some ability to communicate with But he admits that, for now, “the state resources for basic biology research. each other. For example, one of the sig- of the art is just trying to find out how “I think that these types of molecu- nals that stimulated Anderson’s bacte- they work together as a group, as op- lar computing devices might be able ria to invade was the well-known “quo- posed to how we can take advantage of to analyze living cells ex vivo and help rum-sensing” response that kicks in for group behavior.” researchers understand cells without some bacteria when they are present in For biologically based agents, as for killing them,” Shapiro notes. “These large numbers. Ron Weiss, a professor ordinary cells, any communication is applications are probably measured in of electrical engineering and molecu- likely to occur through the emission years rather than in decades.” lar biology at Princeton University, has and sensing of molecules. In contrast, used the quorum-sensing machinery artificial or hybrid systems incorporat- Don Monroe is a science and technology writer based in to build bidirectional communication ing nanometer-scale electronic com- Murray Hill, NJ. between two groups of bacteria. The ponents might also communicate by collective behavior constitutes a kind ultrasound or radio. In principle, as © 2009 ACM 0001-0782/09/0600 $10.00

Search Technology Kleinberg Wins ACM-Infosys FoundationAward

Jon Kleinberg, a professor of phenomenon known as “six support for the $150,000 award is insights into the link between computer science at Cornell degrees of separation.” provided by an endowment from computer network structure University, is the winner of the Kleinberg’s use of the Infosys Foundation. and information that has 2008 ACM-Infosys Foundation mathematical models to “Professor Kleinberg’s transformed the way information Award in the Computing Sciences illuminate search and social achievements mark him as a is retrieved and shared online.” for his contributions to improving networking tools that underpin founder and leader of social The ACM-Infosys Web search techniques that today’s social structure has created network analysis in computer Foundation Award recognizes allow billions of Web users interest in computing from people science,” says Professor Dame young researchers who are worldwide to find relevant, not formerly drawn to this field. Wendy Hall, president of ACM. currently making sizeable credible information on the ever- The ACM-Infosys Foundation “With his innovative models and contributions to their fields evolving Internet. Kleinberg, 37, Award, established in 2007, algorithms, he has broadened and furthering computer developed models that document recognizes personal contributions the scope of computer science science innovation. The goal is how information is organized on by young scientists and system to extend its influence to the to identify scientifically sound the Web, how it spreads through developers to a contemporary burgeoning world of the Web breakthrough research with large social networks, and how innovation that exemplifies the and the social connections it potentially broad implications, these networks are structured greatest recent achievements in enables. We are fortunate to and encourage the recipients to to create the small-world the computing field. Financial have the benefit of his profound further their research.

JUNE 2009 | VOL. 52 | NO. 6 | COMMUNICATIONS OF THE ACM 15 news

Society | DOI:10.1145/1516046.1516052 Leah Ho!mann Content Control Entertainment businesses say digital rights management prevents the theft of their products, but access control technologies have been a uniform failure when it comes to preventing piracy. Fortunately, change is on the way.

Y NOW, THE story is familiar: into question. Previously, compos- for something that could be consumed CD sales are falling. Digi- ers of popular songs relied on the sale thousands of discrete, different times. tal music sales are grow- of sheet music for their income. After Eventually, collection societies were set ing, but have not offset the all, musicians needed sheet music to up to make sure each party had a share loss. The music business learn and perform a work, even if in- in the new revenue streams. Bis struggling to adapt to a new techno- dividual performances generated no Today, musical copyright is most logical era. It’s not the first time. At the royalties. Once performances could be prominently embodied not by sheet turn of the 20th century, for instance, recorded and sold or broadcast on the music but by audio recordings, along as the phonograph gained popularity, radio, however, the system grew less with their translations and derivatives the industry’s model of compensation appealing to both groups of artists, (that is, their copies). Yet computers and copyright was suddenly thrown who were essentially getting paid once have made light work of reproducing most audio recordings, and the in- dustry is unable to prevent what many young fans are now used to—free cop- ies of their favorite songs from online file-sharing networks like BitTorrent and LimeWire. Legal barriers, like the injunctions imposed by the Digi- tal Millennium Copyright Act (DMCA) against copying protected works or cir- cumventing their digital protections, are unpopular and difficult to enforce. (The industry’s John Doe suits have touched a mere fraction of file sharers, and their effect on the overall volume of illegal downloads is questionable.) Technological barriers, like the wide- spread security standards and controls known as digital rights management (DRM), have been even less effective. DRM attempts to control the way digital media are used by preventing purchasers from copying or convert- ing them to other formats. In theory, it gives content providers absolute power over how their work is consumed, en- abling them to restrict even uses that are ordinarily covered by the fair use doctrine. Purchase a DVD in Europe, and you’ll be unable to play it on a DVD player in the U.S. because of region- coding DRM. What’s more, according EINMAN

to the DMCA, it would be illegal for KL Y LL

you to copy your DVD’s contents into a O different format, or otherwise attempt M GH BY to circumvent its region-coding con- P trols. To take a musical example, until

By putting copyrighted books online, Google Book Search may soon revolutionize book publishing. recently songs purchased in Apple’s PHOTOGRA

16 COMMUNICATIONS OF THE ACM | JUNE 2009 | VOL. 52 | NO. 6 news popular iTunes music store could only els: Can companies preserve their cur- be played on an iPod due to the com- rent revenue structures through DRM pany’s proprietary DRM. DRM is being or in court, or must they find some oth- Entertainment businesses say “wielded as a er way of making money? For music, they need DRM to prevent the theft the iTunes model appears to be a viable of products that represent their liveli- powerful tool” one, though questions still remain. For hood. In practice, however, DRM has against unapproved movies, the path is less clear. What will been a uniform failure when it comes happen when DVDs become obsolete? to preventing piracy. Those who are technologies, says Will consumers take out subscriptions engaged in large-scale, unauthorized Aaron Perzanowski. to online movie services, or make dis- commercial duplication find DRM crete one-time purchases? “Nobody “trivial to defeat,” says Jessica Litman, knows what the marketplace of the fu- a professor of law at the University of ture will look like,” says Litman. And Michigan. The people who don’t find the wholesale copyright reform that it trivial: ordinary consumers, who are digital activists long for is years away. often frustrated to discover that their on their hard drives. Angry gamers re- One industry whose business mod- purchases are restricted in unintuitive sponded by posting copies of the game el may soon be radically transformed and cumbersome ways. online, making Spore the most pirated is publishing. Under the terms of a In the music industry, at least, game on the Internet. recent settlement reached with the change is underway. In 2007, Amazon Authors Guild (which sued Google announced the creation of a digital DRM and Movies in 2005 to prevent the digitization music store that offered DRM-free Yet DRM is nowhere near dead outside and online excerpting of copyrighted songs, and in January 2009, Apple fi- the music business. Hollywood, pro- books as part of its Book Search proj- nalized a deal with music companies tected thus far from piracy by the large ect), Google agreed to set up a book to remove anti-copying restrictions file size of the average feature film, con- rights registry to collect and distrib- on the songs it sold through iTunes. tinues to employ it as movies become ute payments to authors and publish- Since iTunes is the world’s most popu- available through illegal file-sharing ers. Much like the collection societies lar digital music vendor—and the iPod networks. Buy a movie on iTunes, and that were established for musicians, its most popular player—critics com- you’ll still face daunting restrictions the registry would pay copyright hold- plained the deal would only further so- about the number and kind of devices ers whenever Internet users elected to lidify Apple’s hold on the industry. Yet you can play it on. Buy a DVD, and you’ll view or purchase a digital book; 63% of because consumers can now switch to be unable to make a personal-use copy the fee would go to authors and pub- a different music player without losing to watch on your laptop or in the car. lishers, and 37% to Google. the songs they’ve purchased, the pre- DRM has also proven useful as a le- If approved, the settlement would diction seems dubious. gal weapon. Kaleidescape, a company be “striking in its scope and potential “As long as the cost of switching whose digital “jukeboxes” organize future impact,” says Deirdre Mulli- technologies is low, I don’t think Apple and store personal media collections, gan, a professor of law at the Univer- will exert an undue influence on con- was sued in 2004 by the DVD Content sity of California, Berkeley’s School of sumers,” says Edward Felten, a profes- Control Association, which licenses the Information. It is nonetheless highly sor of computer science and public af- Content Scrambling System that pro- controversial. Some, like James Grim- fairs at Princeton University. tects most DVDs. (In 2007, a judge ruled melmann, a New York Law School What about piracy? Since DRM there was no breach of the license; the professor, believe it is a “universal win never halted musical piracy in the first case is still open on appeal.) compared with the status quo.” Others place, experts say, there’s little reason The Kaleidescape case is instructive, are disappointed by what they see as to believe its absence will have much experts say, since it shows that prevent- a missed opportunity to set a power- effect. In fact, piracy may well decrease ing piracy isn’t necessarily Hollywood’s ful court precedent for fair use in the thanks to a tiered pricing scheme in biggest concern. Entry-level Kaleides- digital age, and the undeniable danger the Apple deal whereby older and less cape systems start at $10,000—unlike- of monopoly. “No other competitors popular songs are less expensive than ly purchases for would-be copyright appear poised to undertake similar ef- the latest hits. “The easier it is to buy infringers. “Instead, DRM is wielded as forts and risk copyright legislation,” legitimate high-quality, high-value a powerful tool to prevent the develop- says Perzanowski. products,” explains Felten, “the less of ment and emergence of unapproved One thing, at least, is clear: It frees a market there is for pirated versions.” technologies. In some instances, that the courts to consider other industries’ By way of illustration, he points to the may overlap with some concern over in- complaints as they slouch toward the 2008 release of Spore, a hotly anticipat- fringement, but as the Kaleidescape ex- digital age. ed game whose restrictive DRM system ample shows, it need not,” says Aaron not only prevented purchasers from Perzanowski, a research fellow at the Leah Hoffmann is a Brooklyn, NY-based science and installing it on more than three com- Berkeley Center for Law & Technology. technology writer. puters, but surreptitiously installed Indeed, the real question typically a separate program called SecuROM comes down to one of business mod- © 2009 ACM 0001-0782/09/0600 $10.00

JUNE 2009 | VOL. 52 | NO. 6 | COMMUNICATIONS OF THE ACM 17 news

Technology | DOI:10.1145/1516046.1516053 Gregory Goth Autonomous Helicopters Researchers are improving unmanned helicopters’ capabilities to address regulatory requirements and commercial uses.

HERE WOULD SEEM to be a clear market niche for unmanned helicopters. Equipped with lightweight onboard cam- eras, they could serve as Tmapping agents or search-and-rescue “eyes” in places where using a full- sized helicopter and a human crew are life threatening or cost prohibi- tive. Motion-picture producers have explored the use of autonomous heli- copters in filming action scenes in lo- cations where the safety of both flight crews and movie cast members could be at risk from using larger aircraft. Humanitarian groups have consid- ered using autonomous helicopters for land-mine detection, while public safe- ty agencies have explored using them for inspecting bridges and other struc- tures where human inspectors might be endangered. And they are becoming mainstays in applications such as crop dusting in Japan, where the need to fly One of Stanford University’s autonomous helicopters flying upside down in an aerobically at a low altitude and spray chemicals challenging airshow. For more photos and video, visit http://heli.stanford.edu/index.html. can be dangerous for pilots. Academic and commercial research International Conference for Machine other aircraft. Therefore, autonomous teams have been perfecting the capa- Learning’s best application paper for crafts’ use is limited to case-by-case bilities of autonomous helicopters for 2008, and describes how he and col- approval by the FAA, and usually re- nearly two decades, with such wide- leagues programmed an autonomous stricted to line-of-sight operation. In spread deployments as a goal. Algorith- helicopter to perform complex aero- Japan, the government has placed mic and technological advances are batics. “But I think the biggest hurdle strict trade restrictions on the Yamaha occurring at a steady pace, but regula- is regulatory. It’s virtually impossible RMAX autonomous helicopter, which tory roadblocks and trade restrictions to do real UAV [unmanned aerial vehi- is regarded as the industry benchmark, are hampering market acceptance. cle] operations unless you’re a defense to prevent it from being used for mili- And, though much of the cutting-edge contractor or the military—so you have tary operations by unfriendly nations. research in autonomous helicopters to go to a big defense contractor if you Omead Amidi, a research faculty demonstrates significant crossover want to do real UAV research.” member at Carnegie Mellon University potential between disparate computa- Regulatory hurdles vary, depend- and CEO of SkEyes Unlimited, a Wash- tional and scientific disciplines as well ing on the sovereignty involved. In the ington, PA-based firm that manufac- as other aviation applications, many U.S., for example, the Federal Avia- tures instruments for autonomous researchers find themselves stymied tion Administration (FAA) has yet to aircraft, concurs with Coates’ observa- by these non-technological obstacles issue regulations regarding the use of tion about the dearth of regulatory in- that stem from policy concerns. autonomous helicopters in public air- frastructure hindering wider develop- RATKIN

“A lot of vehicles have at least ki- space. A 2008 report by the U.S. Gen- ment and deployment of the craft. F nematics that are similar to helicop- eral Accountability Office (GAO) noted “If you have a helicopter flying over UGENE ters,” says Adam Coates, a Stanford that unmanned aircraft, whether fixed your head, it’s because everything E H BY University Ph.D. student who coau- wing or rotor powered, cannot meet about it is regulated,” Amidi says. “No P thored Learning for Control from Mul- the National Airspace System’s safety such thing exists for autonomous he-

tiple Demonstrations, which won the regulations for tasks such as avoiding licopters. If you could convince me to PHOTOGRA

18 COMMUNICATIONS OF THE ACM | JUNE 2009 | VOL. 52 | NO. 6 news

fly one of these over the head of my Programming daughter, OK, it’s ready, but I’m not doing it now.” Human-generated mapping can cost Repeat AI to the Forefront Despite the regulatory issues, which $20,000 per square Winners the GAO estimated might take 10 years mile; an autonomous For the second year in a row, to resolve in the U.S., researchers have students from St. Petersburg continued to improve autonomous helicopter could University of Information helicopters’ capabilities. The most ad- produce the same Technology, Mechanics and vanced can take off, hover, and main- Optics won the annual ACM results 10 times International Collegiate tain flight autonomously through a Programming Contest (ICPC). combination of advanced sensing cheaper, says With this year’s victory, St. and navigation equipment such as Petersburg University has laser sensors, GPS modules, inertial Omead Amidi. now won the ACM-ICPC world championship three times in measurement units that contain ac- the last four years. celerometers and gyroscopes, and Known as “The Battle of the communications modules that com- Brains,” the ACM ICPC World Finals took place this year at the municate with ground-based comput- Royal Institute of Technology in ers or human pilots when necessary. in their project and was a coauthor of Stockholm, Sweden. The world’s The RMAX, for example, first flew the Learning for Control paper, says the top 100 university teams used fully autonomously out of visual range project successfully transferred ma- open standard technology to solve 11 real-world problems in Japan in 2000, following prepro- chine learning techniques into a disci- involving traffic congestion, grammed instructions. pline that had hitherto been extremely suffix-replacement grammars, While the RMAX is well suited for labor-intensive, relying on painstaking and other issues, with the goal being to correctly solve the commercial purposes, it is also pro- expert modeling of likely behaviors. largest number of problems in hibitively large and expensive for ap- Ultimately, they decided to have the the shortest amount of time. plications such as the surveillance of helicopter “watch” an expert human The 33rd annual ACM building interiors or for bootstrapped pilot’s maneuvers via data input from ICPC, sponsored by IBM, was dominated by teams from Russia university research programs. A base onboard controls and a radio receiver and China. This year’s top 12, model used by the U.S. Army for re- that saved a copy of the human pilot’s medal-winning teams are St. search weighs approximately 185 control stick positions during demon- Petersburg University (Russia), which solved nine problems, pounds, has a rotor diameter of three stration flights. followed by Tsinghua University meters, and costs $86,000, while fully “From those two things, you can (China), St. Petersburg State autonomous units, complete with nav- examine state changes over time and University (Russia), Saratov igational and control equipment, can what the pilot does, and can record a State University (Russia), the University of Oxford (U.K.), cost $1 million. whole trajectory to build up a model,” and Zhejiang University (China). Researchers are successfully apply- Coates says. Massachusetts Institute of ing disparate technologies to improve “Previously, the most common ap- Technology (U.S.) finished the vehicles, using much smaller and proach to designing controllers for in seventh place, followed by Altai State Technical University cheaper helicopters than the RMAX. autonomous aircraft, both helicopters (Russia), University of Warsaw For example, Coates and coauthor Pi- and fixed wing, was to hire a human (Poland), University of Waterloo eter Abbeel, now a professor in the de- engineer to choose parameters for the (Canada), I Javakhishvili Tbilisi State University (Georgia), partment of electrical engineering and controller,” Ng says. “For example, and Carnegie Mellon University computer sciences at the University of if the helicopter is pitched forward a (U.S.). California, Berkeley, utilized artificial little more than you want, how aggres- “It is clear that computa- intelligence principles to demonstrate sively do you want to pull back on the tional thinking, which is at the heart of the information their assertion that an off-the-shelf stick? The traditional approach was to technology revolution, is expectation-maximization algorithm have a person knowledgeable in aero- the engine that is driving could result in the most advanced au- dynamics and helicopters sit down innovation in these countries,” says ACM President Professor tonomous aerobatics yet performed, and model that. This approach can Dame Wendy Hall. “As we using a commercially available radio- often work, but it is a very slow design seek to strengthen computing controlled hobbyist helicopter that process and often doesn’t perform education and fill the talent weighed about 10 pounds. nearly as well as modern machine pipeline for future workers, it is an important reminder Coates says the Stanford project was learning methods.” that, while U.S. enrollment in the culmination of five years of effort, Coates and Abbeel discovered that computer science programs in which numerous approaches were even the most expert human pilot’s may have increased, we need to discussed and dismissed. Andrew Ng, a aerobatic routine contains errors (or, continue investing in programs that attract women and other professor of computer science at Stan- in the language of the problem, is sub- underrepresented groups ford, who advised Coates and Abbeel optimal). “However, repeated expert to this field.”

JUNE 2009 | VOL. 52 | NO. 6 | COMMUNICATIONS OF THE ACM 19 news

demonstrations are often suboptimal go deeper. “One of the reasons people the advent of microelectromechani- in different ways,” their Learning for liked our paper is that it was an off-the- cal systems-based sensing technology, Control paper noted, “suggesting that shelf machine learning algorithm and such as gyroscopes, accelerometers, a large number of suboptimal expert we solved a strange little application and magnetometers, is leading to in- demonstrations could implicitly en- nobody had thought of before,” Coates creased miniaturization. code the ideal trajectory the subopti- notes. “People know how hard this is, Navigationally, academic research- mal expert is trying to demonstrate.” and to see that AI people solved this, I ers are now also concentrating on They discovered that merely using think has made a big impact. We had developing obstacle detection tech- an arithmetic average of the states ob- been preaching for a while that AI is nology that will allow autonomous served at any given time in the expert the key to solving really hard problems helicopters to fly safely in urban areas demonstrations would fall short of ar- that aren’t accessible to us when we’re teeming with tall buildings, overhead riving at the desired trajectory, explain- using lots of classical methods—and if wires, and light poles. Such uses are ing that, in practice each demonstra- you come up with a problem and make not on the near horizon, however; tion would occur at different rates, and such large strides, it really adds some the ongoing safety concerns probably hence make impossible an attempt to weight to the argument that AI can be point to deployment in sparsely popu- combine states from the same time- real and practical with algorithms that lated areas for natural resource map- step in each demonstration. solve really hard problems.” ping, forest firefighting, and marine However, by employing the ma- search and rescue. Human-generated chine learning algorithm—which in- Smaller, Lighter, Safer mapping at quarter-meter resolution cludes an extended Kalman filter and The future of autonomous helicop- can cost $20,000 per square mile, for a dynamic programming algorithm— ters might be even more profoundly example, while autonomous helicop- the researchers were able to infer the affected by the march to increasingly ters could probably deliver the same intended target trajectory and time powerful processors and smaller form results 10 times cheaper, says Amidi. alignment of all the demonstrations. factors. Georgia Tech’s Feron says autono- And, while real-time variables such as “One way to avoid safety troubles mous helicopters will continue to offer the state of the air around the craft, is by making the helicopters smaller, researchers an excellent platform for rotor speed, actuator delays, and the so there are a lot of efforts going into further research in robotics, whether behavior of the helicopter’s onboard miniaturizing the machines,” says Eric the researcher is an “aeronaut” who avionics contribute to an extremely Feron, professor of aerospace software will still be utilizing them 10 years complex environment that cannot be engineering at Georgia Tech Universi- hence, or instead testing a more uni- modeled accurately, these variables ty, who studied autonomous helicop- versally applicable methodology on can be mitigated if the programming ters while a graduate student at MIT. the machines, and that wider deploy- is able to make the helicopter fly the “That’s where I think things are going ment will indeed follow at some point. same trajectory each time. If so, the now.” “The safety and reliability issues are errors caused by these variables will Coates says the breakthrough Stan- not unworkable,” Feron says. “I think tend to be the same, and therefore can ford research was greatly facilitated it’s just a matter of time.” be predicted more accurately. by increased processor capability that In addition to the aerobatic results allowed real-time instruction every Gregory Goth is an Oakville, CT-based writer who specializes in science and technology. of the project, Coates says the ramifi- 20th of a second, which was not pos- cations for machine learning theory sible even five years ago. Additionally, © 2009 ACM 0001-0782/09/0600 $10.00

Awards American Academy Names 2009 Fellows

Computer science was well a center for independent policy and information technologies): ! Alfred Z. Spector, Google represented when the American research. ! John Seely ! Jennifer Widom, Stanford Academy of Arts & Sciences (AAAS) “Since 1780, the Academy Brown, Deloitte University. recently announced the election has served the public good by Center for Edge Elected in the category of the 2009 class of fellows and convening leading thinkers and Innovation/ of business, corporate, and foreign honorary members. The doers from diverse perspectives University of philanthropic leadership: 212 new fellows and 19 foreign to provide practical policy Southern California ! John Doerr, Kleiner, Perkins, honorary members—including solutions to the pressing issues ! Mary Jane Irwin, Caufield & Byers. scholars, scientists, jurists, of the day,” said Leslie Berlowitz, State In an email interview, John writers, artists, civic, corporate AAAS chief executive officer. “I John Seely University Seely Brown offered this career and philanthropic leaders—come look forward to welcoming into Brown ! Maria Klawe, advice for young people: “Nurture

from 28 states and 11 countries the Academy these new members Harvey Mudd College a disposition that embraces H BY J.D. LASICA and range in age from 33 to 83. to help continue that tradition.” ! , Kurzweil Tech- change and that encourages you to P They join one of America’s most Elected in the category of nologies challenge your own assumptions HOTOGRA

prestigious honorary societies and computer sciences (including AI ! Michael Sipser, MIT and having others challenge yours.” P

20 COMMUNICATIONS OF THE ACM | JUNE 2009 | VOL. 52 | NO. 6 news

News | DOI:10.1145/1516046.1516071 Bob Violino Looking Backward and Forward CRA’s Computing Community Consortium hosted a day-long symposium to discuss the important computing advances of the last several decades and how to sustain that track record of innovation.

HAT ARE THE major com- puting innovations of the recent past? How did research enable them? What advances Ware on the horizon, and how can they be realized? These were among the key questions addressed at an invitation- only symposium held at the Library of Congress in Washington, DC, in March. The symposium, “Computing Re- search that Changed the World: Reflec- tions and Perspectives,” was organized by the Computing Research Associa- tion’s (CRA’s) Computing Community Consortium in cooperation with a half- dozen U.S. congressmen. “The main goals were to explore past game-changing research in the comput- ing fields to understand how they came From left: Daphne Koller, Stanford; , MIT; Rodney Brooks, MIT and Heartland about and then to take a peek at the fu- Robotics; and Alfred Spector, Google, were among the symposium’s session speakers. ture to see how this knowledge could be used to maximize the chances for future Each session featured three talks insight into basic biological processes game-changing research,” says CRA Ex- and a short discussion that identified as well as into the mechanisms and pro- ecutive Director Andrew Bernat. future challenges. The sessions were cesses underlying human disease. They “It became pretty clear that there followed by an hour-long discussion also have the potential of allowing us to is no foolproof way to figure out what among all the speakers, with com- understand the complex genetic and research will turn into the big hits of ments from attendees, and a call to ac- environmental factors that lead to dif- tomorrow; rather, that big hits gener- tion for the future. ferences in human phenotype, includ- ally are a combination of independent As for which areas of research seem ing both disease and response to drug efforts driven by curiosity and applica- particularly promising, Bernat says mo- treatment.” tions,” Bernat says. “No one foresaw bile computing will “continue to be a However, Koller adds, it’s impos- the ultimate outcomes of the initial re- huge area for exploration and change sible to extract these insights without search, so we must continue to fund a as are digital media of all types. And new computational methods. “Devel- broad range of efforts in [multiple] sub- networking will continue to boom—not oping these tools is a direction where disciplines, using a variety of funding just computer networking, but social a lot of progress has been made,” she mechanisms.” networks which will help us understand says, “but much more work remains to The symposium’s sessions included the dynamics of human behavior.” be done.” SSOCIATION

A The Internet and the World Wide Web, Daphne Koller, a professor of com- Videos and other material from the which examined areas such as search puter science at Stanford University symposium are available on the CRA

ESEARCH technology and cloud computing; and one of the symposium’s session Web site, and the Computing Commu- R Evolving Foundations, which looked speakers, says one of the most excit- nity Consortium will host additional UTING P at the security of online information ing directions in computing is the abil- symposiums later this year, includ- OM C and global information networks; The ity to use computational methods and ing one on artificial intelligence and Transformation of the Sciences via models to analyze scientific data, par- education and another on educational Computation, which covered topics ticularly biomedical data. data mining. such as supercomputers and the fu- “New biological assays are produc-

H COURTESY OF THE Bob Violino is a writer based in Massapequa Park, NY, who P ture of medicine; and Computing Ev- ing important data at an ever-increasing covers business and technology. erywhere!, which focused on sensing, rate,” Koller says. “These data have the

PHOTOGRA computer graphics, and robotics. potential of providing unprecedented © 2009 ACM 0001-0782/09/0600 $10.00

JUNE 2009 | VOL. 52 | NO. 6 | COMMUNICATIONS OF THE ACM 21 viewpoints

DOI:10.1145/1516046.1516056 Eugene H. Spa!ord VPrivacy and Security Answering the Wrong Questions Is No Answer Asking the wrong questions when building and deploying systems results in systems that cannot be sufficiently protected against the threats they face.

OR OVER 50 years we have been management, yet still attacks succeed. ementary school computer lab, which trying to build computing Each time we apply a new layer, new at- is different from one used to control systems that are trustworthy. tacks appear to defeat it. military weapons. There are some is- The efforts are most notable I conjecture that one reason for sues in common, certainly, but the by the lack of enduring suc- these repeated failures is that we may overall design and deployment should Fcess—and by the oftentimes spectacu- be trying to answer the wrong ques- reflect the differences. lar security and privacy failures along tions. Asking how to make system The availability and familiarity of the way. With each passing year (and “XYZ” secure against all threats is, a few common artifacts has led us to each new threat and breach) we seem at its core, a nonsensical question. deploy them (or variants) everywhere, to be further away from our goals. Almost every environment and its even to unsuitable environments. By Consider what is present in too threats are different. A system con- analogy, what if everything in society many organizations. Operating sys- trolling a communications satellite was constructed of bricks because tems with weak controls and flaws is different from one in a bank, which they are cheap, common, and easy to have been widely adopted because of in turn is different from one in an el- use? Imagine not only homes built of cost and convenience. Thus, firewalls bricks, but everything else from the have been deployed to put up another space shuttle to submarines to medi- layer of defense against the most obvi- Asking how to cal equipment. Thankfully, other fields ous problems. Firewalls are often con- have better sense and choose appropri- figured laxly, so complex intrusion and make system “XYZ” ate tools for important tasks. anomaly detection tools are deployed secure against all A time-honored way of reinforcing to discover when the firewalls are pen- a point is by means of a story told as a etrated. These are also imperfect, espe- threats is, at its parable, a fairy tale, or as a joke. One cially when insider threats are consid- core, a nonsensical classic example I tell my students: ered, so we deploy data loss detection Two buddies leaving a tavern find and prevention tools. We also employ question. a distressed and somewhat inebriated virtual machine environments intend- man on his hands and knees in the park- ed to erect barriers against buggy im- ing lot, apparently searching for some- plementations. These are all combined thing. They ask him what he has lost, with malware detection and patch and he replies that he has dropped his

22 COMMUNICATIONS OF THE ACM | JUNE 2009 | VOL. 52 | NO. 6 V viewpoints

keys. He describes the keys, and says if shouldn’t expect to find what they are ingful manner. That is not necessarily the two men find them they will receive a really seeking.a the case. reward. They begin to help search. Other So it is in research—especially in We have generally failed to under- people come by and they too are drawn cyber security and privacy. We have stand that when we build and deploy into the search. Soon, there is a crowd people seeking answers to the wrong systems they are used in a variety of en- combing the lot, with an air of competi- questions, often because that is where vironments, facing different threats. tion to see who will be the first to find the “the light is better” and there seems to There is no perfect security in any real keys. Periodically someone informs the be a bigger crowd around them. Until system—hardware fails, people make crowd of the discovery of a coin or a par- we start asking questions that better mistakes, and attacks outside our ex- ticularly interesting piece of rock. address the problems that really need pectations may defeat our protection After a while, one in the crowd stands to be solved, we shouldn’t expect to mechanisms. If an attacker is suffi- up and inquires of the fellow who lost see progress. Here are a few examples ciently motivated and has enough re- his keys, “Say, are you sure you lost your of misleading questions: sources (including time), every system keys out here in the lot?” To which the ! How do I secure my commodity can be defeated in some manner.b If man replies, “No. I lost them in the al- operating system against all threats? the attacker doesn’t care if the defeat ley.” Everyone stops to stare at the man. ! How do I protect my system with is noticed, it may reduce the work fac- “Well, why the heck are you searching an intrusion-detection system, data tor involved; as an obvious example, for them here in the parking lot!?” some- loss and prevention tools, firewalls, an assured denial-of-service attack can one exclaimed. To which the man re- and other techniques? be accomplished with enough nuclear plied, “Well, the light is so much better ! How do I find coding flaws in the weapons. The goal in the practice of se- here. And besides, now I have such good system I am using so I can patch them? curity is to construct sufficient defens- company!” ! How do we build multilevel secure es against the likely threats in such a There are many lessons that can be systems? inferred from this story, but the one I Each of these questions implies it b There are many books on this topic, and the AN H stress with my students is that if they can be answered in a positive, mean- basic premise is at the heart of nearly every big don’t properly define the problem, heist movie, including Ocean’s 11, The Italian Job, and The Thomas Crown Affair. For some in- ask the right questions, and search a Another story that resonates with my students teresting, real-life examples outside comput- in the proper places, they may have

USTRATION BY JON is http://spaf.cerias.purdue.edu/Archive/race- ing, I recommend the book Spycraft by Robert LL I good company and funding, but they horse.. Wallace and H. Keith Melton.

JUNE 2009 | VOL. 52 | NO. 6 | COMMUNICATIONS OF THE ACM 23 viewpoints

way as to reduce the risk of compro- only initially—given current losses and mise to an acceptable level; if the at- trends, this approach would eventually tack can be made to cost far more than reduce costs in many environments. the perceived gain resulting from its Robert H. (Bob) Courtney Jr., one of success, then that is usually sufficient. the first computer security profession- By asking the wrong questions— als and an early recipient of the NIST/ such as how to patch or modify existing NCSC National Computer Systems Se-  items rather than ask what is appropri- curity Award articulated three “laws” ate to build or acquire—we end up with for those who seek to build secure, op-   systems that cannot be adequately pro- erational computational artifacts:d tected against the threats they face. Few ! Nothing useful can be said about     current systems are designed accord- the security of a mechanism except in ing to known security practices,c nor the context of a specific application   are they operated within an appropriate and environment. policy regime. Without understanding ! Never spend more mitigating a risk the risks involved, management seeks than tolerating what it will cost you.   to “add on” security technology to the ! There are management solutions current infrastructure, which may add to technical problems but no technical new vulnerabilities. solutions to management problems. The costs of replacing existing sys- Although not everyone will agree tems with different ones requiring new with these three laws, they provide a training seems so daunting that it is sel- good starting point for thinking about dom considered, even by organizations the practice of information security. that face prospects of catastrophic loss. The questions we should be asking are There is so much legacy code that devel- not about how to secure system “XYZ,” opers and customers alike believe they but whether “XYZ” is appropriate for cannot afford to move to something use in the environment at hand. Can it else. Thus, the market tends toward be configured and protected against the “add on” solutions and patches rather expected threats to a level that matches than fundamental reengineering. Sig- our risk tolerance? What policies and nificant research funding is applied to procedures need to be put in place to      tinkering with current platforms rather augment the technology? What is the than addressing the more fundamen- true value of what we are protecting? Do      tal issues. Instead of asking “How do we even know what we are protecting?e          we design and build systems that are As researchers and practitioners,          secure in a given threat environment?” we need to stop looking for solutions and “What tools and programming where the light is good and people         constructs should we be using to pro- seem to be gathered. Consider a quote       duce systems that do not exhibit easily I have been using recently: “Insan-       exploited flaws?” we, as a community, ity is doing the same thing over and       continue to ask the wrong questions. over again while expecting different Note that I am not arguing against results.”f Asking the wrong questions        standards, per se. Standards are impor- repeatedly is not only hindering us tant for interoperability and innovation. from making real progress but may      However, standards are best applied at even be considered insane. the interfaces so as to allow innovation So, what questions are you trying to and good engineering practice to take answer? place inside. I am also not overlooking     the potential expense. Creating new sys- d My thanks to William Hugh Murray for his re- tems, training developers, and develop- statement of Courtney’s Laws.     e Many firms do not understand the value of ing new code bases might be costly, but what they are protecting or where it is located; see http://snipurl.com/sec-econ. f This quote is widely attributed to Albert Ein- c There are many fine works on security engi- stein and to John Dryden. I have been unable neering, including Ross Anderson’s opus of to find a definitive source for it, however. that title. If we return to the fundamentals, tried-and-true design principles were articu- lated by Jerome H. Saltzer and Michael D. Eugene H. Spafford ([email protected]) is a Schroeder in “The Protection of Information professor of computer science and the executive director in Computer Systems,” republished in Com- of the Center for Education and Research in Information Assurance and Security (CERIAS) at Purdue University. munications of the ACM 17, 7 (July 1974) but few systems are designed using these principles. Copyright held by author.

24 COMMUNICATIONS OF THE ACM | JUNE 2009 | VOL. 52 | NO. 6 viewpoints

VDOI:10.1145/1516046.1516055 Kevin Fu Inside Risks Reducing Risks of Implantable Medical Devices A prescription to improve security and privacy of pervasive health care.

ILLIONS OF PATIENTS ben- efit from programma- ble, implantable medi- cal devices (IMDs) that treat chronic ailments Msuch as cardiac arrhythmia,6 diabetes, and Parkinson’s disease with various combinations of electrical therapy and drug infusion. Modern IMDs rely on radio communication for diag- nostic and therapeutic functions— allowing health-care providers to re- motely monitor patients’ vital signs via the Web and to give continuous rather than periodic care. However, the convergence of medicine with ra- dio communication and Internet con- nectivity exposes these devices not only to safety and effectiveness risks, but also to security and privacy risks. IMD security risks have greater direct consequences than security risks of From left, Benjamin Ransford (University of Massachusetts), Daniel Halperin (University of Washington), Benessa Defend (University of Massachusetts), and Shane Clark (University desktop computing. Moreover, IMDs of Massachusetts) worked to uncover security flaws in implantable medical devices. contain sensitive information with privacy risks more difficult to mitigate cause patients harm. In 1982, some- mingling of radio communications than that of electronic health records one deliberately laced Tylenol cap- expose IMDs to historically open en- or pharmacy databases. This column sules with cyanide and placed the con- vironments with difficult to control explains the impact of these risks on taminated products on store shelves perimeters.3,4 For instance, vandals patient care, and makes recommen- in the Chicago area. This unsolved caused seizures in photosensitive in- dations for legislation, regulation, crime led to seven confirmed deaths, dividuals by posting flashing anima- and technology to improve security a recall of an estimated 31 million tions on a Web-based epilepsy sup- and privacy of IMDs. bottles of Tylenol, and a rethinking of port group.1 security for packaging medicine in a Knowing that such vandals will

D Consequences and Causes: tamper-evident manner. Today, IMDs always exist, the next question is Security Risks appear to offer a similar opportunity whether genuine security risks exist. The consequences of an insecure IMD to other depraved people. While there What could possibly go wrong by al- can be fatal. However, it is fair to ask are no reported incidents of deliber- lowing an IMD to communicate over H BY BEN RANSFOR P whether intentional IMD malfunc- ate interference, this can change at great distances with radio and then tions represent a genuine threat. any time. The global reach of the In- mixing in Internet-based services? It

PHOTOGRA Unfortunately, there are people who ternet and the prevalence and inter- does not require much sophistication

JUNE 2009 | VOL. 52 | NO. 6 | COMMUNICATIONS OF THE ACM 25 viewpoints

to think of numerous ways to cause platform, patients can take comfort in available today may not last 25 years. intentional malfunctions in an IMD. that IMDs seldom rely on such widely It is tempting to consider software Few desktop computers have fail- targeted software for now. updates as a remedy for maintaining ures as consequential as that of an the security of IMDs. Because software IMD. Intentional malfunctions can Consequences and Causes: updates can lead to unexpected mal- actually kill people, and are more Privacy functions with serious consequences, difficult to prevent than accidental A second risk is violation of patient pacemaker and defibrillator patients malfunctions. For instance, lifesaving privacy. Today’s IMDs contain detailed make an appointment with a health- therapies were silently modified and medical information and sensory data care provider to receive firmware up- disabled via radio communication on (including vital signs, patient name, dates in a clinic. Thus, it could take an implantable defibrillator that had date of birth, therapies, and medical too long to patch a security hole. passed premarket approval by regula- diagnosis). Data can be read from an Beyond cryptography, several steps tors.3 In my research lab, the same de- IMD by passively listening to radio could reduce exposure to potential vice was reprogrammed with an unau- communication. With newer IMDs misuse. When and where should an thenticated radio-based command to providing nominal read ranges of sev- IMD permit radio-based, remote re- induce a shock that causes ventricular eral meters, eavesdropping will be- programming of therapies (such as fibrillation (a fatal heart rhythm). come easier. The privacy risks are sim- changing the magnitude of defibrilla- Manufacturers point out that IMDs ilar to that of online medical records. tion shocks)? When and where should have used radio communication for an IMD permit radio-based, remote decades, and that they are not aware Remedies collection of telemetry (for example, of any unreported security problems. Improving IMD security and privacy vital signs)? Well-designed crypto- Spam and viruses were also not preva- requires a proper mix of technology graphic authentication and authori- lent on the Internet during its many- and regulation. zation make these two questions solv- decade nascent period. Firewalls, en- able. Does a pacemaker really need to cryption, and proprietary techniques Remedy: Technology accept requests for reprogramming did not stop the eventual onslaught. Technological approaches to improv- and telemetry in all locations from It would be foolish to assume IMDs ing IMD security and privacy include street corners to subway stations? The are any more immune to malware. For judicious use of cryptography and lim- answer is no. Limit unnecessary expo- instance, if malware were to cause an iting unnecessary exposure to would- sure. IMD to continuously wake from power- be hackers. IMDs that rely on radio saving mode, the battery would wear communication or have pathways to Remedy: Regulation out quickly. The malware creator need the Internet must resist a determined Premarket approval for life-sustaining not be physically present, but could ex- adversary.5 IMDs can last upward of 20 IMDs should explicitly evaluate secu- pose a patient to risks of unnecessary years, and doctors are unlikely to sur- rity and privacy—leveraging the body surgery that could lead to infection gically replace an IMD just because a of knowledge from secure systems or death. Much like Macintosh users less-vulnerable one becomes available. and security metrics communities. can take comfort in that most current Thus, technologists must think 20 to Manufacturers have already deployed malware takes aim at the Windows 25 years out. Cryptographic systems hundreds of thousands of IMDs with- out voluntarily including reasonable technology to prevent the unauthor- ized induction of a fatal heart rhythm. Thus, future regulation should pro- vide incentives for improved security and privacy in IMDs. Regulatory aspects of protecting privacy are more complicated, espe- cially in the . Although the U.S. Food and Drug Administra- tion has acknowledged deleterious effects of privacy violations on patient health,2 there is no ongoing process or explicit requirement that a manufac- turer demonstrate adequate privacy

protection. The FDA has no legal re- D mit from Congress to directly regulate privacy (the FDA does not administer HIPAA privacy regulations). H BY BEN RANSFOR P Call to Action

Equipment used to attack an implantable cardiac defibrillator (ICD). My call to action consists of two parts PHOTOGRA

26 COMMUNICATIONS OF THE ACM | JUNE 2009 | VOL. 52 | NO. 6 viewpoints

for IMDs. Today’s guidelines are so ambiguous that an implantable car- Calendar Improving IMD dioverter defibrillator with no appar- security and privacy ent authentication whatsoever has of Events been implanted in hundreds of thou- requires a proper sands of patients.3 June 16–18 Conference on the Future mix of technology Fourth, technologists should en- of the Internet 2009, sure that IMDs do not continue to Seoul Republic of Korea, and regulation. repeat the mistakes of history by un- Contact: Craig Partridge, derestimating the adversary, using Phone: 517-324-3425, Email: [email protected] outdated threat models, and neglect- ing to use cryptographic controls.5 In June 19–20 addition, technologists should not International Symposium on dismiss the importance of usable se- Memory Management, Dublin, Ireland, legislation, one part regulation, and curity and human factors. Sponsored: SIGPLAN, one part technology. Contact: Elliot K Kolodner, First, legislators should mandate Conclusion Email: [email protected] stronger security during premarket There is no doubt that IMDs save lives. June 19–20 approval of life-sustaining IMDs that Patients prescribed such devices are ACM SIGPLAN/SIGBED 2009 rely on either radio communication much safer with the device than with- Conference on Languages, or computer networking. Action at out, but IMDs are no more immune Compilers, and Tools for Embedded Systems, premarket approval is crucial because to security and privacy risks than any Dublin, Ireland, unnecessary surgical replacement di- other computing device. Yet the con- Sponsored: SIGPLAN, rectly exposes patients to risk of infec- sequences for IMD patients can be Contact: Christoph Kirsch, tion and death. Moreover, the threat fatal. Tragically, it took seven cyanide Email: [email protected] models and risk retention chosen by poisonings in the 1982 Chicago Tyle- June 20–24 the manufacturer should be made nol poisoning case for the pharmaceu- The 36th Annual public so that health-care providers tical industry to redesign the physical International Symposium on and patients can make informed deci- security of its product distribution Computer Architecture, Austin, TX, sions when selecting an IMD. Legisla- to resist tampering by a determined Sponsored: SIGARCH, tion should avoid mandating specific adversary. The security and privacy Contact: Stephen W. Keckler, technical approaches, but instead problems of IMDs are obvious, and Phone: 512-471-9763, should provide incentives and pen- the consequences just as deadly. We’d Email: [email protected] alties for manufacturers to improve better get it right today, because sur- June 22 IMD security. gically replacing an insecure IMD is Fourth International Workshop Second, legislators should give much more difficult than an automat- on Mobility in the Evolving regulators the authority to require ade- ed Windows update. Internet Architecture, Krakow, Poland, quate privacy controls before allowing Contact: Prof. Xiaoming, an IMD to reach the market. The FDA References Email: [email protected] writes that privacy violations can affect 1. Epilepsy Foundation. Epilepsy Foundation Takes Action Against Hackers. March 31, 2008; http:// June 22–23 patient health,2 and yet the FDA has no www.epilepsyfoundation.org/aboutus/pressroom/ action_against_hackers.cfm. Second International direct authority to regulate privacy of 2. FDA Evaluation of Automatic Class III Designation Workshop on Future medical devices. IMDs increasingly VeriChip™ Health Information Microtransponder Multimedia Networking, store large amounts of sensitive medi- System, October 2004; http://www.sec.gov/Archives/ Coimbra, Portugal, edgar/data/924642/000106880004000587/ex99p2.txt. Contact: Eduardo Cerquiera, cal information and fixing a privacy 3. Halperin, D. et al. Pacemakers and implantable cardiac defibrillators: Software radio attacks and Email: [email protected] flaw after deployment is especially dif- zero-power defenses. In Proceedings of the 29th ficult on an IMD. Moreover, security Annual IEEE Symposium on Security and Privacy, June 22–25 May 2008. 23rd ACM/IEEE/SCS Workshop and privacy are often intertwined. In- 4. Halperin, D. et al. Security and privacy for on Principles of Advanced and adequate security can lead to inade- implantable medical devices. In IEEE Pervasive Computing, Special Issue on Implantable Electronics Distributed Simulation, quate privacy, and inadequate privacy (Jan. 2008). Lake Placid, NY, can lead to inadequate security. Thus, 5. Schneier, B. Security in the real world: How to Contact: Carl Tropper, evaluate security technology. Computer Security Email: [email protected] device regulators have the unique van- Journal 15 , 4 (Apr. 1999); http://www.schneier.com/ tage point for not only determining essay-031.html. 6. Webster, J.G., Ed. Design of Cardiac Pacemakers. June 23–26 safety and effectiveness, but also de- IEEE Press, 1995. 12th International Symposium termining security and privacy. on Component Based Software Third, regulators such as the Kevin Fu ([email protected]) is an assistant Engineering, professor of computer science at the University of East Stroudsburg, PA, FDA should draw upon industry, the Massachusetts Amherst. Sponsored: SIGSOFT, health-care community, and academ- Contact: Christine Hofmeister, ics to conduct a thorough and open This work was supported by NSF grant CNS-0831244. Email: Christine.hofmeister@ gmail.com review of security and privacy metrics Copyright held by author.

JUNE 2009 | VOL. 52 | NO. 6 | COMMUNICATIONS OF THE ACM 27 viewpoints

VDOI:10.1145/1516046.1516054 Peter J. Denning The Profession of IT Beyond Computational Thinking If we are not careful, our fascination with “computational thinking” may lead us back into the trap we are trying to escape.

N THE MIDST of our struggle to the deep questions of the fi eld.6,9 ! Is computational thinking a better articulate why comput- ! Showing that computation is funda- unique and distinctive characteriza- ing is so much broader than mental, and often unavoidable, in most tion of computer science? programming, a movement of endeavors—a desire to proselytize. ! Is computational thinking an ad- sorts has emerged. It is being Since starting a stint at NASA-Ames equate characterization of computer Icalled “computational thinking.”8 in 1983, I have been heavily involved science? The U.S. National Science Founda- with computational science and I have My own conclusion is that both an- tion’s Computer and Information devoted a substantial part of my own ca- swers are no. I will suggest that a prin- Science and Engineering (CISE) di- reer to advancing these objectives. Since ciples-based framework answers both rectorate has asked most proposers, questions yes. We are custodians of a especially those in its CPATH initia- deep and powerful discourse: Let’s not tive, to include a discussion of how hide it with an inadequate name. their projects advance computational thinking. Carnegie Mellon Universi- What is Computational Thinking? ty’s Center for Computational Think- Computational thinking has a long his- ing says, “It is nearly impossible to tory within computer science. Known do research in any scientifi c or engi- in the 1950s and 1960s as “algorithmic neering discipline without an abil- thinking,” it means a mental orienta- ity to think computationally.…[We] tion to formulating problems as con- advocate for the widespread use of ÷? versions of some input to an output computational thinking to improve and looking for algorithms to perform people’s lives.”1 the conversions. Computational thinking is seen Today the term has been expanded by its adherents as a novel way to say to include thinking with many levels what the core of the fi eld is about, a of abstractions, use of mathematics lever to reverse the decline of enroll- 2003 I have advocated a great-principles to develop algorithms, and examining ments, and a rationale for accepting approach to the perennially open ques- how well a solution scales across differ- computer science as a legitimate fi eld tion, “What is computer science?”4 ent sizes of problems.1 of science. This movement is driven by Yet I am uneasy. I am concerned that four main concerns: the computational thinking movement Is Computational Thinking ! Bringing computer science to reinforces a narrow view of the fi eld Unique to Computer Science? the table of science (as partner, not and will not sell well with the other sci- In the 1940s, John von Neumann wrote programmer). ences or with the people we want to at- prolifi cally on how computers would ! Finding ways to make computer tract. I worry that we are not getting out be not just a tool for helping science, science a more attractive fi eld for stu- of the box, but are merely repackaging but a way of doing science. dents to major in and for other scienc- it with new paper and a fresh ribbon. As early as 1975, Physics Nobel es to collaborate with. In this column, I will examine two Laureate Ken Wilson promoted the ! Resurrecting ongoing inquiry into key questions: idea that simulation and computation

28 COMMUNICATIONS OF THE ACM | JUNE 2009 | VOL. 52 | NO. 6 viewpoints

were a way to do science that was not Therefore, it is unwise to pin our previously available. Wilson’s Nobel hopes on computational thinking as a Prize was based on breakthroughs he Computation is way of telling people about the unique achieved in creating computational unavoidable not character of computer science. We models whose simulations produced need some other way to do that. radical new understandings of phase only in the method The sentiment that computational changes in materials. In the early 1980s, of study, but in thinking is a recent insight into the true Wilson joined with other leading scien- what is studied. nature of computer science ignores the V tists in many fields to advocate that the venerable history of computational grand challenges of science could be thinking in computer science and in cracked by computation and that the all the sciences. Computer science is government could accelerate the pro- a science in its own right (see the side- cess by supporting a network of super- bar “Computer Science as Science”). computing centers.7 They argued that computation had become a third leg puter science was present but was not a Is Computational Thinking of science, joining the traditional legs key player. Computer scientists, in fact, Adequate for Computer Science? of theory and experiment. The term resisted participation until NSF CISE In 1936 Alan Turing defined what it “computational thinking” was com- and DARPA set up research programs means to compute a number. He of- mon in their discussions. open only to those collaborating with fered a model of a computing machine The computational sciences move- other sciences. and showed that the machines were ment eventually grew into a huge In the middle 1980s, Ken Wilson ad- universal (one could simulate anoth- interagency initiative in high-perfor- vocated the formation of departments of er). He then used his theory to settle mance computing, and culminated in computational science in universities. a century-old “decision problem” of the U.S. Congress passing a law fund- He carefully distinguished them from mathematics, whether there is a by- ing a high-performance computing computer science. The term “computa- inspection method to tell if a set of de- initiative in 1991. tional science” was chosen to avoid con- cision rules can terminate with a deci- This movement validated the notion fusion with computer science. sion in a finite number of moves. He that computation (and computational Thus, computational science is seen showed that the “decision problem” was thinking) is essential to the advance- in the other sciences not as a notion not computable and argued that the very ment of science. It generated a power- that flows out of computer science, but act of inspecting is inherently compu- ful political movement that codified as a notion that flows from science it- tational: not even inspectors can avoid this notion into a U.S. federal law. self. Computational thinking is seen as computation. Computation is universal It is important to notice that this a characteristic of this way of science. and unavoidable. His paper truly was the movement originated with the leaders It is not seen as a distinctive feature of birth of computer science. of the physical and life sciences. Com- computer science. The modern formulations of science

Computer Science as Science

Since its beginnings in the late networking infrastructure was equal partners in the search for scientists in other fields have 1930s, computer science has a grand challenge that took new principles. So it matters discovered natural information been a unique combination of many years. Now that this has whether computer science processes—affirming the math, engineering, and science. been accomplished, we are qualifies as a full-fledged sixth criterion.3 The older It is not one, but all three. Major increasingly able to emphasize science. Whether a field is seen definition of computer science subsets form legitimate fields of the experimental method and as a science depends on its as “the study of phenomena math, engineering, or science. reinvigorate our image as a satisfying six criteria:5 surrounding computers,” But if you focus on a single science. Our many partnerships ! Has an organized body of which dates back to Alan subset, you cannot express the with other sciences including knowledge Perlis, George Forsythe, and uniqueness of the field. biology, physics, astronomy, ! Results are reproducible Allen Newell around 1970, The term “computer materials science, economics, ! Has well developed experi- is giving way to “the study science” traces back to the cognitive science, and mental methods of information processes, writings of John von Neumann, sociology, have led to amazing ! Enables predictions, includ- natural and artificial.” The who believed that the innovations. ing surprises shift from computer as object architecture of machines and These collaborations ! Offers hypotheses open to of study to computer as tool is applications could be put on a have uncovered questions falsification enabling us to revisit the deep rigorous scientific basis. in the other fields about ! Deals with natural objects questions of our field in the Until about 1990, the whether computer science is Computer science easily new light of computation as a emphasis within the field was legitimately science. Many see passes the first five of these lens through which to see the developing and advancing computer people as engineers tests, so the debate has tended world. The most fundamental the technology. Building implementing principles they to center on the last. During of these questions is: What is reliable computers within a did not discover rather than the past decade, prominent computation?6,9

JUNE 2009 | VOL. 52 | NO. 6 | COMMUNICATIONS OF THE ACM 29 viewpoints

recognize the same truth when they say ity to take care of the concerns listed at that computation is an essential meth- the beginning of this column. But giv- od of doing science. In fact, a growing The real value of en the outside perception, computa- number of scientists are now saying computer science tional thinking is all too easily seen as a that information processes occur nat- repackaging—a change of appearance urally (for example, DNA transcrip- is in the offers we but not of substance. Do we really want tion) and that computation is needed are able to make to replace that older notion with “CS = to understand and eventually control computational thinking”? A colleague them.3 So computation is unavoidable from our expertise, from another field recently said to me: not only in the method of study, but in which is founded “You computer scientists are hungry! what is studied. First you wanted us to take your courses This is a subtle but important dis- in a rich and deep on literacy and fluency. Now you want tinction. Computation is present in discourse. us to think like you!” nature even when scientists are not ob- I suggest that the real value of com- serving it or thinking about it. Compu- puter science is in the offers we are able tation is more fundamental than com- to make from our expertise, which is putational thinking. For this reason founded in a rich and deep discourse. alone, computational thinking seems We are valued at the table when we like an inadequate characterization of at which we can develop various levels help the others solve problems they computer science. of skill. Computational thinking is one care about. We are most valued not for A number of us developed a great of several key practices at which every our computational thinking, but for principles framework that exposes computer scientist should be compe- our computational doing. the fundamental scientific principles tent (see the sidebar “The Great Prin- of computing4,6 (see the sidebar “The ciples Framework”). It shortchanges References 1. Carnegie Mellon University Center for Computational Great Principles Framework”). This computer science to try to characterize Thinking; http://www.cs.cmu.edu/~CompThink. framework interprets computer sci- the field by mentioning only one essen- 2. Computer Science Unplugged Web site; http:// csunplugged.org. ence as the study of fundamental prop- tial practice without mentioning the 3. Denning, P. Computing is a natural science. Commun. erties of information processes, both others or the principles of the field. ACM 50, 7 (July 2007), 13–18. 4. Denning, P. Great principles of computing. Commun. natural and artificial. Computers are ACM 46, 11 (Nov. 2003), 15–20. the tool, not the object of study. Com- Conclusion 5. Denning, P. Is computer science science? Commun. ACM 48, 4 (Apr. 2005), 27–31. 2 putation pervades everyday life. Computation is widely accepted as a 6. Great Principles of Computing Web site; http:// The great principles framework lens for looking at the world. We do not greatprinciples.org. 7. Wilson, K.G. Grand challenges to computational reveals that there is something even need to sell that idea. Computational science. In Future Generation Computer Systems. more fundamental than an algorithm: Elsevier, 1989, 171–189. thinking is one of the key practices of 8. Wing, J. Computational thinking. Commun. ACM 49, 3 the representation. Representations computer science. But it is not unique (Mar. 2006), 33–35. 9. Wing, J. Five deep questions in computing. Commun. convey information. A computation is to computing and is not adequate to ACM 51, 1 (Jan. 2008), 58–60. an evolving representation and an al- portray the whole of the field. gorithm is a representation of a meth- In the 1960s and 1970s we allowed, Peter J. Denning ([email protected]) is the director of the od to control the evolution. and even encouraged, the perception Cebrowski Institute for Information Innovation and In this framework, computational “CS = programming,” which is now to Superiority at the Naval Postgraduate School in Monterey, CA, and is a past president of ACM. thinking is not a principle; it is a prac- our dismay widely accepted outside the tice. A practice is a way of doing things field and is connected with our inabil- Copyright held by author.

The Great Principles Framework

The Great Principles (GP) technologies. They can be computing. The Internet, for practices: framework is a way to express grouped into seven categories: example, is a technology that ! Programming computer science as a field ! Computation draws its operating principles ! Engineering of systems of science based on deep ! Communication primarily from communication, ! Modeling and enduring fundamental ! Coordination coordination, and recollection, ! Applying principles.3,4,6 The framework ! Recollection and its architecture from design Computational thinking has two parts: core principles ! Automation and evaluation. can be seen either as a style of and core practices. ! Evaluation The core practices are areas thought that runs through the The core principles are ! Design of skill and ability at which practices or as a fifth practice. statements and stories about These are not mutually computing people can display It is the ability to interpret the immutable laws and exclusive groups of principles, various levels of performance the world as algorithmically recurrences that shape and but windows that bring such as beginner, competent, controlled conversions of inputs constrain all computing particular perspectives about and expert. There are four core to outputs.

30 COMMUNICATIONS OF THE ACM | JUNE 2009 | VOL. 52 | NO. 6 viewpoints

VDOI:10.1145/1516046.1516058 Richard Stallman Viewpoint Why “Open Source” Misses the Point of Free Software Decoding the important differences in terminology, underlying philosophy, and value systems between two similar categories of software.

HEN WE CALL software appeal to business executives by citing “free,” we mean it re- practical benefits, while avoiding ideas spects the users’ essen- of right and wrong they might not like tial freedoms: the free- to hear. Other proponents flatly reject- dom to run it, to study ed the free software movement’s ethi- Wand change it, and to redistribute cal and social values. Whichever their copies with or without changes (see views, when campaigning for “open http://www.gnu.org/philosophy/free- source” they did not cite or advocate sw.html). This is a matter of freedom, those values. The term “open source” not price, so think of “free speech,” not quickly became associated with the “free beer.” practice of citing only practical values, These freedoms are vitally impor- such as making powerful, reliable soft- tant. They are essential, not just for the ware. Most of the supporters of “open individual users’ sake, but because they source” have come to it since then, promote social solidarity—that is, shar- the development of the free operating and that practice is what they take it to ing and cooperation. They become even system GNU, so we could avoid the non- mean. more important as more aspects of our free operating systems that deny free- Nearly all open source software is culture and life activities are digitized. dom to their users. During the 1980s, free software; the two terms describe In a world of digital sounds, images, we developed most of the essential almost the same category of software. and words, free software increasingly components of such a system, as well But they stand for views based on fun- equates with freedom in general. as the GNU General Public License (see damentally different values. Open Tens of millions of people around http://www.gnu.org/licenses/gpl.html), source is a development methodology; the world now use free software; the a license designed specifically to pro- free software is a social movement. For schools in regions of India and Spain tect freedom for all users of a program. the free software movement, free soft- now teach all students to use the free However, not all of the users and de- ware is an ethical imperative, because GNU/Linux operating system (see velopers of free software agreed with the only free software respects the users’ http://www.gnu.org/gnu/linux-and- goals of the free software movement. In freedom. By contrast, the philosophy of gnu.html). But most of these users have 1998, a part of the free software com- open source considers issues in terms never heard of the ethical reasons for munity splintered off and began cam- of how to make software “better”—in which we developed this system and paigning in the name of “open source.” a practical sense only. It says that non- built the free software community, be- The term was originally proposed to free software is a suboptimal solution. cause today this system and commu- avoid a possible misunderstanding For the free software movement, how- nity are more often described as “open of the term “free software,” but it soon ever, non-free software is a social prob- source,” and attributed to a different became associated with philosophical lem, and moving to free software is the philosophy in which these freedoms views quite different from those of the solution. are hardly mentioned. free software movement. Free software. Open source. If it’s The free software movement has Some of the proponents of “open the same software, does it matter campaigned for computer users’ free- source” considered it a marketing cam- which name you use? Yes, because dif- dom since 1983. In 1984 we launched paign for free software, which would ferent words convey different ideas.

JUNE 2009 | VOL. 52 | NO. 6 | COMMUNICATIONS OF THE ACM 31 viewpoints

While a free program by any other the same; it is a little looser in some re- considered free software licenses. name would give you the same free- spects, so open source supporters have dom today, establishing freedom in accepted a few licenses that we consid- Different Values Can Lead a lasting way depends above all on er unacceptably restrictive of the users. to Similar Conclusions… teaching people to value freedom. If Nonetheless, it is fairly close to our defi- But Not Always you want to help do this, it is essential nition in practice. Radical groups in the 1960s had a repu- to speak about “free software.” However, the obvious meaning for tation for factionalism: some organiza- We in the free software movement the expression “open source software” tions split because of disagreements don’t think of the open source camp is “You can look at the source code,” on details of strategy, and the two re- as an enemy; the enemy is proprietary and most people seem to think that’s sultant groups treated each other as (non-free) software. But we want people what it means. That is a much weaker enemies despite having similar basic to know we stand for freedom, so we do criterion than free software, and much goals and values. The right wing made not accept being misidentified as open weaker than the official definition of much of this, and used it to criticize the source supporters. open source. It includes many pro- entire left. grams that are neither free nor open Some try to disparage the free soft- Common Misunderstandings of source. Since that obvious meaning ware movement by comparing our “Free Software” and “Open Source” for “open source” is not the meaning disagreement with open source to the The term “free software” has a problem that its advocates intend, the result disagreements of those radical groups. of misinterpretation: an unintended is that most people misunderstand They have it backward. We disagree meaning, “software you can get for zero the term. Here is how writer Neal Ste- with the open source camp on the ba- price,” fits the term just as well as the phenson defined “open source”: Li- sic goals and values, but their views and intended meaning, “software that gives nux is “open source” software meaning, ours lead in many cases to the same the user certain freedoms.” We address simply, that anyone can get copies of its practical behavior—such as developing this problem by publishing the defi- source code files. free software. nition of free software, and by saying I don’t think Stephenson deliberately As a result, people from the free “Think of free speech, not free beer.” sought to reject or dispute the “official” software movement and the open This is not a perfect solution; it cannot definition. I think he simply applied the source camp often work together on completely eliminate the problem. An conventions of the English language to practical projects such as software de- unambiguous, correct term would be come up with a meaning for the term. velopment. It is remarkable that such better, if it didn’t have other problems. The state of Kansas published a similar different philosophical views can so Unfortunately, all the alternatives definition: Make use of open-source soft- often motivate different people to par- in English have problems of their own. ware (OSS). OSS is software for which the ticipate in the same projects. Nonethe- We’ve looked at many alternatives that source code is freely and publicly avail- less, these views are very different, and people have suggested, but none is able, though the specific licensing agree- there are situations where they lead to so clearly correct that switching to it ments vary as to what one is allowed to do very different actions. would be a good idea. Every proposed with that code. The idea of open source is that allow- replacement for “free software” has Open source supporters try to deal ing users to change and redistribute the some kind of semantic problem—and with this by pointing to their official software will make it more powerful and this includes “open source software.” definition, but that corrective approach reliable. But this is not guaranteed. De- The official definition of “open is less effective for them than it is for us. velopers of proprietary software are not source software,” which is published by The term “free software” has two natu- necessarily incompetent. Sometimes the Open Source Initiative (see http:// ral meanings, one of which is the in- they produce a program that is power- opensource.org/docs/osd) and too long tended meaning, so a person who has ful and reliable, even though it does not to cite here, was derived indirectly from grasped the idea of “free speech, not respect the users’ freedom. How will our criteria for free software. It is not free beer” will not get it wrong again. free software activists and open source But “open source” has only one natural enthusiasts react to that? meaning, which is different from the A pure open source enthusiast, one meaning its supporters intend. So there that is not at all influenced by the ide- Open source is is no succinct way to explain and justify als of free software, will say, “I am sur- a development the official definition of “open source.” prised you were able to make the pro- That makes for worse confusion. gram work so well without using our methodology; free Another common misunderstand- development model, but you did. How software is a social ing of “open source” is the idea that can I get a copy?” This attitude will re- it means “not using the GNU GPL.” ward schemes that take away our free- movement. It tends to accompany a misunder- dom, leading to its loss. standing of “free software,” equating The free software activist will say, it to “GPL-covered software.” These are “Your program is very attractive, but equally mistaken, since the GNU GPL is not at the price of my freedom. So I have considered an open source license, and to do without it. Instead I will support a most of the open source licenses are project to develop a free replacement.”

32 COMMUNICATIONS OF THE ACM | JUNE 2009 | VOL. 52 | NO. 6 viewpoints

If we value our freedom, we can act to have to talk about freedom. A certain maintain and defend it. amount of the “keep quiet” approach to Software can only business can be useful for the commu- Powerful, Reliable be said to serve nity, but it is dangerous if it becomes Software Can Be Bad so common that the love of freedom The idea that we want software to be its users if it respects comes to seem like an eccentricity. powerful and reliable comes from the their freedom. That dangerous situation is exactly supposition that the software is de- what we have. Most people involved signed to serve its users. If it is power- with free software say little about free- ful and reliable, that means it serves dom—usually because they seek to be them better. more acceptable to business. Software But software can only be said to serve distributors especially show this pat- its users if it respects their freedom. “open source software” is that the ethi- tern. Nearly all GNU/Linux operating What if the software is designed to put cal ideas of “free software” make some system distributions add proprietary chains on its users? Then powerfulness people uneasy. That’s true: talking packages to the basic free system, and only means the chains are more con- about freedom, about ethical issues, they invite users to consider this an ad- stricting, and reliability that they are about responsibilities as well as conve- vantage, rather than a step backward harder to remove. Malicious features, nience, is asking people to think about from freedom. such as spying on the users, restricting things they might prefer to ignore, such Proprietary add-on software and the users, back doors, and imposed up- as whether their conduct is ethical. partially non-free GNU/Linux distribu- grades are common in proprietary soft- This can trigger discomfort, and some tions find fertile ground because most ware, and some open source supporters people may simply close their minds of our community does not insist on want to do likewise. to it. It does not follow that we ought to freedom with its software. This is no Under the pressure of the movie and stop talking about these things. coincidence. Most GNU/Linux users record companies, software for individ- However, that is what the leaders of were introduced to the system by “open uals to use is increasingly designed spe- “open source” decided to do. They fig- source” discussion, which doesn’t say cifically to restrict them. This malicious ured that by keeping quiet about ethics freedom is a goal. The practices that feature is known as DRM, or Digital and freedom, and talking only about don’t uphold freedom and the words Restrictions Management (see Defec- the immediate practical benefits of cer- that don’t talk about freedom go hand tiveByDesign.org), and it is the antith- tain free software, they might be able to in hand, each promoting the other. esis in spirit of the freedom that free “sell” the software more effectively to To overcome this tendency, we need software aims to provide. And not just certain users, especially businesses. more, not less, talk about freedom. in spirit: since the goal of DRM is to This approach has proved effective, trample your freedom, DRM develop- in its own terms. The rhetoric of open Conclusion ers try to make it difficult, impossible, source has convinced many businesses As the advocates of open source draw or even illegal for you to change the and individuals to use, and even devel- new users into our community, we software that implements the DRM. op, free software, which has extended free software activists must work even Yet some open source supporters our community—but only at the super- more to bring the issue of freedom to have proposed “open source DRM” ficial, practical level. The philosophy of those new users’ attention. We have software. Their idea is that by pub- open source, with its purely practical to say, “It’s free software and it gives lishing the source code of programs values, impedes understanding of the you freedom!”—more and louder than designed to restrict your access to en- deeper ideas of free software; it brings ever. Every time you say “free software” crypted media, and allowing others to many people into our community, but rather than “open source,” you help change it, they will produce more pow- does not teach them to defend it. That our campaign. erful and reliable software for restrict- is good, as far as it goes, but it is not ing users like you. Then it will be deliv- enough to make freedom secure. At- Further Reading 1. Joe Barr wrote an article called “Live and Let License” ered to you in devices that do not allow tracting users to free software takes (see http://www.itworld.com/LWD010523vcontrol4) you to change it. them just part of the way to becoming that gives his perspective on this issue. 2. Lakhani and Wolf’s paper on the motivation of free This software might be “open defenders of their own freedom. software developers (see http://freesoftware.mit.edu/ source,” and use the open source de- Sooner or later these users will be papers/lakhaniwolf.pdf) states that a considerable fraction are motivated by the view that software velopment model; but it won’t be free invited to switch back to proprietary should be free. This was despite the fact they surveyed software, since it won’t respect the free- software for some practical advantage. the developers on SourceForge, a site that does not support the view that this is an ethical issue. dom of the users that actually run it. If Countless companies seek to offer such the open source development model temptation, some even offering copies Richard Stallman ([email protected]) is the author of the free succeeds in making this software more gratis. Why would users decline? Only if symbolic debugger GDB, the founder the project to develop powerful and reliable for restricting they have learned to value the freedom the free GNU operating system, and the founder of the Free you, that will make it even worse. free software gives them, to value free- Software Foundation. dom as such rather than the technical Copyright © 2009 Richard Stallman Fear of Freedom and practical convenience of specific Verbatim copying and distribution of this entire article is The main initial motivation for the term free software. To spread this idea, we permitted in any medium, provided this notice is preserved.

JUNE 2009 | VOL. 52 | NO. 6 | COMMUNICATIONS OF THE ACM 33 viewpoints

VDOI:10.1145/1516046.1516057 George V. Neville-Neil Kode Vicious Obvious Truths How to determine when to put the brakes on late-running projects and untested software patches.

Dear KV, originally published schedule and and the worse their output becomes. I’ve been working on a project that, the work is not even 50% complete. Pilots, fire fighters, emergency-room like all software projects, is late. And 2. The originally published schedule doctors, and the like all know that we’re not just late a little, but a lot. has been extended by 50% or more. past a certain point everything they do The project was supposed to take 3. The schedule is updated daily and will actually cause more trouble than four weeks and we’re now in our third the dates keep getting further out. if they did nothing at all. Because our month. People are blaming the usual 4. The engineers avoid coming to profession is not as extreme as theirs suspects: poorly spec’d-out work, team meetings and when they do at- we seem to never learn this, which is a management interference, and lack tend they: shame, because it’s an important les- of proper infrastructure. What I want a. break down in tears; son. Learn when to stop. to know is how late is too late? How do b. pretend to be asleep; KV people decide to just stop a project? c. bang their heads on the table. Late and Getting Later Driver Education In this month’s installment of “things Dear Later, that ought to be obvious” I discuss If I understand you correctly, and I patching, compiling, and testing hope I can because your email mes- code. I’m sure many of you have had sage is both short and direct, you are these experiences before, and if you involved in a project that has now have a fun one to share please write to taken more than twice as long as pre- me and tell me about it. dicted to implement and is approach- I am sure most of you have heard ing the thrice mark. I would say this the old programmer’s joke, “It com- is scary if it weren’t so common. Proj- piles, ship it!,” which gets a good guf- ects take on a life of their own at some faw now and again from the denizens point and when a group of people get KV is in category C, but then I bet of cubicle land. I’m also sure that together and continue to try to “look you knew that already. All of the above many readers have been subjected to on the bright side” they keep finding are indicators of schedule creep and using code that clearly compiled, ran “silver linings,” even though they are a loss of control of the project. They maybe once, and then actually was now drenched by the rain. It is amaz- are all good times to consider pulling shipped. But have you ever had to deal ing to me that a group of people who the emergency brake handle. The rea- with people sending you patches that often seem so hard-headed and prag- son the handle gets pulled so rarely is just didn’t work? matic—that is, engineers—can con- the aforementioned optimism of the Recently, KV has been fixing a de- HOTO.COM tinue to believe there is a pot of gold staff, whereby if they “just work a lit- vice driver that seems always to be very P STOCK

just over the rainbow somewhere. tle harder” the project will get done. close to working. The driver wasn’t I / V

Many projects can go on for years I have never, in my entire working originally written by KV, and it cer- O D A L

when they should have only gone on career, seen a project that is 50% off tainly wasn’t originally tested by KV, al- O D for months, so long as the money course get back on track because the though it now seems that the company

doesn’t run out. team worked 80 hours a week instead I’m dealing with is using me as their EXEY DU From my point of view there are a of 60. Most people in high-pressure unwitting alpha tester. There are few AL few good places to pause and reflect professions know how this works. things more frustrating than a piece in the life of the project. The harder they work past a certain of software that almost works. It might USTRATION BY

1. You have gotten to the end of the point, the more mistakes they make tick along for days, doing just what it’s ILL

34 COMMUNICATIONS OF THE ACM | JUNE 2009 | VOL. 52 | NO. 6 viewpoints supposed to when—bang!—it breaks. People who send out a “small With a bit of debugging and a bit of patch” without even compiling it are time in the lab I can explain what’s bro- People who send far too confident in their own abili- ken to the vendor. I even have source out a “small patch” ties. Please! Stop! Don’t do that! I code for the driver so I can patch it don’t care if you see bits in your when I understand what’s broken and without even dreams and they assemble correctly send them patches; sometimes they compiling it are far in the morning when you type them send me patches. in. The amount of time you waste by It’s the part where they send me too confident in not doing the most basic tests on code patches that has been a bit more in- their own abilities. you’re patching isn’t only your own; teresting. I had been faithfully apply- it’s multiplied by all the hapless suck- ing patches from the vendor and test- ers who took your patch and tried to ing their fixes and I kept getting this use it. sneaking feeling that they were not Returning to my earlier remark, testing the patches before they sent I would have thought this was ob- them out. I had that feeling not just vious—as obvious as how to spell because I’m a naturally paranoid and sending me hacked bits of the driver “struct”—but I have discovered this is suspicious person, which I am, but that they thought would work. All I not the case. because each patch would fix say, only could think was, “Did you even com- KV 70% or 80% of the problem and then pile this!?!?” But of course I already I’d have to provide the remaining bit knew the answer. George V. Neville-Neil ([email protected]) is the proprietor of Neville-Neil Consulting and a member of the ACM Queue of the fix. Finally, I got a patch that Now I don’t bring this up just be- editorial board. He works on networking and operating proved that although I am paranoid, it cause I like to say, “I told you so,” be- systems code for fun and profit, teaches courses on various programming-related subjects, and encourages is not without reason. I applied a patch cause I don’t. I’d much prefer the code I your comments, quips, and code snips pertaining to his and it didn’t compile: the C keyword received worked the first time, since my Communications column. struct had been spelled incorrectly. employers expect the same from me. I Ha! I had them. They had not even ap- bring this up as yet another example of plied their own patch; they were just unwarranted programmer optimism. Copyright held by author.

ORDER TODAY & SAVE 15% HANDBOOK OF a u t h o r s : Robert Nisbet, PhD Statistical Analysis Pacific Capital Bank Corporation Santa Barbara, CA, USA John Elder, IV, PhD Data Mining Elder Research, Inc. Charlottesville, VA, USA Applications Gary Miner, PhD StatSoft, Inc. Tulsa, OK, USA

“If you want to roll-up your sleeves The Handbook of Statistical Analysis and Data Mining and execute on predictive analytics, Applications is a comprehensive professional reference this is your definite, go-to resource. book for business analysts, scientists, engineers and To put it lightly, if this book isn’t on researchers that brings together in a single resource your shelf, you’re not a data miner.”

all the information a beginner will need to rapidly learn — Eric Siegel, Ph.D., how to conduct data mining and the statistical analysis President, Prediction Impact, Inc., San Francisco, and Founding Chair, Predictive Analytics World required to interpret the data patterns once mined.

key features: May 2009, Hardcover, 900 pp ™ Egdk^YZh`ZnhiVi^hi^XVaVcVanh^hbZi]dYh ISBN-13: 978-0-12-374765-5 ™ 8aZVganYZhXg^WZhbdYZgcVa\dg^i]bh[dg6>$BVX]^cZaZVgc^c\ List Price: $89.95/£45.99 /€57.95 ™ EgVXi^XVaVYk^XZ[gdbhjXXZhh[jagZVa"ldgaY^beaZbZciVi^dch Use offer code 94637 when ordering. ™ >cXajYZhZmiZch^kZXVhZhijY^Zh!ZmVbeaZh!ijidg^Vah!BHEdlZgEd^ciha^YZhVcYYViVhZih

>> To view the full Table of Contents or to order your copy, visit elsevierdirect.com

2M90104_HA_Nisbet_050_1200_Amstat.indd 1 3/5/09 11:06:25 AM

JUNE 2009 | VOL. 52 | NO. 6 | COMMUNICATIONS OF THE ACM 35 CACM app_revised_03_18_09:Layout 1 4/9/09 11:46 AM Page 1

ACM, Uniting the World’s Computing Professionals, Researchers, Educators, and Students

3/@=::3/5C3 B/B7;3E63<1=;>CB7<57A/BB63131=;>CB7<5>@=43AA7=3B7B7D37< B635:=0/:1=;;C<7BG "A7<1@3/A7<57CB7<5 27A17>:7<3 /<2 >@=43AA7=< A3@D3 B= 63:> " @3/16 7BA 4C:: >=B3>=@BC<7B73A4=@7BA;3;03@A A>/@B=4"A=D3@/::;7AA7=<B=/2D/<131=;>CB7<5/A/A173<13/<2/>@=43AA7=<=C@7G=C/1673D3AC113AA0G>@=D727<5G=CE7B6B63@3A=C@13AG=C<332B=/2D/<13 G=C@1/@33@/<2AB/G/BB634=@34@=>=@BC<7BGB=;35@=C>E7B67<" " ,A>C@>=A37A B=3:3D/B3B637AAC3=453<23@27D3@A7BGE7B67<B63/AA=17/B7=</<2B630@=/23@1=;>CB7<51=;;C<7BG -=C1/<8=7<B63 " ,3;/7:27AB@70CB7=<:7AB/B6BB> E=;3< /1; =@5 8=7<:7AB

ACM MEMBER BENEFITS: IAC0A1@7>B7=<B="A<3E:G@323A75<32;=@=D72320G"3:3:/<5C/53A/<2  D7@BC/::/0A IFull access to 600 online books 4@=;(/4/@7J==9A$<:7<343/BC@7<5:3/27<5>C0:7A63@A 7<1:C27<5$'37::G%@=43AA7=:CA;C:B7;327/1=B7=<B=AC0A1@703B=B631=;>:3B3ACM Digital Library I)63Guide to Computing LiteratureE7B6=D3@=<3;7::7=<A3/@16/0:3070:7=5@/>67117B/B7=B7=<B=1=<<31BE7B6B63best thinkers in computing 0G8=7<7<534 Special Interest Groups =@hundreds of local chapters IACM’s 40+ journals and magazines /BA>317/:;3;03@ =<:G@/B3A ITechNews"AB@7 E339:G3;/7:2753AB23:7D3@7<5AB=@73A=<B63:/B3AB )<3EA ICareerNews"A07 ;=@=D727<51/@33@ @3:/B32B=>71A IMemberNet"A3 <3EA:3BB3@1=D3@7<5">3=>:3/<2/1B7D7B73A IEmail forwarding service & filtering service>@=D727<5;3;03@AE7B6/4@33/1; =@53;/7:/22@3AA /<2Postini A>/;47:B3@7<5 I<2;C16;C16;=@3 "AE=@:2E723<3BE=@9=4=D3@ ;3;03@A@/<534@=;ABC23@=43AA7=3@B7A3 B=933>G=C/BB634=@34@=4=@G=C@1/@33@/<2G=C@4CBC@37<B632GCB7<5>@=43AA7=<

(7<13@3:G ,3<2G /::

%@3A723CB7<5"/167<3@G Advancing Computing as a Science & Profession    

              3  %  ,-   !    " #  $         !! "#$ %%$% &'  (  $$ -## "  $$ %$% !)!! &*+ ,+( 1         , # /    

+  / * / * +  +  /  /         

           % ,4                54                     G4                           

    

        ! !" #  !    > %  %&&   & &          

           

              $           !  " #          !  " # )*  $%  &   '  "(      "    ! + )*    !  " #   &     (      ,- !  " # "    ! + .* 

 ,   , /       0 3      "        1   /   ,,,0 0!       "  <.          )    3      ?6-                   =&    >)  &         )         3       )   3    2?::  ?,:@4 ? AAAAAAAAAAAAAAAAAAAAAA

     B 2?::4 ? AAAAAAAAAAAAAAAAAAAAAA      .     2?,: ?65  ?C54 ? AAAAAAAAAAAAAAAAAAAAAA       " E 3 0F   & ? AAAAAAAAAAAAAAAAAAAAAA 3 0 /) G-777 * $ *$ ,--@7-777

H  I >   J   D >)    0  K,@--G65CC5C        

 !" #$  practice

DOI:10.1145/1516046.1516059 such as mirroring, RAID-4 and RAID- Article development led by queue.acm.org 5, and the n+2 configuration, RAID-6, which increases storage system reli- ability using two redundant disks (dual New drive technologies and increased parity). Additionally, reliability at the capacities create new categories of failure RAID group level has been favorably enhanced because HDD reliability has modes that will influence system designs. been improving as well. Several manufactures produce one- BY JON ELERATH terabyte HDDs and higher capacities are being designed. With higher areal densities (also known as bit densities), lower fly-heights (the distance between the head and the disk media), and per- Hard-Disk pendicular magnetic recording tech- nology, can HDD reliability continue to improve? The new technology required to achieve these capacities is not with- Drives: The out concern. Are the failure mecha- nisms or the probability of failure any different from predecessors? Not only are there new issues to address stem- Good, the Bad, ming from the new technologies, but also failure mechanisms and modes vary by manufacturer, capacity, inter- face, and production lot. and the Ugly How will these new failure modes affect system designs? Understanding failure causes and modes for HDDs us- ing technology of the current era and the near future will highlight the need for design alternatives and trade-offs that are critical to future storage sys- tems. Software developers and RAID ar- chitects can not only better understand HARD-DISK DRIVES (HDDS) are like the bread in a peanut the effects of their decisions, but also butter and jelly sandwich—seemingly unexciting know which HDD failures are outside their control and which they can man- pieces of hardware necessary to hold the software. age, albeit with possible adverse per- They are simply a means to an end. HDD reliability, formance or availability consequences. however, has always been a significant weak link, Based on technology and design, where must the developers and architects perhaps the weak link, in data storage. In the late place the efforts for resiliency? 1980s people recognized that HDD reliability This article identifies significant HDD failure modes and mechanisms, was inadequate for large data storage systems so their effects and causes, and relates redundancy was added at the system level with some them to system operation. Many fail- brilliant software algorithms, and RAID (redundant ure mechanisms for new HDDs remain unchanged from the past, but the in- ERBROTHERS array of independent disks) became a reality. RAID P sidious undiscovered data corruptions U moved the reliability requirements from the HDD (latent defects) that have plagued all S HDD designs to one degree or another itself to the system of data disks. Commercial will continue to worsen in the near fu- USTRATION BY

implementations of RAID include n+1 configurations ture as areal densities increase. ILL

38 COMMUNICATIONS OF THE ACM | JUNE 2009 | VOL. 52 | NO. 6 practice

Figure 1: Fault tree for HDD read failures. make the head positioning take too long to lock onto a track and ultimate- ly produce an error. This mode can be cannot read data induced by excessive wear and is ex- acerbated by high rotational speeds. It affects both ball and fluid-dynamic or bearings. The insidious aspect of this Operational Latent type of problem is that it can be inter- Failures Failures mittent. Specific HDD usage condi-

cannot find data data missing tions may cause a failure while reading data in a system, but under test condi- tions the problem might not recur. or or Two very interesting examples of inability to stay on track are caused

bad servo SMART limit error during written but by audible noise. A video file available track exceeded writing destroyed on YouTube shows a member of Sun’s Fishworks team yelling at his disk bad bad drives and monitoring the latency in or or electronics read head disk operations.5 The vibrations from his yelling induce sufficient NRRO that can’t stay bad thermal media asperities the actuator cannot settle for over 520 on track ms. While most (some) of us don’t yell inherent at our HDDs, vibrations induced by corrosion bit errors thermal alarms (warning buzzers) have also been noted to induce NRRO and high-fly scratched write media cause excessive latency and time-outs. SMART limits exceeded. Today’s HDDs collect and analyze functional and performance data to predict im- Two major categories of HDD fail- required for the heads to find and stay pending failure using SMART (self- ure can prevent access to data: those on a track, whether executing a read, monitoring analysis reporting technol- that fail the entire HDD and those that write, or seek command. Servo-track ogy). In general, sector reallocations leave the HDD functioning but cor- information is written only during are expected, and many spare sectors rupt the data. Each of these modes has the manufacturing process and can are available on each HDD. If an exces- significantly different causes, prob- be neither reconstructed using RAID sive number occurs in a specific time abilities, and effects. The first type nor rewritten in the field. Media de- interval, however, the HDD is deemed of failure, which I term operational, fects in the servo-wedges cause the unreliable and is failed out. is rather easy to detect, but has lower HDD to lose track of the heads’ loca- SMART isn’t really that smart. One rates of occurrence than the data cor- tions or where to move the head for trade-off that HDD manufacturers ruptions or latent defects that are not the next read or write. Faulty servo face during design is the amount of discovered until data is read. Figure 1 tracks result in the inability to access RAM available for storing SMART data is a fault tree for the inability to read data, even though the data is written and the frequency and method for cal- data—the topmost event in the tree— and uncorrupted. Particles, contami- culating SMART parameters. When showing the two basic reasons that nants, scratches, or thermal asperities the RAM containing SMART data be- data cannot be read. can damage servo data. comes full, is it purged, then refilled Can’t stay on track. Tracks on an with new data? Or are the most recent Operational Failures: HDD are not perfectly circular; some percentages (x%) of data preserved and Cannot Find Data are actually spiral. The head position the oldest (1–x)% purged? The former Operational failures occur in two is continuously measured and com- method means that a rate calculation ways: first, data cannot be written to pared with where it should be. A PES such as read-error-rate can be errone- the HDD; second, after data is writ- (position error signal) repositions the ous if the memory fills up during an ten correctly and is still present on the head over the track. This repeatable event that produces many errors. The HDD uncorrupted, electronic or me- run-out is all part of normal HDD head errors before filling RAM may not be chanical malfunction prevents it from positioning control. NRRO (nonre- sufficient to trigger a SMART event, being retrieved. peatable run-out) cannot be corrected nor may the errors after the purge, but Bad servo track. Servo data is writ- by the HDD firmware since it is nonre- had the purge not occurred, the error ten at regular intervals on every data peatable. Caused by mechanical toler- conditions may easily have resulted in track of every disk surface. The servo ances from the motor bearings, actua- a SMART trip. data is used to control the positioning tor arm bearings, noise, vibration, and In general, the SMART thresholds of the read/write heads. Servo data is servo-loop response errors, NRRO can are set very low, missing numerous

40 COMMUNICATIONS OF THE ACM | JUNE 2009 | VOL. 52 | NO. 6 practice conditions that could proactively fail ures disagree with the manufacturers’ ! Airborne contamination. Particles a HDD. Making the trip levels more specification.1–3, 6, 7, 10, 11 More discon- within the enclosure tend to fail HDDs sensitive (trip at lower levels) runs the certing are the realizations that the early (scratches and head damage). risk of failing HDDs with a few errors failure rates are rarely constant; there This can give the appearance of an in- that really aren’t progressing to the are significant differences across sup- creasing failure rate. After all the con- point of failure. The HDD may simply pliers, and great differences within a taminated HDDs fail, the failure rate have had a series of reallocations, say, specific HDD family from a single sup- often decreases. that went smoothly, mapping out the plier. These inconsistencies are fur- ! Design changes. Manufacturers problematic area of the HDD. Integra- ther complicated by unexpected and periodically find it necessary to reduce tors must assess the HDD manufac- uncontrolled lot-to-lot differences. cost, resolve a design issue discov- turer’s implementation of SMART and In a population of HDDs that are all ered late in the test phase, or improve see if there are other more instructive the same model from a single manu- yields. Often, the change creates an im- calculations. Integrators must at least facturer, there can be statistically sig- provement in field reliability, but can understand the SMART data collection nificant subpopulations, each having create more problems than it solves. and analysis process at a very low level, a different time-to-failure distribution For example, one design change had then assess their specific usage pattern with different parameters. Analyses of an immediately positive effect on reli- to decide whether the implementation HDD data indicate these subpopula- ability, but after two years another fail- of SMART is adequate or whether the tions are so different that they should ure mode began to dominate and the SMART decisions need to be moved up not be grouped together for analyses HDD reliability became significantly to the system (RAID group) level. because the failure causes and modes worse. Head games and electronics. Most are different. HDDs are a technology ! Yield changes. HDD manufactur- head failures result from changes in that defies the idea of “average” fail- ers are constantly tweaking their pro- the magnetic properties, not electri- ure rate or MTBF; inconsistency is cesses to improve yield. Unfortunate- cal characteristics. ESD (electrostatic synonymous with variability and un- ly, HDDs are so complex that these discharge), high temperatures, and predictability. yield enhancements can inadvertently physical impact from particles affect The following are examples of un- reduce reliability. Continuous tweaks magnetic properties. As with any high- predictability that existed to such an can result in one month’s produc- ly integrated circuit, ESD can leave the extent that at some point in the prod- tion being highly reliable and another read heads in a degraded mode. Sub- uct’s life, these subpopulations domi- month being measurably worse. sequent moderate to low levels of heat nated the failure rate: The net impact of variability in reli- may be sufficient to fail the read heads magnetically. A recent publication Figure 2: Weibull time to failure plot for three very different populations. from Google didn’t find a significant correlation between temperature and

6 B 6.0 3.0 2.0 1.6 1.2 0.9 0.7 0.5 reliability. In my conversations with 99.0 numerous engineers from all the ma- jor HDD manufacturers, none has said 90.0 the temperature does not affect head H reliability, but none has published a 50.0 transfer function relating head life to time and temperature. The read ele- HDD #3 ment is physically hidden and difficult to damage, but heat can be conducted 10.0 HDD #1 from the shields to the read element, 5.0 affecting magnetic properties of the HDD #2 reader element, especially if it is al- ready weakened by ESD. 1.0 The electronics on an HDD are com- 0.5 plex. Failed DRAM and cracked chip Probability of Failure capacitors have been known to cause HDD failure. As the HDD capacities 0.1 increase, the buffer sizes increase and more RAM is required to cache writes. 0.05 Is RAID at the RAM level required to as- sure reliability of the ever-increasing solid-state memory? 0.01 10 100 1000 10000 100000 Operational Failure Data Time to Failure, hrs In a number of studies on disk failure rates, all mean times between fail-

JUNE 2009 | VOL. 52 | NO. 6 | COMMUNICATIONS OF THE ACM 41 practice

ability is that RAID designers and soft- Figure 3: Failure rate over time for five vintages and the composite. ware developers must develop logic and operating rules that will accom- modate significant variability and the 0.02 worst-case issues for all HDDs. Figure 2 shows a plot for three different HDD Vintage 2 populations. If a straight line were to Vintage 1 0.02 fit the data points and the slope were 1.0, then the population could be Composite represented by a Weibull probability distribution and have a constant fail- ure rate. (The Weibull distribution is 0.01 used to create the common bathtub curve.) A single straight line cannot fit either population HDD#2 or HDD#3, Vintage 3 so they do not even fit a Weibull dis- 8.00E-3 tribution. In fact, these do not fit any

single closed-form distribution, but Probability of Failure are composed of multiple failure dis- Vintage 4 tributions from causes that dominate 4.00E-3 at different points in time. Figure 3 is Vintage 5 an example of five HDD vintages from a single supplier. A straight line indi- cates a constant failure rate; the lower 0 the slope, the more reliable the HDD. A vintage represents a product from a 0 4,000 8,000 12,000 16,000 20,000 single month. Time to Failure, hrs

Latent Defects: Data is Corrupted or Missing The preceding discussion centered on of the effectiveness of all the electrical, the head is too high can result in the failure modes in which data was good mechanical, magnetic, and firmware media being insufficiently magne- (uncorrupted) but some other electri- control systems working together to tized so it cannot be read even when cal, mechanical, or magnetic function write (or read) data. Most bit errors the read element is flying at the spec- was impaired. These modes are usual- occur on a read command and are cor- ified height. If writing over a previ- ly rather easily detected and allow the rected using the HDD’s built-in error- ously written track, the old data may system operator to replace the faulty correcting code algorithms, but errors persist where the head was flying too HDD, reconstruct data on the new can also occur during writes. While high. For example, if all the HDDs in HDD, and resume storage functions. BER does account for some fraction of a cabinet are furiously writing at the But what about data that is missing defective data, a greater source of data same time, self-induced vibrations or corrupted because it either was not corruption is the magnetic recording and resonances can be great enough written well initially or was erased or media coating the disks. to affect the fly height. Physically corrupted after being written well. All The distance that the read-write bumping or banging an HDD dur- errors resulting from missing data head flies above the media is care- ing a write or walking heavily across are latent because the corrupted data fully controlled by the aerodynamic a poorly supported raised floor can is resident without the knowledge of design of the slider, which contains create excessive vibration that affects the user (software). The importance of the reader and writer elements. In the write. latent defects cannot be overempha- today’s designs, the fly height is less A more difficult problem to solve sized. The combination of a latent de- than 0.3 µ-in. Events that disturb is persistent increase in the fly height fect followed by an operational failure the fly height, increasing it above caused by buildup of lubrication or is the most likely sequence to result in the specified height during a write, other hydrocarbons on the surface of a double failure and loss of data.1 can result in poorly written data be- the slider. Hydrocarbon lubricants are To understand latent defects bet- cause the magnetic-field strength is used in three places within enclosed ter, consider the common causes. too weak. Remember that magnetic- HDDs. To reduce the NRRO, motors Write errors can be corrected us- field strength does not decrease lin- often use fluid-dynamic bearings. The ing a read-verify command, but these early as a function of distance from actuator arm that moves the heads require an extra read command after the media, but is a power function, pivots using an enclosed bearing car- writing, and can nearly double the ef- so field strength falls off very rapidly tridge that contains a lubricant. The fective time to write data. The BER as the distance between the head and media itself also has a very thin layer (bit-error rate) is a statistical measure media increases. Writing data while of lubricant applied to prevent the

42 COMMUNICATIONS OF THE ACM | JUNE 2009 | VOL. 52 | NO. 6 practice heads from touching the media itself. which the magnetic media is not ca- Lubricants on the media can build pable of holding the proper magnetic up on the head under certain circum- field to be correctly interpreted as a 0 stances and cause the head to fly too or a 1, is really not an issue. Media can high. Lube buildup can also mean that degrade, but the probability of this uncorrupted, well-written data cannot Based on mode is inconsequential compared be read because the read element is technology and with other modes. Data can become too far from the media. Lube buildup corrupted whenever the disks are spin- can be caused by the mechanical prop- design, where ning, even when data is not being writ- erties of the lubricant, which is depen- must the developers ten to or read from the disk. Common dent upon the chemical composition. causes for erasure include thermal Persistent high fly height can also be and architects asperities, corrosion, and scratches or caused by specific operations. For ex- smears. ample, when not writing or reading, if place the efforts Thermal asperities are instances of the head is left to sit above the same for resiliency? high heat for a short duration caused track while the disks spin, lubricant by head-disk contact. This is usu- can collect on the heads. In some ally the result of heads hitting small cases simply powering down the HDD “bumps” created by particles that re- will cause the heads to touch down (as main embedded in the media surface they are designed to do) in the landing even after burnishing and polishing. zone to disturb the lube buildup. This The heat generated on a single contact is very design specific, however, and can be high enough to erase data. Even does not always work. if not on the first contact, cumulative During the manufacturing process, effects of numerous contacts may be the surface of the HDD is checked and sufficient to thermally erase data or defects are mapped out, and the HDD mechanically destroy the media coat- firmware knows not to write in these ings and erase data. locations. They also add “padding” The sliders are designed to push around the defective area mapping away airborne particles so they do not out more blocks than the estimated become trapped between the head minimum, creating additional physi- and disk surface. Unfortunately, re- cal distance around the defect that is moving all particles that are in the 0.3 not available for storing data. Since µ-in. range is very difficult, so particles it is difficult to determine the exact do get caught. Hard particles used in length, width, and shape of a defect, the manufacture of an HDD, such as the added padding provides an extra Al2O3, TiW, and C, will cause surface safeguard against writing on a media scratches and data erasure. These defect. scratches are then media defects that Media imperfections such as voids are not mapped out, so the next time (pits), scratches, hydrocarbon con- data is written to those locations the tamination (various oils), and smeared data will be corrupted immediately. soft particles can not only cause errors Other “soft” materials such as stain- during writing, but also corrupt data less steel can come from assembly after it has been written. The sputter- tooling and aluminum from residuals ing process used to apply some of the from machining the case. Soft parti- media layers can leave contaminants cles tend to smear across the surface buried within the media. Subsequent of the media rendering the data un- contact by the slider can remove readable and unwritable. Corrosion, these bumps, leaving voids in which although carefully controlled, can the media is defective. If data is al- also cause data erasure and may be ac- ready written there, the data is cor- celerated by high ambient heat within rupted. If none is written, the next the HDD enclosure and the very high write process will be unsuccessful, heat flux from thermal asperities. but the user won’t know this unless a write-verify command is used. Latent Defects Data Early reliability analyses assumed Latent defects are the most insidious that once written, data will remain kinds of errors. These data corrup- undestroyed except by degradation of tions are present on the HDD but un- the magnetic properties of the media, discovered until the data is read. If no a process known as bit-rot. Bit-rot, in operational failures occur at the first

JUNE 2009 | VOL. 52 | NO. 6 | COMMUNICATIONS OF THE ACM 43 practice

reading of the data, the corruption is RERs and number of bytes read yields that operational failure rates are not corrected using the parity disk and the hourly read failure rates shown in increased. no data is lost. If one HDD, however, the table here. Frequent scrubbing can affect per- has experienced an operational failure Latent defects do not occur at a formance, but too infrequent scrub- and the RAID group is in the process of constant rate, but in bursts or adja- bing makes the n+1 RAID group highly reconstruction when the latent defect cent physical (not logical) locations. susceptible to double disk failures. is discovered, that data is lost. Since Although some latent defects are cre- Scrubbing, as with full HDD data re- latent defects persist until discovered ated by wear-out mechanisms, data is construction, has a minimum time (read) and corrected, their rate of oc- not available to discern wear-out from to cover the entire HDD. The time to currence is an extremely important as- those that occur randomly at a con- complete the scrub is a random vari- pect of RAID reliability. stant rate. These rates are between 2 able that depends on HDD capacity One study concludes that the BER and 100 times greater than the rates and I/O activity. The operating system is fairly inconsequential in terms of for operational failures. may invoke a maximum time to com- creating corrupted data,4 while anoth- plete scrubbing. er claims the rate of data corruption Potential Value of Data Scrubbing is five times the rate of HDD operat- Latent defects (data corruptions) can Future Technology and Trade-Offs ing failures.8 Analyses of corrupted occur during almost any HDD activity: How are those failure modes going to data identified by specific SCSI error reading, writing, or simply spinning. If impact future HDDs that have more codes and subsequent detailed fail- not corrected, these latent defects will than one-terabyte capacity? Certainly, ure analyses show that the rate of data result in lost data when an operational all the failure mechanisms that occur corruption for all causes is significant failure occurs. They can be eliminat- in the 1TB drive will persist in higher and must be included in the reliability ed, however, by background scrub- density drives that use perpendicular model. bing, which is essentially preventive magnetic recording (PMR) technol- NetApp (Network Appliance) com- maintenance on data errors. During ogy. PMR uses a “thick,” somewhat pleted a study in late 2004 on 282,000 scrubbing, which occurs during times soft underlayer making it susceptible HDDs used in RAID architecture. of idleness or low I/O activity, data is to media scratching and gouging. The The RER (read-error rate) over three read and compared with the parity. If materials that cause media damage months was 8x10–14 errors per byte they are consistent, no action is taken. include softer metals and composi- read. At the same time, another analy- If they are inconsistent, the corrupted tions that were not as great a problem sis of 66,800 HDDs showed an RER data is recovered and rewritten to the in older, longitudinal magnetic re- of approximately 3.2x10–13 errors per HDD. If the media is defective, the re- cording. Future higher density drives byte. A more recent analysis of 63,000 covered data is written to new physical are likely to be even more susceptible HDDs over five months showed a sectors on the HDD and the bad blocks to scratching because the track width much-improved 8x10–15 errors per byte are mapped out. will be narrower. read. In these studies, data corruption If scrubbing does not occur, the pe- Another PMR problem that will is verified by the HDD manufacturer riod of time to accumulate latent de- persist as density increases is side- as an HDD problem and not a result of fects starts when the HDD begins op- track erasure. Changing the direction the operating system controlling the eration in the system. Since scrubbing of the magnetic grains also changes RAID group. requires reading and writing data, it the direction of the magnetic fields. While Jim Gray of Microsoft Re- can act as a time-to-failure accelerator PMR has a return field that is close to search asserted that it is reasonable to for HDD components with usage-de- the adjacent tracks and can potential- transfer 4.32x1012 bytes/day/HDD, the pendent time-to-failure mechanisms. ly erase data in those tracks. In gen- study of 63,000 HDDs read 7.3x1017 The optimal scrub pattern, rate, and eral, the track spacing is wide enough bytes of data in five months, an ap- time of scrubbing is HDD-specific and to mitigate this mechanism, but if a proximate read rate of 2.7x1011 bytes/ must be determined in conjunction particular track is written repeatedly, day/HDD.4 Using combinations of the with the HDD manufacturer to assure the probability of side-track erasure increases. Some applications are opti- Range of average read error rates. mized for performance and keep the head in a static position (few tracks). This increases the chances of not only Bytes Read per Hour lube buildup (high fly writes) but also Low rate (1.35 × 109) High rate (1.35 × 1010) erasures. Low 1.08 × 10–5 err/hr 1.08 × 10–4 err/hr One concept being developed to (8.0 × 10–15) Read increase bit-density is heat assisted Errors Medium 1.08 × 10–4 err/hr 1.08 × 10–3 err/hr magnetic recording (HAMR).9 This per Byte (8.0 × 10–14) per HDD technology requires a laser within the High 4.32 × 10–4 err/hr 4.32 × 10–3 err/hr (3.2 × 10–13) write head to heat a very small area on the media to enable writing. High-sta- bility media using iron-platinum al- loys allow bits to be recorded on much

44 COMMUNICATIONS OF THE ACM | JUNE 2009 | VOL. 52 | NO. 6 practice smaller areas than today’s standard to a spare HDD (even the corrupted should consider optimizations around media without being limited by su- data), and resume recovery. A copy these high-probability events and their per-paramagnetism. Controlling the command is much quicker than re- effects on the RAID operation. amount and location of the heat are, constructing the data based on parity, Only when these high-probability of course, significant concerns. and if there are no defects, little data events are included in the optimiza- RAID is designed to accommo- will be corrupted. This means that re- tion of the RAID operation will reli- date corrupted data from scratches, construction of this small amount of ability improve. Failure to address smears, pits, and voids. The data is data will be fast and not result in the them is a recipe for disaster. re-created from the parity disk and same time-out condition. The offend- the corrupted data is reconstructed ing HDD can be (logically) taken out of and rewritten. Depending on the size the RAID group and undergo detailed Related articles of the media defect, this may be a few diagnostics to restore the HDD and on queue.acm.org blocks or hundreds of blocks. As the map out bad sectors. You Don’t Know Jack about Disks areal density of the HDDs increases, In fact, a recent analysis shows the Dave Anderson the same physical size of the defect true impact of latent defects on the http://queue.acm.org/detail.cfm?id=864058 1 will affect more blocks or tracks and frequency of double disk failures. CTO Roundtable: Storage require more time for re-creation of Early RAID papers stated that the only http://queue.acm.org/detail.cfm?id=1466452 data. One trade-off is the amount of failures of concern were operational A Conversation with Jim Gray time spent recovering corrupted data. failures because, once written, data http://queue.acm.org/detail.cfm?id=864078 A desktop HDD (most ATA drives) is does not change except by bit-rot. optimized to find the data no matter References how long it takes. In a desktop there is Improving Reliability 1. Elerath, J.G. Reliability model and assessment no redundancy and it is (correctly) as- Hard-disk drives don’t just fail cata- of redundant arrays of inexpensive disks (RAID) incorporating latent defects and non-homogeneous sumed that the user would rather wait strophically. They may also silently poisson process events. Ph.D. dissertation, 30–60 seconds and eventually retrieve corrupt data. Unless checked or Department of Mechanical Engineering, University of Maryland, 2007. the data than to have the HDD give up scrubbed, these data corruptions re- 2. Elerath, J.G. and Pecht, M. Enhanced reliability and lose data. sult in double disk failures if a cata- modeling of RAID storage systems. In Proceedings of the 37th Annual IEEE/IFIP International Conference Each HDD manufacturer has a pro- strophic failure also occurs. Data loss on Dependable Systems and Networks, (Edinburgh, UK, June 2007). prietary set of recovery algorithms it resulting from these events is the 3. Elerath, J.G. and Shah, S. Server class disk drives: employs to recover data. If the data dominant mode of failure for an n+1 How reliable are they? In Proceedings of the Annual Reliability and Maintainability Symposium, (January cannot be found, the servo controller RAID group. If the reliability of RAID 2004), 151–156. will move the heads a little to one side groups is to increase, or even keep 4. Gray, J. and van Ingen, C. Empirical measurements of disk failure rates and error rates. Microsoft Research of the nominal center of the track, then up with technology, the effects of un- Technical Report, MSR-TR-2005-166, December to the other side. This off-track read- discovered data corruptions must be 2005. 5. Gregg, B. Shouting in the datacenter, 2008; http:// ing may be performed several times at mitigated or eliminated. Although www.youtube.com/watch?v=tDacjrSCeq4. different off-track distances. This is a scrubbing is one clear answer, other 6. Pinheiro, E., Weber, W.-D., and Barroso, L.A. Failure trends in a large disk drive population. In Proceedings very common process used by all HDD creative methods to deal with latent of the Fifth Usenix Conference on File and Storage manufacturers, but how long can a defects should be explored. Technologies (FAST), (February 2007). 7. Schroeder, B. and Gibson, G. Disk failures in the real RAID group wait for this recovery? Multi-terabyte capacity drives using world: What does an MTTF of 1,000,000 hours mean Some RAID integrators may choose perpendicular recording will be avail- to you? In Proceedings of the Fifth Usenix Conference on File and Storage Technologies (FAST), (February to truncate these steps with the knowl- able soon, increasing the probabil- 2007). edge that the HDD will be considered ity of both correctable and uncorrect- 8. Schwarz, T.J.E., et al. Disk scrubbing in large archival storage systems. In Proceedings of the IEEE failed even though it is not an opera- able errors by virtue of the narrowed Computer Society Symposium (2004), 1161–1170. 9. Seigler, M. and McDaniel, T. What challenges remain tional failure. On the other hand, how track widths, lower flying heads, and to achieve heat-assisted magnetic recording? long can a RAID group response be susceptibility to scratching by softer Solid State Technology (Sept. 2007); http://www. solid-state.com/display_article/304597/5/ARTCL/ delayed while one HDD is trying to re- particle contaminants. One mitiga- none/none/What-challenges-remain-to-achieve-heat- cover data that is readily recoverable tion factor is to turn uncorrectable assisted-magnetic-recording?/. 10. Shah, S. and Elerath, J.G. Disk drive vintage and its using RAID? Also consider what hap- errors into correctable errors through affect on reliability. In Proceedings of the Annual pens when a scratch is encountered. greater error-correcting capability on Reliability and Maintainability Symposium, (January 2004), 163–167. The process of recovery for a large the drive (4KB blocks rather than 512- 11. Sun, F. and Zhang, S. Does hard-disk drive failure rate number of blocks, even if the process or 520-byte blocks) and by using the enter steady-state after one year? In Proceedings of The Annual Reliability and Maintainability Symposium, is truncated, may result in a time-out complete set of recovery steps. These IEEE, (January 2007). condition. The HDD is off recovering will decrease performance, so RAID data or the RAID group is reconstruct- architects must address this trade-off. Jon Elerath is a staff reliability engineer at SolFocus. ing data for so long that the perfor- Operational failure rates are not He has focused on hard-disk drive reliability for more than half his 35-plus-year career, which includes mance comes to a halt; a time-out constant. It is necessary to analyze positions at NetApp, General Electric, Tegal, Tandem threshold is exceeded and the HDD is field data, determine failure modes Computers, Compaq, and IBM. considered failed. and mechanisms, and implement cor- One option is quickly to call the of- rective actions for those that are most fending HDD failed, copy all the data problematic. The operating system © 2009 ACM 0001-0782/09/0600 $10.00

JUNE 2009 | VOL. 52 | NO. 6 | COMMUNICATIONS OF THE ACM 45 practice

DOI:10.1145/1516046.1516060 and implementation realities intrude, Article development led by queue.acm.org often with considerable force. This article will not attempt to dis- cern whether the NFE is a heavenly gift The history of NFE processors sheds light on or a manifestation of evil incarnate. the trade-offs involved in designing network Rather, it will follow its evolution start- ing from a pure host-based implementa- stack software. tion of a network stack and then moving the network stack farther from that ini- BY MIKE O’DELL tial position, observing the issues that arise. The goal is to offer insight into the trade-offs that influence the location choice for network stack software in a larger systems context. As such, it is an Network attempt to prevent old mistakes from being reinvented while harvesting as much clean grain as possible. As a starting point, consider the ca- Front-end nonical structure of a common work- station or server before the advent of multicore processors. Ignoring the provenance of the operating-system Processors, code, this model springs directly from the quintessential early to mid-1980s computer science department com- puter, the DEC VAX 11/780 with a 10Mb Yet Again Ethernet interface with single-cycle di- rect memory access (DMA) ability and connected to a relatively slow 16-bit bus (the DEC Unibus). Since there is only one processor, the network stack vies for the atten- tion of the CPU with everything else running on the machine, albeit prob- “This time for sure, Rocky!” ably with the aid of a software priority —Bullwinkle J. Moose mechanism that makes the network code “more equal than others.” When a packet arrives, the Ethernet THE HISTORY OF the network front-end (NFE) interface validates the Ethernet frame processor, best known as a TCP offload engine cyclic redundancy check (CRC) and (or TOE), extends back to the Arpanet interface then uses DMA to transfer the packet into buffers used by the network code message processor and possibly before. The notion for protocol processing. The DMA is beguilingly simple: partition the work of executing transfers require only one local bus cycle for each16-bit word, and on the communications protocols from the work of executing VAX 11/780 the processor controller the applications that require the services of those for the Unibus buffers 16-bit words protocols. That way, the applications and the network into a single 32-bit transfer into main memory. machinery can achieve maximum performance The TCP checksum is then calcu- and efficiency, possibly taking advantage of special lated by the network code, the protocol state machinery conducts its business, hardware performance assistance. While this looks and the TCP payload data is copied into utterly compelling on the whiteboard, architectural “socket buffers” to await consumption

46 COMMUNICATIONS OF THE ACM | JUNE 2009 | VOL. 52 | NO. 6 A VAX-11/780 from 1983 with 16MB of RAM, and the Ethernet interface containing a Motorola 68000 processor to handle the network traffic.

by the application program. When the 10 megabits/second of network per- PC platform, that conspicuous lack read for the payload data happens, it formance.” The 10Mbps Ethernet can prompted major renovations of the is copied from the socket buffer into deliver about a megabyte/second of PC’s I/O architecture. For the period of application process memory to be di- payload, so this is consistent with the our interest, that progressed from the gested as required. That makes a total other folk theorem of “one megabyte 16-bit ISA bus, to 32-bit PCI, and now of four passes over the data in a single of memory per MIPS per megabyte of PCI Express. For reasons too boring packet before the application gets a I/O.” Where this came from is difficult to explore here, for a very long time, shot at using it. When networks were to pin down, but it is frequently cred- packets moved from PC Ethernet cards slow compared with memory band- ited to Gene Amdahl. into protocol processing buffers with a width and processor speed, the data- Now, let’s move this same model byte-copy operation performed by the copy inefficiency was considered mi- to PC hardware. For a long time, one CPU, upping the data-handling pass INNEGAN F nor compared with the joy of a working of the principal distinctions between count to five. network stack, so it failed to provoke PCs and minicomputers was I/O per- The first significant improvement immediate improvement. formance. To be brutal, compared came when the raw-packet copy op- H BY PATRICK P This base-case platform appears with its minicomputer forebears, the eration and TCP checksum were com- to be the origin of the folk theorem PC platform started life with almost bined. Some network code tried to

PHOTOGRA that “TCP needs one (VAX-)MIPS per no I/O capabilities. Over the life of the do this in software. As PCI Ethernet

JUNE 2009 | VOL. 52 | NO. 6 | COMMUNICATIONS OF THE ACM 47 practice

cards developed efficient DMA hard- ability to add a fast processor that can ware, some combined the TCP check- be applied entirely to protocol process- sum generation with the copy opera- ing is certainly an attractive idea. It is, tion, reducing the pass count to three. however, much more difficult to do This clearly reduced CPU use for a in real life than it first appears on the given amount of TCP throughput and Simply moving whiteboard. started the march to “protocol assist” data directly off the Simply moving data directly off the services performed by network inter- network wire into application buffers faces. (“If a little help is good, a lot of network wire into is not sufficient. The delivery of packets help should be better!”) Adapting the application buffers must be coordinated with all the other network stack code to exploit this new things the application is doing and all checksum capability was not trivial, is not sufficient. the other operating-system machinery but the handwriting on the wall made The delivery of behind the scenes. As a result, the net- it clear that such evolution was likely work protocol stack interacts with the to continue. Significant redesign of the packets must be rest of the operating system in exqui- network code had to be done to allow sitely delicate ways. Truth be told, this functions to move between hardware coordinated with all coordination machinery is the lion’s and software with greater ease in the the other things the share of the code in most stack imple- future. This was genuine architectural mentations. The actual TCP state ma- progress, although it did not happen application is doing chine fits on a half page, once divorced overnight. and all the other of all the glue and scaffolding needed to integrate it with the rest of the sys- A Success Disaster operating-system tem environment. It is precisely this With the explosion of the Web, perfor- machinery behind subtle and complex control coupling mance demands on network servers that makes it surprisingly difficult to skyrocketed. Processors and network the scenes. isolate a network protocol stack fully interfaces were getting faster, and from its host operating system. There memory bandwidth strangulation was are multiple reasons why this interac- being solved. Gigabit Ethernet quickly tion is such a rich breeding ground for became commonplace on server moth- implementation bugs, but one vast cat- erboards (and gamer desktop moth- egory is “abstraction mismatch.” erboards!). By this time, the cost of all Because communications protocols those data copies was clearly unaccept- inherently deal with multiple commu- able. Simply halving the number of nicating entities, some assumptions copies would come close to doubling must be made about the behavior of the sustainable transaction rate for those entities. The degree to which many Web workloads. those assumptions match between a This gave rise to the Holy Grail of host system and protocol code deter- what became known as zero-copy TCP. mines how difficult it will be to map The idea was that programs written to to existing semantics and how much exploit this new capability could have new structure and machinery will be data delivered right into application required. When networking first went buffers without any intervening cop- into Berkeley Unix, subtleties on both ies (ignoring the possible exception of sides required considerable effort to one efficient DMA transfer from the reconcile. There was a critical desire to hardware). Clearly this would require make network connections appear to some cooperation (or at least reduced be natural extensions of existing Unix antagonism) from designers of Ether- machinery: file descriptors, pipes, and net interface hardware, but a working the other ideas that make Unix concep- solution would win many hearts and tually compact. But because of radical minds. differences in behavior, especially de- The step from a zero-copy TCP net- lay, it is impossible to completely dis- work stack to a full-blown TCP offload guise reading 1,000 bytes from a round- engine looks pretty obvious at this the-world network connection so that point. It seems even more attractive giv- it appears indistinguishable from read- en that many PC-based platforms were ing that same 1,000 bytes from a file on slow to exploit the multiprocessor abil- a local file system. Networks have new ities the PC was developing. (Whether behaviors that require new interfaces it is multiple chips or multiple cores to capture and manage, but those new on one chip is largely irrelevant.) The interfaces must make sense with exist-

48 COMMUNICATIONS OF THE ACM | JUNE 2009 | VOL. 52 | NO. 6 practice ing interfaces. This was difficult work, computer device driver. “Doesn’t that measure (and there’s certainly a place and the modifications left few pieces of count?” you rightfully ask. Yes, indeed, in the world for those), but as a long- the system untouched; a few changed it does. term architectural approach, the com- in profound ways. There is a long history of peripheral moditization of processor cores makes The fundamental capabilities pro- chips being designed with absolutely specialized hardware very difficult to vided by a network protocol stack are dreadful interfaces. Such chips have justify. data transfer, multiplexing, flow con- been known to make device-driver writ- Lacking NFEs, what is required for trol, and error management. All of ers contemplate slow, painful violence maximizing host-based network per- these functions are required for the if they ever meet the chip designer in a formance? Here are some guidelines: coordinated delivery of data between dark alley. The very early Ethernet chips ! Wire interfaces should be designed endpoints across the Internet. Indeed, from one famous semiconductor com- to be fast and brilliantly simple. Do the the purpose of all the structure in the pany were absolute masterpieces of bit-speed work and then get the data packet headers: to carry the control co- egregious overdesign. Not only did they into memory as quickly as possible, do- ordination information, as well as the contain many complex functions of du- ing any additional work such as check- payload data. bious utility, but also the functions that sums that can readily be buried in the The critical observation is that the were genuinely required suffered from unavoidable transfer. Streamline the exact same operations are required the same virulent infestation of bugs device as seen by the driver so as to to coordinate the interaction of a net- that plagued the useless bits. Tom Lyon avoid playing “Twenty Questions” with work protocol stack and the host op- wrote a famous Usenix paper in 1985, the hardware to determine what just erating system within a single system. “All the Chips that Fit,” delivering an happened. When all the code is in the same place epic rant on this expansive topic. (It ! Interconnects should have suf- (that is, running on the same proces- should be required reading for anyone ficient capacity to carry the network sor), this signaling is easily done with contemplating hardware design.) traffic without strangling other I/O op- simple procedure calls. If, however, If the goal is efficiency and per- erations. From the standpoint of a net- the network protocol stack executes formance of network code, all of the work interface, PCI Express appears on a remote processor such as a TOE, “mini-protocols” in the entire network to have adequate performance for this signaling must be done with an ex- protocol subsystem must be examined 10Gbps Ethernet as does HyperTrans- plicit protocol carried across whatever carefully. Both internal complexity and port 3.0. connects the front-end processor to integration complexity can be serious ! The system must have sufficient the host operating system. This proto- bottlenecks. Ultimately, the question is memory bandwidth to get the network col is called a host-front end protocol how hard is it to glue this piece onto the payload in and out without strangling (HFEP). other pieces it must interact with fre- the rest of the system, especially the Designing an HFEP is not trivial, quently? If it is very difficult, it is likely processors. Historically, the PC plat- especially if the goal is that it be mate- not fast (in an absolute sense), nor is it form has been chronically starved for rially simpler than the protocol being likely robust from a bug standpoint. memory bandwidth. offloaded to the remote processor. His- Remember the protocol state ma- ! Processors should have enough torically, the HFEP has been the Achil- chines are generally not the principal cores able to exploit the sufficient les’ heel of NFE processors. The HFEP source of complexity or performance memory bandwidth. ends up being asymptotically as com- issues. One extra data copy can make ! Network protocol stacks should be plex as the “primary” protocol being a huge difference in the maximum designed to maximize parallelism and offloaded, so there is very little to gain achievable performance. Therefore, minimize blocking, while never copy- in offloading it. In addition, the HFEP implementations must focus on avoid- ing data. must be implemented twice: once in ing data motion: put it where it goes ! A set of network APIs should be the host and once in the front-end pro- the first time it is touched, then leave designed to maximize performance cessor, each one of those being a dif- it alone. If some other operation on as opposed to mandatory similarity ferent host platform as far as the HFEP packet payload is required, such as with existing system calls. Backward is concerned. Two implementations, checksum computation, bury it inside compatibility is important to support, two integrations with host operating an unavoidable operation such as the but some applications may wish to pay systems—this means twice as many single transfer into memory. In line more to get more. sources of subtle race conditions, with those suggestions, streamline the deadlocks, buffer starvations, and oth- operating-system interface to maxi- Historical Perspective er nasty bugs. This cost requires a huge mize concurrency. Once all those is- NFEs have been rediscovered in at payoff to cover it. sues have been addressed aggressively, least four or five different periods. In there’s not a lot of work left to avoid. the spirit of full and fair disclosure, I But Wait a Minute… must admit to having directly contrib- About now some readers may be eager What Does All this Mean for NFEs? uted to two of those efforts and having to throw a penalty flag for “unconvinc- Many times, but not every time, an NFE purchased and integrated yet another. ing hand waving” because even in the is likely to be an overly complex solu- So why does this idea keep recurring if base case, there is a protocol between tion to the wrong part of the problem. it turns out to be much more difficult the Ethernet interface and the host It is possibly an expedient short-term than it first appears?

JUNE 2009 | VOL. 52 | NO. 6 | COMMUNICATIONS OF THE ACM 49 practice

The capacities and economics of sors quickly improved enough to do the lighting controller, the NCA looks computer systems do not advance compression/decompression on the like just one more switch, albeit a chat- smoothly, nor are the rates of improve- fly, however, and that was the end of ty one. This distinction is usually irrel- ment of various components synchro- HiFN’s dream—well before the drop- evant—it just makes hash of pedantic nized. The resulting interactions pro- ping cost of disk storage would have layering diagrams. There’s something duce dramatically different trade-offs killed it. quite satisfying about that. in system partitioning that evolve over Any effort to question the efficacy time. What is correct today may not of NFEs should include a caveat for Conclusion be right after the next technology im- one particular case that merits a spe- Rather than debate the religious pro- provement. An example will illustrate cial mention because it indeed makes priety of NFEs, particularly the TOE va- the point. a compelling case for a particular style riety, I have examined the architectural Once upon a time, disk storage was of NFE. issues that have produced their recur- expensive—really expensive—but it The proliferation of microcon- ring rise and fall. The TOE-style NFE also exhibited significant economy of trollers in devices such as thermostats, is best viewed as a tactical tool with a scale. At that time, LAN connectivity light switches, toasters, and almost limited expected lifetime of economic and processor performance were suffi- everything else with more than a sim- viability, not an enduring architectural cient to make it desirable to share large ple on/off switch has created a real approach. This is just another example disks among multiple workstations, opportunity for NFEs. Almost all of of the recurring ebb and flow of func- giving rise to the diskless workstation. these microcontroller applications are tions between specialized peripherals This lasted for a number of years, but typified by intense cost pressure, which and the system CPU(s), as the econom- as disks slid down the learning curve, usually translates into extreme limita- ics slosh back and forth interacting the decreasing cost per megabyte of tions on available computing resourc- with system requirements. The lim- disk space overwhelmed the opera- es. It is simply out of the question to ited lifetime of the NFE’s advantages tional complexity of diskless worksta- put a network stack in the vast majority makes it difficult to justify the signifi- tions so they became diskfull, and they of these systems, but the desirability of cant development costs for any but the have been ever since—until relatively remote management of these devices highest-value applications. recently. Today the typical large orga- increases daily. That said, the inexpensive NCA is nization averages the better part of one This has created a new breed of likely to be an approach that does en- PC per employee, so the operational NFE: the network communications dure. It literally transforms network grief of administering all those desk- adapter (NCA) that specializes in the communication into an inexpensive, top PCs is substantial. This cost is now simplicity of the protocol between the pluggable physical component. By do- high enough that the diskless worksta- microcontroller host and the NFE—se- ing so, it provides an avenue for deal- tion has been rediscovered, this time rial ASCII. Most microcontrollers have ing with the extreme cost pressure in- named thin clients. All the storage is some serial port ability, so by looking herent in microcontroller applications elsewhere; nothing permanent exists like a terminal, the NCA can play the while providing an incremental option on the desktop unit. History is busily role of translator, speaking serial out of genuine network citizenship when repeating itself. Why? Because the vari- one side and TCP/IP out the other. The the customer will pay for it. ous cost curves have moved enough, NCA appears as a host on the TCP net- relative to each other, to the point work, often containing a simple Web Related articles where centralization makes sense. server that vends state information on queue.acm.org The same thing happens with NFEs. and may provide certain other man- At a point in time, systems don’t have agement functions that get translated TCP Offload to the Rescue Andy Currid enough network “go-fast” to deliver into simple ASCII exchanges with the http://queue.acm.org/detail.cfm?id=1005069 the performance required, so just add microcontroller system. Network Virtualization a dedicated processor to the network An NCA is usually implemented in Scott Rixner interface to make up for it. The eco- one of the more powerful microcon- http://queue.acm.org/detail.cfm?id=1348592 nomics of that are fleeting at best, how- trollers that have been designed to DAFS: A New High-Performance ever. Between chip design and system- provide an Ethernet interface and sup- Networked File System integration complexity, an NFE will port enough RAM and ROM to contain Steve Kleiman need to be an economically attractive a simplified network stack. The NCA is http://queue.acm.org/detail.cfm?id=1388770 solution for quite some time to recoup now available as an off-the-shelf mod- the development costs. Unfortunately, ule designed for easy integration no Mike O’Dell is a venture partner at New Enterprise Associates (NEA), Chevy Chase, MD, where he works the relentless improvements in proces- more difficult than a modem on a se- to identify early-stage IT, communications, and energy sor, memory system, and system inter- rial port. opportunities. Prior to this position, Odell was chief scientist at UUNET Technologies, responsible for network connect in the base PC platform make The question of which is the tail and product architecture during the emergence of the that window of advantage a shrinking, and which is the dog comes to mind in commercial Internet. He has also held positions at Bellcore (now Telcordia), a GaAs Sparc supercomputer fast-moving target. Does anyone else many of these applications. From the startup, and a U.S. government contractor. He was remember the HiFN file compression TCP network’s point of view, the NCA is founding editor of Computing Systems, an international refereed scholarly journal. processor chip? It was built into PC the host and the microcontroller is be- systems for a very short time. Proces- ing managed. From the point of view of © 2009 ACM 0001-0782/09/0600 $10.00

50 COMMUNICATIONS OF THE ACM | JUNE 2009 | VOL. 52 | NO. 6 DOI:10.1145/1516046.1516061

Article development led by queue.acm.org High bandwidth, low latency, and multihoming challenge the sockets API.

BY GEORGE V. NEVILLE-NEIL Whither Sockets?

ONE OF THE most pervasive and longest-lasting interfaces in software is the sockets API. Developed by the Computer Systems Research Group at the University of California at Berkeley, the sockets API was first released as part of the 4.1c BSD operating system in 1982. While there are longer-lived APIs— for example, those dealing with Unix are topology and speed. For the most file I/O—it is quite impressive for an part it is the increase in speed rather API to have remained in use and largely than the changes in topology that peo- unchanged for 27 years. The only major ple notice. The maximum bandwidth update to the sockets API has been the of a commercially available long-haul extension of ancillary routines to ac- network link in 1982 was 1.5Mbps. The commodate the larger addresses used Ethernet LAN, which was being de- by IPv6.2 ployed at the same time, had a speed of The Internet and the networking 10Mbps. A home user—and there were world in general have changed in very very few of these—was lucky to have a significant ways since the sockets API 300bps connection over a phone line to was first developed, but in many ways any computing facility. The round-trip the API has had the effect of narrow- time between two machines on a local ing the way in which developers think area network was measured in tens of about and write networked applica- milliseconds, and between systems tions. This article briefly examines over the Internet in hundreds of milli- some of the conditions present when seconds, depending of course on loca- the sockets API was developed and con- tion and the number of hops a packet siders how those conditions shaped would be subjected to when being rout- the way in which networking code was ed between machines. (See page 52 for written. Later, I look at ways in which a look at the early Internet.) developers have tried to get around The topology of networks at the time some of the inherent limitations in the was relatively simple. Most computers API and address the future of sockets had a single connection to a local area in a changing networked world. network; the LAN was connected to a The two biggest differences be- primitive router that might have a few tween the networks of 1982 and 2009 connections to other LANs and a single

JUNE 2009 | VOL. 52 | NO. 6 | COMMUNICATIONS OF THE ACM 51 practice

      

"  "   ""   !  '#   !  !    !    & &    !"  %   "'!  #" %  # !#&  !   $  "    % " ! !"  % #     ! ! " !""   ! % ! #! !  ! !  "  '#   " ! %!   ! #"  !

 !"" #" # #! "&!  # #!" "  

59,"/083(6+5,84598/5; 8,<6,703,49(28(9,2209,*544,*90548 (3,88/5;4(7, 4(3,84594,*,88(702=/5894(3,8 !5:7*,/9966,7854(26(.,83(4*/,89,7(*:189(--3+5+.,*=),7.,5.7(6/=(92(8(76(4,9 .0-

connection to the Internet. For one ap- seen as a way of extending the Unix fi le 1, it is those fi ve shown that are central plication to another application, the I/O model over a computer network. to the API and that differentiate it from connection was either across a LAN or One other factor that focused the sock- regular fi le I/O. In reality the socket() transiting one or more routers, called ets API down to the client/server model call could have been dropped and re- IMPs (Internet message passing). was that the most popular protocol it placed with a variant of open(), but this supported was TCP, which has an in- was not done at the time. The sock- History of Sockets herently 1:1 communication model. et() and open() calls actually return The model of distributed program- The sockets API made the client/ the same thing to a program: a process- ming that came to be most popularized server model easy to implement be- unique fi le descriptor that is used in all by the sockets API was the client/server cause of the small number of extra subsequent operations with the API. It model, in which there is a server and system calls that programmers would is the simplicity of the API that has led a set of clients. The clients send mes- need to add to their non-networked to its ubiquity, but that ubiquity has sages to the server to ask it to do work code so it could take advantage of other held back the development of alterna- on their behalf, wait for the server to do computing resources. Although other tive or enhanced APIs that could help the work requested, and at some later models are possible, with the sockets programmers develop other types of point receive an answer. This model of API the client/server model is the one distributed programs. computing is now so ubiquitous it is that has come to dominate networked Client/server computing had many often the only model with which many computing. advantages at the time it was developed. software engineers are familiar. At the Although the sockets API has more It allowed many users to share resourc- time it was designed, however, it was entry points than those shown in Table es, such as large storage arrays and ex- pensive printing facilities, while keep- Table 1: Socket API systems calls. ing these facilities within the control of the same departments that had once run mainframe computing facilities. socket() Create a communication endpoint With this sharing model, it was possible bind() Bind the endpoint to some set of network-layer parameters to increase the utilization of what, at the listen() Set a limit on the number of outstanding work requests time, were expensive resources. accept() Accept one or more work requests from a client Three disparate areas of network- connect() Contact a server to submit a work request ing are not well served by the sockets API: low-latency or real-time applica- tions; high-bandwidth applications;

52 COMMUNICATIONS OF THE ACM | JUNE 2009 | VOL. 52 | NO. 6 practice and multihomed systems—that is, constant check/read/check is wasteful those with multiple network interfac- unless the time between successive re- es. Many people confuse increasing quests is quite long. network bandwidth with higher per- Solving this problem requires in- formance, but increasing bandwidth verting the communication model be- does not necessarily reduce latency. Sockets programs tween an application and the operating The challenge for the sockets API is are written from system. Various attempts to provide an giving the application faster access to API that allows the kernel to call directly network data. the viewpoint into a program have been proposed but The way in which any program us- of a dearth of, none has gained wide acceptance—for ing the sockets API sends and receives a few reasons. The operating systems data is via calls to the operating sys- rather than that existed at the time the sockets API tem. All of these calls have one thing a wealth of, data. was developed were, except in very eso- in common: the calling program must teric circumstances, single threaded repeatedly ask for data to be delivered. and executed on single-processor com- In a world of client/server computing puters. If the kernel had been fitted these constant requests make perfect with an up-call API, there would have sense, because the server cannot do been the problem of which context the anything without a request from the call could have executed in. Having all client. It makes little sense for a print other work on a system pause because server to call a client unless the client the kernel was executing an up-call into has something it wishes to print. What, an application would have been unac- however, if the service provided is mu- ceptable, particularly in timesharing sic or video distribution? In a media systems with tens to hundreds of users. distribution service there may be one The only place in which such software or more sources of data and many lis- architecture did gain currency was in teners. For as long as the user is listen- embedded systems and networked ing to or viewing the media, the most routers where there were no users and likely case is that the application will no virtual memory. want whatever data has arrived. Spe- The issue of virtual memory com- cifically requesting new data is a waste pounds the problems of implement- of time and resources for the applica- ing a kernel up-call mechanism. The tion. The sockets API does not provide memory allocated to a user process is the programmer a way in which to say, virtual memory, but the memory used “Whenever there is data for me, call me by devices such as network interfaces to process it directly.” is physical. Having the kernel map Sockets programs are instead written physical memory from a device into a from the viewpoint of a dearth of, rather user-space program breaks one of the than a wealth of, data. Network pro- fundamental protections provided by a grams are so used to waiting on data that virtual memory system. they use a separate system call, sock- et(), so that they can listen to multiple Attempts to Overcome sources of data without blocking on a Performance Issues single request. The typical processing A couple of different mechanisms loop of a sockets-based program isn’t have been proposed and sometimes simply read(), process(), read(), but implemented on various operating instead select(), read(), process(), systems to overcome the performance select(). Although the addition of a issues present in the sockets API. One single system call to a loop would not such mechanism is zero-copy sockets. seem to add much of a burden, this is Anyone who has worked on a network not the case. Each system call requires stack knows that copying data is what arguments to be marshaled and cop- kills the performance of networking ied into the kernel, as well as causing protocols. Therefore, to improve the the system to block the calling process speed of networked applications that and schedule another. If there were data are more interested in high bandwidth available to the caller when it invoked than in low latency, the operating sys- select(), then all of the work that went tem is modified to remove as many data into crossing the user/kernel boundary copies as possible. Traditionally, an was wasted because a read() would operating system performs two copies have returned data immediately. The for each packet received by the system.

JUNE 2009 | VOL. 52 | NO. 6 | COMMUNICATIONS OF THE ACM 53 practice

Table 2: APIs added by SCTP. select() on any file descriptor, which would let the program know when any of a set of file descriptors was readable, API Explanation writable, or had an error. When pro- sctp _ bindx() Bind or unbind an SCTP socket to a list of addresses grams were written to sit in a loop and sctp _ connectx() Connect an SCTP socket with multiple destination addresses wait on a set of file descriptors—for ex- sctp _ generic _ recvmsg() Receive data from a peer ample, reading from the network and sctp _ generic _ sendmsg(), Send data to a peer writing to disk—the select() call was sctp _ generic _ sendmsg _ iov() sufficient, but once a program wanted sctp _ getaddrlen() Return the address length of an address family to check for other events, such as tim- sctp _ getassocid() Return an association ID for a specified socket address ers and signals, select() no longer sctp _ getpaddrs(), Return list of addresses to caller sctp _ getladdrs() served. The problem for low-latency sctp _ peeloff() Detach an association from a one-to-many socket to apps is that kevents() do not deliver a separate file descriptor data; they deliver only a signal that data sctp _ sendx() Send a message from an SCTP socket is ready, just as the select() call did. sctp _ sendmsgx() Send a message from an SCTP socket The next logical step would be to have an event-based API that also delivered data. There is no reason to have the ap- plication cross the user/kernel bound- The first copy is performed by the net- until it reaches the socket layer, where ary twice simply to get the data the ker- work driver from the network device’s it is copied out of the kernel when the nel knows the application wants. memory into the kernel’s memory, and user’s program calls read(). Data sent the second is performed by the sock- by the program is handled in a similar Lack of Support for Multihoming ets layer in the kernel when the data is way by the kernel, in that kernel buf- The sockets API not only presents per- read by the user program. Each of these fers are eventually added to the trans- formance problems to the application copy operations is expensive because it mit descriptor ring and a flag is then writer, but also narrows the type of must occur for each message that the set to tell the device that it can place communication that can take place. system receives. Similarly, when the the data in the buffer on the network. The client/server paradigm is inher- program wants to send a message, data All of this work in the kernel leaves ently a 1:1 type of communication. Al- must be copied from the user’s pro- the last copy problem unsolved, and though a server may handle requests gram into the kernel for each message several attempts have been made to from a diverse group of clients, each sent; then that data will be copied into extend the sockets API to remove this client has only one connection to a the buffers used by the device to trans- copy operation.1, 3 The problem re- single server for a request or set of re- mit it on the network. mains as to how memory can be safely quests. In a world in which each com- Most operating-system designers shared across the user/kernel bound- puter had only one network interface, and developers know that data copying ary. The kernel cannot give its memory that paradigm made perfect sense. A is anathema to system performance over to the user program, because at that connection between a client and server and work to minimize such copies point it loses control over the memory. is identified by a quad of . Since services generally have device drivers copy data directly of usable memory, leading to system have a well-known destination port (for into and out of kernel memory. On performance degradation. There are example, 80 for HTTP), the only value modern network devices this is a re- also security issues inherent in sharing that can easily vary is the source port, sult of how they structure their mem- memory buffers across the kernel/user since the IP addresses are fixed. ory. The driver and kernel share two boundary. There is no single answer to In the Internet of 1982 each ma- rings of packet descriptors—one for how a user program might achieve high- chine that was not a router had only a transmit and one for receive—where er bandwidth using the sockets API. single network interface, meaning that each descriptor has a single pointer For programmers who are more con- to identify a service, such as a remote to memory. The network device driver cerned with latency than with band- printer, the client computer needed initially fills these rings with memory width, even less has been done. The a single destination address and port from the kernel. When data is re- only significant improvement for pro- and had, itself, only a single source ceived, the device sets a flag in the cor- grams that are waiting for a network address and port to work with. While rect receive descriptor and tells the event has been the addition of a set of it did exist, the idea that a computer kernel, usually via an interrupt, that kernel events that a program can wait might have multiple ways of reaching a there is data waiting for it. The kernel on. Kernel events, or kevents(), are service was too complicated and far too then removes the filled buffer from the an extension of the select() mecha- expensive to implement. Given these receive descriptor ring and replaces it nism to encompass any possible event constraints, there was no reason for the with a fresh buffer for the device to fill. that the kernel might be able to tell the sockets API to expose to the program- The packet, in the form of the buffer, program about. Before the advent of mer the ability to write a multihomed then moves through the network stack kevents, a user program could call program—one that could manage

54 COMMUNICATIONS OF THE ACM | JUNE 2009 | VOL. 52 | NO. 6 practice which interfaces or connections mat- to work, with few or no changes, across tered to it. Such features, when they a plethora of devices, from cellphones, were implemented, were a part of the to laptops, to desktops, and so on. With routing software within the operating properly defined APIs we would re- system. The only way programs could move the artificial barrier that prevents get access to them was through an ob- As systems come this. It is only because of the history of scure set of nonstandard kernel APIs to have more the sockets API and the fact that it has called a routing socket. been “good enough” to date that this On a system with multiple network network interfaces need has not yet been addressed. interfaces it is not possible, using the built in, providing High bandwidth, low latency, and standard sockets API, to write an appli- multihoming are driving the devel- cation that can easily be multihomed— the ability to write opment of alternatives to the sockets that is, take advantage of both inter- API. With LANs now reaching 10Gbps, faces so if one fails, or if the primary applications that it is obvious that for many applica- route over which the packets were flow- take advantage tions client/server style communica- ing breaks, the application would not tion is far too inefficient to use the lose its connection to the server. of multihoming available bandwidth. The communi- The recently developed Stream Con- will be an absolute cation paradigms supported by the trol Transport Protocol (SCTP)4 incor- sockets API must be expanded to allow porates support for multihoming at the necessity. for memory sharing across the kernel protocol level, but it is impossible to boundary, as well as for lower-latency export this support through the sock- mechanisms to deliver data to appli- ets API. Several ad-hoc system calls cations. Multihoming must become were initially provided and are the only a first-class feature of the sockets API way to access this functionality. At the because devices with multiple active moment this is the only protocol that interfaces are now becoming the norm has both the capacity and user demand for networked systems. for this feature, so the API has not been standardized across more than a few operating systems. Table 2 shows the Related articles on queue.acm.org APIs that SCTP added. While the list of functions in Table Code Spelunking: Exploring 2 contains more APIs than are strictly Cavernous Code Bases George Neville-Neil necessary, it is important to note that http://queue.acm.org/detail.cfm?id=945136 many are derivatives of preexisting API Design Matters APIs, such as send(), which need to Michi Henning be extended to work in a multihom- http://queue.acm.org/detail.cfm?id=1255422 ing world. The set of APIs needs to be You Don’t Know Jack harmonized to make multihoming a about Network Performance first-class citizen in the sockets world. Kevin Fall and Steve McCanne The problem now is that sockets are so http://queue.acm.org/detail.cfm?id=1066069 successful and ubiquitous that it is very hard to change the existing API set for References 1. Balaji, P., Bhagvat, S., Jin, H.-W., and Panda, D.K. fear of confusing its users or the preex- Asynchronous zero-copy communication for isting programs that use it. synchronous sockets in the sockets direct protocol (sdp) over infiniband journal. In Proceedings of the As systems come to have more net- 20th IEEE International Parallel and Distributed work interfaces built in, providing the Processing Symposium. 2. Gilligan, R., Thomson, S., Bound, J., McCann, J., and ability to write applications that take Stevens, W. Basic Socket Interface Extensions for advantage of multihoming will be IPv6. RFC 3493 (Feb. 2003); http://www.rfc-editor. org/rfc/rfc3493.txt. an absolute necessity. One can easily 3. Romanow, A., Mogul, J., Talpey, T., and Bailey, S. imagine the use of such technology in Remote Direct Memory Access (RDMA) over IP Problem Statement. RFC 4297 (Dec. 2005); http:// a smartphone, which already has three www.rfc-editor.org/rfc/rfc4297.txt. network interfaces: its primary connec- 4. Stewart, R., et al. Stream Control Transmission Protocol. RFC 2960 (Oct. 2000); http://www.ietf.org/ tion via the cellular network, a WiFi in- rfc/rfc2960.txt. terface, and often a Bluetooth interface as well. There is no reason for an appli- George V. Neville-Neil ([email protected]) is the proprietor cation to lose connectivity if even one of Neville-Neil Consulting. He works on networking and operating systems code and teaches courses on program- of these network interfaces is working related topics. properly. The problem for application designers is that they want their code © 2009 ACM 0001-0782/09/0600 $10.00

JUNE 2009 | VOL. 52 | NO. 6 | COMMUNICATIONS OF THE ACM 55 contributed articles

DOI:10.1145/1516046.1516062 RAKESH AGRAWAL Database research is expanding, with major ANASTASIA AILAMAKI efforts in system architecture, new languages, PHILIP A. BERNSTEIN cloud services, mobile and virtual worlds, and interplay between structure and text. ERIC A. BREWER MICHAEL J. CAREY SURAJIT CHAUDHURI The ANHAI DOAN DANIELA FLORESCU Claremont MICHAEL J. FRANKLIN HECTOR GARCIA-MOLINA Report on JOHANNES GEHRKE LE GRUENWALD Database LAURA M. HAAS ALON Y. HALEVY Research JOSEPH M. HELLERSTEIN YANNIS E. IOANNIDIS HANK F. KORTH

DONALD KOSSMANN A GROUP OF database researchers, architects, users, and SAMUEL MADDEN pundits met in May 2008 at the Claremont Resort in Berkeley, CA, to discuss the state of database research ROGER MAGOULAS and its effects on practice. This was the seventh meet- BENG CHIN OOI ing of this sort over the past 20 years and was distin- guished by a broad consensus that the database TIM O’REILLY community is at a turning point in its history, due RAGHU RAMAKRISHNAN toboth an explosion of data and usage scenarios and major shifts in computing hardware and platforms. SUNITA SARAWAGI Here, we explore the conclusions of this self- MICHAEL STONEBRAKER assessment. It is by definition somewhat inward- focused but may be of interest to the broader ALEXANDER S. SZALAY computing community as both a window into GERHARD WEIKUM upcoming directions in database research and

56 COMMUNICATIONS OF THE ACM | JUNE 2009 | VOL. 52 | NO. 6 a description of some of the community tional enterprise settings, the barriers crawls of deep-Web sites. There is also issues and initiatives that surfaced. We between IT departments and business an explosion of text-focused semistruc- describe the group’s consensus view of units are coming down, and there are tured data in the public domain in the new focus areas for research, including many examples of companies where form of blogs, Web 2.0 communities, database engine architectures, declara- data is indeed the business itself. As a and instant messaging. New incentive tive programming languages, interplay consequence, data capture, integra- structures and Web sites have emerged of structured data and free text, cloud tion, and analysis are no longer viewed for publishing and curating structured data services, and mobile and virtual as a business cost but as the keys to data in a shared fashion as well. Text- worlds. We also report on discussions efficiency and profit. The value of soft- centric approaches to managing the of the database community’s growth ware to support data analytics has been data are easy to use but ignore latent and processes that may be of interest growing as a result. In 2007, corporate structure in the data that might add to other research areas facing similar acquisitions of business-intelligence significant value. The race is on to challenges. vendors alone totaled $15 billion,2 and develop techniques that extract useful Over the past 20 years, small groups of database researchers have periodi- cally gathered to assess the state of the field and propose directions for future research.1,3–7 Reports of the meetings served to foster debate within the data- base research community, explain research directions to external orga- nizations, and help focus community efforts on timely challenges. The theme of the Claremont meet- ing was that database research and the data-management industry are at a turning point, with unusually rich opportunities for technical advances, intellectual achievement, entrepre- neurship, and benefits for science and society. Given the large number of opportunities, it is important for the database research community to address issues that maximize relevance within the field, across computing, and in external fields as well. The sense of change that emerged in the meeting was a function of sever- al factors: Excitement over “big data.” In recent years, the number of communities that is only the “front end” of the data- data from mostly noisy text and struc- working with large volumes of data has analytics tool chain. Market pressure for tured corpora, enable deeper explo- grown considerably to include not only better analytics also brings new users ration into individual data sets, and traditional enterprise applications and to the technology with new demands. connect data sets together to wring out Web search but also e-science efforts Statistically sophisticated analysts are as much value as possible. (in astronomy, biology, earth science, being hired in a growing number of Expanded developer demands. and more), digital entertainment, natu- industries, with increasing interest in Programmer adoption of relational ral-language processing, and social- running their formulae on the raw data. DBMSs and query languages has grown network analysis. While the user base At the same time, a growing number of significantly in recent years, acceler- for traditional database management nontechnical decision makers want to ated by the maturation of open source systems (DBMSs) is growing quickly, “get their hands on the numbers” as systems (such as MySQL and Postgr- there is also a groundswell of effort to well in simple and intuitive ways. eSQL) and the growing popularity of design new custom data-management Ubiquity of structured and unstruc- object-relational mapping packages solutions from simpler components. tured data. There is an explosion of (such as Ruby on Rails). However, the The ubiquity of big data is expanding structured data on the Web and on expanded user base brings new expec- UEKIT

L the base of users and developers of enterprise intranets. This data is from tations for programmability and usabil- data-management technologies and a variety of sources beyond traditional ity from a larger, broader, less-special- will undoubtedly shake up the data- databases, including large-scale efforts ized community of programmers. base research field. to extract structured information from Some of them are unhappy or unwill- USTRATION BY G

ILL Data analysis as profit center. In tradi- text, software logs and sensors, and ing to “drop into” SQL, viewing DBMSs

JUNE 2009 | VOL. 52 | NO. 6 | COMMUNICATIONS OF THE ACM 57 contributed articles

as unnecessarily complicated and tant aspect of the price/performance revolved around two broad agendas daunting to learn and manage relative metric of large systems. These hard- we call reformation and synthesis. The to other open source components. As ware trends alone motivate a wholesale reformation agenda involves decon- the ecosystem for database manage- reconsideration of data-management structing traditional data-centric ideas ment evolves beyond the typical DBMS software architecture. and systems and reforming them for user base, opportunities are emerging These factors together signal an new applications and architectural real- for new programming models and new urgent, widespread need for new data- ities. One part of this entails focusing system components for data manage- management technologies. There is outside the traditional RDBMS stack ment and manipulation. an opportunity for making a positive and its existing interfaces, emphasiz- Architectural shifts in computing. difference. Traditionally, the database ing new data-management systems While the variety of user scenarios is community is known for the practical for growth areas (such as e-science). increasing, the computing substrates relevance of its research; relational Another part of the reformation agen- for data management are shifting databases are emblematic of technol- da involves taking data-centric ideas like declarative programming and query optimization outside their origi- nal context in storage and retrieval to attack new areas of computing where a data-centric mindset promises to yield significant benefit. The synthesis agenda is intended to leverage research ideas in areas that have yet to develop identifiable, agreed-upon system archi- tectures, including data integration, information extraction, and data priva- cy. Many of these subcommunities of database research seem ready to move out of the conceptual and algorithmic phase to work together on comprehen- sive artifacts (such as systems, languag- es, and services) that combine multiple techniques to solve complex user prob- lems. Efforts toward synthesis can serve as rallying points for research, likely leading to new challenges and break- throughs, and promise to increase the overall visibility of the work.

Research Opportunities After two days of intense discussion at the 2008 Claremont meeting, it was dramatically as well. At the macro scale, ogy transfer. But in recent years, the surprisingly easy for the group to reach the rise of cloud computing services externally visible contribution of the consensus on a set of research topics suggests fundamental changes in database research community has for investigation in coming years. software architecture. It democratizes not been as pronounced, and there Before exploring them, we stress a few access to parallel clusters of computers; is a mismatch between the notable points regarding what is not on the list. every programmer has the opportunity expansion of the community’s portfo- First, while we tried to focus on new and motivation to design systems and lio and its contribution to other fields opportunities, we do not propose they services that scale out incrementally of research and practice. In today’s be pursued at the expense of existing to arbitrary degrees of parallelism. At increasingly rich technical climate, the good work. Several areas we deemed a micro scale, computer architectures database community must recommit critical were left off because they are have shifted the focus of Moore’s Law itself to impact and breadth. Impact already focus topics in the database from increasing clock speed per chip is evaluated by external measures, so community. Many were mentioned in to increasing the number of processor success involves helping new classes of previous reports1,3–7 and are the subject cores and threads per chip. In storage users, powering new computing plat- of significant efforts that require technologies, major changes are under forms, and making conceptual break- continued investigation and funding. UEKIT

way in the memory hierarchy due to the throughs across computing. These Second, we kept the list short, favoring L availability of more and larger on-chip should be the motivating goals for the focus over coverage. Though most of us caches, large inexpensive RAM, and next round of database research. have other promising research topics flash memory. Power consumption To achieve these goals, discussion we would have liked to discuss at great- USTRATION BY G

has become an increasingly impor- at the 2008 Claremont Resort meeting er length here, we focus on topics that ILL

58 COMMUNICATIONS OF THE ACM | JUNE 2009 | VOL. 52 | NO. 6 contributed articles attracted the broadest interest within management relative to hardware is the group. exorbitant. In the OLTP market, busi- In addition to the listed topics, the ness imperatives like regulatory compli- main issues raised during the meeting ance and rapid response to changing included management of uncertain business conditions raise the need to information, data privacy and security, The ubiquity address data life-cycle issues (such as e-science and other scholarly appli- of big data is data provenance, schema evolution, cations, human-centric interaction and versioning). with data, social networks and Web expanding the Given these requirements, the 2.0, personalization and contextual- base of users commercial database market is wide ization of query- and search-related open to new ideas and systems, as tasks, streaming and networked data, and developers of reflected in the recent funding climate self-tuning and adaptive systems, and data-management for entrepreneurs. It is difficult to the challenges raised by new hardware recall when there were so many start- technologies and energy constraints. technologies and up companies developing database Most are captured in the following engines, and the challenging economy discussion, with many cutting across will undoubtedly has not trimmed the field much. The multiple topics. shake up market will undoubtedly consolidate Revisiting database engines. System R over time, but things are changing fast, and Ingres pioneered the architecture the database and it remains a good time to try radi- and algorithms of relational databases; research field. cal ideas. current commercial databases are still Some research projects have begun based on their designs. But many of the taking revolutionary steps in database changes in applications and technolo- system architecture. There are two gy demand a reformation of the entire distinct directions: broadening the system stack for data management. useful range of applicability for multi- Current big-market relational database purpose database systems (for exam- systems have well-known limitations. ple, to incorporate streams, text search, While they provide a range of features, XML, and information integration) they have only narrow regimes in which and radically improving performance they provide peak performance; online by designing special-purpose database transaction processing (OLTP) systems systems for specific domains (for exam- are tuned for lots of small, concurrent ple, read-mostly analytics, streams, transactional debit/credit workloads, and XML). Both directions have merit, while decision-support systems are and the overlap in their stated targets tuned for a few read-mostly, large-join- suggests they may be more synergistic and-aggregation workloads. Mean- than not. Special-purpose techniques while, for many popular data-intensive (such as new storage and compres- tasks developed over the past decade, sion formats) may be reusable in more relational databases provide poor general-purpose systems, and general- price/performance and have been purpose architectural components rejected; critical scenarios include (such as extensible query optimizer text indexing, serving Web pages, and frameworks) may help speed prototyp- media delivery. New workloads are ing of new special-purpose systems. emerging in the sciences, Web 2.0-style Important research topics in the applications, and other environments core database engine area include: where database-engine technology ! Designing systems for clusters could prove useful but is not bundled of many-core processors that exhibit in current database systems. limited and nonuniform access to off- Even within traditional applica- chip memory; tion domains, the database market- ! Exploiting remote RAM and Flash place today suggests there is room for as persistent media, rather than rely- significant innovation. For example, in ing solely on magnetic disk; the analytics markets for business and ! Treating query optimization and science, customers can buy petabytes physical data layout as a unified, adap- of storage and thousands of proces- tive, self-tuning task to be carried out sors, but the dominant commercial continuously; database systems typically cannot ! Compressing and encrypting data scale that far for many workloads. Even at the storage layer, integrated with when they can, the cost of software and data layout and query optimization;

JUNE 2009 | VOL. 52 | NO. 6 | COMMUNICATIONS OF THE ACM 59 contributed articles

! Designing systems that embrace This opens opportunities for the nonrelational data models, rather than database community to extend its shoehorning them into tables; contribution to the broader commu- ! Trading off consistency and avail- nity, developing more powerful and ability for better performance and efficient languages and runtime mech- thousands of machines; and This is a unique anisms that help these developers ! Designing power-aware DBMSs opportunity for address more complex problems. that limit energy costs without sacrific- As another example of declarative ing scalability. a fundamental programming, in the past five years a This list is not exhaustive. One “reformation” variety of new declarative languages, industrial participant at the Claremont often grounded in Datalog, have been meeting noted that this is a time of of the notion of developed for domain-specific systems opportunity for academic research- in fields as diverse as networking and ers; the landscape has shifted enough data management, distributed systems, computer games, that access to industrial legacy code not as a single machine learning and robotics, compil- provides little advantage, and large- ers, security protocols, and information scale clustered hardware is rentable in system but as extraction. In many of these scenarios, the cloud at low cost. Moreover, indus- a set of services the use of a declarative language has trial players and investors are aggres- reduced code size by orders of magni- sively looking for bold new ideas. This that can be tude while also enabling distributed opportunity for academics to lead in embedded, as or parallel execution. Surprisingly, the system design is a major change in the groups behind these efforts have coor- research environment. needed, in many dinated very little with one another; the Declarative programming for emerg- move to revive declarative languages ing platforms. Programmer productivity computing contexts. in these new contexts has grown up is a key long-acknowledged challenge organically. in computing, with its most notable A third example arises in enter- mention in the database context in Jim prise-application programming. Gray’s 1998 Turing lecture. Today, the Recent language extensions (such urgency of the challenge is increasing as Ruby on Rails and LINQ) encour- exponentially as programmers target age query-like logic in programmer ever more complex environments, design patterns. But these packages including many-core chips, distrib- have yet to address the challenge of uted services, and cloud computing enterprise-style programming across platforms. multiple machines; the closest effort Nonexpert programmers must be here is DryadLINQ, focusing on paral- able to write robust code that scales out lel analytics rather than on distributed across processors in both loosely and application development. For enter- tightly coupled architectures. Although prise applications, a key distributed developing new programming para- design decision is the partitioning of digms is not a database problem per se, logic and data across multiple “tiers,” ideas of data independence, declara- including Web clients, Web servers, tive programming, and cost-based opti- application servers, and a backend mization provide a promising angle of DBMS. Data independence is particu- attack. There is significant evidence larly valuable here, allowing programs that data-centric approaches will have to be specified without making a priori significant influence on programming permanent decisions about physical in the near term. deployment across tiers. Automatic The recent popularity of the Map- optimization processes could make Reduce programming framework for these decisions and move data and manipulating big data sets is an code as needed to achieve efficiency example of this potential. MapReduce and correctness. XQuery has been is attractively simple, building on proposed as an existing language that language and data-parallelism tech- would facilitate this kind of declarative niques that have been known for programming, in part because XML is decades. For database researchers, often used in cross-tier protocols. the significance of MapReduce is in It is unusual to see this much demonstrating the benefits of data- energy surrounding new data-centric parallel programming to new classes programming techniques, but the of developers. opportunity brings challenges as

60 COMMUNICATIONS OF THE ACM | JUNE 2009 | VOL. 52 | NO. 6 contributed articles

well. The research challenges include quality data items in HTML tables on it developed domain-independent language design, efficient compilers Web pages and a growing number of technology for crawling through forms and runtimes, and techniques to opti- mashups providing dynamic views on (that is, automatically submitting well- mize code automatically across both structured data; and data contributed formed queries to forms) and surfac- the horizontal distribution of parallel by Web 2.0 services (such as photo and ing the resulting HTML pages in a processors and the vertical distribu- video sites, collaborative annotation search-engine index. Within the enter- tion of tiers. It seems natural that the services, and online structured-data prise, the database research commu- techniques behind parallel and distrib- repositories). nity recently contributed to enterprise uted databases—partitioned dataflow A significant long-term goal for the search and the discovery of relation- and cost-based query optimization— database community is to transition ships between structured and unstruc- should extend to new environments. from managing traditional databases tured data. However, to succeed, these languages consisting of well-defined schemata The first challenge database must be fairly expressive, going beyond for structured business data to the researchers face is how to extract struc- simple MapReduce and select-project- join-aggregate dataflows. This agenda will require “synthesis” work to harvest useful techniques from the literature on database and logic programming languages and optimization, as well as to realize and extend them in new programming environments. To genuinely improve programmer productivity, these new approaches also need to pay attention to the soft- er issues that capture the hearts and minds of programmers (such as attrac- tive syntax, typing and modularity, development tools, and smooth inter- action with the rest of the comput- ing ecosystem, including networks, files, user interfaces, Web services, and other languages). This work also needs to consider the perspective of programmers who want to use their favorite programming languages and data services as primitives in those languages. Example code and practical tutorials are also critical. To execute successfully, database research must look beyond its tradition- al boundaries and find allies through- much more challenging task of manag- ture and meaning from unstructured out computing. This is a unique oppor- ing a rich collection of structured, and semistructured data. Informa- tunity for a fundamental “reformation” semi-structured, and unstructured tion-extraction technology can now of the notion of data management, not data spread over many repositories in pull structured entities and relation- as a single system but as a set of servic- the enterprise and on the Web—some- ships out of unstructured text, even in es that can be embedded as needed in times referred to as the challenge of unsupervised Web-scale contexts. We many computing contexts. managing dataspaces. expect in coming years that hundreds Interplay of structured and unstruc- In principle, this challenge is closely of extractors will be applied to a given tured data. A growing number of data- related to the general problem of data data source. Hence developers and management scenarios involve both integration, a longstanding area for analysts need techniques for applying structured and unstructured data. database research. The recent advanc- and managing predictions from large Within enterprises, we see large hetero- es in this area and the new issues numbers of independently developed geneous collections of structured data due to Web 2.0 resulted in significant extractors. They also need algorithms linked with unstructured data (such discussion at the Claremont meeting. that can introspect about the correct- as document and email repositories). On the Web, the database community ness of extractions and therefore UEKIT

L On the Web, we also see a growing has contributed primarily in two ways: combine multiple pieces of extraction amount of structured data primarily First, it developed technology that evidence in a principled fashion. The from three sources: millions of data- enables the generation of domain- database community is not alone in bases hidden behind forms (the deep specific (“vertical”) search engines these efforts; to contribute in this area, USTRATION BY G

ILL Web); hundreds of millions of high- with relatively little effort; and second, database researchers should continue

JUNE 2009 | VOL. 52 | NO. 6 | COMMUNICATIONS OF THE ACM 61 contributed articles

to strengthen ties with researchers in develop methods to answer keyword concepts around which these function- information retrieval and machine queries over large collections of hetero- alities are tied. learning. geneous data sources. We must be able In addition to managing existing Context is a significant aspect to break down the query to extract data collections, there is an opportu- of the semantics of the data, taking its intended semantics and route the nity to innovate in the creation of data multiple forms (such as the text and query to the relevant sources(s) in the collections. The emergence of Web 2.0 hyperlinks that surround a table on a collection. Keyword queries are just creates the potential for new kinds of Web page, the name of the directory one entry point into data exploration, data-management scenarios in which in which data is stored, accompany- and there is a need for techniques that users join ad hoc communities to ing annotations or discussions, and lead users into the most appropriate create, collaborate, curate, and discuss relationships to physically or tempo- querying mechanism. Unlike previ- data online. As an example, consider rally proximate data items). Context ous work on information integration, creating a database of access to clean helps analysts interpret the meaning the challenges here are that we cannot water in different places around the world. Since such communities rarely agree on schemata ahead of time, the schemata must be inferred from the data; however, the resulting schemata are still used to guide users to consen- sus. Systems in this context must incorporate visualizations that drive exploration and analysis. Most impor- tant, these systems must be extremely easy to use and so will probably require compromising on some typical data- base functionality and providing more semiautomatic “hints” mined from the data. There is an important opportunity for a feedback loop here; as more data is created with such tools, information extraction and querying could become easier. Commercial and academic prototypes are beginning to appear, but there is plenty of room for additional innovation and contributions. Cloud data services. Economic and technological factors have motivated a resurgence of shared computing infrastructure, providing software and computing facilities as a service, an approach known as cloud services of data in such applications because assume we have semantic mappings or cloud computing. Cloud services the data is often less precise than in for the data sources and we cannot provide efficiencies for application traditional database applications, as assume that the domain of the query or providers by limiting up-front capital it is extracted from unstructured text, the data sources is known. We need to expenses and by reducing the cost of extremely heterogeneous, or sensi- develop algorithms for providing best- ownership over time. Such services tive to the conditions under which it effort services on loosely integrated are typically hosted in a data center was captured. Better database tech- data. The system should provide mean- using shared commodity hardware nology is needed to manage data in ingful answers to queries with no need for computation and storage. A varied context. In particular, there is a need for manual integration and improve set of cloud services is available today, for techniques to discover data sourc- over time in a pay-as-you-go fashion as including application services (sales- es, enhance the data by discovering semantic relationships are discovered force.com), storage services (Amazon implicit relationships, determine the and refined. Developing index struc- S3), compute services (Amazon EC2, weight of an object’s context when tures to support querying hybrid data Google App Engine, and Microsoft assigning it semantics, and maintain is also a significant challenge. More Azure), and data services (Amazon the provenance of data through these generally, we need to develop new SimpleDB, Microsoft SQL Data Servic- UEKIT

steps of storage and computation. notions of correctness and consistency es, and Google’s Datastore). They L The second challenge is to develop in order to provide metrics and enable represent a major reformation of data- methods for querying and deriving users or system designers to make management architectures, with more insight from the resulting sea of hetero- cost/quality trade-offs. We also need on the horizon. We anticipate many USTRATION BY G

geneous data. A specific problem is to to develop the appropriate systems future data-centric applications lever- ILL

62 COMMUNICATIONS OF THE ACM | JUNE 2009 | VOL. 52 | NO. 6 contributed articles aging data services in the cloud. management across layers. A cross-cutting theme in cloud The need for manageability adds services is the trade-off providers face urgency to the development of self- between functionality and opera- managing database technologies tional costs. Today’s early cloud data that have been explored over the past services offer an API that is much Limited decade. Adaptive, online techniques more restricted than that of traditional functionality will be required to make these systems database systems, with a minimalist viable, while new architectures and query language, limited consistency pushes more APIs, including the flexibility to depart guarantees, and in some cases explicit programming from traditional SQL and transaction- constraints on resource utilization. al semantics when prudent, reduce This limited functionality pushes more burden on requirements for backward compat- programming burden on developers developers but ibility and increase the motivation for but allows cloud providers to build aggressive redesign. more predictable services and offer allows cloud The sheer scale of cloud computing service-level agreements that would be involves its own challenges. Today’s difficult to provide for a full-function providers to build SQL databases were designed in an SQL data service. More work and expe- more predictable era of relatively reliable hardware and rience are needed on several fronts intensive human administration; as a to fully understand the continuum services and offer result, they do not scale effectively to between today’s early cloud data servic- service-level thousands of nodes being deployed es and more full-function but possibly in a massively shared infrastructure. less-predictable alternatives. agreements that On the storage front, it is unclear Manageability is particularly impor- would be difficult whether these limitations should be tant in cloud environments. Relative to addressed with different transactional traditional systems, it is complicated by to provide for implementation techniques, different three factors: limited human interven- storage semantics, or both simultane- tion, high-variance workloads, and a a full-function ously. The database literature is rich variety of shared infrastructures. In the SQL data service. in proposals on these issues. Cloud majority of cloud-computing settings, services have begun to explore simple there will be no database administra- pragmatic approaches, but more work tors or system administrators to assist is needed to synthesize ideas from the developers with their cloud-based literature in modern cloud computing applications; the platform must do regimes. In terms of query processing much of that work automatically. Mixed and optimization, it will not be feasible workloads have always been difficult to to exhaustively search a domain that tune but may be unavoidable in this considers thousands of processing context. sites, so some limitations on either the Even a single customer’s workload domain or the search will be required. can vary widely over time; the elastic Finally, it is unclear how program- provisioning of cloud services makes mers will express their programs in the it economical for a user to occasion- cloud, as discussed earlier. ally harness orders-of-magnitude more The sharing of physical resources in resources than usual for short bursts a cloud infrastructure puts a premium of work. Meanwhile, service tuning on data security and privacy that cannot depends heavily on the way the shared be guaranteed by physical boundaries infrastructure is “virtualized.” For of machines or networks. Hence cloud example, Amazon EC2 uses hardware- services are fertile ground for efforts level virtual machines as its program- to synthesize and accelerate the work ming interface. On the opposite end of the database community has done in the spectrum, salesforce.com imple- these areas. The key to success is to ments “multi-tenant” hosting of many specifically target usage scenarios in independent schemas in a single the cloud, seated in practical econom- managed DBMS. Many other virtual- ic incentives for service providers and ization solutions are possible, each customers. with different views into the workloads As cloud data services become popu- above and platforms below and differ- lar, new scenarios will emerge with ent abilities to control each. These their own challenges. For example, we variations require revisiting traditional anticipate specialized services that are roles and responsibilities for resource pre-loaded with large data sets (such as

JUNE 2009 | VOL. 52 | NO. 6 | COMMUNICATIONS OF THE ACM 63 contributed articles

stock prices, weather history, and Web data-rich mix. The term “co-space” is crawls). The ability to “mash up” inter- sometimes used to refer to a coexist- esting data from private and public ing space for both virtual and physi- domains will be increasingly attractive cal worlds. In it, locations and events and provide further motivation for the in the physical world are captured by challenges discussed earlier concern- Electronic media a large number of sensors and mobile ing the interplay of structured and underscore the devices and materialized within a unstructured data. The desire to mash virtual world. Correspondingly, certain up data also points to the inevitability modern reality actions or events within the virtual of services reaching out across clouds, that it is easy to be world affect the physical world (such an issue already prevalent in scien- as shopping, product promotion, and tific data “grids” that typically have widely published experiential computer gaming). Appli- large shared data servers at multiple but much more cations of co-space include rich social sites, even within a single discipline. It networking, massive multi-player also echoes, in the large, the standard difficult to be games, military training, edutain- proliferation of data sources in most ment, and knowledge sharing. enterprises. Federated cloud architec- widely read. In both areas, large amounts of data tures will only add to these challenges. flow from users and get synthesized Mobile applications and virtual and used to affect the virtual and/or real worlds. This new class of applications, world. These applications raise new exemplified by mobile services and challenges, including how to process virtual worlds, is characterized by the heterogeneous data streams in order need to manage massive amounts of to materialize real-world events, how to diverse user-created data, synthesize balance privacy against the collective it intelligently, and provide real-time benefit of sharing personal real-time services. The database community information, and how to apply more is beginning to understand the chal- intelligent processing to send interest- lenges faced by these applications, but ing events in the co-space to someone much more work is needed. According- in the physical world. ly, the discussion about these topics at The programming of virtual actors in the meeting was more speculative than games and virtual worlds requires large- about those of the earlier topics but scale parallel programming; declarative still deserve attention. methods have been proposed as a solu- Two important trends are changing tion in this environment, as discussed the nature of the field. First, the plat- earlier. These applications also require forms on which mobile applications development of efficient systems, as are built—hardware, software, and suggested earlier in the context of data- network—have attracted large user base engines, including appropriate bases and ubiquitously support power- storage and retrieval methods, data- ful interactions “on the go.” Second, processing engines, parallel and distrib- mobile search and social networks uted architectures, and power-sensitive suggest an exciting new set of mobile software techniques for managing the applications that can deliver timely events and communications across information (and advertisements) to large number of concurrent users. mobile users depending on location, personal preferences, social circles, Moving Forward and extraneous factors (such as weath- The 2008 Claremont meeting also er), as well as the context in which involved discussions on the database they operate. Providing these services research community’s processes, requires synthesizing user input and including organization of publication behavior from multiple sources to procedures, research agendas, attrac- determine user location and intent. tion and mentorship of new talent, The popularity of virtual worlds and efforts to ensure a benefit from like Second Life has grown quickly the research on practice and toward and in many ways mirrors the themes furthering our understanding of the of mobile applications. While they field. Some of the trends seen in data- began as interactive simulations for base research are echoed in other multiple users, they increasingly blur areas of computer science. Whether or the distinctions with the real world not they are, the discussion may be of and suggest the potential for a more broader interest in the field.

64 COMMUNICATIONS OF THE ACM | JUNE 2009 | VOL. 52 | NO. 6 contributed articles

Prior to the meeting, a team led by intellectual and practical relevance. At from all parties. Unlike previous efforts one of the participants performed a the same time, it was acknowledged in this vein, the collection should not bit of ad hoc data analysis over data- that the database community’s growth be designed for any particular bench- base conference bibliographies from increases the need for clear and clearly mark; in fact, it is likely that most of the the DBLP repository (.uni-trier. enforced processes for scientific publi- interesting problems suggested by this de). While the effort was not scien- cation. The challenge going forward data are as yet unidentified. tific, the results indicated that the is to find policies that simultaneous- There was also discussion at the database research community has ly reward big ideas and risk-taking meeting of the role of open source doubled in size over the past decade, while providing clear and fair rules for software development in the database as suggested by several metrics: achieving these rewards. The publica- community. Despite a tradition of open number of published papers, number tion venues would do well to focus as source software, academic database of distinct authors, number of distinct much energy on processes to encour- researchers have only rarely reused institutions to which these authors age relevance and innovation as they or shared software. Given the current belong, and number of session topics do on processes to encourage rigor climate, it might be useful to move more at conferences, loosely defined. This and discipline. aggressively toward sharing software served as a backdrop to the discus- In addition to tuning the main- and collaborating on software projects sion that followed. An open question is stream publication venues, there is an across institutions. Information inte- whether this phenomenon is emerging opportunity to take advantage of other gration was mentioned as an area in at larger scales—in computer science channels of communication. For exam- which such an effort is emerging. and in science in general. If so, it may ple, the database research community Finally, interest was expressed be useful to discuss the management has had little presence in the relatively in technical competitions akin to of growth at those larger scales. active market for technical books. the Netflix Prize (www.netflixprize. The growth of the database commu- Given the growing population of devel- com) and KDD Cup (www.sigkdd.org/ nity puts pressure on the content opers working with big data sets, there kddcup/index.php) competitions. and processes of database research is a need for accessible books on scal- To kick off this effort in the database publications. In terms of content, the able data-management algorithms domain, meeting participants identi- increasingly technical scope of the and techniques that programmers can fied two promising areas for competi- community makes it difficult for indi- use to build software. The current crop tions: system components for cloud vidual researchers to keep track of the of college textbooks is not targeted at computing (likely measured in terms field. As a result, survey articles and this market. There is also an oppor- of efficiency) and large-scale infor- tutorials are increasingly important to tunity to present database research mation extraction (likely measured the community. These efforts should contributions as big ideas in their own in terms of accuracy and efficiency). be encouraged informally within the right, targeted at intellectually curious While it was noted that each of these community, as well as via professional readers outside the specialty. In addi- proposals requires a great deal of time incentive structures (such as academic tion to books, electronic media (such and care to realize, several participants tenure and promotion in industrial as blogs and wikis) can complement volunteered to initiate efforts. That labs). In terms of processes, the review- technical papers by opening up differ- work has begun with the 2009 SIGMOD ing load for papers is increasingly ent stages of the research life cycle to Programming Contest (db.csail.mit. burdensome, and there was a percep- discussion, including status reports edu/sigmod09contest). tion at the Claremont meeting that the on ongoing projects, concise presen- quality of reviews had been decreasing. tation of big ideas, vision statements, References It was suggested at the meeting that the and speculation. Online fora can also 1. Abiteboul, S. et al. The Lowell database research self assessment. Commun. ACM 48, 5 (May 2005), lack of face-to-face program-commit- spur debate and discussion if appro- 111–118. tee meetings in recent years has exac- priately provocative. Electronic media 2. Austin, I. I.B.M. acquires Cognos, maker of business software, for $4.9 billion. New York Times (Nov. 11, erbated the problem of poor reviews underscore the modern reality that 2007). 3. Bernstein, P.A. et al. The Asilomar report on database and removed opportunities for risky or it is easy to be widely published but research. SIGMOD Record 27, 4 (Dec. 1998), 74–80. speculative papers to be championed much more difficult to be widely read. 4. Bernstein, P.A. et al. Future directions in DBMS research: The Laguna Beach participants. SIGMOD effectively over well-executed but more This point should be reflected in the Record 18, 1 (Mar. 1989), 17–26. pedestrian work. mainstream publication context, as 5. Silberschatz, A. and Zdonik, S. Strategic directions in database systems: Breaking out of the box. ACM There was some discussion at the well as by authors and reviewers. In the Computing Surveys 28, 4 (Dec. 1996), 764–778. meeting about recent efforts—nota- end, the consumers of an idea define 6. Silberschatz, A., Stonebraker, M., and Ullman, J.D. Database research: Achievements and opportunities bly by ACM-SIGMOD and VLDB— its value. into the 21st century. SIGMOD Record 25, 1 (Mar. to enhance the professionalism of Given the growth in the database 1996), 52-63. 7. Silberschatz, A., Stonebraker, M., and Ullman, J.D. papers and the reviewing process via research community, the time is ripe Database systems: Achievements and opportunities. such mechanisms as double-blind for ambitious projects to stimulate Commun. ACM 34, 10 (Oct. 1991), 110–120. reviewing and techniques to encour- collaboration and cross-fertilization age experimental repeatability. Many of ideas. One proposal is to foster Correspondence regarding this article should be addressed to Joseph M. Hellerstein (hellerstein@ participants were skeptical that the more data-driven research by building cs.berkeley.edu). efforts to date have contributed to long- a globally shared collection of struc- term research quality, as measured in tured data, accepting contributions © 2009 ACM 0001-0782/09/0600 $10.00

JUNE 2009 | VOL. 52 | NO. 6 | COMMUNICATIONS OF THE ACM 65 contributed articles

DOI:10.1145/1516046.1516063 Although some developing coun- The vision is being overwhelmed by the reality tries are indeed deploying OLPC lap- tops, others have cancelled planned of business, politics, logistics, and competing deployments or are waiting on the interests worldwide. results of pilot projects before decid- ing whether to acquire them in num- BY KENNETH L. KRAEMER, JASON DEDRICK, AND PRAKUL SHARMA bers. Meanwhile, the OLPC organiza- tion (www.olpc.com/) struggles with SON OL

key staff defections, budget cuts, and S L IE

ideological disillusionment, as it ap- N , 15.

pears to some that the educational C LP One Laptop mission has given way to just getting O laptops out the door. In addition, low-

cost commercial netbooks from Acer, ONROY, 14. M

Asus, Hewlett-Packard, and other PC Z OME G

vendors have been launched with great A L

Per Child: AR early success. C

So rather than distributing millions , 13. C LP

of laptops to poor children itself, OLPC O has motivated the PC industry to devel- RCE, 12. Vision vs. A op lower-cost, education-oriented PCs, FO L

providing developing countries with O D O low-cost computing options directly in R competition with OLPC’s own innova- RAKE, 11. D

Reality tion. In that sense, OLPC’s apparent L failure may be a step toward broader success in providing a new tool for , 10. DANIE C

children in developing countries. How- LP O ever, it is also clear that the PC industry cannot profitably reach millions of the DRAKE, 7–9 poorest children, so the OLPC objec- L tives might never be achieved through

the commercial market alone. , 6. DANIE AT THE WORLD Economic Forum in Davos, Switzerland, LD Here, we review and analyze the

January 2005, Nicholas Negroponte unveiled the idea ER CHI P OLPC experience, focusing on the two P TO

of One Laptop Per Child (OLPC), a $100 PC that would most important issues: the successes P

and failures of OLPC in understand- NE LA transform education for the world’s disadvantaged O ing and adapting to the developing- schoolchildren by giving them the means to teach country environment and the unex- DRAKE, 3–5 themselves and each other. He estimated that up pectedly aggressive reaction by the PC L to 150 million of these laptops could be shipped industry, including superpowers Intel and Microsoft, to defeat or co-opt the 4

annually by the end of 2007. With $20 million in OLPC effort. ONROY, 2. DANIE M

startup investment, sponsorships and partnerships OLPC created a novel technology, Z OME the XO laptop, developed with close at- G A with major IT industry players, and interest from L tention to the needs of students in poor AR C developing countries, the nonprofit OLPC project rural areas. Yet it failed to anticipate H BY: 1. generated excitement among international leaders the social and institutional problems P that could arise in trying to diffuse that and the world media. Yet as of June 2009 only a few innovation in the developing-country EFT PHOTOGRA L

context. In addition, OLPC has been

hundred thousand laptops have been distributed P stymied by underestimating the ag- (they were first available in 2007), and OLPC has been gressive reaction of the PC industry to

forced to dramatically scale back its ambitions. the perceived threat of a $100 laptop FROM THE TO

66 COMMUNICATIONS OF THE ACM | JUNE 2009 | VOL. 52 | NO. 6 JUNE 2009 | VOL. 52 | NO. 6 | COMMUNICATIONS OF THE ACM 67 contributed articles

Worldwide distribution of XO laptops. being widely distributed in places the industry sees as emerging markets for its own products. The case of OLPC can be seen as a study in the general diffusion of in- novation in developing countries. Our X analysis draws on diffusion-of-innova- tion theory, exemplified by Rogers,18 and illustrates the difficulty in getting X X widespread adoption of even proven X innovation due to misunderstanding X X X X X the social and cultural environment X in which the innovation is to be intro- X X X duced. We also bring to bear specific X insights from the literature on adop- X tion of IT in developing countries,2,25 using them to analyze the OLPC experi- ence and draw implications for devel- opers and policymakers.

Actual Date of Actual Deployment The original OLPC vision was to Country OLPC Web sitea Deployments Information/Detail change education through the develop- Uruguay 202,000 150,000 November 2008b ment and distribution of low-cost lap- Peru 145,000 40,000 100,000 in distributionc tops embodying a new learning model to every child in the developing coun- Mexico 50,000 50,000 Starting to be shippedd tries. Despite shifting over time, it can e Haiti 13,000 Dozens Pilot began in summer 2008 be characterized by the following text Afghanistan 11,000 450 Expected to rise to 2010f from the OLPC charter: “OLPC is not, Mongolia 10,100 3,000 G1G1 laptops beneficiaryg at heart, a technology program, nor is Rwanda 16,000 10,000 Arrived, not deployed; the XO a product in any conventional infrastructure issuesh sense of the word. OLPC is a nonprofit Nepal 6,000 6,000 Delivered April 2007i organization providing a means to Ethiopia 5,000 5,000 Three schoolsj an end—an end that sees children in even the most remote regions of the Paraguay 4,000 150 4,000 planned next quarterk globe being given the opportunity to l Cambodia 3,200 1,040 January 29, 2009 tap into their own potential, to be ex- Guatemala 3,000 — Planned before posed to a whole world of ideas, and third quarter 2009m to contribute to a more productive Colombia 2,600 1,580 January 25, 2009n; and saner world community” (www. agreement to buy 65,000 XOso olpcnews.com/people/negroponte/ p Brazil 2,600 630 February 6, 2009 new_olpc_mission_statement.html). India 505 31 January 20, 2009q Conceived and led by Nicholas Ne- a OLPC numbers include “XO’s delivered, shipped, or ordered” but do not groponte, a former director of MIT’s distinguish between these categories; wiki.laptop.org/go/Deployments Media Lab, OLPC aimed to achieve its b Tabare, V. Uruguay: When education meets technology. Miami Herald (Nov. 22, 2008), A21. vision through extraordinary innova- c Peru on the up and up, lessons to be learned. Business News Americas (Dec. 18, 2008). d www.bnamericas.com/story.xsql?id_sector=1&id_noticia=431002&Tx_idioma=I&source= tion in hardware and software that e www.olpceu.org/content/xo_stories/haiti/Haiti.html fosters self-learning and fits with the f www.olpcnews.com/countries/afghanistan/olpc_afghanistan_first_school_day.html often-harsh environment in develop- g www.olpceu.org/content/xo_stories/mongolia/Mongolia.html ing countries. The hardware was to h www.olpceu.org/content/xo_stories/rwanda/Rwanda.html be a $100 laptop that would make af- i www.olpceu.org/content/xo_stories/nepal/Nepal.html fordable the large-scale deployment of j http://www.olpceu.org/content/xo_stories/ethiopia/Ethiopia.html computer networks in their schools. k Bucaramanga computers, OLPC, Gemalto. Business News Americas (Feb. 9, 2009). l wiki.laptop.org/go/OLPC_Cambodia The XO laptop developed by OLPC m wiki.laptop.org/go/OLPC_Guatemala reflects hardware innovation in the n wiki.laptop.org/go/OLPC_Colombia power supply, display, networking, o PIlar Saenz, OLPC Volunteer in Colombia (email) keyboard, and touchpad to provide a p download.laptop.org/content/conf/20080520-country-wkshp/Presentations/OLPC%20Country%20 durable and interactive laptop (see the Meeting%20-%20Day%204%20-%20May%2023rd,%202008/Brazil%20-%20Jose%20Aquino%20 -%20Govt%20of%20Brazil.ppt#266,8,Slide 8 figure here). The shell of the machine is q www.olpceu.org/content/xo_stories/india/India.html resistant to dirt and moisture, with all key parts designed to fit behind the dis- play. It contains a pivoting, reversible,

68 COMMUNICATIONS OF THE ACM | JUNE 2009 | VOL. 52 | NO. 6 contributed articles dual-mode (monochrome for outside, ers in the villages and support from the color for indoors) display, movable rub- national education ministry and re- ber WiFi antennas with wireless mesh gional governors who have requested networking, and a sealed rubber-mem- 500,000 more laptops.9 However, re- brane keyboard that can be customized ports from the classroom suggest that for different languages. For low power Expecting a laptop teacher training is limited, and willing- consumption and ruggedness, the XO to cause such ness to adopt a new approach to teach- design intentionally omits all motor- ing is questionable. Children are excit- driven moving parts. It was developed revolutionary ed but somewhat confused about the jointly by the MIT Media Lab, OLPC, change showed use of the machines, and educational and Quanta, a Taiwan-based original software is lacking or difficult to use. design manufacturer, and is manufac- a degree of Also, if a machine fails, it is up to the tured by Quanta in Songjiang, China. naiveté, even for family to replace it or the child must do The software for the XO consists of without.20 a pared-down version of the Fedora Li- an organization nux operating system and specially de- Targeted Cost signed graphical user interface called with the best Despite its considerable innovation, or Sugar. It was developed by the project to intentions and perhaps because of it, the OLPC proj- explore naturalistic concepts related to ect has been unable to achieve its $100 learning, openness, and collaboration.a smartest people. targeted cost. The current cost of each unit is listed on the OLPC Website as Pilot Implementation $199 (www.laptop.org/en/participate/ High-level officials, including even ways-to-give.shtml). However, this does prime ministers and education minis- not include upfront deployment costs, ters, in some developing countries are which are said to add an additional enthusiastic about OLPC, committed 5%–10% to the cost of each machine to purchases and/or trial-distribution (wiki.laptop.org/go/Larger_OLPC), projects. OLPC pilots in a half-dozen and subsequent IT-management costs. countries report positive changes (such Nor does it include the cost of teacher as increased enrollment in schools, training, additional software, and on- decreased absenteeism, increased going maintenance and support. OLPC discipline, and more participation in initially required governments to pur- classrooms), but it is not clear if these chase a million units, then reduced changes are directly related to OLPC, the number to 250,000 in April 2007. as many evaluations are neither inde- Such large purchases are difficult to pendent nor systematic. Independent justify for governments in developing evaluations in Ethiopia and Uruguay countries, and the requirement was ul- cite a positive effect on the availability timately eliminated. of learning material via the laptop but Some countries eventually lost inter- also problems with buggy input devic- est due to the higher costs of the XO. es, connectivity, software functionality, For example, Nigeria failed to honor a and teacher training.8,12,13 pledge by its former president to pur- As of June 2009 the largest ongoing chase a million units, partly because pilot project is in Peru, which planned they no longer cost $100 apiece.21 to distribute 140,000 XOs in 2008, even Meanwhile, other countries, including into rural areas high in the Andes where Libya, have opted for the Intel Class- electricity is often limited and Internet mate, which is priced at approximately connections are not available. There is $250 for the PC alone. Officials in Libya, enthusiasm among students and teach- which had planned to buy up to 1.2 mil- lion XO laptops, became concerned that the machines lacked Windows, and that a Chief among them are collaboration and ex- service, teacher training, and future up- pression (such as Web browsing, email, on- grades would not be provided directly line chat, word processing, drawing, music sequencing, and programming); groups and by OLPC. Subsidies from Intel, includ- neighborhoods to signify other users in physi- ing donated laptops and teacher train- cal and logical proximity; a view-source-code ing, also helped persuade the Libyan key to encourage users to tinker with the code; government to choose the Classmate.21 replacing files and folders with “journals” that store activities performed by users; and tag- ging, clipping, sharing, and searching as sys- Production, Sales, Distribution temwide features.22 OLPC originally estimated that it would

JUNE 2009 | VOL. 52 | NO. 6 | COMMUNICATIONS OF THE ACM 69 contributed articles

ship 100–150 million XO laptops by the depend only on the nature of the in- end of 2007, but the program has clear- novation itself. Often, more important ly fallen far short. Under more mod- is the social and cultural environment est goals, production was supposed to in which it will operate.3,26 Informa- reach five million laptops by the end tion technologies are not standalone of 2008. By contrast, industry analysts PC makers across innovations but system innovations, report that Quanta’s manufacturing ef- the board are still the value of which depends largely on fort began only in December 2007 and an ecosystem that includes hardware, reached a total of 370,500 units by third seeking a formula applications, peripherals, network quarter 2008.16 for well-designed, infrastructure, and services (such as Early commitments for a million installation, training, repair, and tech- XOs each from Brazil, Libya, and Nige- low-cost computing nical support). Deployment involves ria evaporated, but relatively large pur- devices, along with training teachers, creating software chases were made by Uruguay (200,000), and digital content, delivering main- Peru (145,000), and Mexico (50,000). a complementary tenance and support, and sustaining a In November 2007, OLPC launched a long-term commitment. Such capabili- philanthropy program called Give One delivery value ties are in short supply in developing Get One (G1G1, www.olpcnews.com/ chain, market countries,7,26 and OLPC simply never countries/usa/olpc_xo_laptop_sale. had the resources to provide them. html) where people in the U.S. could strategy, and The OLPC plan was to rely on gov- buy two machines for $399, with one business model. ernments to buy its machines, provide being sent to a child in a developing distribution and support, train teach- country. The first program was success- ers to use and maintain them, and even ful, with about 167,000 units sold, but sponsor development of local-language a second G1G1 program in November software. OLPC established its own dis- 2008 resulted in only 12,500 units sold. tribution network or worked with local Lagging production and sales mean voluntary organizations in some coun- that distribution has also lagged. The tries to help with implementation. For table here lists distribution as reported global distribution, OLPC reached (in by OLPC, but many units have yet to be 2007) a comprehensive agreement deployed to their intended recipients. with cellphone distributor Brightstar What has the project accomplished? of Miami, FL, to help manage the com- Why is it so short of its original goals? plexities of entering diverse markets.23 To answer, we look in more detail at However, none of these institutions where OLPC succeeded and failed in had the ability to scale up to deploy- understanding the developing-country ment of millions of machines. This environment and how it was being con- situation is common in developing fronted by the PC industry. countries where endemic problems of infrastructure, financial resources, Analysis technical skills, and waning political OLPC dedicated a great deal of effort to support “hinder both the completion designing a laptop that would function of IS innovation initiatives and the re- well in a developing-country environ- alization of their expected benefits.”b ment. OLPC’s technologist culture en- IT innovation is also part of socially couraged innovation, showing a good embedded systems, the use of which understanding of what was needed in cannot be isolated from the social and developing countries. For example, the cultural environment or from local XO is sealed to keep out dirt, has a dis- norms of practice.1,25 In some cases, play that can be read in bright sunlight, teachers and the educational estab- runs on low power, and is rugged. lishment have resisted innovation that At the same time, the decision to use the Linux/Sugar operating system and interface was driven by a combina- b Negroponte seems to question whether teach- ers are needed at all. Speaking about provid- tion of pragmatic considerations and ing the rural poor a solid educational basis for open source ideology. From a pragmat- development at the 2007 Digital, Life, Design ic point of view, Linux doesn’t require conference in Munich, Germany, Negroponte the computing power of Windows and said: “It’s not about training teachers. It’s not about building schools. With all due respect has a price tag (zero) compatible with [to Hewlett-Packard’s e-inclusion efforts], it’s the goal of minimizing cost. not about curriculum or content. It’s about le- Diffusion of IT innovation does not veraging the children themselves.”24

70 COMMUNICATIONS OF THE ACM | JUNE 2009 | VOL. 52 | NO. 6 contributed articles

requires a significant change in peda- XO features. gogy and that might reduce teacher b status. Even when the laptops are ad- Antennae opted, they are not always used as en- Indicator Light visioned by OLPC or by education min- Microphone isters. One Peruvian teacher said, “The USB Port

ministry would want us to use the lap- Speaker top every day for long periods of time. Indicator Light Directional Pad But we have decided to set rules in our Camera school and, really, the laptop, it’s only Screen Rotate USB Ports 10 a tool for us.” Storage Access Speaker Such resistance is no surprise to Wi-Fi Access students of innovation diffusion or of Game Buttons IT for development. Rogers18 pointed Battery Light Power Button to examples where innovation dif- Power Light fusion failed due to cultural norms SD Slot and the effects of such innovation on Stylus Area/ existing institutional arrangements. Touch Pad 2 Avgerou noted that attitudes toward Mouse Buttons hierarchy are particularly problematic in developing countries. An example illustrating both themes is that the Pe- Latch ruvian experiment was initiated with- out being explained to the national schools) that existing PC makers were is marketing it aggressively against the teachers’ union.10 OLPC has strong not serving raised the prospect that XO worldwide. It secured deals to sell support from the Peruvian Education OLPC might gain a foothold in emerg- hundreds of thousands of Classmates Ministry, but ultimately teachers must ing markets more generally. Moreover, in Libya, Nigeria, and Pakistan, some actually use the machines in the class- the XO’s ultra-low price raised the like- of the very countries OLPC was count- room, and they are likely to see the lihood of a new price point for note- ing on. Intel launched a series of pilot union as an ally while possibly mis- books, potentially forcing PC makers projects in these countries, saying it trusting the ministry. to cannibalize existing low-end prod- will also test the Classmate in at least The fact that OLPC was much stron- ucts in order to compete (and is what 22 others while donating thousands of ger in developing innovative technol- ultimately happened). machines.21 Intel briefly joined OLPC ogy than in understanding how to Branded PC makers have always in July 2007 but got into a nondispar- diffuse it may reflect the engineering faced competition from cheap local agement dispute with Negroponte and orientation of the organization and its brands and clone makers in develop- dropped out only seven months later.14 lack of understanding of the needs or ing countries, but OLPC threatened In 2007, Microsoft offered to make interests of the nontechnical people to grab a share of education budgets available Windows, a student version who will ultimately buy and use the in- worldwide that PC makers hoped to of Microsoft Office, and educational novation. This is illustrated by David tap for themselves. Negroponte’s high- programs to developing countries for Cavallo, OLPC’s chief education archi- profile announcement of the project $3 per copy when used on computers tect, saying, “We’re hoping that these and the publicity he garnered quickly in schools. OLPC then decided to allow countries won’t just make up ground caught the industry’s attention. Windows on the XO, a choice driven by but will jump into a new educational Leading companies first responded demands from some governments for environment.”9 Expecting a laptop by disparaging the XO as a useless toy. Windows-based PCs. Even in countries to cause such revolutionary change Intel’s Craig Barrett called it “a gadget,” with very low levels of PC penetration, showed a degree of naiveté, even for an saying people want the full functional- officials who make purchasing deci- organization with the best intentions ity of a PC.17 Bill Gates said “...geez, get sions may favor a technology standard and smartest people. a decent computer where you can actu- (the Wintel design) they are familiar ally read the text and you’re not sitting with or believe children must learn on Competitive Response there cranking the thing while you’re systems they will encounter later in the from the PC Industry trying to type.”11 Before long, however, work force. The OLPC project was a potential threat the industry began to respond with ac- The OLPC project also stimulated to the PC industry in emerging markets. tion, not just words. innovation in low-cost, low-power PCs. OLPC’s use of an AMD microprocessor In 2006, Intel introduced a small Seeing OLPC’s success in developing and Linux operating system was a po- laptop—the Classmate—for devel- a sub-$200 notebook, Asustek intro- IKE LEE M tential threat to the dominant position oping countries that today sells for duced the EeePC notebook in 2007 for H BY P and historically high profit margins $230–$300. Intel has since licensed the the educational and consumer mar- of Intel and Microsoft. Its targeting Classmate reference design to PC mak- kets in both developed and developing

PHOTOGRA of a new market (developing-country ers to manufacture and distribute and countries, selling more than 300,000

JUNE 2009 | VOL. 52 | NO. 6 | COMMUNICATIONS OF THE ACM 71 contributed articles

with a deep understanding of the local environment to ensure commitment leads to money and action. Likewise, social, economic, and cultural environments vary greatly across and even within countries, and deploying new technologies requires understanding these environments. Innovators must consider the need for expertise in sociology, anthropology, public policy, and economics, as well as for engineers, and establish coher- ent criteria for selecting countries to target based on social, economic, and political characteristics. Success in a few developing countries is critical to broad diffusion, as potential adopters look to their peers for evidence of the value of the innovation.18 Innovative technology can be disrup- tive and trigger a backlash from incum- bents. Some innovations pose a threat The 2010 version of the One Laptop per Child, the XO-2, will have a foldable e-book form to industry incumbents, who may seek and reduce power consumption to one watt. to undermine the innovator’s efforts. The more visible the threat, the stron- units in four months. It was soon selling them for perhaps $75 each.5 ger the reaction is likely to be. This il- joined by major PC makers, including lustrates a dilemma for developers. A Acer, Dell, Hewlett-Packard, and many Lessons program less ambitious and less pub- smaller ones in creating a new category The OLPC experience offers lessons for licized than OLPC might not attract of PC known today as netbooks. innovators and others aiming to intro- the attention of industry incumbents While the XO was specifically de- duce and deploy IT innovation to benefit but also might not attract the partners, signed for the poor, rural education the poor, as well as for the governments investors, and other sponsors needed market in developing countries, net- of developing countries. For innovators, to develop and deploy the innovation. book vendors target urban consumer we thus draw three general lessons: As multinational companies direct and education markets in developed, Diffusing a new innovation requires more attention to emerging markets as well as emerging, markets. In 2008, understanding the local environment. and so-called “bottom of the pyramid” the netbook market exploded, with OLPC recognized correctly that lap- consumers, there is more likelihood sales of 10 million units worldwide tops could reach the poorest children of competition but also more opportu- mostly running Intel’s low-cost Atom only if they were subsidized by govern- nity for cooperation as well. PC makers processor and Windows; sales are ex- ment or other funding sources. This across the board are still seeking a for- pected to double in 2009.16 is similar to rural electrification and mula for well-designed, low-cost com- The OLPC has been credited with telephone service, which usually can- puting devices, along with a comple- spurring the netbook market, but the not be provided economically and end mentary delivery value chain, market competition it spurred is now OLPC’s up subsidized by government or by strategy, and business model. own biggest challenge. Developing charges to urban customers who can Innovative information technologies countries today have a wide choice of be served profitably. However, innova- do not stand alone. A technology like vendors offering inexpensive netbooks, tors should understand that govern- the XO is a system-level innovation that and, though not designed like the XO ments are not monolithic entities, nor requires complementary assets to be for the rigors of poor rural villages, they are they the same from one country to valuable. While OLPC was able to deliv- are competitive in large, easier-to-serve the next. In some cases, funding can er high-level design and hand off devel- urban populations. OLPC responded be allocated by an education ministry, opment and manufacturing to Quanta, by announcing in January 2009 that in others it must be approved by the it had no one to handle marketing,

15 ECT its second-generation laptop design legislature, and in others provincial or deployment, and support. Unlike the J RO would be licensed freely to PC makers local governments have jurisdiction. commercial PC companies, it was not P USE to manufacture and distribute, hoping Commitments from high-level officials part of any established business ecol- F to use the resources of these firms to or political leaders are as binding as a ogy and lacked resources to establish get millions of laptops into the hands politician’s campaign promises.26 Fly- its own ecology. H COURTESY OF of poor children in developing coun- ing into a country and winning initial For developing countries, interna- P tries. Negroponte hopes to have a pro- support is only a first step and must be tional agencies, and philanthropists,

totype in 18 months (from January), followed by a sustained effort by people there are other kinds of lessons: PHOTOGRA

72 COMMUNICATIONS OF THE ACM | JUNE 2009 | VOL. 52 | NO. 6 contributed articles

Understand the true costs and risks, cational and societal outcomes of the 8. Haertl, H. Low-Cost Devices in Educational Systems: 13 The Use of the ‘XO-Laptop’ in the Ethiopian Educational as well as benefits, of innovation. IT in- project.” Other evaluations argue that System. Report distributed by the Division of Health, novation like the XO may offer great the countrywide deployments envis- Education and Social Protection, Information and Communication Technologies, GTZ-Project, Deutsche benefits but also involves costs and aged by OLPC are simply beyond the Gesellschaft fur Technische Zusammenarbeit, risks. The purchase of a laptop is mere- resources of any developing country, Eschborn, Germany, Jan. 2008; www.gtz.de/de/ dokumente/gtz/2008-en-laptop.pdf. ly the start of a stream of ongoing costs. saying that governments must set pri- 9. Hamm, S., Smith, G., and Lakshman, N. Social cause meets business reality. BusinessWeek in The total cost of ownership for a laptop orities regarding goals and the regions, Focus (June 12, 2008), 48; www.thefreelibrary.com/ program could include infrastructure sectors, and schools to be served.8, 12 cial+Cause+Meets+Business+Reality-a01611563648. 10. Hansen, L. Laptop deal links rural Peru to opportunity, investment, training, tech support, risk. Weekend Edition. National Public Radio (Sunday, hardware maintenance, software li- Conclusion Dec. 14, 2008). 11. Hiser, S. Bill Gates criticises OLPC. PlexNex blog (Mar. censes and upgrades, and replacement The potential significance of the XO, 16, 2006); fussnotes.typepad.com/plexnex/2006/03/ expenditures. Cost can also include as well as of other IT innovations, in bill_gates_crit.html. 12. Hooker, M. 1:1 Technologies/Computing in the the opportunity cost or the foregone developing countries calls for system- Developing World: Challenging the Digital Divide. investment in teachers, facilities, or atic, independent evaluation—a true Global e-Schools and Communities Initiative, 7 Dublin, Ireland, May 2008; www.gesci.org/index. other educational materials cited by “grand challenge” for the computing php?option=com_content&task=view&id=75&Itemid India’s education ministry as its main and social science communities. Re- =64. 6 13. Hourcade, J.P., Beitler, D., Cormenzana, F., and Flores, reason for not joining OLPC. searchers can provide value by con- P. Early OLPC experiences in a rural Uruguayan There is also a risk that the expected ducting well-designed studies of the school. In Proceedings of CHI 2008 (Florence, Italy, Apr. 5–10). ACM Press, New York, 2008, 2503–2511. benefits might not be realized. Prob- diffusion and results of such innova- 14. Kirkpatrick, D. Negroponte on Intel’s $100 laptop lems in implementation could limit ac- tion. The knowledge created promises pullout. Fortune (Jan. 4, 2008); money.cnn.com/ 2008/01/04/technology/kirkpatrick_negroponte. tual use, and the need for ongoing fund- to prevent wasting a great deal of mon- fortune/index.htm. 15. Krstic, I. Sic Transit Gloria Laptopi. Ivan Kristic blog ing means that the innovation might ey and effort and lead to quicker diffu- (May 13, 2008); radian.org/notebook/sic-transit-gloria- not be sustainable beyond some initial sion and better use of innovations that laptopi. 2, 13 16. O’Donnell, B. Worldwide Mini-Notebook PC 2008-2012 period. Another risk is investing in a prove beneficial. While OLPC has so far Forecast Update and 3Q08 Vendor Shares. Market technology platform that might not be fallen short of its goals, there is much Analysis. IDC, Framingham, MA, Dec. 2008. 17. Reuters. Intel calls MIT’s $100 laptop a supported in the future; for instance, yet to be learned by studying this case ‘gadget.’ CNet news.com (Dec. 9, 2005); news. investment in software, content, and of IT innovation. com/Intel+calls+MITs+100+laptop+a+gadge t/2100-1005_3-5989067.html?tag=html.alert. training for the XO platform could be 18. Rogers, E.M. Diffusion of Innovations, Fifth Edition. wasted if OLPC would disappear. Acknowledgments Free Press, New York, 1995. 19. Shah, A. OLPC struggles to realize ambitious vision. Policymakers are able to reduce the The Personal Computing Industry Cen- PC World (Dec. 20, 2007); www.pcworld.com/ risk if they make major acquisition de- ter (pcic.merage.uci.edu/) is supported article/140698/olpc_struggles_to_realize_ambitious_ vision.html. cisions only after careful evaluation of by grants from the Alfred P. Sloan Foun- 20. Simon, S. Laptops may change the way rural Peru pilot projects that enable learning first- dation and the U.S. National Science learns. Weekend Edition. National Public Radio (Saturday, Dec. 13, 2008). hand how the technology fits with their Foundation. Any opinions, findings, 21. Stecklow, S. and Bandler, J. A little laptop educational goals and environment. and conclusions or recommendations with big ambitions. WallStreetJournal.com (Nov. 24, 2007); online.wsj.com/public/article/ Learning from other countries’ expe- expressed in this article are those of SB119586754115002717.html. 22. Vota, W. OMG: OLPC just gutted: 50% staff cut rience can be valuable even when the the author(s) and do not necessarily re- & more. OLPC News Forum post (Jan. 7, 2009); context is different; Al-Gahtani1 says flect the views of the Sloan Foundation www.olpcnews.com/forum/index.php?topic=4228. msg28414#msg28414. that successful pilot projects by peers or the National Science Foundation. 23. Vota, W. A Brightstar OLPC give one get one XO in other developing countries help re- computer distributor. One Laptop Per Child News (Oct. duce the perceived risk of adoption. 12, 2007); www.olpcnews.com/implementation/plan/ References brightstar_xo_computer_distribution.html. Adopting organizations need to de- 1. Al-Gahtani, S.S. Computer technology adoption in 24. Vota, W. OLPC Nepal creates content while Saudi Arabia: Correlates of perceived innovation Negroponte dismisses it. One Laptop Per Child News velop internal capabilities and set priori- attributes. Information Technology for Development (Jan. 31, 2007); www.olpcnews.com/countries/nepal/ ties. Although governments might re- 10, 1 (Jan. 2003), 57–69. negroponte_curriculum_content.html. 2. Avgerou, C. Information systems in developing 25. Walsham, G. and Sahay, S. Research on information ceive outside assistance for trials, they countries: A critical research review. Journal of systems in developing countries: Current landscape must be able to sustain the innovation Information Technology 23 (June 2008), 133–146. and future prospects. Information Technology for 3. Avgerou, C. The significance of context in information Development 12, 1 (Feb. 2006), 7–24. in the development of digital educa- systems and organizational change. Information 26. Warschauer, M. Dissecting the ‘digital divide’: A case tional content, training of teachers to Systems Journal 11, 1 (January 2001), 43–63. study in Egypt. The Information Society 19, 4 (Sept./ 4. BBC News. Sub-$100 laptop design unveiled (Sept. 29, Oct. 2003), 297–304. integrate ICT-based educational mate- 2005); news.bbc.co.uk/2/hi/technology/4292854.stm. rials in the teaching-learning process, 5. Bray, H. Cheaper laptop promised; Negroponte Kenneth L. Kraemer ([email protected]) is a research remains determined to realize vision. Boston professor in the Paul Merage School of Business, Co- and design and installation of sup- Globe (Feb. 11, 2009); www.boston.com/business/ Director of the Personal Computing Industry Center, porting IT and power infrastructure. technology/articles/2009/02/11/cheaper_cheap_ and Associate Director of the Center for Research on laptop_promised/. Information Technology and Organizations, all at the For example, one independent evalua- 6. Einhorn, B. A crusade to connect children. India University of California, Irvine. tion concluded: “While the Uruguayan criticizes an MIT professor’s quest to provide ‘One Laptop Per Child,’ but he’s forging ahead Jason Dedrick ([email protected]) is Co-Director and government is making a great effort in elsewhere. BusinessWeek.com (Aug. 16, 2006); www. a project scientist in the Personal Computing Industry providing funding for the hardware, businessweek.com/globalbiz/content/aug2006/ Center at the University of California, Irvine. gb20060816_021986.htm. there is no funding for designing and 7. Farrell, G. ICT in education in Rwanda. In Survey Prakul Sharma ([email protected]) is a research associate developing software and content for of ICD and Education in Africa: Rwanda Country in the Personal Computing Industry Center at the Report. World Bank Information Development, University of California, Irvine. use with the laptops or for conduct- Washington D.C., Dec. 2007; www.infodev.org/en/ Publication.423.htm. ing a thorough evaluation of the edu- © 2009 ACM 0001-0782/09/0600 $10.00

JUNE 2009 | VOL. 52 | NO. 6 | COMMUNICATIONS OF THE ACM 73 review articles

DOI:10.1145/1516046.1516064 low-profile projects have been build- Information and communication technology ing the foundations of ICTD for many years. What’s new are its name and, for development can greatly improve quality more important, the increased recog- of life for the world’s neediest people. nition the field has lately been receiv- ing and its potential for exerting great- BY M. BERNARDINE DIAS AND ERIC BREWER er influence. In this article we explore ICTD and examine the role that computer sci- entists can play in it. Our objective is to convince readers that although How achieving all the goals of ICTD will not be easy, even their partial realization could have tremendous impact. The motivation for this field comes Computer from a new awakening to the vast gap in quality of life between the richest billion people on earth (who enjoy a variety of luxuries, including Internet Science access) and the poorest billion (who just barely eke out a living—and some- times not). The base of the world’s economic pyramid has an estimated population of four billion—over half Serves the of our planet’s people—living on less than $2 a day. In response to this awakening, scholars and practitioners have be- Developing gun to explore the transforming power of information and communication technology when applied to the prob- lems traditionally addressed in devel- World opment. Can mobile phones provide income generation and facilitate re- mote medical diagnosis? How can user interfaces be designed so they are accessible to the semiliterate and even the illiterate? What role can comput- ers play in sustainable education for WHAT DO THE increasingly prominent news stories the rural poor? What new devices can about $100 laptops, kids learning about computers we build to encourage literacy among visually impaired children living in through a “hole in the wall,” and the power of mobile poverty? What will a computer that is phones to educate, entertain, and connect people relevant and accessible to people in developing regions look like? These in remote regions have in common? It is the field are just a few of the questions being of information and communication technology addressed in ICTD. for development (ICTD), based on the belief that In other words, ICTD can be seen as harnessing the power of information technology can have a large and positive effect on and communication technologies, or billions of individuals by helping them overcome the ICTs, to take up many of the challenges of development. ICTs include technol- challenges so prevalent in developing regions. ICTD ogies ranging from robotic tools and is not new—numerous important though relatively state-of-the-art computers to desktop

74 COMMUNICATIONS OF THE ACM | JUNE 2009 | VOL. 52 | NO. 6 Educational initiatives by the TechBridgeWorld group at CMU explore the efficacy of technology tools like an automated English reading tutor. A more recent partnership with researchers from Ashesi University College in Ghana resulted in the country’s first undergraduate robotics course.

and laptop computers in their tradi- mortality, and achieving universal pri- computing. Historically, computers tional forms; and from mobile phones, mary education, environmental sus- started as huge machines that filled PDAs, and wireless networks to long- tainability, and a global partnership rooms and were only relevant and ac- established technologies such as radio for development—to be met by the year cessible to a specialized minority. The and television. The software compo- 2015. Other development goals, not next big wave was the home PC, which nents also span a wide range, from arti- emphasized in the MDGs, include ac- is now relevant and accessible to over ficial intelligence and new algorithms, cess to adequate shelter, information, one billion people worldwide. ICTD is interfaces, and applications to the most avenues for income generation, and perhaps the next revolution in comput-

ERSITY prosaic programmed commodities. financial credit. The ongoing rural-to- ing—transforming the computer and V NI

U Although the goals of international- urban shift of so much of the world’s the applications of computing so that

ON development efforts vary, depending population has introduced a new set of this technology can finally become rel- LL E M on the nature of each endeavor, the problems as well, including increased evant and accessible to the other five overarching goal of all such projects is vulnerability to disasters and the cor- billion people of the world. ARNEGIE C the alleviation of the suffering caused responding challenges for effective Given its position at the intersection AT

LD by poverty and improvement of quality disaster responses. These are among of technology and development, ICTD of life for the world’s poor. The United the many international-development brings together a wide variety of actors GEWOR D

RI Nations’ eight Millennium Develop- challenges that ICTD researchers and in many different roles. Among the B

ECH ment Goals (MDGs) infused new ener- practitioners hope to address. They newest are computer scientists, and T gy into the world’s development efforts expect to reinvent the form, function, their role is potentially a big one, both and helped to focus them on concrete and applications of ICTs in new and for their beneficiaries and themselves. objectives—eradicating extreme pov- creative ways so that such challenges It can change the image of the comput- HS COURTESY OF P erty and hunger, improving maternal may best be met. er science discipline, the nature of the health, prevailing in the battle against From a CS point of view, ICTD can PC, and the future of the field. HOTOGRA P HIV/AIDS and malaria, reducing child be seen as the next wave in ubiquitous A crucial requirement for success

JUNE 2009 | VOL. 52 | NO. 6 | COMMUNICATIONS OF THE ACM 75 review articles

in ICTD, however, is interdisciplinary who must work together if we are to collaboration—working with scholars improve the quality of life for the least and practitioners from many different privileged on our planet. fields. Sociologists, ethnographers, and anthropologists, for example, can The Many Challenges of ICTD provide valuable information about Although the Given its enormous ambitions and the communities intended to benefit cause is noble multidisciplinary requirements, ICTD from ICTD. This information, regard- presents its researchers with a variety ing such things as cultural practices, and the impact of challenges. They include adapting traditions, languages, beliefs, and live- can be large, ICTD to unfamiliar cultures and traditions, lihoods, must guide the design and im- ensuring accessibility to local lan- plementation processes for successful must ultimately guages and multiple levels of literacy, solutions in ICTD. overcoming the barriers of misinfor- Economists and political scientists be judged on its mation and mistrust of technology, play important roles in ICTD as well research value— creating solutions that work within the by designing new economic models, local infrastructure, and many more. marketing strategies, and governmen- and in particular, its For example, networking must work tal policies that affect the economic research value in in circumstances with low bandwidth, viability and sustainability of techno- intermittent bandwidth, or no band- logical interventions. Social scientists computer science. width at all. Computers must operate also play a crucial role in evaluating reliably in environments character- the impacts and outcomes of ICTD ized by dust, heat, humidity, and inex- projects using both qualitative and perienced users. User interfaces must quantitative methods. They observe accommodate semiliterate and illiter- and predict how people in developing ate users. And software applications regions interact with technology, and must be sufficiently intelligent to pro- they aim to affect social systems for vide useful, accessible, and relevant adopting technology-aided solutions services to populations that might be without disruption to the community. interacting with a computing system Thus computer scientists working in for the very first time. the field of ICTD must quickly learn Further, ICTD field tests often re- to work with this variety of scholarly quire considerable ingenuity, whether players, to benefit from their points of they involve accessing target commu- view, and to complement them wher- nities, setting up long-term studies, ever possible. transporting equipment, observing the ICTD does not only cross disci- logistics and legalities of export con- plines; it also transcends the boundar- trol laws, addressing safety concerns, ies of academia and involves multiple and establishing trust and common sectors. This reality obliges ICTD re- ground with partnering organizations searchers to work with practitioners, that cross cultural and geographic government representatives, multi- boundaries. And to begin with, re- lateral institutions such as the United searchers must be entrepreneurial in Nations, nonprofits, nongovernmen- obtaining funding for their research, tal organizations, and even the private as ICTD is not yet an established field sector, whose interest in ICTD begins with reliable funding sources. as it seeks access to emerging markets Although the cause is noble and the and new avenues for corporate social impact can be large, ICTD must ulti- responsibility. Many of these sectors’ mately be judged on its research val- people have been addressing the chal- ue—and in particular, its research val- lenges of development for decades, ue in CS. Like other multidisciplinary and their efforts should profit from fields, ICTD must be simultaneously the addition of professionals in CS and present in multiple communities, each related fields who will contribute new of which may have it own value system perspectives and their useful styles of for research. Even within computer rigor, critique, and innovation. science, ICTD is judged differently by ICTD is therefore a truly global un- different CS communities. dertaking with a grand vision. It brings In the human-computer interac- together numerous players, across tion (HCI) community—for example, geographic, socioeconomic, regional, at the annual ACM CHI conferences— disciplinary, and sectoral boundaries, ICTD has been well received, as HCI

76 COMMUNICATIONS OF THE ACM | JUNE 2009 | VOL. 52 | NO. 6 review articles is multidisciplinary by nature and tive (compared to Western games). They include not just the cost of the already deals both with quantitative ! Implementing a new set of games. technology but also availability (uptime), and qualitative research. Moreover, ! Leading an ongoing multiyear power requirements, potential for theft, developing-region users differ in their study on the educational value of these and logistics. One common approach employment and adoption of technol- games. to financial sustainability is to commer- ogy and thus comprise an important Overall, this process has taken over cialize a solution; this has worked well research direction for HCI profession- four years and continues to this day. for mobile phones and treadle pumps, als. In fact, HCI is arguably the easiest ICTD is also developing its own for example. Even if a for-profit venture discipline within CS in which to work community values over time. The is not the purpose, researchers must on ICTD research. Other areas with clearest values so far are novelty and essentially address the same issues of some inherent compatibility include on-the-ground empirical results, both costs, cash flow, awareness (marketing), systems, networking, databases, and quantitative and qualitative. Less clear and ongoing support. AI. For example, in systems and net- are the values surrounding repeat- Operational sustainability is the ca- working, which are not as multidisci- ability, rigor, and generalizability, and pacity of the permanent staff to keep plinary as HCI and more quantitative least clear is how to merge the values the project going technically (without in character, ICTD work is less natural, of qualitative fields such as anthropol- the researchers). In theory, financial but it can still fit well when technol- ogy or ethnography with those of CS. sustainability enables operational ogy innovation and novel usage are in- Consider generalizability: CS values sustainability (by paying for it), but in volved. Examples from top-tier confer- generalizable results as an indicator practice it cannot do so all by itself. ences include work on delay-tolerant of potential impact, while qualitative This is because of limits on local skills, networking, distributed storage, and researchers often emphasize the dif- supplies, and logistics. Solutions must novel MAC-layer protocols for long- ferences in groups or users and aim be not only easy to use, but also ame- distance WiFi. In these kinds of ap- to broaden the dialogue. This leads to nable to straightforward diagnosis and proaches to ICTD research, there must placing value on reusable technology repair with limited training. be a core technical nugget in addition frameworks, such as HCI toolkits, that Training costs are actually under- to real-world deployments. can be customized and easily local- rated. ICTD projects, particularly in However, research requires a great ized. We discuss one such framework rural areas, cannot view training as a deal of effort per published report, here for mixed paper/phone applica- one-time activity needed only when the given the challenges of deployments; tions. ICTD is also creating its own project starts. Once trained, IT workers over the long term, ICTD researchers scholarly forums for discussing and are often tempted to leave for better must aim to produce papers that are disseminating this work. The Interna- jobs in urban areas or other countries. fewer in number but of higher impact. tional Conference on Information and Thus training is a recurring cost, and it Moreover, it must be noted that ICTD Communication Technologies and must be short and effective. tends to be driven by the solving of a Development and the International These kinds of sustainability are problem rather than by technological Conference on Social Implications of fundamental to scaling a successful innovation (often, in search of a prob- Computers in Developing Countries pilot project. Unfortunately, devel- lem), which means that many ICTD are two examples. opment-work pilots rarely turn into projects may not have a core techni- large-scale self-sustaining successes. cal nugget after all. Such problems, What about Sustainability? Typically the pilot is small enough and although highly satisfying to solve, are Long-term impact requires that ICTD has enough researchers involved (with harder to claim as CS research. projects be self-sustaining. First, after their own support) that the financial For most projects, the real research the researchers leave and the money and operational issues do not really is in actually discovering the specifica- stops flowing, does the project con- hinder it. Thus the pilot is mostly use- tion of the problem via repeated field- tinue? Second, can it be replicated in ful to validate prototypes and assess work and deployments, which is simi- other contexts? community reactions. The under- lar in feel to iterative design in HCI. Sustainability is challenging to standing of financial sustainability Although HCI is an exception, CS does define, and researchers disagree on requires a longer trial with detailed ac- not generally value problem discovery, the details. Most agree on financial counting and no hidden subsidies (un- especially if the end solution is simple sustainability as a key element: the less they are expected to continue at (had we known to apply it). Researcher deployment must produce enough in- scale); it also requires dealing with re- Matthew Kam went through such it- come to at least cover its costs. In this placement costs and expected equip- eration to create effective educational view, philanthropy is acceptable for ment lifetimes. Operational sustain- games on cellphones:3 “kick starting” a project, but not for ability must be evaluated via detailed ! Evaluating 35 existing games for supporting routine operational costs. tracking of problems and how and by PCs with village students. Similarly, while projects typically need whom they were solved. In both cases, ! Creating 10 test games for English not be wildly profitable, they should the system evolves to reduce costs or as a second language (ESL) and testing at least be cash-flow positive, as credit simplify operation. them with 47 students. can be challenging. Finally, replication is the process of ! Studying 28 traditional village The operating-cost issues add sig- moving a successful project to a new games to make the games more intui- nificant constraints to ICTD solutions. environment. As developing regions

JUNE 2009 | VOL. 52 | NO. 6 | COMMUNICATIONS OF THE ACM 77 review articles

are quite heterogeneous in many re- munication Technologies and Devel- care. After several iterations, it became spects, projects typically need some opment offer a much larger sampling clear that the solution was to create adjustments to work well with new of current or recent research efforts in rural vision centers (VCs) consisting partners, a different culture, or a dif- the ICTD field. Several other examples of 1–2 rooms, a nurse, a technician ferent government. Both scaling and and an overview of ICTD are also pro- (to make eyeglasses), and notably the replication are active areas of multidis- vided in a recently published special means for high-quality doctor/patient ciplinary research, and CS has a criti- edition of IEEE Computer.8 videoconferencing. This “video solu- cal role to play, given its direct impact Rural Telemedicine: The Aravind Eye tion,”7 developed at UC Berkeley, uses on sustainability. Care Hospital in southern India is a novel long-distance WiFi links that are world leader in high-volume low-cost low-cost, low-power, and typically de- A Few Sample Projects eye care. Working in the state of Tamil liver 4Mb/s–6Mb/s between the hospi- We have selected four sample ICTD Nadu, Aravind served over 2.4 million tal and the VC over distances ranging projects that illustrate some of the is- patients last year and performed over from a few to tens of kilometers. (The sues discussed thus far. The projects 280,000 cataract surgeries. More than same basic technology has also been focus on four different topics—tele- half the patients receive free or dis- extended to go 382km in Venezuela.) medicine, assistive technology, mi- counted eye care—they are subsidized Having successfully completed a crofinance, and education—all in the by paying customers—and the hospi- five-VC pilot in early 2006, Aravind now context of developing regions. Each tals have been financially self-sustain- has 24 VCs in operation via a mix of WiFi example highlights different chal- ing for decades. and DSL (in more urban areas). Some lenges and characteristics of the ICTD Despite this success, until recently 5,000 patients use the video service per field. Together, these projects reflect Aravind had limited reach into rural month, with over 100,000 through the CS-related innovation in the areas of areas; patient surveys indicated that end of 2008 having used the WiFi links. systems, networking, HCI, and AI. The most patients came from within 20km Of these 100,000, over 15,000 were ef- past proceedings of the International of a hospital and that only 7% of rural fectively blind (primarily due to refrac- Conference on Information and Com- patients had access to any kind of eye tive problems or cataracts), but can now ERSITY V NI U ON LL E M ARNEGIE C AT LD GEWOR D RI B ECH T HS COURTESY OF P

Researchers from CMU’s TechBridgeWorld are working with the Mathru School for the Blind outside Bangalore, India, to enhance HOTOGRA

the teaching and learning process for writing Braille through the use of a low-cost writing tutor that gives audio feedback to students. P

78 COMMUNICATIONS OF THE ACM | JUNE 2009 | VOL. 52 | NO. 6 review articles see well; 85% of them have been able to written from right to left in mirror-im- This Braille tutor,2 first designed, im- return to income generation. This ex- age format so that the correct Braille plemented, and field-tested in 2006, ample shows how the combination of characters can be read when the paper has been enhanced through an itera- basic needs and large volumes in devel- is removed from the slate and flipped tive design process to provide several oping regions enables ICTD research to over. Second, students get delayed features. They include teaching basic have great impact. Aravind recently won feedback; they must wait until their Braille in several languages, teaching the $1M 2008 Gates Foundation Award writing is complete and the paper has basic math symbols, adapting the op- for Global Health, in large part because been removed and read. Third, when erational mode to cater to specific stu- of the reach of these vision centers. the teachers themselves are blind, it is dent needs, and several educational Assistive Technology: The Mathru difficult to diagnose problems in the games that motivate students to learn School for the Blind is a residential students’ writing process by simply the skill of writing Braille. This ongo- facility that provides free education, reading the end product. Finally, mo- ing research has expanded to several clothing, food, and health services tivation for learning to write Braille is new partnerships, including groups in to visually impaired children from very low because the process is tedious Qatar, Zambia, and China. socially and economically deprived and sometimes even physically taxing Microfinance Support: The No- families from remote parts of India. for young students. bel Peace Prize for Mohamed Yunus The school is located in the residen- Researchers from the TechBridge- brought overdue attention to the pow- tial area of Yelahanka, a suburb of World group at Carnegie Mellon Uni- erful role of microfinance in develop- Bangalore. Teaching Braille, the only versity are working with Mathru to ing regions. Such services are in dire means of literacy for the blind, is an enhance the teaching and learning need of technological support, not important part of the curriculum at process for writing Braille using a slate only for basic accounting but also to Mathru. However, learning to write and stylus. This effort has resulted in a reduce fraud and satisfy government Braille using the traditional and slate low-cost Braille writing tutor that gives mandates for reporting. The required and stylus is not an easy process, for audio feedback to the student as he or reports in India, for example, specify several reasons. First, Braille must be she forms characters with the stylus. multiple copies of the same tables

The growth in centers (one per color) and patients in the Aravind telemedicine project in India.

Kallidakurchi Andipatti Srivaikundam Periyakulum Sholavandan Chinnamanur Tirupovanam Bodi Patient Throughput Alanganallur Ambasamudram 5000

4500

4000

3500

3000

2500 umber of Patients

N 2000

1500

1000

500

0 ct 07 ct 06 ar 07 pr 07 ar 08 ar 06 pr 08 pr 06 ov 07 ov ay 07 ov 06 ov ay 08 ay 06 eb 07 ep 07 eb 08 ep 08 ug 07 eb 06 ep 06 ug 08 ug 06 Jul 07 Jul 08 Jul 06 O F O A F F A A S Jan 07 Dec 07 Jun 07 S Jan 08 N S Jan 06 M Jun 08 Dec 06 Jun 06 N M M A A A M M M

Time (in months) Source: Sonesh Surana

JUNE 2009 | VOL. 52 | NO. 6 | COMMUNICATIONS OF THE ACM 79 review articles

in different formats, which are easily candidate in robotics at Carnegie Mel- diate and large-scale impact. But the done with a spreadsheet but tedious lon University and a native of Ghana, role of CS in development is essen- and error-prone on paper, which has had spearheaded a collaborative proj- tially a community decision, involving been the typical mode. ect between TechBridgeWorld and whether we value this work or not. For Tapan Parikh, in his dissertation Ashesi University College to design example, will ICTD be a viable path to work at University of Washington, de- and teach that course5 at Ashesi, a pri- a tenure-track CS faculty position? veloped a system called CAM6 (short for vate, accredited, nonsectarian college We can say that although the chal- ‘camera’) that combines the comfort dedicated to training a new generation lenges are great, ICTD is both intellec- and tangible nature of paper with the of ethical and entrepreneurial leaders tually rewarding and very attractive to power of mobile phones. Two-dimen- in Africa. The collaboration between students at all levels. With several re- sional barcodes on the paper guide data the two universities led to a summer cent reports citing the dwindling num- entry on the phone and help to manage course designed and taught with care- bers of students interested in studying document flow. In addition to work- ful consideration of the local context, CS, perhaps ICTD is one answer. It flow support, CAM uses the keypad for infrastructure, and resources. may help motivate a new generation of numeric input and provides voice feed- Several students who took this computer scientists to contribute their back, both of which have been well re- course have now graduated and have knowledge, talents, and energies to- ceived by semiliterate rural users. This followed different employment paths; ward solving some of the world’s most system is now under trial with 400 mi- some headed to industry (including a pressing problems. crofinance groups in India. startup company for developing mo- Educational Technology and Technol- bile applications) and others to gradu- References 1 1. Dias, M.B., Mills-Tettey, G.A., and Mertz, J. The ogy Education: Project Kané, an initia- ate school. Empowered with a strong TechBridgeWorld Initiative: Broadening perspectives tive of the TechBridgeWorld group at technology education, some of these in computing technology, education, and research. In Proceedings of the International Symposium on Carnegie Mellon University, explores students are now collaborating with Women and ICTD: Creating Global Transformation. the efficacy of technological tools in TechBridgeWorld researchers to de- ACM Press, NY (June 2005). 2. Kalra, N., Lauwers, T., Dewey, D., Stepleton, T., and improving English literacy for children sign, implement, and field-test edu- Dias, M.B. Iterative design of a Braille writing tutor in developing regions, with a focus on cational technology tools to improve to combat illiteracy. In Proceedings of the 2nd IEEE/ ACM International Conference on Information and Africa. The project started with a three- literacy in their homeland. Communication Technologies and Development (Dec. week pilot study in Ghana that tested 2007). 3. Kam, M., Ramachandran, D., Devanathan, V., the feasibility and impact of using an Looking to the Future Tewari, A., and Canny, J. Localized iterative design automated English-reading tutor to We believe that technology, along with for language learning in underdeveloped regions: The PACE Framework. In Proceedings of the ACM improve the level of English literacy good governance and macroeconom- Conference on Human Factors in Computing Systems among children from low-income ics, represents the path forward for the (San Jose, CA, Apr. 28–May 3, 2007). 4. Kim, S.J. Information technology and its impact families in Accra. This study gave pre- majority of the world’s people. Consid- on economic growth and productivity in Korea. International Economic Journal 17, 3 (Oct. 2003), liminary indications that the tutor had er that in 1970, South Korean and Afri- 55–75. a positive impact on the students’ per- can incomes were similar; but the rap- 5. Mills-Tettey, G.A., Dias, M.B., Browning, B., and Amanquah, N. Teaching technical creativity through formance on spelling and fluency tests. id relative rise of South Korea shows robotics: A case study in Ghana. In Proceedings It also identified several important fac- what is possible, due in large part to of the 2nd AI in ICT for Development Workshop, 4 20th International Joint Conference on Artificial tors for success, such as the need to in- technology. We believe that proactive Intelligence (Jan. 2007). clude some local stories familiar to the research and development of ICTs ap- 6. Parikh, T.S., Javid, P., Sasikumar, K., Ghosh, K., and Toyama, K. Mobile phones and paper documents: children and the necessity to narrate propriate for developing regions can Evaluating a new approach for capturing microfinance the tutorial (on how to use the automat- lead to similar growth and prosperity data in rural India. In Proceedings of the ACM Conference on Computer-Human Interaction (Apr. ed tutor) in a voice with a Ghanaian ac- over time and to an improved quality 24–27, 2006, Montreal, Canada). cent. Based on this initial success, the of life in the immediate future. 7. Surana, S., Patra, R., Nedevschi, S., and Brewer, E. Deploying a rural wireless telemedicine system: pilot was scaled to a six-month study Today we have lots of examples and Experiences in sustainability. IEEE Computer 41, 6 that included three groups of children anecdotes about high impact from (June 2008), 48–56. 8. Toyama, K. and Dias, M.B., guest editors. IEEE from very different socioeconomic ICTD in developing regions, but the Computer Magazine, Special Edition on Information backgrounds, and it has also been rep- field remains ad hoc and largely with- Communication Technology for Development (June 2008). licated in Mongu, Zambia. out the benefit of the innovative think-

The automated tutor used in these ing that more computer scientists M. Bernadine Dias ([email protected]) is an assistant studies was not designed for develop- would bring to bear. The situation research professor at the Robotics Institute of Carnegie Mellon University, Pittsburgh, PA. She founded and directs ing regions, however, and it was clear could change substantially, however. the TechBridgeWorld group (www.techbridgeworld.org), that new educational-technology tools The core costs of computing and com- which pursues technology research relevant to, and in partnership with, underserved communities throughout with that focus were needed. This goal munication have dropped to a point the globe. is being pursued through a new part- that enables CS to affect everyone, Eric Brewer ([email protected]) is a professor nership between TechBridgeWorld especially when combined with the in the computer science division at the University of researchers and alumni of the course flexibility inherent in software that California, Berkeley. He founded the Federal Search Foundation, which built FirstGov (now USA.gov), the in robotics and artificial intelligence— enables low-cost customization for a portal for the U.S. government, and he was the founder Ghana’s first—taught at Ashesi Univer- wide variety of contexts. This combi- and chief scientist of the Inktomi Corporation, now part of Yahoo! sity College. nation makes CS uniquely positioned Ayorkor Mills-Tettey, a doctoral among all disciplines to have imme- © 2009 ACM 0001-0782/09/0600 $10.00

80 COMMUNICATIONS OF THE ACM | JUNE 2009 | VOL. 52 | NO. 6 research highlights

P. 82 P. 83 Technical Securing Frame Perspective Reframing Security Communication in Browsers for the Web By Adam Barth, Collin Jackson, and John C. Mitchell By Andrew Myers

P. 92 P. 93 Technical Two Hardware-Based Perspective Software and Approaches for Deterministic Hardware Support Multiprocessor Replay for Deterministic By Derek R. Hower, Pablo Montesinos, Luis Ceze, Replay of Parallel Mark D. Hill, and Josep Torrellas Programs By Norman P. Jouppi

JUNE 2009 | VOL. 52 | NO. 6 | COMMUNICATIONS OF THE ACM 81 research highlights

DOI:10.1145/1516046.1516065 Technical Perspective Reframing Security for the Web By Andrew Myers

THE WEB HAS brought exciting new func- many exciting new applications and the feature of frame navigation in Web tionality while simultaneously requir- services require this sharing. Some of browsers. Code running in one frame ing new mechanisms to make it secure. the techniques developed for operat- (that is, one trust domain) can control We’ve repeatedly discovered that these ing system security, such as controlled where another frame loads its content mechanisms are not good enough, as communication between processes, from. The authors use elegant reason- clever hackers and academics have fig- can be adapted to the Web. But Web ing to identify the most permissive se- ured out how to circumvent and mis- security poses new challenges as well. cure policy for controlling frame navi- use them to compromise security. For example, Web security violations gation. This argument is so simple We now live in a world in which can occur within the context of a sin- and convincing that the policy they viewing an advertisement might com- gle Web page, which often comprises identify has been adopted by most promise your bank account. In the fol- multiple frames controlled by code major browsers. lowing paper, “Securing Frame Com- from different sources. These frames In itself, this would be a significant munication in Browsers,” researchers may be third-party advertisements contribution, but the paper goes far- Adam Barth, Collin Jackson, and John or integrated content from multiple ther. It newly identifies vulnerabili- Mitchell not only illustrate how subtle parties who do not trust each other; ties in two important mechanisms some of these security vulnerabilities the many mashups based on Google for communication between different can be, they show how to solve them Maps are examples of the latter. The frames; one of these mechanisms is in in a principled way. This paper has absence of effective solutions to the the HTML 5 standard. The paper gives had a real impact: their solutions have problem of fine-grained interaction a thoughtful and principled analysis already been widely adopted. between trust domains—coexisting of each communication mechanism Why is Web security difficult? It’s on the very same Web page—has left and identifies a fix for each. These because the Web browser is a place Web applications vulnerable. fixes have also been adopted by cur- where programs and data from dif- Fortunately, researchers like Barth, rent browsers and communication li- ferent sources interact. Each source Jackson, and Mitchell are applying braries. may control resources whose security principled methods to identify and The paper is a great example of re- can be affected by the programs and eliminate these vulnerabilities. The search that has impact precisely be- data from other sources. In fact, there vulnerabilities they address arise from cause it offers principled solutions. is a deep, underlying problem that Too often, proposed computer secu- has never been satisfactorily solved: rity mechanisms merely raise the bar how to securely permit fine-grained The paper is against attacks, starting the next phase sharing and communication between of an arms race. This is a different programs from mutually distrusting a great example kind of work—work that clearly iden- sources. Conventionally, security was of research that tifies and convincingly solves a real considered the job of the operating security problem. The work described system. But the granularity of oper- has impact in this paper makes our lives more se- ating system enforcement is far too precisely because cure and helps the next generation of coarse for Web applications, whose applications to be built securely. And security depends on the precise de- it offers principled their work also helps us understand tails of the interactions between ap- solutions. how to think about the new security plication-level data structures such as challenges that lie ahead. frames, cookies, and interpreted ap-

plication code. Andrew Myers is an associate professor of computer Web security forces us to think anew science at Cornell University, Ithaca, NY. about the problem of fine-grained sharing across trust domains because © 2009 ACM 0001-0782/09/0600 $10.00

82 COMMUNICATIONS OF THE ACM | JUNE 2009 | VOL. 52 | NO. 6 DOI:10.1145/1516046.1516066 Securing Frame Communication in Browsers By Adam Barth, Collin Jackson, and John C. Mitchell

Abstract map, or a photo album, the site runs the risk of incorporat- Many Web sites embed third-party content in frames, ing malicious content. Without isolation, malicious content relying on the browser’s security policy to protect against can compromise the confidentiality and integrity of the malicious content. However, frames provide insufficient user’s session with the integrator. Although the browser’s isolation in browsers that let framed content navigate well-known “same-origin policy”19 restricts script running other frames. We evaluate existing frame navigation poli- in one frame from manipulating content in another frame, cies and advocate a stricter policy, which we deploy in the browsers use a different policy to determine whether one open-source browsers. In addition to preventing undesir- frame is allowed to navigate (change the location of) another. able interactions, the browser’s strict isolation policy also Although browsers must restrict navigation to provide isola- affects communication between cooperating frames. We tion, navigation is the basis of one form of interframe com- therefore analyze two techniques for interframe communi- munication used by leading companies and navigation cation between isolated frames. The first method, fragment can be used to attack a second interframe communication identifier messaging, initially provides confidentiality with- mechanism. out authentication, which we repair using concepts from a Many recent browsers have overly permissive frame well-known network protocol. The second method, post- navigation policies that lead to a variety of attacks. To pre- Message, initially provides authentication, but we dis- vent attacks, we demonstrate against the Google AdSense cover an attack that breaches confidentiality. We propose login page and the iGoogle gadget aggregator, we propose improvements in the post Message API to provide confi- tightening the browser’s frame navigation policy. Based on dentiality; our proposal has been standardized and adopted a comparison of four policies, we advocate a specific policy in browser implementations. that restricts navigation while maintaining compatibility with existing Web content. We have collaborated with the HTML 5 working group to standardize this policy and with 1. INTRODUCTION browser vendors to deploy this policy in 3, Web sites contain content from sources of varying trust- 3.1, and Google Chrome. Because the policy is already worthiness. For example, many Web sites contain third- implemented in Internet Explorer 7, our preferred policy party advertising supplied by advertisement networks is now standardized and deployed in the four most-used or their sub-syndicates.3 Other common aggregations browsers. of third-party content include Flickr albums, Facebook With strong isolation, frames are limited in their interac- badges, and personalized home pages offered by the three tions, raising the issue of how isolated frames can cooperate major Web portals (iGoogle, My Yahoo! and Windows Live). as part of a mashup. We analyze two techniques for inter- More advanced uses of third-party components include frame communication: fragment identifier messaging and Yelp’s use of Google Maps to display restaurant locations, postMessage. Table 1 summarizes our results. and the Windows Live Contacts gadget. A Web site combin- ing content from multiple sources is called a mashup, with š Fragment identifier messaging uses frame navigation the party combining the content called the integrator, and to send messages between frames. This channel lacks integrated content called a gadget. In simple mashups, an important security property: messages are confiden- the integrator does not intend to communicate with the tial but senders are not authenticated. These proper- gadgets and requires only that the browser provide isola- ties are analogous to a network channel in which tion. In more sophisticated mashups, the integrator does senders encrypt their messages with the recipi- wish to communicate and requires secure interframe com- ent’s public key. The Microsoft.Live.Channels munication. When a site wishes to provide isolation and library uses fragment identifier messaging to let the communication between content on its pages, the site Windows Live Contacts gadget communicate with its inevitably relies on the browser rendering process and iso- integrator, following an authentication protocol analo- lation policy, because Web content is rendered and viewed gous to the Needham–Schroeder public-key protocol.17 under browser control. In this paper, we study a contemporary Web version of a recurring problem in computer systems: isolating The original version of this paper was published in the untrusted, or partially trusted, components while providing Proceedings of the 17th USENIX Security Symposium, July secure intercomponent communication. Whenever a site 2008. integrates third-party content, such as an advertisement, a

JUNE 2009 | VOL. 52 | NO. 6 | COMMUNICATIONS OF THE ACM 83 research highlights

Table 1: Security properties of frame communication channels.

Confidentiality Authentication Network Analogue Fragment identifier messaging  Public Key Encryption Original postMessage  Public Key Signatures Improved postMessage   SSL/TLS

We discover an attack on this protocol, related to Lowe’s under his or her control, possibly acting as a client or anomaly in the Needham–Schroeder protocol,15 in server in network protocols of the attacker’s choice. which a malicious gadget can impersonate the integra- Typically, the Web attacker uses at least one machine tor to the Contacts gadget. We suggested a solution as an HTTP server, which we refer to as attacker. based on Lowe’s improvement to the Needham– com. The Web attacker has HTTPS certificates for Schroeder protocol15 that Microsoft implemented and domains he or she owns; certificate authorities provide deployed. such certificates for free. The Web attacker’s network š postMessage is a browser API designed for interframe abilities are decidedly weaker than the usual network communication10 that is implemented in Internet attacker considered in network security because the Explorer 8, Firefox 3, Safari 4, Google Chrome, and Web attacker can neither eavesdrop on messages to nor Opera. Although postMessage has been deployed in forge messages from other network locations. For Opera since 2005, we demonstrate an attack on the example, a Web attacker cannot be a network “man-in- channel’s confidentiality using frame navigation. In the-middle.” light of this attack, the postMessage channel pro- š Client Abilities: We assume that the user views vides authentication but lacks confidentiality, analo- attacker.com in a popular browser, rendering the gous to a channel in which senders cryptographically attacker’s content. We make this assumption because sign their messages. To secure the channel, we propose an honest user’s interaction with an honest site should modifying the API. Our proposal has been adopted be secure even if the user visits a malicious site in by the HTML 5 working group and all the major another browser window. The Web attacker’s content is browsers. subject to the browser’s security policy, making the Web attacker decidedly weaker than an attacker who The remainder of the paper is organized as follows. can execute an arbitrary code with the user’s privileges. Section 2 details our threat models. Section 3 surveys exist- For example, a Web attacker cannot install a system- ing frame navigation policies and standardizes a secure wide key logger or botnet client. policy. Section 4 analyzes two frame communication mech- anisms, demonstrates attacks, and proposes defenses. We do not assume that the user treats attacker.com as Section 5 describes related work. Section 6 concludes. a site other than attacker.com. For example, the user never gives a bank.com password to attacker.com. We 2. THREAT MODEL also assume that honest sites are free of cross-site scripting In this section, we define precise threat models so that we vulnerabilities.20 In fact, none of the attacks described in can determine how effectively browser mechanisms defend this paper rely on running malicious JavaScript as an honest against specific classes of attacks. We consider two kinds principal. Instead, we focus on privileges the browser itself of attackers, a “Web attacker” and a slightly more powerful affords the attacker to interact with honest sites. “gadget attacker.” Although phishing 4, 6 can be described In addition to our interest in protecting users that informally as a Web attack, we do not assume that either the visit malicious sites, our assumption that the user visits Web attacker or the gadget attacker can fool the user by using attacker.com is further supported by several techniques a confusing domain name (such as bankofthevvest. for attracting users. For example, an attacker can place Web com) or by other social engineering. Instead, we assume the advertisements, host popular content with organic appeal, user uses every browser security feature, including the loca- or send bulk e-mail encouraging visitors. Typically, simply tion bar and lock icon, accurately and correctly. viewing an attacker’s advertisement (such as on a search page) lets the attacker mount a Web attack. In a previous 2.1. Web attacker study,12 we purchased over 50,000 impressions for $30. A Web attacker is a malicious principal who owns one or During each of these impressions, a user’s browser rendered more machines on the network. To study the browser secu- our content, giving us the access required to mount a Web rity policy, we assume that the user’s browser renders con- attack. tent from the attacker’s Web site. Attacks accessible to a Web attacker have significant prac- tical impact because these attacks do not require unusual š Network Abilities: The Web attacker has no special net- control of the network. Web attacks can also be carried out work abilities. In particular, the Web attacker can send by a standard man-in-the-middle network attacker, once the and receive network messages only from machines user visits a single HTTP site, because a man-in-the-middle

84 COMMUNICATIONS OF THE ACM | JUNE 2009 | VOL. 52 | NO. 6

can inject malicious content into the HTTP response, simu- even if it contains content from another origin. There are a lating a reply from attacker.com. number of idioms for navigating frames, including

2.2. Gadget attacker window.open(“https://attacker.com/”, “frameName”); A gadget attacker is a Web attacker with one additional abil- ity: the integrator embeds a gadget of the attacker’s choice. This assumption lets us accurately evaluate mashup isola- which navigates a frame named frameName. Frame names tion and communication protocols because the purpose of exist in a global name space that is shared across origins. these protocols is to let an integrator embed untrusted gad- gets safely. In practice, a gadget attacker can either wait for 3.2. Cross-window attacks the user to visit the integrator or can redirect the user to the In 1999, Georgi Guninski discovered that the permissive integrator’s Web site from attacker.com. frame navigation policy admits serious attacks.7 At the time, the password field on the CitiBank login page was contained 3. FRAME ISOLATION within a frame, and the Web attacker could navigate that Web sites can use frames to delegate portions of their screen frame to https://attacker.com/, letting the attacker real estate to other Web sites. For example, a site can sell fill the frame with identical-looking content that steals the parts of their pages to adverting networks. The browser password. This cross-window attack proceeds as follows: displays the location of the main, or top-level, frame in its location bar. Subframes are often visually indistinguishable 1. The user views a blog that displays the attacker’s ad. from other parts of a page, and the browser does not display 2. Separately, the user visits bank.com, which displays their location in its user interface. its password field in a frame. 3. The advertisement navigates the password frame to 3.1. Background https://attacker.com/. The location bar re mains The browser’s scripting policy answers the question “when https://bank.com and the lock icon remains can one frame manipulate the contents of another frame?” present. The scripting policy is the most important part of the 4. The user enters his or her bank.com password into the browser security policy because a frame can act on behalf of https://attacker.com/ frame on the bank.com every other frame it can script. For example, page, submitting the password to attacker.com.

Of the browsers in heavy use today, Internet Explorer 6 and otherWindow.document.forms[0].password.value Safari 3 both implement the permissive policy and allow this attack. Internet Explorer 7 and Firefox 2 implement stricter attempts to read the user’s password from another win- policies (described in subsequent sections). Many Web sites, dow. Modern Web browsers let one frame read and write including Google AdSense, display their password field in a all the properties of another frame only when their con- frame and are vulnerable to this attack; see Figure 1. tent was retrieved from the same origin, i.e. when the scheme (e.g., http or https), host, and port of their loca- 3.3. Same-window attacks tions match. If the content of otherWindow was retrieved In 2001, Mozilla prevented the cross-window attack by from a different origin, the browser’s security policy will implementing a stricter policy: prevent the script above from accessing otherWindow. document. Window Policy In addition to enforcing the scripting policy, every browser A frame can navigate only frames in its window. must answer the question “when is one frame permitted to navigate another frame?” Prior to 1999, all Web browsers implemented a permissive policy: Figure 1: Cross-window attack. The attacker hijacks the password field, which is in a frame. Permissive Policy A frame can navigate any other frame.

For example, if otherWindow includes a frame,

otherWindow.frames[0].location = “https://attacker.com/”; navigates the frame to https://attacker.com/. Under the permissive policy, the browser navigates otherWindow

JUNE 2009 | VOL. 52 | NO. 6 | COMMUNICATIONS OF THE ACM 85 research highlights

This policy prevents the cross-window attack because the Figure 2: Gadget hijacking. Under the window policy, the attacker Web attacker does not control a frame in a trusted win- gadget can navigate other gadgets. dow and, without a foothold in the window, the attacker cannot navigate the login frame. However, the window policy is insufficiently strict to protect users because the gadget attacker does have a foothold in a trusted win- dow in a mashup. (Recall that, in a mashup, the integra- tor combines gadgets from different sources into a single experience.)

š Aggregators: Gadget aggregators, such as iGoogle, My Yahoo! and Windows Live, provide one form of mashup. These sites let users customize their experience by including gadgets (such as stock tickers, weather pre- dictions, and news feeds) on their home page. These sites put third-party gadgets in frames and rely on the browser to protect users from malicious gadgets. š Advertisements: Web advertising produces mashups that combine first-party content, such as news articles or sports statistics, with third-party advertisements. (a) Before Most advertisements, including Google AdWords, are contained in frames, both to prevent the advertisers (who provide the gadgets) from interfering with the publisher’s site and to prevent the publisher from using JavaScript to click on the advertisements.

We refer to pages with advertisements as simple mashups because the integrator and the gadgets do not communi- cate. Simple mashups rely on the browser to provide isola- tion but do not require interframe communication. The windows policy offers no protection for mashups because the integrator’s window contains untrusted gad- gets. A gadget attacker who supplies a malicious gadget does control a frame in the honest integrator’s window, giving the attacker the foothold required to mount a gadget hijack- ing attack.14 A malicious gadget can navigate a target gad- get to attacker.com and impersonates the gadget to the user. For example, iGoogle is vulnerable to gadget hijacking in browsers, such as Firefox 2, that implement the permis- (b) After sive or window policies; see Figure 2. Consider an iGoogle gadget that lets users access their Hotmail account. If the user is not logged into Hotmail, the gadget requests the The Internet Explorer 6 team wanted to enable the child pol- user’s Hotmail password. A malicious gadget can replace icy by default but shipped the permissive policy because the the Hotmail gadget with and steal the user’s Hotmail pass- child policy was incompatible with a large number of Web word. As in the cross-window attack, the user is unable to sites. The Internet Explorer 7 team designed the descen- distinguish the malicious password field from the honest dant policy to balance the security requirement to defeat the password field. cross-window attack with the compatibility requirement to support existing sites.18 3.4. Stricter policies To select a frame navigation policy that provides the best Although browser vendors do not document their naviga- trade-off between security and compatibility, we appeal to tion policies, we reverse engineered the policies of existing the principle of pixel delegation. When one frame embeds browsers (see Table 2). In addition to the permissive and a child frame, the parent frame delegates a region of the window policies, we found two other policies: screen to the child frame. The browser prevents the child frame from drawing outside of its bounding box but does Descendant Policy allow the parent frame to draw over the child using the A frame can navigate only its descendants. position: absolute style. Frame navigation attacks hinge on the attacker escalating his or her privileges and Child Policy drawing on otherwise inaccessible regions of the screen. A frame can navigate only its direct children. The descendant policy is the most permissive (and therefore

86 COMMUNICATIONS OF THE ACM | JUNE 2009 | VOL. 52 | NO. 6

Table 2: Frame navigation policies deployed in existing browsers prior to our work.

IE 6 (Default) IE 6 (Optional) IE 7 (Default) IE 7 (Optional) Firefox 2 Safari 3 Opera 9 Permissive Child Descendant Permissive Window Permissive Child

most compatible) policy that prevents the attacker from google.com. This script creates a rich JavaScript API overwriting screen real estate “belonging” to another origin. that the integrator can use to interact with the map, but Although the child policy is stricter than the descendant the script runs with all of the integrator’s privileges. policy, the added strictness does not provide a significant security benefit because the attacker can simulate the visual Yelp, a popular review Web site, uses the Google Maps gad- effects of navigating a grandchild frame by drawing over get to display the locations of restaurants and other busi- the region of the screen occupied by the grandchild frame. nesses. Yelp requires a high degree of interactivity with the The child policy’s added strictness does, however, reduce Maps gadget because it places markers on the map for each the policy’s compatibility with existing sites, discouraging restaurant and displays the restaurant’s review when the browser vendors from deploying the child policy. user clicks on the marker. To deliver these advanced fea- Maximizing the compatibility of the descendant policy tures, Yelp must use the script version of the Maps gadget, requires taking the browser’s scripting policy into account. but this design requires Yelp to trust Google Maps com- Consider one site that embeds two child frames from a sec- pletely because Google’s script runs with Yelp’s privileges, ond origin. Should one of those child frames be permitted granting Google the ability to manipulate Yelp’s reviews and to navigate its sibling? Strictly construed, the descendant steal Yelp’s customer’s information. Although Google might policy forbids this navigation because the target frame is a be trustworthy, the script approach does not scale beyond sibling, not a descendant. However, this navigation should highly respected gadget providers. Secure interframe com- be allowed because an attacker can perform the navigation munication promises the best of both alternatives: sites by injecting a script into the sibling frame that causes the with functionality like Yelp can realize the interactivity of frame to navigate itself. The browser lets the attacker inject the script version of Google Maps gadget while maintaining this script because the two frames are from the same origin. the security of the frame version of the gadget. More generally, the browser can maximize the compatibility of the descendant policy by recognizing origin propagation 4.1. Fragment identifier messaging and letting an active frame navigate a target frame if the tar- Although the browser’s scripting policy isolates frames from get frame is the descendant of a frame in the same origin as different origins, clever mashup designers have discovered the active frame. Defined in this way, the frame navigation an unintended channel between frames, fragment identi- policy avoids creating a suborigin privilege.11 This added per- fier messaging,1, 21 which is regulated by the browser’s less- missiveness does not sacrifice security because an attacker restrictive frame navigation policy. This “found” technology can perform the same navigations indirectly, but the refined lets mashup developers place each gadget in a separate policy is more convenient for honest Web developers. frame and rely on the browser’s security policy to prevent We collaborated with the HTML 5 working group9 and malicious gadgets from attacking the integrator and honest standardized the descendant policy in the HTML 5 speci- gadgets. We analyze fragment identifier messaging in use fication. The descendant policy has now been adopted by prior to our analysis and propose improvements that have Internet Explorer 7, Firefox 3, Safari 3.1, and Google Chrome. since been adopted. We also reported a vulnerability in Flash Player that could be Mechanism: Normally, when a frame is navigated to a new used to bypass Internet Explorer 7’s frame navigation policy. URL, the browser requests the URL from the network and Adobe fixed this vulnerability in a security update. replaces the frame’ document with the retrieved content. However, if the new URL matches the old URL everywhere 4. FRAME COMMUNICATION except in the fragment (the part after the #), then the browser Unlike simple aggregators and advertisements, sophisti- does not reload the frame. If frames[0] is currently located cated mashups comprise gadgets that communicate with at http://example.com/doc, each other and with their integrator. For example, Yelp integrates the Google Maps gadget, illustrating the need frames[0].location = “http://example.com/doc#msg”; for secure interframe communication in real deployments. Google provides two versions of its Maps gadget: changes the frame’s location without reloading the frame š Frame: In the frame version, the integrator embeds a or destroying its JavaScript context. The frame can read its frame to maps.google.com, in which Google displays fragment by polling window.location.hash to see if the a map of the specified location. The user can interact fragment has changed. This technique can be used to send with the map, but the integrator cannot. messages between frames while avoiding network latency. š Script: In the script version, the integrator embeds Security Properties: The fragment identifier channel has a