COMMuNICATIONS CACM.ACM.ORG OF THE ACM 11/2010 VOL.53 NO.11

using complexity to Protect elections

Association for Computing Machinery CALL FOR PARTICIPATION CTSCTS 20112011 Philadelphia, Pennsylvania, USA

The 2011 International Conference on Collaboration Technologies and Systems

May 23 – 27, 2011 The Sheraton University City Hotel Philadelphia, Pennsylvania, USA

Important Dates: Paper Submission Deadline ------December 15, 2010 Workshop/Special Session Proposal Deadline ------November 15, 2010 Tutorial/Demo/Panel Proposal Deadline ------January 8, 2011 Notification of Acceptance ------February 1, 2011 Final Papers Due ------March 1, 2011

For more information, visit the CTS 2011 web site at: http://cts2011.cisedu.info/

In cooperation with the ACM, IEEE, IFIP ABCD springer.com

Springer References & Key Library Titles

Handbook of Handbook Encyclopedia of Ambient of Natural Machine Intelligence Computing Learning and Smart G. Rozenberg, T. Bäck, C. Sammut, G. I. Webb Environments J. N. Kok, Leiden Univer- (Eds.) sity, The Netherlands H. Nakashima, Future The first reference work (Eds.) University, Hakodate, on Machine Learning Hokkaido, Japan; We are now witnessing Comprehensive A-Z H. Aghajan, Stanford University, Stanford, an exciting interaction between computer coverage of this complex subject area makes CA, USA; J. C. Augusto, University of Ulster at science and the natural sciences. Natural this work easily accessible to professionals and Jordanstown, Newtownabbey, UK (Eds.) Computing is an important catalyst for this researchers in all fields who are interested in a interaction, and this handbook is a major particular aspect of Machine LearningTargeted Provides readers with comprehensive, up-to- record of this important development. literature references provide additional value date coverage in this emerging field. Organizes for researchers looking to study a topic in more all major concepts, theories, methodologies, 2011. Approx. 1700 p. (In 3 volumes, not available detail. trends and challenges into a coherent, unified seperately) Hardcover repository. Covers a wide range of applications ISBN 978-3-540-92909-3 7 $749.00 2010. 800 p. Hardcover relevant to both ambient intelligence and eReference ISBN 978-0-387-30768-8 7 approx. $549.00 smart environments. Examines case studies ISBN 978-3-540-92910-9 7 $749.00 eReference of recent major projects to present the reader 2010. 800 p. with a global perspective of actual develop- Print + eReference ISBN 978-0-387-30164-8 7 approx. $549.00 ments. 2011. Approx. 1700 p. ISBN 978-3-540-92911-6 7 $939.00 Print + eReference 2010. XVIII, 1294 p. 100 illus. Hardcover 2010. 800 p. ISBN 978-0-387-93807-3 7 $229.00 Handbook of ISBN 978-0-387-34558-1 7 approx. $689.00 Biomedical Handbook of Handbook of Imaging Peer-to-Peer Multimedia N. Paragios, École for Digital Centrale de Paris, France; Networking J. Duncan, Yale Univer- X. Shen, University of Entertainment sity, USA; N. Ayache, Waterloo, ON, Canada; and Arts INRIA, France (Eds.) H. Yu, Huawei Technolo- B. Furht, Florida Atlantic This book offers a unique guide to the entire gies, Bridgewater, NJ, University, Boca Raton, chain of biomedical imaging, explaining how USA; J. Buford, Avaya FL, USA (Ed.) image formation is done, and how the most Labs Research, Basking Ridge, NJ, USA; M. Akon, University of Waterloo, ON, The first comprehensive handbook to cover appropriate algorithms are used to address Canada (Eds.) recent research and technical trends in the field demands and diagnoses. It is an exceptional of digital entertainment and art. Includes an tool for radiologists, research scientists, senior Offers elaborate discussions on fundamentals outline for future research directions within undergraduate and graduate students in health of peer-to-peer computing model, networks this explosive field. The main focus targets sciences and engineering, and university and applications. Provides a comprehensive interactive and online games, edutainment, professors. study on recent advancements, crucial design e-performance, personal broadcasting, innova- choices, open problems, and possible solution tive technologies for digital arts, digital visual 2010. Approx. 590 p. Hardcover strategies. Written by a team of leading and other advanced topics. ISBN 978-0-387-09748-0 7 approx. $179.00 international researchers and professionals.

2009. XVI, 769 p. 300 illus., 150 in color. Hardcover 2010. XLVIII, 1500 p. Hardcover ISBN 978-0-387-89023-4 7 $199.00 ISBN 978-0-387-09750-3 7 $249.00

Easy Ways to Order for the Americas 7 Write: Springer Order Department, PO Box 2485, Secaucus, NJ 07096-2485, USA 7 Call: (toll free) 1-800-SPRINGER 7 Fax: 1-201-348-4505 7 Email: [email protected] or for outside the Americas 7 Write: Springer Customer Service Center GmbH, Haberstrasse 7, 69126 Heidelberg, Germany 7 Call: +49 (0) 6221-345-4301 7 Fax : +49 (0) 6221-345-4229 7 Email: [email protected] 7 Prices are subject to change without notice. All prices are net prices. 014712x communications of the acm

Departments News Viewpoints

5 Editor’s Letter 24 Economic and Business Dimensions On P, NP, and The Divergent Online Computational Complexity News Preferences of By Moshe Y. Vardi Journalists and Readers Reading between the lines of the 6 Letters To The Editor thematic gap between the supply How to Think About Objects and demand of online news. By Pablo J. Boczkowski 9 In the Virtual Extension 27 Education 10 BLOG@CACM K–12 Computational Learning Rethinking the Systems Enhancing student learning and Review Process understanding by combining Tessa Lau launches a discussion theories of learning with the about the acceptance criteria computer’s unique attributes. for HCI systems papers at CHI, By Stephen Cooper, Lance C. Pérez, UIST, and other conferences. 13 Turning Data Into Knowledge and Daphne Rainey Today’s data deluge is leading to 12 CACM Online new approaches to visualize, analyze, 30 Legally Speaking A Preference for PDF and catalog enormous datasets. Why Do Software Startups By David Roman By Gregory Goth Patent (or Not)? Assessing the controversial results 31 Calendar 16 Security in the Cloud of a recent empirical study Cloud computing offers many of the role of intellectual property 103 Careers advantages, but also involves security in software startups. risks. Fortunately, researchers are By Pamela Samuelson devising some ingenious solutions. Last Byte By Gary Anthes 33 Privacy and Security Why Isn’t Cyberspace More Secure? 112 Puzzled 19 Career Opportunities Evaluating governmental Rectangles Galore What are the job prospects for actions—and inactions—toward By Peter Winkler today’s—and tomorrow’s— improving cyber security and graduates? addressing future challenges. By Leah Hoffmann By Joel F. Brenner

23 Wide Open Spaces 36 Viewpoint The U.S. Federal Communications Sensor Networks for the Sciences Commission’s decision to open Lessons from the field derived frequencies in the broadcast from developing wireless sensor spectrum could enable broadband networks for monitoring active networks in rural areas, permit smart and hazardous volcanoes. electric grids, and more. By Matt Welsh By Neil Savage on In Support of i Teachers and the CSTA orporat C A call to action to clarify the computer science SST definition to K–12 students. By Duncan Buell Association for Computing Machinery Advancing Computing as a Science & Profession Image courtesy of L

2 communications of the acm | november 2010 | vol. 53 | no. 11 11/2010 vol. 53 no. 11

Practice Contributed Articles Review Articles

74 Using Complexity to Protect Elections Computational complexity may truly be the shield against election manipulation. By Piotr Faliszewski, Edith Hemaspaandra, and Lane A. Hemaspaandra

Research Highlights

84 Technical Perspective Data Races are Evil with No Exceptions 47 By Sarita Adve

42 The Case Against Data Lock-in 58 Understanding Throughput-Oriented 85 Goldilocks: A Race-Aware Want to keep your users? Architectures Java Runtime Just make it easy for them to leave. For workloads with abundant By Tayfun Elmas, Shaz Qadeer, By Brian W. Fitzpatrick and JJ Lueck parallelism, GPUs deliver higher and Serdar Tasiran peak computational throughput 47 Keeping Bits Safe: than latency-oriented CPUs. 93 FastTrack: Efficient and Precise How Hard Can It Be? By Michael Garland and David B. Kirk Dynamic Race Detection As storage systems grow larger By Cormac Flanagan and larger, protecting their data 67 Regulating the Information and Stephen N. Freund for long-term storage is becoming Gatekeepers ever more challenging. Concerns about biased By David S.H. Rosenthal manipulation of search results may require intervention involving 56 Sir, Please Step Away government regulation. from the ASR-33! By Patrick Vogl and Michael Barrett To move forward with programming languages we must first break free Relative Status of Journal from the tyranny of ASCII. and Conference Publications By Poul-Henning Kamp in Computer Science Citations represent a trustworthy Articles’ development led by measure of CS research quality— queue.acm.org whether in articles in conference proceedings or in CS journals. By Jill Freyne, Lorcan Coyle,

Barry Smyth, and Padraig Cunningham About the Cover: Vulnerabilities in the election process Supporting Ubiquitous Location have been around for Information in Interworking centuries. This month’s D cover story, beginning 3G and Wireless Networks on p. 74, examines the Users and wireless ISPs could tap use of computational RAMPERSA

complexity as an effective location-based services across shield against election TARAN

manipulation. Artist

Y networks to create ubiquitous Melvin Galapon picks up H B personal networks. that theme by colorfully depicting a protective By Massimo Ficco, Roberto layer of interference OTOGRAP H

P Pietrantuono, and Stefano Russo hovering over the election practice.

november 2010 | vol. 53 | no. 11 | communications of the acm 3 communications of the acm Trusted insights for computing’s leading professionals.

Communications of the ACM is the leading monthly print and online magazine for the computing and information technology fields. Communications is recognized as the most trusted and knowledgeable source of industry information for today’s computing professional. Communications brings its readership in-depth coverage of emerging areas of computer science, new trends in information technology, and practical applications. Industry leaders use Communications as a platform to present and debate various technology implications, public policies, engineering challenges, and market trends. The prestige and unmatched reputation that Communications of the ACM enjoys today is built upon a 50-year commitment to high-quality editorial content and a steadfast dedication to advancing the arts, sciences, and applications of information technology.

ACM, the world’s largest educational STAFF editorial Board and scientific computing society, delivers resources that advance computing as a Director of Group Publishing Editor-in-chief science and profession. ACM provides the Scott E. Delman Moshe Y. Vardi ACM Copyright Notice computing field’s premier Digital Library [email protected] [email protected] Copyright © 2010 by Association for and serves its members and the computing Executive Editor News Computing Machinery, Inc. (ACM). profession with leading-edge publications, Diane Crawford Co-chairs Permission to make digital or hard copies conferences, and career resources. Managing Editor Marc Najork and Prabhakar Raghavan of part or all of this work for personal Thomas E. Lambert Board Members or classroom use is granted without Executive Director and CEO Senior Editor Brian Bershad; Hsiao-Wuen Hon; fee provided that copies are not made John White Andrew Rosenbloom Mei Kobayashi; Rajeev Rastogi; or distributed for profit or commercial Deputy Executive Director and COO Senior Editor/News Jeannette Wing advantage and that copies bear this Patricia Ryan Jack Rosenberger notice and full citation on the first Director, Office of Information Systems Web Editor Viewpoints page. Copyright for components of this Wayne Graves David Roman Co-chairs work owned by others than ACM must Director, Office of Financial Services Editorial Assistant Susanne E. Hambrusch; John Leslie King; be honored. Abstracting with credit is Russell Harris Zarina Strakhan J Strother Moore permitted. To copy otherwise, to republish, Director, Office of Membership Rights and Permissions Board Members to post on servers, or to redistribute to Lillian Israel Deborah Cotton P. Anandan; William Aspray; lists, requires prior specific permission Director, Office of SIG Services Stefan Bechtold; Judith Bishop; and/or fee. Request permission to publish Donna Cappo Art Director Stuart I. Feldman; Peter Freeman; from [email protected] or fax Director, Office of Publications Andrij Borys Seymour Goodman; Shane Greenstein; (212) 869-0481. Bernard Rous Associate Art Director Mark Guzdial; Richard Heeks; Director, Office of Group Publishing Alicia Kubista Rachelle Hollander; Richard Ladner; For other copying of articles that carry a Scott Delman Assistant Art Director Susan Landau; Carlos Jose Pereira de Lucena; code at the bottom of the first or last page Mia Angelica Balaquiot Beng Chin Ooi; Loren Terveen or screen display, copying is permitted ACM Council Production Manager provided that the per-copy fee indicated President Lynn D’Addesio Practice in the code is paid through the Copyright Wendy Hall Director of Media Sales Chair Clearance Center; www.copyright.com. Vice-President Jennifer Ruzicka Stephen Bourne Alain Chesnais Marketing & Communications Manager Board Members Subscriptions Secretary/Treasurer Brian Hebert Eric Allman; Charles Beeler; David J. Brown; An annual subscription cost is included Barbara Ryder Public Relations Coordinator Bryan Cantrill; Terry Coatta; Mark Compton; in ACM member dues of $99 ($40 of Past President Virgina Gold Stuart Feldman; Benjamin Fried; which is allocated to a subscription to Stuart I. Feldman Publications Assistant Pat Hanrahan; Marshall Kirk McKusick; Communications); for students, cost Chair, SGB Board Emily Eng George Neville-Neil; Theo Schlossnagle; is included in $42 dues ($20 of which Alexander Wolf Jim Waldo is allocated to a Communications Co-Chairs, Publications Board Columnists subscription). A nonmember annual The Practice section of the CACM Ronald Boisvert and Jack Davidson Alok Aggarwal; Phillip G. Armour; subscription is $100. Editorial Board also serves as Members-at-Large Martin Campbell-Kelly; the Editorial Board of . Carlo Ghezzi; Michael Cusumano; Peter J. Denning; ACM Media Advertising Policy Shane Greenstein; Mark Guzdial; Anthony Joseph; Contributed Articles Communications of the ACM and other Peter Harsha; Leah Hoffmann; Mathai Joseph; Co-chairs ACM Media publications accept advertising Mari Sako; Pamela Samuelson; Kelly Lyons; Al Aho and Georg Gottlob in both print and electronic formats. All Gene Spafford; Cameron Wilson Bruce Maggs; Board Members advertising in ACM Media publications is Mary Lou Soffa; Yannis Bakos; Elisa Bertino; Gilles at the discretion of ACM and is intended Contact Points Fei-Yue Wang Brassard; Alan Bundy; Peter Buneman; to provide financial support for the various Copyright permission SGB Council Representatives Andrew Chien; Anja Feldmann; activities and services for ACM members. [email protected] Joseph A. Konstan; Blake Ives; James Larus; Igor Markov; Current Advertising Rates can be found Calendar items Robert A. Walker; Gail C. Murphy; Shree Nayar; Lionel M. Ni; by visiting http://www.acm-media.org or [email protected] Jack Davidson Sriram Rajamani; Jennifer Rexford; by contacting ACM Media Sales at Change of address Marie-Christine Rousset; Avi Rubin; (212) 626-0654. Publications Board [email protected] Co-Chairs Fred B. Schneider; Abigail Sellen; Letters to the Editor Single Copies Ronald F. Boisvert; Jack Davidson Ron Shamir; Marc Snir; Larry Snyder; [email protected] Single copies of Communications of the Board Members Manuela Veloso; Michael Vitale; Wolfgang Wahlster; Andy Chi-Chih Yao; ACM are available for purchase. Please Nikil Dutt; Carol Hutchins; Web SITE Willy Zwaenepoel contact [email protected]. Joseph A. Konstan; Ee-Peng Lim; http://cacm.acm.org Catherine McGeoch; M. Tamer Ozsu; Research Highlights Communications of the ACM Holly Rushmeier; Vincent Shen; Author Guidelines Co-chairs (ISSN 0001-0782) is published monthly Mary Lou Soffa; Ricardo Baeza-Yates http://cacm.acm.org/guidelines David A. Patterson and Stuart J. Russell by ACM Media, 2 Penn Plaza, Suite 701, ACM U.S. Public Policy Office Board Members New York, NY 10121-0701. Periodicals Cameron Wilson, Director Advertising Martin Abadi; Stuart K. Card; Jon Crowcroft; postage paid at New York, NY 10001, 1828 L Street, N.W., Suite 800 Deborah Estrin; Shafi Goldwasser; and other mailing offices. ACM Advertising Department Washington, DC 20036 USA Monika Henzinger; Maurice Herlihy; 2 Penn Plaza, Suite 701, New York, NY T (202) 659-9711; F (202) 667-1066 Dan Huttenlocher; Norm Jouppi; POSTMASTER 10121-0701 Andrew B. Kahng; Gregory Morrisett; Please send address changes to Computer Science Teachers Association T (212) 869-7440 Michael Reiter; Mendel Rosenblum; Communications of the ACM Chris Stephenson F (212) 869-0481 Ronitt Rubinfeld; David Salesin; 2 Penn Plaza, Suite 701 Executive Director Lawrence K. Saul; Guy Steele, Jr.; Director of Media Sales New York, NY 10121-0701 USA 2 Penn Plaza, Suite 701 Madhu Sudan; Gerhard Weikum; Jennifer Ruzicka New York, NY 10121-0701 USA Alexander L. Wolf; Margaret H. Wright [email protected] T (800) 401-1799; F (541) 687-1840 Web Media Kit [email protected] Association for Computing Machinery Co-chairs (ACM) James Landay and Greg Linden

E R E C S Y 2 Penn Plaza, Suite 701 Board Members A C E L L E New York, NY 10121-0701 USA Gene Golovchinsky; Jason I. Hong; P

T E H T (212) 869-7440; F (212) 869-0481 Jeff Johnson; Wendy E. MacKay Printed in the U.S.A. N I I S Z M A G A

4 communications of the acm | november 2010 | vol. 53 | no. 11 editor’s letter

DOI:10.1145/1839676.1839677 Moshe Y. Vardi On P, NP, and Computational Complexity The second week of August was an exciting week. On Friday, August 6, Vinay Deolalikar announced a claimed proof that P ≠ NP. Slashdotted blogs broke the news on

August 7 and 8, and suddenly the whole large degree of the polynomial or a very with a few hundred variables that can- world was paying attention. Richard large multiplicative constant; after all, not be solved by any extant SAT solver. Lipton’s August 15 blog entry at blog@ (10n)1000 is a polynomial! Similarly, it is “So what?’’ shrugs the practitioner, CACM was viewed by about 10,000 conceivable that P ≠ NP, but NP prob- “these are artificial problems.” Some- readers within a week. Hundreds of lems can be solved by algorithms with how, industrial SAT instances are computer scientists and mathemati- running time bounded by nlog log log n—a quite amenable to current SAT-solv- cians, in a massive Web-enabled col- bound that is not polynomial but in- ing technology, but we have no good laborative effort, dissected the proof in credibly well behaved. theory to explain this phenomenon. an intense attempt to verify its validity. Even more significant, I believe, is There is a branch of complexity theory By the time the New York Times pub- the fact that computational complex- that studies average-case complexity, lished an article on the topic on August ity theory sheds limited light on be- but this study also seems to shed little 16, major gaps had been identified, and havior of algorithms in the real world. light on practical SAT solving. How to the excitement was starting to subside. Take, for example, the Boolean Satisfi- design good algorithms is one of the The P vs. NP problem withstood anoth- ability Problem (SAT), which is the ca- most fundamental questions in com- er challenge and remained wide open. nonical NP-complete problem. When puter science, but complexity theory During and following that exciting I was a graduate student, SAT was a offers only very limited guidelines for week many people have asked me to “scary” problem, not to be touched algorithm design. explain the problem and why it is so with a 10-foot pole. Garey and John- An old cliché asks what the differ- important to computer science. “If ev- son’s classical textbook showed a long ence is between theory and practice, eryone believes that P is different than sad line of programmers who have and answers that “in theory, they are NP,” I was asked, “why it is so impor- failed to solve NP-complete problems. not that different, but in practice, they tant to prove the claim?’’ The answer, Guess what? These programmers are quite different.” This seems to ap- of course, is that believing is not the have been busy! The August 2009 is- ply to the theory and practice of SAT same as knowing. The conventional sue of Communications contained an and similar problems. My point here “wisdom’’ can be wrong. While our article by Sharad Malik and Lintao is not to criticize complexity theory. It intuition does tell us that finding solu- Zhang (p. 76) in which they described is a beautiful theory that has yielded tions ought to be more difficult than SAT’s journey from theoretical hard- deep insights over the last 50 years, checking solutions, which is what the ness to practical success. Today’s SAT as well as posed fundamental, tan- P vs. NP problem is about, intuition solvers, which enjoy wide industrial talizing problems, such as the P vs. can be a poor guide to the truth. Case usage, routinely solve SAT instances NP problem. But an important role in point: modern physics. with over one million variables. How of theory is to shed light on practice, While the P vs. NP quandary is a can a scary NP-complete problem be and there we have large gaps. We central problem in computer science, so easy? What is going on? need, I believe, a richer and broader we must remember that a resolution of The answer is that one must read complexity theory, a theory that would the problem may have limited practi- complexity-theoretic claims carefully. explain both the difficulty and the cal impact. It is conceivable that P = NP, Classical NP-completeness theory is easiness of problems like SAT. More but the polynomial-time algorithms about worst-case complexity. theory, please! yielded by a proof of the equality are Indeed, SAT does seem hard in the completely impractical, due to a very worst case. There are SAT instances Moshe Y. Vardi, editor-in-chief

november 2010 | vol. 53 | no. 11 | communications of the acm 5 letters to the editor

DOI:10.1145/1839676.1839678 How to Think About Objects

hough I agree with Morde- not supported by empirical evidence. the more theoretical areas of CS, lead chai Ben-Ari’s Viewpoint We take issue with both his claim and us to conclude that today’s reviews, “Objects Never? Well, Hard- (perhaps ironically) the evidence he though demanding and sometimes ly Ever!” (Sept. 2010) saying used to support it. disappointing, are not “radically em- that students should be in- Génova rhetorically asked “Must all pirical.” To the extent a problem ex- Ttroduced to procedural programming scientific works be reasoned and de- ists in the CS review process, it is due before object-oriented programming, monstrable?,” answering emphatical- to “hypercriticality,” as Moshe Y. Vardi dismissing OOP could mean throwing ly, “Yes, of course,” to which we whole- said in his “Editor’s Letter” (July 2010, out the baby with the bathwater. heartedly agree. Broadly, there are two p. 5), not “radical empiricism.” OOP was still in the depths of the re- ways to achieve this goal: inference Earl Barr and Christian Bird, search labs when I was earning my col- and deduction. Responding to the davis, CA lege degrees. I was not exposed to it for letter to the editor by Joseph G. Davis the first few years of my career, but it in- “No Straw Man in Empirical Research” trigued me, so I began to learn it on my (Sept. 2010, p. 7), Génova said theo- Author’s Response: own. The adjustment from procedural retical research rests on definition and I’m glad to hear from Barr and Bird that programming to OOP wasn’t just a mat- proof, not on evidence. Nonetheless, there are healthy subfields in CS in this ter of learning a few new language con- he appeared to be conflating inference respect. I used “inductive justification” structs. It required a new way of thinking and deduction in his argument that to support the claim that many classical about problems and their solutions. seminal past research would be unac- works in the field are more theoretical That learning process has continued. ceptable today. Many of the famous and speculative than experimental, not to The opportunity to learn elegant new computer scientists he cited to sup- support an argument that CS suffers today techniques for solving difficult prob- port this assertion—Turing, Shannon, from “radical empiricism.” Investigating the lems is precisely why I love the field. But Knuth, Hoare, Dijkstra—worked (and latter through exhaustive empirical surveys OOP is not the perfect solution, just one proved their findings) largely in the of reviews would require surveyors being tool in the software engineer’s toolbox. more-theoretical side of CS. Even a cur- able to classify a reviewer as a “radical If it were the only tool, we would run the sory reading of the latest Proceedings of empiricist.” If my column served this risk of repeating psychologist Abraham the Symposium on Discrete Algorithms purpose, then I am content with it. Maslow’s warning that if the only tool or Proceedings of Foundations of Com- Gonzalo Génova, Madrid, Spain you have is a hammer, every problem puter Science turns up many theoreti- tends to look like a nail. cal papers with little or no empirical Learning any new software tech- content. The work of other pioneers Conclude with the Conclusions nique—procedural programming, Génova cited, including Meyer and The Kode Vicious Viewpoint “Present- OOP, or simply what’s next—takes Gamma, might have required more ing Your Project” by George V. Neville- time, patience, and missteps. I have empirical evidence if presented today. Neil (Aug. 2010) made several debat- made plenty myself learning OOP, as Génova implied their work would not able points about presentations, one of well as other technologies, and contin- be accepted, and we would therefore which was inexcusable: “…I always end ue to learn from and improve because be unable to benefit from it. The fact with a Questions slide.” of them. that they met the requirements of their You have just given a 25-minute For his next sabbatical, Ben-Ari time but (arguably) not of ours does technical presentation to an educat- might consider stepping back into the not mean they would not have risen to ed, knowledgeable, technical audi- industrial world for a year or two. We’ve the occasion had the bar been set high- ence. Using a series of slides, you have learned a great deal about OOP since er. We suspect they would have, and CS explained your problem, described he left for academia 15 years ago. would be none the poorer for it. your solutions, discussed your experi- Jim Humelsine, Neptune, NJ Génova’s suggestion that CS suf- ments, and finally concluded, display- fers today from “radical empiricism” ing each slide for a minute or two. is an empirical, not deductive, claim Your penultimate slide summarizes Evaluating Research: that can be investigated through sur- the whole presentation, including its Hypercriticality vs. veys and reviews. Still, he supported “takeaway” message—everything you Radical Empiricism it via what he called “inductive justifi- want your listeners to remember. Now In his Viewpoint “Is Computer Science cation,” which sounds to us like argu- you expect to spend four or five min- Truly Scientific?” (July 2010), Gon- ment by anecdote. Using the same in- utes answering questions. The slide zalo Génova suggested that computer ductive approach, conversations with you show as you answer will be on science suffers from “radical empiri- our colleagues here at the University screen two or three times longer than cism,” leading to rejection of research of California, Davis, especially those in any other slide.

6 communications of the acm | november 2010 | vol. 53 | no. 11 letters to the editor

So why remove the most useful slide ing at the 2010 O’Reilly Open Source laboratory at age three. He was born in in the whole presentation—the sum- Convention (http://www.oscon.com/ 1934, and the University of Cambridge mary—and replace it with a content- oscon2010/public/schedule/detail/15255), Computer Laboratory was founded (as free alternative showing perhaps a word David Whiles, CIO of Midland Memo- the Mathematical Laboratory) in 1937. or two. Is your audience so dense it can- rial Hospital, Midland, TX, described The source of the error is apparently not hear you say “Thank you” or ask for his hospital’s deployment of VistA and Milner’s official obituary, which noted questions unless they’re on the screen? how it has since seen a reduction in he “held the first established Chair in Do you think the audience will forget to mortality rates of about two per month, Computer Science at the University of say something? Or is the problem with as well as a dramatic 88% decrease in Cambridge.” Translation to American: you, the presenter? Would you yourself central-line infections entering at cath- He held the first endowed professor- forget to ask for questions if the slide eter sites (http://www.youtube.com/ ship. In Britain, the word “Chair” refers wasn’t on the screen in front of you? watch?v=ExoF_Tq14WY). Mean- to a professorship and should not be Technical presentations should be while, the country of Jordan (http://ehs. confused with “Head of Department.” held to a higher standard of informa- com.jo) is piloting an open source soft- Lawrence C. Paulson, tion content and knowledge transfer ware stack deployment of VistA to pro- Cambridge, England than a sales pitch. My advice: Remove vide electronic health records within the “Thank You” and “Questions” its national public health care system. slides, and leave up your “Conclusions” [In the interests of full disclosure, Correction and “Summary” as long as possible. I am an active member of the global Tom Geller’s news story “Beyond the Michael Wolfe, Hillsboro, OR VistA community, co-founding World- Smart Grid” (June 2010) should have VistA in 2002 (http://worldvista.org), a cited Froehlich, J. Larson, E., Camp- 501(c)(3) promoting affordable health bell, T., Haggerty, C., Fogarty, J., and For Electronic Health Records, care IT through VistA. Though now re- Patel, S. “HydroSense: Infrastructure- Don’t Ignore VistA tired from an official role, I previously Mediated Single-Point Sensing of Why did Stephen V. Cantrill’s article served as a WorldVistA Director.] Whole-Home Water Activity in Proceed- “Computers in Patient Care: The Prom- K.S. Bhaskar, Malvern, PA ings of UbiComp 2009 (Orlando, FL, ise and the Challenge” (Sept. 2010) say Sept. 30–Oct. 3, 2009) instead of Patel, nothing about the Veterans Health S.N., Reynolds, M.S., and Abowd, G.D. Information Systems and Technology Author’s Response: “Detecting Human Movement By Dif- Architecture (VistA) used for decades I appreciate Bhaskar’s comments about ferential Air Pressure Sensing in HVAC throughout the U.S. Department of the VA’s VistA medical information system System Ductwork: An Exploration in Veterans Affairs (VA) medical system and applaud his efforts to generate a Infrastructure-Mediated Sensing” in for its patients’ electronic medical re- workable system in the public domain, but Proceedings of Pervasive 2008, the Sixth cords? With 153 medical centers and he misunderstood the intent of my article. It International Conference on Pervasive 1,400 points of care, the VA in 2008 de- was not to be a comparison of good vs. bad Computing (Sydney, Australia, May 19– livered care to 5.5 million people, reg- or best vs. worst, but rather a discussion 22, 2008). We apologize for this error. istering 60 million visits (http://www1. of many of the endemic issues that have plagued developers in the field since the va.gov/opa/publications/factsheets/ Communications welcomes your opinion. To submit a fs_department_of_veterans_affairs.pdf). 1960s. For example, MUMPS, the language Letter to the Editor, please limit your comments to 500 In his book The Best Care Anywhere on which VistA is based, was developed in words or less and send to [email protected]. (http://p3books.com/bestcareanywhere) the early 1970s for medical applications; © 2010 ACM 0001-0782/10/1100 $10.00 Phillip Longman documented VistA’s VistA achieved general distribution in the role in delivering care with better out- VA in the late 1990s, almost 30 years later. comes than national averages to a pop- Why so long? I tried to address some of Coming Next Month in ulation less healthy than national aver- these issues in the article. Also, VistA does Communications ages at a cost that has risen more slowly not represent an integrated approach, but than national averages. Included was a rather an interfaced approach with several The Impact of long list of references (more than 100 proprietary subsystems. Web 2.0 Technologies in the 2010 second edition), yet Cantrill Stephen V. Cantrill, M.D., Denver wrote “Although grand claims are of- Bayesian Networks ten made about the potential improve- ments in the quality of care, decreases Correction CS Education Week in cost, and so on, these are very diffi- The tribute “Robin Milner: The Elegant cult to demonstrate in a rigorous, sci- Pragmatist” by Leah Hoffmann (June Why We Need a Research entific fashion.” 2010) requires a small correction. It Data Census Public-domain VistA also general- said Milner “served as the first chair izes well outside the VA. For example, of the University of Cambridge Com- As well as the latest news on topic it has been deployed in the U.S. Indian puter Laboratory.” For all his many modeling, eye tracking and mobile phones, and crowdsourcing for Health Service, with additional func- gifts, however, Milner was not preco- social good. tionality, including pediatrics. Speak- cious enough to have run a university

november 2010 | vol. 53 | no. 10 | communications of the acm 7

Previous A.M. Recipients

1966 A.J. Perlis 1967 Maurice Wilkes 1968 R.W. Hamming 1969 Marvin Minsky 1970 J.H. Wilkinson 1971 John McCarthy 1972 E.W. Dijkstra 1973 Charles Bachman 1974 Donald Knuth 1975 Allen Newell 1975 Herbert Simon 1976 Michael Rabin 1976 Dana Scott 1977 John Backus 1978 Robert Floyd 1979 Kenneth Iverson ACM A.M. TURING AWARD 1980 C.A.R Hoare 1981 Edgar Codd NOMINATIONS SOLICITED 1982 Stephen Cook 1983 Ken Thompson 1983 Dennis Ritchie Nominations are invited for the 2010 ACM A.M. Turing Award. This, ACM’s 1984 Niklaus Wirth oldest and most prestigious award, is presented for contributions of a technical 1985 Richard Karp 1986 John Hopcroft nature to the computing com¬munity. Although the long-term influences 1986 Robert Tarjan of the nominee’s work are taken into consideration, there should be a particular 1987 John Cocke 1988 Ivan Sutherland outstanding and trendsetting technical achievement that constitutes the principal 1989 William Kahan claim to the award. The award carries a prize of $250,000 and the recipient is 1990 Fernando Corbató 1991 Robin Milner expected to present an address that will be published in an ACM journal. Financial 1992 Butler Lampson support of the Turing Award is provided by the Intel Corporation and Google Inc. 1993 Juris Hartmanis 1993 Richard Stearns Nominations should include: 1994 Edward Feigenbaum 1994 Raj Reddy 1) A curriculum vitae, listing publications, patents, honors, other awards, etc. 1995 Manuel Blum 1996 Amir Pnueli 2) A letter from the principal nominator, which describes the work of the nominee, 1997 Douglas Engelbart 1998 James Gray and draws particular attention to the contribution which is seen as meriting the 1999 Frederick Brooks award. 2000 Andrew Yao 2001 Ole-Johan Dahl 3) Supporting letters from at least three endorsers. The letters should not all be 2001 Kristen Nygaard 2002 Leonard Adleman from colleagues or co-workers who are closely associated with the nominee, 2002 Ronald Rivest and preferably should come from individuals at more than one organization. 2002 Adi Shamir 2003 Alan Kay Successful Turing Award nominations usually include substantive letters of 2004 Vinton Cerf support from a group of prominent individuals broadly representative of the 2004 Robert Kahn 2005 Peter Naur candidate’s field. 2006 Frances E. Allen 2007 Edmund Clarke For additional information on ACM’s award program 2007 E. Allen Emerson please visit: www.acm.org/awards/ 2007 Joseph Sifakis 2008 Barbara Liskov Additional information on the past recipients 2009 Charles P. Thacker of the A.M. Turing Award is available on: Additional information http://awards.acm.org/homepage.cfm?awd=140 on the past recipients of the A.M. Turing Award Nominations should be sent electronically is available on: http:// awards.acm.org/home­ by November 30, 2010 to: page.cfm?awd=140 Vinton G. Cerf, c/o [email protected] in the virtual extension

DOI:10.1145/1839676.1839679 In the Virtual Extension To ensure the timely publication of articles, Communications created the Virtual Extension (VE) to expand the page limitations of the print edition by bringing readers the same high-quality articles in an online-only format. VE articles undergo the same rigorous review process as those in the print edition and are accepted for publication on merit. The following synopses are from articles now available in their entirety to ACM members via the Digital Library. contributed article contributed article viewpoint DOI: 10.1145/1839676.1839702 DOI: 10.1145/1839676.1839701 DOI: 10.1145/1839676.1839703 Supporting Ubiquitous Location Relative Status of Journal In Support of Computer Science Information in Interworking and Conference Publications Teachers and the CSTA 3G and Wireless Networks in Computer Science Duncan Buell Massimo Ficco, Roberto Pietrantuono, Jill Freyne, Lorcan Coyle, Barry Smyth, A number of recent articles and comments and Stefano Russo and Padraig Cunningham have discussed the imbalance between Users and wireless ISPs could tap location- Citations represent a trustworthy measure enrollment and opportunities in computer based services across networks belonging of CS research quality—whether in articles science and the under-enrollments by to other ISPs to create ubiquitous personal in conference proceedings or in CS journals. minorities and women. An ongoing thread networks. Though computer scientists agree that in Peter Denning’s Communications Location-based services have conference publications enjoy greater columns and elsewhere concerns the emerged as a main interest of wireless status in CS than in other disciplines, there identity of the discipline to which we ISPs, or WISPs, and network operators. is little quantitative evidence to support this belong. As the national representative Positioning mobile devices in the third view. The importance of journal publication from universities to the board of the generation (3G) of wireless communication in academic promotion makes it a highly Computer Science Teachers Association networks (such as the Universal Mobile personal issue, since focusing exclusively (CSTA), I continually see the question of Telecommunications System) is crucial on journal papers misses many significant the identity of our discipline both within to many commercial services, including papers published by CS conferences. and external to our field. location applications that utilize accurate This article aims to quantify the relative The identity of computer science is positioning (such as handset navigation importance of CS journal and conference nowhere more important to the discipline tracking and locating points of interest); papers, showing that those in leading of computer science than in the K–12 public and private emergency services, conferences match the impact of those school system. We can instruct our own calling firefighters, medical teams, and in mid-ranking journals and surpass the students in the nature of the discipline, but emergency roadside assistance; and future impact of those in journals in the bottom those we so instruct will only be those who applications (such as fraud detection, half of the Thompson Reuters rankings first choose to come to us. If we want more location-sensitive billing, and advertising). (http://www.isiknowledge.com) for impact students, and if we want to be understood However, positioning techniques vary measured in terms of citations in Google for what we are, we must clarify the by accuracy, implementation cost, and Scholar. We also show that poor correlation message about computer science that all application scenarios (such as indoor between this measure and conference students will receive as part of their K–12 and outdoor). WISPs can exploit their acceptance rates indicates conference education. availability in order to locate their users in publication is an inefficient market where Even if one does not believe that heterogeneous environments by using the venues equally challenging in terms of “computer science” should be taught most suitable positioning technique in a rejection rates offer quite different returns in the K–12 system, it is nonetheless manner transparent to the user. in terms of citations. necessary for us to be involved in The recent interworking between 3G How to measure the quality of academic defining for the schools that which is systems and wireless networks (such research and performance of particular called “computer science.” There will as IEEE 802.11 and Bluetooth) allows researchers has always involved debate. be courses in Photoshop, Web design, WISPs to leverage wireless networks for Many CS researchers feel that performance office tools, A+ certification, networking, localization purposes. Wireless hotspots in assessment is an exercise in futility, in and such, and there will be (a smaller public and private places (such as homes, part because academic research cannot be number of) courses in Visual Basic, C++, offices, airports, shopping malls, arenas, boiled down to a set of simple performance or even Java. The simple fact is that these hotels, and libraries), along with the new metrics, and any attempt to introduce courses will exist in the schools, and there generation of mobile devices supporting them would expose the entire research is nothing fundamentally wrong with multiple positioning technologies (such enterprise to manipulation and gaming. that. What is a problem is for students as GPS, Bluetooth, Wi-Fi, and RFID), On the other hand, many researchers want to be misled into thinking that these fosters WISP development of integrated some reasonable way to evaluate academic are all indistinguishable and all equally positioning systems. performance, arguing that even an describable as “computer science.” imperfect system sheds light on research quality, helping funding agencies and tenure committees make more informed decisions.

november 2010 | vol. 53 | no. 11 | communications of the acm 9 The Communications Web site, http://cacm.acm.org, features more than a dozen bloggers in the BLOG@CACM community. In each issue of Communications, we’ll publish selected posts or excerpts.

Follow us on Twitter at http://twitter.com/blogCACM

doi:10.1145/1839676.1839680 http://cacm.acm.org/blogs/blog-cacm

complexities of system building, it is Rethinking the Systems often impossible to specify all the pa- rameters and heuristics being used within a 10-page paper limit. But the Review Process paper ought to present enough detail to enable another researcher to build Tessa Lau launches a discussion about the acceptance criteria a comparable, if not identical, system. for HCI systems papers at CHI, UIST, and other conferences. ˲˲ Alternative approaches. Why did you choose this particular approach? What other approaches could you Tessa Lau Reviewers reject messy systems papers have taken instead? What is the design “What Makes a Good that don’t have a thorough evaluation space in which your system represents HCI Systems Paper?” of the system, or that don’t compare one point? http://cacm.acm.org/ the system against previous systems ˲˲ Evidence that the system solves the blogs/blog-cacm/86066 (which were often designed to solve a problem as presented. This does not There has been much different problem). have to be a user study. Describe situa- discussion on Twitter, Facebook, and At CHI 2010 there was an ongoing tions where the system would be useful in blogs about problems with the re- discussion about how to fix this prob- and how the system as implemented viewing system for HCI systems papers lem. Can we create a conference/pub- performs in those scenarios. If users (see James Landay’s blog post, “I give lishing process that is fair to systems have used the system, what did they up on CHI/UIST” and the comment work? Plans are afoot to incorporate think? Were they successful? thread at http://dubfuture.blogspot. iterative reviewing into the systems pa- ˲˲ Barriers to use. What would pre- com/2009/11/i-give-up-on-chiuist.html). per review process for UIST, giving au- vent users from adopting the system, Unlike papers on interaction meth- thors a chance to have a dialogue with and how have they been overcome? ods or new input devices, systems are reviewers and address their concerns ˲˲ Limitations of the system. Under messy. You can’t evaluate a system before publication. what situations does it fail? How can using a clean little lab study, or show However, I think the first step is to users recover from these failures? that it performs 2% better than the last define a set of reviewing criteria for What do you think? Let’s discuss. approach. Systems often try to solve a HCI systems papers. If reviewers don’t novel problem for which there was no agree on what makes a good systems previous approach. The value of these paper, how can we encourage authors Readers’ comments systems might not be quantified un- to meet a standard for publication? I’d like to second your first recommendation. til they are deployed in the field and Here’s my list: I’ve reviewed a number of systems papers evaluated with large numbers of users. ˲˲ A clear and convincing description that do not provide a sufficiently compelling Yet doing such an evaluation incurs a of the problem being solved. Why isn’t motivation or use case—why should I or significant amount of time and engi- current technology sufficient? How anyone care about this system? Without neering work, particularly compared many users are affected? How much this, the paper often represents technology to non-systems papers. The result, does this problem affect their lives? in search of a problem. observed in conferences like CHI and ˲˲ How the system works, in enough Now, having read Don Norman’s UIST, is that systems researchers find detail for an independent researcher provocative article “Technology First, Needs it very difficult to get papers accepted. to build a similar system. Due to the Last: The Research-Product Gulf” in the

10 communications of the acm | november 2010 | vol. 53 | no. 11 blog@cacm recent issue of interactions magazine, I sometimes rival the effort to having built the have a keener appreciation for the possible Tessa LAU system in the first place—okay, I might be contribution of some technologies in exaggerating a bit, but still.... search of problems, but I still believe these “If reviewers If we could agree on some reasonable are more the exception than the norm … don’t agree on substitutes, that would be good. Some of and that without adequately making the the papers I’ve worked on have included case for the human-centered value(s) the what makes a good GOMS models of performance, for example, systems will help realize, such papers are systems paper, but not everyone buys into a purely probably more suitable for other venues. analytical approach. Sometimes, even —Joseph McCarthy how can we worse, what I’d like to convey in a paper encourage authors is a conceptual shift, a different way of One problem is that our field is moving thinking about some kind of problem, and so fast that we have to allow new ideas to to meet a standard that’s even harder to evaluate than pure cross evolve with other ideas rapidly. If we for publication?” performance. require evaluations of every paper, then we —Robert St. Amant don’t have the rapid turnaround required for innovations to cross paths with each other. I’ve suggested it before and will suggest On the other hand, it seems wrong not it again. An easy start could be to make to have some filter. Without filters, we accepted interactivity demos “worth” might end up publishing ideas that seem to keep in mind all the different ways a as much as a full paper at CHI—same interesting, but are actually quite useless. systems paper can be really valuable and presentation length, and the associated I think you have a great start on a list of worth seeing at a conference like CHI. A paper (maybe six pages long) needs to be discussion points. One thing to keep in mind systems paper could be interesting thanks of the same “archival” status in the ACM is that we should evaluate papers in whole to a thorough analysis of its deployment Digital Library. rather in parts. I will often recommend and usage (evaluation). Or it could be This could show a true commitment to accepting papers that are deficient in one interesting thanks to a well-argued systems research. area but very good in another. discussion of why it was built a particular —Florian Mueller —Ed Chi way (design). Or it might just demonstrate that a given interesting capability could I think it would be useful to some of be created at all. Or it could be a careful Tessa Lau responds us discussing your post if you could say argument about why a certain system Thank you all for the interesting dis- more about the kinds of evidence you are would be really useful, even if it hasn’t cussion. My goal was to initiate a dis- referring to when you say “evidence that been built or evaluated yet (motivation/ cussion within our community, not to the system solves the problem” that are not position paper). In the end, what I want are have all the answers. user studies. papers that stimulate thought and action. Along those lines, Dan, the question So, what are some examples of specific I’m not going to demand any particular you raise about what constitutes “ap- system problems (“clearly and convincingly levels of motivation, design, or evaluation; propriate evidence” is one that I’ll turn presented”), and what would you consider rather, I’m going to ask whether the back to the community for a collective appropriate evidence to show that your whole is innovative enough. This is a highly answer. system solved the problem? Is it a set subjective decision, which is why I greatly For what it’s worth, though, I don’t of usage scenarios that have been hard value wise program committees who can think of your examples as “systems.” to address through previous designs and make such a judgment on my behalf. The first is usage scenarios or design. you show how a single interface design —David Karger The second is an algorithm. The third can address them completely? Is it a is an interaction method. Each of those new, significantly more efficient algorithm I like your list, and think that the is fairly self-contained and possible to or mechanism, for example, to handle bullet points are representative of good evaluate using a fairly closed study. complex preferences around group evaluation criteria for systems papers What I meant by “systems” is an im- permissions, which would be useful to the across computer science. plemented prototype that gives people builders of group systems to know about? The main sticking point is, as I see access to new functionality that did not (In the latter case, would evidence be it, “Evidence that the system solves the exist before. Examples of “systems” in- performance testing, using logs of previous problem as presented.” In some other clude CoScripter, Many Eyes, Landay’s queries as data?) Is it a new approach for areas of empirical computer science, we DENIM and SILK, Gajos’s SUPPLE, An- using skin-tapping as input? have repositories of test problems, suites drew Ko’s Whyline. How can we show —Dan Gruen of agreed-upon performance metrics, that each of these systems is “innova- testing harnesses for software, and so forth. tive enough” (in David’s words) to mer- I am a strong proponent of rigorous Usability testing is seen as the gold standard it publication? gatekeeping at conferences simply because in HCI, though, and it’s much harder to I need some help figuring out which things leverage such tools to make evaluation by Tessa Lau is a research staff member and manager at IBM’s Almaden Research Center. are worth following in my limited time. user testing efficient. The effort devoted At the same time, I think it is important to user testing of a new system can © 2010 ACM 0001-0782/10/1100 $10.00

november 2010 | vol. 53 | no. 11 | communications of the acm 11 cacm online ACM Member News

DOI:10.1145/1839676.1839681 David Roman Chris Stephenson on K–12 CS Education ACM Member A Preference for PDF News recently interviewed Chris Stephenson, When it comes to electronic formats, Communications readers prefer PDF to executive director of the HTML, according to data on a year’s worth of published articles. PDF’s faithful re- Computer Science Teachers production of magazine pages appears to give it an edge. Legacy is another factor. Association, about the status of Every article in Communications’ 52-year history is available in PDF, while HTML K–12 CS education in the U.S. versions date back to 1999. Also, members are accustomed to accessing ACM’s and how ACM members can help. “I think computer science many transactions, proceedings, and journals exclusively in PDF. teachers in the U.S. face a The table shows how the most popular articles published over a 12-month span number of challenges that make were accessed from the DL and Communications’ Web site. it difficult for them to teach to their full potential,” says Stephenson. “First, computer science is poorly understood in 72,000 1 the U.S. school system, so administra-tors and decision- makers have no idea what CS teachers do or why it is 11 3 important. This means teachers 50,000 2 have to continually fight for their programs and their students and are often

HTM L under-supported in terms of

Pageviews resources. Also, our teacher 25,000 5 certification requirements are 12 a complete mess, and CS 13 6 teachers in many states must 14 10 4 8 more PDF downloads than HTML pageviews. first be certified in some other 9 7 discipline. This doubles the no PDF downloads (on CACM Web site only) time and effort required. And it 0 25,000 75,000 115,000 means CS teachers can be PDF Downloads Title assigned to teach something other than computer science at 1 A Few Billion Lines of Code Later http://cacm.acm.org/magazines/2010/2/69354 any time. Finally, our discipline 2 What Should We Teach New Software Developers? Why? requires teachers to continually http://cacm.acm.org/magazines/2010/1/55760 upgrade both their teaching and 3 You Don’t Know Jack About Software Maintenance technical knowledge, and there http://cacm.acm.org/magazines/2009/11/48444 is very little access to relevant and timely professional 4 MapReduce: A Flexible Data Processing Tool http://cacm.acm.org/magazines/2010/1/55744 development. 5 You’re Doing it Wrong http://cacm.acm.org/magazines/2010/7/95061 “The most important thing 6 Retrospective: An Axiomatic Basis for Computer Programming ACM members can do is to http://cacm.acm.org/magazines/2009/10/42360 advocate for computer science courses in their local schools…. 7 An Interview With Edsger W. Dijkstra http://cacm.acm.org/magazines/2010/8/96632 The other really important action 8 Recent Progress in Quantum Algorithms http://cacm.acm.org/magazines/2010/2/69352 that U.S. members can do is 9 MapReduce and Parallel DBMSs: Friends or Foes? to contact their congressional http://cacm.acm.org/magazines/2010/1/55743 representatives and request 10 Scratch: Programming for All http://cacm.acm.org/magazines/2009/11/48421 that they support the Computer Science Education Act [HR 5929]. 11 How we Teach Introductory Computer Science is Wrong With enough support, this bill http://cacm.acm.org/blogs/blog-cacm/45725 has the power to fundamentally 12 Are You Invisible? http://cacm.acm.org/blogs/blog-cacm/94307 change computer science 13 The “NoSQL” Discussion has Nothing to Do With SQL education in this country.” http://cacm.acm.org/blogs/blog-cacm/50678 A new joint ACM–CSTA report, Running on Empty: The 14 A Tale of A Serious Attempt At P≠NP http://cacm.acm.org/blogs/blog-cacm/97587 Failure to Teach K–12 Computer Methodology: Cited articles were published online or in the monthly print Communications between October 2009 Science in the Digital Age, is through September 2010, and include the 10 titles with the most PDF downloads from the ACM Digital Library in the 12 months following their publication, and the 10 titles with the most HTML pageviews on the Communications available at http://acm.org/ Web site over the same period, according to Google Analytics. Runningonempty/. —Jack Rosenberger

12 communications of the acm | november 2010 | vol. 53 | no. 11 news

Science | doi:10.1145/1839676.1839682 Gregory Goth Turning Data N Into Knowledge Today’s data deluge is leading to new approaches to visualize, analyze, and catalog enormous datasets.

he amount of data available to scientists of nearly every discipline has almost be- come a “Can you top this?” exercise in numbers. TThe Sloan Digital Sky Survey (SDSS), for example, is often cited as a prime example. Since the survey’s 2.5-meter telescope first went online in 1998, more than 2,000 refereed publications have been produced, but they use just 10% of the survey’s available imag- ing data, according to a recent U.S. National Science Foundation work- shop on data-enabled science in the mathematical and physical sciences. Once the next-generation, state-of-the- art Large Synoptic Survey Telescope (LSST) goes online in 2016, however, it is estimated to be capable of produc- ing a SDSS-equivalent dataset every night for the next 10 years. Another of- The Large Synoptic Survey Telescope will have the ability to survey the entire sky in only ten-cited example is the Large Hadron three nights. Collider. It will generate two SDSS’s on i worth of data each day. Some leading computational re- the reality might be that each of the On the surface, then, the scientific search scientists believe, however, that people working on their own discipline orporat C community’s mandate seems clear: progress in utilizing the vast expansion are achieving the results they need to SST create better computational tools to of data will best be attacked on a project- scientifically,” says Dan Masys, M.D., visualize, analyze, and catalog these by-project basis rather than by a pan- chairman of biomedical informatics at enormous datasets. And to some ex- disciplinary computational blueprint. Vanderbilt University. “There’s a cost tent, there is wide agreement these “In theory, you might think we of communication that reaches an ir-

Image courtesy of L tasks must be pursued. should all be working together, and reducible minimum when you work

november 2010 | vol. 53 | no. 11 | communications of the acm 13 news

across disciplinary boundaries, and likely speed processing of problems tion of text provided to us by Google, sometimes it’s worth it. involving multiplying “several million and looks up the statistical properties “But the grander potential for syn- genome data points by several thou- of that word in the text; that is, if you ergy that’s often spoken of at the level sand people” from five days to three give it the word ‘telephone’, it will look of federal funding agencies probably hours—a prime example of focused up how often ‘telephone’ occurs with doesn’t happen as much as people intradisciplinary collaboration and words from a long list of verbs—for think would be best for science,” Masys leading-edge hardware. example, how often does it occur with continues. “You can’t push that rope ‘hug’, or ‘eat’, and so on. all that well, because it depends on the New Perspectives on Data “To Google’s credit they put this art of the possible with respect to tech- Both Mitchell and Randal Bryant, dean out on the Web for anybody to use, but nologies and the vision of the scientists of the school of computer science at they were thinking it would be used by doing the work.” CMU, cite the influence of commercial researchers working on translation— Tom Mitchell, chairman of the ma- companies for helping to expand the and it turned out to be useful for some- chine learning department at Carn- concept of what kind of data, and what thing else.” egie Mellon University (CMU), concurs kind of data storage and computational Meanwhile, the LSST project is with Masys’ assessment. “I think it architectures, can produce useful sci- planning multiple vectors by which its starts from the bottom up and at some entific results. huge dataset—all of which will be pub- point you’ll see commonalities across “The commercial world, Google licly available in near-real time—will domains,” he says. As an example, he and its peers, have been the drivers aid research by professional astrono- cites time series algorithms being de- on the data side, much more than the mers; programs at museums, second- veloped by CMU colleague Eric Xing traditional sciences or universities,” ary schools, and other institutions; and that may also be useful for brain imag- says Bryant, who cites the example of a citizen scientists. The project’s goal, ing work Mitchell is undertaking. Google cluster running a billion-word say the organizers, is “open source, “There’s an example I think is prob- index that outperformed the Big Iron open data.” ably pretty representative of how it’s architecture of the “usual suspects” “We will develop methods for en- going to go,” Mitchell says. “People en- in a 2005 language-translation contest gaging the public so anyone with a Web counter problems and have to design sponsored by the U.S. National Insti- browser can effectively explore aspects algorithms to address them, but time tute of Standards and Technology. of the LSST sky that interest and im- series analysis is a pretty generic prob- The availability of such large datas- pact the public,” according to the LSST lem. So I think bottom up it will grow ets can lead to serendipitous discover- organizers. “We will work with the IT and then they will start connecting ies such as one made by Mitchell and his industry on enhanced visualization across [different disciplines].” colleagues, using a trillion-word index involving dynamic graphics overlays Vanderbilt’s Masys is about to be- Google had originally provided for ma- from metadata and provide tools for gin a collaboration with computation- chine translation projects. “We found public query of the LSST database.” al biologists from Oak Ridge National we could build a computational model The LSST organization’s hope, then, Laboratory. Masys says the Oak Ridge that predicts the neural activity that will is that the distributed nature of allow- scientists’ optimization of Vander- show up in your brain when you think ing any researcher at any level to ac- bilt’s fundamental algorithms and the about an arbitrary noun,” Mitchell says. cess the data will result in a plethora of lab’s teraflop-capable architecture will “It starts by using a trillion-word collec- projects—a kind of “given enough eye-

Ubiquitous Computing Intel’s Friendly, Smart Machines

Context-aware computing, in combined with “soft sensors” about what’s going on around is run through an inference which devices understand what a such as calendars and social it, could learn to recognize algorithm that examines the user is doing and anticipate his networks, to track people’s which person is holding it— input and generates confidence or her needs without being activity and figure out how the based on how the user moves, scores to determine what is asked, are the next step in the devices can help. For instance, a what angle he holds it at, and likely going on. evolution of smart machines, device might locate someone at how fast he presses the Collecting this data will says Justin Rattner, Intel vice her office, hear the sound of buttons—then make require giving users control over president and chief technology human voices, crosscheck her personalized recommendations what gets shared, and allow them officer. calendar, and conclude she’s in a for shows, based on past to turn off sensors, Rattner says. In his keynote address at business meeting, then suggest preferences. A prototype He gives no timeline for IDF2010, the recent Intel to the husband trying to call her Personal Vacation Assistant, introducing such applications, Developer Forum in San that this wouldn’t be a good time developed with Fodor’s Travel, but says, “We believe that Francisco, Rattner laid out a to interrupt. uses GPS location, time of day, context-aware computing is vision in which computers use a A television remote control, and past behavior to poised to fundamentally change variety of sensors—microphones, using unsupervised learning in recommend restaurants and the way we interact and relate to accelerometers, and global which it continuously collects tourist sites. Data, collected over the devices that we use today.” positioning systems (GPSs)— data and makes inferences time and shared among devices, —Neil Savage

14 communications of the acm | november 2010 | vol. 53 | no. 11 news balls” approach to massive datasets. Research & Development However, even massive datasets are “The right question is, sometimes not complete enough to Paper deliver definitive results. Recent dis- Do I have a scientific coveries in biomedical research have revealed that even a complete index question and a Chase of the human genome’s three billion method for answering Due to enormous governmental pairs of chemical bases has not greatly investments in research and accelerated breakthroughs in health it that is scientific, development, scientists in care, because other crucial medical no matter what many Asian countries are data is missing. A study of 19,000 wom- steadily increasing their the dataset is?” number of papers published in en, led by researchers at Brigham and scientific journals. Women’s Hospital in Boston, used data asks Tom Mitchell. The Asia-Pacific region constructed from the National Human increased its total of published Genome Research Institute’s catalog science articles from 13% in the early 1980s to slightly more of genome-wide association study re- than 30% in 2009, according sults published between 2005 and June to the Thomson Reuters 2009—only to find that the single big- National Science Indicators, an annual database of the gest predictor of heart disease among ‘We want to start a joint Ph.D. program number of articles published the study’s cohort is self-reported fam- in public policy and machine learning, in about 12,000 internationally ily history. Correlating such personal because we think the future of policy recognized journals. China data with genetic indexes on a wide de- analysis will be increasingly evidence- leads the pack with 11% in 2009, up from 0.4% in the early mographic scale today is nearly impos- based. And we want to train people 1980s, followed by Japan with sible as an estimated 80% of U.S.-based who understand the algorithms for 6.7% and India with 3.4%. In primary-care physicians do not record analyzing and collecting that evidence contrast, the ratio of articles from scientists in the U.S. patient data in electronic medical re- as well as they understand the policy decreased to 28% in 2009, cords (EMRs). Recent government fi- side.’” As a result, the joint Ph.D. pro- down from 40% in the early nancial incentives are meant to spur gram was created at CMU. 1980s. EMR adoption, but for the immediate In all, 25 nations have increased their research, but future, crucial data in biomedical re- none more so than Singapore. Further Reading search will not exist in digital form. With a population of just five Another issue in biomedical research Duda, S.N. Cushman, C., and Masys, D.R. million, the nation published An XML model of an enhanced dictionary 8,500 articles in 2009, is the reluctance of traditionally trained compared with only 200 in scientists to accept datasets that were to facilitate the exchange of pre-existing clinical research data in international 1981. Singapore now allocates not created under the strict parameters studies, Proceedings of the 12th World 3% of its gross domestic required by, for example, epidemiolo- Congress on Health Informatics, Brisbane, product to research and development, a figure expected Australia, August 20–24, 2007. gists and pharmaceutical companies. to rise to 3.5% by 2015. CMU’s Mitchell says this arena of Mitchell,T.M., Shinkareva, S.V., Carlson, A., The increase in scientific public health research could be in the Chang, K.-M., Malave, V.L., Mason, R.A., publications, especially in vanguard of what may be the true crux and Just, M.A. East Asian countries, reflects a “phenomenal” increase in of the new data flood—the idea that the Predicting human brain activity associated with the meanings of nouns, Science 320, funding, Simon Marginson, a provenance of a given dataset should 5880, May 30, 2008. professor of higher education matter less than the provenance of a at the University of Melbourne, Murray-Rust, P. told The New York Times. given hypothesis. Data-driven science—a scientist’s view, Marginson attributed the “The right question is, Do I have a NSF/JISC Repositories Workshop position increase in research output to scientific question and a method for paper, April 10, 2007. governments’ commitment answering it that is scientific, no matter to establishing knowledge- Newman, H.B. intensive economies. “It’s what the dataset is?” Mitchell asks. In- Data intensive grids and networks for high very much not simply about creasingly, he says, computational sci- energy and nuclear physics: drivers of the knowledge itself—it’s about entists will need to frame their questions formation of an information society, World its usefulness throughout the Summit on the Information Society, Pan- economy. I think that that and provide data for an audience that ex- European Ministerial Meeting, Bucharest, economic vision is really the tends far beyond their traditional peers. Romania, November 7–9, 2002. principal driver,” Marginson “We’re at the beginning of the curve said. Thakar, A.R. of a decades-long trend of increas- Another reason for The Sloan Digital Sky Survey: drinking from increased publications is that ingly evidence-based decision-making the fire hose, Computing in Science and many Asian universities now across society, that’s been noticed by Engineering 10, 1, Jan./Feb. 2008. receive additional funding to people in all walks of life,” he says. have their papers translated “For example, the people at the public Gregory Goth is an Oakville, CT-based writer who into English, the language used specializes in science and technology. by the majority of academic policy school at CMU came to the ma- journals. chine learning department and said, © 2010 ACM 0001-0782/10/1100 $10.00 —Phil Scott

november 2010 | vol. 53 | no. 11 | communications of the acm 15 news

Technology | doi:10.1145/1839676.1839683 Gary Anthes Security in the Cloud Cloud computing offers many advantages, but also involves security risks. Fortunately, researchers are devising some ingenious solutions.

omputing may some day be organized as a public util- ity, just as the telephone system is a public utility,” Massachusetts Institute of CTechnology (MIT) computer science pioneer John McCarthy noted in 1961. We aren’t quite there yet, but cloud computing brings us close. Clouds are all the rage today, promising con- venience, elasticity, transparency, and economy. But with the many ben- efits come thorny issues of security and privacy. The history of computing since the 1960s can be viewed as a continuous move toward ever greater specializa- tion and distribution of computing resources. First we had mainframes, and security was fairly simple. Then we added minicomputers and desktop and laptop computers and client-server models, and it got more complicated. Cloud computing simplifies security issues for users by outsourcing them to companies such These computing paradigms gave way as Microsoft, which recently opened a $550 million data center in Chicago. in turn to n-tier and grid computing and to various types of virtualization. rity management in the cloud. A cell, take action accordingly. They might, As hardware infrastructures grew managed as a single administrative for instance, throttle back the CPU, more complicated and fragmented, domain using common security poli- stop all I/O to a virtual machine (VM), so did the distribution of software and cies, contains a bundle of virtual ma- or take a clone of the VM and move it data. There seemed no end to the ways chines, storage volumes, and networks elsewhere for evaluation. Agents could that users could split up their comput- running across multiple physical ma- be deployed by cloud users, cloud ser- ing resources, and no end to the securi- chines. Around the cells HP inserts vice providers, or third parties such as a ty problems that arose as a result. Part various sensors, detectors, and mitiga- virus protection company, Sadler says. of the problem has been one of moving tors that look for viruses, intrusions, But these agents introduce their targets—just as one computing para- and other suspicious behavior. Virtual- own management challenges. There digm seemed solid, a new, more attrac- ization enables these agents to be very might be as many as 30 agents, inter- tive one beckoned. close to the action without being part acting in various ways and with varying In a sense, cloud computing sim- of it or observed by it, according to HP. drains on system resources. HP Labs plifies security issues for users by out- “People often think of virtualization is developing analytic tools that can sourcing them to another party, one as adding to security problems, but it generate playbooks that script system that is presumed to be highly skilled is fundamentally the answer to a lot of behavior. These templates, tailorable CROSOFT at dealing with them. Cloud users those problems,” says Martin Sadler, by users, employ cost/benefit analyses I M

may think they don’t have to worry director of HP’s Systems Security Lab. and reflect what is most important to FROM

about the security of their software “You can do all sorts of things you can’t users and what cost they are willing to ON I SS and data anymore, because they’re in do when these things are physical ma- bear for various types of protection. I

expert hands. chines.” For example, the sensors can PERM H But such complacency is a mistake, watch CPU activity, I/O patterns, and Virtual Machine Introspection T say researchers at Hewlett-Packard memory usage and, based on models IBM Research is pursuing a similar D WI USE (HP) Laboratories in Bristol, U.K. They of past behavior, recognize suspicious approach called “virtual machine in- H are prototyping Cells as a Service, by activity. They can also assess the prob- trospection.” It puts security inside OTOGRAP H which they hope to automate secu- ability of certain events happening and a protected VM running on the same P

16 communications of the acm | november 2010 | vol. 53 | no. 11 news physical machine as the guest VMs Society running in the cloud. The security VM “People often think employs a number of protective meth- Pew ods, including the whitelisting and of virtualization as blacklisting of guest kernel functions. It can determine the operating system adding to security Report on and version of the guest VM and can problems, but start monitoring a VM without any Mobile beginning assumption of its running it is fundamentally state or integrity. the answer to a lot Apps Instead of running 50 virus scan- of those problems,” ners on a machine with 50 guest VMs, Although a greater number of virtual machine introspection uses just says Martin Sadler, adults are turning to mobile one, which is much more efficient, says phones to text and access director of the Internet, age and gender Matthias Schunter, a researcher at IBM differences exist, according to a Research’s Zurich lab. “Another big HP’s Systems report by Pew Research Center’s advantage is the VM can’t do anything Internet & American Life Project against the virus scan since it’s not Security Lab. and The Nielsen Company. The report, titled The Rise aware it’s being scanned,” he says. of Apps Culture, found that 35% Another variation, called “lie de- of U.S. adults have software tection,” puts a tiny piece of software applications or apps on their inside the VM to look at the list of run- phones, yet only 24% of adults use those apps. Overall, today’s ning processes as seen by the user. In- apps culture—essentially born trospection software outside the VM adversary could launch a side-channel a couple of years ago with can reliably determine all the process- attack based on the VM’s sharing of the introduction of Apple’s iPhone—is predominantly es actually running on the VM; if there physical resources such as CPU data male, younger, and more is any difference between the two lists, caches. The researchers also outlined affluent. some malware, such as a rootkit, is sus- a number of mitigation steps, but con- Eighteen to 29-year-olds pected of running on the VM. cluded the only practical and foolproof comprise only 23% of the U.S. adult population but constitute Looking from both within the VM protection is for cloud users to require 44% of the apps-using and without, the lie detector can also that their VMs run on dedicated ma- population. By contrast, 41% of compare the lists of files on disk, the chines, which is potentially a costly so- the adult population is age 50 lution. and older but this group makes views of open sockets, the lists of load- up just 14% of apps users. ed kernel modules, and so on. “Each Younger adopters also use apps, of these lie tests improves the chanc- Difficulties With Encryption including games and social es of detecting potential malware, Encryption is sometimes seen as the media, more frequently. Gender differences were but none of them can prove that no ultimate security measure, but it also also apparent. Women are malware exists,” says IBM researcher presents difficulties in the cloud. At more likely to rely on social Klaus Julisch. present, processing encrypted data networking apps such as In a third application, a virtual in- means downloading it and decrypting Facebook and Twitter while men are inclined to use trusion detection system runs inside it for local use and then possibly up- productivity and financial apps. the physical machine to monitor traf- loading the results, which is a cumber- Nevertheless, adoption is fic among the guest VMs. The virtual some and costly process. growing rapidly. The Nielsen Company found that the networks hidden inside a physical The ability to process encrypted average number of apps on machine are not visible to conven- data in place has been a dream of a smartphone has swelled tional detectors because the detec- cryptographers for years, but it is now from 22 in December 2009 tors usually reside in a separate ma- demonstrating some progress. Last to 27 today. Not surprisingly, iPhone owners top the list with chine, Schunter says. year, Craig Gentry, first at Stanford an average of 40 apps, while Indeed, snooping between VMs in- University and then at IBM Research, Android users claim 25 and side a machine was shown to be a real proved it is possible to perform cer- BlackBerry owners 14. The next few years will possibility by researchers last year. tain operations on data without first likely usher in dramatic Computer scientists Thomas Risten- decrypting it. The technique, called changes. “Every metric we part, Hovav Shacham, and Stefan Sav- “fully homomorphic encryption,” was capture shows a widening age at the University of California, San hailed as a conceptual breakthrough, embrace of all kinds of apps by a widening population, Diego and Eran Tromer at MIT proved but is so computationally demanding states Roger Entner, coauthor it was possible for an adversary to get that practical applications are years of the report and senior vice his or her VM co-located with a target’s away, experts say. president at Nielsen. “It’s … not VM on a cloud’s physical machine 40% Meanwhile, the more limited abil- too early to say that this is an important new part of the of the time. In a paper, “Hey, You, Get ity to search encrypted data is closer to technology world.” Off of My Cloud,” they showed how the reality. In “Cryptographic Cloud Stor- —Samuel Greengard

november 2010 | vol. 53 | no. 11 | communications of the acm 17 news

age,” a paper published earlier this nies like Google and Amazon and Mi- year, researchers Seny Kamara and crosoft have hundreds of people de- Kristin Lauter of Microsoft Research In “Cryptographic voted to security,” he says. “How many described a virtual private storage ser- Cloud Storage,” do you have?” vice that aims to provide the security of a private cloud and the cost savings Microsoft Further Reading of a public cloud. Data in the cloud researchers Seny remains encrypted, and hence pro- Christodorescu, M., Sailer, R., Schales, D., tected from the cloud provider, court Kamara and Kristin Sgandurra, D., and Zamboni, D. Cloud security is not (just) virtualization subpoenas, and the like. Users index Lauter describe security, Proceedings of the 2009 ACM their data, then upload the data and Workshop on Cloud Computing Security, the index, which are both encrypted, to a virtual private Chicago, IL, Nov. 13, 2009. the cloud. As needed, users can gener- storage service that Gentry, C. ate tokens and credentials that control Fully homomorphic encryption using ideal who has access to what data. provides the security lattices, Proceedings of the 41st Annual Given a token for a keyword, an ACM Symposium on Theory of Computing, of a private cloud Bethesda, MD, May 31–June 2, 2009. authorized user can retrieve point- ers to the encrypted files that contain and the cost savings Kamara, S. and Lauter, K. Cryptographic cloud storage, Proceedings the keyword, and then search for and of a public cloud. of Financial Cryptography: Workshop on download the desired data in encrypt- Real-Life Cryptographic Protocols and ed form. Unauthorized observers can’t Standardization, Tenerife, Canary Islands, know anything useful about the files or Spain, January 25–28, 2010. the keywords. Ristanpart, T., Tromer, E., Sacham, H., The experimental Microsoft service and Savage, S. also offers users “proof of storage,” a when your data is on a server in China Hey, you, get off of my cloud: exploring information leakage in third-party protocol by which a server can prove to but you outsourced to a cloud service compute clouds, Proceedings of the a client that it did not tamper with its in New York?” asks Sion. “Or what if 16th ACM Conference on Computer and encrypted data. The client encodes the you have the legal resources to fight a Communications Security, Chicago, IL, data before uploading it and can verify subpoena for your data, but they sub- Nov. 9–13, 2009. the data’s integrity at will. poena your cloud provider instead? Shi, E., Bethencourt, J., Chan, T-H., Song, D., Not all cloud security risks arise You will be under scrutiny for moving and Perrig, A. from technology, says Radu Sion, a to the cloud by your shareholders and Multi-dimensional range query over encrypted data, Computer Science computer science professor at Stony everyone else.” Technical Report CMU-CS-06-135R, Brook University. There is scant le- Nevertheless, Sion says all but the Carnegie Mellon University, March 2007. gal or regulatory framework, and few most sophisticated enterprises will precedents, to deal with issues of li- be safer putting their computing re- Gary Anthes is a technology writer and editor based in ability among the parties in cloud ar- sources in the expert hands of one of Arlington, VA. rangements, he notes. “What happens the major cloud providers. “Compa- © 2010 ACM 0001-0782/10/1100 $10.00

Distributed Computing Math at Web Speed

“Many hands make light work,” The researchers estimate that possible combinations of the “We believe that our Hadoop goes the old adage. Now there’s a typical computer would have cube in just a few weeks, a task clusters are already more data to prove it. taken at least 500 years to carry the researchers estimate would powerful than many other In recent weeks, both Yahoo! out the same operation. have taken a single computer supercomputers,” says Sze, who and Google have announced the Another group of researchers 35 years. conceived of the project as part results of separate mathematical recently took advantage of Google has yet to release the of an internal Yahoo! contest to experiments that demonstrate Google’s distributed computing details of its technical solution, demonstrate the capabilities of the computational power of large infrastructure to tackle another but it probably bears some Hadoop. clusters of networked PCs. famously thorny computational resemblance to the approach In both cases, the At Yahoo!, a team led by challenge: Rubik’s Cube. The used at Yahoo!, where the team mathematical problems proved researcher Tsz-Wo Sze broke team developed an algorithm used Apache Hadoop, open- particularly well-suited to the world record for calculating capable of solving any Rubik’s source software originally distributed computing because the digits of pi, crunching the Cube configuration in 20 developed at Google (and later the calculations can be parceled famously irrational number moves or less, resolving a developed extensively by Yahoo!) out over the network into much to the two-quadrillionth bit by conundrum that has puzzled that allows developers to stitch smaller operations, capable of stitching together more than mathematicians for three together thousands of computers running on a standard-issue PC. 1,000 computers to complete the decades. The computers over the network into a powerful Making light work indeed. calculation over a 23-day period. simulated all 43 quintillion cloud computer. —Alex Wright

18 communications of the acm | november 2010 | vol. 53 | no. 11 news

Society | doi:10.1145/1839676.1839684 Leah Hoffmann Career Opportunities What are the job prospects for today’s—and tomorrow’s—graduates?

ast fall, Jim Wordelman found programming projects, which is a trend himself in an enviable po- the BLS expects to continue. sition. At a time when un- “The BLS projections are pretty com- employment for recent U.S. pelling,” says Peter Harsha, director of college graduates was at the government affairs at the Computing Lhighest level since 1983, Wordelman, Research Association (CRA). “We’re op- a senior at the University of Illinois at timistic.” Urbana-Champaign’s (UIUC’s) Depart- College students seem to have picked ment of Computer Science, had several up on that optimism, and are returning job offers from companies that wanted to the field after a steep six-year decline to hire him the following summer. Even- caused by the dot-com crash. According tually, he took a job with Microsoft as a to the Taulbee Survey, an annual CRA developer on its Internet Explorer team. study that gathers data for North Ameri- “I did work for the company before and can computer science and computer loved it,” Wordelman explains. The job engineering programs, the number of also put him in a position to follow his computer science majors rose 8.1% in greatest passion: accessibility. 2008 and another 5.5% in 2009. “It’s a If recent data is any indication, cautious uptrend,” says Hambrusch. Wordelman’s case is not unique among At some schools, the surge in interest is computer science graduates in the U.S. even more pronounced: applications to (the job prospects for graduates in the Computer science graduates at Carnegie UIUC’s CS program were up by 26% this Mellon University, shown above, and other United Kingdom, China, and India are schools often have jobs waiting for them year and increased by 32% at CMU. discussed later ). In fact, his fellow UIUC upon graduation. The troubled economy has played a CS graduates received an average of 2.4 role in the uptick. Though the comput- job offers this year. The mean starting technology, engineering, and math- ing industry experienced a wave of lay- salary: $68,650. “Our undergrads have ematics) fields. For a discipline that offs at the height of the recession, it has had no trouble getting positions,” says is still struggling with the public per- been hit less hard than other sectors, Rob Rutenbar, the department head. ception that its jobs are migrating off- and employment was up by an estimat- “Most of them are doing things like soft- shore, such career predictions offer an ed 5% in the second quarter of 2010. ware development. Some launch entre- important counterpoint. preneurial ventures.” Of the new jobs, according to BLS The Coolness Factor At Carnegie Mellon University projections, 27% will be in software en- According to a recent study conducted (CMU), the job outlook is equally rosy: gineering, 21% in computing network- by the National Association of Colleges 95% of this year’s CS students had jobs ing, and 10% in systems analysis. Soft- and Employers, the average salary for waiting for them upon graduation. ware engineering alone is expected to this year’s crop of computer science “Companies may be a little choosier, add nearly 300,000 jobs in the next eight grads stands at $61,112. And while it’s but they are still hiring,” says Susanne years. too early to say for sure, some industry Hambrusch, a computer science profes- Computer programmers will fare watchers predict an influx of students sor at Purdue University, where gradu- less well, with a projected decline in who might otherwise have majored ates enjoyed mean starting salaries of employment of 3% through 2018. The in finance. Harsha, for example, cites $66,875 last year. BLS cites advances in programming David E. Shaw, a computer scientist According to projections from the tools, as well as offshore outsourcing, turned hedge fund manager who made U.S. Bureau of Labor Statistics (BLS), as contributing factors to this decline. a fortune in quantitative trading, then computing will be one of the fastest- Nonetheless, the federal agency pre- returned to scientific research: “He’s a growing job markets through 2018. dicts employers will continue to need model for a certain group.” YAMOTO I

M Employment of software engineers, some local programmers, especially There is also a coolness factor among L E I computer scientists, and network, da- ones with strong technical skills. And a generation of students who grew up AN D

Y tabase, and systems administrators is many companies, having discovered with computers and are deeply engaged H B expected to grow between 24%–32% that outsourcing is more challenging to with technologies like cellphones, Face- through 2018. They account for 71% manage than anticipated, are turning to book and other social media, and the OTOGRAP H P of new jobs among the STEM (science, domestic outsourcing to complete their latest electronic devices from Apple and

november 2010 | vol. 53 | no. 11 | communications of the acm 19 news

other hardware companies. “For every expand as quickly as companies and popular trend in computing there’s universities might like. “We currently a spike in interest,” says Harsha, cit- have about 775 undergrads, and we can ing a similar boom-and-bust cycle that add another couple hundred without a happened with the rise of the personal problem,” says UIUC’s Rutenbar. “But computer during the mid-1980s. Also, we need to do some soul searching if we Harsha says, students may finally have want to grow larger than that.” realized that the stereotype of comput- ing as a lonely career in which you sit in The International Outlook a cube and write code is not true. The job prospects for computer science “We all owe a non-trivial debt to com- graduates in the United Kingdom, Chi- panies like Google and Apple, who do na, and India vary widely as does each cool work on cool products and don’t country’s educational and economic look like your stereotypical guys in flan- situations. nel suits,” says Lenny Pitt, a UIUC com- According to the United Kingdom’s puter science professor. Higher Education Statistics Agency Mark Stehlik, assistant dean for un- (HESA), 17% of 2009’s CS graduates dergraduate education at CMU’s School were unemployed six months later— of Computer Science, has a different more than any other discipline. Indus- historical comparison: the space pro- try watchers caution that the figure gram of the 1960s, which fueled the should be taken with a grain of salt, imaginations and ambitions of a gen- however, since the category includes eration of schoolchildren. “There was students who studied softer subjects such an enterprise built around it,” he like human-computer interaction and recalls. Of course, to go to the moon, you information development as well as tra- had to be a rocket scientist. “And what ditional CS majors. “Higher-level things ACM’s if rocket science wasn’t your thing?” like software and systems design are a interactions Computer science majors, on the other different picture,” says Bill Mitchell, di- hand, have a variety of career options to rector of the British Computer Society. magazine explores choose from once they graduate. “You “They are very much recruiting these critical relationships can do software development across types of people.” between experiences, people, such a wide range of sectors,” notes Ste- Research institutions like the Univer- hlik, as nowadays almost every industry sity of Southampton, which placed 94% and technology, showcasing has computing needs. of its computer science graduates in emerging innovations and industry In spite of recent gains, the supply 2009, echo Mitchell’s sentiment. “Com- leaders from around the world of CS graduates is still dwarfed by the panies still need people with really good projected number of jobs. According to skills, who have been exposed to dif- across important applications of the BLS projections, there will be more ferent languages and platforms, who design thinking and the broadening than twice as many new computing jobs are confident and can code,” says Joyce field of the interaction design. per annum in the next eight years than Lewis, communications manager for Our readers represent a growing the current level of 50,000 computing the University of Southampton’s School graduates will be able to fill. Nor can of Electronics and Computer Science. community of practice that computer science departments, many And while the University of Southamp- is of increasing and vital of whom had trouble dealing with the ton and other members of the Russell global importance. influx of students in the late 1990s, Group—an association of 20 universi- ties that’s often referred to as the U.K.’s Ivy League—have no trouble filling spac- Software engineering es in their computer science programs, educators are nonetheless concerned by alone is expected a massive nationwide drop in interest in to add nearly 300,000 the field. “Enrollment has dropped by nearly 60% over the past eight years,” jobs in the U.S. in says Mitchell, who is working to reform the next eight years. the national IT curriculum and reverse the trend. “Companies tell us they have to bring people in from Silicon Valley.” http://www.acm.org/subscribe Recent computer science gradu- ates in China are also struggling with a demanding job market. According to a study conducted by the MyCOS Insti-

20 communications of the acm | november 2010 | vol. 53 | no. 11 news tute, a Beijing-based think tank, com- in the International Journal of Engineer- puter science, English, and law have ing Studies explained, “The teaching topped a list of majors with the most un- One concern in India load of professors in the top research- employed graduates for the past three is the growing lack intensive schools has increased, and years. In 2009, the most recent year for talented potential research students which data is available, computer sci- of educators to teach are being attracted by high-paying pri- ence was second only to English in the the next generation vate-sector jobs, or by research oppor- number of unemployed graduates. tunities at better-funded institutions Here, too, the figures underlie a more of software abroad.” Those students who do pur- complicated picture. Thanks to govern- engineers—a sue advanced degrees, according to the mental encouragement, the number of study’s authors, often do so to improve university graduates in China has risen shortage of up to their market value in the job market. dramatically during the past 10 years. 70,000 teachers, In 2008, more than six million students graduated nationwide; in 2002, the according to some Further Reading total number was below 1.5 million. estimates. National Association of Software Such increases, education experts con- and Services Companies tend, were not matched with employ- The IT-BPO Sector in India: Strategic Review 2009, http://www.nasscom. ment prospects, particularly in tech- in/Nasscom/templates/NormalPage. nical fields, where market needs are aspx?id=55816. highly specialized. Therefore, students Solanki, K., Dalal, S., and Bharti, V. must work hard to distinguish them- the country’s National Association of Software engineering education and selves from a glut of applicants. Often, Software and Services Companies, the research in India: a survey, International that means earning a graduate degree. IT services sector, still the dominant Journal of Engineering Studies 1,3, 2009. “Companies get a lot of applicants, and source of computing jobs, grew nearly U.S. Bureau of Labor Statistics to make it easier, some use the degree 16.5% in 2009, and software exports are Occupational Outlook Handbook, 2010-11 as a filter,” says Xiaoge Wang, an as- expected to increase by 14.4% in the cur- edition, http://www.bls.gov/oco/. sociate professor in the department of rent fiscal year. Job placements at the Universities & Colleges Admissions Service computer science and technology at country’s top engineering schools are Unistats From Universities and Colleges in the U.K., http://unistats.direct.gov.uk/ Tsinghua University. At Tsinghua, 83% robust, with many students receiving retrieveColleges_en.do. of 2009’s computer science graduates multiple offers upon graduation. One enrolled in graduate programs at home concern, however, is the growing lack Zhang, M. and Virginia M.L. Undergraduate computer science or abroad, up from 78% in 2008 and 66% of educators to teach the next genera- education in China, Proceedings of the 41st in 2007. “Our students would like to go tion of software engineers—a shortage ACM Technical Symposium on Computer to companies like Microsoft or IBM, of up to 70,000 teachers, according to Science Education, March 10–13, 2010, which require a Ph.D. or a master’s,” some estimates. University pay scales Milwaukee, WI. says Wang. are low compared to the private sector, In India, the IT industry is doing and few students pursue the advanced Leah Hoffmann is a Brooklyn, NY-based technology writer. well after a slowdown brought on by degrees that would qualify them for uni- the global downturn. According to versity positions. As a report published © 2010 ACM 0001-0782/10/1100 $10.00

Milestones David Kuck Wins Ken Kennedy Award

David Kuck, an Intel Fellow, operating cost (in terms of new machines, solving linear must be made, so a pre-analyzed is the recipient of the second energy), as well as applications programming and related repository would allow phases, annual ACM-IEEE Computer sensitivity. I’m working on theory problems to satisfy design goals.” and sequences of them, to be Society Ken Kennedy Award for and tools to support codesign. Asked about the next matched to codelets for optimal his four decades of contributions I introduced a computational important innovation with compilation. This is a combined to compiler technology and capacity model in the 1970s compilers, Kuck said, “Compiler static and dynamic approach to parallel computing, which have and pushed it much further in optimization transformations compilation.” improved the cost-effectiveness of the past few years. Measured are well developed, but where The Ken Kennedy Award multiprocessor computing. bandwidth and capacity and how to apply them is still a recognizes substantial In an email interview, Kuck (bandwidth used) for a set of mystery. I believe that building contributions to programmability discussed his current research. architectural nodes for each a large repository of codelets and productivity in computing and “I’m working on hardware/ phase in a computation provide could remove much of the substantial community service software codesign at a very capacity ratios that are invariant mystery related to sequential, or mentoring contributions, and comprehensive level, considering across hardware changes. This vector, parallel, and energy-aware includes a $5,000 honorarium. system cost, performance, and leads to very fast simulations of compilation. Many trade-offs —Jack Rosenberger

november 2010 | vol. 53 | no. 11 | communications of the acm 21 Get the Knowledge of Experts in the Computing Community from Morgan Kaufmann Publishers

Programming Massively GPU Computing Gems An Introduction to Parallel Processors By Wen-mei Hwu Parallel Programming By David B. Kirk and ISBN: 9780123849885 By Peter Pacheco Wen-mei W. Hwu $74.95 | December 2010 ISBN: 9780123742605 ISBN: 9780123814722 $79.95 | January 2011 $69.95 | January 2010

Analyzing Social Media Smart Things No Code Required Networks with NodeXL By Mike Kuniavsky By Allen Cypher, Mira By Derek Hansen, Ben ISBN: 9780123748997 Dontcheva, Tessa Lau Shneiderman and $39.95 | August 2010 and Jeffrey Nichols Marc A. Smith ISBN: 9780123815415 ISBN: 9780123822291 $49.95 | April 2010 $44.95 | August 2010

Use Code 43446 to save 20% on these or other great MK titles at the NEW mkp.com. Also available at Amazon.com or your favorite online retailer!

20102800_AD_CACMAd_077_1200.indd 1 9/24/10 10:29 AM

COMING SOON! High Performance Computing Programming and Applications

• Presents over 20 algo- rithms in pseudocode

• Offers primers on matrix algebra, probability the- ory, and number theory

Catalog no. C7058 • Reviews the latest and most common • Contains hundreds of programs, experiments, December 2010, c. 248 pp. clustering methods exercises, and illustrations ISBN: 978-1-4200-7705-6 • Includes a DVD with color figures • Offers source code, installation guides, and an $89.95 / £57.99 Catalog no. K10863, November 2010 interactive discussion forum online c. 247 pp., ISBN: 978-1-4398-1678-3 Catalog no. K12068, September 2010 $79.95 / £49.99 888 pp., ISBN: 978-1-4398-4620-9 $99.95 / £49.99 SAVE 20% when you order online at www.crcpress.com and enter Promo Code 933JM Limited time offer: Discount expires 1/31/2011

22 communications of the acm | november 2010 | vol. 53 | no. 11 news

Emerging Technology | doi:10.1145/1839676.1839704 Neil Savage Wide Open Spaces The U.S. Federal Communications Commission’s decision to open frequencies in the broadcast spectrum could enable broadband networks in rural areas, permit smart electric grids, and more.

ewly available frequencies in the broadcast spectrum should increase Internet ac- cess and spur innovation, U.S. Federal Communica- Ntions Commission (FCC) Chairman Julius Genachowski said in opening those frequencies to unlicensed com- mercial use. When TV broadcasters switched to narrower digital channels in 2009, they freed up frequencies below 700 mega- hertz, the so-called “white spaces” be- tween channels. Signals at these fre- quencies can travel about three times as far as the traditional Wi-Fi frequency of 2.4 gigahertz, and easily penetrate buildings and other physical obstacles. The new FCC rules create room for two classes of devices: fixed, high-pow- er ones that transmit at up to 4 watts U.S. Federal Communications Commission Chairman Julius Genachowski. and portable, low-power devices limit- ed to 100 milliwatts. Harold Feld, legal outh, CA. By placing white-space radios years before the appearance of the first director of Public Knowledge, a Wash- at substations, Spectrum Bridge cre- low-power applications. ington, D.C., public interest group, ated a wireless network to keep track Feld believes that, as the utility of says thousands of small wireless Inter- of power usage and simultaneously white spaces becomes apparent, the net service providers in rural areas, un- supplied the town’s residents with FCC will look for more spectrum to re- derserved by broadband connections, wireless broadband. A project in Wilm- lease. “As people think about how you will be quick to take advantage of the ington, NC, provided wireless links to could have radios that are more cogni- range and penetration, to achieve good inaccessible water-quality monitors tive, more sensitive to their spectrum coverage with fewer towers. and traffic-monitoring bridge cam- environments and act accordingly, But that will not happen tomorrow. eras. At a hospital in Logan, OH, where people are going to want to see that “It’s going to take a minimum of 18 concrete walls blocked Wi-Fi and the technology become more widely avail- months to get even the most basic de- building’s structure made cabling dif- able,” he says. vices approved and out there,” Feld says. ficult, Spectrum Bridge created a wire- Srivastava notes that when the FCC Neeraj Srivastava, vice president less network to monitor patients, share made the spectrum now used by Wi-Fi of marketing at Spectrum Bridge, a data, and access security cameras. They available in 1985, the popular wireless wireless networking company in Lake also brought broadband access to rural applications were garage door open- Mary, FL, that works with white spaces, Claudville, VA. ers and baby monitors. Nobody had expects the first enhanced Wi-Fi sys- The 4-watt applications can use ex- thought of Bluetooth, Wi-Fi, or home biz a. tems, using fixed devices, could start isting standards; Spectrum Bridge used networking. “There’s always a third di me l appearing in the first quarter of 2011. a modified WiMAX radio in the Logan class of devices, and those are the un- a i oc Under experimental licenses from the hospital, for example. Meanwhile, IEEE known ones. Those probably have the S ca/

i FCC, Srivastava says Spectrum Bridge is working on a standard for the low- most promise, but I can’t tell you what has run several pilot programs that power devices, comparable to its 802.11 they are,” Srivastava says.

y J.D. Las demonstrate the near-term uses. standard for Wi-Fi. That will determine b h One project provided the connectiv- the design of chips for the smartphones Neil Savage is science and technology writer based in Lowell, MA. ity to deploy smart grid power monitors and laptops that will use them, Srivas- otograp h p across the electrical system in Plym- tava says, so it may be more than two © 2010 ACM 0001-0782/10/1100 $10.00

november 2010 | vol. 53 | no. 11 | communications of the acm 23 viewpoints

doi:10.1145/1839676.1839685 Pablo J. Boczkowski VEconomic and Business Dimensions The Divergent Online News Preferences of Journalists and Readers Reading between the lines of the thematic gap between the supply and demand of online news.

he political body, like the Measuring Divergence sis sought to determine whether an biological one, needs the in Online News event covered in one site was also cov- right combination of nutri- I did not look for the dilemma in on- ered in at least one of the others. ents to function adequately. line news. Once I found it in one place, The high level of similarity in news One such key ingredient is however, I went looking for it else- products could not be explained by Tnews about public affairs that is nec- where and found it everywhere. patterns in the nature of demand. If essary to inform political deliberation For a book on imitation in online anything, consumers seemed to want and encourage educated participa- and offline news,1 I measured the more differentiated news products tion among the citizenry. In most lib- amount of readers’ news choices and and also news marked by a different eral democratic societies, this news is the thematic composition of their thematic composition than what was largely provided by elite news organi- choices. I found a large, double-digit offered to them. These sites shared zations in print, broadcast, and online difference between supply (preferenc- 52% of the events presented on their media. But, at least on the Web, while es of journalists) and demand (prefer- respective top 10 stories. Furthermore, these organizations have supplied this ences of readers). almost 59% of these stories shared by kind of news in considerable quanti- More precisely, I calculated the de- more than one site dealt with political, ties, the demand for news among on- gree of similarity in the events covered business, economic, and international line readers has gravitated toward oth- in the stories the three leading online matters. er kinds of content also provided on news sites in Argentina considered By contrast, the most popular sto- these sites, such as information about the most important ones in any given ries among the readers of these sites weather, sports, crime, gossip, and en- news cycle. This meant collecting each were quite dissimilar: only 36% of the tertainment. That frames an interest- homepage’s first 10 stories counting hard news in the top 10 most viewed ing dilemma for online news, and also from left to right and from the top stories on one site were also among the for society as a whole. down in a grid-like manner. The analy- top 10 most viewed stories in at least

24 communications of the acm | November 2010 | vol. 53 | no. 11 V viewpoints

one of the other two sites, and only 32% line-only, and newspaper parent com- We worried that supply and demand of these stories popular in more than panies, and also different geographic are interdependent: journalists might one site dealt with politics, business, orientations. The first two sites were prioritize certain stories because they economic, and international matters. national-global, while the second two perceive that consumers could find This amounted to a 16 percentage were local. them appealing, and consumers might point gap between the levels of similar- In all cases journalists selected click more often on stories that receive ity of the top news choices of journal- more news about politics, economics, major treatment by journalists. To try ists and consumers, and a 27 percent- business, and international matters to get past these interdependencies, we age point gap between the thematic than readers, who, in turn, were more conducted a second analysis on each composition of these similar choices. interested in topics such as sports, site that excluded the stories that were That initial study begged a follow- weather, entertainment, and crime. selected by both journalists and con- up question: do similar patterns arise On each data collection day, research sumers. Focusing on the stories that elsewhere? A second study showed assistants gathered information on the journalists chose irrespective of their that the mismatch also applies to the top 10 stories selected by journalists level of popularity among consumers, leading, elite media in the U.S. My col- and by consumers, respectively, as op- and those that consumers chose even laborator Limor Peer of Yale University erationalized in the previous study. A though they were not prominently dis- NGER I

ES and I conducted a study that compared comparison of the thematic composi- played on the homepage, would give us

D BI the concurrent news choices of jour- tion of journalists’ and consumers’ top a stronger measure of each group’s in- nalists and consumers of four leading, story preferences per site revealed 13 dependent preference. RAYMON

Y

B U.S.-based sites: CNN, Yahoo News, percentage point gaps on Seattle Post- The results suggest that indepen- ON I Chicago Tribune, and the now-defunct Intelligencer and Yahoo, 14 percentage dent from the influence of each other, Seattle Post-Intelligencer.3 We chose point gaps on CNN, and 17 percentage the choices of journalists and con- USTRAT

ILL these sites to represent broadcast, on- point gaps on Chicago Tribune. sumers would follow strikingly diver-

november 2010 | vol. 53 | no. 11 | communications of the acm 25 viewpoints

gent trajectories. The gaps between the course and the nature of consum- the thematic composition of the top er preferences does not change (and stories selected by journalists and It is unlikely that there is no reason to suspect it might), consumers increased by an average of the mismatch the mismatch between supply and de- 19 percentage points. mand will further erode their econom- Does the divergence between supply between supply ic sustainability. If they change course and demand depend on geography or and demand of news and give consumers more of what they ideology? Results from a third study, want, they will likely pay the price of undertaken in collaboration with in the elite media becoming a different kind of news or- Northwestern University graduate stu- began with the Web. ganization and having to compete in dents Eugenia Mitchelstein and Mar- an already crowded space of “populist tin Walter, show that the mismatch is media.” Either way, the future does not a widespread phenomenon that cuts bode well for them. across media from countries and re- The potential implications of these gions with disparate histories, cultural trends for democracy are not encour- makeup, and ideological positions.2 choices was 21 percentage points, on aging either. As noted earlier, the lead- This study deployed the design centrist/liberal sites this difference was ing, elite news organizations are major of the second study to examine 11 19 percentage points. contributors of the kind of informa- sites in Western Europe and Latin tion about political, economic, and America: The Guardian and The Times The Future of Media and Democracy international matters that is essential in the United Kingdom; El País and It is unlikely that the mismatch be- for well-informed democratic delibera- El Mundo in Spain; Die Welt and Der tween supply and demand of news in tion and participation. This informa- Tagesspiegel in Germany; La Reforma the elite media began with the Web. tion is much more difficult to find in and El Universal in Mexico; Clarín As the noted sociologist and former other sources such as tabloid media and La Nación in Argentina; and Fol- journalist Robert Park wrote many and even blogs (which largely amplify ha de Sao Paulo in Brazil (there was decades ago, “The things which most the information that originates in elite only one Brazilian site because data of us would like to publish are not the news organizations). from a second comparable site was things that most of us want to read. We Society’s appetite for informa- not publicly available). All the sites may be eager to get into print what is, tion might get satiated by news about were the online presence of leading or seems to be, edifying, but we want weather, sports, crime, gossip, and generalist, mainstream, and elite to read what is interesting.” But the entertainment. But the contributions newspapers with national reach in strong market position of these elite of these symbolic nutrients for the their respective countries. Moreover, media meant that because advertisers healthy functioning of the body politic in the five countries from which two had to go through them to reach po- will surely be lacking. If, in part, we are news sites were sampled, the pairs tential consumers, journalists could what we eat, we should be aware that had somewhat divergent ideological get away with fulfilling their sense of we also are the news that we consume. outlooks—either conservative and civic duty by disseminating “edify- And when the supply and demand of centrist or conservative and liberal. ing” news despite their limited appeal online news does not meet, it is not Once again, there was a sizable the- among the general public. just elite media organizations that matic gap between the supply and de- But in the highly competitive con- might suffer, but also all of us. mand of online news, with the journal- temporary media environment, few ists leaning more toward stories about news organizations enjoy the kind of References 1. Boczkowski, P. News at Work: Imitation in an Age of politics, business, economics, and in- natural monopoly or oligopoly position Information Abundance. University of Chicago Press, ternational matters than readers. The that newspapers and television net- Chicago, IL, 2010. 2. Boczkowski, P., Mitchelstein, E., and Walter, M. (in differences between the top news choic- works had in the past. Perhaps, none press). Convergence across divergence: Understanding es of journalists and consumers ranged do. Of all media markets, the Web is the gap in the online news choices of journalists and consumers in Western Europe and Latin America. from 30 percentage points in The Guard- the most competitive one because of Communication Research. ian to 9 percentage points in Clarín, with low geographic and distribution barri- 3. Boczkowski, P. and Peer, L. (in press). The choice gap: The divergent online news preferences of journalists an average of 19 percentage points. In ers and the very high number of players. and consumers. Journal of Communication. addition, there were no major patterns In addition, the Web enables orga- of variance in this gap by geographical nizations to automatically track the Pablo J. Boczkowski ([email protected]) is a number of clicks garnered by each professor in the Department of Communication Studies at region or ideological preference. First, Northwestern University and also, during the 2010–2011 journalists choose news about politics, story. This has meant that personnel academic year, a visiting scholar at the University of Chicago Booth School of Business. He is the author of Digitizing economics, business, and internation- at elite online news sites are deeply the News: Innovation in Online Newspapers (MIT Press, al matters 20 percentage points more aware of the extent to which supply 2004) and News at Work: Imitation in an Age of Information often than readers in Western Europe and demand don’t meet. They must Abundance (University of Chicago Press, 2010). and 19 percentage points more often confront the dilemma introduced I would like to thank Shane Greenstein for the invitation to in Latin America. Second, while on con- at the beginning of this column on a write this column and Shane and Eugenia Mitchelstein for servative sites the thematic difference daily basis. feedback on earlier versions. between journalists’ and consumers’ What should they do? If they stay Copyright held by author.

26 communications of the acm | November 2010 | vol. 53 | no. 11 viewpoints

Vdoi:10.1145/1839676.1839686 Stephen Cooper, Lance C. Pérez, and Daphne Rainey Education K–12 Computational Learning Enhancing student learning and understanding by combining theories of learning with the computer’s unique attributes.

n “Computational Thinking,”14 Jeannette Wing struck a chord that has resonated strongly (generating positive as well as negative responses) with many Icomputer scientists and non-comput- er scientists. Wing has subsequently defined computational thinking as the process of abstraction,15 guided by various engineering-type concerns including efficiency, correctness, and several software engineering “-ilities” (maintainability, usability, modifi- ability, and so forth). Some have inter- preted computational thinking as an attempt to capture the set of computer science skills essential to virtually ev- ery person in a technological society, while others view it as a new descrip- Berkeley public elementary school students on a field trip learn how computers enable tion of the fundamental discipline science. that represents computer science and its intersection with other fields. The to add a course, one must also propose other K–12 subjects would take advan- National Academies report1 captures a course to be removed. tage of school children who had been both of these views, as well as present- 2. Even as an elective topic, comput- trained in computational thinking. ing others. er science tends to be disproportionate- 5. The most common definitions of While we can live with such defini- ly available to those wealthy suburban computational thinking are confusing tions/descriptions in the higher educa- schools. Margolis et al.6 explore this when explained to non-computer scien-

b tion arena, we struggle with these no- situation in depth within the urban Los tists. And many K–12 computing teach- La l tions of computational thinking in the Angeles school district. ers are not computer scientists. at’ N K–12 arena (note that we primarily con- 3. Too few K–12 computing teachers We find the above definitions of ey l e k sider K–12 education within the U.S.). are available to implement a national- computational thinking not espe- Several concerns spring to mind: scale computing requirement. CUNY’s cially useful when considered in the 2

rence Ber 1. Computer science does not ap- ambitious 10,000 teacher project will context of K–12 education, and, more w pear within the core topics covered in not produce sufficient numbers of com- specifically, K–12 science, technol- t, La id

m high school. We would have a tough puting teachers required to instruct all ogy, engineering, and mathematics h

tsc time justifying a computer science schoolchildren in the U.S. It would not (STEM) education. We propose an al- l course, even the “great ideas” AP Princi- even get one qualified teacher into each ternative to the common definition oy Ka R y ples course (being developed as part of of the nation’s 30,000+ high schools of computational thinking we believe b h Denning4) replacing Algebra 2, Biology, (see http://nces.ed.gov/programs/di- is appropriate for operationalization or American Government. K–12 educa- gest/d09/tables/dt09_086.asp). in K–12 education and consider its otograp

Ph tion is a zero-sum game. If one wishes 4. It is not clear to us how teachers in broader implications.

november 2010 | vol. 53 | no. 11 | communications of the acm 27 viewpoints

A K–12 View of two strengths of the computer, namely, requiring interaction between the com- Computational Thinking their ability to present complex data puter and the human. We have struggled with how computa- sets, often visually, and their capacity ˲˲ In computational learning, the tional thinking might be different from for storing factual and relational knowl- computer can compensate for a hu- mathematical thinking, algorithmic edge. These four elements frame and man’s lack of factual and relational thinking, quantitative reasoning, de- establish the boundaries of the iterative knowledge and mathematical and sci- sign thinking, and several other models interaction between the human being entific sophistication. of math, science, and even engineering and the computer. Note that the accom- ˲˲ The computer’s ability to quickly to critical thinking and problem solv- panying figure does not explicitly in- compute multiple examples and pres- ing. It was after struggling with the lat- clude a teacher, not because we believe ent them via a modality appropriate to est type of thinking that we realized that teachers are unnecessary, but rather a human’s current development stage perhaps even the term “computational because the role of the teacher in this and level of meta-cognitive awareness thinking” was misleading (from a K–12 model is complex and requires further can leverage the human’s inherent, perspective), and we were approaching investigation. though perhaps not fully conscious, ca- the definition incorrectly. Rather than In developing this model, we ob- pacity for abstraction. considering computational thinking as served that it includes other extant This model for computational learn- a part of the process for problem solv- models in scientific learning and in- ing differs significantly from other pro- ing, we instead developed a model of quiry. For example, one can view com- posed notions of computational think- computational learning that empha- putational science as the interaction ing. For example, algorithmic thinking sizes the central role that a computer between the human and the computer does not require a computer and math- (and possibly its abstraction) can play in that is contained within the box where ematical thinking is almost solely de- enhancing the learning process and im- a human being formulates a problem pendent on the human’s formalization proving achievement of K–12 students and provides a representation suitable capacity for abstraction.2 in STEM and other courses. The figure for a computer. The computer then acts To better understand how our pro- here depicts our current working model on this representation and returns the posed computational learning model of computational learning. It should be results of these actions to the human can be operationalized, we present noted that this model is explicit in its being through, for example, a visual two examples: one from middle school use of a computer and specifically ex- representation. Computational learn- computing and one from high school cludes non-cognitive uses of technology ing expands this interaction by allow- biology. (Powerpoint, wikis, blogs, clickers, and ing the computer to add foundational so forth). knowledge, not just data, unknown to Digitizing Data Similar to Wing’s original vision of the human and by having the results of Cutler and Hutton3 modified a CS Un- computational thinking, we see com- the computer’s actions represented in plugged activity on image represen- putational learning as an iterative and a form compatible with the human’s tation (see http://csunplugged.org/ interactive process between the hu- current capacity for abstraction. In the sites/default/files/activity_pdfs_full/ man (the K–12 student in our case) and more interesting instances of computa- unplugged-02-image_representation. the computer (or, in a more theoretical tional learning, both of these processes pdf) to enable middle school students construct, a model of computation). are likely to be adaptive and personal- to work interactively with a computer We also make explicit the two conse- ized to the individual. program as they learn about how com- quences of the human cognitive pro- We make several observations about puters digitize images. The purpose of cess, namely, the capacity for abstrac- computational learning: these activities is to help students to un- tion and for problem formulation, and ˲˲ Computational learning is iterative, derstand what it means for a computer to digitally represent an image. More A view of computational learning. importantly, the students learn to move from concrete representations of im- ages to more abstract representations Capacity for Abstraction of those images (as a digital represen- tation), and from representation of im- Computational Science ages in 2D to representing objects in 3D. Computer Human And this ability to abstract is important

Memory Visualization Working Memory across all STEM disciplines. Students work interactively with the Computation Problem Knowledge computer program, receiving feedback Formulation to their attempts at digitizing their data. Over the series of lessons, they develop an initial ability to abstract away from the physical representation of an im- External Knowledge age to its digital representation. They are then further able to develop their ability to abstract as they move from 2D

28 communications of the acm | November 2010 | vol. 53 | no. 11 viewpoints images to 3D objects. Finally, students Other (though the computer obviously develop a further ability to abstract. As cannot think at a higher level than the part of the creation of a single 3D object, This model for student),11 or even fits within Newell say a chair, the students are then chal- computational and Simon’s Information Processing lenged to place many chairs in a room. Theory framework.8 Our hope is that by They need to be able to recognize that learning differs considering our model of computation- representing a 3D chair consists of two significantly from al learning, we can better educate and parts: the relative coordinates of each of prepare teachers to benefit from com- the parts of the chair, and the absolute other proposed puting in and outside the classroom, location of one part (say the bottom- notions of and that approaches and computing right corner of the front-right leg). tools can be identified and built to im- computational prove K–12 student STEM learning. Evolutionary Biology thinking. The second example involves the teach- References 1. committee for the Workshops on Computational ing of evolution using computational Thinking, National Research Council. Report of a learning. Our vision is of a 3D visualiza- Workshop on the Scope and Nature of Computational Thinking. National Academies Press, Washington, D.C., tion system that could simulate evolu- 2010. tion. A student could specify an organ- 2. cuny, J. Finding 10,000 teachers. CSTA Voice, 5, 6 9,10,13 (2010), 1–2; http://www.csta.acm.org/Communications/ ism, with primitive appendages (arms, behavior (see ). sub/CSTAVoice_Files/csta_voice_01_2010.pdf legs, joints, and other attributes) to ac- While we know of no tool that pro- 3. cutler, R. and Hutton, M. Digitizing data: Computational thinking for middle school students through computer complish locomotion. Then, by provid- vides the exact support/simulation we graphics. In Proceedings of the 31st Annual Conference ing an environment, the student could are describing, there are several avail- of the European Association for Computer Graphics EG 2010—Education Papers. (Norrköping, Sweden, May run the simulation to watch how the able visualization systems that can 2010), 17–24. organism’s ability to move evolves over simulate/model the world. Two of these 4. Denning, P. Beyond computational thinking. Commun. ACM 52, 6 (June 2009), 28–30. time as a function of its current loco- systems have helped to shape our vision 5. gilbert, J.K., Justi, R., and Aksela, M. The visualization motion capability coupled with the im- of the above-mentioned simulation: of models: A metacognitive competence in the learning of chemistry. Paper presented at the 4th Annual pact of that organism’s environment. The 3D visualization system, Fram- Meeting of the European Science Education Research sticks (www.framsticks.com) can be Association, Noordwijkerhout, The Netherlands, 2003. Students could change the appendag- 6. margolis, J. et al. Stuck in the Shallow End: Education, es and/or the environment to observe used for modeling evolution, and the Race, and Computing. The MIT Press, 2008. 7. national Science Foundation. Molecular visualization how such changes lead to a difference 2D simulation system, NetLogo (http:// in science education. Report from the molecular in the organism’s evolution over time. ccl.northwestern.edu/netlogo/) has visualization in science education workshop. NCSA access center, National Science Foundation, Arlington, In computer science terms, this exam- many available pre-built simulations, VA, 2001. ple is similar to passing a program and including those that model evolution 8. newell, A., and Simon, H.A. Human Problem Solving. Prentice-Hall, Englewood Cliffs, NJ, 1972. an initial state as input to a Universal albeit in a different manner than what 9. rotbain, Y., Marbach-Ad, G., and Stavy, R. Using a Turing Machine. we describe. computer animation to teach high school molecular biology. Journal of Science Education and Technology Such a simulation allows the stu- 17 (2008), 49–58. dent to work interactively with the Conclusion 10. sewell R., Stevens, R., and Lewis, D. Multimedia computer technology as a tool for teaching and computer program. The student learns Most papers we’ve seen on compu- assessment of biological science. Journal of Biological both from the impact of the changes tational thinking represent attempts Education 29 (1995), 27–32. 11. Vygotsky, L.S. Mind and Society: The Development of she makes to the initial configuration at repackaging computing science Higher Mental Processes. Harvard University Press, of the organism and to the initial en- concepts, especially in the form of al- Cambridge, MA, 1978. 12. Williamson V.M. and Abraham, M.R. The effects of vironment (which will lead to the or- gorithmic thinking and introductory computer animation on the particulate mental models ganism evolving the ability to move of college chemistry students. Journal of Research in programming, sometimes in other do- Science Teaching 32 (1995), 522–534. differently) as well as by the ability to mains. Though this may be useful in 13. Windschitl, M.A. A practical guide for incorporating computer-based simulations into science instruction. observe the simulation/visualization some contexts, it is unlikely such a sim- The American Biology Teacher 60 (1998), 92–97. as it is running. In science, researchers ple approach will have significant im- 14. Wing, J.M. Computational thinking. Commun. ACM 49, 3 (Mar. 2006), 33–35. have found that visualization is central pact on student learning—of computer 15. Wing, J.M. Computational thinking and thinking about to increasing conceptual understand- science or other disciplines—in the computing. Philosophical Transactions of the Royal ing and prompting the formation of K–12 setting. The proposed model of Society A, 366 (2008), 3717–3725. dynamic mental models of particulate computational learning combines the- 5,7,9,12). Vi- Stephen Cooper ([email protected]) is an associate matter and processes (see ories of learning with the computer’s professor (teaching) in the Computer Science Department sualization and computer interaction superiority in dealing with complexity at Stanford University. through animation allow students to and variability and its ability to present Lance C. Pérez ([email protected]) is an associate professor engage more in the cognitive process, results using modalities that appeal to in the Department of Electrical Engineering at the University of Nebraska-Lincoln where he also holds the and to select and organize more rele- the learner in order to enhance student position of Associate Vice Chancellor for Academic Affairs. vant information for problem-solving.7 learning and understanding. We believe Daphne Rainey ([email protected]) is a biologist Computer animations incorporated that computational learning can be currently serving in a temporary assignment as a program into interactive simulations offer the framed within various theories of learn- director at NSF’s Division of Undergraduate Education within its Education and Human Resources Directorate. user a chance to manipulate variables ing, where the computer plays a similar to observe the effect on the system’s role as Vygotsky’s More Knowledgeable Copyright held by author.

november 2010 | vol. 53 | no. 11 | communications of the acm 29 viewpoints

Vdoi:10.1145/1839676.1839687 Pamela Samuelson Legally Speaking Why Do Software Startups Patent (or Not)? Assessing the controversial results of a recent empirical study of the role of intellectual property in software startups.

wo-thirds of the approxi- no more than 10 years old before the tween 10%–15% of the software startup mately 700 software entre- survey was conducted. We drew our respondents among the D&B respon- preneurs who participated sample from a general population of dents were venture-backed firms. in the 2008 Berkeley Patent high-tech firms registered with Dun Among the software respondents, only Survey report that they nei- & Bradstreet (D&B) and from the Ven- 2% had experienced an initial public Tther have nor are seeking patents for tureXpert (VX) database that has a rich offering (IPO), while 9% had been ac- innovations embodied in their prod- data set on venture-backed startups. quired by another firm. ucts and services. These entrepre- (Just over 500 of the survey software re- Our interest in conducting this sur- neurs rate patents as the least impor- spondents were D&B firms; just under vey arose because high-technology tant mechanism among seven options 200 respondents were VX firms.) entrepreneurs have contributed sig- for attaining competitive advantage. Eighty percent of the software re- nificantly to economic growth in recent Even software startups that hold pat- spondents were either the CEOs or decades. They build firms that create ents regard them as providing only a CTOs of their firms, and most had new products, services, organizations, slight incentive to innovate. experience in previous startups. The and opportunities for complementary These are three of the most striking average software firm had 58 employ- economic activities. We were curious findings from a recently published ar- ees, half of whom were engineers. Be- to know the extent to which high-tech ticle, “High Technology Entrepreneurs and the Patent System: Results of the Measures of capturing “competitive advantage” from inventions. 2008 Berkeley Patent Survey.”1 After

providing some background about the How important or unimportant is each of the following in your company’s software ability to capture competitive advantage from its technology inventions? survey, I will discuss some key findings medical Devices about how software startup firms use Biotechnology and are affected by the patent system. First-mover While the three findings highlighted advantage above might seem to support a software Secrecy patent abolitionist position, it is sig- nificant that one-third of the software Patents entrepreneur respondents reported having or seeking patents, and that they Copyright perceive patents to be important to per- Trademark sons or firms from whom they hope to Reverse obtain financing. engineering

Complementary Some Background on the Survey assets More than 1,300 high-technology en- Not important Slightly Moderately Very trepreneurs in the software, biotech- at all important important important nology, medical devices, and computer Importance hardware fields completed the Berkeley Patent Survey. All of these firms were

30 communications of the acm | November 2010 | vol. 53 | no. 11 viewpoints

startups were utilizing the patent sys- tem, as well as to learn their reasons Calendar for choosing to avail themselves of the It is an article of patent system—or not. faith among many of Events The basic economic principle un- derlying the patent system is that IP lawyers that November 17–19 technology innovations are often ex- patents provide Asian Internet Engineering pensive, time-consuming, and risky Conference, significant incentives Bangkok, Thailand, V to develop, although once developed, Contact: Kanchanasut these innovations are often inexpen- for firms to engage Kanchana, sive and easy to copy; in the absence of Email: [email protected] intellectual property rights (IPRs), in- in R&D and develop November 17–19 novative high-tech firms may have in- new products. Advances in Computer sufficient incentives to invest in inno- Entertainment Technology vation insofar as they cannot recoup Conference, Taipei, Taiwan, their research and development (R&D) Contact: Duh Henry B.L., expenses and justify further invest- Email: [email protected] ments in innovation because of cheap copies that undermine the firms’ re- that the overall rate of software startup November 22–23 Conference on Decision and coupment strategy. patenting is lower than this given that Game Theory for Security, Although this economic principle ap- patent-holders may have been more Berlin, Germany, plies to all companies, early-stage tech- likely than non-patent-holders to take Contact: Tansu Alpcan, nology firms might, we conjectured, time to fill out a Berkeley Patent Survey. Email: [email protected] berlin.de be more sensitive to IPRs than more In striking contrast to the D&B re- mature firms. The former often lack spondents, over two-thirds of the VX November 22–24 various kinds of complementary assets software startup respondents in the The 17th ACM Symposium on (such as well-defined marketing chan- sample, all venture-backed, had or were Virtual Reality Software and Technology, nels and access to cheap credit) that the seeking patents. We cannot say why Hong Kong, latter are more likely to enjoy. We de- these venture-backed firms were more Contact: George Baciu, cided it would be worthwhile to test this likely to seek patents than other firms. Email: [email protected]. conjecture empirically. With generous Perhaps venture capitalists (VCs) are edu.hk funding from the Ewing Marion Kauff- urging firms they fund to seek patents; December 1–3 man Foundation, three colleagues and I or VCs may be choosing to fund the 9th International Conference designed and carried out the survey and development of software technologies on Mobile and Ubiquitous Multimedia, have begun analyzing the results. that VCs think are more amenable to Limassol, Cyprus, patenting. Contact: Angelides Marios, Why Startups Patent Interestingly, the rate of patenting Email: marios.angelides@ The most important reasons for seek- did not vary by the age of the firm (that brunel.ac.uk ing patents, as reported by software ex- is, older firms did not patent more than December 4–8 ecutives who responded to the Berkeley younger firms). The 43rd Annual IEEE/ACM Patent Survey, were these: to prevent International Symposium on competitors from copying the innova- Why Forego Patenting? Microarchitecture, Atlanta, GA, tion (2.3 on a 4 point scale, where 2 was The survey asked two questions about Sponsored: SIGMICRO, moderately important), to enhance the decisions to forego patenting: For the Contact: Sudhakar firms’ reputation (2.2), and to secure in- last innovation for which the firm chose Yalamanchili, vestment and improve the likelihood of not to seek a patent, what factors influ- Email: [email protected] an IPO (1.96 and 1.97 respectively). enced this decision, and what was the December 5–8 The importance of patents to inves- most important factor in the decision? Winter Simulation Conference, tors was also evident from survey data The costs of obtaining and of en- Baltimore, Maryland, showing striking differences in the rate forcing patents emerged as the first Sponsored: SIGSIM , Contact: Joe Hugan, of patenting among the VX and the D&B and second most frequent explana- Email: [email protected] software companies. tion. Twenty-eight percent of the soft- Three-quarters of the D&B firms had ware startup executives reported that December 12–13 no patents and were not seeking them. the costs of obtaining patents had Virtual Reality Continuum and its Applications in Industry, Because the D&B firms are, we believe, been the most important factor in this Seoul, Republic of Korea, fairly typical of the population of soft- decision, and 12% said that the costs of Sponsored: SIGGRAPH , ware startup firms in the U.S., their re- enforcing patents was the most impor- Contact: Hyunseung Yang, Email: [email protected] sponses may well be representative of tant factor. (They reported that average patenting rates among software start- cost of getting a software patent was ups generally. It is, in fact, possible just under $30,000.)

november 2010 | vol. 53 | no. 11 | communications of the acm 31 viewpoints

Ease of inventing around the innova- What Incentive Effects also expect, as we did, that high-tech tion and satisfaction with secrecy also Do Patents Have? startup companies would regard pat- influenced software startup decisions The Berkeley Patent survey asked start- ents as more important as an induce- not to seek patents, although only rarely up executives to rate the incentive ef- ment to innovation than large firms, were these factors considered the most fects of patents on a scale, where 0 = no given that the latter have lots of other important. incentive, 1 = weak incentive, 2 = moder- assets for achieving and maintaining Intriguingly, more than 40% of the ate incentive, and 3 = strong incentive, success in the marketplace. software respondents cited the unpat- for engaging in four types of innovation: Anecdotes highlighting the impor- entability of the invention as a factor in (1) inventing new products, processes, tance of patents to high-tech entrepre- decisions to forego patenting. Almost a or services, (2) conducting initial R&D, neurs are relatively easy to find. Because quarter of them rated this as the most (3) creating internal tools or processes, data from the Berkeley Patent Survey important factor. Indeed, unpatentabil- and (4) undertaking the risks and costs suggests that software entrepreneurs ity ranked just behind costs of obtain- of commercializing the innovation. regard patents as quite unimportant, ing patents as the most frequently cited We were surprised to discover the the reaction of some prominent patent “most important factor” for not seeking software respondents reported that pat- lawyers to our article about the survey patents. ents provide only weak incentives for has been sharply negative. We believe, It is difficult to know what to make of engaging in core activities, such as in- however, that our analysis is sound and the unpatentability finding. One expla- vention of new products (0.96) and com- these critiques are off-base. We encour- nation may be that the software respon- mercialization (0.93). By contrast, bio- age readers to read the full article and dents believed that patent standards of tech and medical device firms reported make their own judgments. novelty, non-obviousness, and the like just above 2 (moderate incentives) for are so rigorous that their innovation these same questions. Future Research might not have satisfied patent require- Interestingly, the results did not Over the next several years, the co-authors ments. Yet, because the patentability of change significantly when considering of the Berkeley Patent Survey article ex- software innovations has been conten- only responses from software entrepre- pect to analyze further data from this tious for decades, it may also be that a neurs whose firms hold at least one pat- survey and to report new findings. We will significant number of these entrepre- ent or application. Even patent-holding look more closely, for example, at differ- neurs have philosophical or practical software entrepreneurs reported that ences in patenting rates among those in objections to patents in their field. patents provide just above a weak incen- different sectors of the software industry tive for engaging in these innovation-re- and differences between patent holders How Important Are Patents to lated activities. and non-patent holders. We know already Competitive Advantage? that product innovators seek patents One of the most striking findings of our Resolving a Paradox more often than process innovators. study is that software firms ranked pat- If patents provide only weak incen- The findings reported here suggest ents dead last among seven strategies tives for investing in innovation among that software entrepreneurs do not find for attaining competitive advantage, as software startups, why did two-thirds persuasive the canonical story that pat- the accompanying figure shows. (The of the VX respondents and at least ents provide strong incentives to engage relative unimportance of patents for one-quarter of the D&B respondents in technology innovation. These execu- competitive advantage in the software seeking patents? The answer may lie tives regard first-mover advantage and field contrasts sharply with the per- in the perception among software en- complementary assets as more impor- ceived importance of patents in the bio- trepreneurs that patents may be im- tant than IPRs in conferring competitive tech industry, where patents are ranked portant to potential funders, such as advantage upon their firms. Moreover, the most important means of attaining VCs, angel investors, other firms, com- among IPRs, copyrights and trademarks such advantage.) mercial banks, and friends and fam- are perceived to be more important than As shown in the figure on page 30, ily. Sixty percent of software startup patents. Still, about one-third of the software startups regard first-mover respondents who had negotiated with software entrepreneur respondents re- advantage as the single most impor- VCs reported that they perceived VC ported having or seeking patents, and tant strategy for attaining competitive decisions about whether to make the their perception that their investors care advantage. The next most important investments to be affected by patents. about patents seems to be a key factor in strategy was complementary assets (for Between 40% and 50% of the software decisions to obtain patents. example, providing services for licensed respondents reported that patents software or offering a proprietary com- were important to other types of inves- Reference 1. graham, S.J.H., Merges, R.P., Samuelson, P., and plement to an open source program). tors, such as angels, investment banks, Sichelman, T. High technology entrepreneurs and the patent system: Results of the 2008 Berkeley patent Among IPRs, copyrights and trade- and other companies. survey. Berkeley Technology Law Journal 25, 4 (2010), marks—closely followed by secrecy 1255–1327; http://papers.ssrn.com/sol3/papers. and difficulties of reverse engineer- Controversy Over Survey Findings cfm?abstract_id=1429049.

ing—outranked patents as means of at- It is an article of faith among many IP Pamela Samuelson ([email protected]) is the taining competitive advantage among lawyers that patents provide significant Richard M. Sherman Distinguished Professor of Law and software respondents by a statistically incentives for firms to engage in R&D Information at the University of California, Berkeley. significant margin. and develop new products. Most would Copyright held by author.

32 communications of the acm | November 2010 | vol. 53 | no. 11 viewpoints

Vdoi:10.1145/1839676.1839688 Joel F. Brenner Privacy and Security Why Isn’t Cyberspace More Secure? Evaluating governmental actions—and inactions—toward improving cyber security and addressing future challenges.

n cyberspace it’s easy to get away with criminal fraud, easy to steal corporate intellectual property, and easy to pene- trate governmental networks. ILast spring the new Commander of USCYBERCOM, NSA’s General Keith Alexander, acknowledged for the first time that even U.S. classified networks have been penetrated.2 Not only do we fail to catch most fraud artists, IP thieves, and cyber spies—we don’t even know who most of them are. Yet every significant public and private activity—economic, social, govern- mental, military—depends on the se- curity of electronic systems. Why has so little happened in 20 years to alter the fundamental vulnerability of these systems? If you’re sure this insecurity is either (a) a hoax or (b) a highly desir- able form of anarchy, you can skip the rest of this column. Presidential Directives to Fix This Problem emerge dramatically like clockwork from the White House echo chamber, chronicling a history of ex- ecutive torpor. One of the following statements was made in a report to “The architecture of the Nation’s which is which.a In between, for the President Obama in 2009, the other by digital infrastructure, based largely on sake of nonpartisan continuity, Presi- President George H.W. Bush in 1990. the Internet, is not secure or resilient. Guess which is which: Without major advances in the security a The first quotation is from President G.H.W. Bush’s National Security Directive 42, July 5, “Telecommunications and informa- of these systems or significant change

NSON 1990, redacted for public release, April 1, 1992; H O http://www.fas.org/irp/offdocs/nsd/nsd_42.

J tion processing systems are highly sus- in how they are constructed or operat- A

LI ceptible to interception, unauthorized ed, it is doubtful that the United States htm. The second quotation is from the pref- CE

Y ace to “Cyberspace Policy Review: Assuring a

B electronic access, and related forms of can protect itself from the growing

ON Trusted and Resilient Information and Com- I technical exploitation, as well as other threat of cybercrime and state-spon- munications Infrastructure,” May 2009; http:// dimensions of the foreign intelligence sored intrusions and operations.”

USTRAT www.whitehouse.gov/assets/documents/Cyber-

ILL threat.” Actually, it doesn’t much matter space_Policy_Review_final.pdf.

november 2010 | vol. 53 | no. 11 | communications of the acm 33 viewpoints

dent Clinton warned of the insecurities Liability presumably makes these com- created by cyber-based systems and di- panies somewhat more vigilant in their rected in 1998 that “no later than five Deciding what level business practices, but it doesn’t make years from today the United States shall of imperfection is hardware and software more secure. have achieved and shall maintain the Many major banks and other com- ability to protect the nation’s critical acceptable is not panies already know they have been infrastructures from intentional acts a task you want persistently penetrated by highly that would significantly diminish” our skilled, stealthy, and anonymous ad- security.6 Five years later would have your Congressional versaries, very likely including foreign been 2003. representative intelligence services and their sur- In 2003, as if in a repeat perfor- rogates. These firms spend millions mance of a bad play, the second Presi- to perform. fending off attacks and cleaning their dent Bush stated that his cybersecurity systems, yet no forensic expert can objectives were to “[p]revent cyber at- honestly tell them that all advanced tacks against America’s critical infra- persistent intrusions have been de- structure; [r]educe national vulner- feated. (If you have an expert who will ability to cyber attacks; and [m]inimize the damages, say, from finding your say so, fire him right away.) damage and recovery time from cyber computer is an enslaved member of a In an effective liability regime, in- attacks that do occur.”7 botnet run out of Russia or Ukraine? surers play an important role in raising These Presidential pronouncements And how do you prove the problem was standards because they tie premiums will be of interest chiefly to historians caused by the software rather than your to good practices. Good automobile and to Congressional investigators who, own sloppy online behavior? drivers, for example, pay less for car in the aftermath of a disaster that we Asking Congress to make software insurance. Without a liability dynamic, can only hope will be relatively minor, manufacturers liable for defects would however, insurers play virtually no role will be shocked, shocked to learn that be asking for trouble: All software is in raising cyber security standards. the nation was electronically naked. defective, because it’s so astoundingly If liability hasn’t made cyberspace Current efforts in Washington to complicated that even the best of it more secure, what about market de- deal with cyber insecurity are promis- hides surprises. Deciding what level mand? The simple answer is that the ing—but so was Sisyphus’ fourth or of imperfection is acceptable is not consuming public buys on price and fifth trip up the hill. These efforts are a task you want your Congressional has not been willing to pay for more moving at a bureaucratically feverish representative to perform. Any such secure software. In some cases the af- pitch—which is to say, slowly—and legislation would probably drive some termath of identity theft is an ordeal. so far they have produced nothing creative developers out of the market. In most instances of credit card fraud, but more declarations of urgency and It would also slow down software devel- however, the bank absorbs 100% of the more paper. Why? opment—which would not be all bad if loss, so their customers have little in- it led to higher security. But the general centive to spend more for security. (In Lawsuits and Markets public has little or no understanding of Britain, where the customer rather than Change in the U.S. is driven by three the vulnerabilities inherent in poorly the bank usually pays, the situation is ar- things: liability, market demand, and developed applications. On the con- guably worse because banks are in a bet- regulatory (usually federal) action. The trary, the public clamors for rapidly ter position than customers to impose role and weight of these factors vary in developed apps with lots of bells and higher security requirements.) Most other countries, but the U.S. experience whistles, so an equipment vendor that companies also buy on price, especially may nevertheless be instructive trans- wants to control this proliferation of in the current economic downturn. nationally since most of the world’s in- vulnerabilities in the name of security Unfortunately we don’t know wheth- tellectual property is stored in the U.S., is in a difficult position. er consumers or corporate custom- and the rest of the world perceives U.S. Banks, merchants, and other hold- ers would pay more for security if they networks as more secure than we do.4 So ers of personal information do face lia- knew the relative insecurities of the let’s examine each of these three factors. bility for data breaches, and some have products on the market. As J. Alex Hal- Liability has been a virtually nonex- paid substantial sums for data losses derman of the University of Michigan istent factor in achieving greater Inter- under state and federal statutes grant- recently noted, “most customers don’t net security. This may be surprising un- ing liquidated damages for breaches. have enough information to accurately til you ask: Liability for what, and who In one of the best known cases, Heart- gauge software quality, so secure soft- should bear it? Software licenses are land Payments Systems may end up ware and insecure software tend to sell enforceable, whether shrink-wrapped paying approximately $100 million as a for about the same price.”3 This could or negotiated, and they nearly always result of a major breach, not to mention be fixed, but doing so would require limit the manufacturer’s liability to millions more in legal fees. But the de- agreed metrics for judging products the cost of the software. So suing the fendants in such cases are buyers, not and either the systematic disclosure of software manufacturer for allegedly makers and designers, of the hardware insecurities or a widely accepted test- lousy security would not be worth the and software whose deficiencies create ing and evaluation service that enjoyed money and effort expended. What are many (but not all) cyber insecurities. the public’s confidence. Consumer Re-

34 communications of the acm | november 2010 | vol. 53 | no. 11 viewpoints ports plays this role for automobiles disclose in the “Risk Factors” section ered for nine months over the package and many other consumer products, of their prospectuses whether the com- of excellent recommendations put on and it wields enormous power. The mand-and-control features of their his desk by a nonpolitical team of civil same day Consumer Reports issued a SCADA networks are connected to the servants from several departments “Don’t buy” recommendation for the Internet or other publicly accessible and agencies. The Administration’s 2010 Lexus GX 460, Toyota took the network. Issuers would scream about lack of interest was palpable; its hands vehicle off the market. If the engineer- this, even though a recent McAfee are full with a war, health care, and a ing and computer science professions study plainly indicates that many of bad economy. In difficult economic could organize a software security lab- them that do follow this risky practice times the President naturally prefers oratory along the lines of Consumer Re- think it creates an “unresolved security invisible risk to visible expense and is ports, it would be a public service. issue.”1 SCADA networks were built for understandably reluctant to increase isolated, limited access systems. Al- costs for business. In the best of times Federal Action lowing them to be controlled via pub- cross-departmental (or cross-ministe- Absent market- or liability-driven im- lic networks is rash. This point was rial) governance would be extremely provement, there are eight steps the driven home forcefully this summer difficult—and not just in the U.S. Do- U.S. federal government could take to by discovery of the “Stuxnet” computer ing it well requires an interdepartmen- improve Internet security, and none of worm, which was specifically designed tal organ of directive power that can them would involve creating a new bu- to attack SCADA systems.4 Yet public muscle entrenched and often parochi- reaucracy or intrusive regulation: utilities show no sign of ramping up al bureaucracies, and in the cyber are- 1. Use the government’s enormous their typically primitive systems. na, we simply don’t have it. The media, purchasing power to require higher se- 6. Increase support for research into which never tires of the cliché, told us curity standards of its vendors. These attribution techniques, verifiable soft- we were getting a cyber “czar,” but the standards would deal, for example, ware and firmware, and the benefits of newly created cyber “Coordinator” ac- with verifiable software and firmware, moving more security functions into tually has no directive power and has means of authentication, fault toler- hardware. yet to prove his value in coordinating, ance, and a uniform vocabulary and 7. Definitively remove the antitrust let alone governing, the many depart- taxonomy across the government in concern when U.S.-based firms collab- ments and agencies with an interest in purchasing and evaluation. The Feder- orate on researching, developing, or electronic networks. al Acquisition Regulations, guided by implementing security functions. And so cyber-enabled crime and po- the National Institute of Standards and 8. Engage like-minded governments litical and economic espionage contin- Technology, could drive higher secu- to create international authorities to ue apace, and the risk of infrastructure rity into the entire market by ensuring take down botnets and make naming- failure mounts. As for me, I’m already federal demand for better products. and-addressing protocols more diffi- drafting the next Presidential Direc- 2. Amend the Privacy Act to make cult to spoof. tive. It sounds a lot like the last one. it clear that Internet Service Providers (ISPs) must disclose to one another and Political Will References 1. Baker, S. et al. In the Crossfire: Critical Infrastructure to their customers when a customer’s These practical steps would not solve in the Age of Cyber War, CSIS and McAfee, (Jan. computer has become part of a bot- all problems of cyber insecurity but 28, 2010), 19; http://img.en25.com/Web/McAfee/ NA_CIP_RPT_REG_2840.pdf. See also P. Kurtz et net, regardless of the ISP’s customer they would dramatically improve it. al., Virtual Criminology Report 2009: Virtually Here: contract, and may disclose that fact to Nor would they involve government The Age of Cyber Warfare, McAfee and Good Harbor Consulting, 2009, 17; http://iom.invensys.com/EN/ a party that is not its own customer. snooping and or reengineering the pdfLibrary/McAfee/WP_McAfee_Virtual_Criminology_ ISPs may complain that such a service Internet or other grandiose schemes. Report_2009_03-10.pdf. 2. gertz, B. 2008 intrusion of networks spurred combined should be elective, at a price. That’s They would require a clear-headed units. The Washington Times, (June 3, 2010); http:// equivalent to arguing that cars should understanding of the risks to privacy, www.washingtontimes.com/news/2010/jun/3/2008- intrusion-of-networks-spurred-combined-units/. be allowed on the highway without intellectual property, and national se- 3. Halderman, J.Q. To strengthen security, change developers’ incentives. IEEE Security and Privacy brakes, lights, and seatbelts. This re- curity when an entire society relies for (Mar./Apr. 2010), 79. quirement would generate significant its commercial, governmental, and 4. Krebs, B. “Stuxnet” worm far more sophisticated than previously thought. Krebs on Security, Sept. 14, 2010; remedial business. military functions on a decades-old in- http://krebsonsecurity.com/2010/09/stuxnet-worm- 3. Define behaviors that would per- formation system designed for a small far-more-sophisticated-than-previously-thought/. 5. mcAfee. Unsecured Economies: Protecting Vital mit ISPs to block or sequester traffic number of university and government Information. 2009, 4, 13–14; http://www.cerias. from botnet-controlled addresses— researchers. purdue.edu/assets/pdf/mfe_unsec_econ_pr_rpt_fnl_ online_012109.pdf. not merely from the botnet’s com- Translating repeated diagnoses 6. presidential Decision Directive 63, (May 22, 1998); mand-and-control center. of insecurity into effective treatment http://www.fas.org/irp/offdocs/pdd/pdd-63.htm. 7. The National Strategy to Secure Cyberspace 2003. 4. Forbid federal agencies from do- would also require the political will to U.S. Department of Homeland Security. ing business with any ISP that is a hos- marshal the resources and effort nec- pitable host for botnets, and publicize essary to do something about it. The Joel F. Brenner ([email protected]) of the law firm Bush Administration came by that will Cooley LLP in Washington, D.C., was the U.S. National the list of such companies. Counterintelligence Executive from 2006–2009 and the 5. Require bond issuers that are too late in the game, and the Obama Inspector General of the National Security Agency from subject to the jurisdiction of the Fed- Administration has yet to acquire it. 2002–2006. eral Energy Regulatory Commission to After his inauguration, Obama dith- Copyright held by author.

november 2010 | vol. 53 | no. 11 | communications of the acm 35 viewpoints

Vdoi:10.1145/1839676.1839690 Matt Welsh Viewpoint Sensor Networks for the Sciences Lessons from the field derived from developing wireless sensor networks for monitoring active and hazardous volcanoes.

uch computer sci- ence research is inter- disciplinary, bringing together experts from multiple fields to solve Mchallenging problems in the sciences, engineering, and medicine. One area where the interface between computer scientists and domain scientists is es- pecially strong is wireless sensor net- works, which offer the opportunity to apply computer science concepts to obtaining measurements in challeng- ing field settings. Sensor networks have been applied to studying vibrations on the Golden Gate Bridge,1 tracking zebra movements,2 and understanding mi- croclimates in redwood canopies.4 Our own work on sensor networks for volcano monitoring6 has taught us some valuable lessons about what’s needed to make sensor networks suc- cessful for scientific campaigns. At the Harvard University Ph.D. student Konrad Lorincz installing sensors at Reventador volcano. same time, we find a number of myths that persist in the sensor network lit- hazardous volcanoes (see Figure 1). We for reliable, complete signal collection erature, possibly leading to invalid as- have deployed three sensor networks on over the lossy wireless network; and the sumptions about what field conditions two volcanoes in Ecuador: Tungurahua need to discern “interesting” signals are like, and what research problems and Reventador. In each case, wireless from noise. fall out of working with domain scien- sensors measured seismic and acoustic These deployments have taught us tists. We believe these lessons are of signals generated by the volcano, and many lessons about what works and broad interest to “applied computer digitized signals are collected at a cen- what doesn’t in the field, and what the scientists” beyond the specific area of tral base station located at the volcano important problems are from the per- z

sensor networks. observatory. This application pushes spective of the domain scientists. In- nc i

Our group at Harvard has been col- the boundaries of conventional sensor terestingly, many of these problems are Lor laborating with geophysicists at New network design in terms of the high not the focus of much of the computer d

Mexico Tech, UNC, and the Instituto data rates involved (100Hz or more per science research community. Our view y Konra b

Geofísico in Ecuador for the last five channel); the need for fine-grained time is that a sensor network should be treat- h years on developing wireless sensor synchronization to compare signals col- ed as a scientific instrument, and there- otograp h networks for monitoring active and lected across different nodes; the need fore subject to the same high standards p

36 communications of the acm | November 2010 | vol. 53 | no. 11 viewpoints

of data quality applied to conventional ters with a suitably designed antenna scientifi c instrumentation. confi guration. In volcanology, the prop- Working with domain agation speed of seismic waves (on the some myths scientists has taught order of kilometers per second) dictates First, let us dispel a few common myths sensor placements hundreds of meters about sensor network fi eld deploy- us some valuable apart or more, which is at the practical ments. lessons about sensor limit of the radio range. As a result, our Myth #1: Nodes are deployed ran- network design. networks have typically featured nodes V domly. A common assumption in sen- with at most two or three radio neigh- sor network papers is that nodes will be bors, with limited opportunities for randomly distributed over some spatial redundancy in the routing paths. Like- area (see Figure 2). An often-used idiom wise, the code-propagation protocol we is that of dropping sensor nodes from used worked well in a lab setting when an airplane. (Presumably, this implies all of the nodes were physically close that the packaging has been designed and humidity. The inexpensive sensors to each other; when spread across the to survive the impact and there is a used on many mote platforms many not volcano, the protocol fell over, prob- mechanism to orient the radio anten- be appropriate for scientifi c use, con- ably due to the much higher degree of nas vertically once they hit the ground.) founded by low resolution and the need packet loss. Such a haphazard approach to sen- for calibration. While the microphones sor siting would be unheard of in many used in our volcano sensor network cost Lessons Learned scientifi c campaigns. In volcano seis- pennies, seismometers cost upward of Working with domain scientists has mology, sensor locations are typically thousands of dollars. In our deploy- taught us some valuable lessons about chosen carefully to ensure good spatial ments, we use a combination of rela- sensor network design. Our original in- coverage and the ability to reconstruct tively inexpensive ($75 or so) geophones tentions were to leverage the collabora- the seismic fi eld. The resulting topolo- with limited sensitivity, and more ex- tion as a means of furthering our own gies are fairly irregular and do not exhib- pensive ($1,000) seismometers. The computer science research agenda, as- it the spatial uniformity often assumed instruments used by many volcano de- suming that whatever we did would be in papers. Moreover, positions for each ployments are in the tens of thousands satisfactory to the geophysicists. In ac- node must be carefully recorded using of dollars, so much that many research tuality, their data requirements ended GPS, to facilitate later data analysis. In groups borrow (rather than buy) them. up driving our research in several new our case, installing each sensor node Myth #3: The network is dense. Re- directions, none of which we anticipat- took nearly an hour (involving digging lated to the previous myths is the idea ed when we started the project. holes for the seismometer and antenna that node locations will be spatially ho- Lesson #1: It’s all about the data. mast), not to mention the four-hour mogeneous and dense, with each node This may seem obvious, but it’s interest- hike through the jungle just to reach the having on the order of 10 or more neigh- ing how often the actual data produced deployment site. bors in radio range. Routing protocols, by a sensor network is overlooked when Myth #2: Sensor nodes are cheap localization schemes, and failover tech- designing a clever new protocol or pro- and tiny. The original vision of sensor niques often leverage such high density gramming abstraction. To fi rst approxi- networks drew upon the idea of “smart through the power of many choices. mation, scientists simply want all of the dust” that could be literally blown onto This assumption depends on how data produced by all of the sensors, all of a surface. While such technology is still closely aligned the spatial resolution of the time. an active area of research, sensor net- the desired network matches the radio The approach taken by such sci- works have evolved around off-the-shelf range, which can be hundreds of me- entists is to go to the fi eld, install in- “mote” platforms that are substantially larger, more power hungry, and expen- sive than their hypothetically aerosol counterparts (“smart rocks” is a more apt metaphor). The notion that sensor nodes are disposable has led to much research that assumes it is possible to deploy many more sensor nodes than GPS receiver for time sync are strictly necessary to meet scientifi c requirements, leveraging redundancy to extend network battery lifetime and tolerate failures.

It should be emphasized that the Base station FreeWave cost of the attached sensor can outstrip at observatory radio modem Long-distance the mote itself. A typical mote costs ap- radio link (4km) proximately $100, sometimes with on- board sensors for temperature, light, figure 1. sensor network design for monitoring active volcanoes.

november 2010 | vol. 53 | no. 11 | communications of the acm 37 viewpoints

Figure 2a. Myth (taken from Sankarasubramaniam3). resource constraints, such as a target battery lifetime. Our work on the Lance system5 demonstrated it is possible to drive signal downloads from a sensor network in a manner that achieves near optimal data quality subject to these constraints. Figure 3 shows the rectifi- Event radius Sink cation of raw signals collected from the network. Inherent in this approach is the as- sumption that not all data is created equal: there must be some domain- specific assignment of “value” to the signals collected by the network to drive the process. In volcano seismol- ogy, scientists are interested in signals Figure 2b. Reality (Reventador volcano, 2005; from Werner-Allen et al6). corresponding to geophysical events (earthquakes, tremors, explosions) rather than the quiet lull that can last for hours or days between such events. Fortunately, a simple amplitude filter running on each sensor node can read- ily detect seismic events of interest. Lesson #2: Computer scientists and domain scientists need common ground. It should come as no surprise that the motivations of computer sci- entists and “real” scientists are not always aligned. Domain scientists are largely interested in obtaining high- quality data (see Lesson #1 above); whereas computer scientists are driven by the desire to do “cool stuff:” new protocols, new algorithms, new pro- gramming models. Our field thrives on novelty whereas domain scientists have an interest in measured conserva- tism. Anything new we computer scien- struments, collect as much data as the “average seismic signal” sampled tists throw into the system potentially possible, and then spend a consider- by multiple nodes in the network. We makes it harder, not easier, for the able amount of time analyzing it and advocate a two-pronged approach to domain scientists to publish papers writing journal papers. After all, data this problem. The first is to incorpo- based on the results. collection is expensive and time con- rate large flash memories onto sen- Finding common ground is essen- suming, and requires working in dirty sor nodes: it is now possible to build tial to making such collaborations places without a decent Internet con- multi-gigabyte SD or Compact Flash work. Starting small can help. Our nection. Scientists have a vested inter- memory into every node, allowing for first volcano deployment involved just est in getting as much “scientific val- months of continuous sensor data to three nodes running for two days, but ue” as possible out of a field campaign, be stored locally. in the process we learned an incred- even if this requires a great deal of ef- Though this converts the sensor ible amount about how volcanolo- fort to understand the data once it has node into a glorified data logger, it also gists do field work (and what a donkey been collected. In contrast, the sensor ensures that all of the data will be avail- will—and will not—carry on its back). network community has developed a able for (later) analysis and validation Our second deployment focused on wide range of techniques to perform of the network’s correct operation. It is collecting data with the goal of mak- data processing on the fly, aggregating often necessary to service nodes in the ing the geophysicists happy with the fi- and reducing the amount of data pro- field, such as to change batteries, offer- delity of the instrument. The third was duced by the network to satisfy band- ing an early opportunity to retrieve the largely driven by CS goals, but with an width and energy constraints. Many of data manually by swapping flash cards. eye toward meeting the scientists’ data these techniques are at odds with the The second approach is to perform requirements. Writing joint grant pro- domain scientists’ view of instrumen- data collection with the goal of maxi- posals can also help to get everyone on tation. No geophysicist is interested in mizing scientific value while satisfying the same page.

38 communications of the acm | November 2010 | vol. 53 | no. 11 viewpoints

Lesson #3: Don’t forget about the The vast majority of our development ger, monitor, and GUI. The base station base station! The base station is a criti- efforts focused on the sensor node soft- code underwent a major overhaul in the cal component of any sensor network ware, which is fairly complex and uses first two days after arriving in the field, architecture: it is responsible for coor- nonstandard programming languages mostly to add features (such as logging) dinating the network’s operation, mon- and tools. The base station, in our case that we didn’t anticipate needing dur- itoring its activity, and collecting the a laptop located at the volcano obser- ing our lab testing. We paid for the slap- sensor data itself. Yet it often gets short vatory, was mostly an afterthought: dash nature of the base station software. shrift, perhaps because of the false im- some slapped-together Perl scripts and One race condition in the Java code (for pression the base station code will be a monolithic Java program acting as a which the author takes full credit) led easy to write or that it is uninteresting. combined network controller, data log- to an 8-hour outage, while everyone was asleep. (We also assumed that the elec- Figure 3a. Original data: bad timing, unnormalized signals. tricity supply at the observatory would be fairly reliable, which turned out not to be true.) 1.2e+07 Our redesign for the 2007 Tungura- hua deployment involved modularizing 1e+07 the base station code, so that each com-

8e+06 ponent can fail independently. One pro- gram communicates with the network; 6e+06 another acts as a GUI; another logs the sensor data; and another runs the algo- 4e+06 rithm for scheduling downloads. Bugs 2e+06 can be fixed and each of these programs can be restarted at any time without dis- 0 rupting the other programs.

–2e+06 Conclusion –4e+06 Scientific discovery is increasingly driv- 1.1239e+09 1.1239e+09 1.1239e+09 1.1239e+09 1.1239e+09 1.1239e+09 1.1239e+ en by advances in computing technol- ogy, and sensor networks are an impor- tant tool to enhance data collection in many scientific domains. Still, there is a Figure 3b. Data after cleanup. gap between the stereotype of a sensor network in the literature and what many scientists need to obtain good field data. Working closely with domain scientists yields tremendous opportunities for 204 furthering a computer science research 213 agenda driven by real-world problems. 208

206 References 1. Kim, S. et al. Health monitoring of civil infrastructures 209 using wireless sensor networks. In Proceedings of 207 IPSN 2007 (Cambridge, MA, Apr. 2007). 2. Liu, T. et al. Implementing software on resource- 200 constrained mobile sensors: Experiences with Impala and ZebraNet. In Proceedings of the Second 201 International Conference on Mobile Systems, 250 Applications, and Services (MobiSYS’04), June 2004. 3. sankarasubramaniam, Y., Akan, O., and Akyildiz, I. 203 ESRT: Event-to-sink reliable transport in wireless sensor networks. In Proceedings of MobiHoc’03, 2003. 202 4. tolle, G. et al. A macroscope in the redwoods. 205 In Proceedings of the Third ACM Conference on Embedded Networked Sensor Systems (SenSys 2005). 212 5. Werner-Allen, G., Dawson-Haggerty, S., and Welsh, M. Lance: Optimizing high-resolution signal collection in 210 wireless sensor networks. In Proceedings of the 6th 251 ACM Conference on Embedded Networked Sensor Systems (SenSys’08), Nov. 2008. 214 6. Werner-Allen, G. et al. Fidelity and yield in a volcano monitoring sensor network. In Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation (OSDI 2006), Nov. 2006. 03:45:10 03:46:10 03:45:20 03:45:50 03:46:20 03:45:30 03:46:30 03:45:40 30:44:50 03:45:00 03:44:30 03:46:00 03:44:40 Time (UTC) Matt Welsh ([email protected]) is a professor of computer science at Harvard University in Cambridge, MA.

Copyright held by author.

november 2010 | vol. 53 | no. 11 | communications of the acm 39 ACM, Advancing Computing as ACMa ,SAcidevnacnecainngdCaoPmropfuetsisniognas a Science and a Profession

Dear Colleague, Dear Colleague, The power of computing technology continues to drive innovation to all corners of the globe, bringing with it opportunities for economic development and job growth. ACM is ideally positioned The power of computing technology continues to drive innovation to all corners of the globe, to help computing professionals worldwide stay competitive in this dynamic community. bringing with it opportunities for economic development and job growth. ACM is ideally positioned to help computing professionals worldwide stay competitive in this dynamic community. ACM provides invaluable member benefits to help you advance your career and achieve success in your chosen specialty. Our international presence continues to expand and we have extended our online ACM provides invaluable member benefits to help you advance your career and achieve success in your resources to serve needs that span all generations of computing practitioners, educators, researchers, and chosen specialty. Our international presence continues to expand and we have extended our online students. resources to serve needs that span all generations of computing practitioners, educators, researchers, and students. ACM conferences, publications, educational efforts, recognition programs, digital resources, and diversity initiatives are defining the computing profession and empowering the computing professional. ACM conferences, publications, educational efforts, recognition programs, digital resources, and diversity initiatives are defining the computing profession and empowering the computing professional. This year we are launching Tech Packs, integrated learning packages on current technical topics created and reviewed by expert ACM members. The Tech Pack core is an annotated bibliography of resources from the This year we are launching Tech Packs, integrated learning packages on current technical topics created and renowned ACM Digital Library – articles from journals, magazines, conference proceedings, Special Interest reviewed by expert ACM members. The Tech Pack core is an annotated bibliography of resources from the Group newsletters, videos, etc. – and selections from our many online books and courses, as well an non- renowned ACM Digital Library – articles from journals, magazines, conference proceedings, Special Interest ACM resources where appropriate. Group newsletters, videos, etc. – and selections from our many online books and courses, as well an non- ACM resources where appropriate. BY BECOMING AN ACM MEMBER YOU RECEIVE:

BTYimBeElCyOaMccINeGssAtNo AreClMevManEMt iBnEfoRrYmOaUtRioEnCEIVE: Communications of the ACM magazine • ACM Tech Packs • TechNews email digest • Technical Interest Alerts and Timely access to relevant information ACM Bulletins • ACM journals and magazines at member rates • full access to the acmqueue website for practi- Communications of the ACM magazine • ACM Tech Packs • TechNews email digest • Technical Interest Alerts and tioners • ACM SIG conference discounts • the optional ACM Digital Library ACM Bulletins • ACM journals and magazines at member rates • full access to the acmqueue website for practi- tioners • ACM SIG conference discounts • the optional ACM Digital Library Resources that will enhance your career and follow you to new positions Career & Job Center • online books from Safari® featuring O’Reilly and Books24x7® • online courses in multiple Resources that will enhance your career and follow you to new positions languages • virtual labs • e-mentoring services • CareerNews email digest • access to ACM’s 34 Special Interest Career & Job Center • online books from Safari® featuring O’Reilly and Books24x7® • online courses in multiple Groups • an acm.org email forwarding address with spam filtering languages • virtual labs • e-mentoring services • CareerNews email digest • access to ACM’s 34 Special Interest Groups • an acm.org email forwarding address with spam filtering ACM’s worldwide network of more than 97,000 members ranges from students to seasoned professionals and includes many renowned leaders in the field. ACM members get access to this network and the advantages that ACM’s worldwide network of more than 97,000 members ranges from students to seasoned professionals and come from their expertise to keep you at the forefront of the technology world. includes many renowned leaders in the field. ACM members get access to this network and the advantages that come from their expertise to keep you at the forefront of the technology world. Please take a moment to consider the value of an ACM membership for your career and your future in the dynamic computing profession. Please take a moment to consider the value of an ACM membership for your career and your future in the dynamic computing profession. Sincerely, Sincerely,

Alain Chesnais APrleaisnidCehnetsnais Association for Computing Machinery President Association for Computing Machinery

Advancing Computing as a Science & Profession

Advancing Computing as a Science & Profession ACM, Advancing Computing as ACMa ,SAcidevnacnecainngdCaoPmropfuetsisniognas membership application & a Science and a Profession Advancing Computing as a Science & Profession digital library order form Priority Code: AD10

Dear Colleague, You can join ACM in several easy ways: Dear Colleague, Online Phone Fax The power of computing technology continues to drive innovation to all corners of the globe, http://www.acm.org/join +1-800-342-6626 (US & Canada) +1-212-944-1318 bringing with it opportunities for economic development and job growth. ACM is ideally positioned +1-212-626-0500 (Global) The power of computing technology continues to drive innovation to all corners of the globe, to help computing professionals worldwide stay competitive in this dynamic community. Or, complete this application and return with payment via postal mail bringing with it opportunities for economic development and job growth. ACM is ideally positioned to help computing professionals worldwide stay competitive in this dynamic community. ACM provides invaluable member benefits to help you advance your career and achieve success in your Special rates for residents of developing countries: Special rates for members of sister societies: chosen specialty. Our international presence continues to expand and we have extended our online http://www.acm.org/membership/L2-3/ http://www.acm.org/membership/dues.html ACM provides invaluable member benefits to help you advance your career and achieve success in your resources to serve needs that span all generations of computing practitioners, educators, researchers, and chosen specialty. Our international presence continues to expand and we have extended our online Please print clearly students. Purposes of ACM resources to serve needs that span all generations of computing practitioners, educators, researchers, and ACM is dedicated to: students. 1) advancing the art, science, engineering, Name ACM conferences, publications, educational efforts, recognition programs, digital resources, and diversity and application of information technology initiatives are defining the computing profession and empowering the computing professional. 2) fostering the open interchange of ACM conferences, publications, educational efforts, recognition programs, digital resources, and diversity information to serve both professionals and initiatives are defining the computing profession and empowering the computing professional. Address This year we are launching Tech Packs, integrated learning packages on current technical topics created and the public 3) promoting the highest professional and reviewed by expert ACM members. The Tech Pack core is an annotated bibliography of resources from the This year we are launching Tech Packs, integrated learning packages on current technical topics created and City State/Province Postal code/Zip ethics standards renowned ACM Digital Library – articles from journals, magazines, conference proceedings, Special Interest reviewed by expert ACM members. The Tech Pack core is an annotated bibliography of resources from the I agree with the Purposes of ACM: Group newsletters, videos, etc. – and selections from our many online books and courses, as well an non- renowned ACM Digital Library – articles from journals, magazines, conference proceedings, Special Interest Country E-mail address ACM resources where appropriate. Group newsletters, videos, etc. – and selections from our many online books and courses, as well an non- Signature ACM resources where appropriate. Area code & Daytime phone Fax Member number, if applicable BY BECOMING AN ACM MEMBER YOU RECEIVE: ACM Code of Ethics: http://www.acm.org/serving/ethics.html BTYimBeElCyOaMccINeGssAtNo AreClMevManEMt iBnEfoRrYmOaUtRioEnCEIVE: Communications of the ACM magazine • ACM Tech Packs • TechNews email digest • Technical Interest Alerts and choose one membership option: Timely access to relevant information ACM Bulletins • ACM journals and magazines at member rates • full access to the acmqueue website for practi- Communications of the ACM magazine • ACM Tech Packs • TechNews email digest • Technical Interest Alerts and PROFESSIONAL MEMBERSHIP: STUDENT MEMBERSHIP: tioners • ACM SIG conference discounts • the optional ACM Digital Library ACM Bulletins • ACM journals and magazines at member rates • full access to the acmqueue website for practi- ACM Professional Membership: $99 USD ACM Student Membership: $19 USD tioners • ACM SIG conference discounts • the optional ACM Digital Library o o Resources that will enhance your career and follow you to new positions ACM Professional Membership plus the ACM Digital Library: ACM Student Membership plus the ACM Digital Library: $42 USD Career & Job Center • online books from Safari® featuring O’Reilly and Books24x7® • online courses in multiple o o Resources that will enhance your career and follow you to new positions $198 USD ($99 dues + $99 DL) ACM Student Membership PLUS Print CACM Magazine: $42 USD languages • virtual labs • e-mentoring services • CareerNews email digest • access to ACM’s 34 Special Interest o Career & Job Center • online books from Safari® featuring O’Reilly and Books24x7® • online courses in multiple ACM Digital Library: $99 USD (must be an ACM member) ACM Student Membership w/Digital Library PLUS Print Groups • an acm.org email forwarding address with spam filtering o o languages • virtual labs • e-mentoring services • CareerNews email digest • access to ACM’s 34 Special Interest CACM Magazine: $62 USD Groups • an acm.org email forwarding address with spam filtering ACM’s worldwide network of more than 97,000 members ranges from students to seasoned professionals and includes many renowned leaders in the field. ACM members get access to this network and the advantages that ACM’s worldwide network of more than 97,000 members ranges from students to seasoned professionals and come from their expertise to keep you at the forefront of the technology world. payment: includes many renowned leaders in the field. ACM members get access to this network and the advantages that All new ACM members will receive an ACM membership card. come from their expertise to keep you at the forefront of the technology world. Payment must accompany application. If paying by check or Please take a moment to consider the value of an ACM membership for your career and your future in the For more information, please visit us at www.acm.org money order, make payable to ACM, Inc. in US dollars or foreign dynamic computing profession. currency at current exchange rate. Please take a moment to consider the value of an ACM membership for your career and your future in the Professional membership dues include $40 toward a subscription Visa/MasterCard American Express Check/money order dynamic computing profession. to Communications of the ACM. Member dues, subscriptions, o o o Sincerely, and optional contributions are tax-deductible under certain circumstances. Please consult with your tax advisor. Professional Member Dues ($99 or $198) $ ______Sincerely, o ACM Digital Library ($99) $ ______o RETURN COMPLETED APPLICATION TO: Student Member Dues ($19, $42, or $62) $ ______Association for Computing Machinery, Inc. o Alain Chesnais General Post Office Total Amount Due $ ______APrleaisnidCehnetsnais P.O. Box 30777 Association for Computing Machinery New York, NY 10087-0777 President Association for Computing Machinery Questions? E-mail us at [email protected] Card # Expiration date Or call +1-800-342-6626 to speak to a live representative

Satisfaction Guaranteed! Signature

Advancing Computing as a Science & Profession

Advancing Computing as a Science & Profession practice

doi:10.1145/1839676.1839691

Article development led by queue.acm.org Want to keep your users? Just make it easy for them to leave.

by Brian W. Fitzpatrick and JJ Lueck The Case Against Data Lock-in

Until recently, users rarely asked Engineers employ many different tactics to focus on the whether they could quickly and eas- user when writing software: for example, listening ily get their data out before they put reams of personal information into a to user feedback, fixing bugs, and adding features new Internet service. They were more that their users are clamoring for. Since Web-based likely to ask questions such as: “Are services have made it easier for users to move to new my friends using the service?” “How reliable is it?” and “What are the odds applications, it is becoming even more important that the company providing the service to focus on building and retaining user trust. We is going to be around in six months or a year?” Users are starting to realize, have found that an incredibly effective—although however, that as they store more of certainly counterintuitive—way to earn and maintain their personal data in services that are user trust is to make it easy for users to leave your not physically accessible, they run the risk of losing vast swaths of their on- product with their data in tow. This not only prevents line legacy if they do not have a means lock-in and engenders trust, but also forces your of removing their data. It is typically a lot easier for software team to innovate and compete on technical merit. engineers to pull data out of a service We call this data liberation. that they use than it is for regular us-

42 communications of the acm | november 2010 | vol. 53 | no. 11 ers. If APIs are available, we engineers they are more likely to return to your do not have to install software to use it; can cobble together a program to pull product line tomorrow. they do not have to upload data to use our data out. Without APIs, we can even Locking users in may suppress a it; they do not have to sign two-year con- whip up a screen scraper to get a copy of company’s need to innovate as rapidly tracts; and if they decide to try another the data. Unfortunately, for most users as possible. Instead, your company may search engine, they merely type it into this is not an option, and they are often decide—for business reasons—to slow their browser’s location bar, and they left wondering if they can get their data down development on your product are off and running. out at all. and move engineering resources to an- How has Google managed to get us- Locking your users in, of course, has other product. This makes your prod- ers to keep coming back to its search the advantage of making it more diffi- uct vulnerable to other companies that engine? By focusing obsessively on cult for them to leave you for a competi- innovate at a faster rate. Lock-in allows constantly improving the quality of its tor. Likewise, if your competitors lock your company to have the appearance results. The fact that it is so easy for their users in, it is harder for those users of continuing success when, without in- users to switch has instilled an incred- to move to your product. Nonetheless, novation, it may in fact be withering on ible sense of urgency in Google’s search ill it is far preferable to spend your engi- the vine. quality and ranking teams. At Google we neering effort on innovation than it is If you do not—or cannot—lock your think that if we make it easy for users to y gary ne b to build bigger walls and stronger doors users in, the best way to compete is leave any of our products, failure to im- on i that prevent users from leaving. Making to innovate at a breakneck pace. Let’s prove a product results in immediate it easier for users to experiment today use Google Search as an example. It’s a feedback to the engineers, who respond ustrat ill greatly increases their trust in you, and product that cannot lock users in: users by building a better product.

november 2010 | vol. 53 | no. 11 | communications of the acm 43 practice

What Data Liberation Looks Like product—that is, data that users cre- At Google, our attitude has always been ate in or import into Google products. that users should be able to control the This is all data stored intentionally via data they store in any of our products, a direct action—such as photos, email, and that means they should be able to documents, or ad campaigns—that us- get their data out of any product. Period. It is preferable ers would most likely need a copy of if There should be no additional mone- to spend your they wanted to take their business else- tary cost to do so, and perhaps most im- where. Data indirectly created as a side portantly, the amount of effort required engineering effort effect (for example, log data) falls out- to get the data out should be constant, on innovation than side of this mission, as it is not particu- regardless of the amount of data. Indi- larly relevant to lock-in. vidually downloading a dozen photos is it is to build bigger Another “non-goal” of data libera- no big inconvenience, but what if a user walls and stronger tion is to develop new standards: we al- had to download 5,000 photos, one at a low users to export in existing formats time, to get them out of an application? doors that prevent where we can, as in Google Docs where That could take weeks of their time. users can download word processing Even if users have a copy of their users from leaving. files in OpenOffice or Microsoft Office data, it can still be locked in if it is in a Making it easier for formats. For products where there is proprietary format. Some word proces- no obvious open format that can con- sor documents from 15 years ago can- users to experiment tain all of the information necessary, not be opened with modern software today greatly we provide something easily machine because they are stored in a proprietary readable such as XML (for example, format. It is important, therefore, not increases their for Blogger feeds, including posts and only to have access to data, but also to trust, and they comments, we use Atom), publicly have it in a format that has a publicly document the format, and, where pos- available specification. Furthermore, are more likely sible, provide a reference implementa- the specification must have reason- tion of a parser for the format (see the able license terms: for example, it to return to Google Blog Converters AppEngine should be royalty-free to implement. your product project for an examplea). We try to give If an open format already exists for the the data to the user in a format that exported data (for example, JPEG or line tomorrow. makes it easy to import into another TIFF for photos), then that should be product. Since Google Docs deals an option for bulk download. If there with word processing documents and is no industry standard for the data in spreadsheets that predate the rise of a product (for example, blogs do not the open Web, we provide a few differ- have a standard data format), then ent formats for export; in most prod- at the very least the format should be ucts, however, we assiduously avoid publicly documented—bonus points if the rat hole of exporting into every your product provides an open source known format under the sun. reference implementation of a parser for your format. The User’s View The point is that users should be in There are several scenarios where us- control of their data, which means they ers might want to get a copy of their need an easy way of accessing it. Provid- data from your product: they may have ing an API or the ability to download found another product that better suits 5,000 photos one at a time does not ex- their needs and they want to bring their actly make it easy for your average user data into the new product; you have an- to move data in or out of a product. nounced that you are going to stop sup- From the user-interface point of view, porting the product they are using; or, users should see data liberation merely worse, you may have done something to as a set of buttons for import and export lose their trust. of all data in a product. Of course, just because your users Google is addressing this problem want a copy of their data does not nec- through its Data Liberation Front, essarily mean they are abandoning your an engineering team whose goal is to product. Many users just feel safer hav- make it easier to move data in and out of Google products. The data libera- a http://code.google.com/p/google-blog-converters- tion effort focuses specifically on data appengine/wiki/BloggerExportTemplate; and that could hinder users from switch- http://code.google.com/apis/blogger/docs/2.0/ ing to another service or competing reference.html#LinkCommentsToPosts.

44 communications of the acm | november 2010 | vol. 53 | no. 11 practice ing a local copy of their data as a backup. Figure 1 is a sample of one Atom en- metadata in the transformed pages We saw this happen when we first liber- try that encapsulates a Web page within poses a problem during an import—it ated Blogger: many users started export- Sites. This can be retrieved by using the becomes unclear which elements of ing their blogs every week while continu- Content Feed to Google Sites. XHTML correspond to the pieces of the ing to host and write in Blogger. This We have highlighted (in red) the ac- original Atom entry. Luckily, we did not last scenario is more rooted in emotion tual data that is being exported, which have to invent our own metadata em- than logic. Most data that users have on includes an identifier, a last update time bedding technique; we simply used the their computers is not backed up at all, in ISO 8601 format, title, revision num- hAtom microformat. whereas hosted applications almost al- ber, and the actual Web-page content. To demonstrate the utility of micro- ways store multiple copies of user data Mandatory authorship elements and formats, Figure 2 shows the same sam- in multiple geographic locations, ac- other optional information included in ple after being converted into XHTML counting for hardware failure in addi- the entry have been removed to keep the with hAtom microformat embedded: tion to natural disasters. Whether users’ example short. The highlighted class attributes map concerns are logical or emotional, they Once the API was in place, the sec- directly to the original Atom elements, need to feel their data is safe: it’s impor- ond step was to implement the trans- making it very explicit how to recon- tant that your users trust you. formation from a set of Atom feeds struct the original Atom when import- into a collection of portable XHTML ing this information back into Sites. Case Study: Google Sites Web pages. To protect against losing The microformat approach also has the Google Sites is a Web site creator that any data from the original Atom, we side benefit that any Web page can be allows WYSIWYG editing through the chose to embed all of the metadata imported into Sites if the author is will- browser. We use this service inside of about each Atom entry right into the ing to add a few class attributes to data Google for our main project page, as it is transformed XHTML. Not having this within the page. This ability to reimport really convenient for creating or aggre- gating project documentation. We took Figure 1. Atom entry encapsulating a Web page within Sites. on the job of creating the import and ex- port capabilities for Sites in early 2009. Early in the design, we had to de- https://sites.google.com/feeds/content/site/... termine what the external format of a 2009-02-09T21:46:14.991Z Google Site should be. Considering that create and collaborate on Web sites, Maps API Examples we decided that the format best suited 2 for true liberation would be XHTML.

HTML, as the language of the Web, also ... PAGE CONTENT HERE ... makes it the most portable format for
a Web site: just drop the XHTML pages on your own Web server or upload them to your Web service provider. We want- ed to make sure this form of data por- tability was as easy as possible with as Figure 2. Atom entry converted into XHTML. little loss of data as possible. Sites uses its internal data format to encapsulate the data stored in a Web
in the site. The first step to liberating

Maps API Examples this data was to create a Google Data

API. A full export of a site is then pro-
vided through an open source Java cli-
ent tool that uses the Google Sites Data
... PAGE CONTENT HERE ... API and transforms the data into a set of
XHTML pages.
The Google Sites Data API, like all
Google Data APIs, is built upon the Updated on AtomPub specification. This allows for RPC (remote procedure call)-style pro- Feb 9, 2009 grammatic access to Google Sites data (Version 2) using Atom documents as the wire for- mat for the RPCs. Atom works well for
the Google Sites use case, as the data fits fairly easily into an Atom envelope.

november 2010 | vol. 53 | no. 11 | communications of the acm 45 practice

a user’s exported data in a lossless man- building code using those APIs to per- lenge. An extensive photo collection, ner is key to data liberation—it may take form cloud-to-cloud migration. These for example, which can easily scale into more time to implement, but we think types of migrations require multiple multiple gigabytes, can pose difficulties the result is worthwhile. RPCs between services to move the with delivery given the current transfer data piece by piece, and each of these speeds of most home Internet connec- Case Study: Blogger RPCs can be retried upon failure auto- tions. In this case, either we have a cli- One of the problems we often encoun- matically without user intervention. It ent for the product that can sync data ter when doing a liberation project is ca- is a much better model than the one to and from the service (such as Picasa), tering to the power user. These are our transaction import. It increases the or we rely on established protocols and favorite users. They are the ones who likelihood of total success and is an APIs (for example, POP and IMAP for love to use the service, put a lot of data all-around better experience for the Gmail) to allow users to sync incremen- into it, and want the comfort of being user. True cloud-to-cloud portability, tally or export their data. able to do very large imports or exports however, works only when each cloud of data at any time. Five years of jour- provides a liberated API for all of the Conclusion nalism through blog posts and photos, user’s data. We think cloud-to-cloud Allowing users to get a copy of their for example, can easily extend beyond portability is really good for users, and data is just the first step on the road to a few gigabytes of information, and at- it’s a tenet of the Data Liberation Front. data liberation: we have a long way to tempting to move that data in one fell go to get to the point where users can swoop is a real challenge. In an effort Challenges easily move their data from one prod- to make import and export as simple as As you have seen from these case stud- uct on the Internet to another. We look possible for users, we decided to imple- ies, the first step on the road to data forward to this future, where we as en- ment a one-click solution that would liberation is to decide exactly what us- gineers can focus less on schlepping provide the user with a Blogger export ers need to export. Once you have cov- data around and more on building in- file that contains all of the posts, com- ered data that users have imported or teresting products that can compete ments, static pages, and even settings created by themselves into your prod- on their technical merits—not by hold- for any Blogger blog. This file is down- uct, it starts to get complicated. Take ing users hostage. Giving users control loaded to the user’s hard drive and can Google Docs, for example: a user clear- over their data is an important part of be imported back into Blogger later ly owns a document that he or she cre- establishing user trust, and we hope or transformed and moved to another ated, but what about a document that more companies will see that if they blogging service. belongs to another user, then is edited want to retain their users for the long One mistake we made when creat- by the user currently doing the export? term, the best way to do that is by set- ing the import/export experience for What about documents to which the ting them free. Blogger was relying on one HTTP trans- user has only read access? The set of action for an import or an export. HTTP documents the user has read access Acknowledgments connections become fragile when the to may be considerably larger than the Thanks to Bryan O’Sullivan, Ben Col- size of the data you are transferring be- set of documents the user has actually lins-Sussman, Danny Berlin, Josh comes large. Any interruption in that read or opened if you take into account Bloch, Stuart Feldman, and Ben Laurie connection voids the action and can globally readable documents. Lastly, for reading drafts of this article. lead to incomplete exports or missing you have to take into account docu- data upon import. These are extremely ment metadata such as access control frustrating scenarios for users and, lists. This is just one example, but it Related articles on queue.acm.org unfortunately, much more prevalent applies to any product that lets users for power users with lots of blog data. share or collaborate on data. Other People’s Data Stephen Petschulat We neglected to implement any form Another important challenge to http://queue.acm.org/detail.cfm?id=1655240 of partial export as well, which means keep in mind involves security and power users sometimes need to resort authentication. When you are making Why Cloud Computing Will Never Be Free Dave Durkee to silly things such as breaking up their it very easy and fast for users to pull http://queue.acm.org/detail.cfm?id=1772130 export files by hand in order to have their data out of a product, you drasti- better success when importing. We cally reduce the time required for an Brian Fitzpatrick started Google’s Chicago engineering recognize this is a bad experience for attacker to make off with a copy of all office in 2005 and is the engineering manager for the Data Liberation Front and the Google Affiliate Network. users and are hoping to address it in a your data. This is why it’s a good idea to A published author, frequent speaker, and open source future version of Blogger. require users to re-authenticate before contributor for more than 12 years, Fitzpatrick is a member of the Apache Software Foundation and the Open A better approach, one taken by ri- exporting sensitive data (such as their Web Foundation, as well as a former engineer at Apple val blogging platforms, is not to rely search history), as well as over-commu- and CollabNet. on the user’s hard drive to serve as the nicating export activity back to the user JJ Lueck joined the software engineering party at Google in 2007. An MIT graduate and prior engineer at AOL, intermediary when attempting to mi- (for example, email notification that Bose, and startups Bang Networks and Reactivity, he grate lots of data between cloud-based an export has occurred). We are explor- enjoys thinking about problems such as cloud-to-cloud interoperability and exploring the depths and potentials of Blogging services. Instead, data lib- ing these mechanisms and more as we virtual machines. eration is best provided through APIs, continue liberating products. and data portability is best provided by Large data sets pose another chal- © 2010 ACM 0001-0782/10/1100 $10.00

46 communications of the acm | november 2010 | vol. 53 | no. 11 doi:10.1145/1839676.1839692

Article development led by queue.acm.org As storage systems grow larger and larger, protecting their data for long-term storage is becoming ever more challenging.

by David S.H. Rosenthal Keeping Bits Safe: How Hard Can It Be?

These days, we are all data pack rats. Storage is cheap, so if there is a chance the data could possibly be useful, we keep it. We know that storage isn’t completely reliable, so we keep backup copies as well. But the more data we keep, and the longer we keep it, the greater the chance that some of it will be unrecoverable when we need it.

There is an obvious question we should already multiple petabytes. be asking: how many copies in storage The state of our knowledge about systems with what reliability do we keeping bits safe can be summarized need to get a given probability that the as: data will be recovered when we need ˲˲ The more copies, the safer. As the it? This may be an obvious question size of the data increases, the per-copy to ask, but it is a surprisingly difficult cost increases, reducing the number of question to answer. Let’s look at the backup copies that can be afforded. reasons why. ˲˲ The more independent the copies, the To be specific, let’s suppose we need safer. As the size of the data increases, to keep a petabyte for a century and there are fewer affordable storage tech- have a 50% chance that every bit will nologies. Thus, the number of copies survive undamaged. This may sound in the same storage technology in- like a lot of data and a long time, but creases, decreasing the average level of there are already data collections big- independence. ger than a petabyte that are important ˲˲ The more frequently the copies are to keep forever. The Internet Archive is audited, the safer. As the size of the data

november 2010 | vol. 53 | no. 11 | communications of the acm 47 practice

increases, the time and cost needed for All that Sun was saying was if you after the start of the experiment. As each audit to detect and repair damage watched a large number of ST5800 Sirius did not start watching a batch increases, reducing their frequency. systems for a long time, recorded of SC5800s 2.8 million years ago, how At first glance, keeping a petabyte the time at which each of them first would they know? for a century is not difficult. Storage suffered a data loss, and then aver- Sirius says it will sell 2×104 SC5800s system manufacturers make claims aged these times, the result would be per year at $5×104 each (a $1 billion- for their products that far exceed the 2.4×106 years. Suppose Sun watched a-year business), and it expects the reliability we need. For example, Sun 10 ST5800s and noticed that three of product to be in the market for 10 claimed that its ST5800 Honeycomb them lost data during the first year, years. The SC5800 has a service life of product had an MTTDL (mean time to four of them lost data after 2.4×106 10 years. So if Sirius watched the entire data loss) of 2.4×106 years.a,41 Off-the- years, and the remaining three lost production of SC5800s ($1010 worth of shelf solutions appear so reliable that data after 4.8×106 years; Sun would be storage systems) over their entire ser- backups are unnecessary. Should we correct that the MTTDL was 2.4×106 vice life, the experiment would end believe these claims? Where do they years. But we would not consider a 20 years from now after accumulating come from? system with a 30% chance of data loss about 2×106 system-years of data. If its claim were correct, Sirius would have about a 17% chance of seeing a single data-loss event. In other words, Sirius claims the probability that no SC5800 will ever lose any data is more than 80%. Or, since each SC5800 stores 5×1013 bytes, there is an 80% probability that 1019 bytes of data will survive 10 years un- damaged. If one could believe Sirius’ claim, the petabyte would look pretty safe for a century. But even if Sirius were to base its claim on an actual experi- ment, it would not provide results for 20 years and even when it did, would not validate the number in question. In fact, claims such as those of Sun and Sirius are not the result of experimen- tation at all. No feasible experiment could validate them. They are projec- tions based on models of how compo- nents of the system such as disks and software behave.

Models The state of the art in this kind of mod- Before using Sun’s claim for the in the first year was adequate to keep eling is exemplified by the Pergamum ST5800 as an example, I should stipu- a petabyte safe for a century. A single project at UC Santa Cruz.39 Its model late that the ST5800 was an excellent MTTDL number is not a useful charac- includes disk failures at rates derived product. It represented the state of the terization of a solution. from measurements30,35 and sector fail- art in storage technology, and Sun’s Let’s look at the slightly more ures at rates derived from disk vendor marketing claims represented the state scientific claim made at the re- specifications. This system attempts to of the art in storage marketing. Nev- cent launch of the SC5800 by the conserve power by spinning the disks ertheless, Sun did not guarantee that marketing department of Sirius down whenever possible; it makes an data in the ST5800 would last 2.4×106 Cybernetics:b “SC5800 has an MTTDL allowance for the effect of doing so on years. Sun’s terms and conditions ex- of (2.4±0.4)×106 years.” Sirius implic- disk lifetime, but it is not clear upon plicitly disclaimed any liability whatso- itly assumes the failures are normally what this allowance is based. The Per- D ever for loss of, or damage to, the data distributed and thus claims that about gamum team reports that the simula- the ST5800 stores40 whenever it occurs. two-thirds of the failures would oc- tions were difficult: RAMPERSA cur between 2.0×106 and 2.8×106 years “This lack of data is due to the ex- TARAN

tremely high reliability of these con- Y a Numbers are expressed in powers-of-10 nota- H B tion to help readers focus on the scale of the b Purveyors of chatty doors, existential eleva- figurations—the simulator modeled problems and the extraordinary level of reli- tors, and paranoid androids to the nobility many failures, but so few caused data OTOGRAP

1 H ability required. and gentry of this galaxy. loss that the simulation ran very slowly. P

48 communications of the acm | november 2010 | vol. 53 | no. 11 practice

This behavior is precisely what we want hopes, accurately reflect its behavior. from an archival storage system: it can ˲˲ The data used to drive the simu- gracefully handle many failure events lation, which, one hopes, accurately without losing data. Even though we reflects the behavior of the system’s captured fewer data points for the tri- components. ple inter-parity configuration, we be- The more data we Under certain conditions, it is rea- lieve the reported MTTDL is a reason- keep, and the longer sonable to use these models to com- able approximation.”39 pare different storage-system technol- Although the Pergamum team’s ef- we keep it, the ogies. The most important condition is fort to obtain “a reasonable approxi- greater the chance that the models of the two systems use mation” to the MTTDL of its system the same data. A claim that modeling is praiseworthy, there are a number of that some of it will showed system A to be more reliable reasons to believe it overestimates the than system B when the data used to reliability of the system in practice: be unrecoverable model system A had much lower fail- ˲˲ The model draws its failures from when we need it. ure rates for components such as disk exponential distributions. The team drives would not be credible. thus assumes that both disk and sec- These models may well be the best tor failures are uncorrelated, although tools available to evaluate different all observations of actual failures5,42 techniques for preventing data loss, report significant correlations. Corre- but they aren’t good enough to an- lated failures greatly increase the prob- swer our question. We need to know ability of data loss.6,13 the maximum rate at which data will ˲˲ Other than a small reduction in be lost. The models assume things, disk lifetime from each power-on such as uncorrelated errors and bug- event, the Pergamum team assumes free software, that all real-world stud- that failure rates observed in always-on ies show are false. The models exclude disk usage translate to the mostly off most of the threats to which stored environment. A study43 published after data is subject. In those cases where the Pergamum paper reports a quan- similar claims, such as those for disk titative accelerated life test of data re- reliability,30,35 have been tested, they tention in almost-always-off disks. It have been shown to be optimistic. The shows that some of the 3.5-inch disks models thus provide an estimate of the anticipated by the Pergamum team minimum data loss rate to be expected. have data life dramatically worse in this usage mode than 2.5-inch disks Metrics using the same head and platter tech- Even if we believed the models, the nology. MTTDL number does not tell us how ˲˲ The team assumes that disk and much data was lost in the average data- sector failures are the only failures loss event. Is petabyte system A with an contributing to the system failures, MTTDL of 106 years better than a sim- although a study17 shows that other ilar-size system B with an MTTDL of hardware components contribute sig- 103 years? If the average data-loss event nificantly. in system A loses the entire petabyte, ˲˲ It assumes that its software is bug- where the average data-loss event in free, despite several studies of file and system B loses a kilobyte, it would be storage implementations14,20,31 that easy to argue that system B was 109 uniformly report finding bugs capa- times better. ble of causing data loss in all systems Mean time to data loss is not a use- studied. ful metric for how well a system stores ˲˲ It also ignores all other threats bits through time, because it relates to to stored data34 as possible causes of time but not to bits. Nor is the UBER data loss. Among these are operator er- (unrecoverable bit error rate) typically ror, insider abuse, and external attack. quoted by disk manufacturers; it is the Each of these has been the subject of probability that a bit will be read in- anecdotal reports of actual data loss. correctly regardless of how long it has What can such models tell us? been sitting on the disk. It relates to Their results depend on both of the bits but not to time. Thus, we see that following: we lack even the metric we would need ˲˲ The details of the simulation of to answer our question. the system being studied, which, one Let us oversimplify the problem to

november 2010 | vol. 53 | no. 11 | communications of the acm 49 practice

get a clearer picture. Suppose we had the experiments that would be needed eliminated all possible sources of cor- to be reasonably sure random bit rot related data loss, from operator error is not significant are too expensive, or to excess heat. All that remained would take too long, or both. be bit rot, a process that randomly flips the bits the system stores with a con- Our inability to Measuring Failures stant small probability per unit time. compute how many It wasn’t until 2007 that researchers In this model we can treat bits as ra- started publishing studies of the reli- dioactive atoms, so that the time after backup copies we ability that actual large-scale storage which there is a 50% probability that a systems were delivering in practice. bit will have flipped is the bit half-life. need to achieve a Enterprises such as Google9 and insti- The requirement of a 50% chance reliability target tutions such the Sloan Digital Sky Sur- that a petabyte will survive for a centu- vey37 and the Large Hadron Collider8 ry translates into a bit half-life of 8×1017 is something we were collecting petabytes of data with years. The current estimate of the age are just going to long-term value that had to remain of the universe is 1.4×1010 years, so this online to be useful. The annual cost is a bit half-life approximately 6×107 have to live with. of keeping a petabyte online was more times the age of the universe. We are not going than $1 million.27 It is easy to see why This bit half-life requirement clearly questions of the economics and reli- shows the high degree of difficulty of to have enough ability of storage systems became the the problem we have set for ourselves. focus of researchers’ attention. Suppose we want to know whether a backup copies, and Papers at the 2007 File and Storage system we are thinking of buying is stuff will get lost or Technologies (FAST) conference used good enough to meet the 50% chance data from NetApp35 and Google30 to of keeping a petabyte for a century. damaged. study disk-replacement rates in large Even if we are sublimely confident that storage farms. They showed that the every source of data loss other than manufacturer’s MTTF numbers were bit rot has been totally eliminated, we optimistic. Subsequent analysis of the still have to run a benchmark of the NetApp data17 showed that all other system’s bit half-life to confirm it is components contributed to the storage longer than 6×107 times the age of the system failures and that: universe. And this benchmark has to “Interestingly, [the earlier studies] be complete in a year or so; it can’t take found disks are replaced much more a century. frequently (2–4 times) than vendor- So we take 103 systems just like the specified [replacement rates]. But as one we want to buy, write a petabyte of this study indicates, there are other data into each so we have an exabyte of storage subsystem failures besides data altogether, wait a year, read the ex- disk failures that are treated as disk abyte back, and check it. If the system faults and lead to unnecessary disk re- is just good enough, we might see five placements.”17 bit flips. Or, because bit rot is a random Two studies, one at CERN (European process, we might see more, or less. We Organization for Nuclear Research)18 would need either a lot more than an and one using data from NetApp,5 exabyte of data or a lot more than a year greatly improved on earlier work using to be reasonably sure the bit half-life data from the Internet Archive.6,36 They was long enough for the job. But even studied silent data corruption—events an exabyte of data for a year costs 10 in which the content of a file in storage times as much as the system we want changes with no explanation or record- to buy. ed errors—in state-of-the-art storage What this thought-experiment tells systems. us is we are now dealing with such The NetApp study looked at the in- large numbers of bits for such a long cidence of silent storage corruption in time that we are never going to know individual disks in RAID arrays. The whether the systems we use are good data was collected over 41 months enough: from NetApp’s filers in the field, cov- ˲˲ The known causes of data loss are ering more than 1.5×106 drives. The too various and too highly correlated study found more than 4×105 silent for models to produce credible projec- corruption incidents. More than 3×104 tions. of them were not detected until RAID ˲˲ Even if we ignore all those causes, restoration and could thus have caused

50 communications of the acm | november 2010 | vol. 53 | no. 11 practice

data loss despite the replication and nately, adding the necessary inter-stor- we need to be, and thus the cost of the auditing provided by NetApp’s row-di- age-system replication and scrubbing necessary replication. At small scales agonal parity RAID.11 is expensive. the response to this uncertainty is to The CERN study used a program Cost figures from the San Diego add more replicas, but as the scale in- that wrote large files into CERN’s vari- Supercomputer Centerc in 2008 show creases this rapidly becomes unafford- ous data stores, which represent a that maintaining a single online copy able. broad range of state-of-the-art enter- of a petabyte for a year costs about Replicating among identical sys- prise storage systems (mostly RAID ar- $1.05×106. A single near-line copy on tems is much less effective than repli- rays), and checked them over a period tape costs about $4.2×105 a year. These cating among diverse systems. Iden- of six months. A total of about 9.7×1016 costs decrease with time, albeit not as tical systems are subject to common bytes was written and about 1.92×108 fast as raw disk costs. The British Li- mode failures—for example, those bytes were found to have suffered si- brary estimates a 30% per annum de- caused by a software bug in all the sys- lent corruption, of which about two- crease. Assuming this rate continues tems damaging the same data in each. thirds were persistent; rereading did for at least a decade, if you can afford On the other hand, purchasing and op- not return good data. In other words, about 3.3 times the first year’s cost to erating a number of identical systems about 1.2×10–9 of the data written to CERN’s storage was permanently cor- rupted within six months. We can place an upper bound on the bit half-life in this sample of current storage systems by assuming the data was written in- stantly at the start of the six months and checked instantly at the end; the result is 2×108 or about 10–2 times the age of the universe. Thus, to reach the petabyte for a century requirement we would need to improve the perfor- mance of current enterprise storage systems by a factor of at least 109.

Tolerating Failures Despite manufacturers’ claims, cur- rent research shows that state-of-the- art storage systems fall so many orders of magnitude below our bit preserva- tion requirements that we cannot ex- pect even dramatic improvements in technology to fill the gap. Maintaining a single replica in a single storage sys- tem is not an adequate solution to the bit preservation problem. Practical digital preservation sys- tems must therefore: store an extra replica for a decade, you will be considerably cheaper than oper- ˲˲ Maintain more than one copy by can afford to store it indefinitely. So, ating a set of diverse systems. replicating their data on multiple, ide- adding a second replica of a petabyte Each replica is vulnerable to loss ally different, storage systems. on disk would cost about $3.5×106 and and damage. Unless they are regularly ˲˲ Audit or (scrub) the replicas to de- on tape about $1.4×106. Adding cost to audited they contribute little to in- tect damage, and repair it by overwrit- a preservation effort to increase reli- creasing bit half-life. The bandwidth ing the known-bad copy with data from ability in this way is a two-edged sword: and processing capacity needed to another. doing so necessarily increases the risk scrub the data are both costly, and add- The more replicas and the more fre- that preservation will fail for economic ing these costs increases the risk of quently they are audited and repaired, reasons. failure. Custom hardware25 could com- the longer the bit half-life we can ex- Further, without detailed under- pute the SHA-128 checksum of a pet- D pect. This is, after all, the basis for the standing of the rates at which different abyte of data in a month, but doing so backups and checksums technique in mechanisms cause loss and damage, it requires impressive bandwidth—the RAMPERSA common use. In fact, current storage still is not possible to answer the ques- equivalent of three gigabit Ethernet TARAN

Y systems already use such techniques tion we started with and to know how interfaces running at full speed the en- H B internally—for example, in the form many replicas would make us as safe as tire month. User access to data in long- of RAID.29 Despite this, the bit half-life term storage is typically infrequent; it OTOGRAP

H 27 P they deliver is inadequate. Unfortu- c Figures for 2007 are in Moore et al. is therefore rarely architected to pro-

november 2010 | vol. 53 | no. 11 | communications of the acm 51 practice

vide such high-bandwidth read access. probabilities, but to illustrate the ad- of 2×105 stone DVDs. A lot can happen System cost increases rapidly with I/O vantage of using a model checker, and to a stack that big in 100 years. Truly bandwidth, and the additional access- discuss potential trade-offs between magic media that are utterly reliable es to the data (whether on disk or on different protection schemes.”20 would make the problems better, but tape) needed for scrubbing themselves Thus, although intersystem repli- they would not make them go away potentially increase the risk of failure. cation and scrubbing are capable of completely. The point of writing software that decreasing the incidence of data loss, I remember magnetic bubble mem- reads and verifies the data-systems they cannot eliminate it completely. ory, so I have a feeling of déjà vu, but it store in this way is to detect damage And the replication and scrubbing is starting to look possible that flash and exploit replication among systems software itself will contain bugs that memory, or possibly more exotic solid- to repair it, thereby increasing bit half- can cause data loss. It must be doubt- state technologies such as memristors life. How well can we do this? RAID is ful that we can implement these tech- or phase-change memory, may sup- an example of a software technique of niques well enough to increase the bit plant disks. There is a lot to like about this type applied to disks. In practice, half-life of systems with an affordable these technologies for long-term stor- the CERN study18 looking at real RAID number of replicas by 109. age, but will they improve storage reli- ability? Again, we don’t know the answer yet. Despite flash memory’s ubiquity, it is not even clear yet how to measure its UBER: “UBER values can be much better than 10−15 but UBER is a strong func- tion of program/erase cycling and subsequent retention time, so UBER specifications must be coupled with maximum specifications for these quantities.”26 In other words, it depends how you use it, which does not appear to be the case for disk. Flash memory used for long-term data storage, which is writ- ten once and read infrequently, should in principle perform very well. And the system-level effects of switching from hard disk to flash can be impressive: “FAWN [fast array of wimpy nodes] couples low-power embedded CPUs to small amounts of local flash storage, and balances computation and I/O ca- pabilities to enable efficient, massively parallel access to data. …FAWN clus- ters can handle roughly 350 key-value systems from the outside showed a Magic Media queries per joule of energy—two orders significant rate of silent data corrup- Considering the difficulties facing of magnitude more than a disk-based tion, and the NetApp study5 looking disk-drive technology,12 the reliabil- system.”3 at them from the inside showed a sig- ity storage systems achieve is astonish- Fast CPUs, fast RAM, and fast disks nificant rate of silent disk errors that ing, but it clearly isn’t enough. News all use lots of power, so the low power would lead to silent data corruption. sites regularly feature stories reporting draw of FAWN is not a surprise. But A study20 of the full range of current claims that some new storage medium the high performance comes from an- algorithms used to implement RAID has solved the problem of long-term other aspect of disk evolution. Table 1 found flaws leading to potential data data storage. Synthetic stone DVDs23 shows how long it would take to read loss in all of them. Both this study, and claimed to last 1,000 years were a re- the whole of a state-of-the-art disk of another from IBM,16 propose improve- cent example. These claims should be various generations. D ments to the RAID algorithms but nei- treated as skeptically as those of Sun Disks have been getting bigger but ther claims it can eliminate silent cor- and other storage system manufactur- they have not been getting equivalently RAMPERSA ruption, or even accurately predict its ers. It may well be that the media in faster. This is to be expected; the data TARAN

incidence: question are more reliable than their rate depends on the inverse of the Y “While we attempt to use as realistic competitors, but as we have seen, raw diameter of a bit, but the capacity de- H B probability numbers as possible, the media reliability is only a part of the pends on the inverse of the area of a bit. OTOGRAP H

goal is not to provide precise data-loss story. Our petabyte would be a stack FAWN nodes can read their entire con- P

52 communications of the acm | november 2010 | vol. 53 | no. 11 practice tents very quickly, useful for scrubbing, 2013 disk generation (that is, a con- as well as answering queries. sumer 3.5-inch drive holding 8TB). This is all encouraging, but once it But the business case for building it became possible to study the behavior is weak. The cost of the transition to of disk storage at a large scale, it became BPM in particular is daunting.24 Lap- clear that system-level reliability fell far Disks have been tops, netbooks, and now tablets are short of the media specifications. This getting bigger but destroying the market for the desktop should make us cautious about pre- boxes that 3.5-inch drives go into. And dicting a revolution from flash or any they have not been very few consumers fill up the 2009 other new storage technology. getting equivalently 2TB disk generation, so what value does having an 8TB drive add? Let Economics faster. This is to be alone the problem of how to back up Ever since Clayton Christensen pub- an 8TB drive on your desk! lished The Innovator’s Dilemma10 it expected; the data What is likely to happen—indeed, has been common knowledge that rate depends on is already happening—is that the con- disk-drive cost per byte halves every sumer market will transition rather two years. So you might argue that you the inverse of the quickly to 2.5-inch drives. This will don’t need to know how many copies diameter of a bit, eliminate the high-capacity $100 3.5- you need to keep your data safe for the inch drive, since it will no longer be long term, you just need to know how but the capacity produced in consumer quantities. many you need to keep it safe for the depends on the Consumers will still buy $100 drives, next few years. After that, you can keep but they will be 2.5 inches and have more copies. inverse of the area perhaps one-third the capacity. For a In fact, what has happened is the while the $/byte curve will at best flat- capacity at constant cost has been dou- of a bit. ten, and more likely go up. The prob- bling every two years, which is not quite lem this poses is that large-scale disk the same thing. As long as this expo- farms are currently built from consum- nential grows faster than you generate er 3.5-inch drives. The existing players new data, adding copies through time in the market have bet heavily on the is a feasible strategy. exponential cost decrease continuing; Alas, exponential curves can be de- if they’re wrong, it will be disruptive. ceiving. Moore’s Law has continued to deliver smaller and smaller transistors. The Bigger Picture A few years ago, however, it effectively Our inability to compute how many ceased delivering faster and faster CPU backup copies we need to achieve a re- clock rates. It turned out that, from a liability target is something we are just business perspective, there were more going to have to live with. In practice, important things to spend the extra we are not going to have enough back- transistors on than making a single up copies, and stuff will get lost or dam- CPU faster. Like putting multiple CPUs aged. This should not be a surprise, but on a chip. somehow it is. The fact that bits can be At a recent Library of Congress copied correctly leads to an expecta- meeting, Dave Anderson of Seagate tion that they always will be copied cor- warned4 that something similar is rectly, and then to an expectation that about to happen to hard disks. Tech- digital storage will be reliable. There is nologies—HAMR (heat-assisted mag- an odd cognitive dissonance between netic recording) and BPM (bit pattern this and people’s actual experience of media)—are in place to deliver the digital storage, which is that loss and damage are routine occurrences.22 Table 1. The time to read an entire disk of The fact that storage is not reliable various generations. enough to allow us to ignore the prob- lem of failures is just one aspect of a much bigger problem looming over 1990 240 computing as it continues to scale up. 2000 720 Current long-running petascale high- 2006 6450 performance computer applications 2009 8000 require complex and expensive check- 2013 12800 point and restart schemes, because the probability of a failure during ex- ecution is so high that restarting from

november 2010 | vol. 53 | no. 11 | communications of the acm 53 practice

Table 2. The four cases of message digest comparison. cases, as shown in Table 2, depending on whether the digest (b) is unchanged or not. The four cases illustrate two Digest Match No Match problems: Unchanged Data OK Data bad ˲˲ The bits forming the digest are Changed Deliberate alteration Data and/or digest bad no different from the bits forming the data; neither is magically incorrupt- ible. A malign or malfunctioning ser- vice could return bad data with a di- scratch is infeasible. This approach plication to the device, or while it was gest in the ETag header that matched will not scale to the forthcoming gen- sitting in the storage device, or being the data but was not the digest origi- eration: copied back to the application. Recent nally computed. Applications need “…it is anticipated that exascale research has shown the memory buf- to know whether the digest has been systems will experience various kinds fers44 and data paths17 between the ap- changed. A system for doing so with- of faults many times per day. It is also plication and the storage devices con- out incorruptible storage is described anticipated that the current approach tribute substantially to errors. in Haber et al.15 for resilience, which relies on auto- Let’s take the Amazon S3 (Simple ˲˲ Given the pricing structure for matic or application-level checkpoint- Storage Service) REST API2 as an exam- cloud storage services such as Ama- restart, will not work because the time ple to show that, while these develop- zon S3, it is too expensive to extract the for checkpointing and restarting will ments are welcome, they are far from entire data at intervals to confirm it is exceed the mean time to failure of a full a panacea. The PUT request supports being stored correctly. Some method system. … an optional (and recommended) Con- in which the service computes the “Some projections estimate that, tent-MD5 (Message-Digest algorithm digest of the data is needed, but sim- with the current technique, the time to 5) header containing the application’s ply asking the service to return the checkpoint and restart may exceed the digest of the data. The responses to digest of a stored object is not an ad- mean time to interrupt of top super- most requests, including PUT, include equate check.33 The service must be computers before 2015. This not only an ETag (entity tag) header with the challenged to prove its object is good. means that a computation will do little service’s MD5 of the object. The appli- The simplest way to do this is to ask progress; it also means that fault-han- cation can remember the digest it com- the service to compute the digest of dling protocols have to handle multi- puted before the PUT and, when the a nonce (a random string of bits) and ple errors—current solutions are often PUT returns, verify that the service’s the object; because the service cannot designed to handle single errors.”7 digest matches. predict the nonce, a correct response Just as with storage, the numbers Doing so is a wise precaution, but requires access to the data after the of components and interconnections all it really tells the application is that request is received. Systems using this are so large that the incidence of fail- the service knows what the application technique are described in Maniatis et ures is significant, and the available thinks is the correct digest. The service al.21 and Shah et al.38 bandwidths are relatively so low that knows this digest, not because it com- Early detection is a good thing: recovering from the failures is time puted the digest of the correct data, but the shorter the time between detec- consuming enough that multiple because the application provided it in tion and repair, the smaller the risk failure situations have to be handled. the Content-MD5 header. A malign or that a second error will compromise There is no practical, affordable way malfunctioning service could respond the repair. But detection is only part to mask the failures from the applica- correctly to PUT and HEAD requests by of the solution; the system also has to tions. Application programmers will remembering the application’s digest, be able to repair the damaged data. It need to pay much more attention to without ever storing the data or com- can do so only if it has replicated the detecting and recovering from errors puting its digest. data elsewhere—and some dedupli- in their environment. To do so they The application could try to detect cation layer has not optimized away will need both the APIs and the system a malign or malfunctioning service by this replication. environments implementing them to using a GET to obtain the stored data, become much more failure-aware. computing the digest (a) of the returned Conclusion data, and comparing that with (b) either It would be nice to end on an upbeat API Enhancements the digest in the response’s ETag head- note, describing some technological Storage APIs are starting to move in this er or the digest it computed before the fix that would allow applications to ig- direction. Recent interfaces to storage original PUT and remembered (which nore the possibility of failures in their services2 allow the application’s write should be the same). It might seem that environment, and specifically in long- call to provide not just a pointer to the there are two cases: if the two message term storage. Unfortunately, in the real data and a length, but also, optionally, digests match, then the data is OK;d oth- world, failures are inevitable. As sys- the application’s message digest of the erwise it isn’t. There are actually four tems scale up, failures become more data. This allows the storage system frequent. Even throwing money at the to detect whether the data was dam- d Assuming the digest algorithm has not been problem can only reduce the incidence aged during its journey from the ap- broken, not a safe assumption for MD5.19 of failures, not exclude them entirely.

54 communications of the acm | november 2010 | vol. 53 | no. 11 practice

Applications in the future will need to Hard Disk Drives: The Good, the Bad Conference on File and Storage Technologies, (2008). and the Ugly! 23. mearian, L. Start-up claims its DVDs last 1,000 years. be much more aware of, and careful in Computerworld, (Nov. 2009). responding to, failures. Jon Elerath 24. mellor, C. Drive suppliers hit capacity increase http://queue.acm.org/detail.cfm?id=1317403 difficulties.The Register, (July 2010). The high-performance computing 25. michail, H.E., Kakarountas, A.P., Theodoridis, G., community accurately describes what You Don’t Know Jack about Disks Goutis, C.E. A low-power and high-throughput Dave Anderson implementation of the SHA-1 hash function. In needs to be done: http://queue.acm.org/detail.cfm?id=864058 Proceedings of the 9th WSEAS International “We already mentioned the lack of Conference on Computers, (2005). 26. mielke, N., Marquart, T., Wu1, N., Kessenich, J., Belgal, coordination between software layers H., Schares, E., Trivedi, F., Goodness, E., Nevill, L.R. Bit References error rate in NAND flash memories. In 46th Annual with regards to errors and fault man- 1. adams, D. The Hitchhiker’s Guide to the Galaxy. British International Reliability Physics Symposium, (2008), Broadcasting Corp., 1978. agement. Currently, when a software 9–19. 2. amazon. Amazon S3 API Reference (Mar. 2006); 27. moore, R. L., D’Aoust, J., McDonald, R. H., Minor, D. Disk http://docs.amazonwebservices.com/AmazonS3/ layer or component detects a fault it and tape storage cost models. In Archiving 2007. latest/API/. 28. national Institute of Standards and Technology does not inform the other parts of the 3. andersen, D.G., Franklin, J., Kaminsky, M., (NIST). Federal Information Processing Standard Phanishayee, A., Tan, L., Vasudevan, V. FAWN: A fast software running on the system in a Publication 180-1: Secure Hash Standard, (Apr. 1995). array of wimpy nodes. In Proceedings of the ACM 29. patterson, D. A., Gibson, G., Katz, R.H. A case for consistent manner. As a consequence, SIGOPS 22nd Symposium on Operating Systems redundant arrays of inexpensive disks (RAID). In Principles (2009), 1–14. fault-handling actions taken by this Proceedings of the ACM SIGMOD International 4. anderson. D. Hard drive directions (Sept. 2009); Conference on Management of Data, (June 1988), software component are hidden to the http://www.digitalpreservation.gov/news/events/ 109–116. other_meetings/storage09/docs/2-4_Anderson- rest of the system. …In an ideal wor[l] 30. Pinheiro, E., Weber, W.-D., Barroso, L. A. Failure trends seagate-v3_HDtrends.pdf. in a large disk drive population. In Proceedings of 5th d, if a software component detects a 5. Bairavasundaram, L., Goodson, G., Schroeder, B., Usenix Conference on File and Storage Technologies, Arpaci-Dusseau, A.C., Arpaci-Dusseau, R.H. An potential error, then the information (Feb. 2007). analysis of data corruption in the storage stack. In 31. prabhakaran, V., Agrawal, N., Bairavasundaram, L., should propagate to other compo- Proceedings of 6th Usenix Conference on File and Gunawi, H., Arpaci-Dusseau, A.C., Arpaci-Dusseau, Storage Technologies, (2008). nents that may be affected by the error R.H. IRON file systems. InProceedings of the 20th 6. Baker, M., Shah, M., Rosenthal, D.S.H., Roussopoulos, Symposium on Operating Systems Principles, (2005). or that control resources that may be M., Maniatis, P., Giuli, T.J., Bungale, P. A fresh look 32. rosenthal, D.S.H. Bit preservation: A solved problem? 7 at the reliability of long-term digital storage. In responsible for the error.” International Journal of Digital Curation 1, 5 (2010). Proceedings of EuroSys2006, (Apr. 2006). 33. rosenthal, D.S.H. LOCKSS: Lots of copies keep stuff In particular, as regards storage, 7. cappello, F., Geist, A., Gropp, B., Kale, S., Kramer, B., safe. In NIST Digital Preservation Interoperability Snir, M. Toward exascale resilience. Technical Report APIs should copy Amazon’s S3 by pro- Framework Workshop, (Mar. 2010). TR-JLPC-09-01. INRIA-Illinois Joint Laboratory on 34. rosenthal, D.S.H., Robertson, T.S., Lipkis, T., Reich, Petascale Computing, (July 2009). viding optional data-integrity capabili- V., Morabito, S. Requirements for digital preservation 8. cern. Worldwide LHC Computing Grid, 2008; http:// systems: a bottom-up approach. D-Lib Magazine 11, 11 ties that allow applications to perform lcg.web.cern.ch/LCG/. (2005). 9. chang, F., Dean, J., Ghemawat, S., Hsieh, W. C., end-to-end checks. These APIs should 35. schroeder, B., Gibson, G. Disk failures in the real world: Wallach, D. A., Burrows, M., Chandra, T., Fikes, A., What does an MTTF of 1,000,000 hours mean to you? be enhanced to allow the application Grube, R.E. Bigtable: A distributed storage system In Proceedings of 5th Usenix Conference on File and for structured data. In Proceedings of the 7th to provide an optional nonce that is Storage Technologies (Feb. 2007). Usenix Symposium on Operating System Design and 36. schwarz, T., Baker, M., Bassi, S., Baumgart, B., Flagg, prepended to the object data before Implementation, (2006), 205–218. W., van Imngen, C., Joste, K., Manasse, M., Shah, M. 10. christensen, C.M. The Innovator’s Dilemma: When the message digest reported to the ap- Disk failure investigations at the Internet Archive. In New Technologies Cause Great Firms to Fail. Harvard Work-in-Progress Session, NASA/IEEE Conference on plication is computed. This would al- Business School Press (June 1997), Cambridge, MA. Mass Storage Systems and Technologies, (2006). 11. corbett, P., English, B., Goel, A., Grcanac, T., Kleiman, low applications to exclude the possi- 37. sDSS (Sloan Digital Sky Survey), 2008; http://www. S., Leong, J., Sankar, S. Row-diagonal parity for double sdss.org/. bility that the reported digest has been disk failure correction. In 3rd Usenix Conference on 38. shah, M.A., Baker, M., Mogul, J.C., Swaminathan, R. File and Storage Technologies (Mar. 2004). remembered rather than recomputed. Auditing to keep online storage services honest. In 12. elerath. J. Hard-disk drives: The good, the bad, and the 11th Workshop on Hot Topics in Operating Systems, ugly. Commun. ACM 52, 6 (June 2009). (May 2007). 13. elerath, J.G., Pecht, M. Enhanced reliability modeling Acknowledgments 39. storer, M.W., Greenan, K. M., Miller, E.L., Voruganti, of RAID storage systems. In Proceedings of the K. Pergamum: replacing tape with energy-efficient, Grateful thanks are due to Eric Allman, 37th Annual IEEE/IFIP International Conference on reliable, disk-based archival storage. In Proceedings Dependable Systems and Networks, (2007), 175–184. Kirk McKusick, Jim Gettys, Tom Lipkis, of 6th Usenix Conference on File and Storage 14. engler, D. A system’s hackers crash course: techniques Technologies, (2008). that find lots of bugs in real (storage) system code. Mark Compton, Petros Maniatis, Mary 40. Sun Microsystems. Sales Terms and Conditions, In Proceedings of 5th Usenix Conference on File and Section 11.2, (Dec. 2006); http://store.sun.com/ Baker, Fran Berman, Tsutomu Shimo- Storage Technologies, (Feb. 2007). CMTemplate/docs/legal_terms/TnC.jsp#11. 15. Haber, S., Stornetta, W.S. How to timestamp a digital mura, and the late Jim Gray. Some of 41. sun Microsystems. ST5800 presentation. Sun PASIG document. Journal of Cryptology 3, 2 (1991), 99–111. Meeting, (June 2008). this material was originally presented 16. Hafner, J.L., Deenadhayalan, V., Belluomini, W., Rao, K. 42. talagala, N. Characterizing large storage systems: Undetected disk errors in RAID arrays. IBM Journal at iPRES 2008 and subsequently pub- error behavior and performance benchmarks. Ph.D. of Research & Development 52, 4/5, (2008). thesis, Computer Science Division, University of lished in the International Journal of 17. Jiang, W., Hu, C., Zhou, Y., Kanevsky, A. Are disks California at Berkeley, (Oct. 1999). 32 the dominant contributor for storage failures? A Digital Curation. 43. Williams, P., Rosenthal, D. S. H., Roussopoulos, M., comprehensive study of storage subsystem failure Georgis, S. Predicting the archival life of removable This work was funded by the mem- characteristics. In Proceedings of 6th Usenix hard disk drives. In Archiving 2008, (June 2008). Conference on File and Storage Technologies, (2008). ber libraries of the LOCKSS Alliance, 44. Zhang, Y., Rajimwale, A., Arpaci-Dusseau, A.C., 18. Kelemen, P. Silent corruptions. In 8th Annual Workshop Arpaci-Dusseau, R.H. End-to-end data integrity for file and the Library of Congress’ National on Linux Clusters for Super Computing, (2007) systems: A ZFS case study. In 8th Usenix Conference 19. Klima, V. Finding MD5 collisions—A toy for a notebook. Digital Information Infrastructure on File and Storage Technologies, (2010). Cryptology ePrint Archive, Report 2005/075; http:// and Preservation Program. Errors and eprint.iacr.org/2005/075. 20. Krioukov, A., Bairavasundaram, L.N., Goodson, G.R., omissions are the author’s own. David S.H. Rosenthal has been an engineer in Srinivasan, K., Thelen, R., Arpaci-Dusseau, A.C., Silicon Valley for a quarter of a century, including as Arpaci-Dusseau, R.H. Parity lost and parity regained. a Distinguished Engineer at Sun Microsystems and In Proceedings of 6th Usenix Conference on File and employee #4 at NVIDIA. For the last decade he has been Storage Technologies, (2008). working on the problems of long-term digital preservation 21. maniatis, P., Roussopoulos, M., Giuli, T.J., Rosenthal, Related articles under the auspices of the Stanford Library. D.S.H., Baker, M., Muliadi, Y. Preserving peer replicas on queue.acm.org by rate-limited sampled voting. In Proceedings of Triple-Parity RAID and Beyond the 19th ACM Symposium on Operating Systems Principles, (Oct. 2003), 44–59. Adam Leventhal 22. marshall, C. “It’s like a fire. You just have to move on:” http://queue.acm.org/detail.cfm?id=1670144 Rethinking personal digital archiving. In 6th Usenix © 2010 ACM 0001-0782/10/1100 $10.00

november 2010 | vol. 53 | no. 11 | communications of the acm 55 practice

doi:10.1145/1839676.1839693 me I have a tough row to hoe, but I will Article development led by queue.acm.org attempt to argue that this time Pike is merely rearranging the deckchairs of the Titanic and that he missed the next To move forward with programming big thing by a wide margin. languages we must first break free from Pike got fed up with C++ and Java and did what any self-respecting hacker the tyranny of ASCII. would do: he created his own language— better than Java, better than C++, better by Poul-Henning Kamp than C—and he called it Go. But did he go far enough? The Go language does not in any way look substantially different from any of the other programming languages. Sir, Please Fiddle a couple of glyphs here and there and you have C, C++, Java, Python, Tcl, or whatever. Programmers are a picky bunch Step Away when it comes to syntax, and it is a so- bering thought that one of the most rap- idly adopted programming languages of all time, Perl, barely had one for the from the longest time. Ironically, what syntax de- signers are really fighting about is not so much the proper and best syntax for the expression of ideas in a machine-under- ASR-33! standable programming language as it is the proper and most efficient use of the ASCII table real estate.

It’s all ASCII to me… There used to be a programming lan- One of the naughty details of my Varnish software guage called ALGOL, the lingua franca of computer science back in its heyday. is that the configuration is written in a domain- ALGOL was standardized around 1960 specific language that is converted into C source and dictated about a dozen mathemat- code, compiled into a shared library, and executed ical glyphs such as ×, ÷, ¬, and the very readable subscripted 10 symbol, for at hardware speed. That obviously makes me a use in what today we call scientific no- programming language syntax designer, and just as tation. Back then computers were built obviously I have started to think more about how we by hand and had one-digit serial num- bers. Having a teletypewriter custom- express ourselves in these syntaxes. ized for your programming language Rob Pike recently said some very pointed words was the least of your worries. A couple of years later came the APL about the Java programming language, which if you programming language, which includ- think about it, sounded a lot like the pointed words ed an extended character set containing James Gosling had for C++, and remarkably similar to a lot of math symbols. I am told that APL still survives in certain obscure corners what Bjarne Stroustrup said about good ol’ C. of insurance and economics modeling. I have always admired Pike. He was already a giant in Then ASCII happened around 1963, and ever since, programming languages the field when I started, and his ability to foretell the have been trying to fit into it. (Wikipedia future has been remarkably consistent.1 In front of claims that ASCII grew the backslash [\]

56 communications of the acm | november 2010 | vol. 53 | no. 11 practice

specifically to support ALGOL’s /\ and \/ Boolean operators. No source is provid- ed for the claim.) The trouble probably started for real with the C programming language’s need for two kinds of and and or op- erators. It could have used just or and bitor, but | and || saved one and three characters, which on an ASR-33 teletype Result of the Go Language ASR33 Compatibility Test. amounts to 1/10 and 3/10 second, re- spectively. spite all its desirable properties—such supportive to the understanding of the It was certainly a fair trade-off—just as being created as a language to be main body of code? think about how fast you type yourself— embedded in tools—it has been widely And need I remind anybody that you but the price for this temporal frugality spurned, often with arguments about cannot buy a monochrome screen any- was a whole new class of hard-to-spot excessive use of, or difficult-to-figure- more? Syntax-coloring editors are the bugs in C code. out placement of, {} and []. default. Why not make color part of the Niklaus Wirth tried to undo some of My disappointment with Rob Pike’s syntax? Why not tell the compiler about the damage in Pascal, and the bickering Go language is that the rest of the protected code regions by putting them over begin and end would no } take. world has moved on from ASCII, but on a framed light gray background? Or C++ is probably the language that he did not. Why keep trying to cram an provide hints about likely and unlikely milks the ASCII table most by allow- expressive syntax into the straitjacket code paths with a green or red back- ing templates and operator overload- of the 95 glyphs of ASCII when Unicode ground tint? ing. Until you have inspected your data has been the new black for most of the For some reason computer people types, you have absolutely no idea what past decade? are so conservative we still find it more + might do to them (which is probably Unicode has the entire gamut of uncompromisingly important for our why there never was enough interest to Greek letters, mathematical and techni- source code to be compatible with a Tele- stage an International Obfuscated C++ cal symbols, brackets, brockets, sprock- type ASR-33 terminal and its 1963-vin- Code Contest, parallel to the IOCCC for ets, and weird and wonderful glyphs tage ASCII table than it is for us to be the C language). such as “Dentistry symbol light down able to express our intentions clearly. C++ stops short of allowing the pro- and horizontal with wave” (0x23c7). Why And, yes, me too: I wrote this in vi(1), grammer to create new operators. You do we still have to name variables Ome- which is why the article does not have cannot define :-: as an operator; you have gaZero when our computers now know all the fancy Unicode glyphs in the first to stick to the predefined set. If Bjarne how to render 0x03a9+0x2080 properly? place. Stroustrup had been more ambitious on The most recent programming lan- this aspect, C++ could have beaten Perl guage syntax development that had Related articles by 10 years to become the world’s sec- anything to do with character sets apart on queue.acm.org ond write-only programming language, from ASCII was when the ISO-C stan- after APL. A Conversation with Arthur Whitney dard committee adopted trigraphs to http://queue.acm.org/detail.cfm?id=1531242 How desperate the hunt for glyphs is make it possible to enter C source code in syntax design is exemplified by how on computers that do not have ASCII’s You’re Doing It Wrong Poul-Henning Kamp Guido van Rossum did away with the 95 characters available—a bold and de- http://queue.acm.org/detail.cfm?id=1814327 canonical scope delimiters in Python, cisive step in the wrong direction. How Not to Write Fortran in Any Language relying instead on indentation for this While we are at it, have you noticed Donn Seeley purpose. What could possibly be of such that screens are getting wider and wider http://queue.acm.org/detail.cfm?id=1039535 high value that a syntax designer would these days, and that today’s text pro- brave the controversy this caused? A cessing programs have absolutely no Reference high-value pair of matching glyphs, { 1. pike, R. Systems software research is irrelevant; problem with multiple columns, insert http://herpolhode.com/rob/utah2000.pdf. and }, for other use in his syntax could. displays, and hanging enclosures being (This decision also made it impossible placed in that space? Poul-Henning Kamp ([email protected]) has to write Fortran programs in Python, a But programs are still decisively ver- programmed computers for 26 years and is the inspiration behind bikeshed.org. His software has been widely laudable achievement in its own right.) tical, to the point of being horizontally adopted as “under the hood” building blocks. His most The best example of what happens challenged. Why can’t we pull minor recent project is the Varnish HTTP accelerator, which is used to speed up large Web sites such as Facebook. if you do the opposite is John Ouster- scopes and subroutines out in that hout’s Tcl programming language. De- right-hand space and thus make them © 2010 ACM 0001-0782/10/1100 $10.00

november 2010 | vol. 53 | no. 11 | communications of the acm 57 contributed articles

doi:10.1145/1839676.1839694 NVIDIA’s graphics processing units, or For workloads with abundant parallelism, GPUs, follow in the footsteps of earlier throughput-oriented processor designs GPUs deliver higher peak computational but have achieved far broader use in throughput than latency-oriented CPUs. commodity machines. Broadly speak- ing, they focus on executing parallel by Michael Garland and David B. Kirk workloads while attempting to maxi- mize total throughput, even though sacrificing the serial performance of a single task may be required. Though improving total throughput at the ex- Understanding pense of increased latency on individual tasks is not always a desirable trade-off, it is unquestionably the right design de- cision in many problem domains that Throughput- rely on parallel computations, includ- ing real-time computer graphics, video processing, medical-image analysis, molecular dynamics, astrophysical sim- ulation, and gene sequencing. Oriented Modern GPUs are fully programma- ble and designed to meet the needs of a problem domain—real-time computer graphics—with tremendous inherent Architectures parallelism. Furthermore, real-time graphics places a premium on the to- tal amount of work that can be accom- plished within the span of a single frame (typically lasting 1/30 second). Due to their historical development, GPUs have evolved as exemplars of throughput- oriented processor architecture. Their Much has been written about the transition of emphasis on throughput optimiza- tion and their expectation of abundant commodity microprocessors from single-core to available parallelism is more aggressive multicore chips, a trend most apparent in CPU than many other throughput-oriented processor families. Commodity PCs are now typically architectures. They are also widely avail- built with CPUs containing from two to eight cores, able and easily programmable. NVIDIA with even higher core counts on the horizon. These key insights chips aim to deliver higher performance by exploiting  Throughput-oriented processors tackle problems where parallelism is modestly parallel workloads arising from either the abundant, yielding design decisions need to execute multiple independent programs different from more traditional latency- oriented processors. or individual programs that themselves consist of  Due to their design, programming multiple parallel tasks, yet maintain the same level throughput-oriented processors requires much more emphasis on of performance as single-core chips on sequential parallelism and scalability than workloads. programming sequential processors.  GP Us are the leading exemplars A related architectural trend is the growing of modern throughput-oriented prominence of throughput-oriented microprocessor architecture, providing a ubiquitous commodity platform for exploring architectures. Processors like Sun’s Niagara and throughput-oriented programming.

58 communications of the acm | november 2010 | vol. 53 | no. 11 Figure 1. Throughput-oriented processors like the NVIDIA Tesla C2050 deliver substantially higher performance on intrinsically parallel computations, including molecular dynamics simulations. released its first GPU supporting the and AMBER recently added CUDA-ac- core and just over 17 times the through- CUDA parallel computing architecture celerated computations (http://am- put of all eight cores. in 2006 and is currently shipping its bermd.org/gpus/). Its Generalized Born Using the GPU as a case study, this third-generation CUDA architecture, implicit solvent calculation for this sys- article explores the fundamental archi- code-named “Fermi,”24 released in 2010 tem running on the eight cores of a dual tectural design decisions differentiating in the Tesla C2050 and other processors. four-core Intel Xeon E5462 executes at a throughput-oriented processors from Figure 1 is a nucleosome structure rate of 0.06 nanoseconds of simulation their more traditional latency-oriented (with 25,095 atoms) used in bench- time per day of computation. The same counterparts. These architectural differ- marking the AMBER suite of molecular calculation running on an NVIDIA Tesla ences also lead to an approach to paral- dynamics simulation programs. Many C2050 executes the simulation at a rate lel programming that is qualitatively dif- core computations performed in molec- of 1.04 ns/day, roughly 144 times more ferent from the parallel thread models ular dynamics are intrinsically parallel, work per day than a single sequential prevalent on today’s CPUs.

november 2010 | vol. 53 | no. 11 | communications of the acm 59 contributed articles

Throughput-Oriented Processors dividual throughput-oriented architec- workloads, all with interleaved multi- Two fundamental measures of proces- tures may not use all three architectural threading.21 Each is capable of switch- sor performance are task latency (time features just listed. Also worth noting is ing between threads at each cycle. Thus elapsed between initiation and comple- that several architectural strategies, in- the execution of threads is interleaved at tion of some task) and throughput (to- cluding pipelining, multiple issue, and extremely fine granularity, often at the tal amount of work completed per unit out-of-order execution, avoid task-level instruction level. time). Processor architects make many latency by improving instruction-level Blocking multithreading is a coarser- carefully calibrated trade-offs between throughput. grain strategy in which a thread might latency and throughput optimization, Hardware multithreading. A compu- run uninterrupted until encountering a since improving one could degrade the tation in which parallelism is abundant long-latency operation, at which point other. Real-world processors tend to em- can be decomposed into a collection of a different thread is selected for execu- phasize one over the other, depending concurrent sequential tasks that may tion. The streaming processors Imag- on the workloads they are expected to potentially be executed in parallel, or ine,16 Merrimac,9 and SPI Storm17 are encounter. simultaneously, across many threads. notable examples of throughput-orient- Traditional scalar microprocessors We view a thread as a virtualized sca- ed architectures adopting this strategy. are essentially latency-oriented archi- lar processor, typically meaning each These machines explicitly partition pro- tectures. Their goal is to minimize the thread has a program counter, register grams into bulk load/store operations running time of a single sequential file, and associated processor state. A on entire data blocks and “kernel” tasks program by avoiding task-level latency thread is thus able to execute the in- in which memory accesses are restrict- whenever possible. Many architectural struction stream corresponding to a ed to on-chip blocks loaded on their techniques, including out-of-order single sequential task. Note this model behalf. When a kernel finishes process- execution, speculative execution, and of threads says nothing about the way ing its on-chip data, a different task in sophisticated memory caches, have concurrent threads are scheduled; for which required memory blocks have been developed to help achieve it. This instance, whether they are scheduled been loaded onto the chip is executed. traditional design approach is predi- fairly (any thread ready to run is eventu- Overlapping the bulk data transfer for cated on the conservative assumption ally executed) is a separate issue. one or more tasks while another is ex- that the parallelism available in the It is well known that multithread- ecuting hides memory-access latency. workload presented to the processor is ing, whether in hardware31 or software,4 Strategic placement of kernel bound- fundamentally scarce. Single-core sca- provides a way of tolerating latency. If a aries where context switches occur can lar CPUs typified by the Intel Pentium given thread is prevented from running also substantially reduce the amount IV were aggressively latency-oriented. because it is waiting for an instruction to of state that must be retained between More recent multicore CPUs (such as make its way through a pipelined func- task executions. the Intel Core2 Duo and Core i7) re- tional unit, data to arrive from external A third strategy called simultane- flect a trend toward somewhat less-ag- DRAM or some other event, a multi- ous multithreading30 allows different gressive designs that expect a modest threaded system can allow another un- threads to simultaneously issue instruc- amount of parallelism. blocked thread to run. That is, the long- tions to independent functional units Throughput-oriented processors, latency operations of a single thread can and is used to improve the efficiency in contrast, arise from the assumption be hidden or covered by ready-to-run of superscalar sequential processors that they will be presented with work- work from another thread. This focus without having to find instruction-level loads in which parallelism is abundant. on tolerating latency, where processor parallelism within a single thread. It is This fundamental difference leads to utilization does not suffer simply be- likewise used by NVIDIA’s Fermi archi- architectures that differ from tradition- cause a fraction of the active threads are tecture24 in place of intra-thread dual is- al sequential machines. Broadly speak- blocked, is a hallmark of throughput- sue to achieve higher utilization. ing, throughput-oriented processors oriented processors. The design of the HEP,28 Tera,2 and rely on three key architectural features: Hardware multithreading as a de- NVIDIA G8022 processors highlights emphasis on many simple processing sign strategy for improving aggregate an instructive characteristic of some cores, extensive hardware multithread- performance on parallel workloads has throughput-oriented processors: none ing, and use of single-instruction, mul- a long history. The peripheral proces- provides a traditional cache for load/ tiple-data, or SIMD, execution. Aggres- sors of the Control Data Corp. CDC 6600 store operations on external memory, sively throughput-oriented processors, developed in the 1960s and the Het- unlike latency-oriented processors exemplified by the GPU, willingly sacri- erogeneous Element Processor (HEP) (such as typical CPUs) that expend fice single-thread execution speed to in- system28 developed in the late 1970s substantial chip area on sophisticated crease total computational throughput are notable examples of the early use of cache subsystems. These machines are across all threads. hardware multithreading. Many more able to achieve high throughput in the No successful processor can afford multithreaded processors have been absence of caches because they assume to optimize aggregate task throughput designed over the years31; for example, there is sufficient parallel work available while completely ignoring single-task la- the Tera,1,2 Sun Niagara,18 and NVIDIA to hide the latency of off-chip memory tency or vice versa. Different processors GPU22 architectures all use aggres- accesses. Unlike previous NVIDIA pro- may also vary in the degree they empha- sive multithreading to achieve high- cessors, the Fermi architecture provides size one over the other; for instance, in- throughput performance on parallel a cache hierarchy for external memory

60 communications of the acm | november 2010 | vol. 53 | no. 11 contributed articles accesses but still relies on extensive threading, SIMD execution has a long multithreading for latency tolerance. history dating to at least the 1960s. Many simple processing units. The Most SIMD machines can be classi- high transistor density in modern semi- fied into two basic categories. First is conductor technologies makes it feasi- the SIMD processor array, typified by the ble for a single chip to contain multiple Aggressively ILLIAC IV developed at the University of processing units, raising the question of throughput-oriented Illinois,7 the Thinking Machines CM-2,29 how to use the available area on the chip and the MasPar Computer Corp. MP-1.5 to achieve optimal performance: one processors, All consisted of a large array of process- very large processor, a handful of large exemplified ing elements (hundreds or thousands) processors, or many small processors? and a single control unit that would Designing increasingly large single- by the GPU, consume a single instruction stream. processor chips is unattractive.6 The willingly sacrifice The control unit would broadcast each strategies used to obtain progressively instruction to all processing elements higher scalar performance (such as single-thread that would then execute the instruction out-of-order execution and aggres- in parallel. sive speculation) come at the price of execution speed The second category is the vector rapidly increasing power consump- to increase total processor, exemplified by the Cray-125 tion; incremental performance gains and numerous other machines11 that incur increasingly large power costs.15 computational augment a traditional scalar instruc- Thus, while increasing the power con- throughput across tion set with additional vector instruc- sumption of a single-threaded core is tions operating on data vectors of some physically possible, the potential perfor- all threads. fixed width—64-element vectors in the mance improvement from more aggres- Cray-1 and four-element vectors in the sive speculation appears insignificant most current vector extensions (such as by comparison. This analysis has led to the x86 Streaming SIMD Extensions, or an industrywide transition toward mul- SSE). The operation of a vector instruc- ticore chips, though their designs re- tion, like vector addition, may be per- main fundamentally latency-oriented. formed in a pipelined fashion (as on the Individual cores maintain roughly com- Cray-1) or in parallel (as in current SSE parable scalar performance to earlier implementations). Several modern pro- generations of single-core chips. cessor families, including x86 proces- Throughput-oriented processors sors from Intel and AMD and the ARM achieve even higher levels of perfor- Cortex-A series, provide vector SIMD mance by using many simple, and hence instructions that operate in parallel small, processing cores.10 The individu- on 128-bit (such as four 32-bit integer) al processing units of a throughput-ori- values. Programmable GPUs have long ented chip typically execute instructions made aggressive use of SIMD; current in the order they appear in the program, NVIDIA GPUs have a SIMD width of 32. rather than trying to dynamically reor- Many recent research designs, includ- der instructions for out-of-order execu- ing the Vector IRAM,19 SCALE,20 and tion. They also generally avoid specula- Imagine and Merrimac streaming pro- tive execution and branch prediction. cessors,9,16 have also used SIMD archi- These architectural simplifications of- tectures to improve efficiency. ten reduce the speed with which a sin- SIMD execution is attractive because, gle thread completes its computation. among other things, it increases the However, the resulting savings in chip amount of resources that can be devot- area allow for more parallel processing ed to functional units rather than con- units and correspondingly higher total trol logic. For instance, 32 floating-point throughput on parallel workloads. arithmetic units coupled with a single SIMD execution. Parallel processors control unit takes less chip area than frequently employ some form of single- 32 arithmetic units with 32 separate instruction, multiple-data, or SIMD, control units. The desire to amortize execution12 to improve their aggregate the cost of control logic over numerous throughput. Issuing a single instruction functional units was the key motivating in a SIMD machine applies the given factor behind even the earliest SIMD operation to potentially many data op- machines.7 erands; SIMD addition might, for ex- However, devoting less space to con- ample, perform pairwise addition of two trol comes at a cost. SIMD execution de- 64-element sequences. As with multi- livers peak performance when parallel

november 2010 | vol. 53 | no. 11 | communications of the acm 61 contributed articles

tasks follow the same execution trace scene can easily paint millions of pixels streaming multiprocessors, or SMs.22,24 and can suffer when heterogeneous at a time, thus generating a great deal Figure 2 diagrams a representative tasks follow completely different ex- of completely parallel work. Further- Fermi-generation GPU like the GF100 ecution traces. The effi ciency of SIMD more, processing an element generally processor used in the Tesla C2050. Each architectures depends on the availabil- involves launching a thread to execute multiprocessor supports on the order of ity of suffi cient amounts of uniform a program—usually called a shader— a thousand co-resident threads and is work. In practice, suffi cient uniformity written by the developer. Consequently, equipped with a large register fi le, giv- is often present in abundantly parallel GPUs are specifi cally designed to ex- ing each thread its own dedicated set workloads, since it is more likely that ecute literally billions of small user-writ- of registers. A high-end GPU with many a pool of 10,000 concurrent tasks con- ten programs per second. SMs can thus sustain tens of thousands sists of a small number of task types Most real-time visual applications of threads simultaneously. Multiproces- rather than 10,000 completely dispa- are designed to run at a rate of 30–60 sors contain many scalar processing rate computations. frames per second. A graphics system is elements that execute the instructions therefore expected to generate, render, issued by the running threads. Each GPus and display images of visually complex multiprocessor also contains high- Programmable GPUs are the leading ex- worlds within 33ms. Since it must com- bandwidth, low-latency on-chip shared emplars of aggressively throughput-ori- plete many millions of independent memory, while at the same time provid- ented processors, taking the emphasis tasks within this timeframe, the time ing its threads with direct read/write ac- on throughput further than the vast ma- to complete any one of these tasks is cess to off-chip DRAM. The Fermi archi- jority of other processors and thus offer- relatively unimportant. But the total tecture can confi gure its 64KB of per-SM ing tremendous potential performance amount of work that can be completed memory as either a 16KB L1 cache and on massively parallel problems.13 within 33ms is of great importance, as 48KB RAM or a 48KB L1 cache and 16KB Historical perspective. Modern GPUs it is generally closely correlated with the RAM. It also provides a global 768KB L2 have evolved according to the needs of visual richness of the environment be- cache shared by all SMs. The table here real-time computer graphics, two as- ing displayed. summarizes the capacity of a single SM pects of which are of particular impor- Their role in accelerating real-time for the three generations of NVIDIA CU- tance to understanding the develop- graphics has also made it possible for DA-capable GPUs. ment of GPU designs: it is an extremely GPUs to become mass-market devices, The SM multiprocessor handles all parallel problem, and throughput is its and, unlike many earlier throughput- thread creation, resource allocation, paramount measure of performance. oriented machines, they are also widely and scheduling in hardware, inter- Visual applications generally model available. Since late 2006, NVIDIA has leaving the execution of threads at the the environments they display through a shipped almost 220 million CUDA-capa- instruction-level with essentially zero collection of geometric primitives, with ble GPUs—several orders of magnitude overhead. Allocation of dedicated regis- triangles the most common. The most more than historical massively parallel ters to all active threads means there is widely used techniques for producing architectures like the CM-2 and MasPar no state to save/restore when switching images from these primitives proceed machines. between threads. With all thread man- through several stages where processing NVIDIA GPU architecture. Beginning agement performed in hardware, the is performed on each triangle, triangle with the G80 processor released in late cost of employing many threads is mini- corner, and pixel covered by a triangle. 2006, all modern NVIDIA GPUs sup- mal. For example, a Tesla C2050 execut- At each stage, individual triangles/ port the CUDA architecture for parallel ing the increment() kernel in Figure 3 vertices/pixels can be processed inde- computing. They are built around an will create, execute, and retire threads at pendently of all others. An individual array of multiprocessors, referred to as a rate of roughly 13 billion threads/sec.

figure 2. nViDia GPu consisting of an array of multithreaded multiprocessors.

GPU Off-chip memory SM SM SM DRAM SIMT Control SIMT Control SIMT Control

Thread DRAM

scheduling Processing Processing Processing Elements Elements Elements

L1 On-chip memory L1 On-chip memory L1 On-chip memory Memory Interface DRAM PCIe Bus

Interconnection Network Host Interface

Global L2 Cache

62 communications of the acm | november 2010 | vol. 53 | no. 11 contributed articles

To manage its large population of customize the number of threads and indicating the function increment() threads efficiently, the GPU employs blocks launched for a particular kernel will be launched in parallel across B a single-instruction, multiple-thread, at each invocation. A thread block is a blocks of T threads each. The blocks or SIMT, architecture in which threads group of parallel threads that may syn- of a kernel are numbered using two-di- resident on a single SM are executed in chronize with one another at a per-block mensional indices visible to the kernel groups of 32, called warps, each execut- barrier and communicate among them- as the special variables blockIdx.x ing a single instruction at a time across selves through per-block shared mem- and blockIdx.y, ranging from 0 to all its threads. Warps are the basic unit ory. Threads from different blocks may gridDim.x-1 and gridDim.y-1, of thread scheduling, and in any given coordinate with one another via atomic respectively. Similarly, the threads cycle the SM is free to issue an instruc- operations on variables in the global of a block are numbered with three- tion from any runnable warp. The memory space visible to all threads. dimensional indices threadIdx.x, threads of a warp are free to follow their There is an implicit barrier between suc- threadIdx.y, threadIdx.z; the ex- own execution path, and all such execu- cessive dependent kernels launched by tent of the block in each dimension is tion divergence is handled automatical- the host program. given by blockDim.x, blockDim.y, ly in hardware. However, it is obviously The NVIDIA CUDA Toolkit (http:// and blockDim.z. more efficient for threads to follow the www.nvidia.com/cuda) includes a C The function parallel _ incre- same execution path for the bulk of the compiler equipped with a small set of ment() accepts an array x of n elements computation. Different warps may fol- language extensions for writing CUDA and launches a parallel kernel with low different execution paths without programs. Figure 3 sketches a simple at least one thread for each element penalty. CUDA program fragment illustrating organized into blocks of 256 threads While SIMT architectures share these extensions. The global modifi- each. Since the data in the example is many performance characteristics with er indicates the increment() function one-dimensional, the code in Figure SIMD vector machines, they are, from is a kernel entry point and may be called 3 uses one-dimensional indices. Also, the programmer’s perspective, quali- only when launching a kernel. Unmodi- since every thread in this computation tatively different. Vector machines are fied functions and those functions ex- is completely independent, deciding to typically programmed with either vector plicitly marked host are normal C use 256 threads per block in our imple- intrinsics explicitly operating on vectors functions. The host program launches mentation was largely arbitrary. Every of some fixed width or compiler auto- kernels using the function-call-like thread of the kernel computes a globally vectorization of loops. In contrast, SIMT syntax increment<<>>(...), unique index i from its local thread- machines are programmed by writing a scalar program describing the action of Figure 3. Trivial CUDA C kernel for incrementing each element of an array. a single thread. A SIMT machine implic- itly executes groups of independent sca- __global__ void increment(float *x, int n) lar threads in a SIMD fashion, whereas a { vector machine explicitly encodes SIMD // Each thread will process 1 element, which execution in the vector operations in the // is determined from the thread’s index. int i = blockIdx.x*blockDim.x + threadIdx.x; instruction stream it is given. CUDA programming model. The CUDA if( i>>(x, n); processor and one or more parallel ker- } nels that can be executed by the host thread(s) on a parallel device. Individual kernels execute a scalar sequential program across a set of par- Capacity of each SM over three GPU generations. allel threads. The programmer orga- nizes the kernel’s threads into thread blocks, specifying for each kernel G8x/G9x GT2xx GF100 launch the number of blocks and num- Registers (32-bit) 8192 16384 32768 ber of threads per block to be created. Co-resident threads 768 1024 1536 CUDA kernels are thus similar in style Independent warps 24 32 48 to a blocked form of the familiar sin- Shared memory (KB) 16 16 48/16 gle-program, multiple-data, or SPMD, L1 cache (KB) — — 16/48 paradigm. However, CUDA is somewhat L2 cache (KB per chip) — — 768 more flexible than most SPMD sys- tems in that the host program is free to

november 2010 | vol. 53 | no. 11 | communications of the acm 63 contributed articles

Idx and the blockIdx of its block. It dently, a simple CUDA implementation

increments the value of xi by 1 if i < n—a would assign a separate thread to each conditional check required since n need row. For large matrices, this could eas- not be a multiple of 256. ily expose millions of threads of paral- lelism. However, for smaller matrices Throughput-Oriented Programming GPUs are with only a few thousand rows, this level Scalability is the programmer’s central specifically of parallelism might be insufficient, so concern in designing efficient algo- an efficient implementation could in- rithms for throughput-oriented ma- designed to stead assign multiple threads to process chines. Today’s architectural trends execute literally each row. In the most extreme case, each clearly favor increasing parallelism, and non-zero element could be assigned to a effective algorithmic techniques must billions of small separate thread. scale with hardware parallelism. Some Figure 4 plots an experiment mea- techniques suitable for four parallel user-written suring the performance of three differ- threads may be entirely unsuitable for programs ent parallel granularities: one thread/ 4,000 parallel threads. Running thou- row, 32 threads/row, and one thread/ sands of threads at a time, GPUs are a per second. non-zero.3 These tests use synthetic ma- powerful platform for exploring scalable trices with a constant number of entries algorithms and a leading indicator for distributed across a variable number of algorithm design on future throughput- rows ranging from one row with four oriented architectures. million entries on the left to four mil- Abundant parallelism. Throughput- lion rows of one entry each on the right. oriented programs must expose sub- The maximal parallelism resulting from stantial amounts of fine-grain parallel- assigning one thread per non-zero ele- ism, fulfilling the expectations of the ment yields the most efficient imple- architecture. Exploiting multicore CPUs mentation when there are few rows obviously requires exposing parallel- but suffers from lower absolute perfor- ism as well, but a programmer’s mental mance due to its need for inter-thread model of parallelism on a throughput- synchronization. For intermediate row oriented processor is qualitatively dif- counts, assigning 32 threads per row is ferent from multicore. A four-core CPU the best solution, while assigning one can be fully utilized by four to eight thread per row is best when the number threads. Thread creation and schedul- of rows is sufficiently large. ing are computationally heavyweight, Calculation is cheap. Computation since they can involve the saving and generally costs considerably less than restoration of processor state and rela- memory transfers, particularly external tively expensive calls to the operating memory transfers that frequently re- system kernel. In contrast, a GPU typi- quire hundreds of cycles to complete. cally requires thousands of threads to The fact that the cost of memory access cover memory latency and reach full has continued to increase and is now utilization, while thread scheduling is quite high relative to the cost of compu- essentially without cost. tation is often referred to as the “mem- Consider computing the product ory wall.” The energy required to move y=Ax, where A is a n×n matrix, and x is an data between the chip and external n-element vector. For sparse problems, DRAM is also far higher than required to because the vast majority of matrix en- operate an on-chip functional unit. In a tries is 0, A is best represented using a 45nm process, a 64-bit integer addition data structure that stores only its non- unit expends roughly 1pJ (picojoule), zero elements. The algorithm for sparse and a 64-bit floating point fused multi- matrix-vector multiplication (SpMV) ply add, or FMA, unit requires around would look like this: 100pJ. In contrast, reading a 64-bit value from external DRAM requires on the or- procedure spmv(y, A, x): der of 2,000pJ.8 for each row i: The high relative cost of access- y[i] = 0 ing memory affects both latency and for each non-zero column throughput-oriented processors, since j: the cost is the result of the physical prop- y[i] += A[i,j] * x[j] erties of semiconductor technology. However, the performance consequenc- Since each row is processed indepen- es of external memory references for

64 communications of the acm | november 2010 | vol. 53 | no. 11 contributed articles throughput-oriented processors can be problem for which most computer sci- This approach is reminiscent of more significant; these processors are ence students learn a sequential solu- quicksort though more efficient since A designed to reach a higher peak com- tion like this: and B are both sorted. If s is drawn from putational throughput and may have a A, “partitioning” A is trivial, since the el- higher peak throughput-to-bandwidth function merge1(A, B): ements less than s are simply those pre- ratio than latency-oriented processors. if empty(A): return B ceding s, and the corresponding point More important, they seek to tolerate if empty(B): return A at which s splits B can be found through rather than avoid latency. To hide the binary search. latency of frequent movement of data if first(A)

θ for 256 unique values of θ. Tabulating a matter of recursively merging Ai with S = [s1, ..., sk] = select _ all 256 possible values would require Bi. The code for doing it would look like elements(A, B, k) little space, but accessing them from ex- this: A0, ..., Ak] = partition(A, ternal memory would require hundreds S) of cycles. In the same amount of time, a function merge2(A, B): [B0, ..., Bk] = partition(B, thread could execute perhaps 50 to 100 if empty(A): return B S) instructions that could be used to com- if empty(B): return A pute the result and leave the memory return merge3(A0,B0) + [s1] + bandwidth available for other uses. s = select _ an _ element(A,B) ... + [sk] + merge3(Ak,Bk) Divide and conquer. Divide-and-con- A1, A2 = partition(A, s) quer methods often yield effective paral- B1, B2 = partition(B, s) Since each recursive merge is inde- lel algorithms, even for apparently serial pendent, this is an intrinsically paral- problems. Consider the merging of two return merge2(A1,B1) + [s] + lel algorithm. Merge algorithms of this sorted sequences A and B, a common merge2(A2,B2) form have been used in the parallel pro- gramming literature for decades14 and Figure 4. Double-precision throughput of SpMV strategies on an NVIDIA GeForce GTX 285 can be used to build efficient merge sort GPU; all matrices have exactly 4 x 210 nonzeros but differing row counts. routines in CUDA.27 Hierarchical synchronization. More often than not, parallel threads must 1 thread/non-zero 32 threads/row 1 thread/row synchronize with one another at various 20 times, but excessive synchronization 18 can undermine the efficiency of paral-

16 lel programs. Synchronization must be treated carefully on massively parallel 14 throughput-oriented processors where 12 thousands of threads could potentially

O P/s 10 contend for a lock.

G F L To avoid unnecessary synchroniza- 8 tion, threads should synchronize hier- 6 archically, keeping parallel computa- 4 tions independent as long as possible. When parallel programs are decom- 2 posed hierarchically, as in divide-and- 0 conquer methods, the synchronization 1 4 16 64 256 1K 4K 16K 64K 256K 1M 4M of threads can also be organized in a hi- Matrix Rows erarchical fashion. For example, when evaluating merge3(A,B) earlier, each recursive parallel merge step can pro-

november 2010 | vol. 53 | no. 11 | communications of the acm 65 contributed articles

ceed independently of all the others. CPU) cannot afford to aggressively trade Supercomputing (Melbourne, Australia). ACM Press, New York, 1998, 425–432. Any synchronization required within for increased total performance at the 12. flynn, M.J. Very high-speed computing systems. these subtasks can be localized to the cost of single-thread performance. The Proceedings of the IEEE 54, 12 (Dec. 1966), 1901–1909. 13. garland, M., Grand, S.L., Nickolls, J., Anderson, J., subtasks. Only at the end, when all sub- spectrum of workloads presented to it Hardwick, J., Morton, S., Phillips, E., Zhang, Y., and sequent operations are merged, must is simply too broad, and not all compu- Volkov, V. Parallel computing experiences with CUDA. IEEE Micro 28, 4 (July 2008), 13–27. the parallel subtasks be synchronized tations are parallel. For computations 14. gavril, F. Merging with parallel processors. Commun. with one another. Organizing synchro- that are largely sequential, latency-ori- ACM 18, 10 (Oct. 1975), 588–591. 15. grochowski, E., Ronen, R., Shen, J., and Wang, H. Best nization hierarchically also aligns well ented processors perform better than of both latency and throughput. In Proceedings of the with the physical cost of synchronizing throughput-oriented processors. On IEEE international Conference on Computer Design (Oct. 11–13). IEEE Computer Society, Washington, threads spread across different sections the other hand, a processor specifically D.C., 2004, 236–243. 16. Kapasi, U., Dally, W.J., Rixner, S., Owens, J.D., of a given system. It is natural to expect intended for parallel computation can and Khailany, B. The Imagine stream processor. that threads executing on a single core accept this trade-off and realize signifi- In Proceedings of the 2002 IEEE International Conference on Computer Design (Sept. 16–18). IEEE can be synchronized much more cheap- cantly greater total throughput on paral- Computer Society, Washington, D.C., 2002, 282–288. ly than threads spread across an entire lel problems as a result. 17. Khailany, B.K., Williams, T., Lin, J., Long, E.P., Rygh, M., Tovey, D.W., and Dally, W.J. A programmable 512 processor, just as threads on a single As the differences between these GOPS stream processor for signal, image, and video machine can be synchronized more architectures appear durable rather processing. IEEE Journal of Solid-State Circuits 43, 1 (Jan. 2008), 202–213. cheaply than threads across the mul- than transient, the ideal system is thus 18. Kongetira, P., Aingaran, K., and Olukotun, K. Niagara: tiple nodes of a cluster. heterogeneous, where a latency-ori- A 32-way multithreaded Sparc processor. IEEE Micro 25, 2 (Mar./Apr. 2005), 21–29. ented processor (such as a CPU) and a 19. Kozyrakis, C. and Patterson, D. Vector vs. superscalar Conclusion throughput-oriented processor (such as and VLIW architectures for embedded multimedia benchmarks. In Proceedings of the 35th Annual ACM/ The transition from single-core to mul- a GPU) work in tandem to address the IEEE International Symposium on Microarchitecture ticore processors and the increasing use heterogeneous workloads presented to (Istanbul, Turkey, Nov. 18–22). IEEE Computer Society Press, Los Alamitos, CA, 2002, 283–293. of throughout-oriented architectures them. 20. Krashinsky, R., Batten, C., Hampton, M., Gerding, S., signal greater emphasis on parallelism Pharris, B., Casper, J., and Asanovic, K. The vector- thread architecture. SIGARCH Computer Architecture as the driving force for higher computa- References News 32, 2 (Mar. 2004), 52–63. 1. alverson, G., Alverson, R., Callahan, D., Koblenz, B., 21. Laudon, J., Gupta, A., and Horowitz, M. Interleaving: tional performance. Yet these two kinds Porterfield, A., and Smith, B. Exploiting heterogeneous A multithreading technique targeting multiprocessors of processors differ in the degree of par- parallelism on a multithreaded multiprocessor. In and workstations. In Proceedings of the Sixth Proceedings of the Sixth international Conference on International Conference on Architectural Support allelism they expect to encounter in a Supercomputing (Washington, D.C., July 19–24). ACM For Programming Languages and Operating Systems typical workload. Throughput-oriented Press, New York, 1992, 188–197. (San Jose, CA, Oct. 5–7). ACM Press, New York, 1994, 2. alverson, R., Callahan, D., Cummings, D., Koblenz, 308–318. processors assume parallelism is abun- B., Porterfield, A., and Smith, B. The Tera computer 22. Lindholm, E., Nickolls, J., Oberman, S., and Montrym, dant, rather than scarce, and their para- system. In Proceedings of the Fourth international J. NVIDIA Tesla: A unified graphics and computing Conference on Supercomputing (Amsterdam, The architecture. IEEE Micro 28, 2 (Mar./Apr. 2008), 39–55. mount design goal is maximizing total Netherlands, June 11–15). ACM Press, New York, 23. nickolls, J., Buck, I., Garland, M., and Skadron, K. throughput of all parallel tasks rather 1990, 1–6 Scalable parallel programming with CUDA. Queue 6, 2 3. Bell, N. and Garland, M. Implementing sparse (Mar./Apr. 2008), 40–53. than minimizing the latency of a single matrix-vector multiplication on throughput-oriented 24. nVIDIA. NVIDIA’s Next-Generation CUDA Compute processors. In Proceedings of the Conference on High sequential task. Architecture: Fermi, Oct. 2009; http://www.nvidia.com/ Performance Computing Networking, Storage and fermi Emphasizing total throughput over Analysis (Portland, OR, Nov. 14–20). ACM Press, New 25. russell, R.M. The Cray-1 computer system. Commun. York, 2009, 1–11. the running time of a single task leads ACM, 21, 1 (Jan. 1978), 63–72. 4. Birrell, A.D. An Introduction to Programming with 26. sanders, J. and Kandrot, E. CUDA By Example: An to a number of architectural design de- Threads. Research Report 35. Digital Equipment Corp. Introduction to General-Purpose GPU Programming. Systems Research, Palo Alto, CA, 1989. cisions. Among them, the three primary Addison-Wesley, July 2010. 5. Blank, T. The MasPar MP-1 architecture. In 27. satish, N., Harris, M., and Garland, M. Designing architectural trends typical of through- Proceedings of Compcon (San Francisco, CA, Feb. 26– efficient sorting algorithms for manycore GPUs. Mar. 2). IEEE Press, 1990, 20–24. In Proceedings of the 2009 IEEE international put-oriented processors are hardware 6. Borkar, S., Jouppi, N.P., and Stenstrom, P. Symposium on Parallel & Distributed Processing (May multithreading, many simple process- Microprocessors in the era of terascale integration. In 23–29). IEEE Computer Society, Washington, D.C., Proceedings of the Conference on Design, Automation 2009, 1–10. ing elements, and SIMD execution. and Test in Europe (Nice, France, Apr. 16–20). EDA 28. smith, B.J. Architecture and applications of the HEP Hardware multithreading makes man- Consortium, San Jose, CA, 2007, 237–242. multiprocessor computer system. Proceedings of the 7. Bouknight, W.J., Denenberg, S.A., McIntyre, D.E., International Society for Optical Engineering 298 (Aug. aging the expected abundant parallel- Randall, J.M., Sameh, A.H., and Slotnick, D.L. The 1981), 241–248. ism cheap. Simple in-order cores forgo Illiac IV system. Proceedings of the IEEE 60, 4 (Apr. 29. tucker, L.W. and Robertson, G.G. Architecture and 1972), 369–388. applications of the Connection Machine. Computer 21, out-of-order execution and speculation, 8. Dally, W. Power efficient supercomputing. Presented 8 (Aug. 1988), 26–38. and SIMD execution increases the ratio at the Accelerator-based Computing and Manycore 30. Tullsen, D.M., Eggers, S.J., and Levy, H.M. Workshop (Lawrence Berkeley National Laboratory, Simultaneous multithreading: maximizing on-chip of functional units to control logic. Sim- Berkeley, CA, Nov. 30–Dec. 2, 2009); http://www.lbl. parallelism. In Proceedings of the 22nd Annual gov/cs/html/Manycore_Workshop09/GPU Multicore ple core design and SIMD execution re- international Symposium on Computer Architecture SLAC 2009/dallyppt.pdf (S. Margherita Ligure, Italy, June 22–24). ACM Press, duce the area and power cost of control 9. Dally, W.J., Labonte, F., Das, A., Hanrahan, P., Ahn, J., New York, 1995, 392–403. Gummaraju, J., Erez, M., Jayasena, N., Buck, I., Knight, logic, leaving more resources for paral- 31. Ungerer, T., Robic˘, B., and Šilc, J. A survey of T. J., and Kapasi, U.J. Merrimac: Supercomputing processors with explicit multithreading. ACM lel functional units. with streams. In Proceedings of the 2003 ACM/IEEE Computing Surveys 35, 1 (Mar. 2003), 29–63. Conference on Supercomputing (Nov. 15–21). IEEE These design decisions are all predi- Computer Society, Washington, D.C., 2003. 10. Davis, J.D., Laudon, J., and Olukotun, K. Maximizing cated on the assumption that sufficient Michael Garland ([email protected]) is a senior CMP throughput with mediocre cores. In Proceedings research scientist in NVIDIA Research, Santa Clara, CA. parallelism exists in the workloads the of the 14th international Conference on Parallel Architectures and Compilation Techniques (Sept. David B. Kirk ([email protected]) is an NVIDIA Fellow and processor is expected to handle. The 17–21). IEEE Computer Society, Washington, D.C., former chief scientist of NVIDIA Research, Santa Clara, 2005, 51–62. performance of a program with insuf- CA. 11. espasa, R., Valero, M., and Smith, J.E. Vector ficient parallelism may therefore suffer. architectures: Past, present and future. In A fully general-purpose chip (such as a Proceedings of the 12th international Conference on © 2010 ACM 0001-0782/10/1100 $10.00

66 communications of the acm | november 2010 | vol. 53 | no. 11 doi:10.1145/1839676.1839695 Concerns about biased manipulation of search results may require intervention involving government regulation.

By Patrick Vogl and Michael Barrett Regulating the Information Gatekeepers

In 2003, 2bigfeet, an Internet business specializing in the sale of oversize shoes ranked among the top results in Google searches for its products. Its prime location on the virtual equivalent of New York’s high-end shopping mecca Fifth Avenue brought a steady stream of clicks and revenue. But success was fleeting:

That November, Google’s engineers mine the most relevant results in the modified their search engine’s algo- index. Although the precise workings rithms, an update later dubbed “Flor- of these algorithms are kept at least as ida” by the search-engine community. secret as Coca-Cola’s formula they are 2bigfeet’s rankings dropped abruptly usually based on two main functions: just before the Christmas selling sea- keyword analysis (for evaluating pages son, and this Internet success story along such dimensions as frequency was suddenly on the brink of bank- of specific words) and link analysis ruptcy.2 (based on the number of times a page Search engines have established is linked to from other sites and the themselves as critical gatekeepers rank of these other sites) (see Figure 1). of information. However, despite an increasingly monopolistic Internet key insights search market, they and the implicit filtering process in their rankings re-  With search engines guiding access to main largely beyond public scrutiny critical Web-based information flow, and control. This has inspired us to many users are increasingly concerned over possible targeted manipulation of explore an increasingly topical ques- search results. tion: Should search-engine ranking be  With markets and technology both regulated? unlikely to ensure unbiased results, Search engines generally work by regulation may be the only alternative. indexing the Web through so-called  O ne promising way forward is crawler programs. When a user types clearer guidelines for search-engine in a request, search algorithms deter- optimization through self-regulation.

november 2010 | vol. 53 | no. 11 | communications of the acm 67 contributed articles

Appreciating the value of top rank- rankings and German car manufac- sites to be expected to know the ‘rules’ ings, Webmasters have learned to turer BMW, received notable media at- about what they can or cannot do?”23 optimize their pages so big search tention. Many more cases of dropped Third, our research supports the engines rank them more highly. This ranking have been condemned to vir- idea that there is no established pro- has spawned a worldwide industry of tual silence, among them search-en- cess of announcement or appeal prior search-engine-optimization consul- gine optimizer Bigmouthmedia. to rank demotion. Companies af- tants, whose techniques are grouped Second, though market-leader fected usually realize their fate only into two categories: white-hat, ensur- Google has published guidelines on through a sudden loss of traffic or rev- ing that search engines easily analyze creating “Google-friendly” pages, the enue. In a personal interview [2008], a site and are accepted by search en- line between permitted and illicit the CEO of an educational company gines; and black-hat, including hid- practices is blurry at best.20 For exam- told us: “The office called me and told den text, as in white text on a white ple, Google’s guidelines rightly warn me [...] that revenue was down [...], so background, considered illicit by against cloaking, “the practice of pre- I checked our logs and our history [...] most search engines and upon discov- senting different content […] to users It was all on one day. We were up to 14 ery generally punished with degraded and search engines.”13 However, to a million pageviews per month, and on ranking. certain extent cloaking can be justified one day it dropped 70% and [stayed Search engines clearly have a le- and used with good intent by major there], and that was it.” gitimate interest in fighting inappro- sites without penalty.8 For instance, Fourth, options are limited for priate third-party optimization tech- the Wall Street Journal uses it to show companies affected by ranking demo- niques to ensure their search-result full versions of pay-per-view articles to tion. One interviewee recalls his com- quality; for instance, sites with no Google’s indexing program.8 pany got no response, even though he other purpose than linking to specific The difficulty of straddling the line personally went to the search engine sites to increase page rank (link farms) between permitted and illicit prac- firm’s headquarters for assistance. are black-hat and must be dealt with tices is further illustrated by a case Fifth, several allegations in the accordingly, though punishment can involving paid links: In February 2009 blogosphere claim large players are be problematic for multiple reasons: Google punished its subsidiary Google treated better than their less-powerful First, sudden ranking demotion Japan through a page rank demotion counterparts. For example, in May and resulting diminished inflow of vis- for paying for online reviews of a new 2008, Hewlett-Packard began offer- itors have major effects on businesses, widget.23 While Google’s attempt to ing free blog templates, including as illustrated by the cases of Skyfacet play by its own rules is positive the hidden links to its own pages, a quick (which reportedly lost $500,000 of rev- case highlights the difficulty of dis- way to gather “high-quality” links and enue in 2006) and MySolitaire (which tinguishing permitted from illicit op- clicks.25 However, there was no evi- reportedly lost $250,000 the same timization techniques. A leading U.S. dence of punishment by major search year14). Only a few cases, including the commentator in online search asked: engines, sparking significant contro- companies SearchKing18 and Kinder- “If Google itself […] found itself in versy in the community.25 start11 involving lawsuits over page this situation, how are ordinary Web Finally, search engines have pun- ished Web sites using search-engine Figure 1. How ranking works. optimization, as well as the search-en- gine-optimization companies them- selves. In the case of SearchKing, a search-engine consulting company Spider/Crawler in Oklahoma, U.S. courts have found Web that Google “knowingly and inten- Searches Web, storing pages in index tionally” dropped the company’s Web sites in its rankings to punish what it deemed illicit ranking manipulation Index that SearchKing had carried out for its clients.18

Rationale for Regulation Search engine software User Several researchers have pointed to the dangers of targeted manipulation, Algorithm accesses index, Makes request determining most relevant arguing it undermines values like free results for queries based on: speech, fairness, economic efficiency, ˲˲ keyword analysis (such as and autonomy, as well as the institu- word location and frequency) tion of democracy. Concerning de- ˲˲ link analysis (such as number of links from other pages and mocracy and free speech, Introna and rank of these pages) Nissenbaum17 argued a decade ago that search engines’ broad structural bias can lead to underrepresentation

68 communications of the acm | november 2010 | vol. 53 | no. 11 contributed articles of niches and minority interests and a search engines shaping user options loss of variety. Drawing on Anderson’s and information could limit user au- theory of ethic limitations to mar- tonomy.3 kets,1 they made a solid case that this Regulation and alternatives. Tra- lack of pluralism does not correspond ditional justification for regulation to society’s “liberal commitments to Companies affected includes control of monopoly power freedom, autonomy, and welfare.”1 In- usually notice their and excess profits, compensation for trona and Nissenbaum viewed the In- externalities, inadequate informa- ternet as a “political good” due to its fate only through tion, unequal bargaining power, and role as “conveyor of information” and a sudden loss of scarcity of essential products.4 While function like “traditional public spac- none perfectly fits the case of targeted es” as a “forum for political delibera- traffic or revenue. results manipulation, intervention tion.”17 Consequently, just as schools could be supported through two ratio- and national heritage are not left to nales: the mercy of free markets, they argued Information asymmetries. Elkin-Ko- the Internet is a public good requiring ren and Salzberger7 pointed out that special protection.17 Similarly, Bracha the Internet’s common market failure and Pasquale3 made the case in 2007 is an overload rather than undersup- that targeted manipulation of search ply of information.7 However, the so- engine results “threatens the open- lution to this problem—the Internet ness and diversity of the Internet as a search market—suffers from a lack system of public expression.”3 of information. Just as laymen cannot Another approach based on demo- easily assess the services of doctors or cratic values emphasizes the right of the effects of a particular medicine,4 free speech. Chandler5 argued it pro- users of search engines are unable to tects not only the ability to listen and adequately evaluate search services. speak, but that intermediaries can- Market power. While the Internet not impose “discriminatory filters search market shows monopolistic that the listener would not otherwise tendencies (extremely strong market have used.”5 Since extraneous bias positions of a few key players), there introduces different discrimination is no strong case for regulatory inter- criteria, free speech is undermined vention on the standard antitrust ar- by targeted search-engine-result ma- gument of abuse of monopoly power nipulation.5 (such as lack of competition leading Fairness might also be under- to excessive pricing). However, a case mined. Since search-engine rankings can be based on Breyer’s argument4 have enormous influence on business of an “unjustifiably discriminatory ex- performance, ranking manipulation ercise of personal power” combined can cause significant harm both arbi- with the “concentration of substantial trarily and more deliberately. Search social and political power” in a private engines as private entities are gener- entity that controls an “essential prod- ally free to conduct business as they uct”4 (see also Bracha and Pasquale3 wish within the limitations applicable for a similar view on the application of to all companies. However, Webmas- essential-facility arguments to search ters cannot simply opt out of their de- engines). These arguments were de- pendence on search engines, perhaps veloped in the early 20th century U.S. representing an “inescapable influ- business environment when a group ence,”10 taking their relationship from of railway companies in control of ac- the private to the public sphere where cess to the city of St. Louis prevented more dependable accountability is ex- competitors from offering services in pected.10 the same area.21 Even if these argu- Other concerns involve economic ments do not apply to users of search efficiency, deception, and autonomy. engines, they may apply to Webmas- Targeted manipulation can limit the ters unable to choose the search en- availability of information, causing gines the public uses to find informa- market inefficiencies and barriers to tion. entry.3 Manipulating search results, Regulation must be compared to while leading users “to believe that other solutions, particularly free mar- search results are based on relevancy kets. The basic market argument is alone,”16 could be deceptive, while that search engines have an incentive

november 2010 | vol. 53 | no. 11 | communications of the acm 69 contributed articles

to produce the most relevant search arguments emerge from a U.S.-centric results; otherwise, users would switch context, which, as the Microsoft anti- to their competitors.17 However, mar- trust case in the 1990s showed, is not kets alone are unlikely to address the the only legal arena for regulation. concerns of targeted manipulation for Moreover, in legal circles it has been three reasons: Markets alone suggested that the existing regula- Proprietary algorithms. Users would are unlikely to be tory frameworks may be inadequate not be able to detect targeted manipu- for something as groundbreaking as lation in most cases, as search engines sufficient to address Internet search. Given the state of the keep their algorithms secret.3 More- the concerns law, governments and multinational over, even if users are more aware of bodies may need to create a new regu- results manipulation, their expecta- about targeted latory framework. tion of what they are searching for is manipulation. shaped during the search process.17 Search and Its Stakeholders Consequently, they cannot objectively Who are the stakeholders and what evaluate search-engine quality17; are their interests? Mitchell wrote19 Concentrated market. While the that stakeholders are characterized by Internet search market is highly con- power, legitimacy, and urgency. Build- centrated, a less monopolistic mar- ing on a broad investigation of stake- ket is unlikely to emerge due to high holder interests in search-engine law economies of scale.3 Furthermore, in- by Grimmelmann,15 we see four main cumbents benefit from their existing actors—users, search engines, Web- user base by, say, collecting user data masters, and search-engine optimiz- through such products as the Google ers—that are, to some extent, charac- toolbar. Moreover, the emergence of terised by Mitchell’s attributes. new big players is also unlikely, since Table 1 is an overview of key stake- promising start-ups could be acquired holder interests in Internet search by such dominant search giants as bias and stakeholder recognition of Google and Microsoft; and possible manipulation. Search-engine User inertia. While switching might optimizers are particularly conflicted. seem easy (users simply type another On the one hand, they stand to gain address), evidence suggests that per- from greater transparency in Inter- sonal habit is a key factor when select- net search, as their business would ing a search engine22; moreover, new be easier and more efficient, and a technologies like personalized search clearer picture of accepted practices are likely to raise switching costs sig- would enable them to guarantee their nificantly. clients that search engines tolerate Technological development is also their techniques. On the other hand, often mentioned in the context of they profit from lack of transparency, search-engine bias.12 However, while as it raises the value of their key as- new technologies may alleviate some sets—expertise and inside knowledge. concern (such as reinforcement of Moreover, search-engine optimizers popular sites), no technology in sight with good contacts and community- is likely to cure the problem of target- wide attention can profit from their ed manipulation. influence with search-engine com- Two counterarguments often panies in “curing” cases of rank de- brought up against regulation of motion. Therefore, top performers search-engine bias are that search arguably have little interest in greater results are free speech and therefore transparency. cannot be regulated, and search en- Commentators have made several gines are not essential facilities, as proposals for regulating targeted ma- they do not fulfil the criteria of essen- nipulations. We comment first on the tial facilities accepted by U.S. courts. two most promising, then introduce a While these arguments have mer- new approach (outlined in Table 2). it, they are insufficient for reject- Obligation to provide reasons for ing intervention for several reasons: rank demotion. One proposal20 sug- Though the courts have acknowledged gested establishing an obligation to First Amendment rights for search provide a reason for rank demotion to engines, a number of legal scholars increase transparency in the relation- have argued against this view.3,5 Both ship between Webmasters and search

70 communications of the acm | november 2010 | vol. 53 | no. 11 contributed articles engines. In this context, Pasquale20 tween black-hat and white-hat search ters and optimizers a better idea of drew an interesting analogy with cred- engine optimizers are gray, often what to expect from search engines. it-reporting agencies providing rea- forcing Webmasters and optimizers The diminished gray area would lead sons for adverse credit information to to speculate as to which techniques to more consistent application of consumers.20 It would favor the inter- would be punished through ranking ranking degradation, as questionable ests of users and especially Webmas- degradation. In a personal interview sites would fall more clearly into one ters, because it would support Web- on search engine optimization, one category or the other. Moreover, new site optimization for search engines. search-engine consultant said the guidelines could also cover search Search-engine optimizers would key to differentiating between black- engines’ assessment of intent in ques- probably be divided between the less hat and white-hat optimization tech- tionable cases, further promoting influential that gain and top perform- niques in unclear cases is “implica- consistent treatment of market play- ers that lose from greater transpar- tion of intent.” However, in the 2008 ers. This approach is promising be- ency. Building on the credit-agency case of Hewlett-Packard mentioned cause it has advantages for all stake- analogy, Pasquale20 argued that the earlier, comments by search-engine holders (see Figure 2). cost to search engines and risk of al- optimizers indicate the existence of Clearer guidelines. A self-regulato- gorithmic reverse engineering would at least some bias distinguishing be- ry approach initiated by regulators be low.20 However, as the number of tween permitted and illicit optimiza- would be the easiest and most ef- potential queries on rank demotion is tion methods. ficient way to achieve clearer guide- arguably much higher than the num- One approach to increasing trans- lines. There is ample precedence that ber on adverse credit ratings, cost to parency in the relationship between self-regulation works in cyberspace; the search engines would likely be Webmasters and search engines is to for example, in the early days of Inter- significant. Moreover, search engines establish clearer guidelines distin- net search, many search engines did would likely oppose any obligation to guishing black-hat from white-hat op- not distinguish between organic and provide precise reasons for each rank timization. The gray area in between paid results (advertisements). Howev- demotion, as it would increase the would be diminished, giving Webmas- er, following a letter by the U.S. Feder- risk of lawsuits. Installation of ombudsmen and pro- Table 1. Stakeholder interests in the regulation of search bias. cess of appeal. Taking the previous proposal a step further, some have called for an appeal process against Interests Awareness of manipulation rank demotion.24 For example, For- Users ˲˲ High-quality content, particularly quick query Low response and inclusion of new pages; clear, 10 syth emphasized if search engines well-structured results; transparent construction were public entities with ensuing of results accountability, a process of appeal ˲˲ Pleasant search experience, easy usability would probably already have been Search Engines ˲˲ Protection of trade secrets, particularly High 10 algorithms established. While Google offers a ˲˲ Freedom to make changes to algorithms for way to submit pages for “reconsidera- preventing abuse of search-engine optimization tion,”8 such appeals are judged inter- ˲˲ Low cost of regulation nally without transparency, and, while Webmasters ˲˲ Consistent, transparent ranking of pages in Low, some among successful in some cases, a Webmas- search engines professionals ter’s only option might be to attract Search-Engine ˲˲ Top group: low transparency High Optimizers ˲˲ Low-level search-engine optimizers: enough publicity to get a search-en- high transparency gine representative to take up the case internally.9 Installing a formal, transparent ap- peals process would clearly be in the interest of both users and Webmas- Table 2. Regulatory proposals and stakeholder interests. ters, while assisting some search en- gine optimizers but diminishing the ++ strongly supportive — opposed value of top optimizer contacts. On + supportive — — strongly opposed the other hand, search engines would Search Content Search-Engine arguably incur somewhat higher costs Regulatory Proposal Users Engines Providers Optimizers for installing such a process. More- Obligation to provide reasons for rank ++ — ++ divided over, a formal process with a clear demotion chance of success would facilitate ap- Installation of ombudsmen and appeal ++ — ++ divided peals, thereby increasing numbers of process requests and probably encouraging Clearer guidelines for search-engine ++ + ++ divided “appeal gaming.” optimization Clearer guidelines for search engine optimization. Current guidelines be-

november 2010 | vol. 53 | no. 11 | communications of the acm 71 contributed articles

Figure 2. How clearer search-engine optimization guidelines would affect stakeholders. org/oti/articles/1201/icann.html 7. elkin-Koren, N. Law, Economics and Cyberspace. Edward Elgar Publishing, Cheltenham, U.K., 2004. 8. fishkin, R. White-hat cloaking: It exists, it’s permitted, it’s useful. SEOMoz Blog (June 30, 2008); http://www.seomoz.org/blog/white-hat-cloaking-it- Search-Engine exists-its-permitted-its-useful Search Engines 9. fontenot, D. Matt Cutts, Why am I still being Optimization Consultants punished? SEO Scoop (Jan. 24, 2008); http://www. + less risk of lawsuits seo-scoop.com/2008/01/24/matt-cutts-why-am-i- – slightly less freedom + Able to guarantee clients that still-being-punished/ fighting abusive their practices are accepted 10. forsyth, H. Google MP? How the Internet Is search-engine + less risk of sudden rank Challenging Our Notions of Political Power. optimization Clearer demotion Presentation at Pembroke College Ivory Tower Guidelines + optimization gets easier Society, Cambridge, U.K. (Jan. 28, 2008). 11. goldman, E. KinderStart v. Google dismissed: – less value of inside knowledge With sanctions against KinderStart’s counsel. and contacts Technology & Marketing Law Blog (Mar. 20, 2007); http://blog.ericgoldman.org/archives/2007/03/ kinderstart_v_g_2.htm 12. goldman, E. Search engine bias and the demise of Webmasters Users search engine utopianism. Yale Journal of Law & Technology 8 (Spring 2006), 188–200. + easier Website + increased relevance of results 13. google, Inc. Webmaster Tools Help Cloaking, Sneaky optimization for search engines Javascript Redirects, and Doorway Pages (Dec. 4, + less risk of sudden ranking demotion 2008); http://www.google.com/support/webmasters/ bin/answer.py?answer=66355&topic=15263. 14. greenberg, A. Condemned to Google hell. Forbes (Apr. 4, 2007); http://www.forbes.com/2007/04/29/ sanar-google-skyfacet-tech-cx_ag_0430googhell. html 15. grimmelmann, J. The structure of search engine law. al Trade Commission in 2002 recom- promote quick and inexpensive reso- Iowa Law Review 93, 1 (Nov. 2007), 3–63. 16. Hippsley, H. Letter from FTC to Search Engines 6 mending search engines ensure that lution of domain-name conflicts. A Regarding Commercial Alert Complaint Requesting “paid […] results are distinguished similar body could help establish new Investigation of Various Internet Search Engine Companies for Paid Placement and Paid Inclusion from non-paid results with clear and optimization guidelines and manage Programs. Federal Trade Commission, Washington, 16 D.C., June 27, 2002; http://www.ftc.gov/os/closings/ conspicuous disclosures,” all major the mediation process. staff/commercialalertattatch.shtm search engines implemented such 17. Introna, L.D. and Nissenbaum, H. Shaping the Web: Why the politics of search engines matters. The practices. Also, self-regulation would Conclusion Information Society 16, 3 (2000), 169–185. help the process of regulation keep up Here, we’ve argued for the need to 18. miles-LaGrange, V. SearchKing, Inc. v. Google Technology, Inc. CIV-02-1457-M.U.S. District Court with the pace of technological change. open a debate on how to regulate for the Western District of Oklahoma, 2003. Finally, as clearer guidelines would ar- targeted ranking manipulation that 19. mitchell, R.K., Agle, B.R., and Wood, D.J. Toward a theory of stakeholder identification and salience: guably favor all stakeholders, search hinders search-engine optimization. Defining the principle of who and what really counts. engines would at least join the public These practices threaten democracy The Academy of Management Review 22, 4 (Oct. 1997), 853–886. dialogue on self-regulation. and free speech, fairness, market ef- 20. Pasquale, F. Rankings, Reductionism, and Policymakers could signal the im- ficiency, autonomy, and freedom Responsibility. Seton Hall Public Law Research Paper. Seton Hall University, South Orange, NJ, Feb. 25, portance of targeted manipulation from deception. Making the case for 2006. and initiate a dialogue on self-regu- regulation can be based on the search 21. pitofsky, R. The Essential Facility Doctrine Under United States Antitrust Law. Federal Trade lation by creating a committee of key market’s failure of information asym- Commission, Washington, D.C., 2001; http://www. stakeholders to examine cases of rank metries and concentration of market ftc.gov/os/comments/intelpropertycomments/ pitofskyrobert.pdf demotion and recommend ways to power over an essential product in a 22. sherman, C. Search engine users: Loyal or blasé? Search Engine Watch (Apr. 19, 2004); http:// improve today’s optimization guide- private entity. Our analysis of specific searchenginewatch.com/3342041 lines. In addition, the topic of search- regulatory proposals and their impli- 23. sullivan, D. Google penalizes Google Japan for buying links. Search Engine Land (Feb. 11, 2009); http:// engine regulation could be put on the cations for stakeholders highlights searchengineland.com/google-penalizes-google- agenda of the next United Nations the benefits of establishing clearer japan-16541 24. Sullivan, D. Google ombudsman? Search Internet Governance Forum (www.in- guidelines for optimizers. ombudsman? Great idea: Bring them on! tgovforum.org). Search Engine Watch (July 2006); http://blog. searchenginewatch.com/blog/060706-075235 Self-regulation alone may not alle- 25. Wall, A. Strategic content as marketing for link References building. SEO Book (May 8, 2008); http://www. viate concern about rank demotion. 1. anderson, E. Value in Ethics and Economics. Harvard seobook.com/content-marketing-win One idea from the Internet’s early University Press, Cambridge, MA, 1993. 2. Battelle, J. The Search: How Google and Its Rivals days may chart another way forward. Rewrote the Rules of Business and Transformed Our As disputes over domain names be- Culture. Nicholas Brealey Publishing, London, 2005. Patrick Vogl ([email protected]) is a graduate of the MPhil 3. Bracha, O. and Pasquale, F. Federal Search in Technology Policy at the Judge Business School, came more heated in the 1990s and Commission? Access, Fairness and Accountability in University of Cambridge, U.K. U.S. trademark law proved insuffi- the Law of Search. University of Texas Public Law Research Paper No. 123, Austin, TX, 2007. Michael Barrett ([email protected]) is a Reader in cient, the Internet Corporation for 4. Breyer, S. Typical justifications for regulation. In A Information Technology and Innovation and Director of Reader on Regulation, R. Baldwin, C. Scott, and C. Programmes at the Judge Business School, University of Assigned Names and Numbers (www. Hood, Eds. Oxford University Press, New York, 1998, Cambridge, U.K. icann.org) and the World Intellectual 59–93. 5. chandler, J. A right to reach an audience: An Property Organization (www.wipo. approach to intermediary bias on the Internet. int) developed the Uniform Domain- Hofstra Law Review 35, 3 (Spring 2007), 1095–1139. 6. Davis, G. The ICANN uniform domain name dispute Name Dispute-Resolution Policy resolution policy after nearly two years of history. (www.icann.org/en/udrp/udrp.htm) to e-OnTheInternet (Jan./Feb. 2002); http://www.isoc. © 2010 ACM 0001-0782/10/1100 $10.00

72 communications of the acm | november 2010 | vol. 53 | no. 11 Introducing:

The ACM Magazine for Students

XRDS delivers the tools, resources, knowledge, and connections that computer science students need to succeed in their academic and professional careers!

The All-New XRDS: Crossroads is the official magazine for ACM student members featuring:

� Breaking ideas from top researchers and PhD students � Career advice from professors, HR managers, entrepreneurs, and others � Interviews and profiles of the biggest names in the field � First-hand stories from interns at internationally acclaimed research labs � Up-to-date information on the latest conferences, contests, and submission deadlines for grants, scholarships, fellowships, and more!

Also available The All-New XRDS.acm.org XRDS.acm.org is the new online hub of XRDS magazine where you can read the latest news and event announcements, comment on articles, plus share what’s happening at your ACM chapter, and more. Get involved by visiting today!

XRDS.acm.org

ACM_XRDS_Ad_Final.indd 1 4/21/10 12:41:51 PM review articles

doi:10.1145/1839676.1839696 such attacks. However, classic work Computational complexity may truly be from economics and political science proves that every reasonable election the shield against election manipulation. system sometimes gives voters an in- centive to vote insincerely (see Duggan17 by Piotr Faliszewski, Edith Hemaspaandra, and the references therein). Reasonable and Lane A. Hemaspaandra election systems cannot make manipu- lation impossible. However, they can make manipulation computationally infeasible. This article is a nontechnical intro- Using duction to a startling approach to pro- tecting elections: using computational complexity as a shield. This approach seeks to make the task of whoever is Complexity trying to affect the election computa- tionally prohibitive. To better under- stand the cases in which such protec- tion cannot be achieved, researchers to Protect in this area also spend much of their time working for the Dark Side: trying to build polynomial-time algorithms to attack election systems. Elections This complexity-based approach to protecting elections was pioneered in a stunning set of papers, about two de- cades ago, by Bartholdi, Orlin, Tovey, and Trick.2,3,5 The intellectual fire they lit smoldered for quite a while, but in recent years has burst into open flame. Computational complexity may For thousands of years, people—and more recently, truly be the key to defending elections electronic agents—have been conducting elections. from manipulation.

And surely for just as long, people—or more recently, Preliminaries and the Complexity electronic agents—have been trying to affect the of the Winner Problem outcomes of those elections. Such attempts take In the introduction, we focused on many forms. Often and naturally, actors may seek to key insights change the structure of the election, for example, by  A lgorithms can be used to seek attacks attracting new voters, suppressing turnout, recruiting on elections, and complexity can serve to protect elections from attacks. For candidates, or setting election district boundaries. some election systems, manipulation Sometimes voters may even be bribed to vote a certain has been proven NP-hard. way. And a voter may try to manipulate an election  Dichotomy theorems pinpoint what it is about an election system that by casting an insincere vote that may yield a more makes it computationally resistant to manipulation. apon favorable outcome than would the voter’s sincere vote: l

 It is natural to consider an election n ga Not all people who preferred Ralph Nader in the 2004 system's computational weaknesses and lvi U.S. presidential election actually voted for him. strengths as one factor, among many, when selecting a system for a given on By me i One might hope that by choosing a particularly task. In particular, one must consider

which types of attacks one most needs ustrat

wonderful election system, one can perfectly block to thwart. ill

74 communications of the acm | november 2010 | vol. 53 | no. 11 review articles

protecting elections, rather than on ping from (C, V ) to a “winner set” W, why and in what settings elections are Ø Í W Í C. Perhaps the most famous used for aggregating preferences in and common election system is plural- the fi rst place. The latter issue could ity, in which each candidate who most itself fi ll a survey—but not this survey. often comes at the top of voters’ orders However, before moving on we briefl y Voting was already is put into W. We will focus quite a bit mention a few varied examples of how very important on plurality in this article, since it has elections can be useful in aggregating been extensively studied with respect preference. In daily life, humans use before computers to using complexity to protect elec- elections to aggregate preferences in and the internet tions. Plurality is itself a special case tasks ranging from citizens choosing of a broad class of election systems their political representatives to an ac- existed, and in the known as scoring systems or scoring- ademic department’s faculty members rule systems. In these, each candidate selecting which job candidate to hire to modern world, gets from each voter a certain number conference business meeting attend- where multiagent of points based on where he or she ees selecting future locations for their falls in the voter’s ordering, and who- conference. In electronic settings, elec- settings abound, ever gets the most points wins. For tions often can take on quite different, the importance of example, the scoring point system for yet also interesting and important, plurality (in k-candidate elections) is challenges. For example, one can build voting is greater that a voter’s favorite candidate gets a metasearch engine based on combin- still. one point from that voter and the ing underlying search engines, in or- other k−1 candidates get zero points der to seek better results and be more from that voter. In the Borda election resistant to “Web spam.”18 One can system, proposed in the 18th century, use voting as an approach to building the points from favorite to least fa- recommender systems41 and to plan- vorite are k−1, k−2, . . . , 0. In veto elec- ning.20 Voting was already very impor- tions, the points are 1, 1, 1, . . . , 1, 0; tant before computers and the inter- that is, the voter in effect votes against net existed, and in the modern world, one candidate. Scoring systems are where multiagent settings abound, the a fl exible, important class of voting importance of voting is greater still. systems and, as we will see, they are a In this article, we will discuss the class whose manipulation complexity successes and failures to date in using (for fi xed numbers of candidates) is complexity to defend against three im- completely analyzed. There are many portant classes of attacks on election other important election systems, but systems: (structural) control attacks, to move the article along, we will intro- (voter) manipulation, and bribery. In duce them as we need. these three settings, high computa- An election system that immediately tional complexity is the goal. But fi rst, merits note is the Condorcet rule. In we briefl y discuss a case so surprising Condorcet elections, a candidate is a that one might not even think of it, winner exactly if he or she beats each namely, the case in which an election other candidate in head-to-head major- system is so complex that even deter- ity-rule elections under the voters’ pref- mining who won is intractable. erences. Consider the election shown We must fi rst introduce the model of in Figure 1. In that election there is no elections we will use throughout this ar- Condorcet winner, since is beaten by ticle. While doing so, we will also defi ne 3-to-1, is beaten by 4-to-0, and some election systems, such as plurality rule. An election consists of a candidate figure 1. an election. set C and a list V of votes (ballots) over those candidates. In almost all the elec- tion systems we discuss, a vote is simply a strict ordering of all the candidates, for example, “Nader > Gore > Bush” if the voter likes Nader most, Gore next most, and Bush least. An exception is approval voting, in which each vote is a bit-vector giving a thumbs-up or thumbs-down to each candidate. An election system is simply a map-

76 communications of the acm | november 2010 | vol. 53 | no. 11 review articles in a 2-to-2 tie, fails to beat . tem has one glaring fl aw: It is compu- polynomial-time greedy algorithms Although this example shows that tationally intractable to tell who won! correctly fi nd the Lewis Carroll winner Condorcet elections sometimes have no This was fi rst shown in a paper by Bar- all but an asymptotically exponentially winner, some election systems—the so- tholdi, Tovey, and Trick,4 who showed vanishing portion of the time when the called Condorcet-consistent systems— that this problem was NP-hard. Later, number of voters is more than qua- so value the naturalness of the notion Hemaspaandra, Hemaspaandra, and dratically larger than the number of of being a Condorcet winner that they Rothe30 precisely classifi ed the prob- candidates and the inputs are drawn ensure that when a Condorcet winner lem’s complexity as “complete” (that from the uniform probability distri- exists, he or she is the winner in their is, in a certain formal sense the hardest bution.35,39 In fact, that algorithm can system. One particularly interesting sys- problem) for the class of problems that even be made “self-knowingly cor- tem is the election system proposed by can be solved by parallel access to NP (a rect”—it almost always declares that p the famous author and mathematician class that forms the Θ2 level of the poly- its answer is correct, and when it does Lewis Carroll in 1876. Carroll took an nomial hierarchy).a so it is never wrong.35 Another way of ar- approach that should warm the hearts On its face, this result is a disaster guably bypassing the hardness results of computer scientists. He said, in ef- for Lewis Carroll’s election system. Al- for the Lewis Carroll winner problem fect, that whoever had the smallest edit though we want manipulation of elec- is through approximation algorithms. distance from being a Condorcet winner tions to be diffi cult, we do not want to For example, Caragiannis et al.9 have was the winner in his election system. achieve this by migrating to election recently developed two approximation His edit distance was with respect to systems so opaque that we cannot ef- algorithms for computing candidates’ the number of sequential exchanges of fi ciently compute who won. scores in Carroll’s system. And a third adjacent candidates in voter orderings. This disaster may not be quite as way to sidestep the hardness results is So in the Figure 1 example, and tie severe as it fi rst seems. Recent work to change the framework, namely, to as Carroll winners, since either of them on Lewis Carroll elections seeks to assume that the number of candidates with one adjacent exchange can become slip around the edges of the just-men- or the number of voters is bounded by a Condorcet winner (for example, if we by tioned intractability result. In particu- a fi xed constant, and to seek polynomi- one exchange turn voter 1’s preference lar, two recent papers show that simple al-time algorithms in that setting.b The list into > > > , then becomes seminal paper of Bartholdi, Tovey, and a We will not provide here a discussion of NP- 4 a Condorcet winner), but for example p Trick successfully pursued this line, as would take seven adjacent exchanges hardness/NP-completeness/Θ2-completeness, but suffi ce it to say that complexity theorists to become a Condorcet winner. broadly believe any problem that has any one b Many real-life settings have relatively few can- Lewis Carroll’s system is quite lovely of these properties is intractable, that is, does didates. And a particularly interesting setting in that it focuses on the closeness to not have a deterministic polynomial-time algo- with few voters but many candidates comes 18 Condorcet winnerhood. Carroll’s paper rithm. However, these notions are worst-case from Dwork et al., who suggested building a notions. In the section “Using Complexity to search engine for the Web that would simply has been included in books collecting Block Election Manipulation” we will discuss query other search engines and then conduct the most important social choice pa- how their worst-case nature is itself a worry an election given the search engines’ answers pers of all time. However, Carroll’s sys- when using them to protect election systems. as votes.

table 1. the computational complexity of control in condorcet, copeland, Llull, and plurality elections.

the results regarding constructive control in Condorcet and plurality elections are due to bartholdi et al.,5 the results on destructive control for Condorcet and plurality are due to hemaspaandra et al.,31 and the results regarding llull and Copeland are due to Faliszewski et al.25 Adding (unlimited number of) Candidates has not been explicitly studied in bartholdi et al.5 and hemaspaandra et al.,31 but the results on this for Condorcet and plurality elections are corollaries to these papers’ proofs.

election system condorcet copeland Llull Plurality const. Dest. const. Dest. const. Dest. const. Dest. control type control control control control control control control control Adding (unlimited number of) Candidates i v r v v v r r Adding Candidates i v r v r v r r deleting Candidates v i r v r v r r run-off Partition of Candidates (ties Promote) v i r v r v r r run-off Partition of Candidates (ties eliminate) v i r v r v r r Partition of Candidates (ties Promote) v i r v r v r r Partition of Candidates (ties eliminate) v i r v r v r r Partition of voters (ties eliminate) r v r r r r v v Partition of voters (ties Promote) r v r r r r r r Adding voters r v r r r r v v deleting voters r v r r r r v v

november 2010 | vol. 53 | no. 11 | communications of the acm 77 review articles

Between these types and the different The other control types similarly tie-breaking rules that can be used to are motivated as abstractions of real- decide which candidates move forward world actions—many far more savory from the preliminary rounds of two- than vote suppression. For example, round elections in the case of ties (that Control by Adding Voters abstracts is, whether all the tied people move for- such actions as get-out-the-vote drives, ward or none of them do), there now positive advertising campaigns, pro- are eleven types of control that are typi- viding vans to drive elderly people to cally studied—each having both a con- the polls, registration drives, and so structive and a destructive version. on. Control by Adding Candidates and For reasons of space we will not de- Control by Deleting Candidates refl ect fi ne all 11 types here. We will defi ne the effect of recruiting candidates one type explicitly and will mention into—and pressuring them to with- the motivations behind most of the draw from—the race. The memory of others. Let us consider Control by De- the 2000 U.S. presidential race sug- leting Voters. In this control scenario, gests that whether a given small-party figure 2. Ramon Llull, 13th-century mystic the input to the problem is the elec- candidate—say Ralph Nader—enters and philosopher. tion (C, V ), a candidate p ∈ C, and an a race can change the outcome. The integer k. The question is whether by partition models loosely capture other have more recent papers.7,11,24, 25 Howev- removing at most k votes from V one behaviors, such as gerrymandering. er, their polynomial-time algorithms can make p be the sole winner (for the Table 1 summarized the construc- sometimes involve truly astronomical constructive case) or can preclude p tive and destructive control results for multiplicative constants. from being a unique winner (for the four election systems whose behavior Finally, we mention that in the destructive case). Control by Delet- is completely known: Plurality, Con- years since the work showing Lewis ing Voters is loosely inspired by vote dorcet, Llull, and Copeland. The mys- Carroll’s election system to have a win- suppression: It is asking whether by tic and philosopher Ramon Llull (Fig- ner problem that is complete for paral- the targeted suppression of at most k ure 2) defi ned the Llull system in the lel access to NP, a number of other sys- votes the given goal can be reached. 1200s, and the Copeland system is a tems, most notably those of Kemeny (By discussing vote suppression we are closely related system defi ned in mod- and Young, have also been shown to be in no way endorsing it, and indeed we ern times. In both of these systems complete for parallel access to NP.33,43 are discussing paths toward making it one considers each pair of candidates Naturally, researchers have sought to computationally infeasible.) So, for a and awards one point to the winner in bypass these hardness results as well given election system E, we are inter- their head-to-head majority-rule con- (for examples, see6,9,12,36). ested in the complexity of the set com- test, and if the head-to-head contest posed of all inputs (C, V, p, k) for which is a tie, in Copeland each gets half a using complexity to Block the goal can be reached.c point but in Llull each still gets one election control point. So, for example, in Copeland Can our choice of election systems—and c As to who is seeking to do the control, that is ex- one gets ||C||−1 points exactly if one is not merely nasty ones with hard winner ternal to the model. For example, it can be some a Condorcet winner. The Llull/Cope- problems but rather natural ones with central authority or a candidate’s campaign land system is used in the group stage polynomial-time winner problems—be committee. In fact, in the real world there often used to make infl uencing election out- are competing control actors. But results we control actor knows all the votes of all the vot- will soon cover show that even a single control ers. But note that that just makes the “shield” comes costly? We start the discussion actor faces a computationally infeasible prob- results stronger: they show that even if one had of this issue by considering problems lem. Also, the reader may naturally feel uncom- perfect information about the votes, fi nding a of election control, introduced by Bar- fortable with the model’s assumption that the control action is still intractable. tholdi, Tovey, and Trick5 in 1992. In election control, some actor who has figure 3. the points assigned by the Llull/copeland systems in the head-to-head contests of the election of figure 1. complete knowledge of all the votes seeks to achieve a desired outcome—ei- ther making a favored candidate be the sole winner (“constructive control”) or 1. Llull 2. copeland precluding a despised candidate from : 0 : 0 : 0 : 0 : 0 : 0 being a unique winner (“destructive : 1 : 1 : 1 : 1 : 1 : 1 control”)—via changing the structure of the election. The types of structural : 1 : 1 : 0.5 : 1 changes that Bartholdi, Tovey, and : 1 : 0 : 0.5 : 0 Trick proposed are adding or deleting : 0 : 0 candidates, adding or deleting voters, : 1 : 1 or partitioning candidates or voters into a two-round election structure.

78 communications of the acm | november 2010 | vol. 53 | no. 11 review articles of the World Cup soccer tournament, To conclude our discussion of con- except there (after rescaling) wins get trol, we mention one other setting, that one point and ties get one third of a of choosing a whole assembly or com- point. Figure 3 shows how the election mittee through an election. Such assem- from Figure 1 comes out under the bly-election settings introduce a range Llull and Copeland systems. In Table The current push- of new challenges. For example, the 1, I (immunity) means one can never pull between using voters will have preferences over assem- change the outcome with that type of blies rather than over individual candi- control attack—a dream case; R (re- complexity as a dates. We point the reader to the work of sistance) means it is NP-hard to de- shield and seeking Meir et al.40 for results on the complexity termine whether a given instance can of controlling elections of this type. be successfully attacked—still a quite holes in and paths good case; and V (vulnerability) means Using Complexity to Block there is a polynomial-time algorithm around that shield is Election Manipulation to detect whether there is a successful a natural part of the Manipulation is often used informally attack of the given type (and indeed to as a broad term for attempts to affect produce the exact attack)—the case drama of science. election outcomes. But in the litera- we wish to avoid. ture, manipulation is also used to refer Remarkably, given that Llull cre- just to the particular attack in which a ated his system in the 1200s, among all voter or a coalition of voters seeks to natural systems based on preference cast their votes in such a way as to ob- orders, Llull and Copeland are the tain a desired outcome, for example, systems that currently have the great- making some candidate win. In formu- est numbers of proven resistances lating such problems, one often stud- to control. As one can see from Table ies the case in which each voter has a 1, Copeland is perfectly resistant to weight, as is the case in the electoral the constructive control types and to college and in stockholder votes. The all voter-related control types (but is input to such problems consists of the vulnerable to the destructive, can- weights of all voters, the votes of the didate-related control types). And nonmanipulators, and the candidate Llull’s 13th-century system is almost the manipulators are trying to make a as good. Ramon Llull, the mystic, truly winner. was ahead of his time. Manipulation problems have been If one wants an even greater num- studied more extensively than either ber of resistances than Copeland/Llull control or bribery problems, and so provides, one currently can do that in the literature is too broad to survey in two different ways. Recently, Erdélyi, any detail. But we now briefly mention Nowak, and Rothe22 showed that a vot- a few of the key themes in this study, ing system whose votes are in a differ- including using complexity to protect, ent, richer model—each voter provides using algorithms to attack, studying both an approval vector and a strict approximations to bypass protections, ordering—has a greater number and analyzing manipulation properties of control resistances, although in of random elections. achieving that it loses some of the The seminal papers on complex- particular resistances of Cope- ity of manipulation are those of Bar- land/Llull. And Hemaspaandra, tholdi, Orlin, Tovey, and Trick.2,3 Hemaspaandra, and Rothe32 con- Bartholdi, Tovey, and Trick3 gave structed a hybridization scheme that polynomial-time algorithms for ma- allows one to build an election system nipulation and proved a hardness-of- whose winner problem—like the win- manipulation result (regarding so- ner problem of all four systems from called second-order Copeland voting). Table 1—is computationally easy, yet Bartholdi and Orlin2 showed that for the system is resistant to all 22 control “single transferable vote,” a system attacks. Unfortunately, that election that is used for some countries’ elec- system is in a somewhat tricky manner tions, whether a given voter can ma- “built on top of” other systems each nipulate the election is NP-complete, of which will in some cases determine even in the unweighted case. the winner, and so the system lacks the Even if election systems are proven attractiveness and transparency that intractable to manipulate in general, it real-world voters reasonably expect. remains possible that if one allows only

november 2010 | vol. 53 | no. 11 | communications of the acm 79 review articles

a certain number of candidates, the which showed that scoring systems famously developed the theory of aver- manipulation problem becomes easy. are NP-complete to manipulate (in the age-case NP-hardness,37 and although Conitzer, Sandholm, and Lang15 pro- weighted setting) precisely if they al- that theory is difficult to apply and is vide a detailed study of this behavior, low “diversity of dislike” (that is, the tied to what distributions one uses, it showing for each of many election sys- point values for the second favorite would be extremely interesting to es- tems the exact number of candidates and least favorite candidates differ), tablish that the manipulation, control, necessary to make its (constructive, and that all other scoring systems are and bribery problems for important weighted, coalitional) manipulation easy to manipulate. From this it fol- election systems are average-case NP- problem computationally infeasible. lows that the only easily manipulable hard with respect to some appropriate For example, in this setting manipula- scoring systems are an infinite collec- and compellingly natural distribution. tion is easy for Borda with up to two can- tion of trivial systems, plurality, and A very exciting new path toward didates, but becomes infeasible when an infinite collection of systems that circumventing hardness-of-manipula- restricted even to three candidates. are disguised, transformed versions tion results (and, potentially, toward In contrast, it is well known that of plurality; all other scoring systems more generally circumventing hard- manipulation is simple for plurality are NP-hard to manipulate. ness results about election-related is- elections regardless of the number of There has been an intense effort sues) is to look at restricted domains candidates. That is unfortunate, since to circumvent such hardness results. for the collections of votes the elec- plurality elections are the most com- Indeed, the seminal paper on manipu- torate may cast. In particular, there mon and most important elections in lation3 provided a greedy single-voter is a very important political science the real world. manipulation algorithm that was later notion called “single-peaked prefer- What holds for scoring-rule elec- proved to also work in an interest- ences,” in which the candidates are tion systems other than plurality? One ing range of coalitional-manipulation modeled along an axis, such as liberal could try analyzing scoring systems settings.42,49 An influential paper of to conservative, and as one goes away one at a time to see which are sub- Conitzer and Sandholm14 shows that from each voter’s most preferred can- ject to manipulation, but it might be voting systems and distributions that didate in either of the axis’s directions a long slog since there are an infinite on a large probability weight of the in- the voter prefers the candidates less number of scoring systems. This mo- puts satisfy certain conditions have a and less. Walsh46 raised the fascinat- tivates us to look toward an excellent manipulability-detection algorithm ing question of whether hard election- general goal: finding a dichotomy the- that is correct on at least that same set manipulation problems remain hard orem that in one fell swoop pinpoints of inputs. A different line of research fo- even for electorates that follow the what it is about an election system cuses on analyzing the probability with single-peaked model, and he provided that makes it vulnerable to manipu- which a randomly selected election is natural examples in which manipula- lation or that makes manipulation susceptible to a given form of manipu- tion problems remain hard even when computationally prohibitive. For lation.16,28,47,48 In the standard probabi- restricted to single-peaked electorates. scoring systems, this was achieved in listic model used in this line of work,d In contrast, and inspired by a differ- Hemaspaandra and Hemaspaandra29 for many natural election systems the ent part of Walsh’s paper that showed (see also the closely related work15,42), probability that a voter can affect the some profile completion problems result of an election by simply casting a are easy for single-peaked electorates, Figure 4. An example of a weighted random vote is small but nonnegligible. a recent paper by Faliszewski et al.27 plurality election. This work is motivated by perhaps shows that for single-peaked elector- the greatest single worry related to us- ates many NP-hard manipulation and Each bar represents a weighted vote for a particular candidate. We can make p a win- ing NP-hardness to protect elections— control problems have polynomial- ner by bribing the weight-5 voter to vote for a worry that applies to NP-hardness re- time algorithms. The point of—and p, but bribing only the heaviest voter to vote sults not just about manipulation, but threat of—this research line is that for p would not be sufficient. also about control and bribery. That for electorates that are single-peaked, worry is that NP-hardness is a worst- case theory, and it is in concept pos- can be many-one polynomial-time reduced to) a set that is easy on overwhelmingly many 8 sible that NP-hard sets may be easily of its instances.21 Unfortunately, this does not 5 solved on many of their input instanc- necessarily imply that the original set is easy e 6 es even if P and NP differ. Levin has on overwhelmingly many of its instances. In fact, it is known that relative to a random “ora- d This model is called impartial culture. In im- cle” (black box), there are NP sets on which no 4 partial culture each vote is chosen uniformly polynomial-time heuristic algorithm can do 6 at random from the set of all permutations of well.34 Also, it is well known that if any NP-hard 4 the candidates. set has a polynomial-time heuristic algorithm 2 e There are a number of results in theoretical that is correct on all but a “sparse” amount of 2 computer science that are related to this issue, its input, then P = NP.44 However, “sparse” in 0 while as a practical matter not resolving it for that research line is so small as to not reassure the concrete cases we care about. For example, us here. And, finally, there has been much in- c c p 1 2 by an easy “padding” trick one can see that terest in distributions, problems, and settings every NP-hard set can have its instances trans- that remove the gap between worst-case and formed into questions about (in the jargon, average-case complexities.1,38

80 communications of the acm | november 2010 | vol. 53 | no. 11 review articles

NP-hardness results simply may fail to maspaandra,24 and started far more hand, if one changes one’s model and hold. And the reason that can happen recently than did the complexity-the- associates a cost not to each voter, but is the assumption of single-peaked oretic study of control and manipu- rather to each pairwise preference of preferences is so restrictive that it can lation of elections. Bribery comes in each voter (so the more one changes a rule out some of the collections of many variants, but the basic pattern given voter’s vote, the more one has to votes used in the outputs of reductions is just what the term brings to mind. pay—so-called “microbribery”), Llull in general-case NP-hardness proofs. The briber has a certain budget, the bribery (without weights) can be done, Yet another path toward circum- voters (who depending on the model in a slightly different model that al- venting hardness-of-manipulation may or may not have weights) each lows irrational preferences, in polyno- results leads to relaxing the notion have a price for which their vote can be mial time.25 of solving a manipulation problem. bought, and depending on the model Procaccia and Rosenschein42 initi- voters may or may not be required to Summary ated this approach by showing that each have unit cost (the former case In this article, we discussed some of the heuristic from the seminal work is referred to as the “without prices” the main streams—control, manipu- of Bartholdi, Tovey, and Trick,3 when case). And the question is whether lation, and bribery—in the study of extended to a coalitional manipula- the briber can achieve his or her how complexity can be used as a shield tion setting, works correctly on an goal—typically, to make a preferred to protect elections (see Faliszewski interesting class of scoring-system candidate p be a winner—within the et al.26 for a more technical survey). manipulation instances. By an even budget. Note that bribery has aspects This line was started by the striking more careful analysis, together with of both control and manipulation. insight of Bartholdi, Orlin, Tovey, and Zuckerman, they later extended this Like some types of control one has to Trick (see also Simon45 for even earlier result to a number of other election choose which collection of voters to roots) that although economics proves systems,49 and they obtained approxi- act on, but like manipulation one is we cannot make manipulation impos- mation results and results that for ma- altering votes. sible, we can seek to make it compu- nipulable instances are guaranteed to For reasons of space, we cover brib- tationally infeasible. As we have seen, return a manipulation that will work if ery only briefly. We do so by giving a few many hardness results have been ob- one is allowed to add a certain number examples focusing on plurality elec- tained, as have many polynomial-time and weight of additional manipula- tions and Llull elections. attacks. Election systems and settings tors. Brelsford et al.8 provide their own For plurality elections, the complex- vary greatly in the behaviors one can framework for studying approximabil- ity of bribery turns out to be very sensi- establish. It is natural to consider ity of manipulation problems (as well tive to the model. For plurality, brib- an election system’s computational as approximability of bribery and con- ery is NP-complete when voters have weaknesses and strengths, as one fac- trol problems) and for a large class of weights and prices, but is in polynomi- tor among many, when choosing an scoring systems gives approximation al time if voters have only weights, only election system for a given task, and algorithms for manipulation. prices, or neither weights nor prices.24,f in particular to choose a system care- Returning to playing defense, what For the weighted and the weighted- fully in light of the types of attacks one can we do if a system has a polynomial- and-priced cases, these results can be most needs it to thwart. Yet the work time manipulation algorithm? Can we extended to dichotomy theorems that on computational protection of elec- somehow feed the system a can of spin- completely classify which scoring-rule tions has also energized the search for ach and turn it fearsome? To a surpris- election systems have NP-complete end runs around that protection, such ing extent the answer is yes, as studied bribery problems and which have fea- as approximation algorithms and heu- in work of Conitzer and Sandholm13 sible bribery problems.24 Also, for plu- ristics having provably frequent good and Elkind and Lipmaa.19 They vari- rality, there is an efficient algorithm performance, and one must also worry ously do this by adding an elimination that can approximately solve the prob- about such potential end runs when “pre-round” (that may or may not be lem up to any given precision23—a so- making one’s election-system choice. based on a hypothetical one-way func- called fully polynomial-time approxi- This work all falls within the tion) or by changing the election into mation scheme. emerging area known as computa- a long series of rounds of candidate For Llull elections, the results again tional social choice (see Chevaleyre et elimination. The good news is that this are very sensitive to the model. On one al.10 for a superb survey), an area that approach often boosts the complexity, hand, both with and without weights, links AI, systems, and theory within and the bad news is that these multi- and both with and without voter pric- computer science, as well as econom- round election systems are simply not es, the bribery problem for Llull elec- ics, political science, mathematics, the same intuitively attractive animals tions is NP-complete. On the other and operations research. Elections that they are built from. have been important for thousands f The bribery algorithms are far from trivial. For of years, and with the current and Using Complexity to Block example, Figure 4 shows an election (without anticipated increase of electronic Bribery in Elections prices) where the very natural heuristic of first agency, elections become more im- bribing the heaviest voter yields a suboptimal The complexity-theoretic study of solution. Similarly, it is easy to find examples portant—and more open to attacks— bribery in elections was proposed by where bribing the heaviest voter of a current with each passing year. The current Faliszewski, Hemaspaandra, and He- winner does not lead to an optimal solution. push-pull between using complexity

november 2010 | vol. 53 | no. 11 | communications of the acm 81 review articles

as a shield and seeking holes in and manipulating elections. In Proceedings of the 23rd manipulated often. In Proceedings of the 49th IEEE AAAI Conference on Artificial Intelligence (July Symposium on Foundations of Computer Science (Oct. paths around that shield is a natural, 2008). AAAI Press, 44–49. 2008). IEEE Computer Society, 243–249. exciting part of the drama of science, 9. caragiannis, I., Covey, J., Feldman, M., Homan, 29. Hemaspaandra, E., Hemaspaandra, L. Dichotomy for C. Kaklamanis, C., Karanikolas, N., Procaccia, A. voting systems. Journal of Computer and System and is likely to continue for decades to and Rosenschein, J. On the approximability of Sciences 73, 1 (2007), 73–83. come as new models, techniques, and Dodgson and Young elections. In Proceedings of the 30. Hemaspaandra, E., Hemaspaandra, L. and Rothe, J. 20th Annual ACM-SIAM Symposium on Discrete Exact analysis of Dodgson elections: Lewis Carroll’s attacks are formulated and studied. Algorithms (Jan. 2009). Society for Industrial and 1876 voting system is complete for parallel access to This study will clearly benefit from the Applied Mathematics, 1058–1067. NP. Journal of the ACM 44, 6 (1997), 806–825. 10. chevaleyre, Y., Endriss, U., Lang, J. and Maudet, N. A 31. Hemaspaandra, E., Hemaspaandra, L. and Rothe, J. broadest possible participation, and short introduction to computational social choice. In Anyone but him: The complexity of precluding an we urge any interested readers—and Proceedings of the 33rd International Conference on alternative. Artificial Intelligence 171, 5–6 (2007), Current Trends in Theory and Practice of Computer 255–285. most especially those early in their Science. Lecture Notes in Computer Science #4362 32. Hemaspaandra, E., Hemaspaandra, L. and Rothe, careers—to bring their own time and (Jan. 2007). Springer-Verlag, 51–69. J. Hybrid elections broaden complexity-theoretic 11. christian, R., Fellows, M., Rosamond, F. and Slinko, resistance to control. Mathematical Logic Quarterly skills to bear on the many problems A. On complexity of lobbying in multiple referenda. 55, 4 (2009), 397–424. that glimmer in the young, important, Review of Economic Design 11, 3 (2007), 217–224. 33. Hemaspaandra, E., Spakowski, H. and Vogel, J. 12. conitzer, V., Davenport, A. and Kalagnanam, J. The complexity of Kemeny elections. Theoretical challenging study of the complexity of Improved bounds for computing Kemeny rankings. Computer Science 349, 3 (2005), 382–391. In Proceedings of the 21st National Conference 34. Hemaspaandra, L. and Zimand, M. Strong self- elections. on Artificial Intelligence (July 2006). AAAI Press, reducibility precludes strong immunity. Mathematical 620–626. Systems Theory 29, 5 (1996), 535–548. 13. conitzer, V. and Sandholm, T. Universal voting protocol 35. Homan, C. and Hemaspaandra, L. Guarantees for Acknowledgments tweaks to make manipulation hard. In Proceedings of the success frequency of an algorithm for finding We are deeply grateful to Preetjot the 18th International Joint Conference on Artificial Dodgson-election winners. Journal of Heuristics 15, 4 Intelligence (Aug. 2003). Morgan Kaufmann, 781–788. (2009), 403–423. Singh, Communications editors Georg 14. conitzer, V. and Sandholm, T. Nonexistence of 36. Kenyon-Mathieu, C. and Schudy, W. How to rank Gottlob, Moshe Vardi, and Andrew Yao, voting rules that are usually hard to manipulate. with few errors. In Proceedings of the 39th ACM In Proceedings of the 21st National Conference Symposium on Theory of Computing (June 2007). and the anonymous referees for help- on Artificial Intelligence (July 2006). AAAI Press, ACM Press, 95–103. ful suggestions and encouragement. 627–634. 37. Levin, L. Average case complete problems. SIAM 15. conitzer, V., Sandholm, T. and Lang, J. When are Journal on Computing 15, 1 (1986), 285–286. Piotr Faliszewski was supported elections with few candidates hard to manipulate? 38. Li, M. and Vitányi, P. Average case complexity in part by Polish Ministry of Science Journal of the ACM 54, 3 (2007), Article 14. under the universal distribution equals worst-case 16. Dobzinski, S. and Procaccia, A. Frequent manipulability complexity. Information Processing Letters 42, 3 and Higher Education grant N-N206- of elections: The case of two voters. In Proceedings (1992), 145–149. 378637, AGH-UST grant 11.11.120.865, of the 4th International Workshop on Internet and 39. mcCabe-Dansted, J., Pritchard, G. and Slinko, A. Network Economics. (Dec. 2008). Springer-Verlag Approximability of Dodgson’s rule. Social Choice and and by the Foundation for Polish Sci- Lecture Notes in Computer Science #5385, 653–664 Welfare 31, 2 (2008), 311–330. ence's Homing Program. Edith He- 17. Duggan, J. and Schwartz, T. Strategic manipulability 40. Meir, R., Procaccia, A., Rosenschein, J. and Zohar, A. without resoluteness or shared beliefs: Gibbard– The complexity of strategic behavior in multi-winner maspaandra was supported in part Satterthwaite generalized. Social Choice and Welfare elections. Journal of Artificial Intelligence Research by NSF grant IIS-0713061 and a Fried- 17, 1 (2000), 85–93 33 (2008), 149–178. 18. Dwork, C., Kumar, R., Naor, M. and Sivakumar, D. Rank 41. pennock, D., Horvitz, E. and Giles, C. Social choice rich Wilhelm Bessel Research Award aggregation methods for the Web. In Proceedings of theory and recommender systems: Analysis of the from the Alexander von Humboldt the 10th International World Wide Web Conference axiomatic foundations of collaborative filtering. In (Mar. 2001). ACM Press, NY, 613–622. Proceedings of the 17th National Conference on Foundation. Lane A. Hemaspaandra 19. elkind, E. and Lipmaa, H. Hybrid voting protocols and Artificial Intelligence (July/Aug. 2000). AAAI Press, hardness of manipulation. In Proceedings of the 16th 729–734. was supported in part by NSF grants Annual International Symposium on Algorithms and 42. procaccia, A. and Rosenschein, J. Junta distributions CCF-0426761 and CCF-0915792 and Computation (Dec. 2005). Springer-Verlag Lecture and the average-case complexity of manipulating Notes in Computer Science #3872, 206–215. elections. Journal of Artificial Intelligence Research a Friedrich Wilhelm Bessel Research 20. ephrati, E. and Rosenschein, J. A heuristic technique 28 (2007), 157–181. Award from the Alexander von Hum- for multi-agent planning. Annals of Mathematics and 43. rothe, J., Spakowski, H. and Vogel, J. Exact complexity Artificial Intelligence 20, 1-4 (1997), 13–67. of the winner problem for Young elections. Theory of boldt Foundation. 21. erdélyi, G., Hemaspaandra, L., Rothe, J. and Computing Systems 36, 4 (2003), 375–386. Spakowski, H. Generalized juntas and NP-hard sets. 44. Schöning, U. Complete sets and closeness to Theoretical Computer Science 410, 38–40 (2009), complexity classes. Mathematical Systems Theory 19, 1 (1986), 29–42. References 3995–4000. 45. simon, H. The Sciences of the Artificial. MIT Press, 1. ajtai, M. Worst-case complexity, average-case 22. erdélyi, G., Nowak, M. and Rothe, J. Sincere-strategy 1969. Third edition, 1996. complexity and lattice problems. Documenta preference-based approval voting fully resists 46. Walsh, T. Uncertainty in preference elicitation and Mathematica, Extra Volume ICM III (1998), 421–428. constructive control and broadly resists destructive aggregation. In Proceedings of the 22nd AAAI 2. Bartholdi, III, J. and Orlin, J. Single transferable vote control. Mathematical Logic Quarterly 55, 4 (2009), Conference on Artificial Intelligence (July 2007). resists strategic voting. Social Choice and Welfare 8, 4 425–443. AAAI Press, 3–8. (1991) 341–354. 23. faliszewski, P. Nonuniform bribery (short paper). In 47. Xia, L. and Conitzer, V. Generalized scoring rules 3. Bartholdi, III, J., Tovey, C. and Trick, M. The Proceedings of the 7th International Conference on and the frequency of coalitional manipulability. In computational difficulty of manipulating an election. Autonomous Agents and Multiagent Systems (May Proceedings of the 9th ACM Conference on Electronic Social Choice and Welfare 6, 3 (1989) 227–241. 2008), 1569–1572. Commerce (July 2008). ACM Press, NY, 109–118. 4. Bartholdi, III, J., Tovey, C. and Trick, M. Voting 24. faliszewski, P., Hemaspaandra, E. and Hemaspaandra, 48. Xia, L. and Conitzer, V. A sufficient condition for voting schemes for which it can be difficult to tell who won L. How hard is bribery in elections? Journal of rules to be frequently manipulable. In Proceedings the election. Social Choice and Welfare 6, 2 (1989), Artificial Intelligence Research 35 (2009), 485–532. of the 9th ACM Conference on Electronic Commerce 157–165. 25. faliszewski, P., Hemaspaandra, E., Hemaspaandra, (July 2008). ACM Press, NY, 99–108. 5. Bartholdi, III, J., Tovey, C. and Trick, M. How hard is L. and Rothe, J. Llull and Copeland voting 49. Zuckerman, M. Procaccia, A. and Rosenschein, J. it to control an election? Mathematical and Computer computationally resist bribery and constructive Algorithms for the coalitional manipulation problem. Modeling 16, 8/9 (1992), 27–40. control. Journal of Artificial Intelligence Research 35 Artificial Intelligence 173, 2 (2009), 392–412. 6. Betzler, N., Fellows, M., Guo, J., Niedermeier, R. and (2009), 275–341. Rosamond, F. Fixed-parameter algorithms for Kemeny 26. faliszewski, P., Hemaspaandra, E., Hemaspaandra, L. scores. In Proceedings of the 4th International and Rothe, J. A richer understanding of the complexity Piotr Faliszewski ([email protected]) is an assistant Conference on Algorithmic Aspects in Information of election systems. In Fundamental Problems in professor at the AGH University of Science and and Management. Lecture Notes in Computer Science Computing: Essays in Honor of Professor Daniel J. #5034 (June 2008). Springer-Verlag, 60–71. Rosenkrantz. S. Ravi and S. Shukla, eds. Springer, Technology, Kraków, Poland. 7. Betzler, N., Guo, J., Niedermeier, R. Parameterized 2009, 375–406. Edith Hemaspaandra ([email protected]) is a professor at computational complexity of Dodgson and Young 27. faliszewski, P., Hemaspaandra, E., Hemaspaandra, the Rochester Institute of Technology, Rochester, NY. elections. In Proceedings of the 11th Scandinavian L. and Rothe, J. The shield that never was: Societies . Lecture Notes in with single-peaked preferences are more open to Workshop on Algorithm Theory Lane A. Hemaspaandra ([email protected]) is a Computer Science #5124 (July 2008). Springer- manipulation and control. In Proceedings of the 12th professor at the University of Rochester, Rochester, NY. Verlag, 402–413. Conference on Theoretical Aspects of Rationality and 8. Brelsford, E., Faliszewski, P., Hemaspaandra, E., Knowledge (July 2009). ACM Digital Library, 118–127. Schnoor, H., and Schnoor, I. Approximability of 28. friedgut, E., Kalai, G. and Nisan, N. Elections can be © 2010 ACM 0001-0782/10/1100 $10.00

82 communications of the acm | november 2010 | vol. 53 | no. 11 research highlights

p. 84 p. 85 Technical Goldilocks: Perspective Data Races are Evil A Race-Aware Java Runtime with No Exceptions By Tayfun Elmas, Shaz Qadeer, and Serdar Tasiran By Sarita Adve

p. 93 FastTrack: Efficient and Precise Dynamic Race Detection By Cormac Flanagan and Stephen N. Freund

november 2010 | vol. 53 | no. 11 | communications of the acm 83 research highlights

doi:10.1145/1839676.1839697

losing precision and performing even Technical Perspective better than Goldilocks. For the first time, these papers made it possible to Data Races are Evil believe that “data race as an exception” may be a viable solution to the vexing with No Exceptions debugging and semantics problems of By Sarita Adve shared memory. The absolute slowdown of both tech- Exploiting parallelism has become the racy code. Java’s safety requirements niques is still quite significant, but they primary means to higher performance. preclude the use of “undefined” behav- expose an exciting research agenda. An Shared memory is a pervasively used ior. The Java memory model, therefore, immediate challenge is to improve per- programming model, where parallel specifies very weak semantics for racy formance. Another concerns mapping tasks or threads communicate through programs, but is extremely complex and races detected in compiler-generated a global address space. Popular lan- currently has known bugs. code to source code. Others are explor- guages, such as C, C++, and Java have Despite the debugging and seman- ing hardware for similar goals.3 (or will soon have) standardized sup- tics difficulties, some prior work refers More broadly, these papers encour- port for shared-memory threads. Unfor- to certain data races (for example, some age a fundamental rethinking of pro- tunately, shared-memory programs are unsynchronized reads) as benign and gramming models, languages, compil- notoriously prone to subtle bugs, often useful for performance. Unfortunately, ers, and hardware. Should languages due to data races. reasonable semantics for programs with be designed to eliminate data races by Languages that allow data races ob- any type of data race remain elusive; C, design?2 Or should the runtime auto- fuscate true communication and syn- C++, and Java do not provide usable se- matically detect races? Or should the chronization. Since any load or store mantics with any (benign or not) data race. hardware?3 Can similar techniques be may communicate or synchronize with A natural conclusion is to strive for used for enforcing even stronger prop- another thread through a data race, it languages that eliminate data races. erties such as determinism and ato- becomes difficult to reason about sec- Although there is much progress, such micity? Does this impact how we view tions of code in isolation. Data races general-purpose languages,2 are not shared memory as a programming also create non-determinism since the yet commercially available. The follow- model? The answer to all these ques- ordering between two racing accesses ing two papers on Goldilocks and Fast- tions is probably “yes.” is timing-dependent. Racy programs Track suggest alternative solutions that It is unlikely that any of language, therefore require reasoning about use always-on data race detection for runtime, or hardware techniques alone many interleavings of individual mem- current languages. will strike the best balance between ory accesses and their executions are Goldilocks was the first work to pro- ease-of-use, generality, performance, difficult to reproduce. Data races have pose a data race be treated as a language- and complexity, but designing systems therefore been widely considered as level exception, just like null pointer that combine the strengths of such symptoms of bugs and there is much dereferences. This insight cleans up the techniques remains challenging. It is research to automatically detect them. most difficult part of the memory mod- also unclear what high-level language An arguably more fundamental els mess—executions are either sequen- properties such systems will finally en- problem with data races concerns se- tially consistent or throw an exception. sure; for example, race elimination is mantics. Every programming language No complicated semantics of Java and certainly a critical factor in ensuring de- must specify what value a load can re- no unsafe behavior of C/C++! terminism and atomicity. What is clear turn, also called the memory model. The challenge, compared to much is these papers provide key insights to It has been surprisingly difficult to prior work on race detection, is that shape the final solutions and are impor- specify an acceptable model that bal- an always-on language-level exception tant steps toward making parallel pro- ances ease-of-use and performance mechanism must be both fast and pre- gramming amenable to the masses. for languages that allow data races.1 cise. The well-established race detec- Programmers usually expect a sequen- tion algorithm based on vector clocks References 1. adve, S.V. and Boehm, H-J. Memory models: A case tial interleaving-based model called se- is precise, but incurs orders-of-magni- for rethinking parallel languages and hardware. quential consistency. For data-race-free Commun. ACM 53, 8 (Aug. 2010) 90–101. tude slowdown. Faster algorithms (for 2. Bocchino, R.L. et al. A type and effect system programs, it is straightforward to pro- example, based on a lockset approach) for deterministic parallel Java. In Proceedings of the International Conference on Object- vide sequential consistency and high exist, but produce false positives, which Oriented Programming, Systems, Languages, and performance. With racy code, however, are unacceptable for enforcing lan- Applications, 2009. 3. Lucia, B. et al. Conflict exceptions: Providing simple routine compiler and hardware opti- guage-level semantics. concurrent language semantics with precise hardware mizations can result in surprising be- Goldilocks extends the lockset ap- exceptions. In Proceedings of the International haviors. Consequently, the upcoming C proach to make it precise, at a much Symposium on Computer Architecture, 2010.

and C++ memory models specify “unde- faster speed than the previous precise Sarita Adve ([email protected]) is a professor in the fined” behavior for programs with data vector clock algorithm. FastTrack fol- Department of Computer Science at the University of races. This gives freedom to implement- lowed up by optimizing the vector clock Illinois at Urbana-Champaign. ers but poses challenges for debugging approach to make it run faster, without © 2010 ACM 0001-0782/10/1100 $10.00

84 communications of the acm | november 2010 | vol. 53 | no. 11 doi:10.1145/1839676.1839698 Goldilocks: A Race-Aware Java Runtime By Tayfun Elmas, Shaz Qadeer, and Serdar Tasiran

Abstract in particular, every read deterministically returns the value We present Goldilocks, a Java runtime that monitors of the “last” write. This semantics is widely considered to ­program executions and throws a DataRaceException be the only simple-enough model with which writing useful when a data race is about to occur. This prevents racy concurrent programs is practical. For executions containing accesses from taking place, and allows race conditions to race conditions, the semantics is either completely unde- be handled before they cause errors that may be difficult fined (as is the case for C++MM2) or is complicated enough to ­diagnose later. The DataRaceException is a valuable that writing a useful and correct program with “benign debugging tool, and, if supported with reasonable compu- races” is a challenge. tational overhead, can be an important safety feature for Detection and/or elimination of race conditions has been deployed ­programs. Experiments by us and others on race- an area of intense research activity. The work presented in aware Java runtimes indicate that the DataRaceException this paper (and initially presented in Elmas et al.6) makes may be a viable mechanism to enforce the safety of execu- two important contributions to this area. tions of multithreaded Java programs. First, for the first time in the literature, we propose that An important benefit of DataRaceException is that race conditions should be language-level exceptions just ­executions in our runtime are guaranteed to be race free like null pointer dereferences and indexing an array out of and thus sequentially consistent as per the Java Memory its bounds. The Goldilocks runtime for Java provides a Model. This strong guarantee provides an easy-to-use, clean new exception, DataRaceException,a that is thrown pre- semantics to programmers, and helps to rule out many cisely when an access that causes an actual race condition is ­concurrency-related possibilities as the cause of errors. To about to be executed. Since a racy execution is never allowed support the DataRaceException, our runtime incorpo- to take place, this guarantees that the execution remains rates the novel Goldilocks algorithm for precise dynamic race sequentially consistent. detection. The Goldilocks algorithm is general, intuitive, and The DataRaceException brings races to the program- can handle different synchronization patterns uniformly. mer’s attention explicitly. When this exception is caught, the recommended course of action is to terminate the program gracefully. This is because, for racy Java programs, a wide 1. INTRODUCTION range of compiler optimizations are allowed by the JMM, When two accesses by two different threads to a shared and this makes it complicated to relate program executions variable are enabled simultaneously, i.e., at the same pro- to source code. If the exception is not caught, the Goldilocks gram state, a race condition is said to occur. An equiva- runtime terminates the thread that threw the exception. lent definition involves an execution in which two threads While not recommended, the programmer could also make conflicting accesses to a variable without proper choose to make optimistic use of a DataRaceException synchronization actions being executed between the two by, for instance, retrying the code block that led to the race, accesses. A common way to ensure race freedom is to asso- hoping that the conflicting thread has performed the syn- ciate a lock with every shared variable, and to ensure that chronization necessary to avoid a race in the meantime. threads hold this lock when accessing the shared variable. Since the first paper on the Goldilocks runtime,6 the idea The lock release by one thread and the lock acquire by the that certain concurrency errors, especially ones that result next establish the required synchronization between the in sequential consistency violations, should result in excep- two threads. tions has gained significant support and several implemen- Data races are undesirable for two key reasons. First, a tations of the idea have been investigated.9, 11 race condition is often a symptom of a higher-level logical To support a DataRaceException, a runtime must error such as an atomicity violation. Thus, race detectors serve as a proxy for more general concurrency-error detec- a We define DataRaceException as a subclass of the Runtime­ tion when higher-level specifications such as atomicity Exception class in Java. annotations do not exist. Second, a race condition makes the outcome of certain shared variable accesses nondeter- ministic. For this and other reasons, both the Java Memory The original version of this paper was published in the Model (JMM)10 and the C++ Memory Model (C++MM)2 define Proceedings of the ACM SIGPLAN 2007 Conference on well-synchronized programs to be programs whose execu- Programming Language Design and Implementation (PLDI), tions are free of race conditions. For race-free executions, June 2007. these models guarantee sequentially consistent semantics;

november 2010 | vol. 53 | no. 11 | communications of the acm 85 research highlights

incorporate a precise yet efficient race detection mecha- but they cannot handle concurrency patterns implemented nism. In this context, false positives in race detection can- using volatile variables such as barrier synchronization. not be tolerated. The second contribution of our work is the There is a significant body of research on dynamic data- Goldilocks algorithm, a novel, precise, and general algo- race detection based on computing the happens-before rithm for detecting data races at runtime. In Elmas et al.,6 relation4, 7, 14, 15, 17 using vector clocks.12 Hybrid techniques14, 20 we presented an implementation of the Goldilocks algo- combine lockset and happens-before analysis. For exam- rithm in a Java Virtual Machine (JVM) called Kaffe.19 Our ple, RaceTrack20 uses a basic vector clock algorithm to experiments with Goldilocks on benchmarks brought up capture thread-local accesses to objects thereby eliminat- the new possibility that the overhead of post-deployment ing unnecessary and imprecise applications of the Eraser precise race detection in a runtime may be tolerable. There algorithm. Similarly, MultiRace14 presents djit+, a vector has been significant progress in the efficiency of precise race clock algorithm with several optimizations to reduce the detection since the Goldilocks runtime was first published number of checks at an access, including keeping distinct (see Flanagan and Freund, and Pozmiansky and Schuster,7, 14 vector clocks for reads and writes and using a lockset algo- for example) and this idea appears viable today. rithm as a fast-path check. To the best of our knowledge, The Goldilocks algorithm is based on an intuitive, gen- FastTrack,7 which builds on djit+, is the best-performing eral representation for the happens-before relationship as vector clock-based algorithm in the literature. By exploit- a generalized lockset (Goldilockset) for each variable. In ing some access patterns, FastTrack reduces the cost of the traditional use of the term, a lockset for a shared vari- vector clock updates to O(1) on average. We provide a quali- able x at a point in an execution contains a set of locks. A tative comparison of the Goldilocks and FastTrack algo- thread can perform a race-free access to x at that point by rithms in Section 4.3. Vector clock and Goldilocks are first acquiring a lock in this lockset. A Goldilockset is a gen- both precise, but the generalized locksets in Goldilocks eralization of a lockset. In Java, locks and volatile variables provide an intuitive representation of how shared variables are synchronization objects, and acquiring and releasing a are protected at each point the execution. lock, as well as reading from and writing to a volatile vari- Concurrency-Related Exceptions: Since proposed first by able, are synchronization operations. To reflect this, at each the authors in Elmas et al.,6 the idea that programming plat- point in an execution, the Goldilockset for a shared variable forms should be able to guarantee sequential consistency x may contain thread ids, locks, and volatile variables. A for all programs has gained significant support. Marino thread can perform a race-free access to x iff its thread id et.al.11 and Lucia et.al.9 provide platforms with explicit mem- is in the Goldilockset, or if it first acquires a lock that is in ory model exceptions. Both platforms define stronger but the Goldilockset, or reads a volatile variable that is in the simpler contracts than JMM and C++MM, which enable effi- Goldilockset. In other words, the Goldilockset indicates cient hardware implementations that support the memory the threads that have the ownership of x and the synchro- model exceptions. nization objects that protect access to x at that point. The Goldilockset is updated during the execution as synchroni- 2. A GENERIC MEMORY MODEL zation operations are performed. As a result, Goldilocksets In this section, we present a generic memory model and are a compact, intuitive way to precisely represent the express the JMM as a special case of it. This generic model happens-before relationship. Thread-local variables, vari- allows a uniform treatment of the various synchronization ables protected by different locks at different points of the constructs in Java. We also believe that memory models at execution, and event-based synchronization with condi- different levels (e.g., the hardware level) and for different tion variables are all uniformly handled by Goldilocks. languages (e.g., C++MM) can be expressed as instances of Furthermore, Goldilocksets can easily be generalized to this model. This allows Goldilocks to be applied in these handle other synchronization primitives such as software ­settings directly. transactions18 and adapted to handle memory models of Variables and Actions: Program variables are separated into languages other than Java, such as C++MM. To facilitate two categories: data variables (Data) and synchronization this, in this paper (differently from Elmas et al.6) we present variables (Sync). We use x, and o to refer to data and syn- Goldilocks on a generic memory model and then show chronization variables, respectively. Threads in a program how the algorithm can be specialized to JMM. execute actions from the following categories:

1.1. Related work • Data variable accesses: read(t, x, v) by thread t reads the Dynamic Race Detection: There are two approaches to current value v of a data variable x, and write(t, x, v) by dynamic data-race detection, one based on locksets and the thread t writes the value v to x. other based on the happens-before relation. Eraser16 is a • Synchronization operations: When threads synchronize

well-known lockset-based algorithm for detecting race con- using a synchronization mechanism, a thread ti executes ditions dynamically by enforcing the locking discipline that a notification action, which is then observed by other

every shared variable is protected by a unique lock. In spite threads tj. Such a notification–observation pair defines a of the numerous papers that refined the Eraser algorithm to “synchronizes-with” edge from the former action to the reduce the number of false alarms, there are still cases, such latter. We classify actions that serve as sources and sinks as dynamically changing locksets, that cannot be handled of a synchronizes-with edge as synchronization source precisely. Precise lockset algorithms exist for Cilk programs,3 and sink actions, respectively.

86 communications of the acm | november 2010 | vol. 53 | no. 11

– Synchronization source actions: sync-source(t, o) by • The union of the program orders for all t ∈ Tid and the thread t creates a synchronizes-with source by writing synchronization orders for all variables o ∈ Sync is a valid to a synchronization variable o. Lock releases and vola- partial order. During an execution, our data-race detec- tile variable writes in Java are synchronization source tion algorithm examines a linearization of this partial actions. order and identifies the happens-before edges between – Synchronization sink actions: sync-sink(t, o) by thread data accesses. t creates a synchronizes-with sink by reading from a synchronization variable o. Lock acquires and vola- Sequential Consistency: Sequential consistency is a prop- tile variable reads in Java are synchronization sink erty that allows programmers to use an interleaving model actions. of execution where accesses from different threads are inter- leaved into a total order, and every read sees the value of the Multithreaded Executions: An execution E is represented by most recent write. Sequential consistency is widely consid- a tuple E  Tid, Act, W, →po .,→so .. ered to be the only simple-enough model with which writing useful concurrent programs is practical. Formally, an execu- • Tid is the set of identifiers for threads involved in the tion E  Tid, A, W, →po ., →so . is sequentially consistent if there execution. Each newly forked thread is given a new exists a total order →SC over Act satisfying the following: unique id from Tid. SC • Act is the set of actions that occur in this execution. Act|t • For every thread t ∈ Tid, → respects the program order po p o SC is the set of actions performed by t ∈ Tid, and Act|x (resp. →t, i.e., →t ⊆ →. SC Act|o) are the sets of actions performed on data variable x • Every read a = read(x) sees the most recent write to x in →, (resp. synchronization variable o). i.e., there is no other b = write(x) such that W(a) →SC b →SC a. • W is a total function that maps each read(t, x, v) in Act to a write(u, x, v) in Act. W is used to model the write-seen Data Races: Two data variable accesses are called conflicting relationship between a read of x and the write to x it if they refer to the same shared data variable and at least one sees. In a race-free, sequentially consistent execution, of them is a write access. this is the last write before read(t, x, v). In order to make One frequently used definition of a race condition involves the function W total, we assume an initial write for each a program state in which two conflicting accesses by two dif- variable before any reads. ferent threads to a shared data variable are simultaneously p o • →t is the program order per thread t. For each thread t, enabled. To distinguish this definition from others, let us refer p o →t is a total order over Act|t and gives in which order the to this condition as a simultaneity race. The definition of a race actions were issued to execute. This order is sometimes condition used in most work on dynamic race detection is referred to as the observed execution order. what we call a happens-before race and involves two conflicting s o • →o is the synchronization order per synchronization accesses not ordered by the happens before relationship, i.e., s o variable o ∈ Sync. For each o ∈ Sync, →o is a total order not separated by proper synchronization operations. For C++,

over Act|o. these two definitions of a race condition have been shown to be equivalent.2 This proof also generalizes to Java executions. Synchronizes-With and Happens-Before: Given an execu- Formally, an execution E  Tid, Act, W, →po , →so . contains a tion with program and synchronization orders, we extract happens-before race if there are two conflicting actions, two additional orders called the synchronizes-with (→sw ) and a, b ∈ Act| accessing a data variable x, such that neither x ­happens-before (→hb ) orders. Data races are defined using a →hb b nor b →hb a holds. Conversely, the execution is race free these orders. if every pair of conflicting accesses to a data variable are

A synchronization operation a1 by thread t1 syn- ordered by happens-before. sw chronizes with a2 by thread t2, denoted a1 → a2, if a1 is a The well-formedness of an execution guarantees that if sync-source on some synchronization variable o, a2 is a the execution has no race conditions, then it is sequentially so sync-sink on o, and a1 →o a2. consistent. The Goldilocks runtime makes use of this and The happens-before partial order →hb on the execution E is the DataRaceException to guarantee for all programs po defined as the transitive closure of the program orders→ t for (whether racy or not) that every concurrent execution is all t ∈ Tid and the synchronizes-with order →sw . sequentially consistent at the byte-code level. This does not In this paper, we focus only on well-formed executions,10 restrict the Goldilocks runtime’s use as a debugging tool, which respect (i) the intra-thread semantics and (ii) the because, for the Java and C++ memory models, it has been semantics of the synchronization variables and operations. proven2, 10 that if a program has a racy execution, then it is In addition, well-formed executions satisfy two essential guaranteed to have at least one execution that is sequentially requirements for data-race detection: consistent and racy. Thus, it is sufficient to restrict one’s attention to looking for races in sequentially consistent exe- • Happens-before consistency: This property makes use of cutions only. the happens-before order to restrict the write-seen rela- tionship. For example, for a read action a, a →hb W(a) 3. THE GOLDILOCKS ALGORITHM cannot happen, and W(a) cannot be overwritten by In this section, we describe our algorithm for detecting data

another write action b such that W(a) →hb b →hb a. races in an execution E  Tid, Act, W, →po ., →so .. For simplicity

november 2010 | vol. 53 | no. 11 | communications of the acm 87 research highlights

of exposition, we initially do not distinguish between read Figure 1. Transferring ownership of x, and GLS(x). and write accesses. po po The Goldilocks algorithm processes the actions in Act sw one at a time, as a sequence. Before a thread t performs an po po tj p (j) action a in Act, t notifies the oldilocks G algorithm that p (i) a is about to occur. The order in which these notifications ti acquire(Lx) access(x) release(Lx) { t , Lx, t } { t } { t , Lx } from different threads are interleaved and processed by acquire(Lx) access(x) release(Lx) i j j j

Goldilocks is represented mathematically by p, where p(i) is { Lx, ti } { ti } { ti, Lx } the i-th action in the sequence. This linear order, by construc- (a) Direct ownership transfer using lock Lx tion, respects the program order for each thread, and the syn- po chronization total order for each synchronization variable.b sw po p (j) The Goldilocks algorithm maintains for each data vari- sw tj able a “Goldilockset”: a map GLS such that, for every data vari- po t volwrite(o2) access(x)   k able x, its Goldlilockset is a set GLS(x) Tid Sync. Roughly { t , o1, t , o2, t } { t } p (i) volread(o1) volwrite(o2) i k j j speaking, GLSi(x), the Goldilockset of x immediately before ti { ti, o1, tk } { ti, o1, tk, o2 } processing action p(i), consists of (i) the id’s of threads that can access(x) volwrite(o1) access x in a race-free way at that point in the execution, and (ii) { ti } { ti, o1 } the synchronization variables on which a thread can perform a (b) Transitive ownership transfer using volatiles o1 and o2 sync-sink access in order to gain race-free access to x. As Goldilocks processes each action p(i) from E, it updates the Goldilocksets of variables. Initially, the edges between consecutive actions are indicated in the Goldilockset GLS(x) is empty for all data variables, including ­figure. Figure 1a illustrates direct ownership transfer from

static ones. When a new object is created, the Goldilockset ti to tj. After accessing x, ti performs a sync-source opera- for all of its instance fields is initialized to the empty set. tion (lock release) on synchronization variable Lx. Later,

After every action, the Goldilockset of every data variable x tj obtains ownership of x by executing a sync-sink operation is potentially updated. For every data variable x, three sim- (lock acquire) on synchronization variable Lx. Figure 1b illus-

ple rules specify how GLS(x) is updated after p (i) based on trates transitive ownership transfer. Threads ti and tj do not whether p (i) is (1) a synchronization source, (2) a synchroni- synchronize on the same synchronization variable. Instead, zation sink, or (3) a read or write access to x, as shown in the the synchronization involves a chain of synchronizes-with procedure ApplyLocksetRules in Figure 2. edges between other threads and synchronization variables.

If the action p (i) is a synchronization operation on a vari- ti synchronizes with tk via synchronization variable o1 and,

able o, we update the lockset GLS(x) for every data variable later tk synchronizes with tj via synchronization variable o2. x in Data. If p (i) is a sync-source operation, rule 1 adds o to Rules 1 and 2 in Figure 2 require updating the lockset GLS(x) if it contains the id t of the current accessor thread. of each data variable. A naive implementation of this algo- Intuitively, this represents that a later sync-sink operation by rithm would be too expensive for programs that manipu- a thread u on synchronization variable o will be sufficient for late large heaps. In Section 4, we present an efficient way u to gain race-free access to x. This is formalized by rule 2. of implement our algorithm by representing Goldilocksets If p(i) is a sync-sink(o) operation, rule 2 checks whether the implicitly and by applying update rules lazily. synchronization variable o is in GLS(x). If this is the case, then t is added to the Goldilockset. Figure 2. The core lockset update rules. If the action p(i) is an access to a data variable x, rule 3 checks the Goldilockset of the variable GLS(x) to decide ApplyLocksetRules(p(i)): whether this access is race free. If GLS(x) is empty, it indi- cates that x is a fresh variable which has not been accessed 1. if p(i) = sync-source(t, o) so far and any access to x at this point is race free. If GLS(x) foreach x ∈ Data: is not empty, only threads whose id’s are in GLS(x) can per- if t ∈ GLS(x) form race-free accesses to x. If the accessing thread’s id t is GLS(x) := GLS(x)  {o} not in GLS(x) then we throw a DataRaceException on x. Otherwise, the access to x is race free and GLS(x) becomes 2. if p(i) = sync-sink(t, o) the singleton {t}, indicating that, without further synchroni- foreach x ∈ Data: zation operations, only t can access x in a race free manner. if o ∈ GLS(x) Figure 1 shows two cases where the ownership of a GLS(x) := GLS(x)  {t} data variable x is transferred from a thread t to another i 3. if p(i) = write(t, x) or p(i) = read(t, x) thread t , and indicates how the Goldilocksets evolve in j if t ∈ GLS(x) or GLS(x) = 0/ each case. Program order (→po ) and synchronizes-with (→sw ) GLS(x) := {t} else b A racy read may appear earlier in p than the write that it sees. If an execu- throw a DataRaceException on x tion contains a data race between a pair of accesses, Goldilocks declares a race at one of these accesses regardless of which linearization p is picked.

88 communications of the acm | november 2010 | vol. 53 | no. 11

Correctness: The following theorem expresses the fact at different points in this execution. The contained variable that the Goldilocks algorithm is both sound, i.e., x is protected by synchronization on the container object o, detects all actual races in a given execution, and pre- and during the execution, the lock (La or Lb) protecting o cise, i.e., never reports false alarms. The full proof of the and x changes depending on which variable (a or b) points original Goldilocks algorithm for Java can be found in to o. Notice that, T2 changes the protecting lock of the con- Elmas et al.5 tainer object o from La to Lb, without accessing x. Figure 3 shows the Goldilocks update rules applied on GLS(x) for Theorem 1 (Correctness). Let E  Tid, Act, W, →po .,→so . each action and the resulting value of GLS(x). be a well-formed execution, x a data variable, and p a linear Observe that the update rules allow a variable’s order on Act as described earlier. Let i < j, and let p(i) and p(j) Goldilockset to grow during the execution. This enables be two accesses to x performed by threads ti and tj, with no other them to represent threads transfering ownership using action p(k) in between (i < k < j) accessing x. Then tj ∈ GLSj(x), different synchronization variables during the execution. i.e., the access p(j) is declared to be race free by the Goldilocks In this example, this ownership transfer takes place with- algorithm iff p(i) →hb p(j). out the variable itself being accessed. For example, after T2 finishes in Figure 2, GLS(x) has the value {T1, La, T2, 3.1. Example: precise data-race detection Lb}, meaning that a thread can access x without data race In this section, we illustrate on an example the Goldilocks by locking either La or Lb. Then T3 makes Lb the only pro- algorithm and how Goldilocksets capture the synchroniza- tecter lock by acquiring Lb and accesses x. tion mechanism protecting access to a variable at each point in an execution. In this example, earlier lockset algorithms 3.2. Distinguishing read and write accesses would have erroneously declared a race condition. The basic Goldilocks algorithm in Figure 2 tracks the Consider the execution shown in Figure 3 in which all happens-before relationship between any two accesses to a actions of T1 happen first, followed by all actions of T2 and variable x. In order to perform race detection, we must check then of T3. This example mimics a scenario in which an the happens-before relationship only between conflicting object is created and initialized and then made visible glob- actions, i.e., at least one action in the pair must be a write ally by T1. This Int object (referred to as o from now on) is access. We extend the basic Goldilocks algorithm by keep- a container object for its data field (referred to as x from ing track of (i) GLSW (x), the “write Goldilockset of x”, and (ii) now on). The object o is referred to by different global vari- GLSR(t, x), the “read Goldilockset of t and x” for each thread t. ables (a and b) and local variables (tmp1,tmp2,and tmp3) The update rules in ApplyLocksetRules are adapted to main- tain these Goldilocksets, but have essentially the same form as the rules in Figure 2. In the extended algorithm, it is suf- Figure 3. Precise data-race detection example. ficient to check happens-before between the current access to x and the most recent accesses (in the linear order p) to Class Int { int data; } x. How this extension is performed for Java can be found in Int a, b; // Global variables Elmas et al.6

Execution Goldilockset update rule applied GLS(x) 3.3. Specializing Goldilocks to the JMM Thread 1 (T1): The JMM requires that all synchronization operations be tmp1 = new Int; Initialize lockset 0/ ordered by a total order →so , whereas in our execution model, tmp1.data = 0; First access {T1} so a separate total order →o per synchronization variable is acquire(La); La ∈ GLS(x) → add T1 to GLS(x) {T1} sufficient. a = tmp1; No access to x {T1} Data Variables and Operations: In Java, every data variable is release (La); T1 ∈ GLS(x) → add La to GLS(x) {T1,La} in the form of (o, d) where o is an object and d is a nonvola- Thread 2 (T2): tile field. The byte-code instructionsx load and xstore access acquire(La); La ∈ GLS(x) → add T2 to GLS(x) {T1,La,T2} memory to read from and write to fields of objects, respec- tmp2 = a; No access to x {T1,La,T2} tively (x changes depending on the type of the field). release(La); T2 ∈ GLS(x) → add La to GLS(x) {T1,La,T2} The JMM specifies three synchronization mechanisms: acquire(Lb); Lb ∈ GLS(x) → add T2 to GLS(x) {T1,La,T2} monitors, volatile fields, and fork/join operations.

b = tmp2; No acces s to x {T1,La,T2} Monitors: In Java, a monitor per object (denoted by mo) pro- release(Lb); T2 ∈ GLS(x) → add Lb to GLS(x) {T1,La,T2,Lb} vides a reentrant lock for each object o. Acquiring the lock of an object o (acquire(o) ) corresponds to a sync-sink opera- T3 Thread 3 ( ): tion on m , while releasing the lock of o (release(o) ) corre- acquire(Lb); Lb ∈ → T3 o GLS(x) add to GLS(x) sponds to a sync-source operation on m . Nested acquires {T1,La,T2,Lb,T3} o and releases of the same lock are treated as no-ops. In the b.data = 2; T3 ∈ → {T3} GLS(x) Race-free access JMM, each release(o) synchronizes with the next acquire(o) tmp3 = b; No access to x {T3} so operation in → mo. release(Lb); T3 ∈ → Lb {T3,Lb} GLS(x) add to GLS(x) Volatile Variables: Each volatile variable is denoted (o, v) tmp3.data = 3; T3 ∈ → {T3} GLS(x) Race-free access where o is an object, and v is a volatile field. Each read volread(o, v) from a volatile variable (o, v), and each write

november 2010 | vol. 53 | no. 11 | communications of the acm 89 research highlights

volwrite(o, v) to (o, v) is implemented by the xload and xstore to executions of the source-level program, which makes byte-code instructions, respectively. While volread(o, v) debugging hard. corresponds to a sync-sink, volwrite(o, v) corresponds to To use Goldilocks for debugging purposes, this diffi- sync-source operation on (o, v). In the JMM, there is a syn- culty can be remedied by disabling compiler optimizations. chronizes-with relationship between each volread(o, v) and For post-deployment use, a stronger memory model9, 11 that the volwrite(o, v) that it sees. is able to relate each (racy and race-free) execution of the Fork/Join: Creating a new thread with id t (fork(t) ) syn- compiled program to a sequentially consistent execution of chronizes with the first action of thread t, denoted start(t). the source program is needed. The last action of thread t, denoted end(t) ­synchronizes with the join operation on t, denoted join(t). For each 4. IMPLEMENTING GOLDILOCKS thead t, fork(t) and end(t) correspond to sync-source There are two published implementation of the Goldilocks operations on a (fictitious) synchronization variable t–, algorithm, both of which monitor the execution at the Java and start(t) and join(t) correspond to sync-sink opera- byte-code level. At this level, each variable access or syn- tions on t–. The JMM guarantees that for each thread t, chronization operation corresponds to a single byte-code so so so there exists an order →t– such that: fork(t) →t– start(t) →t– instruction, and each byte-code instruction can be associ- so end(t) →t– join(t). ated with a source code line and/or variable. Handling other Synchronization Mechanisms: Using the The first Goldilocks implementation, by the authors of lockset update rules above, Goldilocks is able to uni- this paper, was carried out in Kaffe,19 a clean-room imple- formly handle various approaches to synchronization such mentation of the Java virtual machine (JVM) in C. In Kaffe, as dynamically changing locksets, permanent or temporary we integrated Goldilocks into the interpreting mode of thread-locality of objects, container-protected objects, own- Kaffe’s runtime engine. Implementing the algorithm in the ership transfer of variable without accessing the variable (as JVM enables fast access to internal data structures of the in the example in Section 3.1). Furthermore, Goldilocks JVM that manage the layout of object in the memory and can also handle the synchronization idioms in the java. using the efficient mechanisms that exist in the JVM inter- util.concurrent package such as semaphores and bar- nally, such as fast mutex locks. riers, since these primitives are built using locks and volatile The second implementation of Goldilocks is by Flanagan variables. The happens-before edges induced by the wait/ and Freund and was carried out using the RoadRunner dy­­ notify(All) construct are computed by simply applying the namic program analysis tool.8 In RoadRunner, Goldilocks Goldilockset update rules to the acquire and release opera- is implemented in Java and injected by byte-code instrumen- tions invoked inside wait. tation at load-time of the program. This allows the algorithm to benefit from Java compiler optimizations and just-in-time 3.4. Race detection and sequential consistency compilation and to be portable to any JVM. Flanagan and The Java and C++ memory models provide the data-race free- Freund showed that this implementation is competitive with dom (DRF) property.2, 10 The DRF property guarantees that if ours in Kaffe for most of the common benchmarks.7 all sequentially consistent executions of a source program In the following, we present the most important imple- are race free, then the compiled program only exhibits these mentation features and optimizations. The implementation sequentially consistent executions of the source program, is described based on the core algorithm presented in Figure after any compiler and hardware optimizations permitted by 2. The extension of the implementation that distinguishes the memory model. The Goldilocks algorithm check races read and write accesses can be found in Elmas et al.6 by monitoring the executions of the compiled program, and assumes that the compiler and the runtime it is built on 4.1. Implicit representation and lazy evaluation (hardware or virtual machine) conform to the language and of Goldilocksets the memory model specifications. Therefore, if the source For programs with a large number of data variables, repre- program is race free, then any execution of the compiled senting Goldilocksets explicitly for each data variable and program corresponds to a sequentially consistent execu- implementing the Goldilocks algorithm as described in tion of the source program, and no DataRaceException Figure 2 may have high memory and computational cost. is thrown. We avoid the memory cost by representing the Goldilocksets If the source program has a race, the Goldilocks run- implicitly and the computational cost by evaluating time still ensures that all executions of the compiled pro- Goldilocksets lazily as described below. gram will run under the sequential consistency semantics, Instead of keeping a separate Goldilockset GLS(x) for each i.e., sequential consistency is guaranteed at the byte-code variable x, we represent GLS(x) implicitly as long as no access level. This is accomplished by preventing accesses that will to x happens and is computed temporarily when an access cause a data race and throwing a DataRaceException happens. At this point, the Goldilockset is a singleton, and we right before that access. However, in the case of a racy continue to represent it implicitly until the next access. For program, the JMM permits compiler optimizations that this, we keep the synchronization events in a single, global result in executions that are not sequentially consistent linked list called the synchronization-event list and repre- behaviors of the original source code. In this case, the JMM sent by its head and tail pointers in Figure 4. The ordering of and the DRF property are not strong enough to allow the these events in the list is consistent with the program order p o so Goldilocks runtime to relate byte-code level executions →t for each thread t and the synchronization orders →o for

90 communications of the acm | november 2010 | vol. 53 | no. 11

t is added to GLS(x) or the last event ( ) is processed. In the Figure 4. The synchronization event list. j an former case, no race is reported according to the rule 3 of pos(y) pos(z) Figure 2. In the latter case, a race is reported since tj ∉ GLS(x) head tail after the evaluation.

5(c): After the check, owner(x) is set to tj and pos(x) is set to the tail of the synchronization event list. pos(x) pos(w) The implementation does not use any extra threads for race detection. The algorithm is performed in a decentral- each synchronization variable o.c When a thread performs a ized manner by instrumented threads of the program being synchronization action a, it must append a corresponding checked. For each data variable x, we use a unique lock to event to the synchronization-event list atomically with the make atomic the Goldilockset update and the race-­freedom event. In Kaffe, we make sure this is the case by modifying check for each access to x and to serialize all the race-­freedom the implementations of the Java synchronization actions. checks for x. In order to represent GLS(x), each variable x in the pro- gram is associated with two bits of information regarding 4.2. Performance optimizations the most recent access to x: owner(x) stores the id of the Short-Circuit Checks: A cheap-to-evaluate sufficient con- thread that most recently accessed x, and pos(x) points to dition for a happens-before edge between the last two the last synchronization event in the list that was taken into accesses to a variable can reduce race-detection overhead. account when GLS(x) was last computed. We make use of two such conditions, called short-circuit Figure 4 shows four variables pointing to entries in checks, and bypass the traversal of the synchronization the synchronization-event list. Figure 5 shows how the event list when these checks succeed. In this case, the final Goldilockset GLS(x) is computed when x is accessed. Goldilockset of the variable consists of the id of the thread that accessed it last.

5(a): After each access to x by a thread ti, owner(x) is set to ti, and We employ two constant-time short-circuit checks. First, pos(x) is set to point the tail of the synchronization event list. when the last two accesses to a shared variable are per-

5(b): Right before an access to x by thread tj, temporarily, formed by the same thread t, the happens-before relation- we represent GLS(x) explicitly. GLS(x) is initially {owner(x)} ship is guaranteed by the program order of t. This is detected and is updated by processing the synchronization events by checking whether owner(t), the last accessor thread, is the between pos(x) (denoted by a1, …, an) and tail according to same as the thread performing the current access. the rules 1 and 2 of Figure 2. This process stops either when In the second short-circuit check, we determine whether the variable x is protected by the same lock during the last two accesses to x. For this, we associate with each variable x Figure 5. Lazy evaluation of Goldilockset GLS(x). a lock alock(x), which is randomly selected among the locks head tail held by the most recent accessor thread. When a thread t accesses x and if alock(x) is held by t, then that access is race free. Direct Ownership Transfer: A sound but imprecise third pos(x) optimization is to consider only the subset of synchroni-

owner(x) = ti zation events executed by the current and last accessing thread when examining the portion of the synchronization (a) After t accesses x i event list between pos(x) and tail. This check is not constant head tail time, but we found that it succeeds often enough to improve a1 a2 an Goldilocks overhead. Garbage Collection: The synchronization events list is peri- odically garbage-collected when there are entries in the pos(x) Events to consider owner(x) = ti in lockset evaluation beginning of the list that are not relevant for the Goldilockset computation of any variable. This is the case when an entry (b) Before t accesses x i in the list is not reachable from pos(x) for any data variable x, head tail and is tracked by maintaining incremental reference counts a1 a2 an for each list entry. Partially Eager Evaluation: Sometimes the synchronization pos(x) event list gets too long and it is not possible to garbage-­ collect the event list when variable x is accessed early in owner(x) = ti (c) After ti accesses x an execution but is not used afterwards. We address this problem by “partially eager” Goldilockset evaluation. We move pos(x) forward towards the tail to a new position c For Java, there is a total order on all synchronization operations, and the en- pos´(x), and partially evaluate a Goldilockset GLS(x) of x by tries in the list are in this order. processing events (i.e., running ApplyLocksetRules on them)

november 2010 | vol. 53 | no. 11 | communications of the acm 91 research highlights

between pos(x) and pos´(x). During the next access the evalu- remain sequentially consistent at the byte-code level. tion of GLS(x) starts from the stored Goldilockset, not from Experiments with Goldilocks have demonstrated that the {owner(x)}. runtime overhead of supporting a DataRaceException Sound Static Race Analysis: The runtime overhead of race can be made reasonable. detection is directly related to the number of data variable accesses checked and synchronization events that occur. Acknowledgments To reduce the number of accesses checked at runtime, we We would like to thank Hans Boehm, Cormac Flanagan, use static analysis at compile time to determine accesses Steve Freund, and Madan Musuvathi for their critique of that are guaranteed to be race free. While implementing this paper. This research was supported by the Software Goldilocks in Kaffe, we worked with two static analysis Reliability Research Group at Microsoft Research, tools for this purpose: Chord13 and RccJava.1 Redmond, WA, by the Scientific and Technical Research Council of Turkey (TUBITAK) under grant 104E058, and by 4.3. Race-detection overhead the Turkish Academy of Sciences (TUBA). At the time of the original Goldilocks work, the vector clock algorithm12 was the only precise dynamic-race-detection­ References algorithm in the literature. The vector clock algorithm, for 1. abadi, M, Flanagan, C., Freund, S.N. of the 32nd ACM SIGPLAN- Types for safe locking: Static race SIGACT Symposium on Principles an execution with n threads, requires for every thread and detection for Java. ACM Trans. of Programming Languages (POPL synchronization variable a separate vector clock (VC) of Program. Lang. Syst. (2006). 2005). 2. Boehm, H.-J., Adve, S.V. Foundations 11. marino, D., Singh, A., Millstein, T., size n and performs O(n) operations (merging or compar- of the C++ concurrency memory Musuvathi, M., Narayanasamy, S. ing two VCs) whenever a synchronization operation or data model. In Proceedings of the 2008 DRFx: A simple and efficient memory ACM SIGPLAN Conf. on Programming model for concurrent programming access happens. In preliminary research, compared to a Language Design and Implementation languages. In Proceedings of the straightforward implementation of vector clocks, we found (PLDI 2008) 2010 ACM SIGPLAN Conference on 3. cheng, G.-I., Feng, M., Leiserson, C.E., Programming Language Design and Goldilocks overhead to be significantly less.6 Randall, K.H., Stark, A.F. Detecting Implementation (PLDI 2010). 6 data races in Cilk programs that use 12. mattern, F. Virtual time and global In Elmas et al., we measured the overhead of the locks. In ACM Symposium on Parallel states of distributed systems. In Goldilocks implementation inside Kaffe on a set of widely Algorithms and Architectures (SPAA Proceedings of the International 1998). Workshop on Parallel and Distributed used Java benchmarks. This implementation required us to 4. christiaens, M. De Bosschere, K. Algorithms (1988). run all programs in interpreted (not just-in-time compiled) TRaDe: Data race detection for 13. naik, M., Aiken, A., Whaley, J. Effective Java. Proc. Intl. Conference on static race detection for Java. In mode. We found that, with powerful static analysis tools Computational Science. V. Alexandrov, Proceedings 2006 ACM SIGPLAN eliminating much of the monitoring, we were able to obtain J. Dongarra, B. Juliano, R. Renner, and Conference on Programming Language C. Tan, eds. (ICCS 2001). Design and Implementation (PLDI a slowdown of within approximately 2 for all benchmarks. 5. elmas, T., Qadeer, S., Tasiran, S. 2006). Without static elimination of some checks, overheads Goldilocks: Efficiently computing 14. pozniansky, E., Schuster, A. Multirace: the happens-before relation using Efficient on-the-fly data race detection remained high; some benchmarks experienced slowdowns locksets. Technical Report MSR- in multithreaded C++ programs: of over 15. The overhead results with static pre-elimination TR-2006–163, Microsoft Research Research articles. Concurr. Comput. (2006). Pract. Exp. (2007). were encouraging in that they showed precise race detection 6. elmas, T., Qadeer, S., Tasiran, S. 15. ronsse, M., Bosschere, K.D. RecPlay: to be a practical debugging tool, and they indicated that, Goldilocks: A race and transaction- A fully integrated practical record/ aware java runtime. In Proceedings of replay system. ACM Trans. Comput. with further optimizations, post-deployment runtime race the 2007 ACM SIGPLAN Conference Syst. (1999). DataRaceDetection on Programming Language Design and 16. savage, S., Burrows, M., Nelson, G., detection to support could be viable. Implementation (PLDI 2007). Sobalvarro, P., Anderson, T. Eraser: 7 Later work on FastTrack, a dynamic race detector 7. flanagan, C., Freund, S.N. FastTrack: A dynamic data race detector for Efficient and precise dynamic race multithreaded programs. ACM Trans. based on vector clocks, is able to avoid worst-case perfor- detection. In Proceedings of the Comput. Syst. (1997). mance of vector clocks much of the time using optimiza- 2009 ACM SIGPLAN Conference on 17. schonberg, E. On-the-fly detection 7 Programming Language Design and of access anomalies. In Proceedings tions for common cases. Flanagan and Freund compare Implementation (PLDI 2009). of the ACM SIGPLAN Conference on a number of race-detection algorithms, including just- 8. flanagan, C., Freund, S.N. The Programming Language Design and RoadRunner dynamic analysis Implementation (PLDI 1989). in-time compiled implementations of FastTrack and framework for concurrent programs, 18. shavit, N., Touitou, D. Software Goldilocks in RoadRunner. FastTrack achieves In ACM Workshop on Program transactional memory. In Symposium Analysis for Software Tools and on Principles of Distributed Computing significantly better overheads than both implementa- Engineering (PASTE 2010). (1995). tions of Goldilocks. The low overheads achieved by 9. Lucia, B., Ceze, L., Strauss, K., Qadeer, 19. Wilkinson, T. Kaffe: A JIT and S., Boehm H.-J. Conflict exceptions: interpreting virtual machine to run FastTrack provide further support that a practical Simplifying concurrent language Java code. http://www.transvirtual. race-aware runtime for deployed programs supporting semantics with precise hardware com (1998). exceptions for data-races. In 20. yu, Y., Rodeheffer, T., Chen, W. a DataRaceException can be built. It is reported in Proceedings of the 37th International RaceTrack: Efficient detection of data Flanagan and Freund7 that additional short-circuit checks Symposium on Computer Architecture race conditions via adaptive tracking. (ISCA 2010). In Proceedings of the 20th ACM similar to ones we discussed above dramatically reduce the 10. manson, J., Pugh, W., Adve, S.V. The Symposium on Operating Systems runtime of FastTrack. Most of these checks can be incor- Java memory model. In Proceedings Principles (SOSP 2005). porated into Goldilocks implementations as well. Tayfun Elmas ([email protected]), Koç Serdar Tasiran ([email protected]), Koç 5. CONCLUSION University, Istanbul, Turkey. University, Istanbul, Turkey. We have presented a race-aware runtime for Java incor- Shaz Qadeer ([email protected]), Microsoft Research, Redmond, WA. porating a novel algorithm, Goldilocks, for pre- cise dynamic race detection. The runtime provides a DataRaceException, and thus ensures that executions © 2010 ACM 0001-0782/10/1100 $10.00

92 communications of the acm | november 2010 | vol. 53 | no. 11 doi:10.1145/1839676.1839699 FastTrack: Efficient and Precise Dynamic Race Detection By Cormac Flanagan and Stephen N. Freund

Abstract they report false alarms. Precise race detectors never produce Multithreaded programs are notoriously prone to race con- false alarms. Instead, they compute a precise representation ditions. Prior work developed precise dynamic race detec- of the happens-before relation for the observed trace and tors that never report false alarms. However, these checkers report an error if and only if the observed trace has a race con- employ expensive data structures, such as vector clocks dition. Note that there are typically many possible traces for a (VCs), that result in significant performance overhead. particular program, depending on test inputs and scheduling This paper exploits the insight that the full generality of choices. Precise dynamic race detectors do not reason about VCs is not necessary in most cases. That is, we can replace all possible traces, however, and may not identify races that VCs with an adaptive lightweight representation that, occur only when other code paths are taken. While full cover- for almost all operations of the target program, requires age is desirable, it comes at the cost of potential false alarms constant space and supports constant-time operations. because of the undecidability of the halting problem. To avoid Experimental results show that the resulting race detection these false alarms, precise race detectors focus on detecting algorithm is over twice as fast as prior precise race detectors, only race conditions that occur on the observed trace. with no loss of precision. Typically, precise detectors represent the happens-before relation with vector clocks (VCs),14 as in the Djit+ race detec- tor.16 Vector clocks are expensive to maintain, however, 1. INTRODUCTION because a VC encodes information about each thread in a Multithreaded programs are prone to race conditions and system. Thus, if the target program has n threads, each VC other concurrency errors such as deadlocks and violations requires O(n) storage space and VC operations (such as com- of expected atomicity or determinism properties. The broad parison) require O(n) time. Since a VC must be maintained adoption of multicore processors is exacerbating these prob- for each memory location and modified on each access to lems, both by driving the development of multithreaded that location, this O(n) time and space overhead precludes software and by increasing the interleaving of threads in the use of VC-based race detectors in many settings. existing multithreaded systems. A variety of alternative imprecise race detectors have been A race condition occurs when two threads concurrently developed, which may provide improved performance (and perform memory accesses that conflict. Accesses conflict sometimes better coverage), but which report false alarms when they read or write the same memory location and at on some race-free programs. For example, Eraser’s LockSet least one of them is a write. In this situation, the order in algorithm18 enforces a lock-based synchronization disci- which the two conflicting accesses are performed can affect pline and reports an error if no lock is consistently held on the program’s subsequent state and behavior, likely with each access to a particular memory location. Eraser may unintended and erroneous consequences. report false alarms, however, on programs that use alter- Race conditions are notoriously problematic because native synchronization idioms such as fork/join or bar- they typically cause problems only on rare interleavings, rier synchronization. Some LockSet-based race detectors making them difficult to detect, reproduce, and eliminate. include limited happens-before reasoning to improve preci- Consequently, much prior work has focused on static and sion in such situations.15, 16, 22 dynamic analysis tools for detecting race conditions. Other optimizations include using static analyses or To maximize test coverage, race detectors use a very dynamic escape analyses3, 21 or using “accordion” VCs broad notion of when two conflicting accesses are consid- that reduce space overheads for programs with shortlived ered concurrent. The accesses need not be performed at threads.5 Alternative approaches record program events for exactly the same time. Instead, the central requirement is post-mortem race identification.1, 4, 17 that there is no “synchronization dependence” between the Although these imprecise tools successfully detect race two accesses, such as the dependence between a lock release conditions, their potential to generate many false alarms lim- by one thread and a subsequent lock acquire by a different its their effectiveness. Indeed, it has proven surprisingly dif- thread. These various kinds of synchronization dependen- ficult and time consuming to identify the real errors among cies form a partial order over the instructions in the execu- tion trace called the happens-before relation.13 Two memory The original version of this paper was published in the accesses are then considered to be concurrent if they are not Proceedings of the 2009 ACM SIGPLAN Conference on ordered by this happens-before relation. Programming Language Design and Implementation, June In this paper, we focus on online dynamic race detectors, 2009. which generally fall into two categories depending on whether

november 2010 | vol. 53 | no. 11 | communications of the acm 93 research highlights

the spurious warnings produced by some tools. Even if a code a number of concurrently executing threads, each with a block looks suspicious, it may still be race-free due to some thread identifiert ∈ Tid. These threads manipulate variables subtle synchronization discipline that is not (yet) under- x ∈ Var and locks m ∈ Lock. A trace a captures an execution stood by the current code maintainer. Even worse, additional of a multithreaded program by listing the sequence of opera- real bugs (e.g., deadlocks or performance problems) could tions performed by the various threads. The operations that be added while attempting to “fix” a spurious warning pro- a thread t can perform include: duced by these tools. Conversely, real race conditions could be ignored because they appear to be false alarms. • rd(t, x) and wr(t, x), which read and write a value from a This paper exploits the insight that, while VCs provide variable x a general mechanism for representing the happens-before • acq(t, m) and rel(t, m), which acquire and release a relation, their full generality is not actually necessary in lock m most cases. The vast majority of data in multithreaded pro- • fork(t, u), which forks a new thread u grams is either thread local, lock protected, or read shared. • join(t, u), which blocks until thread u terminates Our FastTrack analysis uses an adaptive representation for

the happens-before relation to provide constant-time and The happens-before relation (

read-shared) in order to guarantee no loss of precision. It also closed: that is, if a

2. PRELIMINARIES VV1 2 iff∀t . V1()t ≤ V 2 ()t VV = λt.( max V() t ,(V t)) 2.1. Multithreaded program traces 1 2 1 2 ⊥ = λt. 0 We begin with some terminology and definitions regard- V ing multithreaded execution traces. A program consists of inct () V = λu. if u = tthen V(u)+ 1 else V()u

94 communications of the acm | november 2010 | vol. 53 | no. 11

In Djit+, each thread has its own clock that is incre- edge shown above. At the second write, Djit+ compares mented at each lock release operation. Each thread t also the ­VCs: keeps a VC Ct such that, for any thread u, the clock entry W = á4, 0, ...ñ  á4, 8, ...ñ = C Ct(u) records the clock for the last operation of thread u x 1 that happens before the current operation of thread t. In addition, the algorithm maintains a VC Lm for each lock m. Since this check passes, the two writes are not concurrent, These VCs are updated on synchronization operations that and no race condition is reported. impose a happens-before order between operations of dif- ferent threads. For example, when thread u releases lock m, 3. THE FastTrack ALGORITHM + + the Djit algorithm updates Lm to be Cu. If a thread t subse- A limitation of VC-based race detectors such as Djit is their quently acquires m, the algorithm updates Ct to be Ct  Lm, performance, since each VC requires O(n) space and each VC since subsequent operations of thread t now happen after operation (copying, comparing, joining, etc.) requires O(n) that release operation. time. To identify conflicting accesses, the jitD + algorithm keeps Empirical benchmark data indicates that reads and two VCs, Rx and Wx, for each variable x. For any thread t, Rx(t) writes operations account for the vast majority (over and Wx(t) record the clock of the last read and write to x by 96%) of monitored operations. The key insight behind thread t. A read from x by thread u is race-free provided it FastTrack is that the full generality of VCs is not nec- happens after the last write of each thread, that is, Wx  Cu. essary in over 99% of these read and write operations: a A write to x by thread u is race-free provided that the write more lightweight representation of the happens-before happens after all previous accesses to that variable, that is, information can be used instead. Only a small fraction of

Wx  Cu and Rx  Cu. operations performed by the target program necessitate As an example, consider the execution trace fragment expensive VC operations. shown in Figure 1, where we include the relevant portion We begin by providing an overview of how our analysis + of the Djit instrumentation state: the VCs C0 and C1 for catches each type of race condition on a memory location. A threads 0 and 1; and the VCs Lm and Wx for the last release race condition is either: a write–write race condition (where a of lock m and the last write to variable x, respectively. We write is concurrent with a later write); a write–read race condi- show two components for each VC, but the target program tion (where a write is concurrent with a later read); or a read– may of course contain additional threads.a write race condition (where a read is concurrent with a later + At the write wr(0, x), Djit updates Wx with current write). clock of thread 0. At the release rel(0, m), Lm is updated Detecting Write–Write Races: We first consider how to with C0. At the acquire acq(1, m), C1 is joined with Lm, thus efficiently analyze write operations. At the second write ­capturing the dashed release-acquire happens-before operation in the trace in Figure 1, Djit+ compares the

VCs Wx  C1 to determine whether there is a race. A care- ful inspection reveals, however, that it is not necessary

Figure 1. Execution trace under Djit+. to record the entire VC á4, 0, …ñ from the first write to x. Assuming no races have been detected on x so far, then all writes to x are totally ordered by the happens-before C0 C1 Lm Wx relation, and the only critical information that needs to á4,0,…ñ á0,8,…ñ á0,0,…ñ á0,0,…ñ be recorded is the clock (4) and identity (thread 0) of the thread performing the last write. This information (clock wr(0,x) 4 of thread 0) is sufficient to determine if a subsequent á4,0,…ñ á0,8,…ñ á0,0,…ñ á4,0,…ñ write to x is in a race with any preceding write. We refer to a pair of a clock c and a thread t as an rel(0,m) epoch, denoted c@t. Although rather simple, epochs pro- á5,0,…ñ á0,8,…ñ á4,0,…ñ á4,0,…ñ vide the crucial lightweight representation for recording sufficiently-­precise aspects of the happens-before relation acq(1,m) efficiently. Unlike VCs, an epoch requires only constant á5,0,…ñ á4,8,…ñ á4,0,…ñ á4,0,…ñ space and supports constant-time operations. An epoch c@t happens before a VC V (c@t  V) if and only wr(1,x) if the clock of the epoch is less than or equal to the corre- sponding clock in the vector: á5,0,…ñ á4,8,…ñ á4,0,…ñ á4,8,…ñ c@t  V iff c ≤ V(t)

We use ^e to denote a minimal epoch 0@0. Using this optimized representation, FastTrack analyzes a for clarity, we present a variant of the Djit+ algorithm where some clocks are one less than in the original formulation.16 This revised algorithm has the trace from Figure 1 using a more compact instrumenta- the same performance as the original but is slightly simpler and more di- tion state that records only a write epoch Wx for variable x, rectly comparable to FastTrack. rather than the entire VC Wx, reducing space overhead, as

november 2010 | vol. 53 | no. 11 | communications of the acm 95 research highlights

• Lock-protected data, where a protecting lock is held on Figure 2. Execution trace under FastTrack. each access to a variable, and hence all access are totally ordered, either by program order (for accesses by the C C L W 0 1 m x same thread) or by synchronization order (for accesses á4,0,…ñ á0,8,…ñ á0,0,…ñ by different threads) ⊥e wr(0,x) Reads are typically unordered only when data is read- shared, that is, when the data is first initialized by one á4,0,…ñ á0,8,…ñ á0,0,…ñ 4@0 thread and then shared between multiple threads in a read- rel(0,m) only manner. FastTrack uses an adaptive representation for the read á5,0,…ñ á0,8,…ñ á4,0,…ñ 4@0 history of each variable that is optimized for the common acq(1,m) case of totally-ordered reads, while still retaining the full precision of VCs when necessary. á5,0,…ñ á4,8,…ñ á4,0,…ñ 4@0 In particular, if the last read to a variable happens after all preceding reads, then FastTrack records only the epoch of wr(1,x) this last read, which is sufficient to precisely detect whether á5,0,…ñ á4,8,…ñ á4,0,…ñ 8@1 a subsequent write to that variable conflicts withany preced- ing read in the entire program history. Thus, for thread-local and lock-protected data (which do exhibit totally-ordered reads), FastTrack requires only O(1) space for each allo- cated memory location and only O(1) time per memory shown in Figure 2. (C and L record the same information as access. C and L in Djit+.) In the less common case where reads are not totally At the first write to x, FastTrack performs an O(1)-time ordered, FastTrack stores the entire VC, but still handles

epoch write Wx := 4@0. FastTrack subsequently ensures read operations in O(1) time. Since such data is typically that the second write is not concurrent with the preceding read-shared, writes to such variables are rare, and their anal- write via the O(1)-time comparison: ysis overhead is negligible.

á ñ = 3.1. Analysis details Wx = 4@0  4, 8, ... C1 Based on the above intuition, we now describe the FastTrack algorithm in detail. Our analysis is an online To summarize, epochs reduce the space overhead for algorithm whose analysis state consists of four components: detecting write–write conflicts from O(n) to O(1) per allo-

cated memory location, and replaces the O(n)-time VC com- • Ct is the current VC of thread t.

parison “” with the O(1)-time comparison “”. • Lm is the VC of the last release of lock m.

Detecting Write–Read Races: Detecting write–read races • Rx is either the epoch of the last read from x, if all under the new representation is also straightforward. On other reads happened-before that read, or else is a

each read from x with current VC Ct, we check that the read VC that records the last read from x by multiple happens after the last write via the same O(1)-time compari- threads.

son Wx  Ct. • Wx is the epoch of the last write to x. Detecting Read–Write Races: Detecting read–write race con-

ditions is somewhat more difficult. Unlike write operations, The analysis starts with Ct = inct(^V), since the first opera- which are totally ordered in race-free programs, reads are tions of all threads are not ordered by happens-before. In

not necessarily totally ordered. Thus, a write to a variable x addition, initially Lm = ^V and Rx = Wx = ^e. could potentially conflict with the last read of x performed Figure 3 presents the key details of how FastTrack (left by any other thread, not just the last read in the entire trace column) and Djit+ (right column) handle read and write

seen so far. Thus, we may need to record an entire VC Rx, operations of the target program. For each read or write oper-

in which Rx(t) records the clock of the last read from x by ation, the relevant rules are applied in the order presented thread t. until one matches the current instrumentation state. If an We can avoid keeping a complete VC in many cases, assertion fails, a race condition exists. The figure shows the however. Our examination of data access patterns across a instruction frequencies observed in the programs described variety of multithreaded Java programs indicates that read in Section 4, as well as how frequently each rule was applied. operations are often totally ordered in practice, particularly For example, 82.3% of all memory and synchronization opera- in the following common situations: tions performed by our benchmarks were reads, and rule [FT read same epoch] was used to check 63.4% of those reads. • Thread-local data, where only one thread accesses a Expensive O(n)-time operations are highlighted in grey. variable, and hence these accesses are totally ordered Read Operations: The first four rules provide various alter- by program-order natives for analyzing a read operation rd(t, x). The first rule

96 communications of the acm | november 2010 | vol. 53 | no. 11

Figure 3. FastTrack race detection algorithm and its comparison to Djit+.

FastTrack State: Djit+ State: C : VC t Ct : VC L : VC m Lm : VC W : Epoch x Wx : VC R : Epoch  VC x Rx : VC

When Thread t performs rd(t, x): 82.3% of all Operations

[FT read same epoch] [Djit+ read same epoch] 63.4% of reads 78.0% of reads if Rx = Et then if Rx(t) = Ct(t) then skip skip endif endif

[FT read shared] 20.8% of reads if Rx ∈ VC then

assert Wx  Ct

Rx (t) := Ct(t) endif

[FT read exclusive] [Djit+ read] 15.7% of reads 22.0% of reads if Rx ∈ Epoch and Rx  Ct then if Rx(t) ≠ Ct(t) then

assert Wx  Ct assert Wx  Ct

Rx := Et Rx(t) := Ct(t) endif endif

[FT read share] 0.1% of reads if Rx ∈ Epoch then

let c@u = Rx

assert Wx  Ct

Rx := ⊥V[t  Ct(t), u  c] endif

When Thread t performs wr(t, x): 14.5% of all Operations

[FT write same epoch] [Djit+ write same epoch] 71.0% of writes 71.0% of writes if Wx = Et then if Wx(t) = Ct(t) then skip skip endif endif

[FT write exclusive] 28.9% of writes if Rx ∈ Epoch then

assert Rx  Ct

assert Wx  Ct

Wx := Et endif

[FT write shared] [Djit+ write] 0.1% of writes 29.0% of writes if Rx ∈ VC then if Wx(t) ≠ Ct(t) then

assert Rx  Ct assert Wx  Ct

assert Wx  Ct assert Rx  Ct

Wx := Et Wx(t) := Ct(t)

Rx := ⊥e endif endif

november 2010 | vol. 53 | no. 11 | communications of the acm 97 research highlights

[FT read same epoch] optimizes the case where x was a fast path for the 28.9% of writes for which Rx is an epoch, already read in this epoch. This fast path requires only a and this rule checks that the write happens after all previous

single epoch comparison and handles over 60% of all reads. accesses. In the case where Rx is a VC, [FT write shared]

We use Et to denote the current epoch c@t of thread t, where requires a full (slow) VC comparison, but this rule applies + c = Ct(t) is t’s current clock. Djit incorporates a comparable only to a tiny fraction (0.1%) of writes. In contrast, the cor- rule [Djit+ read same epoch]. responding Djit+ rule [Djit+ write] requires a VC compari- The remaining three read rules all check for write–read son on 29.0% of writes.

conflicts via the fast epoch-VC comparisonW x  Ct, and then Other Operations: Figure 4 shows how FastTrack handles

update Rx appropriately. If Rx is already a VC, then [FT read synchronization operations. These operations are rare, and the shared] simply updates the appropriate component of traditional analysis for these operations in terms of expensive that vector. Note that multiple reads of read-shared data VC operations is perfectly adequate. Thus, these FastTrack from the same epoch are all covered by this rule. We could rules are similar to those of Djit+ and other VC-based analyses. extend rule [FT read same epoch] to handle same-epoch Example: The execution trace in Figure 5 illustrates how

reads of read-shared data by matching the case that Rx ∈ VC FastTrack dynamically adapts the representation for the

and Rx(t) = Ct(t). The extended rule would cover 78% of all read history Rx of a variable x. Initially, Rx is ⊥e, indicating reads (the same as [Djit+ read same epoch]) but does not that x has not yet been read. After the first read operation

improve performance perceptibly. rd(1, x), Rx becomes the epoch 1@1 recording both the If the current read happens after the previous read epoch clock and the thread identifier of that read. The second (where that previous read may be either by the same thread read rd(0, x) at clock 8 is concurrent with the first read, and or by a different thread, presumably with interleaved syn- so FastTrack switches to the VC representation á8, 1, …ñ

chronization), [FT read exclusive] simply updates Rx with for Rx, recording the clocks of the last reads from x by both the accessing thread’s current epoch. For the more general threads 0 and 1. After the two threads join, the write opera- situation where the current read is concurrent with the pre- tion wr(0, x) happens after all reads. Hence, any later opera- vious read, [FT read share] allocates a VC to record the tion cannot be in a race with either read without also being epochs of both reads, since either read could subsequently in a race on that write operation, and so the rule [FT write

participate in a read–write race. shared] discards the read history of x by resetting Rx to

Of these three rules, the last rule is the most expen- ⊥e, which also switches x back into epoch mode and so sive but is rarely needed (0.1% of reads) and the first three rules provide commonly-executed, constant-time Figure 5. Adaptive read history representation. fast paths. In contrast, the corresponding rule [Djit+ read] always executes an O(n)-time VC comparison for C C W R these cases. 0 1 x x Write Operations: The next three FastTrack rules handle a á7,0,…ñ á0,1,…ñ ⊥e ⊥e write operation wr(t, x). Rule [FT write same epoch] opti- mizes the case where x was already written in this epoch, wr(0,x) + which applies to 71.0% of write operations, and Djit incor- á7,0,…ñ á0,1,…ñ 7@0 ⊥ porates a comparable rule. [FT write exclusive] provides e fork(0,1)

á8,0,…ñ á7,1,…ñ 7@0 ⊥e Figure 4. Synchronization, threading operations. rd(1,x) Other: 3.3% of all Operations á8,0,…ñ á7,1,…ñ 7@0 1@1 When Thread t performs acq(t, m): rd(0,x)

Ct := Ct  Lm á8,0,…ñ á7,1,…ñ 7@0 á8,1,…ñ When Thread t performs rel(t, m): rd(1,x) Lm := Ct á8,0,…ñ á7,1,…ñ 7@0 á8,1,…ñ Ct := inct(Ct) When Thread t performs fork(t, u): join(0,1) á8,1,…ñ á7,2,…ñ 7@0 á8,1,…ñ Cu := Cu  Ct

Ct := inct(Ct) wr(0,x) When Thread t performs join(t, u): á8,1,…ñ á7,2,…ñ 8@0 ⊥e C := C C t t  u rd(0,x)

Cu := incu(Cu) á8,1,…ñ á7,2,…ñ 8@0 8@0

98 communications of the acm | november 2010 | vol. 53 | no. 11

optimizes later accesses to x. The last read in the trace then – MultiRace, a hybrid LockSet/Djit+ race detector16 sets Rx to a nonminimal epoch. 4.1. Performance and precision 4. EVALUATION Table 1 lists the size, number of threads, and uninstru- To validate FastTrack, we implemented it as a component mented running times for a variety of benchmark programs of the RoadRunner dynamic analysis framework for mul- drawn from the Java Grande Forum,12 Standard Performance tithreaded Java programs.10 RoadRunner takes as input a Evaluation Corporation,19 and elsewhere.2,7,11,21 All timing compiled Java target program and inserts instrumentation measurements are the average of 10 test runs. Variability code into the target to generate an event stream of memory across runs was typically less than 10%. and synchronization operations. Back-end checking tools The “Instrumented Time” columns show the running process these events as the target executes. The FastTrack times of each program under each of the tools, reported as implementation extends the algorithm described so far the ratio to the uninstrumented running time. Thus, tar- to handle additional Java primitives, such as volatile get programs ran 4.1 times slower, on average, under the variables and wait(), as outlined previously.8 Some of the Empty tool. Most of this overhead is due to communicat- benchmarks contain faulty implementations of barrier ing all target program operations to the back-end checker. synchronization.9 FastTrack contains a specialized analy- The variations in slowdowns for different programs sis to compensate for these bugs. are not uncommon for dynamic race condition checkers. We compare FastTrack’s precision and performance Different programs exhibit different memory access and to six other analyses implemented in the same framework: synchronization patterns, some of which impact analysis performance more than others. In addition, instrumenta- – Empty, a trivial checker that performs no analysis tion can impact cache performance, class loading time, and and is used to measure the overhead of RoadRunner other low-level JVM operations. These differences can some- – Eraser,18 an imprecise race detector based on the times even make an instrumented program run slightly LockSet algorithm described in Section 1 faster than the uninstrumented (as in colt). – Goldilocks, a precise race detector based on an The last six columns show the number of warnings pro- extended notion of LockSets7 duced by each checker. The tools report at most one race – BasicVC, a traditional VC-based race detector that for each field of each class, and at most one race for each maintains a read and a write VC for each memory array access in the program source code. All eight warn- location and performs at least one VC comparison on ings from FastTrack reflect real race conditions. Some every memory access of these are benign (as in tsp, mtrt, and jbb) but oth- – Djit+, a high-performance VC-based race detector16 ers can impact program behavior (as in raytracer and described in Section 2 hedc).15, 20, 21

Table 1. Benchmark results. Programs marked with ‘*’ are not compute-bound and are excluded from average slowdowns.

Instrumented Time (slowdown) Warnings ace ace C C rack rack

Base R R V V T T Size Thread Time + + ulti ulti jit jit asic asic oldilocks oldilocks mpty raser raser ast ast D D Program (loc) Count (s) E E M G B F E M G B F colt 111,421 11 16.1 0.9 0.9 0.9 1.8 0.9 0.9 0.9 3 0 0 0 0 0 crypt 1,241 7 0.2 7.6 14.7 54.8 77.4 84.4 54.0 14.3 0 0 0 0 0 0 lufact 1,627 4 4.5 2.6 8.1 42.5 – 95.1 36.3 13.5 4 0 – 0 0 0 moldyn 1,402 4 8.5 5.6 9.1 45.0 17.5 111.7 39.6 10.6 0 0 0 0 0 0 montecarlo 3,669 4 5.0 4.2 8.5 32.8 6.3 49.4 30.5 6.4 0 0 0 0 0 0 mtrt 11,317 5 0.5 5.7 6.5 7.1 6.7 8.3 7.1 6.0 1 1 1 1 1 1 raja 12,028 2 0.7 2.8 3.0 3.2 2.7 3.5 3.4 2.8 0 0 0 0 0 0 raytracer 1,970 4 6.8 4.6 6.7 17.9 32.8 250.2 18.1 13.1 1 1 1 1 1 1 sparse 868 4 8.5 5.4 11.3 29.8 64.1 57.5 27.8 14.8 0 0 0 0 0 0 series 967 4 175.1 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1 0 0 0 0 0 sor 1,005 4 0.2 4.4 9.1 16.9 63.2 24.6 15.8 9.3 3 0 0 0 0 0 tsp 706 5 0.4 4.4 24.9 8.5 74.2 390.7 8.2 8.9 9 1 1 1 1 1 elevator* 1,447 5 5.0 1.1 1.1 1.1 1.1 1.1 1.1 1.1 0 0 0 0 0 0 philo* 86 6 7.4 1.1 1.0 1.1 7.2 1.1 1.1 1.1 0 0 0 0 0 0 hedc* 24,937 6 5.9 1.1 0.9 1.1 1.1 1.1 1.1 1.1 2 1 0 3 3 3 jbb* 30,491 5 72.9 1.3 1.5 1.6 2.1 1.6 1.6 1.4 3 1 – 2 2 2 Average slowdown/total warnings 4.1 8.6 21.7 31.6 89.8 20.2 8.5 27 5 3 8 8 8

november 2010 | vol. 53 | no. 11 | communications of the acm 99 research highlights

Eraser Comparison: Our reimplementation of Eraser RoadRunner. Because of these difficulties, oldilocks G incurs an overhead of 8.7x, which is competitive with similar reimplemented in RoadRunner incurred a slowdown of Eraser implementations built on top of unmodified JVMs.15 31.6x across our benchmarks, but ran out of memory on Surprisingly, FastTrack is slightly faster than Eraser on lufact. Our Goldilocks reimplementation missed three some programs, even though it performs a precise analysis races in hedc, due to an unsound performance optimiza- that traditionally has been considered more expensive. tion for handling thread-local data efficiently.7 We believe More significantly, Eraser reported many spurious warn- some performance improvements are possible, for both ings that do not correspond to actual races. Augmenting Goldilocks and the other tools, by integration into the vir- our Eraser implementation to reason about additional tual machine. synchronization constructs, such as fork/join or wait/ notify operations,16, 22 would eliminate some of these 4.2. Checking eclipse for race conditions spurious warnings, but not all. On hedc, Eraser reported To validate FastTrack in a more realistic setting, we also a spurious warning but also missed two of the real race applied it to five common operations in the Eclipse devel- conditions reported by FastTrack, due to an (intentional) opment environment.6 These include launching Eclipse, unsoundness in how the Eraser algorithm reasons about importing a project, rebuilding small and large workspaces, thread-local and read-shared data.18 and starting the debugger. The checking overhead for these BasicVC and Djit+ Comparison: Djit+ and BasicVC operations is as follows: reported exactly the same race conditions as FastTrack. That is, all three checkers provide identical precision. Instrumented Time (Slowdown) However, FastTrack outperforms the other checkers. It is Base Operation Time (s) Empty Eraser Djit+ FastTrack roughly 10x faster than BasicVC and 2.3x faster than Djit+. These performance improvements are due primarily to the Startup 6.0 13.0 16.0 17.3 16.0 reduction in the allocation and use of VCs. Across all bench- Import 2.5 7.6 14.9 17.1 13.1 marks, Djit+ allocated more over 790 million VCs, whereas Clean Small 2.7 14.1 16.7 24.4 15.2 FastTrack allocated only 5.1 million. Djit+ performed Clean Large 6.5 17.1 17.9 38.5 15.4 over 5.1 billion O(n)-time VC operations, while FastTrack Debug 1.1 1.6 1.7 1.7 1.6 performed only 17 million. The memory overhead for stor- ing the extra VCs leads to significant cache performance Eraser reported potential races on 960 distinct field and degradation in some programs, particularly those that ran- array accesses for these five tests, largely because Eclipse domly access large arrays. These tools are likely to incur uses many synchronization idioms that Eraser cannot even greater overhead when checking programs with larger handle, such as wait()and notify(), semaphores, and numbers of threads. readers-writer locks. FastTrack reported 27 distinct warn- MultiRace Comparison: MultiRace maintains Djit+’s ings, 4 of which were subsequently verified to be potentially instrumentation state, as well as a lock set for each memory destructive.9 Djit+ reported 28 warnings, which overlapped location.16 The checker updates the lock set for a location heavily with those reported by FastTrack, but schedul- on the first access in an epoch, and full VC comparisons ing differences led to several being missed and several new are performed only after this lock set becomes empty. This (benign) races being identified. Although our exploration of synthesis substantially reduces the number of VC opera- Eclipse is far from complete, these preliminary observations tions, but introduces the overhead of storing and updating are quite promising. FastTrack is able to scale to precisely lock sets. In addition, the use of Eraser’s unsound state check large applications with lower run-time and memory machine for thread-local and read-shared data leads to overheads than existing tools. imprecision. Our reimplementation of the MultiRace algorithm 5. CONCLUSION exhibited performance comparable to Djit+. Performance Race conditions are difficult to find and fix. Precise race of MultiRace (and, in fact, all of our other checkers) can detectors avoid the programmer-overhead of identifying be improved by adopting a coarse-grain analysis in which all and eliminating spurious warnings, which are particularly fields of an object are represented as a single “logical loca- problematic when using imprecise checkers on large pro- tion” in the instrumentation state.16, 22 grams with complex synchronization. Our FastTrack anal- Goldilocks Comparison: Goldilocks7 is a precise race ysis is a new precise race detection algorithm that achieves detector that does not use VCs to capture the happens-before better performance than existing algorithms by tracking relation. Instead, it maintains, for each memory location, a less information and dynamically adapting its represen- set of “synchronization devices” and threads. A thread in tation of the happens-before relation based on memory that set can safely access the memory location, and a thread access patterns. We have used FastTrack to identify data can add itself to the set (and possibly remove others) by per- races in programs as large as the Eclipse programming forming any of the operations described by the synchroniza- environment, and also to improve the performance of other tion devices in the set. analyses that rely on precise data race information, such Goldilocks is a complicated algorithm to optimize, and as serializability checkers.8 The FastTrack algorithm and ideally requires tight integration with the underlying virtual adaptive epoch representation is straightforward to imple- machine and garbage collector, which is not possible under ment and may be useful in other dynamic analyses for

100 communications of the acm | november 2010 | vol. 53 | no. 11

multithreaded software. 11. fleury, E., Sutre, G. Raja, version 17. ronsse, M., Bosschere, K.D. RecPlay: A 0.4.0-pre4. Available at http://raja. fully integrated practical record/replay sourceforge.net/, 2007. system. TCS 17, 2 (1999), 133–152. Acknowledgments 12. Java Grande Forum. Java Grande 18. savage, S., Burrows, M., Nelson, G., benchmark suite. Available at Sobalvarro, P., Anderson, T.E. Eraser: This work was supported in part by NSF Grants 0341179, http://www.javagrande.org/, 2008. A dynamic data race detector for 0341387, 0644130, and 0707885. We thank Ben Wood for 13. Lamport, L. Time, clocks, and the multi-threaded programs. TOCS 15, 4 ordering of events in a distributed (1997), 391–411. implementing Goldilocks in RoadRunner and for com- system. Commun. ACM 21, 7 (1978), 19. standard Performance Evaluation ments on a draft of this paper, and Tayfun Elmas, Shaz 558–565. Corporation. SPEC benchmarks. 14. mattern, F. Virtual time and global http://www.spec.org/, 2003. Qadeer, and Serdar Tasiran for their assistance with states of distributed systems. In 20. von Praun, C., Gross, T. Object race Workshop on Parallel and Distributed detection. In OOPSLA, 2001, Goldilocks. Algorithms, 1988. 70–82. 15. o’Callahan, R., Choi J.-D. Hybrid 21. von Praun, C., Gross, T. Static conflict dynamic data race detection. In analysis for multi-threaded object- References PPOPP (2003), 167–178. oriented programs. In PLDI (2003), 16. pozniansky, E., Schuster, A. MultiRace: 115–128. 1. adve, S.V., Hill, M.D., Miller, B.P., on Computational Science (2001), Efficient on-the-fly data race detection 22. yu, Y., Rodeheffer, T., Chen, W. Netzer, R.H.B. Detecting data races 761–770. in multithreaded C++ programs. RaceTrack: Efficient detection of data on weak memory systems. In ISCA 6. the Eclipse programming Concurrency and Computation: race conditions via adaptive tracking. (1991), 234–243. environment, version 3.4.0. Practice and Experience 19, 3 (2007), In SOSP (2005), 221–234. 2. cern. Colt 1.2.0. Available at Available at http://www.eclipse.org, 327–340. http://dsd.lbl.gov/~hoschek/colt/ 2009. (2007). 7. elmas, T., Qadeer, S., Tasiran, S. 3. choi, J.-D., Lee, K., Loginov, A., Goldilocks: A race and transaction- O’Callahan, R., Sarkar, V., Sridhara, aware Java runtime. In PLDI (2007), Cormac Flanagan, Computer Science Stephen N. Freund, Computer M. Efficient and precise datarace 245–255. Department, University of California at Science Department, Williams College, detection for multithreaded object- 8. flanagan, C., Freund, S.N. FastTrack: Santa Cruz, Santa Cruz, CA. Williamstown, MA. oriented programs. In PLDI (2002), Efficient and precise dynamic race 258–269. detection. In PLDI (2009), 121–133. 4. choi, J.-D., Miller, B.P., Netzer R.H.B. 9. flanagan, C., Freund, S.N. Adversarial Techniques for debugging parallel memory for detecting destructive programs with flowback analysis. races. In PLDI (2010), 244–254. TOPLAS 13, 4 (1991), 491–530. 10. flanagan, C., Freund, S.N. The 5. christiaens, M., Bosschere, K.D. RoadRunner dynamic analysis TRaDe: Data race detection for framework for concurrent programs. Java. In International Conference In PASTE (2010), 1–8. © 2010 ACM 0001-0782/10/1100 $10.00

CACM lifetime mem half page ad:Layout 1 2/3/10 2:21 PM Page 1

Take Advantage of ACM’s Lifetime Membership Plan!

◆ ACM Professional Members can enjoy the convenience of making a single payment for their entire tenure as an ACM Member, and also be protected from future price increases by taking advantage of ACM's Lifetime Membership option. ◆ ACM Lifetime Membership dues may be tax deductible under certain circumstances, so becoming a Lifetime Member can have additional advantages if you act before the end of 2010. (Please consult with your tax advisor.) ◆ Lifetime Members receive a certificate of recognition suitable for framing, and enjoy all of the benefits of ACM Professional Membership. Learn more and apply at: http://www.acm.org/life

november 2010 | vol. 53 | no. 11 | communications of the acm 101 Group Term Life Insurance

10- or 20-Year Group Term Life Insurance

Group Disability Income Insurance

Group Accidental Death & Dismemberment Insurance

Group Catastrophic Major Medical Insurance

Group Dental Plan

Long-Term Care Plan

Major Medical Insurance

Short-Term Medical Plan

Who has time to think about insurance?

Today, it’s likely you’re busier than ever. So, the last thing you probably have on your mind is whether or not you are properly insured.

But in about the same time it takes to enjoy a cup of coffee, you can learn more about your ACM-sponsored group insurance program — a special member benefit that can help provide you financial security at economical group rates.

Take just a few minutes today to make sure you’re properly insured.

Call Marsh U.S. Consumer, a service of Seabury & Smith, Inc., at 1-800-503-9230 or visit www.personal-plans.com/promo/acm/49771.

49771 (2010) ©Seabury & Smith, Inc. 2010 d/b/a in CA Seabury & Smith Insurance Program Management Administered by: CA Ins. Lic. #0633005 AR Ins. Lic. #245544

49771 ACM All Plans ad.indd 1 3/8/10 9:25 AM 49771 ACM AD (2010) Full Size: 8.125" x 10.875" Bleed: 8.375" x 11.125" Live: 7" x 9.5" Folds to: NA Perf: N/A Colors: 4C Stock: NA Postage: N/A

MARSH Misc: N/A careers

Appalachian State University favorably. Candidates must commit to, and excel that will be available to the successful candidate. Tenure Track Assistant Professor in, undergraduate teaching as well as scholarly The Department of Computer Science (www. achievement. A Ph.D. in computer science or a cs.dartmouth.edu) is home to 17 tenured and The Department of Computer Science at Appa- closely related field is required. tenure-track faculty whose research spans com- lachian State University invites applications for This position carries a 3-2 teaching load with putational biology, vision/graphics, machine a tenure-track faculty position at the rank of as- a full-salary one-semester research leave prior to learning, algorithms, theory, and systems. The sistant professor beginning August 2011. Quali- tenure review and generous sabbatical and fel- department has strong Ph.D. and M.S. programs, fications for this position include a Ph.D. degree lowship leaves for senior faculty. Please send a outstanding undergraduate majors and minors, in Computer Science or a closely related field. cover letter describing research and teaching and is affiliated with an M.D./Ph.D. program. Responsibilities include teaching undergraduate interests, curriculum vitae, statement on teach- Dartmouth is an Ivy League school situated in and graduate computer science courses, an active ing, undergraduate and graduate transcripts, and Hanover, on the Connecticut River, in the Upper program of scholarship, pursuit of external fund- three letters of recommendation to: Valley region of New Hampshire. It is a beautiful, ing, and participation in service activities. Search Committee, reference #001 historic campus, located in a scenic, year-round, The Department of Computer Science at Ap- Department of Mathematics and Computer outdoor recreational area. Dartmouth hosts an an- palachian State University has 10 tenured faculty Science nual film festival; renowned musical and theatrical members. Approximately 25 students are enrolled College of the Holy Cross performers; and convenient public transportation in the Master of Science program. The Bachelor of Worcester, MA 01610 to Boston and New York, as well as local airports. Science in Computer Science program is accredit- email: [email protected] Applicants are invited to send their CV, re- ed by the Computing Accreditation Commission search statement, teaching statement, and names of ABET and serves about 220 students. Review of applications will begin on Decem- of at least four references, one of whom should Appalachian State University is a member in- ber 15, 2010 and continue until the position has comment about teaching. All material should be stitution of the seventeen-campus University of been filled. sent to [email protected] by December North Carolina. Located in Boone, North Carolina, The College of the Holy Cross is a highly se- 1st, 2010. All letters of recommendation should the university has approximately 17,000 students, lective Catholic liberal arts college in the Jesuit be emailed or mailed to search, 6211 Sudikoff primarily in bachelors and masters programs in tradition. It enrolls about 2,700 students and is Lab, Computer Science Department, Dartmouth both liberal arts and applied fields. Additional located in a medium-sized city 45 miles west of College, Hanover, NH 03755 by the recommender information about the Department of Computer Boston. Holy Cross belongs to the Colleges of themselves. Science, the university, and the surrounding area Worcester Consortium (www.cowc.org) and the Direct inquiries may be sent to Professor Hany can be found at www.appstate.edu. New England Higher Education Recruitment Farid ([email protected]). Applicants must send: 1) statements on teach- Consortium (www.neherc.org). The College is an Dartmouth is an equal opportunity/affirma- ing, research, and interest/qualifications for this Equal Employment Opportunity Employer and tive action employer and encourages applications position, 2) a copy of their graduate transcript, complies with all Federal and Massachusetts laws from women and members of minority groups. 3) a curriculum vita, and 4) names and contact concerning equal opportunity and affirmative ac- information for three references. All application tion in the workplace. material should be sent by e-mail to search@ Additional information on the college, the de- Furman University cs.appstate.edu in a single PDF file attachment or partment, and this position available at: Assistant Professor of Computer Science by mail to the chair of the search committee: Dr. http://mathcs.holycross.edu/positionCS/ Rahman Tashakkori, Department of Computer CSPos11.html The Department of Computer Science invites Science, 525 Rivers Street, Boone, NC 28608. The applications for a tenure track position at the initial review of complete applications will begin Assistant Professor level to begin in the fall of on December 2, 2010 and will continue until the Dartmouth College 2011. Candidates must have a Ph.D. in Computer position is filled. Neukom Institute for Computational Science Science or a closely related field. The position re- Appalachian State University is an Affirmative Department of Computer Science quires teaching excellence, effective institutional Action/Equal Opportunity Employer. The univer- Assistant Professor service, and an ability to work with colleagues sity has a strong commitment to the principles of across disciplines. An ability to develop a pro- diversity and inclusion, and to maintaining work- The Neukom Institute for Computational Science gram of scholarly and professional activity involv- ing and learning environments that are free of all and the Department of Computer Science at Dart- ing undergraduates is a priority. Research special- forms of discrimination. mouth College invite applications for a tenure- ty areas being sought include (but are not limited Individuals with disabilities may request ac- track faculty position at the level of Assistant Pro- to) high performance computing, computational commodations in the application process by con- fessor in the Department of Computer Science. science, mathematical modeling, and bioin- tacting the chair of the search committee. Crimi- We seek candidates in the area of computational formatics. Of particular interest are candidates nal background checks will be conducted on all biology and bioinformatics whose research fo- willing to engage in collaborative research that finalists invited for on-campus interviews. cuses on the development and application of new bridges the computational and medical sciences. computational methods. Candidates will comple- The position will be initially funded by and is ex- ment a growing program in computational biol- pected to contribute to a major multi-disciplinary College of the Holy Cross ogy within the Departments of Biology, Computer and multi-organizational state-wide initiative Assistant Professor, Tenure Track Science, Engineering Sciences, and Mathematics, aimed at biofabrication of tissues and organs. as well as the Dartmouth Medical School. Furman is a highly selective, independent, top The Department of Mathematics and Computer The Neukom Institute for Computational 40 undergraduate liberal arts institution with an Science at the College of the Holy Cross invites Science (www.dartmouth.edu/~neukom) is an enrollment of approximately 2600 students. The applications for a full-time tenure-track ap- endowed institute whose broad mandate is to in- university is located in the vibrant and beautiful pointment to begin in August 2011. All research spire and support computational science across upstate region of South Carolina, offers generous specialties will be considered and breadth of in- the Dartmouth campus. The Institute has con- benefits to fulltime faculty, and subscribes toa terests within computer science will be viewed siderable financial and computing resources problem-solving, project-oriented, experience-

november 2010 | vol. 53 | no. 11 | communications of the acm 103 careers

based approach to education that is referred to as commitment to introductory undergraduate oration, and substantial support from the Harvard Engaged Learning. The Department of Computer Computer Science instruction. We are seeking School of Engineering and Applied Sciences. Science confers the B.S. degree with majors in applicants whose interests and experience focus We invite applications for a tenure-track pro- Computer Science, Information Technology, and on teaching, pedagogy, and classroom and labo- fessor within Computer Science, and strongly Computer Science/Mathematics. The successful ratory innovations. While we are considering an encourage applications from qualified women candidate will have the opportunity to teach in internal candidate for this position, we encourage and minority candidates. The appointment is ex- Furman’s First Year Seminar program. Furman applications from all well qualified individuals, pected to begin on July 1, 2011. University is an equal-opportunity employer. especially from women and minority candidates. We seek outstanding applicants in areas relat- Women and underrepresented minorities are The application deadline is December 1, ed to machine learning and artificial intelligence. strongly encouraged to apply. For the complete 2010. A Ph.D. and significant postdoctoral teach- We are particularly interested in applicants ad, please visit http://cs.furman.edu. ing experience are required. Applications includ- whose research examines computational issues Applicants should submit a curriculum vitae, ing a CV, a statement describing teaching experi- raised by very large data sets, broadly construed. statement of teaching philosophy, description of ence and philosophy, and at least three letters of Specific areas of interest include but are not lim- research interests, an official copy of most recent recommendation should be sent to: Computer ited to statistical machine learning, probabilistic transcripts, and have three letters of recommen- Science Lectureship Committee, c/o Tristen Hub- modeling, reinforcement learning and massively dation sent separately. Please send all materials bard, School of Engineering and Applied Sciences, parallel processing. Potential application areas of to Dr. Kevin Treu, Chair, Department of Com- Harvard University, 33 Oxford Street, Cambridge, interest include computational science and engi- puter Science, Furman University, 3300 Poinsett MA 02138. Email: [email protected]. neering and the social sciences. Hwy, Greenville, SC 29613. Materials may also be Harvard is an Equal Opportunity/Affirmative Candidates should have an outstanding re- sent in PDF format to [email protected]. Action employer and encourages applications search record and a strong commitment to un- Review of applications will continue until the po- from women and members of minority groups. dergraduate teaching and graduate training. sition is filled. Applicants must have completed a Ph.D. by Sep- ESTIMATE #UDEL_CCMCLUSTERtember 1, 2011. Information about Harvard’s cur- Harvard University SEAS rent faculty, research, and educational programs Harvard University SEAS Tenure-TrackPUBLICATION: Professor in Computer Science COMMUNICATIONSis available at http://www.seas.harvard.edu/teach OF THE ACM- Senior Lectureship in Computer Science ing-learning/areas/computer-science. Over the past several years, Harvard’s Computer Candidates should send a curriculum vitae, a The School of Engineering and Applied Sciences Science faculty has doubledDATE(S) in size, moved OF into RUN:list of publications, NOVEMBER a statement of research and (SEAS) at Harvard University expects to appoint a a state-of-the-art teaching and research facility, teaching interests, and up to three representa- Senior Lecturer in Computer Science, either full- and made a serious commitment to fosteringDEADLINE: col- tive papers SEPT (ideally24 as a single PDF document) to time or part-time, starting in the Fall of 2011. This laboration with other academic disciplines. The [email protected]. In addition, candi- position will be for five years and may be renewed. Computer Science program benefits from its out- dates should have at least three letters of refer- We are particularly interested in candidates standing undergraduate and graduate students, ence sent to the above address. who have demonstrated their excellence in and an excellent location, significant industrial collab- Alternatively, material may also be sent via surface mail to CS Search Committee, School of Engineering and Applied Sciences, Harvard Uni- versity, Maxwell Dworkin 153, 33 Oxford Street, Cambridge,Faculty MA Positions 02138. in Electrical and Computer ApplicationsEngineering will at be the reviewed University as they of Delaware are received. For full consideration, applications The Department of Electrical and Computer Engineering (ECE) at the shouldUniversity be received of Delaware by December (UD) invites 1, nominations 2010. and applications for One of the oldest institutions of higher education in this country, the Harvardtenure-track is facultyan Equal positions Opportunity/ at the Assistant, Affirmative Associate, and Full Professor ranks. University of Delaware today combines tradition and innovation, offer- Action Employer. Applications from women and ing students a rich heritage along with the latest in instructional and minoritiesThe ECE are Department strongly invites encouraged. candidates that complement the depart- research technology. The University of Delaware is a Land-Grant, Sea-Grant ment's traditional strengths in (1) Computer Engineering, (3) Signal Processing, Communications & Controls, and (3) Nanoelectronics, and Space-Grant institution with its main campus in Newark, DE, located halfway Electromagnetics, and Photonics. A particular emphasis is placed on can- between Washington, DC and New York City. Please visit our website at www.udel.edu. Hologic,didates aligned Inc. with UD's College of Engineering wide cluster searches in: (1) Energy, (2) Information Technologies, (3) Security, and (4) Biomedical SoftwareEngineering. Engineering Exceptional cases outside of these traditional strengths and Faculty Positions in Electrical and Computer Engineering focused areas may also be considered. Hologic,Successful Inc. applicants is a leading will share developer, our vision tomanufactur grow the department- into a The Department of Electrical and Computer Engineering (ECE) at the University of Delaware (UD) invites nom- er andleader supplier in research of andpremium educational diagnostics, programs. ECE medical initiatives are support- inations and applications for tenure-track faculty positions at the Assistant, Associate, and Full Professor ranks. ed by over 40,000 square feet of departmental facilities, including a state- imagingof-the systemsart 7,000 sq and ft clean surgical room for products nano-fabrication, dedicated and fueled with over The ECE Department invites candidates that complement the department's traditional strengths in (1) Computer to serving$10M/year the in healthcare research expenditures. needs of Applicants women. should Due tohold a Ph.D. in Engineering, (3) Signal Processing, Communications & Controls, and (3) Nanoelectronics, Electromagnetics, electrical and/or computer engineering, or closely related field in mathe- continuedmatics, biomedical, growth computer & new or products, physical sciences. Hologic Successful has candidates and Photonics. A particular emphasis is placed on candidates aligned with UD's College of Engineering wide multipleare expected openings to have in demonstrated our Danbury, excellence CT inand innovative New- research and cluster searches in: (1) Energy, (2) Information Technologies, (3) Security, and (4) Biomedical Engineering. ark, showDE facilitiesthe potential in for our high-quality Software teaching Engineering and mentoring. de The- University of Delaware offers very competitive salary and start-up packages, and has Exceptional cases outside of these traditional strengths and focused areas may also be considered. partment.a generous Medical benefit package. device The experience application reviews preferred. start November 1st. ApplyThe to search website; will continue www.hologic.com/careers. until the positions are filled; however, early appli- Successful applicants will share our vision to grow the department into a leader in research and education- cation is strongly encouraged. al programs. ECE initiatives are supported by over 40,000 square feet of departmental facilities, including a Hologic, Inc. is an state-of-the art 7,000 sq ft clean room for nano-fabrication, and fueled with over $10M/year in research ApplicantsEqual should Opportunity submit a curriculum Employer. vitae, a statement of research and teaching interests, and a list of at least four references to expenditures. Applicants should hold a Ph.D. in electrical and/or computer engineering, or closely related www.engr.udel.edu/facultysearch. Questions can be directed to f- field in mathematics, biomedical, computer or physical sciences. Successful candidates are expected to [email protected] - 1 Sr. orEmbedded ECE Faculty SW Search Eng. Committee, BA in Eng/ 140 Evans Hall, have demonstrated excellence in innovative research and show the potential for high-quality teaching and University of Delaware, Newark, DE 19716. Applications will be consid- Compered Sci until and the 5+ position yrs experience. is filled. The UniversityPWM motor of Delaware driv- is an equal mentoring. The University of Delaware offers very competitive salary and start-up packages, and has a gen- ers/PIDopportunity controllers. employer. Proficiency with assembly, C erous benefit package. The application reviews start November 1st. The search will continue until the posi- and C++ in embedded environment. Knowledge The UNIVERSITY OF DELAWARE is an Equal Opportunity Employer tions are filled; however, early application is strongly encouraged. of Freescale,which encourages .NET and applications failsafe from experience Minority Group useful. Members and (IRC #18231) Women. Applicants should submit a curriculum vitae, a statement of research and teaching interests, and a list of at least four references to www.engr.udel.edu/facultysearch. Questions can be directed to [email protected] 1 SW Eng. BA in EE/Comp Sci. Excellent or ECE Faculty Search Committee, 140 Evans Hall, University of Delaware, Newark, DE 19716. Applications C/C#/C++ & object oriented programming skills, will be considered until the position is filled. DICOM, med imaging & image processing. Use dev tools and all elements of standard dev envi- The UNIVERSITY OF DELAWARE is an Equal Opportunity Employer which encourages applications from ronment; source code control, compilers, config Minority Group Members and Women. mgmt. & defect tracking. (IRC #19788)

104 communications of the acm | november 2010 | vol. 53 | no. 11

1/3 Page ad in-column Cost: $4,750.00 Cost: $2,618.00 Web see below Web see below Proc. Charge $128.25 Proc. Charge $70.68 Total Cost: $4,878.25 Total Cost: $2,678.68 november 2010 | vol. 53 | no. 11 | communications of the acm 105 careers

Newark - 2 SW Eng (Embedded). BA in EE/ engineering such as bioinformatics and financial tions for two positions beginning in Fall 2011. Comp Sci, Masters preferred. 3 yrs software dev engineering. Strong candidates in core computer using C and C++ in an RTOS/embedded environ- science and engineering research areas will also Senior Faculty Position in Systems ment. Medical device/ medical imaging or driver be considered. Applicants at Assistant Professor This position is in systems, broadly encom- dev experience preferred. (IRC #19750) level should have an earned PhD degree and dem- passing parallel computing and architectures, 1 Sr. SW Eng. BA in Comp Eng/Eng. 5 years onstrated potential in teaching and research. distributed systems, cyberinfrastructure, and experience developing C/C#/C++ in Windows en- Salary is highly competitive and will be com- networking. Applicants are expected to have a vironment & object oriented programming skills. mensurate with qualifications and experience. well-established track record of substantial re- DICOM, med imaging & image processing. (IRC Fringe benefits include medical/dental benefits search contributions to their field, externally #19306) and annual leave. Housing will also be provided funded research, and leadership. where applicable. For appointment at Assistant Professor/Associate Professor level, initial ap- Faculty Position in Complex Networks The Hong Kong University of Science pointment will normally be on a three-year con- and Systems and Technology tract. A gratuity will be payable upon completion This position is at the junior level but outstanding Department of Computer Science and of contract. senior candidates may be considered. Research Engineering Applications should be sent through e-mail areas include complex networks, computational Faculty Positions including a cover letter, curriculum vitae (includ- biology and epidemiology, artificial life and ro- ing the names and contact information of at least botics, computational intelligence, bio-inspired The Department of Computer Science and Engi- three referees), a research statement and a teach- computing, large scale data modeling and simu- neering is one of the largest departments in the ing statement (all in PDF format) to csrecruit@ lation, and Web applications, with special inter- School of Engineering. The Department currently cse.ust.hk. Priority will be given to applications est in modeling the dynamics of complex infor- has 40 faculty members recruited from major received by 28 February 2011. Applicants will be mation networks, social networks and media, universities and research institutions around the promptly acknowledged through e-mail upon re- and the spread of ideas and disease in human and world, with about 1000 students (including 600 ceiving the electronic application material. social systems. undergraduate and 180 postgraduate students). (Information provided by applicants will be Applicants for either position should have a The medium of instruction is English. More in- used for recruitment and other employment-re- Ph.D.in Computer Science or another relevant formation on the Department can be found at lated purposes.) area and a well-established record (senior level) http://www.cse.ust.hk/. or demonstrable potential for excellence in re- The Department will have at least two tenure- search and teaching (junior level). track faculty openings at Assistant Professor/ Indiana University The IU Bloomington School of Informatics Associate Professor/Professor levels for the 2011- School of Informatics and Computing and Computing is the first of its kind and among 2012 academic year. We are looking for faculty the largest in the country, with a faculty of more candidates with interests in multidisciplinary re- The School of Informatics and Computing at In- than 60 full time members, more than 400 gradu- search areas related to computational science and diana University, Bloomington, invites applica- ate students, and strong undergraduate programs. Degrees offered include M.S. degrees in Com- puter Science, Bioinformatics, Human Computer Interaction Design, and Security Informatics, and Ph.D. degrees in Computer Science and in Infor- matics. The School has received public recogni- tion as a “top-ten program to watch” (Computer- world) thanks to its excellence and leadership in academic programs, interdisciplinary research, TEMASEK RESEARCH FELLOWSHIP (TRF placement, and outreach. The school offers excel- lent work conditions, including attractive salaries The Nanyang Technological University (NTU) and the National University of and research support, and low teaching loads in a Singapore (NUS) invite outstanding young researchers with a PhD Degree setting of strong student growth. in science or technology to apply for the prestigious TRF awards. Located in the wooded, rolling hills of south- The TRF scheme provides selected young researchers an opportunity to ern Indiana, Bloomington is a culturally thriv- conduct and lead research that is a relevant to defence. It offers: ing college town with a moderate cost of living • 3-year research grant, with an option to extend up to and the amenities for an active lifestyle. IU is a further 3 years, renowned for its top-ranked music school, high • possible tenure-track academic appointment with the performance computing and networking facili- university at the end of the TRF, ties, and performing and fine arts. • attractive and competitive remuneration. Applicants should submit a curriculum vitae, Fellows may lead and conduct research, and publish in these areas: a statement of research and teaching, and the 1. Biomimicry names of six references using the recruit link at 2. Cognitive Sciences http://hiring.soic.indiana.edu (preferred) or by 3. Cyber Security mail to the Chair of either the Systems or Com- 4. Computational Photography plex Networks and Systems Faculty Search Com- 5. Microsystem Technologies mittee, School of Informatics and Computing, Other fundamental areas of science or technology, where a breakthrough 919 E 10th Street, Bloomington, IN 47408. Ques- would be of interest to defence and security, will also be considered. tions concerning the Systems search may be sent Singapore is a globally connected cosmopolitan city-state with to [email protected]; a supportive environment and vibrant research culture. For more questions concerning the Complex Networks information and application procedure, please visit and Systems search to hiring-cnets@informatics. NTU – http://www3.ntu.edu.sg/trf/ indiana.edu. To receive full consideration com- NUS – http://www.nus.edu.sg/dpr/funding/trf.htm pleted applications must be received by Decem- ber 1, 2010. Closing date: 11 January 2011 Indiana University is an Equal Opportunity/ Shortlisted candidates will be invited to Singapore to present Affirmative Action employer. Applications from their research plans, meet local researchers and identify potential women and minorities are strongly encouraged. collaborators in April/May 2011. IU Bloomington is vitally interested in the needs of Dual Career couples.

106 communications of the acm | november 2010 | vol. 53 | no. 11 Michigan Technological University Montana State University vision and multimedia; machine learning; natu- Department of Computer Science RightNow Technologies Professorships in ral language processing; scientific computing; Department Chair Computer Science and verification and programming languages. Collaborative research with industry is facili- Michigan Technological University invites ap- The Montana State University Computer Science tated by geographic proximity to computer sci- plications and nominations for the position of Department is searching for two faculty members ence activities at AT&T, Google, IBM, Bell Labs, Chair of the Department of Computer Science. at either the Assistant, Associate or Full level, NEC, and Siemens. The chair will be expected to build on a strong based on experience. Candidates at the Associ- Please apply at https://cs.nyu.edu/webapps/ undergraduate degree program and continue the ate or Full level must have established or rising facapp/register development of graduate education and research prominence in their field. A three-year start-up To guarantee full consideration, applications programs. Candidates are expected to have a pro- package is being provided by RightNow Tech- should be submitted no later than December 1, fessional record of accomplishments commen- nologies. Montana State University is a Carnegie 2010; however, this is not a hard deadline, as all surate with the rank of full professor at Michigan Foundation RU/VH research university with an candidates will be considered to the full extent Tech, including a record of high quality publica- enrollment of approximately 13,000. The website feasible, until all positions are filled. Visiting po- tions and external funding. Candidates must also www.cs.montana.edu/faculty-vacancies has infor- sitions may also be available. have demonstrated administrative, supervisory, mation on position requirements and application New York University is an equal opportunity/ or leadership experience. procedures. ADA/EO/AA/Veterans Preference. affirmative action employer. The Computer Science Department has 325 undergraduate majors in three BS degree programs and 50 graduate students in MS and New Mexico State University North Carolina State University PhD degree programs in Computer Science Assistant Professor Department of Computer Science and in the Computational Science and Engi- Faculty Positions neering PhD program. The research interests of the The Computer Science Department at New Mex- 17 faculty include both core areas of computer sci- ico State University invites applications for a ten- The Department of Computer Science at NC State ence and interdisciplinary topics. The Department ure-track position at the assistant professor level, University (NCSU) seeks to fill multiple tenure has close ties to the Department of Electrical and with appointment starting in the Fall 2011 semes- track faculty positions starting August 16, 2011. Computer Engineering and offers many courses re- ter. We are seeking strong candidates in any areas Exceptional candidates in all areas of Computer quired by the Computer Engineering, Bioinformat- of Computer Science, although applications with Science will be considered, but of particular inter- ics, and Cheminformatics BS degree programs. expertise in computer architecture, operating est are candidates specializing in Computer and The University has approximately 7,000 stu- systems, compilers, and computer graphics/ani- Network Security. dents and 400 faculty with educational and re- mation are particularly encouraged. Applications Successful candidates must have a strong search programs that emphasize solving techno- from women and members of traditionally un- commitment to academic and research excel- logical problems in all aspects of life. Michigan der-represented groups are strongly encouraged. lence, and an outstanding research record com- Tech is located in Michigan’s scenic Upper Penin- For the complete announcement and applica- mensurate with the expectations of a major re- sula and is bounded by Lake Superior and nearby tion procedure, please visit http://www.cs.nmsu. search university. Required credentials include a forests. The community offers year-round recre- edu/~epontell/job.html. Apply URL: http://www/ doctorate in Computer Science or a related field. ational and cultural opportunities. This environ- cs/nmsu.edu/~cssearch While the department expects to hire faculty pri- ment, combined with a competitive compensation New Mexico State University is an EEO/AA marily at the Assistant Professor level, candidates package, provides an excellent quality of life. Employer. All university positions are contingent with exceptional research records are encouraged In addition to the present search, strategic upon availability of funding. All offers of employ- to apply for senior positions. The Department is faculty hiring initiatives with up to ten new posi- ment, oral and written, are contingent on the uni- one of the largest and oldest in the country. It is tions in “Next Generation Energy Systems” and versity’s verification of credentials and other in- part of NCSU’s College of Engineering, which has “Health: Basic Sciences, Technologies, and Medi- formation required by federal law, state law, and recently received significant increases in private cal Informatics” are in their second year. Quali- NMSU policies/procedures, and may include the and public funding, faculty positions, and facili- fied candidates are encouraged to send a separate completion of a criminal history check. ties that will assist the Department in achieving application, following the “How to Apply” guide- its goals. The department’s research expendi- lines at http://www.mtu.edu/sfhi. tures and recognition are growing steadily. For ex- Michigan Tech is an ADVANCE institution, New York University ample, we have one of the largest concentrations one of a limited number of universities in Courant Institute of Mathematical Sciences in the country of prestigious NSF Early Career receipt of NSF funds in support of our com- Department of Computer Science Award winners (total of 20). mitment to increase diversity and the Faculty NCSU is located in Raleigh, the capital of participation and advancement of women in North Carolina, which forms one vertex of the STEM. The department expects to have several regular world-famous Research Triangle Park (RTP). RTP Applications must include a vita, list of ref- faculty positions beginning in September 2011 is an innovative environment, both as a metropol- erences, and a cover letter that addresses the and invites candidates at all levels. We will con- itan area with one of the most diverse industrial candidate’s professional qualifications and ad- sider outstanding candidates in any area of com- bases in the world, and as a center of excellence ministrative philosophy. Applications received puter science, with systems and formal methods/ promoting technology and science. The Research by November 15, 2010, are assured of full consid- verification being high-priority areas. Triangle area is routinely recognized in nation- eration. Applications must be submitted by email Faculty members are expected to be outstand- wide surveys as one of the best places to live in the to [email protected]. ing scholars and to participate in teaching at all U.S. We enjoy outstanding public schools, afford- To learn more about this opportunity, please levels from undergraduate to doctoral. New ap- able housing, and great weather, all in the prox- visit: http://www.cs.mtu.edu/CSchairSearch.html pointees will be offered competitive salaries and imity to the mountains and the seashore. or contact: startup packages, with affordable housing within Applications will be reviewed as they are re- Prof. Steven Seidel a short walking distance of the department. New ceived. The positions will remain open until suit- [email protected] York University is located in Greenwich Village, able candidates are identified. Applicants are Department of Computer Science one of the most attractive residential areas of encouraged to apply by December 15, 2010. Ap- Michigan Technological University Manhattan. plicants should submit the following materials 1400 Townsend Drive The department has 32 regular faculty mem- online at http://jobs.ncsu.edu (reference position Houghton, MI 49931-1295 bers and several clinical, research, adjunct, and number 1091) cover letter, curriculum vitae, re- visiting faculty members. The department’s cur- search statement, teaching statement, and names Michigan Technological University is an equal rent research interests include algorithms, cryp- and complete contact information of four refer- opportunity educational institution/equal oppor- tography and theory; computational biology; dis- ences, including email addresses and phone num- tunity employer. tributed computing and networking; graphics, bers. Candidates can obtain information about

november 2010 | vol. 53 | no. 11 | communications of the acm 107 careers

the department and its research programs, as well candidates in all areas of computer science. We journalism, freedom of expression in the digital as more detail about the positions advertised here have particular interest in candidates with experi- age, the changing role of the media in campaigns at http://www.csc.ncsu.edu/. Inquiries may be sent ence in one or more of the areas: programming and elections, and the relation between news pro- via email to: [email protected]. language theory, formal verification, or security. gramming and informed citizenship. Applicants North Carolina State University is an equal op- The Department also has openings for two will be expected to teach at the graduate and portunity and affirmative action employer. In ad- joint lecturer/post-doctoral researcher positions undergraduate levels in both academic and pre- dition, NC State University welcomes all persons and a variety of research positions. Research posi- professional curricula. without regard to sexual orientation. Individuals tions are contingent on external funding. We seek an innovative intellectual leader with disabilities desiring accommodations in the Applicants must hold a Ph.D. or equivalent in with an interdisciplinary orientation whose work application process should contact the Depart- Computer Science or a related discipline, or must speaks to both the academic and professional ment of Computer Science at (919) 515-2858. complete the Ph.D. by November 1, 2011. Please communities. Applicants should have a record specify in your application if you are applying for of substantial research accomplishments in peer the assistant professor position, for one of the lec- reviewed publications. The successful applicant Oklahoma State University turer/postdoc positions, or for a research position. is expected to eventually assume the director- Assistant Professor We will begin evaluating applications on No- ship of the graduate program in journalism in the FACULTY SEARCH vember 15, 2010. Applications submitted after that department. University: Oklahoma State University date may be considered, but we would prefer that Applicants should send curriculum vitae, Department: Computer Science Department you complete your application by November 15. bibliography, and a brief statement of research Position: Faculty Position More details on these positions can be found at interest to: Professor James S. Fishkin, Chair, the Department’s web site http://compsci.rice.edu. Department of Communication, McClatchy Hall, Applications are invited for two anticipated full- Apply URL: http://csfacultyapplications.rice.edu Stanford University, Stanford, CA 94305-2050. For time, tenure-track Assistant Professor positions. Rice University is an Equal Opportunity/Affir- full consideration, materials must be received by The term of initial appointment will begin in mative Action Employer. January 15, 2011. The term of appointment would August 2011. begin September 1, 2011. Stanford University is an The Oklahoma State University Computer Sci- equal opportunity employer and is committed to ence Department is seeking applications from Saint Vincent College increasing the diversity of its faculty. It welcomes qualified candidates with teaching and research Tenure-track position nominations of, and applications from, women experience in any area of Computer Science. A and members of minority groups, as well as oth- Ph.D. or D.Sc. in Computer Science or a closely The Computing & Information Science Depart- ers who would bring additional dimensions to related area is required. ment at Saint Vincent College invites applications the university’s research and teaching missions. The positions being sought are for the Stillwater for a tenure track, assistant professor position be- campus and duties may be assigned to either Still- ginning in August 2011, contingent on funding water or Tulsa campuses or both. Please send cur- and staffing within the Department. Applicants Swarthmore College riculum vita, a statement of teaching and research should hold a graduate degree (preferably a PhD Visiting Assistant Professor experience, and names of three references to: or ABD) in information technology, information Chair, Faculty Search Committee, systems, computer science, or a closely-related Swarthmore College invites applications for a Computer Science Department discipline. Primary duties would include under- three-year faculty position in Computer Science, 219 MSCS Building graduate teaching and research in the areas of at the rank of Visiting Assistant Professor, be- Oklahoma State University networking, IT, computer security, and databases. ginning September 2011. Specialization is open. Stillwater, OK 74078-1053 Candidates should be prepared to teach introduc- Review of applications will begin January 1, 2011, tory CS classes, and the ability to teach some of and continue until the position is filled. For infor- Application via e-mail with pdf attachment(s) the following courses is a plus: programming lan- mation, see http://www.cs.swarthmore.edu/job. is preferred. guages, Java, graphics, and game development. Swarthmore College has a strong commit- Send e-mail to Saint Vincent College is a Catholic, Bene- ment to excellence through diversity in educa- [email protected] . dictine liberal arts and science college of about tion and employment and welcomes applications 1700 undergraduate students and 200 graduate from candidates with exceptional qualifications, For full consideration, applications should be students. It is located about forty miles east of particularly those with demonstrable commit- received by December 20, 2010, but applications Pittsburgh, Pennsylvania in a pleasant suburban/ ments to a more inclusive society and world. will be accepted until the positions are filled. rural environment near the foothills of the Laurel These positions are contingent upon avail- Mountains. Saint Vincent is an equal opportunity able funding. employer. Review of applications will begin on The University of Michigan - Dearborn Oklahoma State University is an Affirmative December 1, 2010 and continue until the position Department of Computer and Information Action/Equal Opportunity/E-Verify employer is filled. To apply, send a letter of application, cur- Science committed to diversity. OSU-Stillwater is a tobac- riculum vita, research statement, teaching state- Assistant/Associate/Full Professor co-free campus. ment, transcripts, and three letters of reference Oklahoma State University is a modern com- to: The Department of Computer and Information prehensive land grant university that serves the Human Resources Director Science (CIS) at the University of Michigan-Dear- state, national and international communities by Saint Vincent College born invites applications for a tenure-track faculty providing its students with exceptional academic 300 Fraser Purchase Road position in any of the following areas: computer experiences, by conducting scholarly research Latrobe, PA 15650-2690 and data security, digital forensics, and informa- and other creative activities that advance funda- www.stvincent.edu/hr2 tion assurance. Rank and salary will be commen- mental knowledge, and by disseminating knowl- surate with qualifications and experience. We of- edge to the people of Oklahoma and throughout fer competitive salaries and start-up packages. the world. Stanford University Qualified candidates must have, or expect to Tenured Professor Position in the have, a Ph.D. in CS or a closely related discipline Communication Dept by the time of appointment and will be expected Rice University, to do scholarly and sponsored research, as well as Computer Science Dept. The Department of Communication at Stanford teaching at both the undergraduate and graduate Tenure-track Assistant Professor University invites applications for a tenured pro- levels. Candidates at the associate or full profes- fessor position in the Department of Communi- sor ranks should already have an established Rice University’s Department of Computer Sci- cation. The areas of expertise of applicants can in- funded research program. The CIS Department ence seeks applications for a tenure-track assis- clude, but are not limited to, the changing forms offers several BS and MS degrees, and participates tant professor to start in July 2011. We welcome of journalism, the economics and regulation of in an interdisciplinary Ph.D. program in informa-

108 communications of the acm | november 2010 | vol. 53 | no. 11 tion systems engineering. The current research University of California, Los Angeles should also arrange to have three letters of recom- areas in the department include computer graph- Computer Science Department mendation sent to the department as soon as pos- ics and geometric modeling, database systems, sible. Electronic submission of materials as PDF networking, computer and network security, and The Computer Science Department of the Henry documents is preferred. Electronic copies should software engineering. These areas of research are Samueli School of Engineering and Applied Sci- be sent to [email protected]. Copies can also supported by several established labs. ence at the University of California, Los Angeles, be sent to: Dr. Andrew Sears, Chair of Faculty The University of Michigan-Dearborn is lo- invites applications for tenure-track positions in Search Committee, Information Systems Depart- cated in the southeastern Michigan area and of- all areas of Computer Science and Computer En- ment, UMBC, 1000 Hilltop Circle, Baltimore, MD fers excellent opportunities for faculty collabora- gineering. Applications are also encouraged from 21250-5398. For inquiries, please contact Barbara tion with many industries. We are one of three distinguished candidates at senior levels. Quality Morris at (410) 455-3795 or [email protected]. campuses forming the University of Michigan is our key criterion for applicant selection. Appli- Review of applications will begin immediately system and are a comprehensive university with cants should have a strong commitment both to and will continue until the position is filled. This over 8500 students. One of university’s strategic research and teaching and an outstanding record position is subject to the availability of funds. visions is to advance the future of manufacturing of research for their level of seniority. UMBC is an Affirmative Action/Equal Opportu- in a global environment. The University of California is an Equal Op- nity Employer and welcomes applications from mi- The University of Michigan-Dearborn is dedi- portunity/Affirmative Action Employer. The de- norities, women and individuals with disabilities. cated to the goal of building a culturally-diverse partment is committed to building a more diverse and pluralistic faculty committed to teaching faculty, staff and student body as it responds to and working in a multicultural environment, and the changing population and educational needs University at Buffalo, The State strongly encourages applications from minori- of California and the nation. To apply, please visit University of New York ties and women. http://www.cs.ucla.edu/recruit. Faculty applica- Faculty Position in Computer Science and A cover letter, curriculum vitae including e- tions received by January 15 will be given full con- Engineering mail address, teaching statement, research state- sideration. ment, and three letters of reference should be The CSE Department invites excellent candidates sent to, in all core areas of Computer science and Engi- Dr. William Grosky, Chair UMBC neering, especially Database Systems, Data Min- Department of Computer and Information University of Maryland Baltimore County ing, Information Retrieval, Machine Learning Science An Honors University in Maryland and Robotics areas, to apply for an opening at the University of Michigan-Dearborn Information Systems Department assistant professor level. 4901 Evergreen Road The department is affiliated with successful Dearborn, MI 48128-1491 The Information Systems Department at UMBC centers devoted to biometrics, bioinformatics, invites applications for a tenure-track faculty po- biomedical computing, cognitive science, docu- Email: [email protected] sition at the Assistant Professor level in the area ment analysis and recognition, high performance Internet: http://www.cis.umd.umich.edu of data mining starting August 2011. Outstanding computing, and information assurance. Phone: 313.583.6424 candidates in other areas will also be considered. Candidates are expected to have a Ph.D. in Fax: 313.593.4256 Candidates must have an earned PhD in In- Computer Science/Engineering or related field by formation Systems or a related field no later than August 2011, with an excellent publication record The University of Michigan-Dearborn is an August 2011. Applicants engaged in research in and potential for developing a strong funded re- equal opportunity/affirmative action employer. data mining with overlapping interest in artifi- search program. cial intelligence in areas including but not lim- Applications should be submitted by Decem- ited to privacy preserving data mining, cyberse- ber 31, 2010 electronically via http://www.ubjobs. Toyota Technological Institute curity, spatial and spatio-temporal data mining, buffalo.edu/ at Chicago and knowledge discovery in healthcare data, are The University at Buffalo is an Equal Opportu- Computer Science Faculty Positions of primary interest. Ideal candidates will be en- nity Employer/Recruiter. at All Levels gaged in research that spans two or more of these areas with preference given to those who can col- Toyota Technological Institute at Chicago (TTIC) laborate with current faculty. Candidates should The University of Alabama is a philanthropically endowed degree-granting have a strong potential for excellence in research, Tenure-Track Faculty Position institute for computer science located on the the ability to develop and sustain an externally Software Engineering Focus University of Chicago campus. The Institute is funded research program, and the ability to con- Department of Computer Science expected to reach a steady-state of 12 traditional tribute to our graduate and undergraduate teach- faculty (tenure and tenure track), and 12 limited ing mission. The Department of Computer Science at the Uni- term faculty. Applications are being accepted in The Department offers undergraduate degrees versity of Alabama invites applications for a new all areas, but we are particularly interested in in Information Systems and Business Technology tenure-track Assistant Professor position to be- Theoretical computer science Administration as well as both the MS and PhD gin August 2011. The general area of focus is in Speech processing in Information Systems. In addition, the Depart- software engineering, with a high priority area in Machine learning ment offers an MS and PhD in Human-Centered model-driven engineering. Candidates must have Computational linguistics Computing. Consistent with the UMBC vision, an earned Ph.D. in Computer Science or a related Computer vision the Department has excellent technical support field, with solid evidence of superior research Computational biology and teaching facilities as well as outstanding and scholarly accomplishments, as well as dem- Scientific computing laboratory space and state of the art technology. onstrated excellence in teaching. Applicants who UMBC’s Technology Center, Research Park, and specialize in software engineering are encour- Positions are available at all ranks, and we Center for Entrepreneurship are major indicators aged to apply. have a large number of limited term positions of active research and outreach. Further details The University of Alabama is considered the currently available. on our research, academic programs, and faculty Capstone of higher education in Alabama and is For all positions we require a Ph.D. Degree or can be found at http://www.is.umbc.edu/. Under- also the largest institution in the State. The Uni- Ph.D. candidacy, with the degree conferred prior represented groups including women and minor- versity was ranked in 2010 by US News and World to date of hire. Submit your application electroni- ities are especially encouraged to apply. Report as 34th among US public universities. The cally at: Applications will not be reviewed until the University led the nation in 2010 with ten USA To- http://ttic.uchicago.edu/facapp/ following materials are received: a cover letter, a day Academic All-Americans (record for any US one-page statement of teaching interests, a one- university) and ranks 10th in the nation among Toyota Technological Institute at Chicago is an page statement of research interests, one or more public universities in enrolling National Merit Equal Opportunity Employer sample research papers, and a CV. Applicants Scholars.

november 2010 | vol. 53 | no. 11 | communications of the acm 109 careers

The Department of Computer Science, University of Calgary tion, gender identity, religion, color, national or housed in the College of Engineering, currently Calgary, Alberta, Canada T2N 1N4 or ethnic origin, age, disability, or status as a Viet- has twenty-three faculty members (16 tenured/ [email protected] nam Era Veteran or disabled veteran in the ad- tenure track faculty, 6 of whom have interests in ministration of educational policies, programs or software engineering), roughly 200 undergradu- The applications will be reviewed beginning activities; admissions policies; scholarship and ates in an ABET accredited B.S. degree program, November 2010 and continue until the positions loan awards; athletic, or other University admin- and approximately 60 M.S. and Ph.D. students. are filled. istered programs or employment. The Penn CIS Throughout the 2010-2011 academic year, two All qualified candidates are encouraged to ap- Faculty is sensitive to “two–body problems” and postdocs in software engineering will be sup- ply; however, Canadians and permanent residents opportunities in the Philadelphia region. ported in the Department. A recent Department will be given priority. The University of Calgary re- of Education GAANN award provides extended spects, appreciates, and encourages diversity. fellowships to six Ph.D. students who are focus- University of Rochester ing on software engineering. Assistant to Full Professor The Department and College of Engineering University of Pennsylvania of Computer Science are undergoing a period of extensive growth. The Department of Computer and Information Department is housed in a new (opened August Science The UR Department of Computer Science seeks 2009) state-of-the art complex. Over the next three Faculty Position researchers in computer vision and/or machine years, the University will complete construction learning for a tenure-track faculty position begin- of the science and engineering complex, which The University of Pennsylvania invites applicants ning in Fall 2011. Outstanding applicants in other is comprised of four buildings focused on the for tenure-track appointments in computer areas may be considered. Candidates must have expansion of research in engineering and the graphics and animation to start July 1, 2011. Ten- a PhD in computer science or related discipline. sciences. Over 3000 square feet of new research ured appointments will also be considered. Senior candidates should have an extraordinary space has been constructed to support the efforts Faculty duties include teaching undergraduate record of scholarship, leadership, and funding. of the software engineering faculty. The software and graduate students and conducting high-quality The Department of Computer Science is a engineering faculty in the Department are PIs or research. Teaching duties will be aligned with two select research-oriented department, with an co-PIs on 13 active awards (over $3.25M) across programs: the Bachelor of Science and Engineering unusually collaborative culture and strong ties to five different funding agencies. in Digital Media Design, and the Master of Science cognitive science, linguistics, and electrical and For more information about the Software and Engineering in Computer Graphics and Game computer engineering. Over the past decade, a Engineering Group at UA, please visit http://soft- Technology (see http://cg.cis.upenn.edu). Research third of its PhD graduates have won tenure-track ware.eng.ua.edu and teaching will be enhanced by the recently reno- faculty positions, and its alumni include leaders Details regarding the application procedures vated SIG Center for Computer Graphics, which at major research laboratories such as Google, for this position are available at http://cs.ua.edu. For houses the largest motion capture facility in the re- Microsoft, and IBM. information about the position, please contact the gion, and is also the home of the Center for Human The University of Rochester is a private, Tier I Search Committee at [email protected]. Modeling and Simulation. Successful applicants research institution located in western New York Review of applications will begin late-Fall 2010 will find Penn to be a stimulating environment con- State. The University of Rochester consistently and will continue until the position is filled. The ducive to professional growth. ranks among the top 30 institutions, both public University of Alabama is an equal opportunity/af- The University of Pennsylvania is an Ivy and private, in federal funding for research and firmative action employer. Women and minority League University located near the center of Phil- development. Half of its undergraduates go on applicants are particularly encouraged to apply. adelphia, the 5th largest city in the US. Within to post-graduate or professional education. The walking distance of each other are its Schools of university includes the Eastman School of Music, Arts and Sciences, Engineering, Fine Arts, Medi- a premiere music conservatory, and the Univer- University of Calgary cine, the Wharton School, the Annenberg School sity of Rochester Medical Center, a major medical Department of Computer Science of Communication, Nursing, and Law. The Uni- school, research center, and hospital system. The Assistant Professor Positions versity campus and the Philadelphia area sup- Rochester area features a wealth of cultural and port a rich diversity of scientific, educational, and recreational opportunities, excellent public and The Department of Computer Science and the cultural opportunities, major technology-driven private schools, and a low cost of living. University of Calgary seeks outstanding candi- industries such as pharmaceuticals, finance, and Candidates should apply online at http://www. dates for two tenure track positions at the As- aerospace, as well as attractive urban and subur- cs.rochester.edu/recruit after Nov. 1, 2010. Review sistant Professor level. Applicants from areas of ban residential neighborhoods. Princeton and of applications will begin on Dec. 1, and continue Database Management and Human Computer New York City are within commuting distance. until all interview openings are filled. The Uni- Interaction/Information Visualization are of par- To apply, please complete the form located on versity of Rochester has a strong commitment ticular interest. Details for each position appear the Faculty Recruitment Web Site at: http://www. to diversity and actively encourages applications at: http://www.cpsc.ucalgary.ca/. cis.upenn.edu/departmental/facultyRecruiting. from candidates from groups underrepresented Applicants must possess a doctorate in Com- shtml. Electronic applications are strongly pre- in higher education. The University is an Equal puter Science or a related discipline at the time ferred, but hard-copy applications (including the Opportunity Employer. of appointment, and have a strong potential to names of at least four references) may alterna- develop an excellent research record. tively be sent to: The department is one of Canada’s leaders as Chair, Faculty Search Committee The University of Washington Bothell evidenced by our commitment to excellence in re- Department of Computer and Information Lecturer or Senior Lecturer search and teaching. It has an expansive graduate Science program and extensive state-of-the-art comput- School of Engineering and Applied Science The University of Washington Bothell Computing ing facilities. Calgary is a multicultural city that is University of Pennsylvania and Software Systems Program invites applica- the fastest growing city in Canada. Calgary enjoys Philadelphia, PA 19104-6389. tions for a full-time Lecturer or Senior Lecturer a moderate climate located beside the natural position to begin fall 2011. Duties include teach- beauty of the Rocky Mountains. Further informa- Applications should be received by January ing/mentoring undergraduate and graduate stu- tion about the department is available at http:// 15, 2011 to be assured full consideration. Applica- dents, including industry internships. Areas of www.cpsc.ucalgary.ca/. tions will be accepted until the position is filled. expertise include: security, networking, operat- Interested applicants should send a CV, a con- Questions can be addressed to faculty-search@ ing systems, architecture, databases, multimedia cise description of their research area and program, cis.upenn.edu. The University of Pennsylvania software development, computer engineering, a statement of teaching philosophy, and arrange to values diversity and seeks talented students, fac- and parallel/distributed computing. have at least three reference letters sent to: ulty and staff from diverse backgrounds. The Bothell campus was founded in 1990 as Dr. Carey Williamson The University of Pennsylvania does not dis- an innovative, interdisciplinary campus within Department of Computer Science criminate on the basis of race, sex, sexual orienta- the University of Washington system - one of the

110 communications of the acm | november 2010 | vol. 53 | no. 11 premier institutions of higher education in the women, members of visible minorities, native a closely related field by August 2011, have a com- US. Faculty members have full access to the re- peoples, and persons with disabilities. All quali- mitment to undergraduate teaching, and be able sources of a major research university, with the fied candidates are encouraged to apply; however, to contribute to curriculum development. culture and close relationships with students of a Canadian citizens and permanent residents will small liberal arts college. be given priority. Fall 2010. For additional information, including ap- Yale University plication procedures, please see our website at Senior Faculty http://www.uwb.edu/CSS/. All University faculty University of Wisconsin - Platteville engage in teaching, research, and service. The Assistant Professor - Software Engineering Yale University’s Electrical Engineering Depart- University of Washington, Bothell is an affirma- ment invites applications from qualified individu- tive action, equal opportunity employer. The Department of Computer Science and Software als for a senior faculty position in either computer Engineering at the University of Wisconsin-Platte- systems or signals & systems. Subfields of interest ville invites applications for a tenure-track faculty include wireless communications, networking, University of Waterloo position in Software Engineering starting Fall, 2011. systems on a chip, embedded systems, and emerg- David R. Cheriton School of Computer Science Candidates must have a Ph.D. in Software Engineer- ing computing methodologies inspired by advanc- Tenured and Tenure-Track Faculty Positions ing, Computer Science, or closely related field. es in the biological sciences, quantum computing, The primary duty for this position will be and other novel research directions. All candidates Applications are invited for several positions in teaching courses in an ABET-accredited software should be strongly committed to both teaching computer science: (a) Up to two senior, tenured engineering program. Candidates must have ex- and research and should be open to collaborative David R. Cheriton Chairs in Software Systems are cellent verbal and written communication skills research. Candidates should have distinguished open for candidates with outstanding research re- and a strong commitment to teaching. Commit- records of research accomplishments and should cords in software systems (very broadly defined). ment to scholarly and professional activities is be willing and able to take the lead in the shaping Successful applicants will be acknowledged lead- also required. The candidate must have a dem- of Yale’s expanding programs in either computer ers in their fields or have demonstrated the po- onstrated commitment to or experience with engineering or signals & systems. Yale University is tential to become such leaders. These positions racially diverse populations. Salary will be com- an Affirmative Action/Equal Opportunity Employ- include substantial research support and teaching petitive and commensurate with experience and er. Yale values diversity among its students, staff, reduction. (b) One tenured or tenure-track posi- qualifications. In addition, there are consulting and faculty and strongly welcomes applications tion is open in the area of Health Informatics, in- opportunities with local companies. from women and under represented minorities. cluding, but not limited to, healthcare IT, medical Applications must be submitted electronical- The review process will begin November 1, 2010. informatics, and biomedical systems. The success- ly. Send a letter of application, undergraduate & Applicants should send a curriculum vitae to: ful applicant will help develop a new graduate de- graduate transcripts, a statement illustrating your Chair gree program in health informatics. (c) One other commitment to fostering campus racial diversity, Electrical Engineering Search Committee tenured or tenure-track position is available for and a vita including references to stutenbm@ Yale University excellent candidates in any computing area, but uwplatt.edu. Visit our web site at http://www.uw- P.O. Box 208267 highest priority will be given to candidates special- platt.edu/csse for more information. Review of New Haven, CT 06520-8267 izing in systems software (operating systems, dis- applications will begin on December 6, 2010 and tributed systems, networks, etc.) and information continue until a suitable candidate is found. systems (e-commerce systems, enterprise resource The University of Wisconsin-Platteville, an planning systems, business intelligence, etc.). equal opportunity, affirmative action employer, Skidmore college Successful applicants who join the University seeks to build a diverse faculty and staff and en- The Department of Mathematics and Computer of Waterloo are expected to be leaders in research, courages applications from women and persons Science at Skidmore College invites applications for a tenure-track position in Computer Science beginning have an active graduate-student program, and con- of color. The names of all nominees and appli- September 2011. Qualifications include a Ph.D. in tribute to the overall development of the School. cants who have not requested in writing that their Computer Science. The appointment will be at the rank A Ph.D. in Computer Science, or equivalent, is re- identities be kept confidential, and of all finalists, of Assistant Professor. Salary will be competitive and commensurate with experience. In addition, startup quired, with evidence of excellence in teaching and will be released upon request. Employment re- funds and pre-tenure sabbaticals are available. research. Rank and salary will be commensurate quires a criminal background check. A commitment to quality instruction of undergraduates with experience, and appointments are expected and continuing scholarly achievement is essential. The department has expertise in the theory of computation, to commence during the 2011 calendar year. algorithms, artificial intelligence, computer vision and With over 70 faculty members, the University of US Air Force Academy graphics; we hope to widen our areas of expertise with Waterloo’s David R. Cheriton School of Computer Distinguished Visiting Professor this appointment. In addition to strong disciplinary teaching and scholarship, including collaborative Science is the largest in Canada. It enjoys an excel- & Visiting Chair research with students, the College encourages lent reputation in pure and applied research and interdisciplinary teaching and scholarship via participation in its First-Year Experience program and houses a diverse research program of international U.S. AIR FORCE ACADEMY Department of Com- other interdisciplinary programs such as Neuroscience, stature. Because of its recognized capabilities, the puter Science is accepting applications for our Environmental Studies, and Gender Studies. School attracts exceptionally well-qualified stu- Coleman-Richardson Chair and Distinguished Located in the vibrant community of Saratoga Springs, New York, Skidmore College is a highly dents at both undergraduate and graduate levels. Visiting Professor positions. See http://www.usa- selective liberal arts college of 2400 students. The In addition, the University has an enlightened in- fa.edu/df/dfcs/index.cfm or call (719) 333-7474 Department of Mathematics and Computer Science tellectual property policy which vests rights in the for details. U.S. Citizenship required. consists of eleven faculty members, offers both major and minor programs in Computer Science, and inventor: this policy has encouraged the creation has its own Linux network supported by a dedicated of many spin-off companies including iAnywhere system administrator within the department . For Solutions Inc., Maplesoft Inc., Open Text Corp., Valdosta State University more detailed information, please go to http://cms. skidmore.edu/mcs/. and Research in Motion. Please see our web site for Assistant Professor Review of applications will begin December 15, more information: http://www.cs.uwaterloo.ca. 2010 and will continue until the position is filled. To submit an application, please register at the Applications are invited for two ten-month tenure- Applications from members of underrepresented groups are especially encouraged. submission site: http://www.cs.uwaterloo.ca/fac- track faculty positions at the rank of Assistant Pro- To learn more about and apply for this position ulty-recruiting. Once registered, instructions will fessor with starting date of August 1, 2011. Responsi- please visit Skidmore’s website at: jobs.skidmore. edu/applicants/Central?quickFind=52518 be provided regarding how to submit your appli- bilities include teaching at the undergraduate level, Skidmore College is committed to being an inclusive cation. Although applications will be considered scholarly activity, and service to both the department campus community and, as an Equal Opportunity as soon as possible after they are complete and as and the university. For a detailed job description and Employer, does not discriminate in its hiring or employment practices on the basis of gender, race or long as positions are available, full consideration application instructions email Dr. Ashok Kumar, In- ethnicity, color, national origin, religion, age, disability, is assured for those received by November 30. terim Head, [email protected]. family, veteran or marital status, sexual orientation, The University of Waterloo encourages appli- Applicants must complete a doctorate in Com- gender identity or expression. cations from all qualified individuals, including puter Information Systems, Computer Science, or CREATIVE THOUGHT MATTERS.

november 2010 | vol. 53 | no. 11 | communications of the acm 111 last byte

DOI:10.1145/1839676.1839700 Peter Winkler Puzzled Rectangles Galore Welcome to three new puzzles. Solutions to the first two will be published next month; the third is (as yet) unsolved. In each, the issue is how your intuition matches up with the mathematics.

The hero of this column is the at designated spots in the simple, ordinary, axis-aligned room, blocking all possible rectangle. Looking out the shots by the enemy. How many window, how many do you students do you need? see? My view at the moment of You may assume for this Cambridge, MA, easily takes in purpose that you, your enemy, more than one thousand, and the students are all slim mostly windows. Asking new enough to be considered questions about an old figure points (viewed from above), helps us see it in a new light. rather than solid figures in 3D space. If, for example, you had Packing rectangles with six given dots at A large rectangle in the continuum many graduate their lower-left-hand corners. 1.plane is partitioned into students, you could place them The conjecture is that there a finite number of smaller around you in a circle, with the is always a way to choose the rectangles, each with either enemy outside. But you can do small rectangles so they cover integer width or integer height; better… at least half the area of the big that is, its width, height, or rectangle. Be the first ever to both width and height are Before you (on the plane) prove it. A far as I know, no whole numbers of units. Now 3.is a large rectangle one has succeeded in even prove that the large rectangle, containing a finite number of showing you can cover any likewise, has integer width or distinct dots, one of which is fixed fraction (say, 1/100) of the height (or both). at the rectangle’s lower-left- area of the original rectangle. hand corner. Your objective Alternatively, if you You are in a large is to pack smaller, disjoint reject the conjecture, find 2.rectangular room with rectangles into the big one a counterexample, a way to mirrored walls. Your mortal (with sides parallel to those of distribute the dots (be sure enemy, armed with a laser the big one) in such a way that to include the lower-left-hand gun, is elsewhere in the room. each small rectangle includes corner of the big rectangle) so As your only defense, you one of the dots as its own there is, provably, no way to do may summon a number of lower-left-hand corner; see the the packing so as to cover half graduate students to stand figure for an example. the area of the big rectangle.

All readers are encouraged to submit prospective puzzles for future columns to [email protected]. Peter Winkler ([email protected]) is Professor of Mathematics and of Computer Science and Albert Bradley Third Century Professor in the Sciences at Dartmouth College, Hanover, NH.

112 communications of the acm | november 2010 | vol. 53 | no. 11

Think Parallel..... It’s not just what we make. It’s what we make possible.

Advancing Technology Curriculum Driving Software Evolution Fostering Tomorrow’s Innovators

Learn more at: www.intel.com/thinkparallel

Copyright © 2009 Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and other countries. *Other names and brands may be claimed as the property of others.