02.19 Computer FEBRUARY 2019 FEBRUARY

HOLDING US TOGETHER HOLDING US TOGETHER US HOLDING Magnetic- Induction Near-Field Communication 63 Distributed Ledger Technology 68 Volume 52 Number 2 Volume

vol. 52 no. 2 www.computer.org/computer B. Ramakrishna Rau Award Call for Award Nominations Deadline: 1 May 2019

Established in memory of Dr. B. (Bob) Ramakrishna Rau, this award recognizes his distinguished career in promoting and expanding the use of innovative computer microarchitecture techniques, including his innovation in compiler technology, his leadership in academic and industrial computer architecture, and his extremely high personal and ethical standards.

Award A certificate and a $2,000 honorarium are awarded. 2018 B. Ramakrishna Presentation Rau Award The award is presented annually at the ACM/IEEE Recipient International Symposium on Microarchitecture.

Nomination Requirements The candidate will have made an outstanding innovative contribution or contributions to microarchitecture and use of novel microarchitectural techniques or compiler/ architecture interfacing. It is hoped, but not required, that Ravi Nair the winner will have also contributed to the computer IBM Thomas J. Watson microarchitecture community through teaching, mentoring, Research Center or community service.

Honoring This award requires 3 endorsements. contributions Nominations are being accepted electronically by to the computer 1 May 2019 to bit.ly/ramakrishna-rau. microarchitecture field. Questions? VIsit bit.ly/ramakrishna-rau or email [email protected] Digital Object Identifier 10.1109/MC.2019.2901909 EIC’S MESSAGE Holding Us Together 12 DAVID ALAN GRIER FEBRUARY 2019 FEATURES 14 23 32 A Pragmatic View Software Systems Algorithms: Law on Code Complexity With Antifragility and Regulations Management PHILIP TRELEAVEN, to Downtime JEREMY BARNETT, VARD ANTINYAN, ANNA B. SANDBERG, KJELL J. HOLE AND AND ADRIANO KOSHIYAMA AND MIROSLAW STARON CHRISTIAN OTTERSTAD

Digital Object Identifier 10.1109/MC.2019.2892860 FEBRUARY 2019 CONTENTS

ABOUT THIS ISSUE COLUMNS HOLDING US 4 SPOTLIGHT ON 63 THE IOT CONNECTION TOGETHER TRANSACTIONS NFMI: Connectivity for The features in this issue Microexpressions: A Chance Short-Range IoT for Computers to Beat Applications draw from the past, offer a Humans at Detecting Hidden AMITANGSHU PAL bit of present insight, and Emotions? AND KRISHNA KANT look to the future. BJÖRN SCHULLER 68 CYBERTRUST 9 50 & 25 YEARS AGO Rethinking Distributed Computer, February 1969 Ledger Technology and 1994 RICK KUHN, DYLAN YAGA, AND JEFFREY VOAS ERICH NEUHOLD

FEATURES CONTINUED Departments 41 KDD Cup 99 Data Sets: A Perspective 6 Elsewhere in the CS on the Role of Data Sets in Network LORI CAMERON Intrusion Detection Research KAMRAN SIDDIQUE, ZAHID AKHTAR, FARRUKH ASLAM KHAN, AND YANGWOO KIM

Membership News 52 Rack-Scale Capabilities: Fine-Grained Protection for Large-Scale Memories Cover 3 IEEE Computer Society Information KIRK M. BRESNIKER, PAOLO FARABOSCHI, AVI MENDELSON, DEJAN MILOJICIC, TIMOTHY ROSCOE, AND ROBERT N.M. WATSON

Circulation: Computer (ISSN 0018-9162) is published monthly by the IEEE Computer Society. IEEE Headquarters, Three Park Avenue, 17th Floor, New York, NY 10016- 5997; IEEE Computer Society Publications Office, 10662 Los Vaqueros Circle, Los Alamitos, CA 90720; voice +1 714 821 8380; fax +1 714 821 4010; IEEE Computer Soci- ety Headquarters, 2001 L Street NW, Suite 700, Washington, DC 20036. IEEE Com- puter Society membership includes a subscription to Computer magazine. Postmaster: Send undelivered copies and address changes to Computer, IEEE Mem- bership Processing Dept., 445 Hoes Lane, Piscataway, NJ 08855. Periodicals Post- age Paid at New York, New York, and at additional mailing offices. Canadian GST See www.computer.org/computer #125634188. Canada Post Corporation (Canadian distribution) publications mail -multimedia for multimedia content agreement number 40013885. Return undeliverable Canadian addresses to PO Box related the features in this issue. 122, Niagara Falls, ON L2E 6S8 Canada. Printed in USA. IEEE COMPUTER SOCIETY http://computer.org // +1 714 821 8380 COMPUTER http://computer.org/computer // [email protected]

EDITOR IN CHIEF ASSOCIATE EDITOR IN CHIEF ASSOCIATE EDITOR IN CHIEF, 2019 IEEE COMPUTER SOCIETY David Alan Grier Elisa Bertino PERSPECTIVES PRESIDENT Djaghe LLC Purdue University, Jean-Marc Jézéquel Cecilia Metra [email protected] [email protected] University of Rennes University of Bologna, Italy ASSOCIATE EDITOR IN CHIEF, [email protected] [email protected] COMPUTING PRACTICES George K. Thiruvathukal Rohit Kapur Loyola University Chicago, Synopsys, [email protected] [email protected]

AREA EDITORS BIG DATA AND DATA ANALYTICS Greg Byrd HIGH-PERFORMANCE SECURITY AND PRIVACY Naren Ramakrishnan North Carolina State University COMPUTING Jeffrey M. Voas Virginia Tech Erik DeBenedictis Vladimir Getov IEEE Fellow Ravi Kumar Sandia National Laboratories University of Westminster VISION, VISUALIZATION, Google CYBER-PHYSICAL SYSTEMS OF THINGS AND AUGMENTATION CLOUD COMPUTING Oleg Sokolsky Michael Beigl Mike J. Daily Schahram Dustdar University of Pennsylvania Karlsruhe Institute of Technology HRL Laboratories TU Wien DIGITAL HEALTH COMPUTER ARCHITECTURES Christopher Nugent David H. Albonesi Ulster University Cornell University

COLUMN AND DEPARTMENT EDITORS AFTERSHOCK CYBER-PHYSICAL SYSTEMS OUT OF BAND STANDARDS Hal Berghel Dimitrios Serpanos Hal Berghel Forrest “Don” Wright University of Nevada, Las Vegas University of Patras University of Nevada, Las Vegas Standards Strategies, LLC Robert N. Charette CYBERTRUST THE POLICY CORNER 50 & 25 YEARS AGO ITABHI Corporation Jeffrey M. Voas Mina J. Hanna Erich Neuhold John L. King NIST Synopsis University of Vienna University of Michigan THE IOT CONNECTION REBOOTING COMPUTING WEB EDITOR COMPUTING EDUCATION Trevor Pering Erik DeBenedictis Zeljko Obrenovic Ann E.K. Sobel Google Sandia National Laboratories Incision Miami University EXPANDED SPOTLIGHT ON TRANSACTIONS COMPUTING THROUGH TIME Dirk Riehle Ron Vetter Ergun Akleman University of Erlangen–Nuremberg University of North Carolina Texas A&M Wilmington

ADVISORY PANEL Doris L. Carver, Louisiana State University (EIC Emeritus) Carl K. Chang, Iowa State University (EIC Emeritus) Bob Colwell, Consultant Bill Schilit, Google Bruce Shriver, Consultant (EIC Emeritus) Ron Vetter, University of North Carolina Wilmington (EIC Emeritus) Alf Weaver, University of Virginia

CS PUBLICATIONS BOARD MAGAZINE OPERATIONS COMMITTEE Fabrizio Lombardi (VP of Publications), Alfredo Benso, Cristiana Bolchini, Sumi Helal (Chair), Irena Bojanova, Jim X. Chen, Shu-Ching Chen, Gerardo Javier Bruguera, Carl K. Chang, Fred Douglis, Sumi Helal, Shi-Min Hu, Con Diaz, David Alan Grier, Lizy K. John, Marc Langheinrich, Torsten Möller, Sy-Yen Kuo, Ming C. Lin, Stefano Zanero, Daniel Zeng David Nicol, Ipek Ozkaya, George Pallis, VS Subrahmanian

COMPUTER STAFF IEEE PUBLISHING OPERATIONS Senior Managing Editor Senior Advertising Coordinator Senior Director, Publishing Associate Director, Geraldine Krolin-Taylor Debbie Sims Operations Information Conversion [email protected] Publisher Dawn M. Melley and Editorial Support Cover Design Robin Baldwin Director, Editorial Services Neelam Khinvasara Matthew Cooper IEEE Computer Society Kevin Lisankie Senior Art Director Peer Review Administrator Membership Director Director, Production Services Janet Dudar [email protected] Erik Berkowitz Peter M. Tuohy Publications Portfolio Manager IEEE Computer Society Executive Carrie Clark Director Melissa Russell

Digital Object Identifier 10.1109/MC.2019.2892861

Permission to reprint/republish this material for commercial, advertising, or promotional purposes or for creating new collective works for resale or redistribution must be obtained from the IEEE by writing to the IEEE Intellectual Property Rights O ce, 445 Hoes Lane, Piscataway, NJ 08854-4141 or [email protected]. Copyright © 2019 IEEE. All rights reserved. IEEE prohibits discrimination, harassment, and bullying. For more information, visit www.ieee.org/web/aboutus/whatis/policies/p9-26.html.

COMPUTER 0018-9162/19©2019IEEE PUBLISHED BY THE IEEE COMPUTER SOCIETY FEBRUARY 2019 3 SPOTLIGHT ON TRANSACTIONS

Microexpressions: A Chance for Computers to Beat Humans at Detecting

cally and reliably retrieve and inter- Hidden Emotions? pret facial expressions? In the recent Finish–British contri- Björn Schuller, Imperial College London and University of Augsburg bution “Towards Reading Hidden Emo- tions: A Comparative Study of Sponta- neous Micro-Expression Spotting and This installment of Computer’s series Recognition Methods” (IEEE Transac- tions on Affective Computing, vol. 9, no. 4, highlighting the work published in IEEE 2018, pp. 563–577), Xiaobai Li, Xiaopeng Computer Society journals comes from IEEE Hong, Antti Moilanen, Xiaohua Huang, Tomas Pfister, Guoying Zhao, and Matti Transactions on Affective Computing. Pietikäinen discuss automatic compu- tational analysis of “rapid, involuntary facial expressions which reveal emo- ohnny English, the British MI7 spy portrayed on tions that people do not intend to show.” film by Rowan Atkinson, has already experienced According to the authors, automatic ME analysis has so the treacherous nature of microexpression (ME) far been attempted on posed videos only, and reported ME analysis: in Johnny English Reborn, one movie in recognition performance is low. As a step forward in this Jthe blockbuster series, a high-frame-rate moving image of promising discipline, the authors introduce spotting of the spy’s face, significantly slowed down, reveals his gen- spontaneous MEs even in arbitrary long video recordings uine feelings, otherwise professionally hidden. But without the need of training. As to recognition, their sug- in our world, where are such computers that automati- gested framework surpasses previous efforts significantly on two demanding standard spontaneous ME . In additional tests, their techniques for automatic ME recogni- Digital Object Identifier 10.1109/MC.2019.2897034 Date of publication: 22 March 2019 tion significantly outperform humans at ME recognition

4 COMPUTER PUBLISHED BY THE IEEE COMPUTER SOCIETY 0018-9162/19©2019IEEE EDITOR RON VETTER University of North Carolina Wilmington; [email protected]

and match human skills in combined MEs. The authors compare LPB-type his- applications for automatic ME recog- ME spotting and recognition. tograms of oriented gradients with those nition and analysis will soon appear in In the authors’ framework, the abil- of image-gradient orientation features many fields, including forensics and psy- ity to spot MEs relies on appearance-based based on near-infrared and RGB-color im- chotherapy. However, further efforts are feature descriptors. The two inner eye ages. They also compare the effects of dif- needed to ensure that technologies for corners and a nasal spine point are located ferent temporal interpolation and motion spotting MEs are not easily fooled by, for first and then tracked by the Kanade–Lu- magnification to counter the low intensity example, literal blinks of an eye. Also, ad- cas–Tomasi algorithm. The points also of ME methods after face alignment. In vances are needed to make such technol- serve for in-plane rotation and face-size their final system, they recommend LBP ogies practical and reliable in real-world normalization, but 3D head rotation is ig- features for spotting. For recognition, they settings. Advances in deep learning may nored. The points further serve to fix a 6 × 6 recommend four and 10 as parameters for be useful in helping to achieve these grid. Calculations are made of local binary magnification and temporal interpolation goals sooner rather than later. pattern (LBP) histograms and, alternatively, and HIGO-type features. histograms of optical flow per block. ME spotting is based on analyses of differences between the features of a current frame he task of analyzing sponta- BJÖRN SCHULLER is a professor and the average facial features over a time neous MEs is challenging as they of artificial intelligence at Imperial window with thresholding and peak de- tend to be brief and low in inten- College London and of embedded in- tection. Once MEs are spotted, features Tsity. Nonetheless, computer systems telligence for health care and well-be- extracted are fed into linear-support vector should be able to match or surpass the ing at the University of Augsburg. machines for subject-independent classifi- human ability to read and interpret fa- Contact him at [email protected]. cation into positive, negative, or surprised cial expressions. Thus, many expected

Rejuvenating Binary Executables ■ Visual Privacy Protection ■ Communications Jamming Policing Privacy ■ Dynamic Cloud Certification■ Security for High-Risk Users Smart TVs ■ Code Obfuscation ■ The Future of Trust

IEEE Symposium on Security and Privacy

January/February 2016 March/April 2016 May/June 2016 Vol. 14, No. 1 Vol. 14, No. 2 Vol. 14, No. 3

IEEE Security & Privacy magazine provides articles with both a practical and research bent by the top thinkers in the fi eld. • stay current on the latest security tools and theories and gain invaluable practical and research knowledge, • learn more about the latest techniques and cutting-edge technology, and computer.org/security • discover case studies, tutorials, columns, and in-depth interviews and podcasts for the information security industry.

Digital Object Identifier 10.1109/MC.2019.2901910 ELSEWHERE IN THE CS EDITOR LORI CAMERON [email protected]

Computer Highlights Society Magazines

The IEEE Computer Society’s lineup of 12 peer-re- Rochester Institute of Technology. Read more in the viewed technical magazines covers cutting-edge top- July–September 2018 issue of IEEE Annals of the History ics ranging from software design and computer graph- of Computing. ics to Internet computing and security, from scientific applications and machine intelligence to visualiza- tion and microchip design. Here are highlights from recent issues.

OpenSpace: Bringing NASA Missions to the Public This article from the September/October 2018 issue of IEEE Computer Graphics and Applications presents Open- Space, an open-source astrovisualization software Evidence-Based Reasoning in Intelligence Analysis: project designed to bridge the gap between scientific Structured Methodology and System discoveries and their public dissemination. There is a This article from the November/December 2018 wealth of data for space missions from NASA and other issue of Computing in Science & Engineering presents sources. OpenSpace brings this data together and com- a scientific method-based approach and system that bines it in a range of immersive settings. Through non- helps intelligence analysts and others reason better linear storytelling and guided exploration, interactive when addressing issues involving incomplete, con- immersive experiences help the public to engage with tradictory, ambiguous, and missing information. advanced space mission data and models and, thus, be The approach integrates the analyst’s imagination better informed and educated about NASA missions, the and expertise with the computer’s knowledge and solar system, and outer space. The authors demonstrate critical reasoning. this capability by exploring the OSIRIS-Rex mission.

Interview with Charles Bigelow Toward the Musicologist-Driven Mining Charles Bigelow’s career parallels the development of of Handwritten Scores digital font technology. He has designed fonts and con- For many decades, historical musicologists have been sulted about font technology with many of the compa- seeking objective and powerful techniques to col- nies that created desktop publishing systems. He has lect, analyze, and verify their findings. The aim of also written extensively on digital font technology and this study, published in the July/August 2018 issue of taught at Rhode Island School of Design, Stanford, and IEEE Intelligent Systems, is to show the importance of such domain-specific problems to achieve actionable knowledge discovery in the real world. The focus is on finding evidence for the chronological ordering of J.S. Bach’s manuscripts by proposing a musicolo- Digital Object Identifier 10.1109/MC.2019.2897055 Date of publication: 22 March 2019 gist-driven mining method for extracting quantitative

6 COMPUTER PUBLISHED BY THE IEEE COMPUTER SOCIETY 0018-9162/19©2019IEEE information from early music manuscripts. Bach’s C-clefs were extracted from a wide range of manu- scripts under the direction of domain experts and, with these, C-clefs were classified. The proposed A Crossmodal Approach to Multimodal Fusion methods were evaluated on a data set containing more in Video Hyperlinking than 1,000 clefs extracted from Bach’s manuscripts. With the recent resurgence of neural networks and The results show more than 70% accuracy for dating the proliferation of massive amounts of unlabeled Bach’s manuscripts. Dating of Bach’s lost manuscripts multimodal data, recommendation systems and mul- was quantitatively hypothesized, providing a rough timodal retrieval systems based on continuous rep- barometer to combine with other evidence to evaluate resentation spaces and deep-learning methods are musicologists’ hypotheses. becoming of great interest. Multimodal representa- tions are typically obtained with auto-encoders that reconstruct multimodal data. In this article from the April–June 2018 issue of IEEE MultiMedia, the authors describe an alternative method to perform high-level multimodal fusion that leverages crossmodal transla- Efficient Cloud Provisioning for Video tion by using symmetrical encoders cast into a bidi- Transcoding: Review, Open Challenges, rectional deep neural network (BiDNN). Using the les- and Future Opportunities sons learned from multimodal retrieval, they present Video transcoding is the process of encoding an initial a BiDNN-based system that performs video hyperlink- video sequence into multiple sequences of different bit ing and recommends interesting video segments to rates, resolutions, and video standards, to view it on devices a viewer. Results established using TRECVID’s 2016 of various capabilities and with various network access video hyperlinking benchmarking initiative show that characteristics. Because video coding is a computationally our method obtained the best score, thus defining the expensive process and the amount of video in social media state of the art. networks drastically increases every year, large media pro- viders’ demand for transcoding cloud services will continue to rise. This article, which appeared in the September/Octo- ber 2018 issue of IEEE Internet Computing, surveys related state-of-the-art cloud services. It also summarizes research on video transcoding and provides indicative results for Teaching Pervasive Computing in Liberal a transcoding scenario of interest related to Facebook. Arts Colleges Finally, it illustrates open challenges in the field and out- In this article from the July–September 2018 issue of lines paths for future research. IEEE Pervasive Computing, the authors reflect on the critical role of liberal arts education in fostering cre- ative, collaborative, and ethical innovators for perva- sive computing. They discuss why liberal arts educa- tion is important as a foundation for advanced studies and leadership in ubiquitous computing, and they share Not in Name Alone: A Memristive Memory their experiences teaching pervasive computing in lib- Processing Unit for Real In-Memory Processing eral arts colleges. Data movement between processing and memory is the root cause of the limited performance and energy effi- ciency in modern von Neumann systems. To overcome the data-movement bottleneck, the authors of this arti- cle from the September/October 2018 issue of IEEE Micro present the memristive memory-processing unit, a real Fingerprinting for Cyberphysical System Security: processing-in-memory system in which the computation Device Physics Matters Too is done directly in the memory cells, thus eliminating Due to the increase in attacks against cyberphysical sys- the need for data transfers. Furthermore, with its enor- tems, it is important to develop novel solutions to secure mous inner parallelism, this system is ideal for data-in- these critical systems. System security can be improved tensive applications that are based on single instruction, by using the physics of process actuators (that is, devices). multiple data, therefore providing high throughput and Device physics can be used to generate device finger- energy efficiency. prints to increase the integrity of responses from process

FEBRUARY 2019 7 ELSEWHERE IN THE CS

actuators. Read more in the September/October 2018 issue of IEEE Security & Privacy.

A Cognitive Assistance Framework for Supporting Human Workers in Industrial Tasks Cognitive systems are capable of human-like actions such as perception, learning, planning, reasoning, self- and Software Engineering’s Top Topics, Trends, context-awareness, interaction, and performing actions and Researchers in unstructured environments. The authors of this article This article on software engineering’s 50th anniversary, from the September/October 2018 issue of IT Professional which is part of the September/October 2018 issue of IEEE present an implemented framework for an interactive cog- Software, offers an overview of the twists, turns, and numer- nitive system that enables human-centered, adaptive indus- ous redirections seen over the years in the software engi- trial assistance in cooperative, complex human-in-the-loop neering (SE) research literature. Nearly a dozen topics have assembly tasks. The functionality of the cognitive system dominated the past few decades of SE research, and they includes enabling perception and awareness, understand- have been redirected many times. Some are gaining popu- ing and interpreting situations, reasoning, decision mak- larity, whereas others are becoming increasingly rare. ing, and autonomous acting.

Editorial: Unless otherwise stated, bylined articles, as well as the author to incorporate review suggestions, but not the pub- product and service descriptions, reflect the author’s or firm’s lished version with copyediting, proofreading, and format- opinion. Inclusion in Computer does not necessarily constitute ting added by IEEE. For more information, please go to: http:// endorsement by the IEEE or the Computer Society. All submis- www.ieee.org/publications_standards/publications/rights sions are subject to editing for style, clarity, and space /paperversionpolicy.html. Permission to reprint/republish this material for commercial, advertising, or promotional pur- Reuse Rights and Reprint Permissions: Educational or per- poses or for creating new collective works for resale or redistri- sonal use of this material is permitted without fee, provided bution must be obtained from IEEE by writing to the IEEE Intel- such use: 1) is not made for profit; 2) includes this notice and lectual Property Rights Office, 445 Hoes Lane, Piscataway, NJ a full citation to the original work on the first page of the 08854-4141 or [email protected]. Copyright © 2019 copy; and 3) does not imply IEEE endorsement of any third- IEEE. All rights reserved. party products or services. Authors and their companies are permitted to post the accepted version of IEEE-copyrighted Abstracting and Library Use: Abstracting is permitted with material on their own webservers without permission, pro- credit to the source. Libraries are permitted to photocopy for vided that the IEEE copyright notice and a full citation to the private use of patrons, provided the per-copy fee indicated in the original work appear on the first screen of the posted copy. An code at the bottom of the first page is paid through the Copyright accepted manuscript is a version which has been revised by Clearance Center, 222 Rosewood Drive, Danvers, MA 01923.

8 COMPUTER WWW.COMPUTER.ORG/COMPUTER 50 & 25 YEARS AGO

EDITOR ERICH NEUHOLD University of Vienna [email protected]

FEBRUARY 1969 loop parallelism at the instruction set level by issuing sev- In the early years, Computer was published only bimonthly. eral instructions or operations in a single cycle. At runtime, Therefore, we will have to skip our interesting and informa- dependences between operations must be checked either tive extractions for this month. The next column will appear statically by the compiler or dynamically by the hardware to in the March issue of Computer, and we hope you will eagerly ensure that only independent operations are issued simul- wait for its publication. taneously.” (p. 19) “While fine-grained parallel architectures exploit parallelism at the instruction level, coarse-grained FEBRUARY 1994 architectures exploit it by scheduling entire iterations on Is 1994 1984 All Over Again? (p. 8) “This year, we face the separate processors. On the shared-memory multiprocessor prospect of yet another profound development. Instead of shown in Figure 10, for example, the independent tasks to be micro, it is a new era of multimedia telecommunications— scheduled are the individual instantiations of the iterations, the convergence of micros with communications networks. each with a unique value of the loop index.” (Editor’s note: … 1994 marks the beginning of a roller-coaster ride into yet The article evaluates various parallelization strategies for both another era of computing. We at Computer Magazine plan to coarse-grained and fine-grained approaches. However, it restricts be at the forefront of this activity with articles and columns its analysis to situations that appear in scientific programming, that aim squarely at the core of these technologies. Our new for example, matrix-based situations. In today’s situations, sta- “Computing Milieu” section tackles the sociological and phil- tistical analysis of large amounts of data—for example, for infor- osophical underpinnings of this new era, our “Computing mation mining—are important parts of parallelization; therefore, Practices” section illustrates how to apply the new technolo- novel distribution techniques had to be found.) gies, and our regular research articles give you a look at what is to come.” Defining Requirements for a Standard Simulation Envi- ronment (p. 28): “In this article, we define a reference model Exploiting the Parallelism Available in Loops (p. 13) “An for general-purpose discrete-event simulation environments efficient approach for extracting this potential parallelism as well as the associated requirements for the model’s func- is to concentrate on the parallelism available in loops. Since tional layers, to be used as the basis for a future standard.” the body of a loop may be executed many times, loops often (p. 30) “Reference model … The model consists of five distinct comprise a large portion of a program’s parallelism. A vari- layers. The top layer, or application layer 4, can access all lay- ety of parallel computer architectures and compilation tech- ers so that developers can add application-specific constructs niques have been proposed to exploit this parallelism at dif- to their environments. The lower layers include properties ferent granularities.” (p. 14) “We’re interested in parallelism that enable implementation of similar features at higher lev- limitations due to data dependences, since, at least in theory, els. The lowest layer 0 provides the basic language-level sup- we could eliminate resource dependences with additional port for the environment and can be accessed by all other lay- hardware. For simplicity, we’ll limit the analysis to singly ers. Layer 1 defines the requirements for model specification. nested loops in which the loop index has been normalized Layer 2 deals with model knowledge management. Layer 3 is to vary from 1 to n with a stride (increment between itera- the system design layer.” (Editor’s note: The article proposes a tions) of 1.” (p. 16) “Fine-grained parallel architectures exploit reference architecture for simulation systems that, despite look- ing attractive, never managed to become the reference model in simulation. Simulink from MathWorks is widely used, but other Digital Object Identifier 10.1109/MC.2019.2897920 Date of publication: 22 March 2019 approaches still play important roles.)

COMPUTER 0018-9162/19©2019IEEE PUBLISHED BY THE IEEE COMPUTER SOCIETY FEBRUARY 2019 9 50 & 25 YEARS AGO

A Methodology for Design, Test, and Evaluation of Real- environment: (1) establishing a standards-checking pro- Time Systems (p. 35): “This methodology can reduce proj- gram, (2) implementing an automated problem-reporting ect costs and shorten schedules because it requires per- system, (3) automating the generation of configuration status formance evaluation and integration testing early, when accounting reports, and (4) providing an automated metrics problems are generally easier and less costly to correct. … acquisition and reporting capability.” (Editor’s note: The article This article presents a methodology that is suitable for use as develops and describes a software configuration management sys- part of either a prototyping approach or a component-reuse tem not so different from others existing in 1994, but it also relates it approach. This methodology integrates modeling and simu- to the ADA environment developed in the previous article.) lation as well as developmental and operational testing over the life cycle. The type of systems or components we address Computer Science for the Many (p. 62) “Traditional literacy operate in real time.” (p. 36) “This methodology consists of courses were developed some years ago and are still taught an analytical approach for representing (1) a system and its at many universities throughout the world. These courses environment, and (2) a hardware and software architecture. emphasize learning the vocabulary of computing; gaining Both the methodology and the modeling-and-simulation/ some experience with software packages, such as word pro- test-and-evaluation suite based on this architecture are called cessing, spreadsheets, and systems; and studying ASSET (an acronym for aerospace systems simulation, engi- the history and social impact of computing. … Unfortunately, neering, and test tool).” (Editor’s note: The publication analyzes literacy courses address the syntax and form of the field in detail a design/simulation/implementation approach for an rather than the structure and ideas. They enable students to airborne interferometer processing system.) use machines, but they do not engage their intellects in the real excitement of computing. Memorized vocabulary will not Ada System Dependency Analyzer Tool (p. 49): “Strongly last unless tied to a real understanding, and the applications typed languages like Ada promote significant extensibility packages students learn will soon be out of date.” (p. 63) “The and reusability because they support safe, error-free inter- philosophy is that if students develop sufficient knowledge of faces between different software packages. At the same time, what computers can do, and learn how to get started doing this ability to embed diverse systems within an application, those things, they will have gained knowledge and skills of particularly commercial off-the-shelf (COTS) software, often lasting value. The course teaches this material by introducing unintentionally adds to architectural complexity. … With the students to programming and by teaching them the fun- large, complex software systems, automated tools are indis- damental mechanisms of computer hardware and software.” pensable for identifying the architectural components, the (p. 69) “Artificial intelligence. This topic divides into two structure that interconnects them, and other subtle depen- parts, knowledge representation and reasoning. The knowl- dencies. This article describes the construction of an Ada edge representation lectures introduce several common rep- System Dependency Analyzer (SDA), a software architecture resentation schemes. … The reasoning lectures show several analysis tool that generates a quantitative snapshot of an search methodologies and applications of problem solving. … Ada application’s software architecture. The SDA can pro- “In concluding the artificial intelligence study, instructors cess thousands of Ada source files during a single run and try to characterize the field’s current level of success and warn report on them as a group of files comprising a single Ada against over optimism about the future. Students are told system.” (p. 54) “Our object-oriented design method provides that almost any intelligent phenomenon—learning, problem the additional benefit of letting us add new features to the solving, creativity, natural language processing, vision, and SDA in a straightforward manner. For example, to add a new so forth—can be simulated to a very limited extent. However, capability, a handle routine is added to the parser, providing no artificially intelligent phenomenon of this nature has been a point of call for the new construct when it is detected in the exhibited to a great degree, nor is this likely in the foresee- grammar.” (Editor’s note: The authors claim that the approach able future.” (Editor’s note: I consider this curriculum proposal to can easily be adopted to other high-level programming environ- be well thought out and well argued. Not knowing the U.S. scene ments but do not indicate how this could be done, which is espe- for such basic computer courses in academia, I am somewhat sur- cially interesting in light of the ADA idiosyncrasies mentioned in prised that the subjects covered apparently are not contained in the next article.) such courses. In Germany and Austria, such content was and is more or less standard.) Implementing a Software Configuration Management Environment (p. 56) “Configuration management is con- Former IBM Chief T.J. Watson, Jr., dies (p. 84) “In 1977, cerned with maintaining a product’s integrity. To do so, when he received an honorary Doctor of Civil Law degree a successful CM environment requires three basic capa- from Oxford University, Watson summed up concerns that bilities: (1) version control, (2) a check-out/check-in facility, had occupied him in both his business and his public service and (3) a buffered-compare program.” (p. 59) “There are a roles: ‘One of the most important problems we face today, as few other items to consider when developing a software CM techniques and systems become more and more pervasive, is

10 COMPUTER WWW.COMPUTER.ORG/COMPUTER the risk of missing that fine, human point that may well make The Open Channel: Natural Talent for Computing (p. 120) the difference between success and failure, fair and unfair, “Two extremes. Today, I am teaching in the laboratory again. right and wrong. … No IBM computer has an education in the I’m moving between people who are convinced that they know humanities. … There isn’t a single one with moral standards, everything about computers and people who are convinced conscience, soul, or heart. … I say that for every step forward that they will never know anything about computers. Some- in the direction of more so-called scientific management, we where in here are future computer science stars, but they’re not must take one or more steps toward improved preservation so easy to spot anymore. I’m thinking, maybe we need fewer of human values.” (Editor’s note: A statement I fully subscribe gun nuts and fewer computer nuts, and more people who can to. However, more than 40 years later, I sometimes have the strong hit what they’re aiming at. I’m beginning to believe that natu- impression that we have forgotten this fact and trust computers, ral talent is a matter of a fresh, patient, and open mind.” (Edi- clouds, big data, and deep learning more than our own human val- tor’s note: What a true statement and one that is often completely ues, intelligence, and ethics.) disregarded by our “modern” recruiting methods.)

IEEE TRANSACTIONS ON SUBMIT TODAY BIG DATA

SUBSCRIBE AND SUBMIT

For more information on paper submission, featured articles, calls for papers, and subscription links visit: www.computer.org/tbd

TBD is financially cosponsored by IEEE Computer Society, IEEE Communications Society, IEEE Computational Intelligence Society, IEEE Sensors Council, IEEE Consumer Electronics Society, IEEE Signal Processing Society, IEEE Systems, Man & Cybernetics Society, IEEE Systems Council, and IEEE Vehicular Technology Society

TBD is technically cosponsored by IEEE Control Systems Society, IEEE Photonics Society, IEEE Engineering in Medicine & Biology Society, IEEE Power & Energy Society, and IEEE Biometrics Council

Digital Object Identifier 10.1109/MC.2019.2901911

FEBRUARY 2019 11 EDITOR IN CHIEF DAVID ALAN GRIER EIC’S MESSAGE Djaghe, LLC; [email protected]

Holding Us he fields of computer science, computer engi- neering, and software engineering are remark- Together ably broad, and few people can fully encompass all three. Fewer still can clearly see how these Tfields interact with each other, how each borrows from David Alan Grier, Djaghe, LLC the other and comments on a different approach to the problems of computation, and how they work together as The articles in this issue draw a complex whole. In this issue, we present five articles that demonstrate the diversity of computation and point to the from the past, offer a bit of common goal of making computation useful. If I were to identify a good entry point for this issue, I present insight, and look to the might suggest Antinyan, Sandberg, and Staron’s article on code complexity, “A Pragmatic View on Code Complexity future. Taken together, you get Management.” The authors identify an important point a sense of what the entire field in this article: There is a big difference between complex- ity that is essential and complexity that is a byproduct of computing is doing today Digital Object Identifier 10.1109/MC.2019.2897033 and how it all holds together. Date of publication: 22 March 2019

12 COMPUTER PUBLISHED BY THE IEEE COMPUTER SOCIETY 0018-9162/19©2019IEEE of the engineering process. All of us to build an algorithm that could de- the fabric-attached memory that has are engaged in the work of managing tect potential intruders from the pat- been part of Hewlett-Packard’s re- complexity and ensuring that our sys- terns in network traffic. (The contest search on new network articles. Like tems have enough elements to work and its result are in IEEE Xplore at all of the articles here, this one pres- properly but not so many that they be- https://ieeexplore.ieee.org/document ents an interesting solution, identifies come obscure or brittle. /821515.) This current article suggests the strengths and weaknesses of that Since complexity is often the en- that, while the contest may have been solution, and suggests the next steps emy of agility and robustness, you important to the KDD field, its data for research. may want to jump next to “Soft- may not be the best example of current Of course, that is the common fea- ware Systems With Antifragility to intruder traffic. ture of the articles in this issue. They Downtime” by Hole and Otterstad, draw from the past, give some insight which discusses how to think about into the present, and look to the fu- systems that are robust and should he last article, by Bresniker et ture. That is what the entire field of operate continually. These authors al., ”Rack-Scale Capabilities: computing is doing today. approach the subject from a negative Fine-Grained Protection for point of view when they write about TLarge-Scale Memories” returns to the DAVID ALAN GRIER is a principal “software systems that are antifragile hardware roots of computing but does with Djaghe, LLC. He is a Fellow of against downtime.” As a good editor, so with a modern twist. It looks at the IEEE. Contact him at dagrier.dc@ I normally would highlight the text the problem of properly securing net- gmail.com. and tell them to phrase their results worked memory units, particularly in a positive way. However, they have good reason to approach the subject from a negative viewpoint, since an- tifragility really is more of a strategy for building good systems and avoid- ing the problems that bring software Call for Articles to a stop. Failed systems, through either com- plexity or fragility, can indicate a failed IEEE Software seeks engineering process and invite a legal in- practical, readable articles vestigation. However, there often are le- that will appeal to experts gal implications for software that is func- tioning properly. In “Algorithms: Law and nonexperts alike. The and Regulations,” Treleaven, Barnett, and magazine aims to deliver Koshiyama ask if there is a new legal sta- reliable information to software tus for algorithms and if we need to start developers and managers to thinking about how algorithms interact with legal institutions. help them stay on top of rapid Legal structures are important technology change. Submissions because we don’t always see the im- must be original and no more plications of our work. Siddique, than 4,700 words, including 250 Akhtar, Khan, and Kim, the authors of the next article, “KDD Cup 99 Data words for each table and gure. Sets: A Perspective on the Role of Data Sets in Network Intrusion Detection Research,” look at a classic problem of computer science that was proposed Author guidelines: www.computer.org/software/author at the 1999 Knowledge Discovery and Further details: [email protected] Data Mining Conference (KDD). At that www.computer.org/software conference, DARPA proposed a contest

Digital Object Identifier 10.1109/MC.2019.2901912 COVER FEATURE HOLDING US TOGETHER

A Pragmatic View on Code Complexity Management

Vard Antinyan and Anna B. Sandberg, Volvo Car Group Miroslaw Staron, University of Gothenburg

This article endeavors to underpin complexity understanding by scrutinizing how developers experience code complexity and how certain code characteristics impact complexity. The results provide a distinction between essential and accidental code characteristics and help in evaluating the influence of these characteristics on complexity increase.

ource code, unlike other artifacts, is structurally COMPLEXITY sophisticated, representationally abstract, and The concept of complexity emerges from systems the- progressively versatile over time. Therefore, it is ory to elucidate the difficulty of system understanding considered one of the most complex artifacts cre- due to its elements and interconnections. Generally, ated by humans. The complexity of any living product code it has been difficult to define complexity comprehen- S1 grows, but there are no well-accepted strategies to manage it. sively. One of the most comprehensive definitions is pro- Code complexity measures created previously are not of much vided by Maier:3 “Complexity is an emergent property of help in making code simpler because they cannot distinguish a system due to its elements and interconnections.” accidental complexity from one that is essential. The best that This definition assumes that complexity is an they can do is to predict maintainability and defects.2 A pivotal intrinsic quality of a system. More elements and fur- question that still needs to be answered is how can one identify ther interconnections in a system create more dif- the accidental part of complexity to reduce it? ficulty in understanding how the system works. Because the difficulty of understanding emerges from an increasing number of elements and intercon- Digital Object Identifier 10.1109/MC.2018.2888761 Date of publication: 22 March 2019 nections, by which complexity is defined, then that

14 COMPUTER PUBLISHED BY THE IEEE COMPUTER SOCIETY 0018-9162/19©2019IEEE difficulty of understanding emerges seven elementary concepts at a time. and interconnections of a small unit from complexity. This means that the large amount of of code. One can imagine that the Here we should notice that, even information obtained through the developer’s short-term memory slides though understanding is a purely human senses cannot be stored and over the elementary code statements, cognitive quality and complexity is a managed at once. To manage the continuously loading/unloading ele- purely system quality, they are unshak- continuous flow of information into ments and their interconnections and ably connected in their essence. In fact, short-term memory, working memory making the necessary modifications, it would be hard to describe what com- decomposes information into small at all times holding no more than plexity is without considering that it is and relatively isolated units. Because seven elementary concepts. sensed and interpreted by the human a decomposed unit is not completely If the code is difficult to decompose brain. Hence, to scrutinize the impact independent, its linkages with the into small and well-isolated units, of complexity on system development, directly related units are also stored. however, short-term memory gets one needs to understand such func- The more isolated a unit is, the fewer loaded and unloaded more inten- tions of the brain that sense informa- linkages there are to remember, and, sively, so working memory can obtain tion, manipulate it, and make deci- therefore, the easier it is to process the a thorough understanding of the unit sions accordingly when the system is information. A summary of experi- and make decisions. In this case, being developed. ments on the capacity of short-term working memory (whether by inten- memory was reported by Miller,5 who tion or not) conducts an elaborative COMPLEXITY IN THE BRAIN also observed that, irrespective of the rehearsal process, which activates a Any information that can be obtained subject of information, this capacity part of long-term memory and stores by human senses is filtered with the remains in the same limit. a larger content of relevant informa- help of transient sensory stores in the The second quality of short-term tion there. Working memory then can brain.4 If the information is irrele- memory is that it gives quick access to access the newly stored information vant, it tends to disappear from these its information. As opposed to long- in long-term memory relatively more sensory stores quickly. If the informa- term memory, which requires con- easily and manipulate it for deci- tion is relevant, however, it is stored tinuous (and sometimes long-lasting) sion making. in short-term memory. Short-term exercises for accessing information, The activation of long-term mem- memory is an intermediate memory short-term memory permits access to ory, nonetheless, takes a substantial storage system that can hold informa- its total content in milliseconds. To amount of time and effort. Therefore, tion for about 30 s to permit its use by understand how short-term memory, the developer experiences a sense of working memory. Working memory, long-term memory, and working mem- difficulty and tiredness. As a result, which is the executive function of the ory deal with code complexity when the likelihood of making mistakes brain, then uses this information for maintaining code, we provide the fol- will grow, and maintenance time will either decision making or rehearsal lowing example. increase. Consequently, it is relevant for storing it in long-term memory. A developer always works on a to scrutinize the elementary code Long-term memory is a large mem- small unit of code at a time, and the characteristics that make the smallest ory storage that requires elaborate only information the developer’s units of code highly interconnected rehearsal of any information to store short-term memory holds is a small (complex). Two questions are central and activate it when working mem- number of elements and their direct to this scrutiny. ory requires it. interconnections with the rest of the Short-term memory has two code. When the necessary modifica- ››Which are the elementary char- important qualities that seem to have tions are complete, the developer’s acteristics of code that increase a crucial impact on the effectiveness short-term memory unloads the cur- complexity? of processing information and mak- rent information and reloads the next ››Which of these characteristics ing decisions. The first is its remark- unit of information. At all times, the are accidental and thus avoid- ably small capacity. On average, short- developer’s short-term memory holds able, and which are essential and term memory can hold no more than only the information of the elements thus unavoidable?

FEBRUARY 2019 15 HOLDING US TOGETHER

The next section answers these two because operators and variables are core implementing a modular architecture questions. constituents for implementing problem and thoughtful task distribution. domain tasks. A defined minimal num- It is possible to avoid misleading EVALUATING CODE ber of variables and operators should be comments by attentive commenting and CHARACTERISTICS used in accomplishing a given program- updating comments over the develop- Brooks6 claimed that complexity of ming task. ment time, so they do not become obso- code is essential because it is a corol- The next two characteristics, calls lete and thus misleading. Many soft- lary of essential structural character- and conditional statements, are mostly ware engineers in industry are inclined istics of code. According to Brooks, essential for the same reason as the for- to believe that comments should be because code characteristics are essen- mer two. The extensive use of them, embedded in the names of variables tial, there is little chance to decrease however, can create many logical paths and functions, so the risk of obsoleted complexity. It is known that there are and abstract away the operations of comments can be reduced. In times of also representational and evolutional the function, making the function intensive maintenance, however, some characteristics of code that are often difficult to comprehend. In particu- code areas can be changed quickly and accidental and can have a major role in lar, case decomposition can be used to repeatedly, making the updates of com- complexity decrease.7,8 avoid such a scenario, but often decom- ments a challenging task. In 2012, when we were developing position can be performed only at Deep nesting can usually be avoided decision-support systems for Ericsson the expense of creating more cou- by making an independent function and the Volvo Group, a question often pled functions. out of the nested block, merging sev- came up during discussions with soft- When it comes to logically unrelated eral conditions with Boolean operators ware engineers: What is it that makes tasks, it is difficult to decide to what using return, break, and continue or code complex? This question was partic- extent these tasks are logically unre- equivalent operators, which will make ularly important because, initially, the lated; therefore, it is difficult to under- the logical flow more linear. Lack of McCabe complexity measure was used9 stand when to avoid them by decompo- code structure can always be avoided for complexity assessment, but soft- sition. Here, there can be two extreme by using meaningful variable and func- ware engineers found it to be a simplis- options: an isolated but noncohesive tion names, adhering to correct inden- tic measure. Thereafter, we discussed function versus a group of coupled but tations, and keeping code line length to this question as part of numerous group cohesive functions. Depending on the an optimal limit.10 A name must not be meetings with 12 software engineers in task to be programmed, the tradeoff too short to deliver its meaning as com- the companies. Ten code characteristics between coupling and cohesion can prehensively as possible and must not were identified that are omnipresent be different. be too long to be grasped in one look. in code and perceived to have distinc- Without changing code, it is not pos- tive impacts on complexity. The next sible to conduct software maintenance The impact on code complexity section discusses how avoidable these or development. Changes that are too Evaluating the impact of code char- characteristics are when coding. frequent, however, may indicate rather acteristics on complexity is valuable unstable code. It is usually possible to because it can indicate how much com- Essential and accidental avoid too-frequent changes by having a plexity may be decreased. To perform characteristics well-modularized architectural design such an evaluation, an online survey Here, a postulate is that the more avoid- that will prevent change waves across was conducted. The target respondents able a characteristic, the more it is acci- different code areas. were software engineers working in dental. The 10 characteristics are pre- If the same unit of code is modified seven large software development sented in Table 1. The third column of by many developers simultaneously, companies (Axis Communications, the table gives explanations of why and each developer needs to understand how Ericsson, Grundfos, Jeppesen, SAAB to what extent the characteristics can his or her individual changes are related Defense and Security, Volvo Group be avoided in code. to changes made by colleagues. It is usu- Global, and Volvo Car Group) and two The first two characteristics in ally possible to avoid many developers, universities (the University of Gothen- the table seem to be strictly essential, making changes in the same code by burg and the University of Chalmers).

16 COMPUTER WWW.COMPUTER.ORG/COMPUTER TABLE 1. The code characteristics and the extent of avoidance.

Characteristic Description Avoidable?

Many operators Many mathematical operators They cannot be avoided because they are essential for in a unit of code conducting operations.

Many variables Many variable declarations They cannot be avoided because they are essential for task and use in a unit of code specifications and operations. Essential Complexity Essential Many calls Many unique invocations of They mostly cannot be avoided because they are essential for methods or functions in a unit code modularity. The use of decomposition, however, allows of code for avoiding calling many complex functions in a unit of code.

Many conditional Many conditional statements They mostly cannot be avoided because they are essential for statements in a unit of code (e.g., if, while, executing different scenarios under different conditions. The for) use of decomposition, however, allows the extensive use of them in a unit of code to be avoided.

Many independent tasks Many logically unrelated tasks This may or may not need to be avoided. Too little that are jointly solved in a unit decomposition increases the internal complexity of functions, of code (cohesion) and too much decomposition increases the outbound complexity of functions.

Many changes Many modifications made in a They may be avoided by maintaining a well-modularized unit of code over a specified architecture and stable code. Many changes, however, can be period of time a corollary of active maintenance activities.

Many developers Many developers that make They usually can be avoided by using well-defined task Accidental Complexity simultaneous changes in distribution and a well-modularized code architecture. a given unit of code over a specified period of time

Many misleading Many misleading or obsoleted They mostly can be avoided by keeping the comments brief comments comments in a unit of code and updating them with changing code.

Deep nesting Deep nesting level of This mostly can be avoided by refactoring the block into conditional statements in a separate functions, combining the conditional tests, or using unit of code early returns.

Lack of structure Incorrect indentations, These can always be avoided by using clean coding rules. misleading naming, and These characteristics are representational and thus are not lengthy lines essential for code execution.

Because software engineers work with universities responded to the survey. Each question targeted assessing the code directly and develop a variety of We developed the survey questions influence of a specified characteris- software, their collective perception and improved them by conducting tic on code complexity as perceived could reveal profound knowledge two pilot studies with nine software by the respondents. For example, the regarding how code characteristics engineers. As a result, each of the sur- question specific for variables was for- impact complexity. vey questions was organized into six mulated as “Generally many variables Eighty-nine participants from the ordinal Likert scale values with an in a unit of code make that code….” companies and 11 participants from the additional “not answered” option. The respondents were asked to fill in

FEBRUARY 2019 17 HOLDING US TOGETHER

the ellipses by one of the following see which ordinal values are high for a complex” (36 answers) and “quite com- seven options: given characteristic. Additionally, the plex” (37 answers). Overall, more than statistical modes per characteristic are 80% of respondents selected the high- ››no complex emphasized by specifying the number est three values of the Likert scale for ››little complex of respondents on the color that rep- these characteristics (the red-orange ››somewhat complex resents the mode. The accidental char- area on the bars). Conversely, three ››rather complex acteristics are positioned toward the essential characteristics of code— ››quite complex left side, and the essential character- calls, variables, and operators— ››very complex istics are positioned toward the right are considered to have little impact ››not answered (N/A). side of the horizontal axis. on complexity. The most important result the fig- The only essential code characteristic The rest of the questions were orga- ure reveals is that, according to soft- considered to have a significant impact nized in a similar fashion. ware engineers’ knowledge and expe- is conditional statements. However, Figure 1 presents the results of the rience, accidental code characteristics conditional statements can be used survey. The vertical axis represents have substantially higher impact on nested or not nested, and this consid- the number of answers, and the hori- complexity than essential ones. Lack eration was not specified in the survey zontal axis represents the code char- of structure and nesting depth, in par- question; therefore, its influence might acteristics. The number of answers per ticular, seem to have a decisive impact be overestimated if the nesting fac- ordinal value is color coded, so we can on complexity. Their modes are “very tor is excluded. Additionally, two

Accidental Complexity Essential Complexity

100

90

80 34 70 34 60 27 26 50 (%) 22 30 40 37 32 25 30

20 36 10

0 Lack of Deep Many Many Many Frequent Many Many Many Many Structure Nesting Misleading Developers Tasks Changes Conditional Calls Variables Operators Comments Statements N/A No Complex Little Complex Somewhat Complex Rather Complex Quite Complex Very Complex

FIGURE 1. The code characteristics and their influence on complexity and assortment.

18 COMPUTER WWW.COMPUTER.ORG/COMPUTER rather accidental characteristics, many 2014. In mid-2013, the researchers THE PROBLEM WITH developers and misleading comments, had already finished the research CURRENT COMPLEXITY are also considered to have significant project and left the company, so the MEASURES impact. Overall, the red-orange area of particularities of the refactoring The main goal of a complexity measure the left half of the figure is many times were not observed. Later in 2016, how- is to identify complex units of code and larger than that of the right half. Hence, ever, the researchers could measure indicate the cause of high complexity it seems that, if the proposed acciden- the number of postunit test defects so the developers can refactor and sim- tal characteristics are managed in code, in the product during the years and plify the code. The most popular com- then a large part of complexity can be examine the impact of refactor- plexity measures, however, have not dissolved away. ing on defects. Figure 2 shows the been successful in fulfilling this goal.2 defects during the years beginning Two major problems in terms of this A CASE REPORT with 2003. From 2003 to 2013, defects lack of success are presented in the fol- In 2012, we conducted research at seem to have grown rapidly in a fluc- lowing two sections. Ericsson with the purpose of identify- tuating manner. ing defect-prone and unmaintainable As Figure 2 shows, the trend is bro- The problem of measuring areas of source code in a large soft- ken in 2013, after which the number of accidental code characteristics ware product.11 The reason was that a defects decreased dramatically until The first problem has to do with how well high number of defects were reported 2015. It was difficult to assess how complexity measures quantify acciden- by customers and system testing, and much of the defect reduction should tal code characteristics. The first row of the program leaders wanted to initiate be attributed to the reduction of acci- Table 2 presents the 10 code character- major refactoring activities. However, dental characteristics. It is certain, istics discussed in this article. The first the product was large, and develop- however, that refactoring activities column of Table 2 illustrates 10 software ers needed decision support on which were conducted for the files that the measures. The first eight measures are areas of code should be refactored. researchers provided to the develop- probably the most popular with research- Nine months of measurements and ers. Moreover, during the nine months ers and software engineers so far: size,9 biweekly reference group meetings of systematic investigations, develop- change,12 coupling (fan-in and fan-out) resulted in the understanding that ers understood the true drivers of com- and the measures of Henry and Kafura,9 most of the applied code complexity plexity, which influenced their tactics McCabe cyclomatic complexity,9 the measures and all of the applied code size of refactoring. object-oriented measures of Chidamber measures are inadequate for decision support (this is discussed in detail in the next section). Two measures, how- ever—maximum depth of nesting in a file and the number of developers who make simultaneous changes in a file— were shown to be sensitive measures of complexity. Even though these measures were not theoretically defined in the lit- erature, they still could be used by local measurement tools for indicating overly Number of Defects complex areas in the source code. Based on these measurements, program lead- ers were provided with the top-200 list of overly complex files, which were priori- 2002 2004 2006 2008 2010 2012 2014 2016 tized to be refactored in the coming year. Year Refactoring took place in the sec- FIGURE 2. The product defects over time. ond half of 2013 and continued until

FEBRUARY 2019 19 HOLDING US TOGETHER

and Kemerer,9 and Halstead measures of popular measures do not quantify the Overall the results of Table 2 are worri- software science.9 These measures have proposed accidental characteristics. The some because the most popular complex- often been evaluated for defect prediction final two measures, which are conform- ity measures poorly capture the acciden- and maintainability assessment, therefore ably designed by Buse and Weimer8 and tal code characteristics, even though many standard software measurement Harrison and Magel,7 are included to the accidental characteristics seem to textbooks include them. show that there have been endeavors to have multifold greater impact on com- Table 2 presents the following measure such accidental characteris- plexity than the essential ones. Most of information about these measures: tics as lack of structure and nesting. The the measures quantify the essential created measures, however, did not characteristics, which neither have big ››If a measure fully quantifies a fully quantify these characteristics. impact on complexity nor are reducible given characteristic, their inter- Measurement of nesting is probably due to their essentiality. Here, it becomes section cell has a green check difficult because it is hard to define the clear why it is unlikely that using these mark. measurement entity. Traditional mea- measures could help software engineers ››If a measure partly quantifies a surement entities are functions, files, to design simpler software. given characteristic, their inter- and components, for which most of the section cell has an exclamation measures are defined. Measuring nest- The problem of point. ing for such entities is not straightfor- size-based measurement ››If a measure does not quantify a ward because nesting is only defined for The second problem is that most of the given characteristic, their inter- code blocks. existing measures have mixed nature, section cell is left empty. As for the lack of structure, it is a more assessing both complexity and size at sophisticated characteristic, defined by the same time. Size is an essential code What is interesting in the table is that the length of identifiers, length of lines, characteristic, because generally big- its left side does not contain any green indentations, and so on; therefore, it is ger size indicates more functionality check marks, indicating that the most difficult to quantify it accurately. of software, and therefore it cannot be

TABLE 2. The complexity measures and code characteristics. Accidental Complexity Essential Complexity

Lack of Misleading Independent Conditional Other Nesting Developers Changes Calls Variables Operators Structure Comments Tasks Statements Characteristics Size Measures Revision Count Halstead

Fan-Out

Fan-In Henry and Kafura McCabe Childamber and Kemerer Buse and Weimer Harrison and Magel

20 COMPUTER WWW.COMPUTER.ORG/COMPUTER ABOUT THE AUTHORS

VARD ANTINYAN is a software quality engineer at Volvo Car Group. His research interests include software process and quality, software measure- ment, risk management, and cognitive psychological processes in complexity management. Antinyan received a Ph.D. in software engineering from the Uni- versity of Gothenburg. Contact him at [email protected]. reduced. For a complexity measure to be actionable, it should capture aspects ANNA B. SANDBERG is a senior director at Volvo Car Group. Her main research of complexity that are accidental. focus is to drive software change and improve productivity in product develop- When closely observing the measures ment. Sandberg received a Ph.D. in software engineering from the University of Table 2, one can understand that most of Gothenburg. She is a Member of the IEEE. Contact her at anna.sandberg@ are mixed in nature, indicating both volvocars.com. size and complexity at the same time. In essence, they are based on count- MIROSLAW STARON is a professor of software engineering at the University of ing occurrences of characteristics, Gothenburg. His research interests include software modeling, design, metrics, which makes them also size measures. and decision support. Staron received a Ph.D. in software engineering from the For example, the McCabe complexity Blekinge Institute of Technology. Contact him at [email protected]. measure is based on counting the num- ber of conditional statements. Halstead measures are based on counting the number of operators and operands. The be activated, drastically increas- ››Among the evaluated 10 char- Henry and Kafura measure is based on ing a developer’s effort and time acteristics, those that influence counting in and out invocations (fan-in consumption. complexity the most are more of and fan-out) and lines of code. A mea- ››Simultaneous changes: If many accidental characteristics and sure of Chidamber and Kemerer is based developers simultaneously therefore are manageable. on the number of methods in a class. change a piece of code, the ››The practical value of managing Although these measures are not as sim- information in that code is often accidental characteristics is the plistic as size measures, they contain, to destroyed. The consequence is reduced number of defects and a great extent, a size factor in them. An that any learning that stores increased maintainability for a increasing number of invocations, con- information in the long-term given size of software. ditional statements, and so forth indi- memory is often erased. There- ››Most of the well-known com- cates increasing size of code. fore, no retrieval of information plexity measures are based on There are only a few complexity mea- from long-term memory can be essential code characteristics, sures that are, to a large extent, size used. Generally, such situations which is why they are of little independent. Examples are nesting depth arise due to highly coupled archi- help in simplifying source code. of blocks9 and the number of developers tectural solutions, where there making simultaneous changes in a file. are central software modules cou- How can we significantly improve pled with many other modules. current practices of code complexity ››Nesting: A statement that is management? Two follow-up steps are in the lowest level of nesting Unfortunately, thus far, only size- straightforward. requires short-term memory to based measures orchestrate code remember all of the conditional complexity measurement discipline, ››Evaluate a more exhaustive list statements of the higher levels probably because they are easier to of code characteristics so the to understand how the current design. And, unfortunately, the neces- entire source of complexity can statement shall operate. If the sity of size-independent complexity be well understood. maximum level of nesting is measures has not been emphasized ››Design measures that are size more than the limit of short- enough in the literature to prompt independent and can quantify term memory (about seven), their development. accidental complexity of code. working memory has to go back Such measures can be created and forth over the nested condi- based on code nesting, lack of tional statements intensively to code structure, and number of make sense out of them. In this he following are the three key developers making simultane- case, long-term memory has to Tconclusions of this article. ous changes. FEBRUARY 2019 21 HOLDING US TOGETHER

REFERENCES information,” Psychol. Rev., vol. 63, Approach. Boca Raton, FL: CRC, 1. T. Mens, “On the complexity of soft- no. 2, pp. 81–97, 1956. 2014. ware systems,” IEEE Computer, vol. 6. F. Brooks, “No silver bullet—Essence and 10. G. J. Holzmann, “Code clarity,” IEEE 45, no. 8, pp. 79–81, 2012. accidents of software engineering,” IEEE Softw., vol. 33, no. 2, pp. 22–25, 2016. 2. N. E. Fenton and M. Neil, “Software Computer, vol. 20, no. 4, pp. 10–19, 1987. 11. V. Antinyan et al., “Identifying risky metrics: Roadmap,” in Proc. Conf. Future 7. W. A. Harrison and K. I. Magel, “A areas of software code in Agile/ Software Engineering, 2000, pp. 357–370. complexity measure based on nest- software development: An industrial 3. M. W. Maier, The Art of Systems Architect- ing level,” ACM SIGPLAN Not., vol. experience report,” in Proc. Conf. Soft- ing. Boca Raton, FL: CRC, 2009, pp. 5–6. 16, no. 3, pp. 63–74, 1981. ware Maintenance, Reengineering and 4. J. . Anderson, Cognitive Psychology 8. R. P. Buse and W. R. Weimer, “Learn- Reverse Engineering, 2014, pp. 154–163. and Its Implications. San Francisco, ing a metric for code readability,” 12. T. L. Graves, A. F. Karr, J. S. Mar- CA: Freeman, 2015, pp. 124–148. IEEE Trans. Softw. Eng., vol. 36, no. 4, ron, and H. Siy, “Predicting fault 5. G. A. Miller, “The magical number pp. 546–558, 2010. incidence using software change seven, plus or minus two: Some 9. N. Fenton and J. Bieman, Software history,” IEEE Trans. Softw. Eng., vol. limits on our capacity for processing Metrics: A Rigorous and Practical 26, no. 7, pp. 653–661, 2000.

CALL FOR ARTICLES IT Professional seeks original submissions on technology solutions for the enterprise. Topics include • emerging technologies, • social software, • cloud computing, • data management and mining, • Web 2.0 and services, • systems integration, • cybersecurity, • communication networks, • mobile computing, • datacenter operations, • green IT, • IT asset management, and • RFID, • health information technology. We welcome articles accompanied by web-based demos. For more information, see our author guidelines at www.computer.org/itpro/author.htm. WWW.COMPUTER.ORG/ITPRO

Digital Object Identifier 10.1109/MC.2019.2901995

22 COMPUTER WWW.COMPUTER.ORG/COMPUTER COVER FEATURE HOLDING US TOGETHER

Software Systems With Antifragility to Downtime

Kjell J. Hole and Christian Otterstad, Simula UiB

The authors discuss how to develop and operate large distributed software systems that embrace failures, learn from them, and adapt to maintain a very high system uptime.

espite the continued emphasis on predictive due to these incidents. We then explore why and how risk analysis followed by risk mitigation, software developers and system operators should create there are software systems of (inter)national and maintain so-called antifragile software systems2–4 importance, such as banking infrastructures that limit the impact of incidents and use them to learn Dand e-government platforms, that remain fragile to how to improve the systems over time, thus removing extended system downtime. In fact, countries’ vulner- the need for highly accurate risk management to curtail ability to downtime in software solutions is increasing downtime. According to the analysis, we should imple- as traditional systems are replaced. For example, when ment distributed software systems with antifragility nations dismantle their fixed-line phone systems, peo- to downtime by using separate and isolatable processes ple have no alternative means of reporting emergencies that run on multiple servers (physical machines) and when mobile phone systems go down. From this perspec- communicate via asynchronous message passing. Fur- tive, we studied essential software systems whose stake- thermore, we should inject local artificial failures into holders have skin in the game, meaning that they have the systems to detect vulnerabilities early before they much to lose when the systems go down for a long time.1 cause systemic failures. In this article, we first focus on rare incidents caused by coincidental errors or targeted attacks and explain SYSTEM MODEL why stakeholders cannot depend on traditional risk In this section, we model large software systems and management to avoid intolerably long system downtime their stakeholders to facilitate the later discussions on antifragility to downtime. A system design defines a software system’s components, interfaces, data formats, Digital Object Identifier 10.1109/MC.2018.2888772 Date of publication: 22 March 2019 dataflows, and storage solutions. Although software

COMPUTER 0018-9162/19©2019IEEE PUBLISHED BY THE IEEE COMPUTER SOCIETY FEBRUARY 2019 23 HOLDING US TOGETHER

including owners, developers, operators, foresee outages in a changing system of and users, as a complex adaptive sys- interacting parts that fail randomly and tem. The entities in this sociotechnical that is attacked by attackers exploit- system interact in involved ways. They ing, perhaps, unknown vulnerabili- adapt to each other and to the environ- ties to crash the system (see “Feedback Thick Tail ment to enable the system to survive Loops” for a more general explanation). Probability events with potentially huge negative A tail with unknown thickness makes Thin Tail impact. Incidents in the system are due it impossible to accurately estimate, or to coincidental errors and malfunctions bound, the probabilities of outliers in Length of Outage as well as hostile and targeted attacks. the form of very long outages because the system’s history will not contain Two PDFs for the duration of FIGURE 1. THE LIMITS OF TRADITIONAL enough such incidents.2 an outage, one with a thin tail and one RISK MANAGEMENT Even worse, risk analysts are likely with a thick tail. To explain why we need antifragility to overlook the possibility of very to downtime, we first use the com- long outages that have not happened systems consist of both hardware and plex system model to study experts’ before because a system’s history will software, we mostly consider the soft- inability to foresee downtime in big not contain any such incidents. In ware design due to space limitations. software systems. Although many dif- fact, the history of a complex adaptive A software system consists of soft- ferent incidents lead to downtime, we software system is of limited value ware modules that together provide the focus on the outages themselves and when analysts try to determine the functionality of the system. Each mod- not their primary causes. For simplic- risk because the system and its envi- ule contains a cohesive set of units, ity, we assume that the longer an out- ronment change over time. Because which can be subroutines, functions, age lasts, the more severe the financial rare extremely long outages dominate or classes. Although the modules of a cost to a group of stakeholders. We the cost to stakeholders, analysts can- typical software monolith are parts of a model the length of an outage by a sto- not depend on traditional risk man- layered design, the developers organize chastic variable. Figure 1 sketches two agement to predict and then mitigate the executable code into a single file. If a possible probability density functions intolerable outages.2 Instead, we must monolithic system must support many (PDFs). One PDF has a thin tail, and the develop and operate complex adap- simultaneous users, operators put a load other has a thick, or fat, tail. The tails tive software systems that avoid, or at balancer in front of multiple servers that determine the probabilities of outliers least severely limit the length of, unex- run identical copies of the executable file. in the form of very long outages. A thin pected outages. A distributed software system consists tail indicates that we can ignore overly of many executable processes that together lengthy outages because they are very WHY WE NEED realize the functionality of the system. unlikely. The same is not true for a ANTIFRAGILITY TO Each process realizes the functionality of thick tail; eventually, a prolonged out- DOWNTIME one or more software modules. The pro- age with excessive cost will occur.2,3 According to conventional wisdom, cesses run on multiple servers and com- To apply traditional risk manage- the opposite of a fragile system is a municate over an external network to ment to severely limit a distributed sys- robust system. Although stressors complete various tasks. The processes are tem’s total downtime during a future and perturbations easily damage frag- active, executing their operations, or inac- period, it is necessary first to describe ile systems, robust systems tolerate tive. Each process exists until it is termi- all possible incidents leading to out- rough treatment. In 2012, Nassim N. nated or crashes. A distributed software ages and then estimate the probabili- Taleb published his landmark book system that supports millions of users ties of outages of different lengths. The Antifragility: Things That Gain From may have thousands of processes that run duration of an outage is likely to have a Disorder,3 pointing out that the oppo- on many servers. PDF with an unknown thick tail.2,3 This site of a fragile system is a system We model a distributed or monolithic hidden fat tail is due to the inability of that needs stressors to thrive. Unlike software system and its stakeholders, system designers and risk analysts to robust systems, antifragile systems use

24 COMPUTER WWW.COMPUTER.ORG/COMPUTER FEEDBACK LOOPS

omplex adaptive systems contain feedback the global behavior. Although removing feedback loops.s1 A feedback loop is a sequence of loops can reduce the likelihood of extreme reac- Cinteracting processes that together cause a system tions, it is not possible to remove all loops because to adapt to the effect of its preceding behavior. they are needed to make complex systems adap- Feedback loops make complex systems adap- tive. Ideally, to ensure desirable global behavior, tive and generate global patterns or behaviors. complex adaptive software systems should have Positive feedback loops propagate and turn local only well-understood feedback loops. In practice, events into global events affecting whole systems, systems tend to contain hidden positive feedback whereas negative feedback loops limit the impact loops that were not introduced by the system de- of local events that influence system processes. signers but were created by chance as the systems Negative feedback typically stabilizes a system’s evolved. The hidden positive feedback loops make global behavior over a specific operating range, systems fragile to systemic failures, leading and positive feedback creates extreme global to downtime. behavior outside the normal operating range. It is hard to find all feedback loops in a complex Reference system because of the very many interactions S1. D. Helbing, “Systemic risks in society and econom- between its processes, and it is even harder to ics,” ETH Zürich, Switzerland, Oct. 2010. [Online]. understand all the ways feedback loops can affect Available: https://ssrn.com/abstract=2413205

events with negative impact to learn under stress. Interestingly, an anti- to achieve a very high uptime. Such a how to adjust themselves to limit the fragile system relies on the fragility system will gradually become more impact of future incidents and become of its components. Components that fragile to downtime as the system stronger in a continually chang- develop problems must fail fast and and its environment change in unpre- ing environment. give place to alternative components dictable ways. Fragility inevitably Although most of us think of robust for the system to improve. When com- accumulates below the surface of or resilient as the opposite of fragile, ponents fail slowly, there is likely to be any robust complex system, and ulti- these notions lie in the middle of a failure propagation that causes sys- mately the system will go down, per- spectrum going from fragile to anti- temic incidents. haps for a long time.5 It follows from fragile. A robust system is indifferent Complex adaptive software sys- the previous section that a complex to the volatility and uncertainty of its tems can be made antifragile to dif- adaptive software system must be able internal and external environments ferent types of impacts. The open-ac- to limit the impact of inevitable, sur- and remains the same. A resilient cess book Anti-Fragile ICT Systems4 prising incidents. The technical sys- system may change but will return discusses antifragility to malware tem itself or, more likely, the software to its original state. An antifragile spreading. Here, we focus on making developers and system operators must system is not only robust or resil- a system antifragile to downtime. It is use these incidents with small impact ient but also benefits from volatility not enough to develop a system that is to learn how to change and improve and uncertainty because it improves robust (or resilient) to known incidents the system to keep it running. In other

FEBRUARY 2019 25 HOLDING US TOGETHER

words, whenever there is a need for significant fragility remains if the cop- interfaces, and these interfaces must very high uptime in a complex adap- ies have the same flaws or bugs. be the only way the processes can tive software system, it must be anti- Since all large systems experience interact and share data. fragile to downtime. design flaws, implementation bugs, We propose that the distributed configuration errors, and hardware system architecture should consist of HOW TO DEVELOP failures, we need a system architec- weakly dependent processes, making it ANTIFRAGILE SYSTEMS ture that continues to function well possible to isolate any process display- We need to determine how to design in the presence of failures. To reduce ing suspicious or surprising behavior. and implement large software sys- the fragility to downtime compared These isolatable processes allow us to tems that limit the impact of inevita- with deployment monoliths, we sug- take a misbehaving process offline and ble, surprising incidents due to coin- gest a distributed system architecture replace it with a new instance with the cidental errors and malfunctions as of multiple separate processes with same functionality. If all instances of a well as hostile and targeted attacks. diverse functionality. The processes process misbehave, we can use an older This section suggests properties of a communicate over an external net- version of the process.4 suitable system architecture. It then work to achieve the system goals. This considers available software technol- architecture must have at least two Local persistent storage ogies to implement the architecture. physical machines that observe each A stateful process stores information in Finally, it discusses how to create other’s processes to restart a crashed a database or an in-memory data struc- implementations with adequate secu- process. If there is only one machine, ture that persists between process acti- rity. Although the computer science it is impossible to reboot a process that vations. Consider a distributed sys- literature contains the authors’ sug- crashed due to a hardware failure. tem where multiple stateful processes gestions, relatively few systems satisfy When there is a second machine, it can manipulate data in a central database. all of them. run the crashed process. If the database goes down, these pro- cesses are unable to carry out their Distributed system architecture Isolatable processes tasks. The processes are strongly depen- To determine a useful system archi- Consider a distributed software sys- dent on the database. Furthermore, the tecture, we first compare monolithic tem containing two communicating database is a single point of failure. and distributed systems. Software processes A and B. If A’s functionality Finally, process A is strongly dependent developers divide the program code of degrades when B stops working cor- on process B when A depends on B to a monolith into at least three layers that rectly, then A is strongly dependent on update information in the database. handle user interaction, business logic, B. When a distributed system includes Since a centralized database cre- and data storage. The layers are pack- processes that are strongly dependent ates strong dependencies between aged and deployed as a single executable, on process B, it is hard to isolate B by processes, we need to build distributed resulting in a deployment monolith that taking down its connections to other systems of weakly dependent pro- is independent of other applications. processes without severely damaging cesses with local persistent data stor- Monoliths are simple to develop, test, the functionality of the system. When age; that is, stateful processes must and deploy, and they scale by running B misbehaves, it can thus negatively maintain individual databases even multiple copies behind a load bal- affect other processes and eventually if multiple processes must store sim- ancer. However, although there exist take down the whole system. ilar data. A similar logic shows that massive monoliths with high uptime, If process A’s functionality is nearly weakly dependent processes cannot a monolith’s single executable incurs unaffected when process B malfunc- share any in-memory data structure. fragility to downtime. A design flaw or tions, then A is weakly dependent on implementation bug in the software B. When A is weakly dependent on B, Asynchronous message passing (e.g., a memory leak) can bring down and vice versa, a change to one process With synchronous communication, pro- the entire application. Although we can should not necessitate any modifica- cess A sends a request to process B. reduce the fragility to downtime by run- tion to the other. Weakly dependent The request blocks the operation of ning multiple copies of the executable, processes must have well-defined process A until process B completes

26 COMPUTER WWW.COMPUTER.ORG/COMPUTER its calculations and returns a response antifragility to downtime is essential use of virtualization makes it economi- to A. With asynchronous communica- because it is much harder than realiz- cally viable to deploy redundant and tion, process A sends a message stating ing a monolith. Software technologies diverse services to limit the impact of fail- that something happened and expects suited to implement the distributed ures. Netflix has developed a cloud-based process B to act on this information. system architecture must support microservice solution for streaming Process A does not wait for an answer many concurrent processes, asyn- movies and TV programs. This solution from process B but continues to carry chronous message passing, and local is antifragile to downtime (see a talk by out other tasks. Process A may receive persistent data storage. Furthermore, Adrian Cockcroft, https://www.youtube a response from process B later, but spawning of new processes and termi- .com/watch?v=dekV3Oq7pH8). often no response is expected. nation of existing processes must be The microservices can store data in It is easier to reason about synchro- lightweight operations because huge flat files, Structured Query Language nous (request/response) communica- systems use numerous processes that (SQL) databases, or NoSQL databases, tion than asynchronous (event-based) come and go. Finally, the technologies depending on their needs. Although communication because we know if a must allow processes to inform each microservices should never share a data- request received an answer. However, other of problems, including crashes. base, they may share a distributed and synchronous communication requires The functional language Erlang highly redundant cloud-based database highly reliable communication. When (https://www.erlang.org), together solution like Casandra or MongoDB,8 processes communicate synchronously with its runtime system, satisfies all given that different services access only over an external network between serv- the stated requirements.6 Alterna- isolated portions of the solution. ers, communication failures will occur tively, we can use the programming that cause processes to hang. Even language Elixir (https://elixir-lang Security of implementations without any network failure, there may .org), which uses the same runtime sys- Although a distributed system of sep- be a significant delay before processes tem as Erlang. Two other options are arate and isolatable microservices receive a response to a request. During to use the Akka toolkit (http://akka can limit the impact of coincidental this waiting time, the requesting pro- .io) together with the programming errors and malfunctions, it is perhaps cesses sit idly, wasting computational languages Java or Scala. Both languages less evident whether the system can resources. Hence, synchronous com- run on top of the Java virtual machine restrict the impact of hostile and tar- munication creates strong dependen- (JVM). The toolbox Akka reimplements geted attacks. cies between processes. Erlang functionality on the JVM. Today’s distributed systems, espe- We recommend the use of asynchro- At the time of this writing, it is increas- cially cloud-based systems, use virtu- nous message passing between pro- ingly common to implement software alization technologies that segregate cesses in the distributed architecture processes as microservices.7 Although microservices from the underlying because asynchronous messaging mit- there is no uniquely accepted definition operating systems. The process segre- igates the problems with delayed and of a microservice, it is a small process that gation of any virtualization technol- missing responses. If there is a network does one thing well. The limited and ogy is stronger than traditional process failure or a problem with the processing focused functionality of a microservice segregation, making it hard, but not of a message at the receiver, it is possi- allows a single developer to determine impossible, for an attacker to break out ble to resend the message later. The pro- what it does and how. Furthermore, of an encapsulated microservice. Con- cesses may not send pointers to inter- microservices are independently deploy- sider the following simplified but illus- nal functionality or data because this able and easy to update or replace entirely. trative example: Assume that there is would create not only strong dependen- It is possible to implement the micros- enough software diversity to ensure cies between the processes but severe ervices of a system using different pro- that services on different computers security vulnerabilities as well. gramming languages and storage solu- have no exploitable vulnerability in tions according to need. Developers often common. To control the underlying System implementations realize microservices as virtual machines server of a microservice, an attacker We should implement the outlined or Docker containers, which are isolated must then find and exploit two vulner- distributed architecture only when and replaced when they misbehave. The abilities, one to break into the targeted

FEBRUARY 2019 27 HOLDING US TOGETHER

microservice and another to escape the transaction; a less critical service would PDFs sketched in Figure 1. Here, we virtual machine to control the under- merely enable the attacker to view a list compare two different systems: one lying server. If the microservice-based of completed transactions. Every inter- with a thin-tailed PDF for the length system runs on n different machines, nal and external microservice must of an outage and another with a thick- then the attacker must develop 2n assume that any input is hostile. Since tailed PDF. The PDFs’ central areas exploits to control all services. the microservices communicate over illustrate that short outages can be less In practice, an attacker would not an insecure network and some may be frequent when the tail is thick, perhaps have to control all n machines in a dis- compromised, even duly authenticated because skilled software developers tributed system to shut down its main microservices should not be trusted to and system operators have managed to functionality. The important point is provide benign and correctly formatted remove the underlying causes of most that it is possible to develop a distrib- input in the future. short outages. However, the thick tail uted system forcing an attacker to tells us that there is a remaining non- implement a series of different attacks HOW TO OPERATE negligible risk of rare extreme outages. to damage the central functionality ANTIFRAGILE SYSTEMS As explained previously, it is hard to severely. The same is not necessar- This section discusses how to manage quantify this risk because it is difficult ily true when we scale a monolith by complex adaptive software systems to foresee rare incidents with a tremen- running identical copies on n servers, with antifragility to downtime. Even if dous impact on complex adaptive sys- because the same exploit may be used a system is highly robust to downtime tems. Furthermore, it is challenging to compromise all servers. when it is new, it will become increas- to learn from incidents when there are An internal microservice communi- ingly fragile to outages as the system only a few relatively minor incidents cates only with other services inside and its environment change. Even- between rare major incidents. the boundary of a system, whereas an tually, it will drift into failure.5 For a If a distributed system consists of external microservice communicates robust system to become antifragile separate and isolatable processes with with both external hosts and micros- to downtime, it must continuously use the necessary software diversity and ervices that are part of the system. An incidents with limited impact to learn process redundancy (see Chapters 4 extra effort should be made to secure all how to avoid system outages. and 5 in Anti-Fragile ICT Systems4), then external services as well as the internal it is possible to introduce deliberate services that guard valuable assets. An Learning from artificial failures failures into the system without suf- example of an internal service in need of To understand how to learn quickly fering painful consequences. These extra protection is a service that would and avoid increased fragility to artificial failures allow the system to allow an attacker to make a payment outages over time, we return to the detect and remove weaknesses and errors rapidly. The flow diagram in Figure 2 illustrates how we can intro- Develop and duce failures in a system to deter- Test Fix mine whether a change introduced new problems. The injection of fail- Yes ures shifts the focus from predictive Introduce risk analyses, possibly carried out by No Start Change to Negative third parties with limited knowledge Impact? System of a system, toward continuous testing performed by software developers and system administrators with intimate Introduce knowledge of the system. Monitor System Local Failure We can view failure injection as a form of hormesis in the software domain. Hormesis is a biological phe- FIGURE 2. A diagram of failure injection into a software system. nomenon by which a beneficial effect,

28 COMPUTER WWW.COMPUTER.ORG/COMPUTER such as improved health, tolerance to in computer science called chaos engi- to limit the impact of future incidents. stress, growth, or longevity, results neering (defined by a set of principles Although chaos engineering originated from exposure to small amounts of available at http://­principlesofchaos.org). at Netflix, it has spread to other compa- an agent that is otherwise toxic or The discipline experiments on a complex nies and may become a regular part of deadly when given in higher quanti- adaptive software system to uncover sys- site reliability engineering.10 ties.3 We gain much more from fail- temic weaknesses, remove them, and ure injection than we lose if we do it build confidence in the system’s ability ANTIFRAGILITY TO ATTACKS right (see a talk by Kolton Andrus, to withstand future incidents.9 Although According to its description,9 chaos available at https://www.usenix.org fault injection is an approach to testing engineering with the aforementioned /conference/srecon17americas/program one condition, chaos engineering is a tools provides resilience to the impact /presentation/andrus). much wider practice for generating new of coincidental errors and malfunc- information based on real-time metrics. tions in a system. We just argued that Examples of failure injection tools The obtained information often suggests chaos engineering in a distributed sys- As part of Netflix’s effort to develop a new directions of exploration to uncover tem of separate and isolatable processes cloud-based microservice system for additional weaknesses. creates not only resilience but antifra- media streaming, the company cre- To create failure scenarios, the Net- gility to downtime caused by accidental ated software tools that inject artifi- flix tool Failure Injection Testing (FIT) errors and breakdowns. However, since cial failures into its streaming service changes headers of requests at the edge a distributed system with antifragility to detect problems quickly.9 The Net- of the microservice system. FIT adds a to downtime must also prevent mali- flix tool Chaos Monkey shuts down scenario to a percentage of the head- cious and targeted attacks from crash- randomly selected virtual machines ers of a class of requests. When these ing the system, we need tools designed to ensure that the streaming service requests move through the system, specially to increase the security of a tolerates this frequent failure with- injection points between the microser- distributed system. out the customers experiencing any vices check for the failure scenario and Netflix has developed a set of secu- problems. Amazon has a similar tool act accordingly. FIT covers the impact rity tools to detect and mitigate security called Chaos Lambda. The company space between the small failures issues. Security Monkey continuously that maintains the website indeed. generated by Chaos Monkey and the monitors and detects potential anoma- com has created a tool called Sloth massive failures simulated by Chaos lies and unsafe configurations. Exam- to verify that its private cloud infra- Kong. In 2016, Netflix launched the ples include overly permissive firewall structure can withstand network Chaos Automation Platform (ChAP) to rules, load balancers that permit weak problems. Sloth is a daemon that runs run tool-based experiments automati- ciphers, and services that are open to the on all hosts in the infrastructure. The cally. To examine a microservice, ChAP Internet. In March 2017, the tool shipped tool causes network delays (see a launches experiment and control clus- with approximately 130 security checks. talk by Preetha Appan, available at ters of the service and applies a FIT The use of extensive security tools like https://www.usenix.org/conference­ scenario to the experimental group. Security Monkey is likely to increase /srecon17americas/program/presentation Chap then compares real-time metrics a system’s robustness to the impact of /appan). The Netflix tool Chaos Kong from the two groups to determine the attacks. However, no amount of tool simulates the outage of an entire region impact of the FIT scenario. It aborts the use can guarantee that an attacker will with multiple data centers, testing the experiment immediately if the cus- never manage to exploit, perhaps, a new system’s ability to transition service tomer pain becomes too strong.9 (zero-day) vulnerability to breach a dis- from one region to another. There are Netflix’s experience strongly indi- tributed system and then try to crash its other failure injection tools developed by cates that regular and frequent chaos processes from the inside. various organizations.9 engineering detects weaknesses, veri- fies robustness, and leads to antifragil- FAIL FAST DURING ATTACKS Chaos engineering ity to downtime in a changing system To better understand the impact of a Netflix’s effort to create failure injection when problems are removed promptly, system breach, we answer two ques- tools has developed into a new discipline for example, by adding countermeasures tions: If an attacker controls at least one

FEBRUARY 2019 29 HOLDING US TOGETHER

further limit the damage of the attacks. Although the individual processes ABOUT THE AUTHORS should fail fast, ideally, the overall dis- tributed system should not fail at all. KJELL J. HOLE is the director at the research center Simula UiB and a profes- sor in the Department of Informatics at the University of Bergen. His research interests include the development and operation of antifragile systems. Hole e studied complex adap- received a Ph.D. in computer science from the University of Bergen. Contact tive software systems that him at [email protected]. require very high uptime, Wand we have argued that such systems CHRISTIAN OTTERSTAD is a postdoctoral fellow at the research center Simula must have antifragility to downtime UiB. His research interests include low-level security exploits and mitigation. because traditional risk manage- Otterstad received a Ph.D. in computer science from the University of Bergen. ment is unable to foresee and mitigate Contact him at [email protected]. rare incidents that lead to prolonged downtime. An antifragile system, especially a customer-facing, Inter- net-scale system, should consist of process in a distributed system, will an because the malware can modify the separate and isolatable processes that intrusion detection system (IDS) detect IDS code and prevent it from reporting run on multiple servers and commu- any malicious activity? Furthermore, anomalies. A monolithic system with a nicate over an external network via should a process that receives mali- single executable would be under com- nonblocking asynchronous message cious data or is manipulated in some plete control of the attacker. However, passing. We suggest implementing other way crash immediately, or should for a distributed system with processes the processes as microservices with it try to handle the situation? running on multiple servers, only the local data storage. IDSs detect both abnormal network processes on an exploited server would Since it is hard to detect and debug traffic and unusual host activities. be controlled by the attacker. This both malicious and benevolent pro- Examples of open source IDSs are Snort, hardware-supported modularity pro- cesses that fail slowly, these processes Tripwire, and Logcheck. All IDSs report vides the defender with a better oppor- should be designed to fail immediately false positives and false negatives due tunity to react to a malicious infected and noisily when an error occurs. Misbe- to limitations of their evaluation func- process and, hopefully, prevent the having processes should be isolated and tions. Although signatures of known attack from spreading to other pro- their functionality replaced to limit the attacks can be formalized and therefore cesses and taking down the whole sys- impact of local incidents and avoid fail- identified, novel attacks are not known tem (see the earlier discussion on the ure propagation that causes extended beforehand and cannot readily be for- need for a series of different attacks). downtime. Engineering teams should malized and detected. Turing’s proof Since we are not always able to detect run experiments frequently that inject of the halting problem can be modified malicious activity, any process should artificial failures in a production sys- to show that it is not always possible to be fragile and should fail fast when it tem to detect and remove hidden vul- parse a sequence of CPU instructions receives unusual data or is manipu- nerabilities, thus improving the system and data to identify malicious activity lated in any way, even though our goal over time. Together, the introduction of (see Chapter 5 in Olav Lysne’s book11). is to create a distributed system with artificial failures to facilitate learning It is not feasible in practice to reli- very high uptime. A failing process A and the isolation of misbehaving pro- ably detect malicious activity in a should inform another process B about cesses result in an antifragile system system infected by malware. When the failure and let process B handle the whose outage PDF has a thin tail (see malware controls the execution flow consequences.6 Fragile processes that Figure 1)—that is, the likelihood of an of a server at the highest level of priv- fail fast in this way facilitate detection of intolerably long outage is negligible. ilege, even kernel mode functionality attacks, isolation of malicious processes, Although antifragile systems will fail to detect malicious activity and activation of countermeasures to become stronger by overcompensating

30 COMPUTER WWW.COMPUTER.ORG/COMPUTER for stressors, they are antifragile to 2. N. N. Taleb, The Black Swan: The Microservice Architecture: Aligning stressors only up to a certain point. Impact of the Highly Improbable, 2nd Principles, Practices, and Culture. Beyond that point, the systems may ed. New York: Random House, 2010. Sepastobol, CA: O’Reilly, 2016. be severely damaged or may even col- 3. N. N. Taleb, Antifragile: Things That 8. P. J. Sadalage and M. Fowler, NoSQL lapse. We cannot continue to add new Gain From Disorder. New York: Ran- Distilled: A Brief Guide to the Emerging processes to complex adaptive software dom House, 2012. World of Polyglot Persistence. Reading, systems without also creating unin- 4. K. J. Hole, Anti-Fragile ICT Systems. MA: Addison-Wesley, 2013. tended positive feedback loops (see New York: Springer, 2016. 9. C. Rosenthal, L. Hochstein, A. Bloho- “Feedback Loops”). Since it is hard to 5. S. Dekker, Drift Into Failure: From wiak, N. Jones, and A. Basiri, Chaos detect unintended or hidden positive Hunting Broken Components to Under- Engineering: Building Confidence in feedback loops, no system can continue standing Complex Systems. Surrey, System Behavior Through Experiments. to grow forever without incurring fra- U.K.: Ashgate, 2012. Sepastobol, CA: O’Reilly, 2017. gility to downtime. 6. J. Armstrong, Programming Erlang: 10. B. Beyer, C. Jones, J. Petoff, and N. R. Software for a Concurrent World, 2nd Murphy, Eds., Site Reliability Engineer- REFERENCES ed. Raleigh, NC: Pragmatic Book- ing: How Google Runs Production Sys- 1. N. N. Taleb, Skin in the Game: Hidden shelf, 2013. tems. Sepastobol, CA: O’Reilly, 2016. Asymmetries in Daily Life. New York: 7. I. Nadareishvili, R. Mitra, M. 11. O. Lysne, The and Snowden Random House, 2018. McLarty, and M. Amundsen, Questions. New York: Springer, 2018.

Call for Articles

IEEE Pervasive Computing

seeks accessible, useful papers on the latest

peer-reviewed developments in pervasive,

mobile, and ubiquitous computing. Topics

include hardware technology, software

infrastructure, real-world sensing and

Author guidelines: interaction, human-computer interaction, www.computer.org/mc/ and systems considerations, including pervasive/author.htm deployment, scalability, security, and privacy. Further details:

[email protected] www.computer.org/pervasive Digital Object Identifier 10.1109/MC.2019.2901996

FEBRUARY 2019 31 COVER FEATURE HOLDING US TOGETHER

Algorithms: Law and Regulations

Philip Treleaven, University College London Jeremy Barnett, St. Paul’s Chambers and Gough Square Chambers London Adriano Koshiyama, University College London

The legal status of AI and algorithms continues to be debated. Resume-sifting algorithms exhibit unethical, discriminatory, and illegal behavior; crime-sentencing algorithms are unable to justify their decisions; and autonomous vehicles’ predictive analytics software will make life and death decisions.

ecently, Elon Musk stated AI is an “existential Research Council of five ethical “principles for design- threat to humanity” and needs urgent regu- ers, builders, and users of robots”; and the Association lation “before it is too late.”1 In a similar vein, for Computing Machinery’s seven principles for algo- pioneering computer scientist Ben Shneider- rithmic transparency and accountability, published in man, in his 2017 Turing Lecture on algorithmic account- 2017. The IEEE has begun developing a new standard R 2 ability, called for a “national algorithm safety board.” to explicitly address ethical issues and the values of Over the years, there have been proposals to regu- potential future users.3 This new standard, IEEE P7000, late robots and algorithms. These include science fic- aims to establish a process model by which engineers tion writer Isaac Asimov’s famous 1942 proposal “Three and technologists can address ethical considerations Laws of Robotics”; the South Korean Government’s throughout the various stages of system initiation, proposal in 2007 of a Robot Ethics Charter; a 2011 pro- analysis, and design.4 posal from the U.K. Engineering and Physical Sciences In this era of ubiquitous, pervasive algorithms, AI and soon blockchain smart contracts, three axes of dis- cussion are emerging: when-to regulation, where-to Digital Object Identifier 10.1109/MC.2018.2888774 Date of publication: 22 March 2019 jurisdiction, and how-to technology.

32 COMPUTER PUBLISHED BY THE IEEE COMPUTER SOCIETY 0018-9162/19©2019IEEE Opinion is polarized on regulation. algorithms to be regulated and proved and unaccountable models in our lives, For instance, 1) code is law, i.e., this legal? For example, many machine-learn- ranging from obtaining car insurance, is progress, don’t stop innovation; 2) ing algorithms 1) are “black boxes” and, loans, and so on to reinforcing discrim- owner responsibility, i.e., the company thus, are unable to explain their deci- ination by unethically designed deci- is legally responsible, therefore, just sion-making process, a growing legal sions. Nick Bostrom’s Superintelligence7 apply existing laws; 3) code of conduct, requirement in consumer applications; addresses the threat that an unaligned i.e., self-regulation by programmers and 2) use proprietary Internet Protocol, “superintelligent” machine (reachable companies works best; 4) sectorial regu- which the developers are unwilling this century) can cast upon us. All of lations, i.e., leave regulation to specific to divulge; 3) evolve as they learn by these authors provide persuasive argu- sectors, such as financial regulators; 5) assimilating new data that change their ments about the imminent threat that algorithm safety board, i.e., a special behavior; and 4) operate increasingly in a lack of oversight, standards, and eth- national authority is required because algorithm-algorithm ecosystems where ics pose to humanity. of the pervasiveness of algorithms; and future behavior is dynamic and unpre- It total, there are three areas, eth- 6) AI is considered harmful, i.e., ban AI dictable. As previously discussed, this ics, legal, and technology, being inves- algorithms at least in certain critical article is intended to foster debate tigated by academia and industry with applications. One axis covers increas- concerning the legal status of “intelli- the goal of alleviating or avoiding the ingly sophisticated algorithms and the gent” algorithms in the computer sci- impact of intelligence algorithms. other covers increasing levels of regula- ence community. tion, as shown in Figure 1. Ethics With regard to jurisdiction, Europe CURRENT DEBATE There is a growing body of literature may decide to heavily regulate AI and In this section, we review the debate about the implications that unethi- algorithms, the United States may use on the role of algorithms and their cally designed intelligent algorithms existing legislation, and China may regulation and oversight. While many can have for society. Mittelstadt et do nothing. This will only encourage authors are raising public awareness, al.8 reveal the gap between those that “regulatory arbitrage,” as companies we can mention Cathy O’Neil’s Weap- design, operate, and use these algo- (and innovation) move to the most ons of Math Destruction5 and Frank rithms and society at large and then conducive environment. Pasquale’s The Black Box Society,6 two propose a debate across six types of Finally, we have technology. What popular science books that reveal deci- concerns prompted by algorithmic technology options are available for sions made by opaque, nonauditable, decision making. Making an algorithm

Apply Sector Algorithm Safety Self-Regulation Ban AI Current Law Regulation Board Moderated by Possible Existing Laws Cover Appropriate for ASB Advice for Chatbots Not Appropriate Reputational Damage Slander and so on Existing Regulators Specialist Systems Existing Laws Cover Robo Suitable for Informal Appropriate for ASB Certification for Certain “Black-Box” Discrimination, Ethical Advisors Consumer Advice Existing Regulators Critical Systems Systems Banned Behavior, and so on Existing Laws Cover Control Suitable for Algorithms Appropriate for ASB for Safety-Critical Safety-Critical “Black- Contracts, Agencies, Algorithms in Consumer Devices Existing Regulators Infrastructure Box” Systems Banned and so on Legislation for Self-Drive Existing Laws Cover Existing Laws Totally Autonomous Not Appropriate “Algorithm Driving Vehicles Driving and Operation Need Updating Vehicles Banned Test” Intelligent Existing Health and Existing Laws Specialist Algorithm “Killer” Robots Largely Not Appropriate Robots Laws Appropriate Need Updating Certification Required Outlawed

FIGURE 1. AI and algorithms—when and how to regulate.

FEBRUARY 2019 33 HOLDING US TOGETHER

fair is a paramount goal, and mean- pose. He proposes a finite difference These four technologies are inti- ingful advances9 have been made to approximation (FDA) for algorithms mately linked, i.e., AI provides the remove (and understand) any bias to serve as an expert regulator that algorithms, blockchain provides the that is embedded in its decision-mak- develops guidance, standards, and data storage and processing infra- ing process. A great deal of work in expertise in partnership with industry structure, the IoT provides the data, this sense can be found in the annual to strike a balance between innova- and big data (behavioral/predictive) events of fairness, accountability, and tion and safety. An FDA for algorithms provides the analysis. transparency in machine learning could draw knowledge from financial (www.fatml.org). services regulation because stress test- ALGORITHM TECHNOLOGIES This area is increasingly gaining ing, capital/margin requirements, risk AND ECOSYSTEMS institutional support; examples include management, circuit breaking, and There are two broad classes of algo- the Future of Life Institute, the Future so on are tools and practices already rithm, which can be termed static algo- of Humanity Institute, the Center for tested, implemented, and regulated. rithms (i.e., traditional programs that the Study of Existential Risk, and the perform a fixed sequence of actions) newest, the Ada Lovelace Institute in Technology and dynamic algorithms (i.e., those that the United Kingdom, with a US$7 mil- The technology debate is mostly led embody machine learning and evolve). lion backing from the Nuffield Founda- by think tanks and the industry. It is these latter intelligent algorithms tion. These organizations are devoted to Noteworthy is the research from the that present complex technical chal- building a shared understanding of the Machine Intelligence Research Insti- lenges for testing and verification, ethical questions raised by the applica- tute,13 AI safety units of Google14,15, which will underpin regulation. tion of data and intelligent algorithms and OpenAI (openai.com/) backed by as well as to developing an evidential Elon Musk, , and many other AI technologies basis for how these technologies affect entrepreneurs and tech companies. AI provides computers with the ability society and different groups within it. Public figures, such as the late Ste- to make decisions and learn without phen Hawking,16 have also voiced AI explicit programming. There are two Legal safety concerns. Hawking called for main branches: Debate on the legal aspects of automated a de-escalation of the IT arms race decision making focuses on regulat- and demanded companies mitigate ››Knowledge-based systems: ing the data rather than the algorithm the risks of these systems being used These are computer programs component. The landmark legislation against humanity. Research institu- that reason, and knowledge in Europe is the General Data Protec- tions and professional associations is explicitly represented as tion Regulation (GDPR),10 a major E.U. are now creating blueprints for studies ontologies or rules rather than regulation that provides rights to in- that promote beneficial AI.4,17 implicitly via code. For exam- dividuals pertaining to their data us- Overall, the debate in each of these ple, in rule-based systems, age and storage. It remains to be seen areas is far from a consensus. Action has where the knowledge base whether the United States and China been taken concerning “data” (e.g., the contains the domain knowledge will follow suit. Although the GDPR GDPR and other EU data protection laws), coded in the form of IF-THEN or covers some parts of the algorithm but we have yet to see concrete actions on IF-THEN-ELSE rules. Rule-based component—such as the “right to ex- the algorithms side. Any standards, pro- systems can explain their deci- planation” of all automated decisions— tocols, and design aspects that emerge for sion making. Wachter et al.11 raise several reasons to algorithms are likely to focus on the per- ››Machine learning: These are doubt both the legal existence and fea- formance, behavior, and “explainability” programs that have the abil- sibility of such a right. of intelligent algorithms. ity to learn without explicit In a recent paper, Tutt12 argues that For completeness, we now sum- programming and can change U.S. criminal and tort regulatory sys- marize four core algorithm technol- when exposed to new data. For tems will prove no match for the diffi- ogies: AI, blockchain, the IoT, and big example: 1) supervised learn- cult regulatory puzzles that algorithms data (behavioral/predictive) analytics. ing, i.e., where algorithms are

34 COMPUTER WWW.COMPUTER.ORG/COMPUTER trained with example data, and fix the light. The algorithm then pays In the next section, we look at the 2) unsupervised learning, i.e., for the service. evolving algorithm ecosystem and dis- where algorithms infer a func- Two important IoT-ecosystem tech- cuss its impact on regulation and the tion from unlabeled data. Most nologies are: 1) building information law, that is, algorithms as assistants, machine-learning algorithms modeling, which is a digital model of competitors, controllers, judge/jury, are unable to explain their rea- a facility or infrastructure; aside from technology options, and regulations. soning (e.g., black box). supporting computer-aided design during construction or refurbishment, ALGORITHMS Blockchain technologies the digital model will also be used to AS ASSISTANTS The core blockchain technologies18 are manage the facility or infrastructure Traditional (static) algorithms are as follows: in real time using IoT resources; and already prevalent, taken for granted, 2) blockchain smart contracts, which and perform critical tasks such as fly- ››Distributed ledger: a decentral- will be the algorithms that operate and ing planes and controlling nuclear ized database where transac- control the infrastructure. systems. The current debate mainly tions are kept in a shared, repli- concerns intelligent (dynamic) algo- cated, synchronized, distributed Big data (behavioral and rithms, their continued evolution, and bookkeeping record, which is predictive) analytics potential threat. secured by cryptographic seal- Big data analytics is the process of Beginning with virtual assistants, ing. The key attributes are resil- examining vast and varied data sets to known as chatbots, these are algorithms ience, integrity, transparency, uncover hidden patterns, trends, cus- designed to simulate a conversation and unchangeability, i.e., mostly tomer preferences, and so forth. One of with human users primarily over the “immutable.” the most exciting areas for intelligent Internet, where machine-learning algo- ››Smart contracts: are (possibly) algorithms is behavioral and predictive rithms perform a task, such as providing computer programs that codify analytics. Behavioral analytics focuses customer service or answering a question. transactions and contracts, on providing insight into the actions Algorithms behind this technology which, in turn, “legally” man- of people, whereas predictive analytics have the capacity for learning, reason- age the records in a distributed extracts information from existing data ing, and understanding. They range ledger. sets to determine patterns and predict from search engines like Google, to future outcomes and trends. increasingly sophisticated assistants Intelligent algorithms and smart Consider the 2002 film Minority such as Apple Siri, Samsung Bixby with contracts will be critical “robo assis- Report, an action thriller set in Wash- image search, or smart devices such tants” that run commerce and infra- ington D.C. in 2054, where police uti- as Amazon Echo/Alexa and Google structure based on distributed led- lize algorithms to arrest and convict Home. Regarding liability and the law, ger technology. murderers before they commit their if Google returns the wrong answer to a crimes. No longer science fiction, pre- search, we try again. However, if Ama- IoT dictive analytics is already being used zon’s Alexa misinterprets a conversa- The IoT is becoming increasingly cru- in the sentencing of offenders, with tion or hears something on the televi- cial because every device that has an decisions being challenged in the sion and makes an expensive purchase, on/off switch will have a unique iden- courts. An example is Wisconsin v. where does the law stand? tity, and a connection to the Inter- Loomis, where Compas, a risk-assess- net will communicate with and be ment tool, contributed to the trial Rogue algorithms controlled by an algorithm. Devices judge increasing Loomis’s sentence. Rogue algorithms have emerged be­­ range from individual lights in a smart This ruling is being appealed because cause of advertisers who abuse Ama- building to domestic appliances to the Compas is unable to explain its reason- zon Alexa and Google Home. Recently, national infrastructure. When a light ing, who its creators are, and they are Burger King “hijacked” Google Home fails, an algorithm runs, triggering unwilling to divulge its methods due to speakers by creating an ad that trig- UBERises, an electrician, to come and intellectual property considerations. gered devices to read its Wikipedia

FEBRUARY 2019 35 HOLDING US TOGETHER

entry for the Whopper, edited before- ALGORITHMS advisor software providing professional hand to sound like marketing copy. AS COMPETITORS financial advice or portfolio manage- Google quickly blocked the trigger but We now examine algorithms that are ment with minimal human interven- not before the restaurant chain had displacing humans. The application of tion, based on mathematical rules and gained much free publicity and consid- intelligent algorithms has been driven AI algorithms. A typical robo collects erable Google consumer backlash. to a large extent by the highly competi- information from clients about their Another example is that of Mic- tive financial services industry, begin- financial situation, level of acceptable rosoft’s Tay, a chatbot algorithm that ning with algorithmic trading (AT) risk, and future goals through an online was designed to learn from user inter- and the rise of financial robo advisors. survey and then uses the data to offer action via Twitter. Tay proved a smash advice and/or automatically invest cli- hit with racists, trolls, and online AT ent assets. There are two broad classes: troublemakers who persuaded it to In electronic financial markets, AT 1) robo investors, which invest with- blithely use racial slurs, defend white refers to the use of algorithms to auto- out offering any financial advice and supremacist propaganda, and even mate one or more stages of the trading are not formally regulated, and 2) robo call for outright genocide. Even the process, e.g., pretrade analysis (data advisors, which offer regulated advice heavily censored Chinese Internet is analysis), trading signal genera- to users and then use the responses to not immune: “Rogue Chatbots Deleted tion (buy and sell recommendations), guide investment decisions. in China After Questioning Com- and trade execution. Each stage of this munist Party,” read one recent head- trading process can be conducted Rogue algorithms. Warren Buffett, line. Two chatbots, BabyQ and XiaoB- entirely by algorithms or by algorithms when discussing the aftermath of the ing, have been pulled from a Chinese plus humans. 2008 financial crisis, warned, “Wall messaging app after they questioned Street’s beautifully designed risk algo- the rule of the Communist Party and Rogue algorithms. AT, because of its rithms contributed to the mass murder made unpatriotic comments. The bots magnitude and proliferation, has had of [U.S.] $22 trillion.” This comment were available on a messaging service a significant impact on financial mar- highlights potential problems, namely with 800 million users run by Chinese kets. Notably, the 2010 Flash Crash, that today’s robo investors and robo Internet giant Tencent, before appar- which wiped out US$600 billion in advisor algorithms probably lack the ently going rogue! market value of U.S. corporate stocks in necessary experience to manage assets 20 min. However, the involvement and during sustained periods of market tur- Regulation and the law market impact of AT on flash crashes bulence and falling stocks prices. It might be argued that self-regula- is still the subject of much debate. tion works well, based on the chatbot Another interesting example, one Financial regulations examples mentioned previously. How- that possibly involves a traditional Regulators are also exploring using ever, it does highlight the significant (static) algorithm, is that of mar- algorithms to improve efficiency. challenge of testing these evolving ket-making firm Knight Capital. On Financial regulation is estimated by the machine-learning algorithms. 1 August 2012, Knight Capital deployed Financial Times to cost firms billions and Although concern has been expressed untested software in a production envi- involve 10% of the workforce. Regulators about the urgent need to police and ronment that contained an obsolete face myriad pressures, such as increas- regulate these rogue algorithms, there function. The rogue algorithm started ing the monitoring of small firms and exists, through the current U.S. and U.K. pushing erratic trades through on individuals, cross-border cybercrime criminal and civil laws, a consider- roughly 150 different stocks and lost (e.g., anti-money-laundering and binary able body of law that can be deployed US$440 million in 30 min, resulting in options), political pressure to curb where necessary. Toby Walsh suggests the end of the company. excesses (e.g., Libor), escalating inter- using chatbots-to-humans warnings national and EU regulations (e.g., Mar- like the Red Flag Law that governs the Professional robo advisors kets in Financial Instruments Directive early use of motor vehicles (arxiv.org We now look at how algorithms impact II), and governments that relax regula- /pdf/1510.09033.pdf). professions. Robo advisors are a class of tions to increase competitiveness (e.g.,

36 COMPUTER WWW.COMPUTER.ORG/COMPUTER the Dodd–Frank Wall Street Reform and omnipresent. An example is an auto- Likewise, no one expects the law to Consumer Protection Act) and so on. pilot controlling an aircraft. Suddenly, condone autonomous vehicles driving The monitoring challenges faced Asimov’s “Three Laws of Robotics” are drunks home! However, challenging by regulators are illustrated by the U.K. a reality, with the proliferation of fully legal questions are already being posed Financial Conduct Authority (FCA). autonomous vehicles and military as to what happens where there are col- Previously, the FCA monitored 25,000 robots that are designed to kill. Auton- lisions between two driverless cars and large- and medium-size firms. With omous vehicles have control systems both appear to have acted appropri- virtually the same resources, the FCA that can analyze sensory data to dis- ately. Further ethical issues arise when, now must supervise an additional tinguish between cars on the road, e.g., a driverless car swerves to avoid a 30,000 small firms. Hence, regulators pedestrians, and other potential haz- pedestrian and causes a fatal accident. are looking to automate compliance ards, which are necessary for safe nav- and regulations using AI algorithms igation but also can predict accidents. ALGORITHMS and blockchain, while also regulat- The moral dilemma for the software AS JUDGE AND JURY ing algorithmically. Recently, the U.K. engineer, manufacturer, and regulator Perhaps the most contentious area is Financial Stability Board published a are that predictive algorithms may, in algorithms making decisions regard- report19 about the impact of AI in finan- extreme situations, need to make life and ing humans with little or no right of cial services, pointing out that the lack of death decisions. One possible avenue is appeal. This includes: 1) consumer deci- “auditability” of intelligent algorithms that a regulator requires navigation soft- sions, where applicants are judged by could become a macrolevel risk. ware to pass an algorithm “driving test.” algorithms that are unable to explain their reasoning, ranging from CV-sift- Regulation and the law. Financial Rogue algorithms ing software to people applying for algorithms are increasingly subject to A recent video on the Internet shows loans and mortgages; 2) staff decisions, compliance, with regulators requiring how Tesla’s autopilot algorithm attempts where algorithms select staff for jobs, firms to demonstrate that trading algo- to predict a car accident before it hap- decide remuneration, and whether they rithms have been thoroughly tested, pens. However, not everything is per- should be dismissed; and 3) defendant demonstrate “best execution,” and fect for autopilot algorithms. Nearly decisions, where the justice system is are not engaged in market disruption. every autonomous vehicle company using algorithms to recommend sen- Likewise, robo advisors are being regu- from Tesla and Google to Uber has had tencing of criminals. As previously lated to benchmark their level of advice some car crashes, including the death of discussed, one example is Wisconsin v. to ensure they are not engaged in mar- a Tesla driver. Some accidents are algo- Loomis, where a black-box risk-assess- ket manipulation. The U.S. Securities rithm anomalies, some are caused by ment tool contributed to the trial judge and Exchange Commission requires other drivers’ unpredictable behaviors, increasing Loomis’s sentence. robo advisors to be registered and com- and, in the future, some may be caused pliant in three areas: 1) substance and by malicious hacking. Algorithmic star chamber presentation of disclosures, 2) provi- In the workplace, algorithms are rap- sion of suitable advice, and 3) enacting Regulation and the law idly becoming a judge and jury “star effective compliance programs. The law is playing catch up with indus- chamber.” Uber has been criticized in With respect to the law, these finan- trialized countries by adjusting their the media for only communicating cial algorithms are not fiduciaries, nor laws to accommodate autonomous with its drivers via algorithms that uni- do they currently fit under the tradi- vehicles. For example, the U.K. Vehicle laterally decide on the level of revenue tional standard applied to human reg- Technology and Aviation Bill imposes share, driver’s rating, and whether to istered investment advisors. liability on the owner of an uninsured terminate employment without the automated vehicle when driving itself right of appeal. Online retailers live in ALGORITHMS IN CONTROL and makes provisions for cases where fear of a drop in their Google search- Although much public debate centers the owner has made “unauthorized engine ranking if they are judged by on future intelligent robots, tradi- alterations” to the vehicle or failed to an algorithm to have done some- tional control algorithms are already update its software. thing fraudulently.

FEBRUARY 2019 37 HOLDING US TOGETHER

Rogue algorithms into computer systems for execution; accuracy); 2) satisfies the standards, Engineers creating algorithms face and 3) code-is-business-logic, i.e., practices, and conventions; and 3) solves increasing ethical and legal challenges those who see smart contracts as con- the right problem (e.g., correctly model that can have severe consequences sisting of digitizing business logic per- physical laws), and satisfies the intended affecting individuals, groups, and formance (e.g., payment), which may use and user needs in the opera- whole societies. As an example, con- or may not be associated with a natural tional environment. sider the following: you have been asked language contract. to code the algorithm for a self-driving Algorithm circuit breakers car that can predict possible accidents. TECHNOLOGY OPTIONS For financial algorithms, circuit break- When a fatal accident appears unavoid- Before discussing regulatory struc- ers are used to detect failures and encap- able, does your algorithm 1) sacrifice tures and possible changes to the law, sulate the logic by preventing a fail- the car, 2) sacrifice the pedestrian, it is appropriate to discuss technol- ure from regularly recurring during 3) sacrifice passengers in other vehi- ogy options. maintenance, temporary external cles, 4) risk harming the occupants, or system failures, or unexpected sys- 5) risk an even greater accident? Algorithm testing tem difficulties. Circuit breakers are Depending on the nature of the system, used by exchanges to curb panic selling BLOCKCHAIN SMART techniques divide into the following: and excessive volatility (i.e., large price CONTRACTS swings in either direction) in individ- We now consider the emerging tech- ››Traditional testing, which can ual securities. For machine-learning nology of smart contracts, one of the involve static code reviews or algorithms circuit breaker technology, most contentious algorithm technol- dynamic analysis with test sets, monitoring functionality may be the ogies for lawyers, and frequently dis- along with “white-box” inter- only option. paraged as “not smart, not contracts.” nal workings and “black-box” Arguably, algorithm testing and However, to quote Sean Murphy20 functionality algorithm certification are unlikely to of Norton Rose Fulbright, “Smart con- ››Algorithm formal verification, fully work for machine-learning algo- tracts, in combination with distributed which proves or disproves the rithms. As a result, all critical intelligent ledger technologies, have the poten- correctness using a formal proof algorithms may require circuit breakers. tial to automate an extensive array of on an abstract mathematical transactions and services within the model of the system, which service sector. Legal compliance can be corresponds accurately to the egal redress for algorithm fail- built into the program logic, providing nature of the system (usually ure seems straightforward: if a way of transacting that maximizes known before construction). something goes wrong with an operational efficiencies with the poten- ››Algorithm cross-validation, which Lalgorithm, just sue the humans who tial to reduce legal and regulatory cost aims to run the same algorithm deployed the algorithm. However, it and risk.” in an independent data set or may not be that simple. For example, if scenario to evaluate potential an autonomous vehicle causes death, Regulation and the law risks (e.g., overfitting, sensi- does the lawsuit pursue the dealership, With regard to smart contract law, pro- bilities to noise, and so on) and the manufacturer, the third-party who ponents fall into three legal camps: measure its expected general- developed the algorithm, the driver, 1) code-is-contract, i.e., those who ization accuracy. or the other person’s illegal behavior? espouse encoding the entirety of a nat- This stimulates the debate of whether ural language contract; 2) hybrid-con- Algorithm certification or not algorithms should be given a tract, i.e., those using a hybrid smart Algorithm certification involves audit- legal personality. contract model under which natural ing whether the algorithm used during As we know, a legal person refers to a language contract terms are connected the life cycle 1) conforms to the proto- nonhuman entity that has legal stand- to computer code via parameters (e.g., coled requirements (e.g., for correct- ing in the eyes of the law. A graphic a smart contract template) that feed ness, completeness, consistency, and example of a company having legal

38 COMPUTER WWW.COMPUTER.ORG/COMPUTER personality is the offense of corporate be required to provide expert arises about the scope of the algorithms’ manslaughter, which is the criminal knowledge and advice. authority. This article was written to offense of an act of homicide com- ››Legal status of algorithms: Cur- stimulate discussion in the computer mitted by a company or organization. rently, algorithms are not con- science and legal professions concern- Another important principle of law is sidered to be artificial persons, ing algorithms, regulations, and the that of agency, in which a relationship i.e., “they are unable to own law, a subject of growing debate. is created whereby a principal gives things so they are not worth legal authority to an agent to act on the suing.” However, in cases where ACKNOWLEDGMENTS principal’s behalf when dealing with dynamic algorithms and robots We thank the reviewers for their com- a third party. An agency relationship develop beyond the intentions of ments and ideas that have helped us is a fiduciary relationship. It is a com- the designers, where the owner/ reshape and improve our article. plex area of law with concepts such as designer of an algorithm cannot apparent authority, where a reasonable be identified (perhaps through REFERENCES third party would understand that the insolvency), or where a so-called 1. O. Etzioni, “How to regulate artificial agent had authority to act. decentralized autonomous orga- intelligence,” NY Times, Sept. 1, 2017. As the combination of software and nization builds a store of wealth, [Online]. Available: http://www hardware continues to produce intelli- a body of opinion is forming .nytimes.com/2017/09/01/opinion gent algorithms that learn from their in law to support the view that /artificial-intelligence-regulations- environment and may become unpre- courts should have the ultimate rules.htm dictable, it is conceivable that, with the authority to issue sanctions, 2. B. Shneiderman. (2017). “Turing growth of multialgorithm systems, deci- which might include the power lecture on algorithm accountabil- sions will be made by algorithms that have to fine or use a “kill switch” ity,” Turing. Accessed on: far-reaching consequences for humans. where necessary. Jan. 2019. [Online]. Available: It is this potential of unpredictability that www.turing.ac.uk/events supports the argument that algorithms In summation, Ben Shneiderman /turing-lecture-algorithmic- should have a separate legal identity so has called for a national algorithm accountability/ that due process can occur in cases where safety board.2 Such a board would have 3. S. Spiekermann, “Artificial intelli- unfairness is present. The alternative to the expertise to 1) provide specialist gence: Considering the ethics,” this approach would be to adopt a regime advice on algorithms to sectors such as ­Parliament Mag., Nov. 7, 2016. of strict liability for those who design or finance and 2) recommend changes to [Online]. Available: https://www place dangerous algorithms on the mar- the law as algorithms and their ecosys- .theparliamentmagazine.eu/articles ket, so as to deter behaviors that appear tem evolve. A proposed codes of ethics /magazines/ieee-considering- or are proved to have been reckless. Is this for designers of algorithms do not yet ethics a case of bolting the barn door after the suggest how these principles can be 4. IEEE. (2018). IEEE P7000—Engineer- horse has escaped? enforced, giving rise to a need for a ing methodologies for ethical life- We present three discussion points: debate concerning potential sanctions cycle concerns working group. for its breach. IEEE. Piscataway, NJ. [Online]. ››Algorithm circuit breakers: Criti- Regarding giving algorithms a legal Available: http://sites.ieee.org cal intelligent algorithms may personality (i.e., artificial persons), a /sagroups-7000/ require mandatory circuit break- company having legal personality can, 5. C. O’Neil, Weapons of Math Destruc- ers for safe operation because for example, be charged with corpo- tion: How Big Data Increases Inequal- algorithms with machine learn- rate manslaughter, which is a criminal ity and Threatens Democracy. Largo, ing evolve dynamically and may offence in law. Another controversial MD: Crown Books, 2017. prove unfeasible to rigorously issue in law is called Agency; wherein 6. F. Pasquale, The Black Box Society: The test and verify. algorithms are authorized to enter Secret Algorithms That Control Money ››National algorithm safety board: into contracts with humans or other And Information. Cambridge, MA: A special national board will algorithms and subsequently a dispute Harvard Univ. Press, 2015.

FEBRUARY 2019 39 HOLDING US TOGETHER

ABOUT THE AUTHORS 15. J. Leike et al., “AI safety grid- worlds,” 2017. [Online]. Available: PHILIP TRELEAVEN is a professor of computing at University College Lon- arXiv:1711.09883 don and director at the U.K. Centre for Financial Computing & Analytics (www 16. S. Hawking, M. Tegmark, S. Russel, .­financialcomputing.org). His research interests include data science, algo- and F. Wilczek, “Transcending rithms, and blockchain technologies. Treleaven received a Ph.D. from The Uni- complacency on superintelligent ,versity of Manchester. He is a Member of the IEEE and the IEEE Computer Soci- machines,” Huffington, Post June 19 ety. Contact him at [email protected]. 2014. [Online]. Available: http:// www.huffingtonpost.com JEREMY BARNETT is a regulatory barrister and sits as a Recorder of the /stephen-hawking/ artificial- Crown and County Court, St. Paul’s Chambers (www.stpaulschambers.com intelligence_b_5174265.html /jeremy-barnett-crime) and Gough Square Chambers. Barnett received his L.L.B. 17. S. Russell, D. Dewey, and M. Tegmark, from the University of Liverpool. With a background in advanced computing, he “Research priorities for robust and is currently involved in research and development of blockchain and smart con- beneficial artificial intelligence,” AI tracts. Contact him at [email protected]. Mag., vol. 36, no. 4, pp. 105–114, 2015. 18. P. Treleaven, R. G. Brown, and D. ADRIANO KOSHIYAMA is a Ph.D. student at University College London in Yang, “Blockchain technology in the Department of Computer Science. Koshiyama received his M.Sc. in elec- finance,” Computer, vol. 50, no. 9, trical engineering from PUC-Rio. His research interests include computational pp. 14–17, 2017. finance and AI. He is a Student Member of the IEEE and the IEEE Computer 19. Financial Stability Board. (2017). Society. Contact him at [email protected]. Artificial intelligence and machine learning in financial services. FSB. Basel, Switzerland. [Online]. Available: http://www.fsb 7. N. Bostrom, Superintelligence: Paths, Union. Brussels, Belgium. [Online]. .org/2017/11/artificial-intelligence-an Dangers, Strategies. London, U.K.: Available: http://eur-lex.europa.eu d-machine-learning-in-financial-­ Oxford Univ. Press, 2014. /eli/reg/2016/679/oj service/ 8. B. D. Mittelstadt, P. Allo, M. Taddeo, 11. S. Wachter, B. Mittelstadt, and L. Flo- 20. S. Murphy and C. Cooper, “Can S. Wachter, and L. Floridi, “The ridi, “Why a right to explanation of smart contracts be legally binding ethics of algorithms: Mapping the automated decision-making does not contracts?” Norton Rose Fulbright, debate,” Big Data Soc., vol. 3, no. 2, exist in the general data protection London, U.K., White Paper, 2016. 2016. regulation,” Int. Data Privacy Law, 21. IEEE. (2016). Ethically aligned design: 9. M. J. Kusner, J. Loftus, C. Russell, vol. 7, no. 2, pp. 76–99, 2017. A vision for prioritizing human well- and R. Silva, “Counterfactual fair- 12. A. Tutt, “An FDA for algorithms,” being with artificial intelligence and ness,” Advances Neural Inform. Pro- Administ. Law Rev., vol. 69, no. 83, autonomous systems. IEEE. Piscat- cess. Syst., vol. 2016, pp. 4069–4079, pp. 83–125, 2017. away, NJ. [Online]. Available: http:// 2017. 13. J. Taylor, E. Yudkowsky, P. LaVictoire, standards.ieee.org/develop 10. EUR-Lex. (2016). Regulation (EU) and A. Critch, Alignment for Advanced /indconn/ec/ead_v1.pdf 2016/679 of the European Parliament Machine Learning Systems. Berkeley, and of the Council of 27 April 2016 California: Machine Intelligence on the protection of natural persons Research Institute, 2016. with regard to the processing of 14. D. Amodei, C. Olah, J. Steinhardt, Access all your IEEE Computer Society personal data and on the free move- P. Christiano, J. Schulman, and D. subscriptions at ment of such data, and repealing Mané, “Concrete problems in AI computer.org Directive 95/46/EC (General Data safety,” 2016. [Online]. Available: /mysubscriptions Protection Regulation). European arXiv:1606.06565

40 COMPUTER WWW.COMPUTER.ORG/COMPUTER COVER FEATURE HOLDING US TOGETHER

KDD Cup 99 Data Sets: A Perspective on the Role of Data Sets in Network Intrusion Detection Research

Kamran Siddique, Xiamen University Malaysia Zahid Akhtar, University of Memphis Farrukh Aslam Khan, Center of Excellence in Information Assurance, King Saud University Yangwoo Kim, Dongguk University

Many consider the KDD Cup 99 data sets to be outdated and inadequate. Therefore, the extensive use of these data sets in recent studies to evaluate network intrusion detection systems is a matter of concern. We contribute to the literature by addressing these concerns.

ver the last decade, extensive research has technologies and the concomitant rise in the number of been carried out to build efficient network network attacks. Knowledge Discovery and Data Mining intrusion detection systems (NIDSs) using (KDD) Cup 99 data sets—the most valuable and inno- novel techniques from multiple computing vative resource for intrusion detection research—were Odomains. This research is expected to grow continu- initially launched in 1998. In this article, KDD refers ously due to ongoing advancements in communication to the family of data sets that includes DARPA, KDD CUP 99, and NSL_KDD, unless explicitly stated other-

Digital Object Identifier 10.1109/MC.2018.2888764 wise. These workloads have been extensively utilized, Date of publication: 22 March 2019 primarily in the fields of intrusion detection, machine

COMPUTER 0018-9162/19©2019IEEE PUBLISHED BY THE IEEE COMPUTER SOCIETY FEBRUARY 2019 41 HOLDING US TOGETHER

recommend their use for NIDS 2 TABLE 1. Performance evaluations on KDD99 evaluation. data sets or their subsets. In this study, we advocate the first False alarm/ Year of status, and our main objective is to * Algorithm(s)/technique(s) Accuracy (%) alarm rate (%) publication strengthen academic NIDS research in Multiclass classification multiple criteria 99.14 0.01765 2017 real-world operational environments. linear programming classification From the perspective of security, it is a natural, comple- k-means and random forest 99.98 0.14 2017 mentary process to experience advance- Efficient proactive artificial 83.72–100 0.68–0.81 2016 ments on both sides, i.e., for the attacker immune-system-based anomaly detection and the defender. However, it is hard to and prevention system comprehend the mechanics and ratio- nale of the defender technologies, which J48 and REPTree 99.9 0.0002 2016 claim to combat modern attacks but Averaged one-dependence estimators 96.77 0.0323 2015 are being trained and evaluated using technology that is two decades old! To Online sequential extreme-learning 98.66 1.74 2015 be more specific, the extensive use of machine KDD99 collections in recent studies is SVM and genetic PCA 99.96 0.49 2014 concerning and raises many questions that need to be analyzed critically. It has Radial basis function and SVM 98.46 NA 2014 recently been reported that 125 research J48, PCA, SVM, self-organizing map 99.59 1.27 2013 studies published in mainstream jour- nals during 2010–2015 performed NIDS Weighted k-means and random forest 98.3 1.6 2013 evaluations using KDD99.3 The authors3 Ant colony algorithm and SVM 98.62 NA 2012 did not include conference publications and restricted their search only to qual- Discriminative multinomial naive Bayes 96.5 3 2010 ity-assured journals. We believe that the and N2B statistics of their systematic analysis Decision tree 92.30 11.71 2007 from the NIDS perspective raise serious concerns about these data sets, which SVM 99.6 4.17 2005 are nearly 20 years old and have faced 4,5 NA: not applicable; PCA: principal component analysis; SVM: support vector machine; N2B: nominal to binary. several pointed criticisms. Despite *See more details at https://goo.gl/8MQLNC. these criticisms, the data sets are often justified as being adequate by scholars seeking to publish their research based learning, and data-stream research. ››The data sets are considered on these data sets in mainstream jour- In this article, we examine the use of inadequate for NIDS nals and conferences. Table 1 presents KDD99 data sets from the perspective evaluations.1 the performance evaluation results of of NIDS evaluation, which generally ››The data sets are considered ade- representative studies in which these intersects with machine-learning quate for NIDS evaluations, that data sets were used. Hereafter, these research. We observed that KDD99 is, by the studies that used these studies are referred to as the reviewed data sets currently have three highly data sets (Table 1). studies in this article. Because there are a contradictory statuses among the ››The data sets are publicly avail- number of differences in the experimen- intrusion detection research com- able at the University of Cali- tal approaches and methodologies used munity, but we see only one of them fornia, Irvine KDD Archive, but in the studies, we have selected their as justified: the archival authority does not most common representative attributes

42 COMPUTER WWW.COMPUTER.ORG/COMPUTER in Table 1, such as the algorithms and ››discuss the importance of using five weeks of normal and malicious traf- techniques involved, accuracy, and false- appropriate data sets fic. There exist four types of attacks, i.e., alarm rate. ››highlight some adequate data probe, remote to local, denial of service The KDD99 data sets have not only sets and encourage their use for (DoS), and user to root. The data sets are been extensively criticized but are also NIDS evaluations still publicly available for download from considered to be inadequate and dead ››perform an empirical analysis of the Lincoln Laboratory repository (https:// by many experts.1,6 More importantly, the complexity of KDD and mod- www.ll.mit.edu/ideval/data/index.html). even the data set archival authority itself ern data sets strongly discouraged their use in 20072 ››provide recommendations that The KDD Cup 99 data set and recommended that all journals and motivate the community to con- The KDD Cup 99 data set stems from conferences reject studies outright if tinuously evaluate the useful- DARPA/MIT Lincoln Laboratory the results were solely based on KDD99 ness of data sets to ensure that packet traces, and it is the most widely data sets. As such, some scholars might the right data are being used used data set for NIDS evaluation. This suggest that a heavy reliance on the to solve the right problems and is a transformed version of the DARPA KDD99 data sets brings into question advance science. data containing 41 features that are the progress of NIDS research in recent considered suitable for machine-learn- years. Although many researchers have THE KDD99 ing classification algorithms. The data already been convinced by such advice, DATA SET FAMILY set can be obtained as three partitions: studies from a large group of researchers In this section, we briefly describe the a full training set, a 10% version of the show somewhat different understand- KDD99 data sets. More detailed infor- training set, and a test set (https:// ings, as shown in Table 1. Most reviewed mation about these data sets can be kdd.ics.uci.edu/databases/kddcup99 studies stated that they used these found in Lippman et al.7 and Tavallaee /kddcup99.html). Aside from the four data sets because “these are the most et al.8 A timeline depicting significant categories of attacks within the DARPA widely used benchmark data sets,”3,8,18 events regarding KDD99 data sets, orig- data set, 17 new attacks were added in “these are the de facto standard data inating in 1998, is shown in Figure 1. the test data. Because it is considered a sets,”3 or other noncompelling rea- large data set for most machine-learn- sons. Such arguments and practices are The DARPA data set ing algorithms, many researchers far too common across many scientific The Massachusetts Institute of Technol- prefer to use sampled data. Moreover, disciplines, and they may not always be ogy (MIT) Lincoln Laboratory, sponsored the records duplication in both train- optimal. As Albert Einstein remarked, by DARPA and the Air Force Research Lab- ing and test data can produce biased “What is right is not always popular, and oratory, generated the first standard cor- results for frequent instances, and what is popular is not always right.” pora for the evaluation of NIDSs in 1998. overcoming such issues led to the gen- Despite the seriousness of this issue, After these data sets were used for some eration of the NSL_KDD data set. the topic has not yet been exclusively time, the quickly changing landscape of and systematically addressed, which communication technology demanded The NSL_KDD data set is essentially the main cause of using updated data sets with more novel attacks The NSL_KDD, from the Information outdated data sets on a massive scale. (including a Windows NT target) while Security Center of Excellence (ISCX), To the best of our knowledge, this is the defining a security policy for the target University of New Brunswick (UNB), first work that contributes to the litera- network. This resulted in the creation of is another distilled version of the KDD ture by analyzing this subject. We another version of the data set in 1999. Cup 99 data sets. It was created in 2009 These two data sets are referred to as with the main aim of resolving the issue ››begin with a brief introduction DARPA98 and DARPA99, and they consist of redundant records found in KDD to the KDD99 data set family of raw tcpdump data (see Figure 1). These Cup 99 data sets,8 in which the ratio of ››identify the issues and conse- workloads contributed significantly to duplicate records in the training and quences associated with NIDS NIDS research due to their objectivity, for- testing data was reported as being 78% evaluations based on KDD99 mality, repeatability, and statistical signif- and 75%, respectively. Such huge redun- data sets icance. The DARPA99 data set consists of dancy may cause learning algorithms to

FEBRUARY 2019 43 HOLDING US TOGETHER

Warning: Not to Use KDD Cup ‘99 Data Set2 TCP DUMP McHugh4 Sommer and Paxson1 DARPA987 DARPA997 Brugger and Thomas 20 Chow19 et al. Moustafa and Slay6

1997 1998 1999 2000 2005 2007 2008 2009 2010 2012 2015 2016 2017 2018 Data Set Usage Count in SCIE/ESCI Indexed Still in Use Journals Only: 125 7 KDD Cup ‘99 NSL_KDD8

Features Extracted, for Classification Duplicate Removal With Machine-Learning Algorithms and Size Reduction

Data Set Generation/Feature Extraction Data Sets Considered Adequate to Use

Notable Criticism UCI Lab Warning and Recommendations

FIGURE 1. A timeline of the KDD99 data sets. TCP: Transmission Control Protocol; SCIE: Science Citation Index Expanded; ESCI: Emerging Science Citation Index.

produce biased evaluation results and, pod attacks; none of these are seen in system is trained on one data set and in turn, prevent them from learning current network patterns. In such a situ- tested on another one, is an additional infrequent records. After cleaning and ation, it is prudent to ask, “Are we train- way to evaluate the interoperability resampling, the resultant data set con- ing our intrusion detection systems and generalization capability of an IDS, sisted of 125,973 and 22,544 records for (IDSs) to combat historical attacks?” which is important to real-word appli- training and testing, respectively. This On the contrary, one may argue that cations. However, most existing IDSs is derived from the original 4,900,000 if the classification module of an IDS are either prone to overfitting or have and 2,000,000 records, respectively, in is strong (e.g., a multiagent-based IDS low generalization/interoperability the KDD Cup 99 data sets. or deep neural networks) and trained because they are often based on a priori on KDD99, it will also be able to detect known attacks. Developing cross–data- ANALYZING THE modern network traffic attacks without set methods that detect varying or pre- EVALUATIONS BASED a drop-off in performance. In practice, viously unseen attacks, which is much ON KDD99 there are very few such evaluations, more practical, is essential. The DARPA and KDD Cup 99 data sets e.g., the study by Sadhasivan and Bala- As shown in the reviewed studies served as the first systematic approach subramanian,9 in which the system has on building IDSs, we found that more to NIDS data generation, and they were been trained and tested independently attention is given to devising new algo- an incredibly innovative and valu- on both the KDD99 data sets and a rithms and techniques, conducting able resource when released in 1998 real-time data set called supervisory comparative evaluation, achieving bet- and 1999, respectively. However, these control and data acquisition, which con- ter numerical results, and other aspects data sets do not inclusively reflect cur- tains comparatively recent and diverse of the research than on clearly exam- rent network traffic trends and modern attack vectors. Moreover, to the best of ining the suitability and function of footprint attacks. For instance, the DoS our knowledge, studies on the cross– the data sets. Regardless of the factors attack types in KDD99 include land, data-set setting have not received much and constraints, they need to be recon- Neptune, back, Smurf, teardrop, and attention. This process, in which the sidered because the utilization of an

44 COMPUTER WWW.COMPUTER.ORG/COMPUTER inadequate data set does not contribute that methods working almost perfectly The current real-world environment is to IDS research even if the proposed solu- on the de facto data sets are no longer rel- much more challenging than the ones tion reliably detects attacks while show- evant as they stand. Although accelerat- depicted by the outdated data sets. ing the best possible capabilities. For ing the development of innovative algo- KDD Cup 99 and NSL_KDD, the instance, Table 1 shows that the perfor- rithmic methods has its own importance, derived versions of DARPA data sets, are mance results of almost all IDSs against these methods will only contribute and generally considered as an improvement KDD99 are remarkable, but it is unlikely be beneficial if given appropriate input. when compared with their predecessors. that such systems would perform so effi- Developing cross–data-set methods However, the actual tradeoffs presented ciently in real-world environments. Even that detect varying or previously unseen by such modifications are rarely investi- the results become mostly irrelevant attacks, which is much more practical, gated. The transformation from DARPA when cross-validated with synthetic is essential. to KDD Cup 99 not only transposed the contemporary workloads.6 This is crit- same serious flaws, but the extracted fea- ical because IDSs were developed with APPROPRIATE DATA SET(S) tures also appeared to produce unfortu- the main objective of coping with trend- FOR IDS EVALUATION nate artifacts.12 Similarly, obtaining the ing challenges and attacks. In fact, the Current technology has evolved tre- NSL_KDD data set via cleaning and resa- evaluation results based on KDD99 have mendously since the days when KDD99 mpling KDD Cup 99 further affected its become imbued with high accuracy, data sets were launched. Innovative originality. This modification destroyed thereby not only giving a false sense of developments have greatly changed the the natural order of packet flow, which progress but also impairing the ability to basic nature of computer systems, and resulted in reducing data strength and explore further knowledge horizons. operating systems have gone through often restricted researchers from devis- A recent cross-validation compara- many fundamental revisions. For ing novel solutions.13 Although these tive study6 demonstrates that the per- instance, the development of sophisti- modified versions may be considered formances of algorithms verified using cated network protocols and the widely to work with basic machine-learning KDD99 are severely degraded when eval- adopted enhancement of 32-bit archi- algorithms and similar techniques, they uated on modern data sets. Specifically, tecture to 64 bits have had a profound are undeniably not suitable for use as an the performances of five machine-learn- effect on system communications. evaluation benchmark. ing algorithms (decision tree, linear These advancements have resulted regression, naive Bayes, artificial neural in bringing a fundamental change THE QUEST FOR network, and expectation maximization) in attack methodologies as well. For BETTER ALTERNATIVES were compared using the KDD99 data set example, in KDD-era attacks, usually Arguably, the scarcity of adequate data and a contemporary data set, UNSW-NB15 only a single process was targeted, lead- sets is one of the major research chal- (https://www.unsw.adfa.edu.au/ ing to a bigger information presence lenges that compelled scholars to use australian-centre-for-cyber-security/ in its system call trace. Consequently, the KDD99 data sets for several years. cybersecurity/ADFA-NB15-Datasets/). IDS schemes optimized on these data Nevertheless, the current availability All five algorithms performed well on sets attained monumental accuracy by of some better alternatives has coun- KDD99 (similar to the results shown in leveraging any such features. Although tered every single reason to justify Table 1) but were considerably poorer on present-day attacks no longer follow their use. Table 2 compares the KDD99 the newer data set. This is mainly because this pattern, evidence of them and their data sets with some adequate alterna- of the upgraded testbed architecture and activities is spread among numerous tives and highlights their appropriate- the greater complexity introduced in system call traces. Thus, clustering ness based on the characteristics used the UNSW-NB15 data set. Similarly, the algorithms perform poorly on such for qualifying data sets.14 We have degraded performance of state-of-the-art data sets because of the similarity attempted to include only those data machine-learning techniques on some between attack and normal patterns. sets that have been found to be bet- other new data sets clearly indicates that This is very close to the contemporary ter than the legacy workloads, and we a modern environment is much more hacking attacks that are launched as briefly discuss them before reporting challenging than the one represented by distributed patterns through multiple our experimental evaluation and giv- the older data set(s).10,11 This suggests processes to compromise the system. ing some recommendations.

FEBRUARY 2019 45 HOLDING US TOGETHER

TABLE 2. A comparison of DARPA/KDD Cup 99 data sets with some repositories selected based on the key characteristics of an adequate data set.

Realistic network Diverse Complete interac- Data set(s) Content configuration Realistic traffic Labeled attacks tion capture Anonymized

DARPA/KDD Cup 99 Mixed Yes No Yes Yes Yes No

NGIDS-DS Mixed Yes Yes Yes Yes Yes No

ISCX-UNB* Mixed Yes Yes Yes Yes Yes No

TUIDS* Mixed Yes Yes Yes Yes Yes No

UNSW-NB15 Mixed Yes Yes Yes Yes Yes No

MAWILab Mixed Yes Yes Yes Yes Yes Yes

Attribute Description

Content Network traffic is generally categorized asnormal (benign) or malicious and is called mixed if it consists of both. The intrusion detection evaluations were generally performed using mixed contents.

Realistic network A network configuration may be real, simulated, or emulated. The configuration is called realistic if the process of setting a configuration network’s control, flow, and operation is performed realistically rather than simulated or emulated.

Realistic traffic Network traffic may or may not be realistic. A trace is considered to be realistic if it is captured during the regular operations of the network and has not been altered after capture.

Labeled The traces may or may not be labeled (often called the ground truth). Labeling is carried out to distinguish malicious activity from normal traffic to perform evaluation operations.

Diverse attacks This refers to the attack vector or simply how many types of attacks the data set contains. It is preferable to have a diverse family of recent attacks in the obtained data sets.

Complete The value of this attribute indicates whether the traces contain complete information or data about all network interactions interaction capture performed, as the information is useful for postevaluation operations.

Anonymized The traces may or may not be anonymized, primarily as a result of privacy issues. Anonymized traces may lack information that is considered critical for the IDS.

*These data sets can be obtained by establishing email contact with the concerned authorities.

The next-generation IDS data set set. It facilitates the process of selecting empirical evaluation demonstrates that The next-generation IDS data set or generating a realistic data set for the it represents both normal traffic dynam- (NGIDS-DS) has been recently gener- design, development, and validation of a ics and realistic attack behaviors of the ated in the next-generation cyber range reliable IDS. Based on the proposed met- real-world networks. In the section infrastructure of the Australian Defense ric, the researchers generated syntheti- “Experimental Evaluation,” we used Force Academy, Canberra.15 Before data cally realistic NGIDS-DS using IXIA Per- this data set to perform a comparative set generation, an evaluation metric fectStorm hardware in conjunction with empirical evaluation with potentially based on a fuzzy logic system has been some commercial cybersecurity-test outdated data sets to further emphasize proposed to provide a theory for evaluat- hardware platforms. The data set has a the significance of using appropriate ing the quality of realism of any IDS data medium-high quality of realism, and an data sets.

46 COMPUTER WWW.COMPUTER.ORG/COMPUTER The ISCX-UNB data set using nmap, coordinated scanning using evolution of Internet traffic but also The ISCX-UNB data set was built on rnmap, user to root using brute-force highlights the longitudinal charac- the concept of profiles to reflect net- SSH, DDoS using an agent-handler net- teristics of network traffic. Regarding work traffic and intrusions,14 and it work, and DDoS using an IRC botnet. The intrusion detection research, it allows has become the benchmark data set scenarios launched 28 distinct attack the examination of short- and long-last- obtained in seven days under practical types, among which 22 attack types were ing anomalies that have appeared since and systematic conditions. Traces were used for TUIDS intrusion detection, six 2001. Moreover, MAWI traffic monitors collected using a real-time testbed for coordinated scanning, and six for several hundred thousand Internet by incorporating multistage attacks. the DDoS category. Table 2 shows that Protocol addresses daily using numer- The concept of profiles (i.e., a and b) the TUIDS and ISCX-UNB data sets have ous applications. The resultant traces was introduced to facilitate the repro- similar characteristics, but unlike ISCX- thereby sense diverse anomalies rang- duction of certain real-world events UNB, which covered only packet-level ing from well-known to other unknown and behaviors seen on networks. The traces, TUIDS provided both packet- and malicious activities that are either hid- underlying notion of introducing pro- flow-level data. den or still emerging. Although MAWI files was to overcome the shortcom- is an excellent source of contemporary ings of static data sets that suffer from The UNSW-NB15 data set network traffic information and could various problems, such as being out- The researchers at the Australian Cen- be very useful for examining how well dated, modifiable, inextensible, and tre for Cyber Security at the Univer- a proposed IDS will work in practice, irreproducible. Furthermore, various sity of New South Wales (UNSW) gen- the lack of ground truth data makes the multistage attack scenarios were exe- erated this data set, which consists of evaluation of anomaly detectors chal- cuted to generate malicious traces, real-life, modern, normal, and synthe- lenging. Moreover, having anonymized such as infiltrating the network from sized attack activities.6 It represents payloads is another drawback. Despite the inside, HTTP, DoS, distributed DoS nine major attack families (i.e., fuzz- great efforts, the ground truth of MAWI (DDoS) with an Internet relay chat (IRC) ers, analysis, backdoors, DoS, exploits, traces is still questionable because it botnet, and brute-force secure shell generic, reconnaissance, shellcode, was obtained by combining the results (SSH). The data set is a significantly and worms) by utilizing the IXIA Per- of four unsupervised network anom- suitable representative as an adequate fectStorm platform. It has a rich set of aly detectors.17 data set (as shown in Table 2) and a good 49 extracted features that have been replacement for legacy workloads. developed with the Argus and Bro-IDS EXPERIMENTAL EVALUATION frameworks. Like the DARPA and KDD In this section, we describe an experi- The TUIDS data set Cup 99 data sets, it was also created by mental evaluation based on the prem- The Tezpur University Intrusion Detec- establishing a synthetic environment. ise that the KDD99 data sets are out- tion System (TUIDS) data set was gen- Nonetheless, the representation of dated and do not contribute to NIDS erated by the Network Security Lab at contemporary traffic patterns makes it research, even if the proposed sys- Tezpur University, India.16 An organized a far better choice than KDD99. tem shows significant performance approach was followed to generate real- results. We analyzed the performance istic network traffic by using both packet- The MAWILab data set of widely used machine-learning and flow-level information. The data set Measurement and Analysis on the Wide algorithms on KDD99 data sets and a is available in three categories: TUIDS Internet (MAWI) is a public collection representative modern data set from intrusion detection, TUIDS coordinated of 15-min-long network traffic traces among those listed in Table 2, i.e., the scan, and TUIDS DDoS data set. Normal captured every day on a backbone link NGIDS-DS from UNSW at the Austra- traffic was generated based on the daily between Japan and the United States lian Defence Force Academy.15 We activities of users, and malicious traf- since 2001 (http://mawi.wide.ad.jp took into account one of the studies18 fic was generated by launching attacks /mawi/), and the MAWI repository cur- given in Table 1 to conduct comparative within the testbed environment. The rently contains more than 16 years of evaluation. The development environ- generation process involved six attack traffic. This data set is not only a valu- ment and the definition of algorithms, scenarios, i.e., DoS using targa, probing able resource to study and analyze the i.e., support vector machines (SVMs),

FEBRUARY 2019 47 HOLDING US TOGETHER

(FN) is the number of malicious TABLE 3. The distribution of training and testing data sets. traces incorrectly classified as normal—the most serious and Number of dangerous state for an IDS. Data sets Data files features/attributes Benign Malicious ››False-positive rate (FPR) is the Training 1–59 Nine 62,484,485 701,507 ratio of the percentage of normal traces incorrectly classified as Testing 60–99 Nine 34,434,334 553,159 intrusions to the total number of normal traces.

RESULTS AND DISCUSSION TABLE 4. Comparative performance evaluation of KDD99 Table 4 presents the comparative data sets and a representative modern data set. evaluation results for the KDD99 and NGIDS-DS data sets for a few well- KDD99 data sets NGIDS-DS data set known machine-learning algorithms. Machine-learning False-alarm It can be seen that J48 shows the high- algorithm/technique TP (%) FP (%) Accuracy (%) rate (%) est accuracy, in terms of TP %, of 99.90 with an FPR of 0.0003 for KDD99 data SVMs18 95.93 0.001 76.57 17.59 sets, whereas random forest achieves Random forest18 99.57 0.0007 81.23 14.84 the highest accuracy of 81.23 with a false-alarm rate of 14.84. A typical 18 J48 99.90 0.00003 79.61 20.12 results interpretation is seen as KDD99 REPTree18 99.90 0.0002 73.54 19.42 performs better than NGIDS-DS in terms of accuracy for all four algo- rithms; however, this scenario leads random forest, J48, and REPTree, have Evaluation metrics to somewhat different perspectives by been adopted from Rathore et al.18 The complexity of the data sets was revealing the inefficiencies of KDD99 measured using the following metrics. data sets. In fact, KDD99 is a less com- Data set and experimental setup plex data set that includes a small The NGIDS-DS data set has more than ››Accuracy has traditionally been number of high-footprint attacks and 90 million records in comma-separated considered the most important reflects differentiable normal com- values format, including 88,791,734 performance evaluation metric. puter activities. NGIDS-DS is a complex and 1,262,426 records for benign and It is expressed as the percentage data set that contains a variety of mod- malicious activities, respectively. Each of true predictions: ern, low-footprint attacks and reflects data set file has nine attributes: date, normal computer activities. The com-  (TP + TN) time, pro_id, path, sys_call, event_id, Accuracy = , plexity differences not only caused tra- (TP + TN + FP + FN) attack_cat, attack_subcat, and label, ditional machine-learning techniques which indicates 0 for benign and to degrade performance accuracy but 1 for malicious. The details of train- where true positive (TP) is the also generated more false alarms. ing and testing sets for the selected number of intrusions or anom- There is a vast overlap between the machine-learning algorithms are given alies identified correctly, true patterns of normal and attack obser- in Table 3. The experiments were con- negative (TN) is the number of vations, which adds another dimen- ducted on an Intel core i7-6500U CPU normal network traces iden- sion to the complexity of the NGIDS-DS at 2.5 GHz with 512-GB solid-state drive tified correctly, false positive data set. This particular feature is a and 8-GB random-access memory, with (FP) is the number of normal very common trait in modern hacking Apache Hadoop version 2.9.0 installed traces incorrectly classified as attack samples that gives a distributed on Ubuntu 14.04. intrusions, and false negative approach to system compromise by

48 COMPUTER WWW.COMPUTER.ORG/COMPUTER spreading the activity through multi- appropriate data set, it would be this. Finally, we strongly suggest that ple processes. Select the one that best matches with the NIDS community should build Machine-learning techniques gen- current network traffic patterns and a consensus on the need to discard erally rely on a frequency analysis to system architectures. We stress that future IDS studies that rely solely on provide discriminative features;10 they this should serve as a prerequisite con- evaluation with KDD99. It would be thus perform poorly on complex data dition for qualification of the data set more effective if this screening were sets (e.g., NGIDS-DS) because of the sim- because it dominates almost all other done by the editorial boards of jour- ilarities between attack and normal characteristics of an adequate data set. nals and conference committees data, along with the loss of all positional In other words, if a data set is not rep- rather than expecting a consolidated data of the system call traces. Nonethe- resentative of current system architec- response at the research-field level. less, these algorithmic techniques are tures and network traffic patterns, it is This will encourage the use of appro- able to attain high classification accu- likely an unsuitable resource for NIDS priate data sets and will ultimately be racy on the KDD99 data set because it is evaluation. This can be clearly observed useful to strengthen academic NIDS limited to a single process that creates in the case of DARPA/KDD Cup 99 data research under real-world operational a larger and more detectable system sets; they still hold almost all of the environments. The progress of NIDS call footprint. Specifically, the results important characteristics, i.e., they research could be much better now clearly indicate that the traditional have realistic network configuration, if the same had been done in 2007 on features and methods that have been contain a labeled data set, are nonano- the recommendation by Terry Brug- selected and perfected for intrusion nymized, exhibit complete interaction ger at the University of California.2 To detection on KDD99 data sets are no lon- capture, and contain diverse attack avoid spinning our wheels for another ger relevant or adequate to capture the types. However, the inapplicability of decade, we must lay the KDD Cup 99 nature of attacks, which originate from our proposed prerequisite condition data set and all its variations to rest, modern frameworks and are most likely makes these data sets impractical. with a 21-gun salute to symbolize the to evolve as time passes. Despite the importance of using importance of these data sets in shap- Therefore, the NGIDS-DS data set is appropriate data sets, there will always ing the foundation of NIDS research. regarded as complex15 because of the be analysis deficiencies because no similar behaviors of modern attacks and data set is perfect. However, attempts ACKNOWLEDGMENTS normal network traffic. Thus, use of the must be made to select the best data This research was supported by the NGIDS-DS data set is advised for reliably set possible. Building on the points Ministry of Science, Information and evaluating the existing and novel tech- addressed in this article, we primar- Communications Technology, Korea, niques of NIDSs. All in all, the poor and ily recommend working with real-life under the Information Technology rich performances on the NGIDS-DS and traffic data, such as those provided by Research Center support program KDD99 data sets, respectively, suggest MAWILab. Although issues such as the (IITP-2018-2016-0-00465) supervised by that to attain higher-order accuracy on missing ground truth and anonymiza- the Institute for Information and Com- NIDSs, researchers must first acknowl- tion may exist in them (as pointed out munications Technology Promotion. edge that the modern environment is earlier), they could still be useful in much more challenging than the KDD99 determining realistic IDS behavior REFERENCES data set. Second, researchers must be and, possibly, achieving cross–data-set 1. R. Sommer and V. Paxson, “Outside the encouraged to actively search for and validation standards. In addition, con- closed world: on using machine learn- devise either a new richer data feature or sidering extant academic IDS research ing for network intrusion detection,” a new decision engine on which to base constraints, we suggest using the in Proc. 2010 IEEE Symp. Security and the next generation of NIDSs. publicly available data sets described Privacy, Washington, D.C., pp. 305–316. in the section “The Quest for Better 2. T. Brugger, “KDD cup’99 dataset (net- Alternatives” or similar workloads, work intrusion) considered harm- f we were dealing with a situation which adequately satisfy the qualify- ful,” KDNuggets, no. 18, item 4, Sept. where we could make only one rec- ing characteristics outlined in Table 2 15, 2007. Accessed on: July 28, Iommendation on how to select an and modern trends. 2017. [Online]. Available: http:// FEBRUARY 2019 49 HOLDING US TOGETHER

DARPA off-line intrusion detection evaluation,” Comp. Networks, vol. 34, ABOUT THE AUTHORS no. 4, pp. 579–595, 2000. doi: 10.1016 /S1389-1286(00)00139-0. 8. M. Tavallaee, E. Bagheri, W. Lu, and KAMRAN SIDDIQUE is a research assistant professor at Xiamen University A.A. Ghorbani, “A detailed analysis Malaysia. His research interests include cybersecurity, machine learning, and of the KDD CUP 99 data set,” in Proc. big data processing. Siddique received a Ph.D. in computer engineering from Second IEEE Symp. Computational Dongguk University. Contact him at [email protected]. Intelligence for Security and Defense Applications (CISDA’09), Ontario, ZAHID AKHTAR is a research assistant professor at the University of Memphis. Canada, 2009, pp. 53–58. His research interests include machine learning and computer vision image 9. D. K. Sadhasivan and K. Balasubra- processing with applications in biometrics, cybersecurity, and multimedia qual- manian, “A fusion of multiagent ity assessment. Akhtar received a Ph.D. in electronic and computer engineer- functionalities for effective intru- ing from the University of Cagliari. He is a member of the IEEE Signal Process- sion detection system,” Security Com- ing Society. Contact him at [email protected]. mun. Networks, Jan. 2017, p. 6216078, 2017. doi: 10.1155/2017/6216078. FARRUKH ASLAM KHAN is an associate professor at the Center of Excellence 10. G. Creech and J. Hu, “Generation of a in Information Assurance, King Saud University. His research interests include new IDS test dataset: Time to retire cybersecurity, body-sensor networks and e-health, bioinspired and evolution- the KDD collection,” in Proc. 2013 ary computation, and the Internet of Things. Khan received a Ph.D. in computer IEEE Wireless Communications and engineering from Jeju National University. He is a Senior Member of the IEEE. Network Conf. (WCNC), Shanghai, Contact him at [email protected]. China, pp. 4487–4492. 11. A. Vasudevan, E. Harshini, and S. YANGWOO KIM is the corresponding author for this article and a professor Selvakumar, “SSENet-2011: A net- at Dongguk University. His research interests include parallel and distributed work intrusion detection system processing systems, cloud computing, , and peer-to-peer com- dataset and its comparison with KDD puting. Kim received a Ph.D. from Syracuse University. Contact him at ywkim@ CUP 99 dataset,” in 2011 Second Asian dongguk.edu. Himalayas Int. Conf. Internet (AH-ICI), pp. 1–5, 2011. [Online]. Available: https://ieeexplore.ieee .org/document/6113948 www.kdnuggets.com/news/2007 5. P. Gogoi, M. H. Bhuyan, D. K. Bhat- 12. M. V. Mahoney and P. K. Chan, “An /n18/4i.html tacharyya, and J. K. Kalita, “Packet analysis of the 1999 DARPA/Lincoln 3. A. Özgür and H. Erdem, “A review and flow based network intrusion Laboratory evaluation data for network of KDD99 dataset usage in intrusion dataset,” in Proc. Int. Conf. Contempo- anomaly detection,” in Recent Advances detection and machine learning rary Computing, 2012, pp. 332–334. in Intrusion Detection, G. Vigna, E. between 2010 and 2015,” PeerJ Pre- 6. N. Moustafa and J. Slay, “The evalu- Jonsson, and C. Kruegel, Eds. Berlin: prints, vol. 4, pp. e1954v1, Apr. 2016. ation of network anomaly detection Springer, pp. 220–237, 2003. 4. J. McHugh, “Testing intrusion systems: Statistical analysis of the 13. A. Iorliam, S. Tirunagari, A. T. S. detection systems: A critique of the UNSW-NB15 data set and the com- Ho, S. Li, A. Waller, and N. Poh, 1998 and 1999 DARPA intrusion parison with the KDD99 data set,” “‘Flow size difference’ can make detection system evaluations as Inform. Security J.: A Global Perspec- a difference: Detecting mali- performed by Lincoln laboratory,” tive, vol. 25, no. 1–3, pp. 18–31, 2016. cious TCP network flows based ACM Trans. Inform. Syst. Security, doi: 10.1080/19393555.2015.1125974. on Benford’s law.” 2016. [Online]. vol. 3, no. 4, pp. 262–294, 2000. doi: 7. R. P. Lippman, J. W. Haines, D. J. Available: https://arxiv.org/ 10.1145/382912.382923. Fried, J. Korba, and K. Das, “The 1999 abs/1609.04214.

50 COMPUTER WWW.COMPUTER.ORG/COMPUTER 14. A. Shiravi, H. Shiravi, M. Taval- 16. M. H. Bhuyan, D. K. Bhattacharyya, system for ultra-high-speed big data laee, and A. A. Ghorbani, “Toward and J. K. Kalita, “Towards generating environments,” J. Supercomputing, developing a systematic approach real-life datasets for network intru- vol. 72, no. 9, pp. 3489–3510, 2016. to generate benchmark datasets for sion detection,” Int. J. Network Secu- doi: 10.1007/s11227-015-1615-5. intrusion detection,” Comput. Secu- rity, vol. 17, no. 6, pp. 683–701, 2015. 19. S. T. Brugger and J. Chow, “An assessment rity, vol. 31, no. 3, pp. 357–374, 2012. 17. R. Fontugne, P. Borgnat, P. Abry, and K. of the DARPA IDS evaluation dataset doi: 10.1016/j.cose.2011.12.012. Fukuda, “MAWILab: combining diverse using Snort,” Dept. Comput. Sci., Univ. 15. W. Haider, J. Hu, J. Slay, B. P. Turn- anomaly detectors for automated anom- California, Davis, Rep. CSE-2007-1, 2005. bull, and Y. Xie, “Generating realistic aly labeling and performance bench- 20. C. Thomas, V. Sharma, and N. Bal- intrusion detection system dataset marking,” in Proc. 6th Int. Conf. Emerging akrishnan, “Usefulness of DARPA based on fuzzy qualitative model- Networking Experiments and Technologies dataset for intrusion detection sys- ing,” J. Network Comput. Appl., vol. 87, (CoNEXT 10), Philadelphia, PA, 2010. tem evaluation,” in Proc. SPIE—the pp. 185–192, June 2017. doi: 10.1016 18. M. M. Rathore, A. Ahmad, and A. Int. Society Optical Engineering, vol. /j.jnca.2017.03.018. Paul, “Real time intrusion detection 6973, pp. 69730G, 2008. IEEE COMPUTER GRAPHICS AND APPLICATIONS APPLICATIONS AND GRAPHICS COMPUTER IEEE IEEE COMPUTER GRAPHICS AND APPLICATIONS APPLICATIONS AND GRAPHICS COMPUTER IEEE IEEE COMPUTER GRAPHICS AND APPLICATIONS APPLICATIONS AND GRAPHICS COMPUTER IEEE IEEE COMPUTER GRAPHICS AND APPLICATIONS APPLICATIONS AND GRAPHICS COMPUTER IEEE

November/December 2016 July/August 2016 September/October 2016 January/February 2017 November/December 2016 September/October 2016 January/February 2017January/February

July/August 2016July/August Qualit Assessment and Defense Quality Assessment and Perception in Computer Graphics Computer in Perception and Assessment Quality Perception Applications Element Human Water, the Sky, and

in Computer Graphics Visualization Data Sports Defense Applications

VOLUME 36 NUMBER 4 NUMBER 36 VOLUME VOLUME 37 NUMBER 1 37 NUMBER VOLUME VOLUME 36 NUMBER 5 NUMBER 36 VOLUME VOLUME 36 NUMBER 6 NUMBER 36 VOLUME

c1.indd 1 12/14/16 12:21 PM

c1.indd 1 6/22/16 1:20 PM c1.indd 1 8/22/16 2:59 PM c1.indd 1 10/24/16 3:44 PM

CG& www.computer.org/cgaA IEEE Computer Graphics and Applications bridges the theory and practice of computer graphics. Subscribe to CG&A and • stay current on the latest tools and applications and gain invaluable practical and research knowledge, • discover cutting-edge applications and learn more about the latest techniques, and • benefit fromCG&A ’s active and connected editorial board.

Digital Object Identifier 10.1109/MC.2019.2901914

FEBRUARY 2019 51 COVER FEATURE HOLDING US TOGETHER

Rack-Scale Capabilities: Fine-Grained Protection for Large-Scale Memories

Kirk M. Bresniker and Paolo Faraboschi, Hewlett Packard Labs Avi Mendelson, Technion Dejan Milojicic, Hewlett Packard Labs Timothy Roscoe, ETH Zurich Robert N.M. Watson, University of Cambridge

Rack-scale systems with large, shared, disaggregated, and persistent memory need solid protection and authorization techniques. Our solution uses a memory- side capability enforcement processor that gates memory accesses through extended capabilities, enables fine- grained access control beyond a single address space, and minimally disrupts the programming model.

t the crossover point between technology capabilities, could enable a robust and scalable protec- adoption curves (such as persistent mem- tion mechanism. ory, rack-scale disaggregate memory, and Memory has always been considered a scarce resource memory semantics fabrics), it is vital to that needs to be shared among multiple programs. Since understandA the benefits of defying conventions. In the inception of virtual memory in the 1960s,1 operating this article, we examine the ramifications of persistent systems (OSs) have overcome the limitations of small fabric-attached memories (FAMs) at the rack scale and physical memory by a variety of mechanisms to give how approaches that predate memory paging, such as users and programs the perception of unlimited mem- ory. Virtual memory was—and remains today—a pow- erful mechanism with which to optimize memory allo- Digital Object Identifier 10.1109/MC.2018.2888769 Date of publication: 22 March 2019 cation, simplify addressing, manage fragmentation, and

52 COMPUTER PUBLISHED BY THE IEEE COMPUTER SOCIETY 0018-9162/19©2019IEEE allow oversubscription by means of tection, privileged execution of code certain workloads experience huge paging out to slower media and pag- may be achieved simply by gaining ac- performance benefits from superpages, ing in when needed. Over time, these cess to a piece of the memory address but others suffer—in practice, the shift operations have become so important space. For small page sizes this has is toward greater flexibility in the size that hardware support has appeared (historically) been considered an ac- of translation units within the con- in all modern processors in the form of ceptable compromise across security, straints of different page sizes.2 caches of the virtual memory tables, or performance, and hardware complex- Paging was not the only memory translation look-aside buffers (TLBs). ity considerations. protection concept developed in the Once the OS and hardware manage However, the technology has sub- 1960s. Segmentation and capabili- virtual memory as fixed size pages and stantially changed. Individual com- ties3 are two alternative approaches include all of the necessary structures puters can afford terabytes of physi- that support variable-size memory for virtual-to-physical address transla- cal memory, and rack-scale systems units, from a single byte to the whole tion, it becomes natural to extend these federating hundreds of elements are address space. Both approaches can tables to capture related concepts, such approaching petabytes. At this scale, coexist with paging, and, for a while, as access protection. For security, pri- organizing memory in kilobyte-sized some processors supported segmen- vacy, and error-containment reasons, pages requires billions of pages, and tation, while other systems sup- not all programs are allowed to access the overhead to manage the map- ported capabilities. all memory pages. Access-right infor- pings does not scale, as page tables and Capabilities are particularly rel- mation (typically read, write, or exe- page-table walks overflow TLBs and evant to this discussion: they are cute privileges) is stored as metadata caches. At the same time, this abun- unforgeable tokens of authority used associated with a page, cached in the dance of memory removes the orig- to protect memory at a fine granulari- TLB, and checked at page granular- inal motivation for virtual memory ty, down to a single-byte location. For ity. This arrangement makes the pro- (paging to disk), and most programs a full implementation, they require tection check fast because it is part of keep all of the data in memory because processor instruction set architecture the translation process of each mem- of performance issues. As a conse- (ISA) support to keep extra information ory-access instruction. It also enables quence, the trend is to shift toward large (hidden to application programming) precise exceptions to generate accurate (1–2 MiB) or huge (1–4 GiB) page sizes, associated with memory addresses notifications about the nature of the so that applications can allocate most (i.e., pointers) stored in registers and protection violation. This is import- of their working set right away, allow- memory. A capability-enhanced CPU ant to implement functionality, such ing page-table overhead to scale with can check this information upon every as on-demand paging, shared libraries, memory-size growth and move the OS individual memory access to ensure or copy-on-write, and it requires the overhead out of the way. that the access is allowed. The check ability to precisely restart a faulting Unfortunately, larger pages increase can be extended to manipulate the ca- memory instruction after an exception security risks and can expose the pabilities themselves, such as secure- occurs. However, it is a compromise page to errors (or malicious attacks) ly storing (and retrieving) them to because programs would naturally like because any address within a page can memory, while preventing access from to expose a different, often finer, gran- be accessed without additional fine- unauthorized code. ISA-supported ca- ularity protection, possibly at the indi- grained control. More importantly, the pabilities can be passed in user space vidual-object level, and not be tied to underlying problem comes from the without performance costs, but they an arbitrary page size. bundling of the two key memory con- require invasive hardware changes to Page-level protection also creates cepts, translation and protection, in the the memory hierarchy, the microarchi- the opportunity for malicious ex- same page structure. This is becoming tecture (e.g., extending the register file ploits, such as buffer and stack over- a primary cause of tension in the OS: and caches to store the metadata), and flows, when multiple tenants share the needs of a large translation unit the ISA itself. The software stack also a single page or execute code in a and a small protection granularity are needs to change to maintain and uti- shared library. Because all addresses fundamentally incompatible, and we lize capabilities effectively when describ- within a page inherit the same pro- need a different approach. In addition, ing data and code structures.

FEBRUARY 2019 53 HOLDING US TOGETHER

CHERI CAPABILITIES access to only a single writer, while Most changes can be hidden in librar- Capability Hardware Enhanced RISC allowing other clients only read access. ies or directly implemented by the Instructions (CHERI) (www.cheri-cpu Supporting capabilities requires compiler and tool chain. .org) is an example of an ISA-supported changing the ISA and microarchitec- capability implemented as extended, ture. CHERI extends capability reg- CAPABILITY ENFORCEMENT or safe, pointers4 compatible with isters to access memory and adds the ACCELERATORS off-the-shelf software. Simple pointers supporting enforcement logic. Enforce- CPUs and ISAs are evolving slowly. are references to memory locations, ment compares the contents of capabil- It takes several years for a new ISA and they contain (virtual) addresses. ity registers with the attempted access feature to be implemented and even Capabilities are extended pointers that after the capability has been manipu- longer to reach the market, be sup- contain base, offset, length, and protec- lated through typical pointer arithme- ported by an industry-standard OS, tion bits (Figure 1). Length defines the tic operations. Capabilities are enforced and, finally, be adopted by application address range that the capability can on data access to support passive data developers. ISA-supported capabil- access, counting from the base address. checks and instruction execution (e.g., ities are no exception. Moving some Offset represents the individual mem- procedure call/return, jumps) for active of the ISA support into a separate sys- ory access target (the virtual address to objects and compartmentalization. tem (outside the CPU) could lower the access memory through the capability CHERI also adds privileged instructions adoption barrier. is base plus offset). The protection bits to store/load to/from memory using Furthermore, ISA-supported capa- grant read, write, and execute permis- capabilities. To prevent processes from bilities exist within a single virtual sions. A process owning a capability can forging capabilities stored in memory, address space. Sharing across address derive other capabilities with reduced a tag bit is maintained for each capa- spaces (or persistent memory) requires rights, in terms of space or access. This bility in memory, which is propagated additional OS support and incurs per- allows a process to subdivide a capabil- through caches and the TLB into formance costs in crossing OS bound- ity to provide access to only a subset of the capability registers. Any attempt aries. This eliminates the perfor- the initial address range, or to remove to modify a memory location contain- mance advantage of ISA-supported rights, such as execute or write (main- ing capabilities by unauthorized code capabilities in the user space when taining the monotonicity of capability clears the capability bit and effectively dealing with multiple processes or OSs derivation). The software tool chain invalidates the capability, prevent- at the rack scale. (compiler and linker) and programmers ing it from accessing data. The tag The alternative to CPU-supported can selectively manage access to mem- enforces noncorruption and ensures capabilities is a dedicated external ory regions by passing to other pro- valid provenance. component. In this case, we propose a grams capabilities that refer to a region CHERI capabilities double the size memory-side capability-enforcement subset or limited access rights. For of pointers from 64 to 128 bits (plus one processor (CEP), a hardware controller example, a memory manager can hand tag bit). Using capabilities on a single (also called an ) interposed out access to parts of the memory buffer node requires small changes to soft- on the load/store path between the to clients, or a server can provide write ware and limited changes to the OS. CPU and the memory. The CEP acts as

Memory Region IF/ID ID/EX EX/MEM End of Region P General- Offset Into Region PC Instruction Purpose Memory ALU Data PCC Registers Memory Capability Base Offset Length Bits Hidden Bit Indicating Capability Pointer Pointer Capability Registers

FIGURE 1. The format of CHERI capabilities compared with a simple pointer and (micro)architecture changes. IF: instruction fetch; ID: instruction decode; EX: execute; MEM: memory access; ALU: arithmetic logic unit.

54 COMPUTER WWW.COMPUTER.ORG/COMPUTER a secure memory controller, taking the shared memory. An ISA-based system through a familiar load/store interface, responsibility of guarding the access cannot enforce capabilities across dif- with performance comparable to that of to the memory it controls through a ferent virtual address spaces or differ- local memory access. The abundance of capability system. The CPU can use it ent OS instances. The CEP overcomes globally addressable memory enables by issuing specific CEP instructions to these limitations, because it operates new in-memory algorithms and non- access memory or to manipulate the memory-side on the physical addresses, partitioned data structures that are CEP-stored capabilities. In its straight- rather than virtual addresses, and impractical on traditional clusters due forward implementation, the CEP can introduces handles in the user space to performance, power, and cost lim- be used to replace the ISA support for [Figure 2(b)]. The CEP tracks the han- itations. Unfortunately, it also further capabilities, with minimal changes to dles and checks them when data are widens the chasm between protection the rest of the system. However, some accessed so that only allowed processes and translation, making the case for functionality, such as compartmental- can proceed. The CEP [Figure 2(c)] can capabilities even stronger. ization, may require the CEP to rely on also supplement ISA capability enforce- The concept of capabilities needs to OS support. The CEP also provides sup- ment across virtual address spaces. evolve to support memory in rack-scale port for capabilities not covered by the systems with many nodes running ISA capability model, such as memory RACK-SCALE SYSTEMS independent OSs. When rack-scale sharing (intra- and internode) and per- AND CAPABILITIES systems also include shared NVM, as sistent memory. Enhancements in optical intercon- some emerging paradigms combining Figure 2(a) shows how ISA-supported nects, memory semantics protocols, memory and storage advocate, capabil- capabilities enable secure access within and the emergence of fabric-attached ities need to evolve accordingly. individual virtual address space. Appli- nonvolatile memory (NVM) are mak- Rack-scale systems consist of mul- cations can share memory, but capabili- ing rack-scale memory a reality. This tiple nodes, each running its own OS ties cannot be stored in shared memory, enables the individual nodes in a rack- instance in support of the scale-out nor can they be securely used to access scale system to access all memory model. They also have a stronger trust

VAS VAS Local Caps/Data Handles/Data Caps/Data Local Caps/Data Handles/Data Caps/Data

App App App App App App

OS OSS OSS A SA CPUU CPU CPUCPPUU AP I CAP ISA C CAP ISA IS ……. ……. ……. VAS0 VASn VAS0 VASn VAS0 VASn MMUM U MMUMMU MMUMMU

NVRAM CEP DRAM NVRAM CEP DRAM PAS PAS PAS Tagged CAPs Tagged CAPs Tagged CAPs (a) (b) (c) FIGURE 2. The CEP. (a) ISA capabilities allow fine-grained protection within a single virtual address space. (b) Transition: the CEP fine- grained protection uses handles across the physical address space. (c) Vision: the CEP supplements the ISA in fine-grained protection across VAS/PAS and NVM. DRAM: dynamic random-access memory; NVRAM: nonvolatile random-access memory; VAS: virtual address space; PAS: physical address space; MMU: memory management unit; CEP: capability enforcement processor.

FEBRUARY 2019 55 HOLDING US TOGETHER

model—if a single OS is compromised, selectively enable access to individual policies and management of the data the node boundaries prevent propa- components of the data structure at access from multiple nodes, following gation to other nodes. In a distributed fine granularity. the self-protecting memory principle. multi-OS environment, revocation of In this environment, threats can Although the performance implica- capabilities becomes a complex task come from a compromised or buggy tions of checking accesses for very fast because we can no longer rely on a OS, application, or any other piece of (node-local) memory would be severe, single OS (and single execution hard- system software. The major concern is they become tolerable for slower devices ware) to have full control of a capabil- with unauthorized writing to the mem- (NVM byte-addressable technologies) ity. Managing distributed capabilities ory. When memory is persistent and or when the accesses traverse a multi- requires the careful interaction of not cleared after reboot, the threats/ hop fabric (FAM at the rack scale). hardware support, OSs, and the appli- bugs are exacerbated because contents Another way to look at this is from cation runtime. may persist beyond the lifetime of the the perspective of address spaces. An interesting programming para- OS. To address these threats, we lever- ISA-supported capabilities take a digm of rack-scale systems organizes age different security models. ISA sup- virtual-address-space view [Figure 3(a), applications into microservices and port deals with the individual virtual left], and an OS takes the node view containers. They benefit from fine- address space; OS capabilities enforce a [Figure 3(a), right]; the rack-wide view grained protection because they can node-level trust model; and at the rack- addresses the rack scale because any be packaged much more densely than wide scale, we leverage the TOR man- part of the NVM could be mapped into a what a given page size allows. In addi- ager, secure enclaves, and the network- single node. Because of the size of FAM tion, they benefit from both code and ing components that enable access to and the distance from each CPU, we can data protection by selectively allow- the FAM. offload some of the capability enforce- ing which components can be invoked The CEP model naturally expands ment from the CPU into accelerators from other components. Delegation to cross-node capabilities in rack-scale closer to FAM [Figure 3(a), center]. in the case of microservices is a very systems. Because the CEP resides close Figure 3(b) presents a sample rack- powerful programming approach to to memory, it is effective in enforcing scale configuration that uses FAM

Local Copy of Global Capability App App App App Local DRAM SoC NVM NVM Global OS OS Local DRAM Persistent SoC Capability Node 0

CPU ……. CPU Local Copy of CAP ISA CAP ISA Global Capability

……. Network VAS VAS VAS ……. 00 0n n0 VASnn MMU MMU Local DRAM SoC Global CEP CEP NVM Persistent Data NVRAM DRAM NVRAM DRAM Local DRAM NVM SoC Tagged CAPs PAS OS-Enforcement Node N Rack-Wide ISA-Enforcement CEP-Enforcement Persistent Memory Pool (a) (b)

FIGURE 3. Rack-scale capabilities. (a) Approaches to capability enforcement in rack-scale systems with fabric-attached memory: ISA-, OS-, and rack-supported. (b) Capabilities in a fabric-attached memory. SoC: system on chip.

56 COMPUTER WWW.COMPUTER.ORG/COMPUTER pooled and accessible by all nodes, they are stored in nonvolatile device or to be global [Figure 3(b)]. When global as some approaches advocate. Simi- anywhere outside the failure domain of capabilities are passed to other nodes, lar considerations apply if rack-scale the process that created them. Capabil- revocation complications arise. memory is distributed and accessi- ities derived from a persistent capabil- Rack-scale systems typically in­­ ble by more traditional mechanisms, ity can be ephemeral [e.g., they live in volve additional levels of memory such as RDMA remote direct memory memory that disappears with the pro- translation beyond the processor’s access or NVM express over fabrics. cess, like local dynamic random-ac- memory management unit. In addi- Adapting capabilities to the rack-scale cess memory (DRAM)], but the master tion to virtual (unique to a process) and environment is critical to reliable soft- capability needs to be persistent. The physical (unique to a node) addresses, ware development in that environ- opposite is not true: persistent capa- memory locations have a unique fab- ment, but it also requires extending bilities cannot point to process-local ric address. These can be made of node our notion of the model and imple- mentation of the underlying capabilities. There is always a lowest layer of the system software (kernel, supervisor, hypervisor, whatever runs on the TOR CAPABILITIES CAN BE STORED IN THE control processor, and so on) that mul- MEMORY OF OTHER NODES OR IN tiplexes the machine, and this is where GLOBALLY SHARED NVM THAT IS SAVED the software that controls the lowest level of capabilities lives, the rack- ACROSS OS REBOOTS. wide capability management system. Anything else (virtual machine, con- tainers, bare-metal OSs on a secure partition of the hardware) are above (volatile) memory; only ephemeral identifier and local addresses (for leg- this layer. capabilities can. acy networks) or built in the protocol Compared with single address- Local pointer bugs may corrupt itself (for new interconnects, such as space capabilities (such as CHERI), local data within a process, but the Gen-Z). Regardless of the mechanism, which live and die with the creation corruption is limited to the process fabric addresses are larger than indi- and termination of a process, capa- lifespan. With NVM, pointer bugs may vidual node addresses, and, ideally, bilities in a rack-scale system are long persist in memory indefinitely, lead- one would like to have a direct trans- lived. They can outlive not only the ing to corruption, regardless of pro- lation from 64-b virtual addresses to process that created them or was using gram restart or system reboot, mak- unique fabric addresses. However, them but also an OS reboot or even ing fine-grained pointer and memory when using ISA load/store instruc- reinstall. Capabilities can be stored in protection essential to the success of tions, the smaller physical address the memory of other nodes or in glob- NVM-based systems for nonmanaged (lower than 52 bits today) gets in the ally shared NVM that is saved across languages (and for the runtimes of way, causing a disconnect between OS reboots. There is a temporal aspect managed languages, frequently imple- the CPU and memory, resulting in of capability persistence that does not mented in C/C++). the need for memory-side translation exist with ephemeral capabilities (local Unlike local capabilities, rack-scale support (which makes a CEP approach capabilities that live in local memory capabilities can be named and accessed even more appealing). and a single process). In addition, when globally from any node, not just from Pursuing this kind of work requires a capability is stored in persistent data, the node where they were created. To the intersection of many areas of com- the capability itself has to be persistent accomplish this, we record the creation puter science. We six coauthors come for the system to be consistent. Because source node in the capability, so that from diverse and complementary back- the notion of persistence is always tied accessing memory can be appropri- grounds: microarchitecture, architecture, to a certain class of failures, capabili- ately directed. Similarly to persistence, distributed systems, system software, and ties can be considered persistent when a capability pointing to global data has security. This makes us ideal collaborators

FEBRUARY 2019 57 HOLDING US TOGETHER

schemes. It would be as if someone Local Cap Sealed Master Handle Master Cap decided to change the lock to a shared closet, without telling everyone with a key that the key no longer works. One-sided revocation is nontrivial in 1) Is master valid? rack-scale systems because capabilities Are bounds OK? can be dispersed, and it may take time, and a complicated distributed algo- 2) Is cap valid? rithm, to reach revocation closure. For Local Node nonarchitectural capabilities, the OS 3) Issue Memory Request maintains data structures that track Global trees of derived capabilities, which are Persistent then parsed to revoke all derived capa- Memory bilities. Even in a single system, revo- (a) cation represents a complex activity that can cause performance penalties Thread 2 Thread 3 and is nontrivial to implement effi- Thread 1 Execution ciently. In CHERI, the locations of capa- Execution Protection Key2 bilities can be tracked with assistance Protection Key 3 from the paging mechanism, but this Execution requires sweeping through the mem- Protection Key Memory Block 1 Memory Block ory with suitable atomicity properties. 3 In a rack-scale system, with distributed 1 CPU Side state, revocation is extremely complex and must avoid the need for global operations to ensure adequate scalabil- Key Matching ity and reliability. An alternative is lazy revoca- tion, which can be accomplished by extending derived capabilities with FAM Side Block Protection Key 2 copies of a master capability repre- senting the same memory. On revo- Memory Block cation, the master capability and the memory it represents are both freed. (b) On each access to memory using other FIGURE 4. Approaches to revocation. (a) Redirection. (b) Key-based revocation. copies of revoked capability, a verifi- cation is first performed to determine whether the master capability is valid, and ensures that all of the aspects of the a distributed application program- followed by verification of the access design are discussed and covered. ming interface. This creates an inter- right to memory. These two verifica- esting complication: once the owner tions can be conducted in parallel CAPABILITY REVOCATION releases a resource, all of the capabili- and be hardware accelerated [Fig- Being unforgeable, capabilities can be ties representing that resource need to ure 4(a)]. Another approach is to asso- passed around to access resources (i.e., be revoked. Otherwise, a subsequent ciate blocks of memory and threads memory). In a rack-scale system, the access will result in an error that would accessing memory with matching other processes can be microservices be difficult to debug and would require keys. Upon each memory access, keys that execute on other nodes through complicated client-side error handling are matched using hardware.5 If there

58 COMPUTER WWW.COMPUTER.ORG/COMPUTER TABLE 1. Different approaches to capabilities.

Features

Approach Example systems Distribution Persistency Revocation, GC Granularity HW/SW support

HW CAP, Plessey System Single process No support Revocation in Fine HW/SW, ISA, ISA support 250, StarOS, IBM/38, except Plessey in StarOS, IBM/38, CODOMS, OS, microcode, iAPX432, Hardbound, System 250, Hardbound, low- M-Machine and compiler low-fat pointers HW, StarOS, and fat pointers HW, GC in StarOS, and CODOMs, M-Machine, iAPX, which are CODOMS, and M-Machine and CHERI multinode CHERI

OS Mach, Chorus, Multinode clusters Capability to pager Revocation: yes, Page, objects, and OS support Amoeba, KeyKOS, except for KeyKOS (Mach, Chorus, L4), except Amoeba exceptionally fine and MMU EROS, L4, Barrelfish, (multiprocess) and FS (Amoeba), and and KeyKOS and Composite EROS, and VAS (KeyKOS) GC: no, except L4 and L4 (1 node) Composite (ref cnt)

Languages E, Joe-E, Caja, Single process No No revocation Objects Language and fat SoftBound, CCured, GC optional runtime and pointers low-fat pointers SW, compiler and Cyclone

HW: hardware; ref: reference; SW: software; cnt: count; FS: file system; GC: garbage collection. is a match, access is allowed, and if not, was revocation. Deriving capabilities genzconsortium.org/), extend mem- an exception is raised [Figure 4(b)] and results in chains that need to be torn ory semantics across nodes within a communicated back to the applica- during revocation. This is costly in sin- rack and also provide basic support for tion. The application can rerequest the gle-node systems and unacceptable at copying capabilities around through capabilities and reissue the access (if the rack scale. The lazy approach we privileged operations. it still has permissions to the memory introduced addresses this challenge. region), or it can signal a protection Capabilities are invalidated, and verifi- OTHER APPROACHES violation to the end user. cation is conducted every time capabili- TO CAPABILITIES Implicit in lazy revocation are two ties are used. Memory-side accelerators There is a rich history of capabilities, important points: software needs to allow verification at memory access which can be classified as hardware, react to traps caused by overrevoca- speed. The performance of the CEP OS, and language supported (see Table 1). tion and reacquire underlying capa- is affected by the number of capabili- Only hardware-supported fine gran- bilities with new keys, and a gen- ties, which can be cached by the CEP if ularity and persistency, for example, uine protection fault is likely to be needed. Capability-based fine-grained CAP, StarOS and IBM System/38 (see caused by a bug or a malicious exploit memory protection fits well with pol- Levy3 for details). OS-supported, but attempt. A well-behaved application icies for elastic scaling of memory not rack-scale, systems targeted clus- should not try to access memory after regions, enabled by splitting and merg- ters (e.g., L4,6 KeyKOS,7 Barrelfish8). a revocation, so the protection mech- ing of capabilities and corresponding Language-supported approaches are anism is a backstop and hopefully is memory regions. more flexible but have lower perfor- rarely invoked. To extend trust among the nodes, mance. They rely on objects within a we need to rely on a secure and scal- single process, for example, low-fat IMPLEMENTATION ASPECTS able memory fabric that supports pointers,9 SoftBound,10 and CCured.11 Historically, the primary challenge managing capabilities. New intercon- Recently, vendors, such as Intel, intro- in scaling capability-based systems nect standards, such as Gen-Z (https:// duced limited support for fine-grained

FEBRUARY 2019 59 HOLDING US TOGETHER

APPLICATIONS AND USE CASES

he Machine Research Program at Hewlett effective use of locality. The memory-driven Packard Labs proposes a so-called memo- organization of the machine rack-scale infra- Try-driven computing approach spanning from structure allows us to hold graph and meta- embedded through exascale computing. Hewlett data in a single shared memory pool, allowing Packard Enterprise recently demonstrated a rack- distributed applications to access them at a fine, scale prototype of 160 TiB of memory attached byte-level granularity. The use of graph theory to an optically connected memory semantics fab- across a longitudinal data set naturally invites ric. This prototype crosses an interesting thresh- multiple access roles: analysis versus evolution old, offering significantly more memory than is of the graph either with or without the metadata. addressable either by the physical addressing Capabilities enable enforcement of roles, which of industry standard architectures or the virtual can survive and be revoked independent of the addressing of OS kernels. Although this was execution lifecycle of any particular process or the designed as a testbed for hardware, firmware, and underlying OS. OS investigations, the prototype has also afforded the opportunity to explore applications of rack- HARDWARE/APPLICATION COMPO- scale systems and how capabilities can enhance SITION WITH ACCESS TO DISAGGRE- those applications. Two are briefly described here. GATED PERSISTENT OBJECTS There is an interesting intersection between PETA-SCALE TIME-VARYING GRAPH capabilities and the emerging category of com- DATA STORES WITH MULTIPLE AC- posable hardware, which today involves com- CESS ROLES position of storage and networking with fixed, A huge variety of problems arising from the study relatively stateless CPU and memory resources. of complex economic, ecologic, and biologic Container-based application development and systems are most naturally represented as rack-scale infrastructure allow for the low-level graphs. The efficient algorithms of graph theory commissioning of just the right hardware, inclu- can find hidden correlations and allow us to make sive of accelerators and memory, for a particular inferences from incomplete data as long as we container. Add in persistent objects in disaggre- can efficiently manipulate both the graph and its gated memory, inclusive of data, applications, associated metadata. This is where conventional libraries, and you can gain the ability to remove a scale-out systems are challenged, since the data majority of spin-up/spin-down time and replace distribution, caching, and prefetching algorithms virtualized input–output operations with much can be rendered ineffective by the random nature higher-performance-shared memory opera- of the underlying relationships. Even if care is tions. Capabilities allow all of those fabric- taken to optimally partition a graph for a given attached memory accesses, both sequential and access pattern, as the graph varies with time, simultaneous, to be authenticated and protected the partitioning rapidly becomes inefficient. If against errors while still allowing immediate the access pattern is random, the vast majority access to in-memory objects as soon as fabric of the accesses are remote, thus preventing any connectivity is established.

60 COMPUTER WWW.COMPUTER.ORG/COMPUTER memory protection, such as MPX. For support is not extensible to the rack We see many opportunities for additional discussion on use cases, see scale. Memory mapped from the FAM deeper integration of hardware archi- “Applications and Use Cases.” on one node may end up at different vir- tecture, OSs, and programming mod- tual addresses on other nodes. Self-ref- els. The key technical question is how erenceable structures or sophisticated to balance the support across these e motivated the need for ways of translating from virtual to phys- three levels to achieve the desired per- rack-scale capabilities as ical to rack-scale address spaces become formance, security, and flexibility. a consequence of increas- necessary. ISA support for capabilities Wing memory capacity paired with fine- is a long-term evolution, requiring more REFERENCES grained (load/store) access to FAM. We than five years to adoption. Providing 1. F. J. Corbato and V. A. Vyssotsky, described how rack-scale capabilities similar functionality closer to FAM offers “Introduction and overview of are evolving from traditional ISA- and a faster pace of evolution and a more the Multics system,” in Proc. Fall OS-supported capabilities. We dis- scalable and reliable solution. In addi- Joint Computer Conference (AFIPS ’65). cussed the CEP as an alternative (or tion to hardware, changes to the system New York, 1965, pp. 185–196. supplement) to ISA support. Finally, we software are required to support legacy 2. D. Milojicic and T. Roscoe, “Outlook described capability revocation as a key applications. New classes of applications on operating systems,” IEEE Comput., challenge and presented two solutions will evolve to fully utilize the benefits of vol. 49, no. 1, pp. 43–51, Jan. 2016. doi: for hardware support for revocation. memory-driven computing: load/store 10.1109/MC.2016.19. Many challenges remain for a future semantics and latency in accessing rack- 3. H. M. Levy, Capability-Based work on rack-scale capabilities. ISA scale fabric-attached NVM.12 Computer Systems. Newton, MA:

ABOUT THE AUTHORS

KIRK M. BRESNIKER is a fellow and chief architect of sys- DEJAN MILOJICIC is a distinguished technologist at Hew- tems research at Hewlett Packard Labs. His research inter- lett Packard Labs. His research interests include operating ests include novel hardware and software system designs. systems, distributed systems, and systems management. Bresniker received a B.S. in electrical engineering from Santa Milojicic received a Ph.D. from the University of Kaiser- Clara University. He is a Senior Member of the IEEE. Contact slautern. He is a Fellow of the IEEE and was the 2014 IEEE him at [email protected]. Computer Society president. Contact him at dejan.milojicic @hpe.com. PAOLO FARABOSCHI is a fellow at Hewlett Packard Labs. His research interests include intersection of architecture TIMOTHY ROSCOE is a professor of computer science at and software. Faraboschi received a Ph.D. from the Univer- ETH Zurich. His research interests include networks, oper- sity of Genoa, Italy. He is a Fellow of the IEEE. Contact him at ating systems, and distributed systems. Roscoe received a [email protected]. Ph.D. from the University of Cambridge. Contact him at [email protected]. AVI MENDELSON is a professor of computer science and electri- cal engineering at Technion. He earned his Ph.D. from the Univer- ROBERT N.M. WATSON is a university senior lecturer at the sity of Massachusetts at Amherst. His research interests include University of Cambridge Computer Laboratory. He received computer architecture, operating systems, reliability, cloud com- a Ph.D. from the University of Cambridge. Contact him at puting, and high-performance computing. He is a Fellow of the [email protected]. IEEE. Contact him at [email protected].

FEBRUARY 2019 61 HOLDING US TOGETHER

Butterworth-Heinemann, 7. N. Hardy, “KeyKOS Architec- 10. S. Nagarakatte, J. Zhao, M. M. K. 1984. ture,” SIGOPS Operating Syst. Rev., Martin, and S. Zdancewic, “Soft- 4. R. N. M. Watson et al. “CHERI: A vol. 19, no. 4, pp. 8–25, 1985. doi: Bound: Highly compatible and Hybrid Capability-System Architec- 10.1145/858336.858337. complete spatial memory safety for ture for Scalable Software Compart- 8. A. Baumann, P. Barham, P.-E. C,” in Proc. 30th ACM SIGPLAN Conf. mentalization,” in Proc. 36th IEEE Dagand, T. Harris, R. Isaacs, and S. Programming Language Design and Symp. Security and Privacy, May 2015, Peter, “The : A new OS Implementation, New York, NY, 2009, pp. 20–37. architecture for scalable multicore pp. 245–258. 5. R. Acherman, C. Dalton, P. Farabos- systems,” in Proc. ACM 22nd Symp. 11. G. C. Necula, S. McPeak, and chi, M. Hoffmann, D. Milojicic, and Operating Systems Principles, Big Sky, W. Weimer, “CCured: Typesafe G. Ndu, “Separating translation from MT, 2009, pp. 29–44. retrofitting of legacy code,” protection in address spaces with 9. A. Kwon, U. Dhawan, J. M. Smith, T. ACM SIGPLAN Notices, vol. 37, dynamic remapping,” in Proc. 16th F. Knight, Jr., and A. DeHon, “Low- no. 1, pp. 128–139, 2002. doi: Workshop Hot Topics in Operating fat pointers: Compact encoding and 10.1145/565816.503286. Systems (HotOS ’17), 2017, 118–124. efficient gate-level implementation 12. P. Faraboschi, K. Keeton, T. Marsland, 6. J. Liedtke, “On con- of fat pointers for spatial safety and and D. Milojicic, “Beyond proces- struction,” in Proc. 15th ACM Symp. capability-based security,” in Proc. sor-centric operating systems,” in Principles, Copper 2013 ACM SIGSAC Conf. Computer Proc. 15th Workshop on Hot Topics in Mountain Resort, CO, Dec. 1995, and Communications Security, Ber- Operating Systems (HotOS’15), Kar- pp. 237–250. lin, Germany, 2013, pp. 721–732. tause Ittingen, Switzerland, 2015.

From the analytical engine to the supercomputer, from Pascal to von Neumann, from punched cards to CD-ROMs—IEEE Annals of the History of Computing covers the breadth of computer history. e quarterly publication is an active center for the collection and dissemination of information on historical projects and organizations, oral history activities, and international conferences. www.computer.org/annals

Digital Object Identifier 10.1109/MC.2019.2901913

62 COMPUTER WWW.COMPUTER.ORG/COMPUTER EDITOR ROY WANT THE IOT CONNECTION Google; [email protected]

NFMI: Connectivity for Short-Range IoT Applications

Amitangshu Pal and Krishna Kant, Temple University

Magnetic-induction-based near- field communication is an emerging (NF) phenomenon that applies to distances of less than λπ/2 , where λ is the wave- technology that is used for short-range, length of the transmit-side current. NFMI communication is based on Internet-of-Things applications requiring high the principle of resonant inductive cou- security, human safety, and low power; pling (RIC), which involves two matched coils, each forming an LC circuit with it operates in harsh environments, withstanding the same resonance frequency. RIC is commonly used in wireless power trans- the presence of water, soil, and metals. fer and has numerous applications. For example, smartphone charging pads and the charging of moving electric cars’ bat- lectromagnetic waves are composed of mutu- teries operate on the same principle. NFMI communication ally orthogonal electric and magnetic fields. Typ- modulates the magnetic field and forms the basis of near-field ical RF-based communication involves the prop- communications (NFCs) among NFMI devices. Because the agation of such waves, as governed by Maxwell’s electric field plays no role in this communication, the signal Eequations. In free space, RF signal strength falls off at 1 / r2 with is almost purely magnetic and thus does not suffer from distance r, although in cluttered environments, the fall-off is the usual fading and diffraction associated with electromag- often somewhat faster. In contrast, near-field magnetic induc- netic waves. tion (NFMI) transmits data through a modulated alternating Consider a pair of transmit-and-receive magnetic coils, magnetic field that induces a current in a receiver coil. The with K and K turns and radii of ρ and ρ , respectively, t r t r transmitter generates this magnetic field by modulating an separated by distance r. Suppose that the coils are immersed alternating current in its own transmit coil. At first glance, in a medium that features a relative permeability of μ (note: this coupling has entirely different physics than that of prop- μ = 1 for air). Suppose that the receiver coil is oriented or- agating electromagnetic waves. The induction is a near field thogonal to the line passing through the centers of the two coils. Then, if the transmit coil has current I flowing t through it, the induced current in the receiver coil, that is, Digital Object Identifier 10.1109/MC.2019.2892862 Date of publication: 22 March 2019 I , is given by r

0018-9162/19©2019IEEE FEBRUARY 2019 63 THE IOT CONNECTION

the same frequency used by high-fre- 1 FROM THE EDITOR quency RFID and is employed for personal products, such as audio headphones The Internet of Things (IoT) extends traditional computing by making use of from NXP6 and Freelinc.7 At this fre- the spatial arrangement of computing nodes to achieve many of its goals. quency, the NFMI range is only 3.5 m, Whether this is a heating, ventilation, or air-conditioning system in a home which is adequate for body area network or nodes in an automated factory line, spatial- and proximity-based data ex- (BAN) applications but not for more gen- change is a key tool that enables the desired operation. In this article, the au- thors discuss using NFMI to support short-range IoT communication. eral home automation. The higher fre- quency helps in terms of higher power transfer with tiny antennas. Passing the Torch I would like to take this opportunity to thank all of the contributors to and read- LOW-POWER BLUETOOTH ers of the "IOT Connections" column. I have served as department editor for AND RELATED three years, and in April, I will be passing the torch to Trevor Pering (Google, TECHNOLOGIES Inc.). Trevor has extensive publishing experience in the areas of mobile and Bluetooth (BT) is an RF-based wireless pervasive computing and is currently working on IoT systems in industry. This technology standard for exchanging gives him a fresh perspective on the kind of articles that will add value to the data over short distances (typically department and the discussions that need to occur to move IoT system design forward.—Roy Want 10 m for class 2 devices) using ultra- high frequency (UHF) radio waves in the 2.4-GHz Industry, Science, Med- icine (ISM) band. It is mainly used for πρKK 22ρ f , the induced power is communicating among a few devices trtr res II∝ f , 2 rtres proportional to f . This means in PANs. Bluetooth Low Energy (BTLE) 2r3 res a higher power transfer can is a slightly modified version of BT that be achieved by increasing features short connection times and f where res is the operating (resonance) the frequency, but at the cost devices that largely remain in sleep frequency. We can draw several con- of decreasing the maximum mode, which is of primary interest clusions from this. range of cf/(2)π , where c is here. Zigbee is another similar technol- res the speed of light. ogy designed to be at lower power and 1. Since the power is propor- 4. If the receive coil is not aligned lower speed than BTLE and operates at tional to I2, the induced power as indicated, the induced cur- lower frequencies. RF-based NFC is an- r decays as 1 / r6 with distance r. rent will be less, but this aspect other relevant technology although it's This makes the technology is omitted for simplicity. designed for operation over very short inherently short range and, distances. Table 1 shows a comparison therefore, suitable for small Recognizing the potential of NFMI of the three technologies. personal area networks (PANs). communications, IEEE finalized the These technologies are well es- Fortunately, the very rapid 1902.1 standard in 2009, which specifies tablished and work extremely well in decay means that it is not an NF magnetic communication pro- open, uncluttered environments but possible to snoop on the signal tocol called RuBee.9 RuBee operates do not work well in the presence of beyond a certain range. in the lower frequency range of 30– aqueous or plant/animal tissue media, 2. The current is directly pro- 900 KHz, and its purpose is to support which cause high signal absorption; portional to the product of low data-rate applications with coin-size or metallic clutter that causes dif- transmit-and-receive coil areas batteries that last 5–10 years. Visible As- fraction or shielding of the signals; or ()ρρ22× and the number of turns sets, Inc. has introduced RuBee tags that underground/underwater operation tr for each of them ()KK× . In operate at below 450 kHz (it typically that results in an extremely complex tr other words, to transfer higher operates at 131 kHz), which is compatible communications channel. Reducing power, one must choose larg- with low-frequency RFID. At 131 KHz, absorption by choosing lower frequen- er-sized coils and more turns. NF conditions occur for up to approxi- cies helps in attenuation; however, This provides a very flexible con- mately 364 m, which is very long; how- bigger antennas are required, which trol on power for NFMI; however, ever, because of low power and small introduces the problem of undesirable it may also limit miniaturization coil antennas, the actual range is only a size and potentially severe interfer- when the size is crucial. few tens of meters. ence with nearby radios. For this rea- 3. Because the induced current is In a higher NFMI frequency range, son, BTLE devices cannot be deeply proportional to the frequency 13.56 MHz is very popular because it is implanted in the human body.

64 COMPUTER WWW.COMPUTER.ORG/COMPUTER BTLE devices coexist with other TABLE 1. A comparison of NFC, Zigbee, products that use different protocols, and BTLE. such as Wi-Fi or Zigbee, but operate in the ISM 2.4-GHz band and thus may Aspect NFC Zigbee BTLE experience high interference. Addi- Standardization International Organization for Zigbee Alliance Bluetooth tionally, RF radios consume more power body Standardization (ISO)/International Special because of the high sleep-mode power Electrotechnical Commission Interest Group consumption. Because of the charac- teristics of far-field (FF) transmissions, Network standard ISO 13157, and so on IEEE 802.15.4 IEEE 802.15.1 the BTLE signals can be intercepted and Network type Point-to-point WPAN WPAN decrypted by a remote eavesdropper. For this reason, the NSA has restricted Cryptography Not with RFID Available Available 2 its use in the U.S. Armed Forces. Fur- Range <0.2 m 10–20 m 50 m thermore, in the context of BANs, the use of BTLE has already raised serious Frequency 13.56 MHz 2,400/915/868 MHz 2.4–2.5 GHz concerns. BTLE-equipped implantable Bit rate 106/212/424 Kb/s 110 Kb/s 1 Mb/s defibrillators, insulin pumps, and infu- sion pumps have all been hacked.3 Setup time <0.1 s <6 s <0.006 s

11 50 mA 30 mA 13 mA OPPORTUNITIES FOR NFMI Peak current draw The key benefit of NFMI is its better pen- WPAN: wireless personal area network. etration performance (i.e., lower absorp- tion) than RF through materials that are human body is often measured by the coils (instead of 1/r2 or 20 dB/de- challenging for RF, such as underwater specific absorption rate (SAR), which is cade for NFC RF). Figure 1 shows the environments and communications the power absorbed per mass of tissue, propagation characteristics of NFMI (at through water-rich media such as the measured in units of watts per kilo- 13.56 MHz) and BTLE (at 2.4 GHz) sig- human body, fresh produce, meats, and gram (W/kg). In the United States, the nals, which indicates the dual slope of so on. The reason for this is that water and U.S. Federal Communications Com- NFMI communication and represents most other water-rich materials have mission requires mobile phones to a simultaneous decay of the magnetic magnetic permeability similar to that have a SAR limit of at or below 1.6 W/ and electric waves. At 13.56 MHz, of air. In other words, the relative magnetic kg. Similarly, the European Union has the NFMI signal has a wavelength of permeability of most such materials is 1.0, made the SAR limit 2 W/kg. The emis- λ = 22.1m; thus the crossover point which also includes austenitic stainless sions from NFMI are far less than this between NF and FF occurs at approx- steel.12 This is demonstrated by tests con- specified limit. RuBee produces 40 nW imately (/λπ2) 3.5m. Beyond this ducted by the U.S. Department of Energy of RF power compared to 4 W for UHF point, it rolls off at 20 dB/decade (just in which one NFMI radio is kept inside RFID systems, i.e., RuBee produces like the FF propagation character- a sealed stainless steel drum and one re- roughly 1-quadrillion (15 zeros) less RF istics), as observed in Figure 1. With mains outside.3 Furthermore, a sheet power than UHF RFID.8 NFMI, the signal crosses the noise floor of mild steel or other form of iron placed The power consumption of NFMI is at approximately 3 m, as opposed to in close proximity to an NFMI radio es- generally lower than that of BTLE. As 20 km with BTLE. In the case of NFMI, sentially acts like a mirror and strength- reported in Abrams,1 the current NFMI this results in very little leakage outside ens the signal.4 Because magnetic signals battery-powered earpieces can oper- of the intended range. The communi- are not affected by an aqueous or tissue ate for roughly 20 hours, as opposed cation is invisible outside this range medium, NFMI works well for commu- to 3–4 hours in the case of BTLE. Aura and thus adds a high level of security. nication with deeply implanted medical Communications, Inc. has developed a Additionally, NFMI communication devices. NFMI communication protocols, system-on-chip magnetic communi- does not cause interference with other including RuBee, have been certified by cation system named LibertyLink, which wireless networks such as Wi-Fi and the U.S. Food and Drug Administration as draws 7 mA at 2.2 V to transmit full-du- BTLE. Because of its short range, the a nonsignificant risk technology suitable plex voice or data across a 1-m link, same frequency can be reused for other for human use.1,2 whereas typical RF solutions require at NFMI communications. Thus, in an Because NFMI operates in a low-fre- least 10 times that amount of power.11 overcrowded area, using an NFMI-based quency band, it significantly reduces The received power of the NFMI PAN is more efficient than using BTLE. RF absorption by biological tissues. signal falls off as 1/r6 of the dis- Given that NFMI transmission through The amount of RF absorption in the tance r, or 60 dB/decade between the body is safe and no different than

FEBRUARY 2019 65 THE IOT CONNECTION

–40 on emerging quantum magnetometer 5 NFMI 1/R6 FF RF 1/R2 technology. This work suggests that –60 “The best magnetic field sensitivity is Noise (13 MHz) Noise (2.4 GHz) –80 obtained using quantum sensors.” This technology will improve the receiver’s –100 M-Wave ability to pick up VLF signals far beyond the range of conventional RF receivers. –120 As a result, the team demonstrated –140 sending a digitally encoded dc mag- E-Wave E-Wave netic signal in the sub-kHz frequency band and detecting this faint signal at –160 one pico-tesla magnetic field strength 10–1 100 101 102 103 104 105 106 (i.e., one-millionth of Earth’s magnetic Available Power at Receive Antenna (dBm) Distance (m) field strength) across a distance of tens Figure 1. The signal propagation characteristics of NFMI versus BTLE. M-Wave: of meters in a magnetically noisy indoor magnetic wave; E-Wave: electronic wave; FF: far field. (Source: Audio Express13, environment. This is achieved by using used with permission.) an “optically pumped,” highly sensitive magnetometer that relies on the quan- tum properties of rubidium atoms. The transmission through the air, the technol- Another issue with magnetic com- novel magnetometer uses polarized ogy provides the tantalizing possibility of munication is its small transmission light as a detector to measure the “spin” secure, through-the-body-communica- range (a few meters) and much lower data of the rubidium atoms induced by the tions. In other words, if all of the devices rates than RF (400 Kb/s at 13.56 MHz, as magnetic fields. The team also believes worn by a person are shielded to remove opposed to a few megabits per second). that its range can be further improved any through-the-air communication, all The latter can be addressed to some to hundreds of meters in a less noisy en- communications will be through the body extent by using multiple-input, multi- vironment using improved sensor tech- and thus free from any interference or tam- ple-output techniques, which essentially nology and signal modulation schemes. pering. To enable these devices to commu- amount to using multiple coils operating nicate with another external device, the per- on different channels. Increasing the son would have to touch the device (with range requires overcoming two limita- FMI technology provides some mutual authentication procedures used to tions: 1) the need to keep the range below unique advantages that can be prevent unwanted communication). One λπ/2 to maintain NFC communication exploited in several emerging such scenario is that of a patient securely and 2) fast decay of the induced signal NIoT applications where many small IoT transferring medical data to a health-care from a distance. devices must operate in close prox- worker via a physical touch, such as shaking The problem of a low transmis- imity in a rather harsh environment. hands. This applies to a wide range of bodily sion range can be addressed by simply However, NFMI technology is still not data collection and transfer, e.g., from rou- choosing a low-operating frequency, for as well explored as RF, and we expect tine data given to a trainer to data given to a example, lowering the frequency from many challenges with using the tech- doctor during a hospital round. 13 to 1.3 MHz increases the range from nology reliably and integrating it with 3.5 to 35 m, which is adequate for most other wireless technologies. We hope HURDLES OF NFMI applications. However, this frequency that this article will inspire greater The magnetic field induced by an NFMI reduction would also decrease the in- interest in examining and apply- coil is necessarily orthogonal to the duced current by a factor of 10, and to ing this technology to a wider set of coil, and the field strength falls off as compensate for this decrease, we would emerging IoT applications. the cosine of the angle in other direc- need to increase the coil diameter, and/ tions. This means that, to generate an or the number of turns. This may be REFERENCES omnidirectional signal, one would need reasonable for many large form-factor 1. M. Abrams, “NFMI: Connectivity three orthogonal coils, placed either IoT devices but may be problematic for for the IoT,” LinkedIn, Feb. 22, 2016. concentrically or in close proximity to small embedded devices. [Online]. Available: https://www each other. The third dimension can be A team from the National Insti- .linkedin.com/pulse/nfmi- challenging in many applications where tute of Standards and Technology has connectivity-iot-michael-abrams/ a thin, surface-mounted device is highly proposed very low-frequency (VLF) 2. J. Steinberg, “Why your Bluetooth desirable (e.g., a wearable device such as magnetic communication using an devices aren’t as secure as you a wristwatch or heart-rate monitor). ultrasensitive magnetic receiver based think,” Inc., Oct. 20, 2015. [Online].

66 COMPUTER WWW.COMPUTER.ORG/COMPUTER Available: https://www.inc.com 8. Carpenter Technology Corporation, [Online]. Available: http://ru-bee /joseph-steinberg/are-your-bluetoot “Magnetic properties of stainless .com/Academy/HumanSafe/index h-devices-secure-maybe-not.html steels.” Accessed on: Feb. 2018. [On- .html 3. R. T. Drayer, “Applications of current line]. ­Available: https://www.cartech 13. M. Abrams, “Near field magnetic in- technology for continuous monitor- .com/en/alloy-techzone/ duction (NFMI): Dreams of wireless ing of spent fuel,” SRNL, Aiken, SC, technical-information/technical-articles/ hearables,” Audio Express, Dec. 27, Rep. SRNL-STI-2013-00342, 2013. magnetic-properties-of-stainless-steels 2017. [Online]. Available: https:// 4. A. Pal and K. Kant, “Magnetic induc- 9. RuBee. Accessed on: Feb. 2018. [On- www.audioxpress.com/article/near- tion-based sensing and localization line]. Available: http://ru-bee.com field-magnetic-induction-nfmi- for fresh food logistics,” in Proc. 42nd 10. C. Bunszel, “Magnetic induction vs. dreams-of-wireless-hearables IEEE Conf. Local Computer Networks RF: Power benefits, drawbacks,” EE (LCN), 2017, pp. 383–391. Times. Oct. 11, 2001. [Online]. Avail- AMITANGSHU PAL is a postdoc- 5. V. Gerginov, F. C. S. da Silva, and able: https://www.eetimes.com toral scholar in the Computer and D. Howe, “Prospects for magnetic /document.asp?doc_id=1225281 Information Systems Department at field communications and location 11. P. Mannion, “Comparing low power Temple University. Contact him at using quantum sensors,” Rev. Sci. wireless technologies (Part 2),” [email protected].­ Instrum., vol. 88, no. 12, pp. 125,005, Digi-Key. Dec. 14, 2017. [Online]. 2017. doi: 10.1063/1.5003821. Available: https://www.digikey.ie KRISHNA KANT is a professor in the 6. NXP. Accessed on: Feb. 2018. [Online]. /en/articles/techzone/2017/dec Computer and Information Systems Available: https://www.nxp.com/ /comparing-low-power-wireless- Department at Temple University. He 7. Freelinc. Accessed on: Feb. 2018. [On- technologies-part-2 is a Fellow of the IEEE. Contact him line]. Available: http://www.freelinc 12. RuBee, “RuBee human safety at [email protected]. .com/ summary,” Accessed on: Feb. 2018.

stay IEEE MultiMedia

on the July–September 2016 http://www.computer.org

Cutting Edge of Artificial Intelligence 2016 july–september ❚

Quality Modeling

IEEE January/fEbruary 2016

Also in this issue: IEEE Intelligent Systems provides peer- aI’s 10 to Watch 56

January/FEB real-Time Taxi Dispatching 68 from flu Trends to Cybersecurity 84 ruary IEEE 2016

PUTTING AI INTO PRACTICE

reviewed, cutting-edge articles on the Volume 23 Number 3

mult-22-03-c1 Cover-1 July 12, 2016 4:40 PM

Online Beh theory and applications of systems A vi OrAl AnA lysis OrAl IEEE MultiMedia serves the community of that perceive, reason, learn, and scholars, developers, practitioners, and students

VOL who are interested in multiple media types and uME 31 nu uME

MBE act intelligently. r 1 work in fields such as image and video processing, www.computer.org/intelligent IS-31-01-C1 Cover-1 January 11, 2016 6:06 PM audio analysis, text retrieval, and data fusion. Read It Today! The #1 AI Magazine www.computer.org/multimedia www.computer.org/intelligent IEEE Digital Object Identifier 10.1109/MC.2019.2901998 Digital Object Identifier 10.1109/MC.2019.2901997

FEBRUARY 2019 67 SECTIONCYBERTRUST TITLE

Rethinking Distributed Ledger Technology

Rick Kuhn and Dylan Yaga, National Institute of Standards and Technology Jeffrey Voas, IEEE Fellow

replicated data synchronized among Distributed ledger technology (DLT) offers separate network nodes, which may new and unique advantages for information be geographically dispersed. There is substantial discussion around some systems, but some of its features are not a terms in DLT, public versus private in particular. We believe it is better good fit for many applications. We review the to distinguish blockchain systems based on their permission model— properties of DLT and show how two recently permissioned or permissionless— developed ideas can be used to retain its because that is directly tied to the technology, whereas private or pub- advantages while simplifying design. lic may apply to the visibility of the network or ledger itself. With its features providing dis- tributed, trusted data using no hile most of the excitement around block- central server, DLT seems to be a natural tool for many chain stems from its use in cryptocurren- complex distributed systems, and a number of imple- cies, designers are beginning to find inter- mentations have been proposed. However, some environ- esting ways to solve system problems using ments and applications are not well suited to using an ap- Wit and other forms of DLT. The most commonly used data pend-only ledger. For example, an analysis of DLT for the structure for distributed ledgers is the blockchain. A key international banking consortium the Society for World- feature of a blockchain-based system is the decentralized, wide Interbank Financial Telecommunication (SWIFT) found that the permissionless model used by Bitcoin and other cryptocurrencies “does not provide the level of Digital Object Identifier 10.1109/MC.2019.2898162 Date of publication: 22 March 2019 trust, transparency, and accountability required by the

68 COMPUTER PUBLISHED BY THE IEEE COMPUTER SOCIETY 0018-9162/19©2019IEEE EDITOREDITOR JEFFREY EDITOR NAMEVOAS IEEE Fellow; [email protected];

financial industry.”1 The SWIFT anal- ›› Pseudo-anonymity: Especially With each update, records are ysis noted that permissioned ledgers for cryptocurrency, blockchains dispersed simultaneously to are helpful, but “existing implementa- enable participation using only peer nodes, who ensure that the tions of permissioned ledgers remain identifiers. Permissioned block- updates are correct. basic.” Of particular concern is the chains may not include this ›› Replication and synchronization immutable aspect of transactions re- property. guarantee: Transactions are corded in blockchains.2–4 As noted by ›› Public access and transparency: duplicated across all nodes of the the European Banking Institute, “Once Every participant can see all network so that every node has an error is embedded in the block- transactions on the blockchain, an identical copy of all transac- chain, this may be highly problematic, although they may be ano- tion records current to the most legally, in that often law requires the nymized. This property may also recent update cycle. Consensus ability to rectify errors as a matter of not be provided in permissioned protocols are designed so that law in a way foreign to DLT.”2 One op- systems. when the consensus is complete, tion is to correct errors by issuing a ›› Small transaction size: Block- all nodes have an identical copy new transaction that supersedes the chains were originally designed of the distributed ledger records. older, erroneous transaction. In this way, the ledger provides a full history of events as they happened. While this is possible or desirable for some appli- The financial industry views full traceability cations, privacy laws lead to additional and simplified reconciliation of transactions among complications, as discussed later. the key advantages of DLT. In addition to complicating support for privacy rules, other properties of conventional blockchains are not a for monetary transactions, so ›› Integrity protection: Cryp- good match for applications beyond messages are assumed to be tographic hashes are used to cryptocurrency,5,6 and modifications relatively small. guarantee that records have not to distributed ledger designs are being ›› Immutable records: As a conse- been changed. developed to meet new needs. Block- quence of the linked chain of chains are a valuable DLT for provid- cryptographic hashes of records, We compare these properties with ing trust, but there are many ways a change to one record would the needs of more typical applica- to construct distributed ledgers. We cause the hash of subsequent tions of distributed data storage and propose an alternative that provides records to be invalid, so changes retrieval in Table 1. Note that six of the trust features of blockchains with require recomputing of the the nine blockchain properties de- a more flexible data structure and or- entire chain. As a result, it is signed for cryptocurrency are at dering protocol. generally intractable to change odds with the requirements of many any record in a blockchain. other applications. DLT AND DATA ›› Proof of work or other expensive MANAGEMENT consensus models: This is a con- NEED FOR AN ALTERNATIVE A distributed ledger, as the name sug- sequence of the need to prevent SOLUTION gests, is a distributed record of trans- double spending. Permissioned The mismatch between blockchain actions maintained by consensus blockchains do not generally properties and many application among a network of peer-to-peer nodes need this feature and can use needs has led to a number of problems (possibly geographically dispersed). simpler consensus. in applying blockchain designs to data The most widely recognized form of ›› Block ordering guarantee: The management problems. For example, DLT is the blockchain structure, which consensus mechanism ensures Bitcoin is designed to provide some de- provides the basis for cryptocurren- ordering of the blocks, and there- gree of anonymity in transactions (i.e., cies and a variety of other applications. fore transactions, preventing the only public identifiers, not real-world Most currently available distributed possibility of double spending. identities, are used), but the law may ledger designs using blockchain pro- ›› Decentralization: There is no prohibit anonymity for many types of vide certain properties: central authority for records. transactions and require participants

FEBRUARY 2019 69 CYBERTRUST

TABLE 1. Comparing characteristics of DLT applications. industries where auditing is a part of doing business. Every node on the Cryptocurrency Finance, supply chain, e-commerce, etc. system can have a full set of records 1. Pseudo-anonymity ID required for contracts or government detailing the movement of assets. Any regulation shared database can keep track of as- set movement, but DLT adds trust by 2. Public access, transparency Controlled access maintaining current integrity-pro- 3. Small transaction size Range of message sizes up to large tected records at every organization, documents and images making it easy to audit the process. 4. Immutable records Changes and deletions, often required Thus, the financial industry views full by law traceability and simplified reconcil- iation of transactions among the key 5. Proof of work and other expensive Flexible consensus models advantages of DLT.1 We can view DLT consensus models as adding a layer of distributed trust 6. Block ordering guarantee Time stamps often required to the problem of data storage and re- 7. Decentralization Same in many applications trieval, clearly a desirable property, but industry is still struggling with 8. Replication and Synchronization Same in many applications how to use DLT in practical ways. guarantee

9. Integrity protection Same in many applications A PERMISSIONED DISTRIBUTED LEDGER MODEL FOR DECENTRALIZED to be identified for tax or other pur- of the GDPR, resulting in proposals, TRUST poses. Laws that require the ability to such as an editable blockchain,7 us- Much of the current DLT research delete privacy relevant information, ing new forms of hashing. For cryp- seems to center on how to bypass prop- such as the European Union General tocurrency, a consensus algorithm is erties that were built into blockchain. Data Protection Regulation (GDPR), needed to guarantee record ordering Adaptations, such as faster consensus may limit the type of information that in the absence of a central time au- algorithms, are gradually moving DLT can be stored in a blockchain.2,4 thority (i.e., transactions are ordered from its origin in cryptocurrency to- For system engineers, the price of based on group consensus rather than ward a more general-purpose database distributed trust is often an added the time of entry into a system), and technology. However, instead of tweak- complexity. The design choices that this ordering is used to prevent dou- ing blockchain designs, we can rethink were made to incorporate anonym- ble spending. Designs for access con- the idea of a distributed ledger to reflect ity and prevent double spending in trol using blockchain may involve to- the needs of data management applica- blockchains often lead to seemingly kenizing permissions, passing these tions, as discussed earlier. unnecessary complications when to users, and spending down the value Can we provide a simpler model applied to areas beyond cryptocur- to remove a permission from a user. that gives the decentralized trust of a rency. For example, immutability All of these strategies are needed to blockchain but otherwise behaves as a has resulted in designs where alter- take advantage of blockchain’s trust conventional database? In this section, able records must be kept off of the properties, but blockchains would we describe an approach to achieving blockchain, with only pointers to probably not be used if a more con- this goal using two recent propos- them stored in the blockchain itself. ventional database could provide the als: a data block matrix8 and verified Alternatively, some designs involve desired distributed trust. time.9 The data block matrix retains encrypting data on the blockchain At first glance, blockchain solutions hash-based data integrity guarantees and then destroying the encryption for applications, such as supply chains, while allowing controlled modifica- key to delete the data. Neither of these financial settlements, and others, may tion or deletion of specified records, options may be desirable for many appear to offer nothing more than with integrity guarantees for all other applications, as the first option leads added complexity in comparison with records. A data block matrix can be to unnecessary complications, and a conventional database. However, implemented in a decentralized sys- the second risks the data’s being de- when more than one organization is tem to provide data replication among crypted in the future, when data must involved, the decentralized trust of peers. The verified time protocol al- be protected for decades. These are blockchains and other distributed lows guaranteed time stamps to be serious design issues for supporting ledgers can be a tremendous advan- used in place of consensus algorithms privacy requirements, such as those tage. For example, consider regulated to ensure record ordering.

70 COMPUTER WWW.COMPUTER.ORG/COMPUTER Blockchain and data block matrix features. 0 1 2 3 4 TABLE 2.

0 • 13713 H0,– Blockchain: provides integrity and Data block matrix: provides integrity sequencing and erasure 12 • 5915 H1,–

24 6 • 11 17 H2,– Integrity protection and no erasure Integrity protection for all blocks not possible erased 3810 12 • 19 H3,– Double-spend problem solved by Ability to erase values obviates need for 414161820 • H4,– distributed transaction ordering ordering guarantees through consensus H–,0 H–,1 H–,2 H–,3 H–,4 etc. guarantees algorithms

Figure 1. A data block matrix with num- Ordering guarantees require consensus Ordering guarantees granted by time bered cells. algorithms authority

The data block matrix uses an ar- is NN2 − because the diagonal ledger. Often, systems will rely on local ray of blocks, with hash values for is null. system time or a network time, and each row and column. This structure ›› Block dispersal: No consecutive both may differ from one system to makes it possible to delete or modify blocks appear in the same row the next. Distributed ledgers must be a particular block with hash values, or column. able to operate in environments that assuring that other blocks have not include rules mandated by govern- been affected. An example is shown in Clearly, this data structure is not ments or contracts. For some appli- Figure 1. Suppose that it is desired to suited to all DLT applications, but it cations, the ordering of transactions delete block 12 by writing all zeroes to offers features that are difficult to pro- into blocks within the blockchain may that block or otherwise modifying it. vide with a conventional blockchain. not be enough, and there is a need for This change disrupts the hash values Our goal is not to replace blockchains a global time stamp service, providing 9 of H3,- and H-,2 for row 3 and column 2. but to offer a new form of data storage verified time. Time is a key component However, the integrity of all blocks structure that provides the integrity of this because things happen outside except the one containing X is still en- guarantees of blockchain with the of the blockchain that matter for ap- sured by the other hash values. That addition of reversibility, which can be plications. Orders must be fulfilled by is, other blocks of row 3 are included used in a wide range of applications. A a specified date and time, legal papers in the hashes for columns 0, 1, 3, and 4. comparison is shown in Table 2. must be filed on schedule, and so on, re- Similarly, other blocks of column 2 are In distributed ledger designs, the quiring time stamps of events that take included in the hashes for rows 0, 1, 2, role of time is often an afterthought. place outside of the blockchain. and 4. Thus, the integrity of blocks that Some DLT systems have no inherit A global time-stamping approach have not been deleted is assured. Blocks transaction time stamp to record when would include an agreed-upon and can be deleted by overwriting with ze- the transaction was submitted to the accepted service. This approach could roes or other values, with one row and system. Rather, the transactions adopt incorporate a high-resolution block- one column hash recalculated; specifi- the time when they were included into chain time mechanism, such as the cally, after deleting block i, j, row i, and the ledger, which may occur after a open source Chainpoint protocol,10 column j, hash values are recalculated. significant amount of time has passed into the distributed ledger to produce The data structure ensures the fol- since being submitted. This approach a final and agreed upon time stamp. lowing properties:8 has worked for applications where just Chainpoint uses the Network Time having a transaction accepted is good Protocol with the National Institute of ›› Balance: The upper half (above enough (e.g., we do not need to know Standards and Technology Random- diagonal) contains, at most, one that a cryptocurrency transaction was ness Beacon to provide provable time additional cell more than the submitted down to the millisecond, stamps, and it might be adapted to the lower half does. just that it was submitted and eventu- needs described here. ›› Hash sequence length: This is the ally recorded in the ledger). number of blocks in a row or col- However, when time-dependent sit- umn hash proportional to N uations arise, a time stamp becomes he blockchain data structure for a matrix with N blocks by the more important, and knowing when a and proof-of-work protocol balance property. transaction was submitted to a system were designed to solve the ›› Number of blocks: The total num- may be more important than know- Tproblem of double spending in cryp- ber of data blocks in the matrix ing when it was incorporated into the tocurrencies. Although blockchain

FEBRUARY 2019 71 CYBERTRUST

to incorporating trust into distributed 7. J. Condliffe, “Is an editable block- DISCLAIMER systems applications. chain the future of finance?” The identification of any commercial MIT Technology Review, Sept. product or trade name does not imply REFERENCES 20, 2016. [Online]. Available: endorsement or recommendation by 1. F. Le Borne, D. Treat, F. Dimid- https://www.technologyreview the National Institute of Standards and schstein, and C. Brodersen, “SWIFT .com/s/602434/is-an-editabl Technology, nor is it intended to imply on distributed ledger technologies: e-blockchain-the-future-of-finance that the materials or equipment iden- Delivering an industry-standard 8. D. R. Kuhn, “A data structure for tified are necessarily the best available platform through community integrity protection with erasure for the purpose. collaboration.” SWIFT, Rep. 57189, capability,” NIST, Gaithersburg, MD, Apr. 2016. [Online]. Available: White Paper, 2018. https://www.tradefinance.training 9. A. Stavrou and J. Voas, “Verified has found many applications outside /library/files/SWIFT%20Position%20 time,” IEEE Computer, vol. 50, of cryptocurrency, many of its fea- Paper%20(DLTs).pdf no. 3, pp. 78–82, 2017. doi: 10.1109/ tures are not well suited to common 2. D. A. Zetzsche, R. P. Buckley, and D. MC.2017.63. data-management applications, W. Arner, “The distributed liability 10. Tieron, “Chainpoint: Innovations leading many to argue that distrib- of distributed ledgers: Legal risks of in blockchain timestamp proofs.” uted ledgers are only databases with blockchain,” Univ. of Illinois Law Rev., Accessed on: Jan. 2019. [Online]. more complex features. As we have de- pp. 1361, 2018. Available: https://blog.tierion.com scribed, the added trust of distributed 3. M. Berberich and M. Steiner, “Block- /chainpoint-innovations-in-block ledgers is a valuable feature, providing chain technology and the GDPR,” chain-time stamp-proofs greatly simplified auditability and ver- Eur. Data Protection Law Rev., vol. 2, ification of actions among multiple no. 3, pp. 422, 2016. doi: 10.21552 parties in applications, such as supply /EDPL/2016/3/21. RICK KUHN is a computer scientist chain and others. 4. O. Kharif, “Is your blockchain at the National Institute of Standards The blockchain design for hash- doomed?” Bloomberg Businessweek, and Technology. He is a Fellow of based integrity verification provides Mar. 22, 2018. [Online]. Available: the IEEE. Contact him at kuhn@nist trust at the cost of an inability to de- https://www.bloomberg.com .gov. lete or update records, leading to de- /news/articles/2018-03-22 sign complications that would not /is-your-blockchain-business-doomed DYLAN YAGA is a computer scientist arise with conventional database 5. T. Simonite, “Banks embrace Bit- at the National Institute of Standards management systems. Similarly, the coin’s heart but not its soul,” MIT and Technology. Contact him at dylan sequencing guarantees of blockchain Technology Review, Sept. 24, 2015. [email protected]. consensus protocols are needed for [Online]. Available: https://www cryptocurrency in the absence of a uni- .technologyreview.com/s/541686 JEFFREY VOAS is a cofounder of versal time stamp. Moreover, actions /banks-embrace-bitcoin Cigital. He is an IEEE Fellow. Contact within the distributed ledger must be s-heart-but-not-its-soul him at [email protected]. connected with other actions in the real 6. U.K. Government Chief Scien- world through accurate time stamps. tific Adviser, “Distributed ledger We have presented a new architec- technology: Beyond block chain,” ture that provides the trust features of Government Office for Science, 2016. blockchains with characteristics that [Online]. Available: https://assets allow for simpler designs and greater .publishing.service.gov.uk/govern Access all your IEEE Computer practicality in conventional data man- ment/uploads/system/uploads Society subscriptions at agement problems. We believe this al- /attachment_data/file/492972/gs-16- computer.org/mysubscriptions ternative can lead to new approaches 1-distributed-ledger-technology.pdf

72 COMPUTER WWW.COMPUTER.ORG/COMPUTER PURPOSE: The IEEE Computer Society is the world’s largest EXECUTIVE COMMITTEE association of computing professionals and is the leading provider President: Cecilia Metra of technical information in the field. President-Elect: Leila De Floriani; Past President: Hironori MEMBERSHIP: Members receive the monthly magazine Kasahara; First VP: Forrest Shull; Second VP: Avi Mendelson; Computer, discounts, and opportunities to serve (all activities Secretary: David Lomet; Treasurer: Dimitrios Serpanos; are led by volunteer members). Membership is open to all IEEE VP, Member & Geographic Activities: Yervant Zorian; members, affiliate society members, and others interested in the VP, Professional & Educational Activities: Kunio Uchiyama; computer field. VP, Publications: Fabrizio Lombardi; VP, Standards Activities: COMPUTER SOCIETY WEBSITE: www.computer.org Riccardo Mariani; VP, Technical & Conference Activities: OMBUDSMAN: Direct unresolved complaints to ombudsman@ William D. Gropp computer.org. 2018–2019 IEEE Division V Director: John W. Walz 2019 IEEE Division V Director Elect: Thomas M. Conte CHAPTERS: Regular and student chapters worldwide provide the 2019–2020 IEEE Division VIII Director: Elizabeth L. Burd opportunity to interact with colleagues, hear technical experts, and serve the local professional community. BOARD OF GOVERNORS AVAILABLE INFORMATION: To check membership status, report Term Expiring 2019: Saurabh Bagchi, Leila De Floriani, David S. an address change, or obtain more information on any of the Ebert, Jill I. Gostin, William Gropp, Sumi Helal, Avi Mendelson following, email Customer Service at [email protected] or call Term Expiring 2020: Andy Chen, John D. Johnson, Sy-Yen Kuo, +1 714 821 8380 (international) or our toll-free number, +1 800 272 David Lomet, Dimitrios Serpanos, Forrest Shull, Hayato Yamana 6657 (US): Term Expiring 2021: M. Brian Blake, Fred Douglis, Carlos E. • Membership applications Jimenez-Gomez, Ramalatha Marimuthu, Erik Jan Marinissen, • Publications catalog Kunio Uchiyama • Draft standards and order forms • Technical committee list EXECUTIVE STAFF • Technical committee application • Chapter start-up procedures Executive Director: Melissa Russell • Student scholarship information Director, Governance & Associate Executive Director: • Volunteer leaders/staff directory Anne Marie Kelly • IEEE senior member grade application (requires 10 years Director, Finance & Accounting: Sunny Hwang practice and significant performance in five of those 10) Director, Information Technology & Services: Sumit Kacker Director, Marketing & Sales: Michelle Tubb PUBLICATIONS AND ACTIVITIES Director, Membership Development: Eric Berkowitz

Computer: The flagship publication of the IEEE Computer Society, COMPUTER SOCIETY OFFICES Computer, publishes peer-reviewed technical content that covers all aspects of computer science, computer engineering, Washington, D.C.: 2001 L St., Ste. 700, Washington, D.C. technology, and applications. 20036-4928 • Phone: +1 202 371 0101 • Fax: +1 202 728 9614 Email: [email protected] Periodicals: The society publishes 12 magazines, 15 transactions, and two letters. Refer to membership application or request Los Alamitos: 10662 Los Vaqueros Cir., Los Alamitos, CA 90720 information as noted above. Phone: +1 714 821 8380 • Email: [email protected] Conference Proceedings & Books: Conference Publishing Asia/Pacific: Watanabe Building, 1-4-2 Minami-Aoyama, Services publishes more than 275 titles every year. Minato-ku, Tokyo 107-0062, Japan • Phone: +81 3 3408 3118 Fax: +81 3 3408 3553 • Email: [email protected] Standards Working Groups: More than 150 groups produce IEEE standards used throughout the world. MEMBERSHIP & PUBLICATION ORDERS Technical Committees: TCs provide professional interaction in Phone: +1 800 272 6657 • Fax: +1 714 821 4641 more than 30 technical areas and directly influence computer Email: [email protected] engineering conferences and publications. IEEE BOARD OF DIRECTORS Conferences/Education: The society holds about 200 conferences each year and sponsors many educational activities, including President & CEO: Jose M.D. Moura computing science accreditation. President-Elect: Toshio Fukuda Certifications: The society offers three software developer Past President: James A. Jefferies credentials. For more information, visit www.computer Secretary: Kathleen Kramer .org/certification. Treasurer: Joseph V. Lillie Director & President, IEEE-USA: Thomas M. Coughlin 2019 BOARD OF GOVERNORS MEETINGS Director & President, Standards Association: Robert S. Fish Director & VP, Educational Activities: Witold M. Kinsner 6 – 7 June: Hyatt Regency Coral Gables, Miami, FL Director & VP, Membership and Geographic Activities: (TBD) November: Teleconference Francis B. Grosz, Jr. Director & VP, Publication Services & Products: Hulya Kirkici Director & VP, Technical Activities: K.J. Ray Liu

revised 13 February 2019

Digital Object Identifier 10.1109/MC.2019.2901327 Looking for the BEST Tech Job for You? Come to the Computer Society Jobs Board to meet the best employers in the industry—Apple, Google, Intel, NSA, Cisco, US Army Research, Oracle, Juniper...

Take advantage of the special resources for job seekers— job alerts, career advice, webinars, templates, and resumes viewed by top employers. www.computer.org/jobs