The Dawning of the Autonomic Computing Era
Total Page:16
File Type:pdf, Size:1020Kb
The dawning of the autonomic computing era by A. G. Ganek T. A. Corbi This issue of the IBM Systems Journal on solving it. More than any other I/T problem, explores a broad set of ideas and approaches this one—if it remains unsolved—will actually pre- to autonomic computing—some first steps in vent us from moving to the next era of comput- what we see as a journey to create more self- ing. The obstacle is complexity...Dealing with managing computing systems. Autonomic it is the single most important challenge facing the computing represents a collection and I/T industry. 1 integration of technologies that enable the creation of an information technology One month later, Irving Wladawsky-Berger, Vice computing infrastructure for IBM’s agenda for President of Strategy and Technology for the IBM Server Group, introduced the Server Group’s auto- the next era of computing—e-business on 2 demand. This paper presents an overview of nomic computing project (then named eLiza* ), with IBM’s autonomic computing initiative. It the goal of providing self-managing systems to ad- examines the genesis of autonomic dress those concerns. Thus began IBM’s commitment to deliver “autonomic computing”—a new company- computing, the industry and marketplace wide and, it is to be hoped, industry-wide, initiative drivers, the fundamental characteristics of targeted at coping with the rapidly growing complex- autonomic systems, a framework for how ity of operating, managing, and integrating comput- systems will evolve to become more self- ing systems. managing, and the key role for open industry standards needed to support autonomic We do not see a change in Moore’s law 3 that would behavior in heterogeneous system slow development as the main obstacle to further environments. Technologies explored in each progress in the information technology (IT) indus- of the papers presented in this issue are try. Rather, it is the IT industry’s exploitation of the introduced for the reader. technologies in accordance with Moore’s law that has led to the verge of a complexity crisis. Software de- velopers have fully exploited a four- to six-orders- of-magnitude increase in computational power— On March 8, 2001, Paul Horn, IBM Senior Vice Pres- producing ever more sophisticated software appli- ident and Director of Research, presented the theme cations and environments. There has been exponen- and importance of autonomic computing to the Na- tial growth in the number and variety of systems and tional Academy of Engineering at Harvard Univer- Copyright 2003 by International Business Machines Corpora- sity. His message was: tion. Copying in printed form for private use is permitted with- out payment of royalty provided that (1) each reproduction is done The information technology industry loves to without alteration and (2) the Journal reference and IBM copy- prove the impossible possible. We obliterate bar- right notice are included on the first page. The title and abstract, but no other portions, of this paper may be copied or distributed riers and set records with astonishing regularity. royalty free without further permission by computer-based and But now we face a problem springing from the very other information-service systems. Permission to republish any core of our success—and too few of us are focused other portion of this paper must be obtained from the Editor. IBM SYSTEMS JOURNAL, VOL 42, NO 1, 2003 0018-8670/03/$5.00 © 2003 IBM GANEK AND CORBI 5 components. The value of database technology and commonly used number is: For every dollar to pur- the Internet has fueled significant growth in storage chase storage, you spend $9 to have someone man- subsystems to hold petabytes 4 of structured and un- age it.” 8 structured information. Networks have intercon- ● Aberdeen Group studies show that administrative nected the distributed, heterogeneous systems of the cost can account for 60 to 75 percent of the over- IT industry. Our information society creates unpre- all cost of database ownership (this includes ad- dictable and highly variable workloads on those net- ministrative tools, installation, upgrade and deploy- worked systems. And today, those increasingly valu- ment, training, administrator salaries, and service able, complex systems require more and more skilled and support from database suppliers). 9 IT professionals to install, configure, operate, tune, ● When you examine data on the root cause of com- and maintain them. puter system outages, you find that about 40 per- cent are caused by operator error, 10 and the rea- IBM is using the phrase “autonomic computing” 5 to son is not because operators are not well-trained represent the vision of how IBM, the rest of the IT or do not have the right capabilities. Rather, it is industry, academia, and the national laboratories can because the complexities of today’s computer sys- address this new challenge. By choosing the word tems are too difficult to understand, and IT oper- “autonomic,” IBM is making an analogy with the au- ators and managers are under pressure to make tonomic nervous system. The autonomic nervous sys- decisions about problems in seconds. 11 tem frees our conscious brain from the burden of ● A Yankee Group report 12 estimated that down- having to deal with vital but lower-level functions. time caused by security incidents cost as much as Autonomic computing will free system administra- $4,500,000 per hour for brokerages and $2,600,000 tors from many of today’s routine management and for banking firms. operational tasks. Corporations will be able to de- ● David J. Clancy, chief of the Computational Sci- vote more of their IT skills toward fulfilling the needs ences Division at the NASA Ames Research Cen- of their core businesses, instead of having to spend ter, underscored the problem of the increasing sys- an increasing amount of time dealing with the com- tems complexity issues: “Forty percent of the plexity of computing systems. group’s software work is devoted to test,” he said, and added, “As the range of behavior of a system 13 Need for autonomic computing grows, the test problem grows exponentially.” ● A recent Meta Group study looked at the impact As Frederick P. Brooks, Jr., one of the architects of of downtime by industry sector as shown in the IBM System/360*, observed, “Complexity is the Figure 1. business we are in, and complexity is what limits us.” 6 The computer industry has spent decades creating Although estimated, cost data such as shown in Fig- systems of marvelous and ever-increasing complex- ure 1 are indicative of the economic impact of sys- ity. But today, complexity itself is the problem. tem failures and downtime. According to a recent IT resource survey by the Merit Project of Computer The spiraling cost of managing the increasing com- Associates International, 1867 respondents grouped plexity of computing systems is becoming a signif- the most common causes of outages into four areas icant inhibitor that threatens to undermine the fu- of data center operations: systems, networks, data- ture growth and societal benefits of information base, and applications. 14 Most frequently cited out- technology. Simply stated, managing complex sys- ages included: tems has grown too costly and prone to error. Ad- ministering a myriad of system management details ● For systems: operational error, user error, third- is too labor-intensive. People under such pressure party software error, internally developed software make mistakes, increasing the potential of system problem, inadequate change control, lack of au- outages with a concurrent impact on business. And, tomated processes testing and tuning complex systems is becoming more ● For networks: performance overload, peak load difficult. Consider: problems, insufficient bandwidth ● For database: out of disk space, log file full, per- ● It is now estimated that one-third to one-half of formance overload a company’s total IT budget is spent preventing or ● For applications: application error, inadequate recovering from crashes. 7 change control, operational error, nonautomated ● Nick Tabellion, CTO of Fujitsu Softek, said: “The application exceptions 6 GANEK AND CORBI IBM SYSTEMS JOURNAL, VOL 42, NO 1, 2003 Figure 1 Downtime: Average hourly impact INDUSTRY SECTOR ENERGY TELECOMMUNICATIONS MANUFACTURING FINANCIAL INSTITUTIONS INFORMATION TECH INSURANCE RETAIL PHARMACEUTICALS BANKING 0 0.5 1 1.5 2 2.5 3 MILLIONS OF $ REVENUE/HOUR LOST Data from IT Performance Engineering and Measurement Strategies: Quantifying Performance Loss, Meta Group, Stamford, CT (October 2000). Well-engineered autonomic functions targeted at im- “There is only one answer: The technology needs to proving and automating systems operations, instal- manage itself. Now, I don’t mean any far out AI proj- lation, dependency management, and performance ect; what I mean is that we need to develop the right management can address many causes of these “most software, the right architecture, the right mecha- frequent” outages and reduce outages and downtime. nisms...Sothat instead of the technology behav- ing in its usual pedantic way and requiring a human A confluence of marketplace forces are driving the being to do everything for it, it starts behaving more industry toward autonomic computing. Complex het- like the ‘intelligent’ computer we all expect it to be, erogeneous infrastructures composed of dozens of and starts taking care of its own needs. If it doesn’t applications, hundreds of system components, and feel well, it does something. If someone is attacking thousands of tuning parameters are a reality. New it, the system recognizes it and deals with the attack. business models depend on the IT infrastructure be- If it needs more computing power, it just goes and ing available 24 hours a day, 7 days a week. In the gets it, and it doesn’t keep looking for human be- face of an economic downturn, there is an increas- ings to step in.” 15 ing management focus on “return on investment” and operational cost controls—while staffing costs What is autonomic computing? exceed the costs of technology.