Inside: CONFERENCE REPORTS WIESS ‘02 OSDI ‘02

THE MAGAZINE OF USENIX & SAGE April 2003 • volume 28 • number 2 inside: CONFERENCE REPORTS WIESS ‘02 OSDI ‘02 & The Advanced Computing Systems Association & The System Administrators Guild conference reports This issue’s reports focus on WIESS ‘02 2nd Workshop on Industrial example, a stock trading system may and on OSDI ‘02. Experiences with Systems have several redundant pathways for Software (WIESS ‘02) entering a trade, to protect against trades being lost before they have been BOSTON, MASSACHUSSETS entered. For the IT infrastructure, this OUR THANKS TO THE SUMMARIZERS: DECEMBER 9-11, 2002 means the redundant pathways need to Scott Banachowski KEYNOTE ADDRESS be synchronized at some point. This Richard S. Cox type of problem is rarely considered by Douglass J. Wilson, IBM Steven Czerwinski researchers or product developers. Himanshu Raj Summarized by Richard S. Cox Cristian Tapus MIT’s Technology Review recently ran a Third, error logging and reporting is Charles P. Wright story titled “Why Software Is So Bad.” important. As an industry, we currently Praveen Yalagandula support very primitive logging with no Wanghong Yuan The key is the problem of integration. Nickolai Zeldovich mechanisms for root-cause analysis or Ben Zhao CIOs spend 35% of their budgets on correlation of failures. Error messages Yutao Zhong integration, because every new system are often arcane or not useful, and “first- must work with the existing infrastruc- failure” capture is impossible. This is ture. The complexity of integration is evidenced by a common, though unreal- driven up by the constraints of the busi- istic, request from support center staff: ness environment as well as those of the “Turn logging on and recreate the fail- software. ure.”Because logging events need to be Several lessons can be learned from correlated, error tracking and logging studying systems usage. First, standards should be a basic service of the OS. and componentization are proving inef- fectual for complex systems. For exam- SESSION 2 ple, LDAP is a fine protocol, but no two Summarized by Wanghong Yuan organizations use the same schema. USING END-USER LATENCY TO MANAGE Making matters worse, interoperability INTERNET INFRASTRUCTURE is poor due to differing interpretations Bradley Chen, Michael Perkowitz, of standards, edge conditions, and ven- Appliant dor-specific extensions. This is leading The problem addressed in this paper is to a change from creating solutions by that distributed application perfor- mixing “best-of-breed” products to mance is important but hard to under- using a single “best-of-suite” package. stand. CDN selection and CRM systems Unfortunately, much of the literature on were offered as examples to illustrate the building component systems is aca- problem. The basic approach proposed demic, failing to deal with the scale of is to use end-user latency analysis: (1) large systems. content (e.g., an HTML Web page) is Second, systems will fail. Other indus- tagged to collect data; (2) tagged data is tries have accepted this, but software observed on the desktop (end-client sys- engineers are just now realizing that fail- tem); and (3) data is analyzed on the ure is hard. The recovery design must fit management server. the usage, which means the designer The challenges for this approach include must understand the failure modes in (1) technique issues such as larger data practice. This may mean using less sets, heavy-tailed data, and the deriva- sophisticated algorithms that are better tion of request properties, and (2) social fitted to the purpose. It also means and economic issues such as privacy. accepting that business redundancy may The results show that end-user latency be at odds with IT redundancy. For analysis can monitor relevant information, which is obscured otherwise. 82 Vol. 28, No. 2 ;login: BUILDING AN “IMPOSSIBLE” VERIFIER The previous talk was given in 1977, SESSION 4 ON A JAVA CARD when the main computer models were Summarized by Cristian Tapus EPORTS Damien Deville, Gilles Grimaud, Uni- IBM mainframes, coming VAX, and R AN EXAMINATION OF THE TRANSITION OF versité de Science et Technologies de PDP-11s, while C was taking the place of THE ARJUNA DISTRIBUTED TRANSACTION Lille ASM and structured programming PROCESSING SOFTWARE FROM RESEARCH TO The smart card device environment became the dominating idea. Three PRODUCTS ONFERENCE imposes constraints on CPU, memory, approaches of building system software C M.C. Little, HP–Arjuna Labs; S.K. and I/O. As a result, Java Card Virtual were introduced and compared: “Do it Shrivastava, Newcastle University Machine needs to be adapted to the right,”“Do it over,”and “Do it small, Arjuna started in 1986 as a research smart card. The regular verification with tools.”“Do it right” emphasizes an project at the University of Newcastle, approaches do not fit, since unification optimistic on-requirement analysis that England. Arjuna was a “vehicle for get- is costly. The proposed approach assumes “we know what we are doing.” ting Ph.D. degrees.”The decision to use addresses the above problems via (1) “Do it over” puts more emphasis on C++ was a pragmatic one (expensive non-stressing encoding and (2) efficient early implementation by still starts from Euclid vs. free C++ AT&T). Arjuna was fixed points using a software cache pol- scratch. The last approach, by contrast, designed to be a toolkit for development icy. considers tools instead of systems and of fault-tolerant applications which builds small and fast so that, if neces- would provide persistence, concurrency ENHANCEMENTS FOR HYPER-THREADING sary, failures can happen quickly. TECHNOLOGY IN THE OPERATING SYSTEM: control, and replication. Modularity was SEEKING THE OPTIMAL SCHEDULING In order to see the effect of these strate- the key to the longevity of the system. Jun Nakajima, Venkatesh Pallipadi, Intel gies, Mashey discussed different metrics In 1994 Newcastle University asked In this talk, Jun Nakajima first gave an to qualitatively measure success and gave them to implement a student registra- overview of Hyper-Threading (HT) statistics and observations of projects in tion system because the “academic technology by comparing it with multi- data processing. Figures and numbers researchers are cheap.”The system was processors. The reason behind HT is showed the low percentage of complete supposed to run on multiple platforms, that CPU units are not fully utilized. To success and indicated the larger a project serve about 15,000 students over five fully utilize CPU units, the HT approach is, the higher overhead it has to pay. days, and could not tolerate failures. is to use two architectural sets, thereby Laws of program evolution also state There were problems, though. Assump- executing two tasks simultaneously. that the entropy of a project increases tions were made about network parti- with time and may result in a complex tions and recovery that made the system The HT approach requires the OS program used to solve a simple problem. scheduler to support HT-aware idle han- fail to identify dead machines vs. slow dling, processor-cache affinity, and scal- Several principles were offered to coun- connections. Intuition is not a good ability (per-package run queue). This teract these problems: “build it fast,” approach to designing systems. paper proposes a micro-architecture “keep it small and simple,”and “build The year 1995 brought standards for scheduling assist (MASA) methodology for change.”Existing tools should be uti- transactions: object transaction system to address the above problems, thereby lized whenever possible. It would be specifications (OTSS) from OGM. It achieving an optimal process placement. good to build tools and consider the shared many similarities with Arjuna, interfaces of connecting tools. Some but it was only a two-phase commit pro- INVITED TALK “small tactics,”including “lifeboat the- tocol engine (persistence and concurrent ory,”“sinking lifeboat theory,”and other control where required from elsewhere). SOFTWARE STRATEGY FROM THE considerations about people and consol- “1980 TIME CAPSULE” At this time the OTSArjuna system was idations, were also discussed. John R. Mashey developed. With only slight changes to the interfaces between modules, the sys- Summarized by Yutao Zhong Even after 25 years of work, we need to keep these problems in mind, since system was complying with OTS. JTSAr- John Mashey reused the slides from a tem complexity is much higher nowa- juna followed just two years later as the talk he gave 25 years ago titled “Small Is days; fortunately, people are increasingly first Java transaction service. Beautiful and Other Thoughts on Pro- aware of these issues. gramming Strategies.”It is interesting to In 1999 the Java and C++ transaction see from these old slides and the newly Mashey ended the talk by saying, “We service were marketed; only one year added comments what has changed and have met the enemy and they are us.” later Bluestone took over Arjuna Solu- what hasn’t. tions Limited and was, in turn, acquired by HP in 2001. When the system was April 2003 ;login: WIESS ‘02 G 83 acquired by Bluestone the need for real to the entire engineering staff (one Currently there are almost no such testing became a reality. For the previous advantage of being small), and there are incentives. decade only about 20 tests had been strict coding standards (it is the law). Andrew Hume: I do technology transfer used, but this was increased to over 4000 The talk continued by describing tech- at AT&T. The problem is enticing tests in order to stretch every feature of niques used to obtain the final product. researchers, because you go for a while the system. The previous method was to When you hit bedrock, try to rethink without publishing papers. On the other get a release out to the users, and users what you are doing; and observe the hand, you can then write a different kind would then report problems back and “rule of holes – if you are in one, stop of paper, about the real aspects of sys- bugs would be fixed.

Inside: CONFERENCE REPORTS WIESS ‘02 OSDI ‘02

Communicating Between the Kernel and User-Space in Linux Using Netlink Sockets

Linux Kernel and Driver Development Training Slides

Linux 2.5 Kernel Developers Summit

Linux Kernel 8.1 Introduction

Community Notebook Kernel News Zack’S Kernel News

Q8:BËJ B<IE<C E<NJ

Fall 2020 Vol

Intel & Freebsd

Using Upstream MPTCP in Linux Systems

Unreliable Guide to Hacking the Linux Kernel Release 4.13.0-Rc4+

Free Software and Proprietary Software a History of Free Software the FOSS Development Model FOSS Business Model Conclusion

Open Source Software License Information