motd As I type this, the master copies of the LISA Conference pro- by Rob Kolstad ceedings are printed and shortly to be sent to the publisher. The Dr. Rob Kolstad has subjects range across the spectrum from Alva Couch’s new the- long served as editor of ory work to Josh Simon’s and Liza Weissler’s super practical doc- ;login:. He is also head coach of the USENIX- ument repository implementation. sponsored USA Com- puting Olympiad. A new “lunch with the speakers” program will widen access to presenters, who will appear at informal tables in the courtyard shortly after their talks for lunch (or, in the case of afternoon presenters, late-afternoon drinks). Yet another reason why LISA is the premier conference for administrators. Several topics are appearing on my radar this month, and space won’t permit me to write 650 words about all of them. They are News from All Over interesting enough, though, that I hope you’ll think about them and let me know if you hear report-worthy developments. They On the SAGE front, I am excited to announce SAGE’s biggest are presented here in no particular order. current news: the largest reported barrier to joining SAGE has fallen. SAGE and USENIX dues are now separated! Anyone can Some Predict the Death of the join USENIX for $110/year; anyone can join SAGE for $40/year I’m hearing more pundits (including some traditionally associ- – and there is no longer a requirement for membership in ated with the USENIX community) bemoaning current Internet USENIX as a prerequisite for membership in SAGE. If you’re in problems. Of course, the recent worms/viruses have done noth- the computer administration world, please urge your friends ing to help, but spam and the general noise level (including pop- and associates to join. (See page 36 for details.) up ads for browsers) has prompted ever more to suggest that the One of the benefits of SAGE membership is access to the results usefulness of and the Web is declining. Some suggest that of the annual salary survey. This year’s survey was the largest yet solutions involving balkanization (creation of smaller [sub-]nets (we teamed with SANS and Sys Admin,with almost 10,000 par- that serve specific [potentially distributed] communities) is the ticipants). Results are now posted on the SAGE Web site. Enjoy! only answer. I imagine that will happen anyway, but more as an addition than as a partition. You probably know that USENIX (this year along with SANS) sponsors the USA Computing Olympiad, the premier high Some Predict Quicker Rise of Linux school computing competition (and I am the head coach). Goals Pundits, Information Week,and ofcourse the vocal evangelists including encouragement of careers in computing, in addition are saying that Linux will rise to become a/the dominant operat- to identification and recognition of outstanding programmers. ing system. Larger corporations (who really do move most of the In 2003, Don Piele (long-time USACO director) hosted the money in the IT world) are now in the “when” phase of Linux international programming championships (the “International deployments rather than the “if” phase, according to Information Olympiad on Informatics”) at his institutional home, the Uni- Week magazine polls and others. Why mention this now? versity of Wisconsin–Parkside. Fully 75 countries competed Because the security issues are really pushing management to (about 300 competitors) in the week-long event held August understand what’s going on in the machine rooms, performance 16th-23rd which included 10 hours of grueling programming and usability are now seen as assets, and the SCO lawsuit is pro- competition. viding publicity and visibility to Linux. Regrettably, the rise of Linux means the death-knell for the BSD line. For the first time ever, the USA team tied for highest honors, with two gold medals (for juniors Alex Schwendner and Tiankai Mixed Predictions on Future Job Markets Liu) and two silver medals (for junior Anders Kaseorg and MIT- While outsourcing comments led all others in the free-format bound Timothy Abbott). Being the host country, a second team section of the salary survey, many on the Net also voice concern also competed for honor and earned scores that otherwise would about future careers in IT, administration, software, etc. New have earned medals. Only one senior is graduating this year, so criticisms are appearing that academic education is diverging we’ve got great chances for next year, as well. Best of all, the from the needs of industry. The worries include: lack of curricu- competition was run flawlessly for the first time, with no glitches lum relevance means college graduates are not as employable, or major protests. Congratulations to the competitors, coaches, industry will have to step in to make its own program (which and organizers. will surely be more focused on the near term), enrollments in

2 Vol. 28, No. 5 ;login: traditional programs will drop even more (they’re already way down from highs dur- ing the dot-com boom), and foreign outsourcing will drive anyone still interested into other careers, anyway. More next time. EDITORIAL STAFF EDITOR: Results of the First Haiku Contest Rob Kolstad [email protected] CONTRIBUTING EDITOR: Two months ago, I asked people to send Stateless I wander Tina Darmohray [email protected] me haiku about the Internet. What a fab- My passage HTTP MANAGING EDITOR: ulous set of entries I received! I culled Infinite roaming Alain Hénon [email protected] them down to some of the best, shown My favorite of Tobias J. Kreidl’s submis- below. Thanks to all who wrote; the topic COPY EDITOR: sions ended with a twist, a sort of apoca- Steve Gilmartin of the next contest is at the end of this lyptic tone: article. PROOFREADER Slipping though a pipe jel [email protected] Matthew W. Hurlburt’s first entry worked Bits propagate through routers TYPESETTER: in a sort of anthropomorphism, raising Filling us with fright Festina Lente the specter of digital isolation: One of John Sellens’s entries sported a MEMBERSHIP, PUBLICATIONS, Alone in my room similar theme: AND CONFERENCES Digitally connected Contemplating bits USENIX Association To others like me 2560 Ninth Street, Suite 215 Intrusion detection works Matthew also sent a more traditional Berkeley, CA 94710 Machine is now owned Phone: 510 528 8649 haiku, that included both technology and My favorite (though it’s really hard to : 510 548 5738 a season: Email: [email protected] choose!) was Heather Fox’s more tradi- [email protected] Like a Spring rainstorm tional entry, with its digital, emotional, [email protected] The server stops and restarts and outdoor themes: WWW: http://www.usenix.org Over and over http://www.sage.org Shooting stars of bytes Steve . Johnson submitted several snip- breathing ether in and out pets; here are my favorites: this world has no sun Innocent action Heather wins the first haiku author polo violates segmentation shirt. it’s a null pointer This was great fun! For systems admin Professional adventure Let’s try for another topic. Compose a come visit LISA haiku that reveals the joys or frustrations that surround the process of developing UNIX eternal scripts or programs. Send it to From PDP-11 [email protected]. To my PDA Haiku are little three line poems with five Edward Chrzanowski sent in an interest- syllables in the first line, seven in the sec- ing set of haiku where changing spacing ond line, and five more in the last line. or single letters makes for new haiku with Syllabic stresses (accents) are not impor- new meanings. Consider changing “infi- tant. The brief haiku captures a thought, nite” to “in finite” or changing “wander” a moment, a scene, or a vision. The best to “wonder”: haiku amplify insight or even cause an “Aha!” or epiphany. Many haiku tradi- tionally describe the seasons of the year.

October 2003 ;login: 3 opinion Value Added by Tina Darmohray I was fortunate to land in a soup-to-nuts project the first job I had out of col- Tina Darmohray, lege. I had everything to learn and there was at least one of everyone to learn contributing editor of ;login:, is a com- from. We had scientists, kernel coders, application programmers, database pro- puter security and networking consult- grammers and administrators, documentation specialists, system administrators, ant. She was a founding member of networking and telecom jockeys, and so on. Despite all the different flavors of SAGE. She is cur- rently a Director of computer professionals on the project, I noticed one universal theme: Those who USENIX. understood the whole picture were the most valuable. [email protected] Since I had everything to learn and didn’t have a particular specialty yet, I benefited from every person on the project. While there, I did some programming, project man- agement, technical training, documentation, and database and system administration. I was like a sponge, and there was free-flowing knowledge running steadily through- out the hallways and conference rooms of that project. I did my best to come up to speed on my own, but when I had questions there were always folks to go to for answers. There was so much to learn and, luckily, so many really good people willing to let me learn from them. As I came out from under my initial input overload, I began to notice that I wasn’t the only staffer who sought help from colleagues. In fact, there was information exchange going on for everyone, regardless of their experience or seniority. After a while I noticed a trend, though: There were some folks who were the top question-answerers, and it was almost as if an unofficial org chart was reflecting it. It was these folks whom the experienced staff went to when they had questions. And it was these same people who had the last word in meetings and to whom everyone deferred when the strategic decisions were being made. Since their authority was clear, though not official or necessarily aligning with a title of “manager,”I began to assess what it was they had in common. Were they the scientists on the project? The kernel programmers? Soon I realized that it wasn’t their profession or title which they had in common but their body of knowledge. In every case these local sages were the people who knew more than just their area of specialty. For the programmers, that meant they knew the kernel as well as the applications and net- working. For the network folks, it was the guy who could talk with the kernel pro- grammers. For the sysadmins, it was the one who could administer the network applications as well as the local machine, and for the database folks it was the person who understood the OS and its administration, too. In every case, these people were able to bridge the gap to their peers. On this particular project, which drew on so many different disciplines, those who could do so were key. But I’ve observed this type of employee is always key, no matter where they work. It’s like they are translators in the tower of Babel, the hub of a communications wheel. For their ability to do and understand more than one thing, they are more valuable and more utilized than their peers. Thinking back 20 years, I realize that additional value hasn’t changed. The people who understand the way things work, top to bottom, are still the most valuable and influ- ential folks on any project. In this economy, being that person may mean you keep your job or get the next one. Take every opportunity to broaden your body of knowl- edge; not only will you bring added value to your project, you’ll also add value to your resume.

4 Vol. 28, No. 5 ;login: introducing voice over Internet

by Ryan Kereliuk Ryan Kereliuk is a protocol technology contract developer ECHNOLOGY

focusing on systems T and network soft- Overly optimistic marketing during the so-called communications revolution ware ranging from l has given voice over Internet protocol (VoIP) technology the stigma of device drivers to pro- tocol stacks. He being the next big thing that never materializes. I’ll risk the same mistake by holds a master’s degree in CS from suggesting that VoIP is emerging as a viable Internet application. the University of Alberta. Key VoIP Drivers Broadly speaking, three factors will motivate the adoption of VoIP: [email protected]

n Reduced ownership and operational costs n Simplification n A roadmap for building next-generation services Operational cost is paramount. Long distance charges are the first and most obvious expense. Skeptics suggest that VoIP has missed its window because of fierce competi- tion in the long distance market, but the market continues to grow, so the window remains open. Furthermore, even a few cents a minute is far more expensive than uti- lizing an otherwise unused – and already paid for – resource. Personnel costs also add up. Most companies large enough to have their own private branch exchange (PBX) are staffed with a telecom group. Since VoIP is premised on open, interoperable standards, telephone services become an application more akin to running an HTTP server than a traditional phone system. One is able, then, to leverage the competencies of an IT networking group and invest resources there, which is attractive given the versatility of those staff. Security is a feature that one gets “for free” with VoIP.VoIP is secured in the same way as other Internet services: by minimizing attack vectors, using strong authentication, and protecting important servers with a firewall. On the management front, IP phones can often be configured using DHCP and TFTP, for instance, which exploits services that usually exist already. Simplification might save money directly but is a benefit in its own right. VoIP con- verges voice and networking, reducing the number of service providers a company must deal with. (Of course, I’m not predicting the death of the public switched tele- phone network or PSTN in the short or medium term.) On one hand, ISPs offer straightforward billing plans for bit pipes. On the other, VoIP empowers people to take control of the server infrastructure that telcos use to extract complex fees for every move/add/change operation. The most important simplification, though, concerns the phones themselves. If VoIP is really just another Internet application, then are phones even required? So-called hard phones are available to recreate the experience of using a regular phone, but soft clients, ordinary PC applications that run on desktop computers, are making inroads. In time soft clients will dominate. VoIP will offer new features over legacy phone systems, namely rich media and conver- gent integration. Rich media rejects the assumption that a communications channel spans only one medium (e.g., audio). Instead, voice, video, and text can be shared simultaneously. VoIP protocol engineers, particularly those with the Internet Engi-

October 2003 ;login: VOICE OVER INTERNET PROTOCOL TECHNOLOGY l 5 Estimates suggest that VoIP neering Task Force (IETF), have shown tremendous commitment to generality, so today’s protocols are equipped to help people communicate in whichever ways serve can save 30% over conven- them best. tional telecom rollouts after Convergent integration – unified messaging – suggests that voice mail and should, like email, be accessible using any device connected anywhere on the Internet. accounting for long distance This unification might even extend to integrating VoIP software into customer rela- charges, personnel, servers, tionship management suites or e-commerce applications. The case for VoIP is closely tied to cost as well as potential new services. Estimates sug- and phones. gest that VoIP can save 30% over conventional telecom rollouts after accounting for long distance charges, personnel, servers, and phones. Thinking About Deployment A VoIP project, not surprisingly, begins with a requirements analysis that informs the trade-offs faced at every stage of the decision-making process. Key areas for considera- tion include cost, network requirements, protocol selection, client and server hardware and software, impact on existing infrastructure, and migration paths. Regardless of the underlying technology, certain questions govern subsequent choices and, in the end, serve as evaluation criteria for any deployment. Three that deserve consideration are:

n System utilization n Interoperability n Quality Utilization refers to the capacity of a given system and is related to the numbers of active and potential users: How many VoIP endpoints will be connected, and what percentage of those will be active at any one time? Interoperability is concerned with which users can connect with one another: Is the system meant for internal use only or for use with external parties? In the latter case, are external parties reached by using some (possibly different) VoIP protocol or by using a gateway to the PSTN? Finally, unlike legacy phone systems, VoIP offers the opportunity to trade quality for other benefits. What sort of quality do users expect in terms of system availability and media fidelity? Call quality is governed by the codec in use, assuming the network transport is able to keep up. The term codec is familiarly expanded to coder/decoder but, these days, com- pressor/decompresser is an equally valid meaning. Different media codecs require dif- ferent network resources but also provide different levels of quality. The G.711 codec, for example, provides quality equal to conventional telephone systems at a rate of 64Kbps. G.729A, by contrast, needs 8Kbps of but sacrifices quality. There is no substitute for making test calls with different codecs, but my subjective impression is that G.729A offers a quality markedly better than cellular telephone service. Choos- ing a codec or building a model of codec usage is of great importance to resource planning. Simplistically, the codec bandwidth can be multiplied by the number of active users or voice paths to compute bandwidth needs. A bandwidth number in hand, one can begin shopping for network service. Often for- gotten, though, is that bandwidth is just one measure of a network. Latency, jitter, (QoS), and availability are other important considerations. VoIP sys- tems are particularly sensitive to latency, which distorts the flow of conversation, because changes in speaking order are politely signaled by silence. High latency has the

6 Vol. 28, No. 5 ;login: effect of making multiple parties think the channel is open to new speakers – which of Any VoIP implementation course results in a collision. A rough rule of thumb is that end-to-end latency should not exceed 250 milliseconds, but this number should be minimized. Jitter is a measure needs to mesh with existing ECHNOLOGY of the difference in inter-packet latency between packets leaving the sender and arriv- security policies and T ing at the receiver. Introduced by network elements, jitter can cause increased packet l loss or perceptible delays, so its minimization is very desirable. infrastructure. Making guarantees about bandwidth, latency, and jitter – QoS in networking parlance – is a hard problem that is not adequately solved in contemporary networks. These days, poor man’s quality of service is achieved by overprovisioning, a very basic but effective technique. In summary, if one is outfitting a new location with network ser- vice or is changing service providers, it is a good time to ask harder questions than one might have in the past. Learn about a provider’s reliability by interviewing existing customers, get comfortable with their problem-tracking and resolution procedures, and sign a service level agreement (SLA) that captures the important requirements. Media codecs have direct implications for bandwidth. By contrast, calls are set up and torn down with a signaling protocol that is comparatively lightweight. When it comes to signaling, one is able to choose from a wide selection: Megaco, H.323, SKINNY, MGCP, SIP, etc. (to name a few); a comparative examination of these different proto- cols is beyond the scope of this article. The only protocol I work with today is the IETF’s Session Initiation Protocol (SIP), which I strongly recommend to anyone implementing VoIP. The signaling protocol one selects will obviously affect choices to do with client and server software and, to a lesser extent, client and server hardware. For SIP, many server choices are available, including high-quality open source and commercial packages. In terms of hardware, a good rule of thumb is that the machine used for an organization’s corporate HTTP server is comparable to the machine needed to run SIP services. On the client side, choosing between soft clients and hard clients is a key decision. Soft clients tend to be less expensive (or free), offer more features, and integrate more organically with existing desktop software. The downside, if there is one, is a learning curve similar to deploying any new application. Hard phones, predictably, offer a user experience that in most cases is identical or slightly improved over regular telephone handsets. Regardless of the particular protocol choice, the infrastructure associated with deploy- ing VoIP will be impacted. Hard phones, for example, don’t reuse the resources allo- cated for existing PC desktops. When deploying hard phones, additional switch ports are required for each endpoint, more cabling may be needed, and routers are a factor if additional subnets are required. Nodes (whether hard phones or desktops) configured with RFC 1918 private IP addresses will have issues communicating with people beyond the nearest network address translation (NAT) boundary. SIP, for example, includes routing information inside IP packet payloads, which means that vanilla translation systems do not work. Like FTP, it will be some time before NAT-enabled firewall devices are smart enough to rewrite these packets with appropriate translated address information. In the mean- time, options include using public IP addresses for SIP endpoints or using a SIP-spe- cific application-level gateway that is able to reconcile addresses inside SIP packets. Lastly, any VoIP implementation needs to mesh with existing security policies and infrastructure. This might mean adjustments to deployed security systems such as fire-

October 2003 ;login: VOICE OVER INTERNET PROTOCOL TECHNOLOGY l 7 walls and RADIUS servers. VoIP security is particularly important, especially consider- ing that authorized users can often access expensive resources such as PSTN lines through a gateway. Finally, finding a migration path will be of key importance if one already has a signifi- cant investment in legacy phone technology or cares about reaching PSTN users. PBX users are best advised to contact their PBX vendor, as some vendors have VoIP options in the form of line cards for existing equipment. If PSTN connectivity is needed for a VoIP installation, a variety of choices are available in a variety of sizes. Running a T1 line to a VoIP-enabled router, for example, allows 24 simultaneous calls to be gated to the PSTN. On a smaller scale, some routers can take a module implementing a foreign exchange office (FXO) interface, which connects one or two lines to a telco. In short, telephony companies are sensitive to migration issues, and solutions are generally available. An Example Implementation Building out a VoIP network is not as complex as it might seem. In this section I will give a high-level description of one setup I’ve used that addresses some different requirements. This scenario describes the VoIP solution for a multi-office distributed company. In this deployment, remote offices are connected to the Internet using business-grade DSL lines. When using this class of network connection, one generally doesn’t have the influence necessary to negotiate a favorable service level agreement, but the good news is that these links are more than adequate for serving small satellite offices. The central corporate phone system is served by a VoIP-dedicated one-megabit link with an SLA guaranteeing latencies of 70 milliseconds or less to other predefined points on the Internet. This is a connection capable of serving 12 calls with G.711 or 42 calls with G.729A, though both codecs are used in practice. This example uses the SIP signaling protocol exclusively, with a hodgepodge of differ- ent servers and clients. In terms of servers, for example, remote offices utilize open source SIP proxies, including VOCAL and SIP Express Router (SER), while the head- quarters uses a commercial SIP server that supports unified messaging functions. The clients deployed vary widely, because the organization is a software development firm in which developers are permitted to use the client of their choice. Typically, however, desks are equipped with hard phones such as Cisco 7960 handsets, while roaming users use soft clients such as Microsoft’s Messenger product. The reality is that VoIP is not currently sufficient to reach all potential business part- ners and customers. In this deployment, then, PSTN connectivity is supplied using Cisco routers. One satellite office uses a Cisco 2600 router equipped with an FXO module to connect two lines to the PSTN for local dialing. Headquarters, however, uses a Cisco 3600 router with a T1 interface card to provide PSTN direct dial numbers into 24 different VoIP endpoints. By terminating these latter PSTN lines at SIP phones, some of them remote, the organization achieves the effect of virtualizing its geograph- ically distributed operations at the head office. Both PSTN compatibility systems work very well.

On Timing There is no question that VoIP represents a major paradigm shift. The legacy teleph- ony model is very strong and thus will not be unseated easily – and change will not happen all at once even in an environment of enthusiastic adoption. So when, then?

8 Vol. 28, No. 5 ;login: Before I say when, let me describe how. The value of a communications medium cor- SELECTED POINTERS relates with the number of users reachable using that medium, so the adoption rate http://www.voip-calculator.com/ – An online will accelerate due to what economists call a network effect. This is developing in a script to calculate network requirements based on anticipated utilization. ECHNOLOGY

couple of ways. T

http://www.vovida.org/ – An open source com- l First, trends in the service provider sector augur well for the emergence of VoIP.We’re munity site that offers a variety of VoIP soft- presently seeing major providers size up the next generation of ware including the VOCAL SIP proxy. services. Looking to the competitive long distance market, we already see more agile http://www.iptel.org/ – Makers of the SIP providers using IP links to traffic international PSTN calls, thus gaining a competitive Express Router proxy server and providers of edge. personal SIP hosting services. There is a new breed of company, those offering local and long distance telephone ser- http://pulver.com/fwd/ – A free SIP service vice using VoIP technology. , for instance, offers flat-rate local and long dis- provider that currently connects over 40,000 users. tance calling in the , with the option of choosing a phone number from several American area codes. Early entrants like Vonage are aiming to capture business http://www.microsoft.com/windowsxp/ and consumer customers who aren’t in a position to manage their own server-side windowsmessenger/ – Watch here for the new Windows XP Messenger or download the old infrastructure or PSTN interconnectivity. This parallels managed Web services, which Messenger 4.7, which is also SIP compatible. coincided with the growth of that technology. http://www.vonage.com/ – A service provider Second, consumers, too, are seeing new reasons for VoIP, including the sort of man- offering SIP services with interconnection to aged services just described. Other reasons include network access and software avail- the PSTN. ability. As anecdotal evidence, I’ve been broadband connected for over five years, and http://www.resiprocate.org/ – An open source among my friends, family, and coworkers this is universal (although Canada is a bit SIP stack optimized for speed. ahead of the curve on this technology). And, now, inexpensive VoIP software is widely http://www.asteriskpbx.org/ – An open source, available to leverage such connections. By the time this article is printed, Microsoft, the multi-protocol PBX suite for Linux. desktop juggernaut, will have released the latest version of their Windows XP Messen- http://www.xten.com/ – Xten offers the lite ver- ger program, which includes a complete SIP client. This enhancement, to be released sion of their SIP soft phone as a free download. in version five at Microsoft’s Windows Update site, means that over 17 million PCs are potential VoIP nodes using the SIP protocol. http://www.sipphone.com/ – A SIP provider that sells hard phones for use with its free service. Remember the network effect? Within 36 months it will be possible to satisfy a healthy percentage of one’s calling needs, both personal and professional, using VoIP.

Conclusion It has been the intent of this article to raise consciousness about VoIP, a technology that has been rightly considered imaginary. I use VoIP systems every day for my tele- phone needs, so, to me, they hold great promise. I’m convinced that VoIP is, at once, both a solution for next-generation communications and a challenge for IT depart- ments, so it would be wise to keep VoIP in mind. The good news is that people com- fortable with networking and Internet services will catch on to VoIP very quickly. In a future article, I will focus on the Session Initiation Protocol for VoIP. In that arti- cle, I’ll look at some of the lower-level nuts-and-bolts related to implementing SIP ser- vices. I’ll describe the different actors in a SIP system, with examples drawing on open source SIP applications. SIP is just an application protocol that runs on the Internet, so, as developers, we can contemplate writing new and interesting phone services with- out investing huge amounts of time learning proprietary protocols. To that end, we’ll take a tour of one open source application built on an open source SIP stack.

October 2003 ;login: VOICE OVER INTERNET PROTOCOL TECHNOLOGY l 9 musings In my last column, I waxed enthusiastic about how superior Windows was when by Rik Farrow it came to the ease of writing keystroke sniffers. While there are hundreds more Rik Farrow provides keystroke sniffers for Windows, I have since learned of a new one that runs on UNIX and Internet security consulting Linux (2.2 and 2.4 kernels), Solaris, and OpenBSD. and training. He is the author of UNIX System Security and I was talking to Ed Balas, after being grabbed by Lance Spitzer and dragged over to a System Administra- tor’s Guide to System table staffed by the Honeynet Project at the 2003 BlackHat conference in Las Vegas. Ed V. explained to me that he was a “generalist, not a specialist,” and had just finished writ- ing, then porting, a kernel module that is designed for use in GenII Honeynet, which [email protected] he is writing about for the December 2003 annual security issue of ;login:. The original honeynets relied on packet sniffing, and that can be a problem when the first thing the attacker does is download and install a version of SSH. Now, everything that gets downloaded can travel across an encrypted link. And attackers have started to encrypt their tools on UNIX systems, something that has been done for a while with Windows trojans and viruses, so static analysis becomes much more difficult. But if the honeypot can capture all keystrokes, the potential for useful research becomes much greater for the honeypot owner. Sebec2 (see http://project.honeynet.org/papers/honeynet/tools/index.html) hooks into the read entry in the kernel’s system call table, so that anything that a program reads, whether it be a file or a command line, gets captured. That means that any encryption keys or passwords that an attacker uses will be recorded. Sebec2 then creates its own IP packets and writes them directly to the NIC device driver. Most applications that cre- ate their own packets (e.g., hping) write to a raw socket, and that makes it possible for snort or tcpdump to capture the packet before it gets written to the device driver. But Sebec2 prevents the packets from being detected by any application-level process – in fact, by anything other that a kernel module like Sebec2 itself. The server end of Sebec2 still sniffs the network to collect the packets. As Ed described his new tool, I couldn’t help but think that this would make a great rootkit. Sebec2, unlike xkey (see my August 2003 Musings), also reports the application that requested the read(), so an eavesdropper gets a much clearer picture than with xkey. And Sebec is listed as a rootkit at chkrootkit.org. Like many other security tools, the potential is there for both good and dark uses. I also heard about a really interesting study done by Cisco employees. They decided to take a closer look at the myths surrounding BGP vulnerabilities. When someone claims to be able to “bring down the Internet in a half hour,” they would be referring to flaws either in the routers or in the core routing infrastructure, and that means BGP. Matthew Franz and Sean Convery first presented their results (publicly) at NANOG and subsequently at the BlackHat conference. You can read their presentation at http://www.nanog.org/mtg-0306/pdf/franz.pdf. I wrote about the potential for “bad things happening” to BGP back in April of 2002, and nothing has changed since then. True, there are now two proposals for adding security to BGP updates, when before there was only one. But S-BGP (Secure-BGP) requires a PKI, the signing of routes by the originating router, and the checking of sig- natures by the router receiving the update: http://www.ietf.org/internet-drafts/draft- clynn-s-bgp-protocol-01.txt. I heard complaints about this concept from many people, who pointed out that han- dling public key encryption was not something router CPUs could manage while still

10 Vol. 28, No. 5 ;login: getting useful work done. The S-BGP proponents do not consider this an issue, but if Misconfigured routers are . . . you do use BGP, and S-BGP becomes mandatory, plan on updating the core of your ECURITY router so it will include the extra processing power necessary. The router company ver- a much bigger issue at S sion is called SoBGP, or Secure Origin BGP, and differs in that it uses a decentralized l certificate management system and relies on RADIUS servers (or something that does present than problems with not involve the router’s CPU) to handle public key encryption: http://www.ietf.org/ BGP. internet-drafts/draft-ng-sobgp-bgp-extensions-01.txt. Convery and Franz decided to take a hard look at the FUD surrounding BGP by creat- ing attack trees, then trying out each branch to see if any of these attacks was actually something to worry about. Recall, if you would, that BGP neighbors, or peers, keep a TCP connection going at all times. Once a minute, they exchange keep-alive messages, or they might send routing updates to their neighbors. If this TCP connection (to port 179) goes down, each router strives to recreate it quickly while also sending out updates reporting that the routes they know of that travel through the neighboring router are all invalid. That information percolates across the Internet, forcing any router with a complete set of BGP routes to recalculate its routing tables. In Convery and Franz’s attack tree, this leads to two branches: disrupt the TCP connection or insert phony routes (you can read the papers or presentation for much more detail). So they tried disrupting the TCP connections between seven different products/imple- mentations that all support BGP.And failed. They also tried hijacking BGP connec- tions. BGP neighbors can use MD5 authentication that gets included as a TCP option, and if this is used, hijacking means guessing the shared secret. But hijacking also means being able to sniff communication between BGP neighbors, which are often connected using a link dedicated to that purpose. The short answer is that hijacking a connection could be done if MD5 authentication is not used, and the connection can be sniffed. They also attempted inserting an update when MD5 was not used, and this resulted in an ACK storm (since the receiving system has sent an acknowledgment for a packet that the sending system has never heard of), with the eventual result that the connec- tion gets reset after five minutes. But during that five minutes, the phony routes spread through the Internet (unless blocked by BGP route filtering, but that’s another story). They tried fuzz testing using BGP messages in attempts to crash BGP; four flaws were found in 1200 attempts, and three of those worked only if received from a valid peer and/or valid AS. The fuzz testing included sending messages that varied from totally random data to up to eight valid fields followed by random data. So the implementa- tions appeared to be fairly robust. The most interesting test that Convery and Franz carried out was to use traceroute to 120,000 destinations, one for each CIDR block in the Internet, in an attempt to map out all the routers that might be BGP speakers. With a list of 115,466 IP addresses ready, they then tried SYN probes against each address, to ports 22, 23, 80, and 179 (SSH, Telnet, HTTP, and BGP). In theory, any access to administrative ports by “out- siders” should be filtered or ignored. In practice, 14.5% of all routers responded to at least one of the three administrative ports, SSH, Telnet, or HTTP. They concluded, based on other tests not mentioned here, that the best way to disrupt the Internet is to take over one or more routers and have them distribute misleading routes. If a trusted router gets compromised, signed updates (if these are eventually approved) would still be trusted. Misconfigured routers are, they concluded, a much bigger issue at present than problems with BGP.

October 2003 ;login: MUSINGS l 11 Each vulnerability stretches July 2003 also brought with it announcements of several new buffer overflow vulnera- bilities in Windows. One announcement affected all versions of Windows from 95 up way back in time, to much to Server 2003 (CA-2003-14 or MS03-023) and the module that handles Rich Text earlier versions, yet is still within HTML. The other involved only NT through Windows Server 2003, and requires access to port 135 TCP or UDP (CA-2003-16/19 or MS03-026). Windows present in the Windows Server 2003 is supposed to be the most secure version of Windows, so the existence of these vulnerabilities was considered an embarrassment for Microsoft. But I noticed Server 2003. something other than simple embarrassment in these vulnerabilities. Each vulnerability stretches way back in time, to much earlier versions, yet is still pres- ent in the Windows Server 2003. What that tells me is that Windows Server 2003 is still using code left over from the earliest days of Windows with a TCP/IP stack (the RTF- HTML bug) or from the earliest days of NT. I have always thought one of the biggest issues with Windows systems is the amount of legacy code each version contains, cru- cial to maintaining backwards compatibility. Microsoft built all of Server 2003 with stack canaries, similar to the StackGuard canaries for Linux, but these servers are still vulnerable. One reason is that stack canaries can only guard against stack overflows, not heap overflows. An even better reason is that some mistakes in code are so subtle that code analysis by programmers and by special tools will not catch them. The heap overflow found in Sendmail in March of 2003 was a good example, as understanding the source code meant single-stepping through it, a laborious process. I got a chance to read most of Secure Coding, a new O’Reilly book by Mark Graff and Ken van Wyk, both currently with CERT. Secure Coding is about principles and prac- tices, and it provides little in terms of code examples (and most examples are the mis- takes, not solutions). Their goal is not to write a programming guide, since other people have done that well, but instead to share their years of experience with vulnera- bilities. Problems offered go beyond programming mistakes, such as the all-too-fre- quent buffer overflows, to design and operational mistakes. They do include some great examples of failures and of good design. I liked their explanation of how one could log in to some rlogin servers using the login name -froot and get a root prompt with no password (a problem that occurred because rlogind calls the login program without checking the provided login name). Or how /etc/passwd file fragments started showing up in Solaris tar files in a new release. It turned out that some heap memory had just been freed, and then reallocated as the buffer used when writing out tape blocks, and that memory had previously been used in password file reading functions (getpwent()). Both of these issues are problems in composition – that is, neither occurred on its own, but required a combination or sequence of events for the bug to assert itself. In the case of tar, some code had been removed, moving the freeing of the memory used by getpwent() closer to the malloc() used for tar’s block buffers. Who would have ever guessed (much less figured this out) before it was discovered the hard way. Secure Coding is a relatively short book (177 pages), but not an easy one. It really helped me to understand why I was never a great programmer – my programming style was more prototyping than design. Prototyping was loads of fun, as I would occa- sionally knock together example programs thousands of lines long in a day when I was teaching people how to use a graphics library. I was proud that I could get such a mon- strosity to compile, link, and actually demonstrate potential solutions to the problems the client had. But the real problem arose after I left, when the client decided not to

12 Vol. 28, No. 5 ;login: toss those thousands of lines of disorganized code, but instead to use it as a starting I finally recognized that point for their application. Note that what I was doing was gluing together many short ECURITY example applications, not writing fresh (but unorganized) code. secure coding cannot be S l Graff and van Wyk provide methodologies and questions that you can use in architec- taught as an isolated subject ture, design, implementation, and operation. Having read their book, I finally recog- nized that secure coding cannot be taught as an isolated subject but needs to be woven but needs to be woven into into every area of software instruction. I certainly recommend this book if you have every area of software anything to do with software (beyond just running off-the-shelf apps), as security is important and not something that can be taken lightly (or worse, something that you instruction. can just expect will happen organically, without careful thought). As I finish writing this column, I am also getting ready to leave for the USENIX Secu- rity Symposium in Washington, DC, something I have been looking forward to for months. BlackHat is interesting, and some research does show up there, such as the Cisco presentation. BlackHat led off this year with a presentation by David Litchfield, who, in his pre-conference notes, planned on explaining the differences between exploits in Linux and Windows using stack-based buffer overflows in Oracle 9i’s XDB as the example. Instead, he chose to talk about the much more current issue of the overflows in Windows RPC. Litchfield certainly knows his stuff, that being the internals of Windows and Linux executables and libraries. Although his presentation was disjointed, he did manage to take the audience through a code debugging session of MS RPC, showing where the problem occurs and how it can be exploited. He could not demonstrate an exploit for Server 2003, which LSD claims to have written but not released (yet). But he did have parts of his audience absolutely mesmerized. A group of Microsoft employees were sitting in the front of the room, furiously taking notes. Litchfield described how the stack overflow protection works in Server 2003 and suggested methods for evading it, something LSD appears to have done. This vulnerability in MS server products has great potential for worm writers, as the MS-Blaster worm and its variants have shown. Microsoft, having been slammed by Slammer, had adopted a much less tolerant view of people who fail to patch their sys- tems. Everyone within Microsoft was given a deadline to install the patch for MS03- 026. If they failed to do so by the deadline, their network connection would be disabled. I asked a friendly Microsoft employee exactly what that meant, and he told me that internal security was aggressively scanning for the vulnerability, and would turn off the switch port of anyone not patched by the deadline. Also, anyone logging in remotely would also be tested before their authentication could be completed. Wise moves. Apparently, Microsoft’s customers did not take the issue seriously enough, as MS-Blaster has been moderately successful, although paling in comparison with the SoBig.F virus. When will they ever learn?

October 2003 ;login: MUSINGS l 13 ISPadmin

by Robert Haskins Managing and Providing ISP Services Robert D. Haskins is an independent con- Introduction sultant specializing In this edition of ISPadmin, I look at ways service providers manage and in the Internet Ser- vice Provider (ISP) provide mail, Web, dialup, and other types of services to their customers. If industry. I were bringing up any service provider type of business from scratch today, ISPMan (or a similar, usually custom-built package) would be at the core, along with an appropriate billing system.

Background [email protected] In most smaller “legacy” ISPs, shell scripts are used to provision services ordered by the customer. These scripts are custom written and driven by the “legacy” billing sys- tem. Such provisioning applications tend to take on a life of their own, causing a major maintenance headache for the ISP. For example, each time the provider offers a new type or classification of service, the custom provisioning system must be modified to include this new service. Also, when a new system is added, the provisioning system must be modified to include this new system. When the business model changes, there is a high likelihood that the provisioning system will also need to be changed. There are many operational issues as well. For example, non-LDAP-enabled systems (smaller providers who use stand-alone files for enabling dialup (RADIUS) accounts) experience a short amount of downtime when the RADIUS service is reloaded after updating the user list. Also, there is always a possibility the service won’t start up cor- rectly when the associated RADIUS daemon is reloaded after adding a user to the stand-alone file. Most larger service provider operations have resorted to writing custom scripts for this purpose. However, for a smaller provider (and even for larger ones just starting out), an application like ISPMan is a better option. While somewhat constrained by the inherent design in any application, it is written in so the provider can add func- tionality if need be.

Open Source Solutions Several open source systems manage and provide ISP services. Vishwakarma, perhaps the oldest service provider management system, doesn’t appear to have been modified in the past three years. It has a lot of the functionality that an ISP would require, such as virtual email, Web-hosting management, and reseller support, and it has an LDAP directory to tie everything together. However, it appears to have several pieces missing. The biggest holes would be lack of a distributed model for back-end services, in addi- tion to quota support. However, it is certainly worth a look. It is released under the GNU GPL. Another system in the same vein is a package called “vhost” (virtual host), which is also released under the GNU GPL. It has much of the same functionality as ISPMan, with support for virtual email and Web sites. However, its biggest drawback is the lack of an LDAP directory, which would allow a distributed model for enabling back-end services. Hopefully, this is an area of focus for future development. Imagine how the growing ISP would have to throw out the vhost system when they ran out of spare cycles on their existing hardware! It would be much easier, and more cost efficient, to simply add machines to the cluster in order to add capacity.

14 Vol. 28, No.5 ;login: Commercial Solutions DMIN

There are a number of commercial products in this space. Most companies have prod- A ucts in the “virtual server” space as well, which allows a service provider to make (and YS S sell) multiple virtual servers out of one “real” server. (The best known open source vir- l tual server project is vserver.) Many of these commercial products are based upon open source components for their back-end services. Some commercial solutions even manage Microsoft-based products such as Exchange and Windows Server 2000. (As an aside, if you are looking for an open source Exchange replacement, the SAGE members list had a great discussion on the topic back in June 2003.) The obvious benefit of a commercial solution is the improved support one should get utilizing such a product. Given the level of support available in open source applica- tions such as ISPMan, such differences are decreasing by the day. However, commercial implementations might be useful in certain environments and are certainly worth investigating.

What Is ISPMan? ISPMan consists of a collection of back-end services (all utilizing an LDAP directory for user information) and a set of custom Perl programs to manage those services. Like the other open source ISP managers, it is has been released under the GNU GPL. The major applications ISPMan uses to provide services to end users include:

n (mail exchange) n Cyrus IMAP (POP3/IMAP subscriber access) n Apache 1.x (Web services) n BIND 8/9 (name services) n PureFTP (FTP access) n OpenLDAP 2.x (directory services) n FreeRADIUS (RADIUS services) n Horde IMP (Webmail) The system is scalable, and redundant with appropriate additional hardware. It con- tains a Web interface for management/provisioning, as well as a command line inter- face for provisioning accounts via automated mechanisms (e.g., through the service providers billing system) and for other tasks. Any function that can be performed via the Web interface is available via the command line interface on any in the clus- ter! It is worthwhile noting that the developers of ISPMan plan on implementing a Simple Object Access Protocol (SOAP) interface for easier integration into the billing system, and other required interfaces as well. ISPMan has a very nice “reseller” feature, where the ability to add/change/delete ser- vices can be delegated to others (e.g., a wholesaler setting up a reseller to sell dialup accounts, Web hosting, etc.). Also, this interface could be used for a large enterprise, where the central IT department gave access to people in each division to add email or dialup accounts, perhaps even under their own subdomains (or completely different domains). This is discussed in “How an Enterprise Might Use ISPMan,” below.

How ISPMan Works The basis of ISPMan is, of course, the LDAP directory. Where standard attributes don’t exist, ISPMan defines its own schema that defines the data structure the ISPMan appli- cations use to communicate between cluster members (e.g., the location of a user’s mailbox), and for communications with machines outside the cluster (e.g., DNS data).

October 2003 ;login: ISPADMIN l 15 A user record, for example, has the following basic attributes (the schema source is in parentheses): uid (cosine.schema) uidNumber (nis.schema) homeDirectory (nis.schema) userPassword (core.schema) ISPMan currently makes use of the following schemas (the source is in parentheses): core.schema (openldap) cosine.schema (openldap) nis.schema (openldap) misc.schema (openldap) inetorgperson.schema (openldap) dnszone.schema (from bind9_sdb) pureftpd.schema (from pureftpd) ISPMan.schema (ISPMan) RADIUS-LDAPv3.schema (from FreeRadius) Each server in the ISPMan cluster runs the “ISPMan-agent” daemon. Its purpose is to communicate tasks to the LDAP server to execute. For example, if a node running the Web user interface manager has a request to add a user, the request gets communicated via the ISPMan-agent interface to the appropriate servers, which, in turn, provision the user on the appropriate servers. Some of the tasks it can perform are the following: create mailbox set mailbox quota delete mailbox create home directory delete home directory ISPMan Installation As with many projects, the most important piece for a complicated installation like ISPMan is to plan appropriately. Some questions you should ask when planning a project like this include:

n What types of services do I want to offer? n What equipment can I dedicate to this project? n How much redundancy do I want? n What level of high availability am I willing to pay for (99.9%, 99.99%, etc.)? n How many subscribers do I want the initial system sized for? How many subscribers do I think I will have in six months? In 12 months? n Which machines will have what function? n What is the interface into my billing system? n Are there any other interfaces required for add-on services, such as hosting MS Exchange services? n Do I have resellers selling my services? What are their requirements? ISPMan itself is relatively straightforward to install. However, due to the complex nature of these installations, the details are left as an “exercise for the reader.”

16 Vol. 28, No. 5 ;login: REFERENCES Amavis: http://www.amavis.org/ Additional components such as antivirus and anti-spam functionality can be added. Apache: http://httpd.apache.org/

These can be provisioned as optional additional services in order to build incremental DMIN Bynari InsightServer: http://www.bynari.net/ A revenue for the provider. YS S

Clam Anti-Virus: http://clamav.elektrapro.com/ l How an Enterprise Might Use ISPMan Cyrus IMAP server: The “holy grail” that ISPMan solves for the enterprise would be the “single sign on” http://asg.Web.cmu.edu/cyrus/imapd/ problem. The single sign on would essentially be the employee’s LDAP directory entry Ensim: http://www.ensim.com/ in the ISPMan application. This would be used for user authentication for every appli- cation in the enterprise that required it. It could also be used for a centralized “address FreeRADIUS: http://www.freeradius.org/ book” if so desired. Horde IMP: http://www.horde.org/imp/ In addition, ISPMan could be used to provision and provide email and dialup services H-sphere: for an enterprise. In fact, the reseller capability would be a nifty way to enable divi- http://www.psoft.net/h_sphere2_info.html sions of a large organization to add their own accounts, on their own subdomain or Introduction to ISPMan written by the author, even in conjunction with an entirely different domain! The central IT organization Atif Ghaffar: http://www.linuxfocus.org/ would enable IT or division employees to manage their own subset of accounts sepa- English/September2000/article173.shtml rate from the rest of the organization. ISC BIND: http://www.isc.org/products/BIND/ ISPMan could be combined with something like Bynari’s InsightServer to provide cal- ISPMan home page: http://www.ISPMan.org/ endaring, scheduling, contact-list management, etc., for a total enterprise solution at a Lotus Notes: http://www.lotus.com/ fraction of the price of MS Exchange or Lotus Notes. The added bonus is that the solu- tion is based on open protocols, utilizing the same “standard” interface (MS Outlook). Managing ISPMan’s logs: http://nakedape.cc/wiki/index.cgi/ Bynari even offers an add-on Web client that emulates much of the functionality of the ISPManLogStats MS Outlook client. While not for every enterprise, certainly the price alone makes it worth looking into. MS Exchange: http://www.microsoft.com/exchange/ I wish to thank Atif Ghaffar for his input into this article. Next time, please join me for MS Outlook: another installment of ISPadmin. In the meantime, I look forward to hearing your http://www.microsoft.com/office/outlook/ feedback! default.asp OpenAntiVirus: http://www.openantivirus.org/ OpenLDAP: http://www.openldap.org/ Postfix: http://www.postfix.org/ ProFTPD: http://proftpd.linux.co.uk/ PureFTPd: http://www.pureftpd.org/ SAGE members list archive: http://sageweb.sage.org/resources/mailarchive/ sage-members-archive/ Neil Schneider’s write-up on installing ISPMan: http://www.linuxgeek.net/ISPMan/ SOAP: http://www.w3.org/TR/SOAP/ SpamAssassin: http://au.spamassassin.org/ Sphera: http://www.sphera.com/ vhost: http://www.chaogic.com/vhost/ Virtuozzo: http://www.sw-soft.com/ Vishwakarma: http://www.kandalaya.com/vishwakarma.shtml vserver: http://www.solucorp.qc.ca/miscprj/s_context.hc

October 2003 ;login: ISPADMIN l 17 “eat your own dog food”

You’ve no doubt heard this phrase before; you’ve probably even used it by Adam Moskowitz once or twice. As a system administrator, do you eat your own dog food? Adam Moskowitz is a system administrator, From the exact same can as your users? If not, why not? programmer, manager of system administra- In the late 1980s, years before the “dog food” phrase was ever uttered, I worked in tors, certified barbecue judge, and author of Encore Computer Corporation’s software development group. Not only did the oper- the recently published ating system developers have to use the latest version of UMAX (our multiprocessor SAGE Short Topics booklet Budgeting for port of 4.3BSD), everyone in the group did, too. Talk about a good use of peer pres- SysAdmins. sure! I can’t prove that this practice actually improved the quality of our software, but when a user reported that something was broken, we tended to believe them, usually because we had already discovered the problem for ourselves. [email protected] When a user says, “Foo is broken,” I think the worst possible response a system admin- istrator can give is, “It works for me.” It’s even worse when the sysadmin is on a differ- ent hardware platform than most users, running a different , and using a different shell. Eating your own dog food won’t stop you from giving such a bad answer, but it at least mitigates your “sin.” On the other hand, I think users find it comforting when your response is, “Yes, I’ve noticed this, too; I’m already working on fixing it.” Eat your own dog food often enough and this is likely to become your usual response. Here’s how I think the concept should apply to system administrators: First, use the same hardware platform and operating system as your users. If there are several, and it’s not feasible to give every sysadmin in the group one of every platform, spread them evenly throughout the group. Apply the same patches to your machine as you apply to users’ machines: no more and no fewer. Upgrade the OS on your machine no more than one day before you upgrade users’ machines. The obvious complaint at this point is usually, “But I have to test new patches/operat- ing systems/whatever somewhere before we put them in production.” Yes, you do, and that’s what test machines are for. Treat your desktop system as a production (user) machine and do your testing elsewhere. Next, shells: These are very personal things, and one’s setting is even more so. I’ve always believed that an organization should support a (limited) number of shells, but to the greatest extent possible allow users to run whatever shell they want. By “sup- port” I mean that an effort will be made to make sure that all supported programs run on each of the supported shells, and problems in this area will be investigated (and, one hopes, fixed) by the appropriate part of the IT organization. In an ideal world, users will be able to run all supported programs with an empty shell rc file, because the system-wide default file will contain all the required settings. If that’s not possible, new users should be given an rc file that looks something like this: # Uncomment the next line if you plan to run ABC # . /etc/rcfiles/abc # Uncomment the next line if you plan to run DEF # . /etc/rcfiles/def I know it would be folly to suggest that system administrators be forced to use the same shell(s) as users. Instead, I propose that there be a test account for every shell, either without an rc file or with the same one given to new users. If it contains lines like those shown above, uncomment them as appropriate to test a given problem (or

18 Vol. 28, No.5 ;login: to match the users’ settings), then comment them out again when you’re done. Here’s a If you write tools, for use concrete example of what I mean: DMIN A

either by users or your fellow YS

joesixc:*:99901:23:CSH test account:/home/joesixb:/bin/csh S joesixk:*:99902:23:KSH test account:/home/joesixk:/bin/ksh system administrators, you l joesixs:*:99903:23:SH test account:/home/joesixs:/bin/sh should make it a point to use (I once worked on a project where “Joe Sixpack” was the name we gave to our target customer.) those tools and not the If a user reports that something isn’t working, and it isn’t something obvious like a underlying “raw” commands. malformed command, log in to the appropriate account on the affected machine, modify the rc file as necessary, and only then try to duplicate the user’s problem. If, after trying all this, it still “works for you,” say to the user: “I can’t seem to duplicate the problem, I’ll have to come to your office and work with you to figure out why you’re having the problem.” If you write tools, for use either by users or your fellow system administrators, you should make it a point to use those tools and not the underlying “raw” commands (if such commands exist). For example, I once wrote a set of Perl scripts to manipulate DNS files without having to use an editor, and some very simple Web forms to call those scripts using CGI. The other UNIX admins used the command line version, and the Windows and Macintosh guys used the Web forms; the idea was to allow anyone in the group to make DNS changes without having to worry (too much) about the per- snickety file formats. Even though there were three of us in the group who were per- fectly capable of editing the files by hand (and getting it right, at least most of the time), the agreement was that we would always use the tools I had written. There were several interesting results from this “experiment”:First,the number of times DNS broke because one of the senior admins had edited the files by hand and gotten them wrong (or forgot to increment the serial number) dropped significantly. Second, we found several bugs in the tools that the junior people wouldn’t have dis- covered for a while (if ever). Third, we came up with several modifications to the tools that made them even more useful to the junior staff. Clearly, eating my own dog food not only helped improve the quality of the tool I had written, it actually helped me reduce the number of mistakes I made while doing my job. That is, after all, the reason for writing such tools in the first place – but what good are they if you don’t use them? There are other ways in which the “dog food” concept can be applied to system admin- istration, but I trust that you can figure them out for yourself.

October 2003 ;login: “EAT YOUR OWN DOG FOOD” l 19 using C# properties and static members

public class TestProp1 { by Glen public static void Main() { McCluskey Prop1 p = new Prop1(1956); Glen McCluskey is a Console.WriteLine("year #1 = " + p.GetYear()); consultant with 20 years of experience //p.year = 1977; and has focused on p.SetYear(1977); programming lan- guages since 1988. Console.WriteLine("year #2 = " + p.GetYear()); He specializes in Java } and C++ perfor- } mance, testing, and technical documen- tation areas. The year field is private, meaning that it cannot be set directly from outside of the class (see the commented line in Main). If [email protected] the field is made public, then there’s no way to validate a new value that is set. The field is instead changed via the SetYear In this column we’re going to continue our examination method, and the proposed new value is checked in this method. of the C# programming language, and look at two par- ticular features of C# classes. This approach is very common and works pretty well, but it’s a little tedious to use, with every private field requiring a pair of We’ll start by considering the use of properties, which are kind get/set access methods. of a hybrid between data fields and methods. We’ll then go on and look at static class members. C# offers another approach to solving this problem, using what are called properties. A property looks like a data field in an Properties object, but access is controlled via internal get/set methods. The Imagine that you’re developing a C# class, and that class will property can be made public and accessed like a field, but there have a data field that represents a calendar year. The field is set is a layer of control that allows the class designer to interpose by the constructor and validated to ensure that the year is 1800 specific processing when the property is accessed. or later. You also want the ability to change the year after the Let’s look at an example: fact, in existing objects of the class. using System; Here’s some C# code that illustrates this approach: public class Prop2 { using System; private int year; public class Prop1 { private const int MINYEAR = 1800; private int year; public int Year { public Prop1(int y) { get { SetYear(y); return year; } } public int GetYear() { set { return year; if (value < MINYEAR) } throw new ArgumentException("year < " + MINYEAR); private const int MINYEAR = 1800; year = value; } public void SetYear(int y) { } if (y < MINYEAR) throw new ArgumentException("year < " + public Prop2(int y) { MINYEAR); Ye a r = y ; year = y; } } } } public class TestProp2 { public static void Main() { Prop2 p = new Prop2(1956); Console.WriteLine("year #1 = " + p.Year);

20 Vol. 28, No. 5 ;login: p.Year = -1977; using System; Console.WriteLine("year #2 = " + p.Year); public class TestGlobals { } public static void Main() { }

//Globals g = new Globals(); ROGRAMMING There’s still a private year field in this code, but also a public P Globals.glob1 = 500; l property “Year.” The property can be thought of as a “virtual Globals.glob2 = 600; field.” It’s treated like a data field when you’re programming with the class, but there are hooks such that the class can con- Console.WriteLine("glob1 = " + Globals.glob1); trol what happens when the property’s value is retrieved or set. Console.WriteLine("glob2 = " + Globals.glob2); } In the example above, when the property value is retrieved, the } private year field’s value is returned. When the property is set, Globals is a class with two public static data members. They can the proposed new value, represented by the keyword value, is be referenced by qualifying the member names with “Globals.” first checked to ensure that it’s at least 1800. Then the private Globals also has a private constructor, which cannot be called field is set. from outside the class. Defining a private constructor means Properties enable simple field access from a programmer’s per- that no instances of the class can be created. In other words, the spective while, at the same time, supporting data hiding so that class is used simply as a packaging vehicle for static data mem- access to private data can be controlled. bers.

Global Variables and Static Members This same approach can be used for packaging static methods, C# requires that all data fields be part of a class. This restriction such as methods that represent self-contained mathematical leads to an obvious question: How do you implement global functions and that have no meaning as conventional methods variables, variables that can be accessed from anywhere in your that operate on class instances. Let’s look at an example: application? Using such variables is not always a good idea, but using System; we’re going to assume that you really do want them for some public class CircleFuncs { purpose. private CircleFuncs() {} To answer this question, we need to consider what is meant by public static double GetCircumference(double r) { the concept of static class members. Normally, you define a class return 2.0 * Math.PI * r; and then create instances or objects of the class. For example, a } Point class that represents X,Y points will have various public static double GetArea(double r) { instances, such as one that represents the point 25,35. The X,Y return Math.PI * r * r; values in the instance are called instance members. } A static member can be thought of as belonging to the class } itself, and not to its instances. For example, a static data field is public class TestCircleFuncs { shared across all instances of a class. There may be one or a mil- public static void Main() { lion instances of the class in a running application, but there double radius = 10.0; will still be only one copy of the static data for the class. Console.WriteLine("circumference = " + Static members can be used to implement the equivalent of CircleFuncs.GetCircumference(radius)); global variables. Here’s an example: Console.WriteLine("area = " + CircleFuncs.GetArea(radius)); // file #1 } public class Globals { } private Globals() {} CircleFuncs is a class that groups together some methods used public static int glob1 = 100; to calculate properties of a circle, such as its circumference and public static int glob2 = 200; area. } Note that the Main method, the entry point to a C# application, // file #2 is static. It’s part of a class – in the example above, the class Test- CircleFuncs – but it doesn’t operate on instances of TestCircle- Funcs.

October 2003 ;login: USING C# PROPERTIES AND STATIC MEMBERS l 21 Constants uint last = (uint)Math.Sqrt(p); Constants are closely related to static members. If you’re doing for (uint i = 3; i <= last; i += 2) { C# programming, how do you define groups of constants for if (p % i == 0) return false; use in your program? Let’s look at a couple of examples that } illustrate some of the techniques that are available: return true; using System; } enum Color {RED = 1, GREEN = 2, BLUE = 3} static Primes() { PRIMES = new uint[NUMPRIMES]; public class Const { uint currvalue = 1; private Const() {} for (uint i = 0; i < NUMPRIMES; i++) { public const string RED = "red"; while (!IsPrime(currvalue)) public const string GREEN = "green"; currvalue++; public const string BLUE = "blue"; PRIMES[i] = currvalue++; } } } public class TestConst { } public static void Main() { Console.WriteLine("Color.GREEN = " + public class TestPrimes { Color.GREEN); public static void Main() { Console.WriteLine("Color.GREEN as int value = " + Console.Write("primes = "); (int)Color.GREEN); for (uint i = 0; i < Primes.NUMPRIMES; i++) Console.WriteLine("Const.GREEN = " + Console.Write(Primes.PRIMES[i] + " "); Const.GREEN); Console.WriteLine(); } } } } In this first example, we use an enumerated type, very similar to In this example, we want to compute a table of prime numbers. what C and C++ offer. This approach works as you would We want the table to be constant and thus not mutable after it’s expect, but is suitable only for integral types. initialized, but at the same time, we’d like to compute the values in the table at runtime, rather than actually listing them out (2, The Const class shows how to define a group of string con- 3, 5, 7, 11, ...) in the source code. stants. A private constructor is once again used, so that no instances of the Const class can be created. The fields are There are a couple of techniques that we use to implement this marked as Const, which means that they’re static and cannot be approach. We mark the PRIMES array as static and read-only. changed after initialization. The read-only qualifier means that the array can be modified in the constructor, but not afterwards. When you run this program, the output is: We also use a static constructor, which is called when the static Color.GREEN = GREEN data members for the Primes class are initialized. The class has Color.GREEN as int value = 2 both an instance constructor, which is private and used to disal- Const.GREEN = green low creation of class instances, and a static constructor, used to Here’s another slightly more complicated example: initialize the PRIMES field. using System; The output of this program is: public class Primes { primes = 2 3 5 7 11 13 17 19 23 29 public const uint NUMPRIMES = 10; public static readonly uint[] PRIMES; In future columns we’ll start looking at some broader issues with classes, such as programming with interfaces and abstract private Primes() {} classes. private static bool IsPrime(uint p) { if (p <= 2) return p == 2; if (p % 2 == 0) return false;

22 Vol. 28, No. 5 ;login: the tclsh spot tions only need to know how to talk to the master. This allows by Clif Flynt the slave tasks to be simpler applications. Clif Flynt is president The last choice is whether to control the slave applications using of Noumena Corp., command line arguments, pipes, or sockets. Again, the need to ROGRAMMING which offers training P

and consulting ser- run on multiple sets of hardware drives the design to a socket- l vices for Tcl/Tk and Internet applications. based client-server architecture. He is the author of Tcl/Tk for Real Pro- Using the Tcl socket command to coordinate multiple tasks is grammers and the fairly simple. The Tcl TCP socket implementation is possibly TclTutor instruction package. He has the easiest-to-use socket package available, and the callback been programming mechanism used to service clients makes it easy for a server to computers since 1970 and a Tcl advo- interact with several active clients simultaneously. cate since 1994. Tcl uses a channel abstraction for I/O. A channel is a handle [email protected] that references a source or destination for a stream of bytes. A Tcl channel is similar to the FILE pointer in C, abstracted a bit Client Server Sockets higher to include pipes and sockets. We open either a client or server socket with Tcl’s socket com- Previous Tclsh Spot articles described techniques for mand. A client-side socket is slightly simpler, so we’ll look at generating IP packets to simulate an attack on a fire- that first. wall, while the last Tclsh Spot article described using the Syntax: socket ?options? host port Expect extension to monitor a remote system’s log files. socket Open a client socket connection. This article will start to explore techniques for coordi- ?options? Options to specify the behavior of the socket. nating the attack-and-monitor activities for a firewall -myaddr addr Defines the address (as a name or exerciser. number) of the client side of the Several software architecture options exist for a system like this. socket. This can be used to specify The two obvious ones are a single application with attacking which of several interfaces and monitoring subsections and a set of cooperating applica- to use, and is not necessary if the tions where each application provides a subset of the function- client machine has only one network ality. interface. -myport port Defines the port number for the A single application is conceptually simpler, since there’s no server side to open. If this is not sup- need for interprocess communications. On the other hand, plied, then a port is assigned at ran- dealing with multiple sections that can require attention at dom from the available ports. undefined intervals is nearly as complex as interprocess com- -async Causes the socket command to munication. The real problem with a single application archi- return immediately, whether the tecture in this case is that it limits the system to a single connection has been completed or hardware platform. The validation application may need to run not. attacks and monitors from multiple sets of hardware. host The host to open a connection to. May be a name or Given that there will be multiple independent processes, the a numeric IP address. next question is whether they should be peers or operate in a port The number of the port to open a connection to on master-slave relationship. If all the processes were identical, it the host machine. would make sense to run a peer relationship. For a system The socket command will return a channel which can be used where each child task has a different purpose, a peer relation- with the puts and gets commands to send and receive data ship would mean that each child would need to know how to from the channel. communicate with every different type of application. With a master-slave architecture, only the master needs to know how to As a quick test of a Tcl client, we might write the code shown talk to many types of applications, and the individual applica- below, expecting to see the beginnings of a Sendmail conversa- tion.

October 2003 ;login: THE TCLSH SPOT l 23 set smtpSocket [socket 127.0.0.1 25] Syntax: fconfigure channelId ?name? ?value? set input [gets $smtpSocket] fconfigure Configure the behavior of a channel. puts "READ 1: $input" puts $smtpSocket "helo [email protected]" channelId The channel to modify. set input [gets $smtpSocket] name The name of a configuration field which includes: puts "READ 2: $input" -blocking boolean Ifset true (the default Unfortunately, this won’t quite work. This script generates a mode), a Tcl program will single line of output and then hangs: block on a gets, or read until data is available. If set READ 1: 220 vlad.cflynt.com ESMTP Sendmail false, gets, read, puts, flush, 8.11.6/8.11.6; Tue, 5 Aug 2003 20:48:36 -0400 and close commands will By default, Tcl channels use buffered I/O. The example above not block. just hangs forever with the string “helo [email protected]” sitting in -buffering newValue The newValue argument the client socket’s output buffer, while the Sendmail server waits may be set to: for input. full: The channel will use Tcl provides two solutions to this dilemma: the flush com- buffered I/O. mand, which will flush a buffer, or the fconfigure command, line: The buffer will be which allows an application to modify the behavior of a chan- flushed whenever a nel. full line is received. The simplest way to solve the problem is to follow each puts none: The channel will with a flush command. This works fine on small programs but flush whenever char- gets cumbersome on larger projects. acters are received. Syntax: flush channelId By using fconfigure to set the buffering to line mode, we don’t need the flush after each puts command. flush Flush the output buffer of a buffered channel. set smtpSocket [socket 127.0.0.1 25] channelId The channel to flush. fconfigure $smtpSocket -buffering line For example: set input [gets $smtpSocket] set smtpSocket [socket 127.0.0.1 25] puts "READ 1: $input" set input [gets $smtpSocket] puts $smtpSocket "helo example.com" puts "READ 1: $input" set input [gets $smtpSocket] puts $smtpSocket "helo [email protected]" puts "READ 2: $input" flush $smtpSocket This script generates output resembling this: set input [gets $smtpSocket] puts "READ 2: $input" READ 1: 220 vlad.cflynt.com ESMTP Sendmail 8.11.6/8.11.6; Tue, 5 Aug 2003 20:51:34 -0400 This generates the expected output of a simple conversation: READ 2: 250 vlad.cflynt.com Hello localhost [127.0.0.1], READ 1: 220 vlad.cflynt.com ESMTP Sendmail pleased to meet you 8.11.6/8.11.6; Tue, 5 Aug 2003 20:48:36 -0400 A server-side socket is a little different. Rather than opening a READ 2: 501 5.0.0 Invalid domain name connection to another system, a server waits until a client The better way to solve the buffered I/O problem is to figure out requests a connection to a particular port. When a client what style of buffering best suits your application and configure requests a connection, a new port is assigned for the conversa- the channel to use that buffering. For a challenge/response type tion, and a callback script defined in the socket -server com- interaction, this is probably line buffering; a character-based mand is evaluated. interactive application (like Telnet) would use no buffering, Syntax: socket -server procedureName ?options? port while an application moving lots of data (like an HTTP dae- socket mon) would use fully buffered I/O. -server Open a socket to watch for connections from clients.

24 Vol. 28, No. 5 ;login: procedureName A procedure to evaluate when a connection available to read when the script is called, thus the application attempt occurs. This procedure will be called never blocks. with three arguments: Syntax: fileevent channel direction ?script? n The channel to use for communication fileevent Defines a script to evaluate when a channel ROGRAMMING with the client. P readable or writable event occurs. l n The IP address of the client. channel The channel identifier returned by open or n The port number used by the client. socket. ?options? Options to specify the behavior of the socket. direction Defines whether the script should be evalu- -myaddr addr Defines the address (as a name ated when data becomes available (readable) or number) to be watched for or when the channel can accept data connections. This is not neces- (writable). sary if the client machine has ?script? Ifprovided,this is the script to evaluate when only one network interface. the channel event occurs. If this argument is port The number of the port to watch for connec- not present, Tcl returns any previously tions. defined script for this file event. The code to establish a server-side socket looks like this: Setting up a file event is commonly done on the server side socket -server openConnection $port within the openConnection script, and on a client side, imme- diately after opening the socket. The script that gets evaluated when a socket is opened (in this # Server side sample openConnection with fileevent case, the openConnection procedure) does whatever setup is required. This might include client validation, opening connec- proc openConnection {channel ip port} { tions to databases, configuring the socket for asynchronous fileevent $channel readable [list processLine $channel] read/write access, etc. fconfigure $channel -buffering line } The script has three arguments appended to it before being evaluated: the handle for the new channel, the IP address of the # Client sample open socket with fileevent client, and the port assigned to the client’s socket. set Client(sock) [socket 127.0.0.1 12345] fileevent $Client(sock) readable "processLine $Client(sock)" A simple server to report the current time and close the connec- tion looks like this: The last “Tclsh Spot” article described using expect to auto- mate examining a log file. We can use the challResp procedure #!/usr/local/bin/wish socket -server openConnection 12345 from that example to build a client that will automate verifying an FTP server. proc openConnection {channel ip port} { puts $channel [clock format [clock seconds]] The challResp procedure provides a framework for close $channel challenge/response interactions: } ############################################# This will open a connection, but doesn’t do anything useful. A # proc challResp {pattern response info}— more useful server would interact with the client. The server #Hold a single interchange challenge/response conversation. could use the blocking gets command to wait for input, but # Arguments while this paradigm works with single-client applications like #pattern:The pattern to wait for as a challenge. #response:The response to this pattern. Sendmail, it won’t work with multiple clients, any of which #info: Identifying information about this interac- might require service at any time. #tion for use with exception reporting. Tcl supports both the linear-program flow used with a block- # until-data-is-ready model, and an event-driven flow, which can # Results be used with a multiple simultaneous session model. proc challResp {pattern response info} { global spawn_id The fileevent command defines a script to evaluate when data expect { becomes available. Using fileevent guarantees that data will be $pattern {exp_send "$response\n"}

October 2003 ;login: THE TCLSH SPOT l 25 timeout {error "Timeout at $info" "Timeout at $info"} set fail [catch {challResp $challenge [subst eof {error "Eof at $info" "Eof at $info"} $response] $msg} result] } if {$fail} {break;} return "OK" } } array set lookup {0 "Success" 1 "Fail"} We could automate this FTP login conversation: puts $Client(output) [list RESULT: $lookup($fail) $< ftp 192.168.90.222 $result] Connected to 192.168.90.222. } 220 vmware2.cflynt.com FTP server (Version By using the associative array Client to hold the IP address, wu-2.6.2-5) ready. Name (192.168.90.222:clif): anonymous username, and password, it is easy to run multiple tests with 331 Guest login ok, send your complete e-mail address code like this: as password. array set Client {IP 192.168.99.99 User badIP Passwd Password: badPasswd} 230 Guest login ok, access restrictions apply. runTest with this code: array set Client {IP 192.168.90.222 User goodUser Passwd goodPasswd} spawn ftp 192.168.99.99 runTest challResp "Name" anonymous "Name prompt" challResp "word:" [email protected] "Password prompt" This set of code would create a stand-alone application with a challResp "Guest login ok" " " "FTP application prompt" hardcoded set of tests. The script would iterate through the tests and exit. If there is no FTP server running on 192.168.99.99, a timeout error will be generated with the string Name prompt, and if We can convert this into a client-server application by adding a anonymous logins are not supported, the error will include the procedure to process the data that’s read from the server and a string FTP application prompt. If anonymous logins are sup- few lines to open and configure the socket. The problem with ported, no error will be generated. this is that the script would open the socket, send an initial “Hello,” and then reach the end of the script and exit. This can be expanded and generalized into a procedure that keeps a list of arguments for challResp in a list, and iterates The vwait command is the solution to this problem. The vwait over them until the arguments are used up, or an error is command causes a script to wait until a variable changes value. thrown: The interpreter pauses at the vwait command and enters the event loop, processing events until the variable is assigned a new ############################################# value. After this the interpreter continues evaluating the com- # proc runTest {}— mands in the script. #Run an FTP login test # Arguments Syntax: vwait varName #NONE varName The variable name to watch. The script fol- # Results lowing the vwait command will be evaluated #Returns a list of result (Success/Fail) and optional after the variable’s value is modified. #failure message. # The simplest way to process the data from the server is to have proc runTest {} { the server always send Tcl commands, which can be evaluated global Client spawn_id errorInfo in the client using Tcl’s eval command. set errorInfo "" The eval command concatenates a set of strings into a single spawn ftp $Client(IP) string, and passes it to the command evaluation section of the set conversations { interpreter, just as lines in a script are evaluated. "Name" "$Client(User)" "Name prompt" Syntax: eval string1 ?string2...? "word:" "$Client(Passwd)" "Password prompt" string* Strings that will compose a command. "Guest login ok" {} "FTP application prompt" } proc processLine {channel} { global Client foreach {challenge response msg} $conversations {

26 Vol. 28, No. 5 ;login: set len [gets $channel line] {{anonymous/goodpwd} {IP 192.168.90.222 User anonymous Passwd [email protected]}} if {[eof $channel]} { } close $channel

return The openConnection procedure registers the fileevent script to ROGRAMMING } evaluate whenever the client sends data, configures the channel P l eval $line to be line buffered, and initializes a counter to step through the } tests for this client. set Client(output) [socket 127.0.0.1 23456] Note how this procedure uses an associative array index with fileevent $Client(output) readable "processLine two fields to distinguish the test counts for connections from $Client(output)" different clients. Using multiple fields in an array index pro- fconfigure $Client(output) -buffering line vides the same functionality in Tcl as multiple dimensioned # Let the server know we’re open for business. arrays provide in C and FORTRAN. puts $Client(output) ready proc openConnection {channel ip port} { global Server set Client(done) 0 fileevent $channel readable [list processLine $channel] vwait Client(done) fconfigure $channel -buffering line The server will send the client data like this: # initialize the test counter set Server($channel.testNum) 0 array set Client {IP 192.168.99.99 User badIP Passwd } badPasswd} runTest The processLine procedure starts the same as the client’s When the client receives data, the fileevent command causes processLine procedure by reading a line of data and checking the processLine procedure to be evaluated, which reads a line for an EOF condition. from the socket and evaluates it. The server does a rudimentary parse, looking to see if a line Notice the eof test after the gets. This will catch the spurious starts with the phrase “RESULT”.Ifthe line starts with read event generated by most TCP stacks when the other end of “RESULT”,it’s displayed. Once data is read, the runTest proce- a socket closes. dure is invoked. The server end of this pair includes a list of tests to run and proc processLine {channel} { global Server three procedures to coordinate the tests: set len [gets $channel line] openConnection Accepts new client connections. if {[eof $channel]} { processLine close $channel Reads data from the client. Displays results and calls unset Server($channel.testNum) runTest to start the next test in the client. return runTest } Sends the commands to the client. if {[string first RESULT $line] == 0} { puts "$Server(descript): $line" The list of tests can simply be an identifier and a set of array } indices and values to be sent to the client: runTest $channel set Server(tests) { } {{bad address } {IP 192.168.90.223 User badIP Finally, the runTest procedure checks to see whether there are Passwd badPasswd}} valid tests to be run. If there are, it sends the appropriate Tcl {{bad username} {IP 192.168.90.222 User badUser commands to the client and updates the counter. Passwd badPasswd}} {{good username} {IP 192.168.90.222 User goodUser proc runTest {channel} { Passwd goodPasswd}} global Server {{anonymous/badPasswd} {IP 192.168.90.222 User if {$Server($channel.testNum) = [llength anonymous Passwd badPasswd}} $Server(tests)]} {

October 2003 ;login: THE TCLSH SPOT l 27 puts $channel {set Client(done) 1} } else { foreach {Server(descript) params} \ [lindex $Server(tests) $Server($channel.testNum)] {} puts $channel "array set Client [list $params]" puts $channel "runTest" incr Server($channel.testNum) } } This pair of procedures implements a simple test framework that can be run with different sets of data to characterize an FTP server. It’s not sufficient to handle characterizing a firewall, but it’s getting closer. Sending scripts to the client to evaluate is a technique used by agent-style applications. This technique supports a great deal of customization at runtime. Tcl’s eval command creates safe sandboxed interpreters, which makes it an excellent choice for exploring agent style applications. The next “Tclsh Spot” article will look at building a server- agent architecture to perform more tests. As usual, the complete code that was described in this article is available from http://www.noucorp.com.

28 Vol. 28, No. 5 ;login: practical perl print submit(), reset(); print end_form(); by Adam Turoff Adam is a consultant print end_html(); who specializes in ROGRAMMING

using Perl to manage While that approach is easy to write and easy to understand, it P

big data. He is a l long-time Perl Mon- is also awkward and cumbersome. It is a simple translation of ger, a technical editor HTML syntax into Perl statements for building a static Web for The Perl Review, and a frequent pre- page. Modifying and debugging this program will be more diffi- senter at Perl confer- cult than necessary – programmers will need to keep a mental ences. model of the HTML page in mind while modifying code that [email protected] uses Perl syntax. Adding dynamic features to selectively display some components will turn this simple program into some- Using Object Factories thing complex very quickly. The above fragment deals with two primary tasks: building the In my last column, I demonstrated how to clean up a CGI Web page, and building the form components. In my experi- form-building program by refactoring it and using a hierarchy ence, the first part of this program is static and unchanging, of modules. Each module inherited from a common applica- while the second part is more dynamic. Therefore, the program tion-specific Field module and implemented specialized behav- can be simplified by separating the static HTML-building parts iors to produce a specific kind of CGI form field. This time, I from the more dynamic form-building parts. revisit the same problem and examine a different solution – object factories and factory methods. One way to simplify the form-building part of this program is to describe an HTML form with a list of hashes. Each hash in One of the best features of Perl is the principle of TMTOWTDI: this list represents a single form field. Building an HTML form “There’s More Than One Way to Do It.” Even if you have only a involves processing these field descriptions and converting them passing familiarity with Perl, you can use it to automate a into HTML form fields as necessary. The full Web page is then tedious task or write a small program to get your job done. You created by combining the static HTML elements with these can approach Perl as a shell programmer, or as a C or Java pro- dynamically generated form fields. A program written this way grammer, and still get your job done. might look something like this: However, Perl is a rich language unlike any other. While you are #!/usr/bin/perl -Tw free to use idioms from shell, C, or Java to accomplish your task, use strict; using Perl idioms can help you do it faster and with less effort. use CGI qw(:standard); My last column took a Java-flavored approach to cleaning up a CGI program. In this column, I’ll look at a more Perl-flavored my @fields = ( { approach that’s easier to write, maintain, and extend. -name => "name", -label=> "Name", Many Ways to Do It -size=>50, Consider the problem from the last column: a CGI program -maxlength=>50, that needs to create an HTML form. The simple and straight- -procedure=>\&CGI::textfield, forward approach is to use the CGI.pm HTML-building func- }, tions to build a Web page one piece at a time: ## more fields here ); #!/usr/bin/perl -Tw ... use strict; my @rows; use CGI qw(:standard); foreach my $row (@fields) { print header('text/html'); my $sub = $row->{‘-procedure’}; print start_html("Test Page"); push (@rows, Tr(td($row->{-label}), td($sub->(%$row)))); print start_form(); } print popup_menu(...); ...

October 2003 ;login: PRACTICAL PERL l 29 print start_form(), The Problem with Inheritance table(@rows), The interface provided by the Field::* modules certainly sim- submit(), reset(), plifies the job of creating a dynamic Web form. It works by end_form(); using a core Field class and subclasses like Field::Text to con- ... struct specific field types: While this approach is a step forward, it does have problems. package Field; The format for the field descriptions found in the @fields array use strict; are undocumented. They are values that will be passed to a CGI.pm function like CGI::textfield, but that knowledge is use CGI qw(:standard); hidden dozens or hundreds of lines away in the body of the sub init_field { foreach loop. my $self = shift; This approach also leads to a lot of duplicated information. The my %params = @_; -name and -label components contain similar values. Instead, ## Assign the key/value pairs for this object one could easily be derived from the other, reducing an oppor- while(my ($key, $value) = each %params) { tunity for bugs to creep in. $self->{$key} = $value; } Ideally, these field objects should be simple to create and use. ## Create the field name from the text label One way to do that is to create a group of Field modules to ease my $name = "\L$self->{-label}"; the process of defining fields to build an HTML form. Here is $name =~ s/ /_/g; an example of what that might look like, taken from the last $self->{-name} = $name; column: return $self; #!/usr/bin/perl -Tw } use strict; sub display_row { use Field; ## pull in all the Field::* packages my $self = shift; use CGI qw(:standard); my $sub = $self->{-procedure}; my @fields = ( return Tr(td($self->{-label}), td($sub->(%$self))); new Field::Text('Name'), } ## more fields here package Field::Text; ); use base 'Field'; ... use CGI; my @rows; sub new { foreach my $field (@fields) { my $class = shift; push (@rows, $field->display_row()); my $label = shift; } ## Create an object print start_form(), my $self = bless {}, $class; table(@rows), ## Finish initialization submit(), reset(), $self->init_field(-label => $label, end_form(); -size => 50, ... -maxlength => 50, -procedure => \&CGI::textfield); In this example, the interface for building a Web page is much } cleaner. Creating the @fields list is done by creating a series of objects that are defined by the Field module. Each object con- Field types that display multiple values share similar behaviors. structor uses sensible defaults and requires a minimal amount The Field::Group module helps to define field types like radio of information. Later on, the dynamic HTML form field gener- groups and checkbox groups: ation is accomplished by calling the display_row method on package Field::Group; each field object in turn. use base 'Field';

30 Vol. 28, No. 5 ;login: sub init_group { ence between these objects. In fact, the only real differences my $self = shift; between these objects are in the data they store. my $procedure = shift; my $label = shift; Because there are no behavioral differences between these

objects, there’s no necessity to create multiple class definitions. ROGRAMMING

my @values = @_; P

In fact, all of these objects could be created through different l $self->init(-label => $label, methods in a single class. After all, there’s nothing special about -procedure => $procedure, object constructors in Perl – they’re just class methods that hap- -values => \@values); } pen to create objects. package Field::RadioGroup; Refactoring the code to take advantage of this observation, we use base 'Field::Group'; can replace the entire class hierarchy with two classes: one to use CGI; display fields and one to create field objects. The new Field class is very easy to write; it contains all of the behaviors shared sub new { across field objects. Currently, this is only the display_row() my $class = shift; method, and a basic constructor: my $self = bless {}, $class; package Field; $self->init_group(\&CGI::radio_group, @_); use CGI qw(:standard); } sub new { package Field::CheckboxGroup; return bless {}, __PACKAGE__; use base 'Field::Group'; } use CGI; sub new { sub display_row { my $class = shift; my $self = shift; my $sub = $self->{-procedure}; my $self = bless {}, $class; return Tr(td($self->{-label}), td($sub->(%$self))); $self->init_group(\&CGI::checkbox_group, @_); } } The rest of the magic is in an object factory class, a class that Although this module hierarchy does aid in creating dynamic exists to create other objects. This class, FieldFactory, contains HTML forms, it has a Java-flavored design that leads to overly the methods for creating and customizing new Field objects, complex Perl code. In order to define three types of fields, five init_field() and init_group(). The init_field() method handles the classes are required in a hierarchy that is three levels deep. bulk of the initialization and customization of a new Field Extending this library isn’t difficult, but it is cumbersome. Each object, while the init_group() method handles tasks common to HTML form field type requires a new class definition. Each initializing group fields. class definition contains a package declaration, a use base dec- Here’s the start of the FieldFactory class. These two methods are laration, and a constructor method. While none of these almost exactly the same as the previous versions. The main dif- requirements are particularly odious, they obscure the intent: ference is that the init_field() method customizes a new object, identifying the differences between textboxes, radio groups, $obj, not the current object, $self (now a FieldFactory object): checkbox groups, and other HTML form fields. package FieldFactory; Using Object Factories use CGI; Looking at the code for the Field modules, there are two pri- sub new { mary tasks that need to be solved: creating and displaying field return bless {}, __PACKAGE__; objects. The process of creating fields is handled by a series of } constructor functions, and the process of displaying fields is sub init_field { handled by the display_row() method in the Field class. my $self = shift; Each type of field object is created by a different method. my %params = @_; Field::Text Textbox objects are created by the constructor of the my $obj = new Field; class, radio group objects are created by the constructor of the Field::RadioGroup class, and so on. But there’s very little differ- ## Assign the key/value pairs for this object while(my ($key, $value) = each %params) {

October 2003 ;login: PRACTICAL PERL l 31 $obj->{$key} = $value; With these changes to the Field module, the CGI program needs } some minor changes. First, we have to create a FieldFactory ## Create the field name from the text label object. Next, the constructor calls to create form fields need to my $name = "\L$self->{-label}"; be replaced with method calls on the factory object. The $name =~ s/ /_/g; updated code looks something like this: $obj->{-name} = $name; #!/usr/bin/perl -wT return $obj; use strict; } use Field; sub init_group { use FieldFactory; my $self = shift; my $factory = new FieldFactory; my $procedure = shift; my $label = shift; my @fields = ( my @values = @_; $factory->textbox('Name'), ... $self->init_field( ); -label => $label, -procedure => $procedure, my @rows; -values => \@values); foreach my $field (@fields) { } push (@rows, $field->display_row()); } The remainder of the FieldFactory class is composed of factory methods which call these two init methods to create Field print start_form(), objects. Here is the factory method that creates textbox fields. It table(@rows), submit(), reset(), is almost identical to the old Field::Text constructor: end_form(); sub textbox { ... my $self = shift; my $label = shift; Conclusion return $self->init_field( This program demonstrates there’s always more than one way -label => $label, to do it. Simple and straightforward programs may be easy -size => 50, to write initially, but they can lead to readability and maintain- -maxlength => 50, ability problems later on as they grow. Cleaning up with ad hoc -procedure => \&CGI::textfield); data structures can help some areas of a program and hurt } others. The real benefit comes from adding new factory methods. Here Restructuring programs to use modules is a very big win, and are a few more which create radio groups, checkbox groups, and there’s more than one way to factor out common code into pop-up menus. Note that all we need here is the code. The modules. One common technique is to create a hierarchy of extraneous package declarations and other magic incantations classes to solve a problem. Another is to create an object factory are no longer necessary: instead of a class hierarchy to describe differences between sub radio_group{ objects. Each technique has its benefits and its uses. In this my $self = shift; example, using an object factory helped simplify the implemen- $self->init_field_group(\&CGI::radio_group, @_); tation of an application-specific module with very little impact } on the main of the program. sub checkbox_group{ my $self = shift; $self->init_field_group(\&CGI::checkbox_group, @_); } sub popup_menu { my $self = shift; $self->init_field_group(\&CGI::popup_menu, @_); }

32 Vol. 28, No. 5 ;login: the bookworm BOOKS REVIEWED IN THIS COLUMN by Peter H. Salus There are things I disagree with, but anyone can cavil at anything. This is a THE ART OF UNIX PROGRAMMING Peter H. Salus is a ERIC S. RAYMOND member of the ACM, wonderful must-have. Buy one as soon the Early English Text as you can. Boston, MA: Addison-Wesley, 2003. Society, and the Trol- Pp. 550. ISBN 0-131-142901-9. lope Society, and is a And thank you, Eric. life member of the ABSOLUTELY OPENBSD American Oriental Society. He is Editorial More OS Stuff MICHAEL W. LUCAS Director at Matrix.net. San Francisco, CA: No Starch, 2003. He owns neither a dog Last year I read and enjoyed Lucas’ Pp. 528. ISBN 1-886411-99-9. nor a cat. Absolute BSD,which was on FreeBSD. Lucas has now produced a (smaller) MOVING TO LINUX [email protected] tome called Absolutely OpenBSD.If MARCEL GAGNÉ you’re really into security and excessively Boston, MA: Addison-Wesley, 2003. For about six months or so I’ve been Pp. 348. ISBN 0-321-15998-5. reading the same book, or actually, parts paranoid, OpenBSD is the system for of the same book, some of them several you. And as it’s not an “easy” system, LINUX SECURITY COOKBOOK times. This is because I’ve been reading Lucas’ new book is much needed. I espe- DANIEL J. BARRETT ET AL. The Art of UNIX Programming as Eric cially enjoyed the sections on installa- Sebastopol, CA: O’Reilly, 2003. Pp. 308. ISBN 0-596-00391-9. Raymond’s been writing it. And it has tion and configuration and on building gotten better and better as a variety of firewalls with pf. LINUX IN A NUTSHELL, 4TH ED. folks – including (alphabetically) Ken Gagné’s Moving to Linux is a straightfor- ELLEN SIEVER ET AL. Arnold, Steve Bellovin, Steve Johnson, ward exposition of just how a non- Sebastopol, CA: O’Reilly, 2003. Pp. 928. ISBN 0-596-00482-6. Doug McIlroy, Henry Spencer, and Ken hacker PC user can get rid of “the Blue Thompson – have put in their two cents Screen of Death.” If you have a friend, a INTRUSION DETECTION WITH SNORT (and, in several cases, at least a dime). co-worker, a significant other, or a rela- RAFEEQ UR REHMAN If you are an experienced user of things tive who periodically screams, sighs, Upper Saddle River, NJ: Prentice Hall, 2003. Pp. 256. ISBN 0-13-140733-3. UNIX-y, you’ll really enjoy Raymond’s bursts into tears, or asks for help, here’s work. If you’re a newbie, you may have the simple solution. It comes with a THE LINUX DEVELOPMENT PLAT- to really think about much of it; and if bootable CD of Knoppix, Klaus Knop- FORM you’re in the middle, you can taste, per’s variant of Debian. RAFEEQ UR REHMAN AND CHRISTOPHER PAUL savor, and enjoy. With Linux Security Cookbook,Barrett et Upper Saddle River, NJ: Prentice Hall, 2003. Pp. 320 + CD-ROM. ISBN 0-13-009115-4. I’ll assume that most of my readers are al. have done a nice job in presenting a mid- to high-level. If so, the best place to lot of security tools and techniques in a TCL/TK: A DEVELOPER’S GUIDE, 2D begin this valuable book is at Appendix brief book (barely over 300 pages). I’m ED. D, “Rootless Root.” The first time I read disappointed at the paucity of refer- CLIF FLYNT it, I laughed so hard I got hiccups. Once ences, but the information that’s actually San Francisco, CA: Morgan Kaufmann, 2003. you’ve read that, you might want to go here is first-rate. Pp. 758. ISBN 1-55860-802-8. to pp. 35–38 (in Chapter 1), because The second edition of Linux in a Nut- PRACTICAL PROGRAMMING IN everything that I think of as important is shell has lived near my desk for four- TCL/TK, 4TH ED. encapsulated there. Raymond has ren- and-a-half years. The new fourth edition BRENT WELCH ET AL. dered McIlroy’s, Pike’s, and Thompson’s has just supplanted it. The fourth edi- Upper Saddle River, NJ: Prentice Hall, 2003. versions of the UNIX philosophy into a tion is 1.5 times the size of the second. It Pp. 877 + CD-ROM. ISBN 0-13-038560-3. set of bullets, which might well be put seems to be more than 1.5 times as use- on a wall near every hacker’s screen. ful. Rather than take you through all of Ray- Bruce Perens, the former Debian project mond’s valuable pages, let me just leader, is series editor for a slew of open remark on how good certain pieces are, source books from Prentice Hall. I’ve like the 20 pages of OS comparisons; the read Intrusion Detection with SNORT chapters on languages (and mini-lan- and The Linux Development Platform, guages), editors, and tools are extraordi- and they are of an extremely high qual- nary.

Ocotber 2003 ;login: 33 ity. I trust the other volumes will be as In October 1983, there were 78 good. The snort volume provides a Networking email/news links in the UK being fed number of useful scripts to enable you from UKC by Peter Collinson. Also in to integrate snort with Apache, PHP, etc. Anniversaries 1983, EARN was created – a European The Development Platform does an by Peter H. Salus version of BITNET. (And while I’m talk- excellent job of limning just how to go ing about these things, it was early in about building a Linux development [email protected] 1984 that Jun Murai initiated JUNET in environment. It’s accompanied by a use- As we close in on the end of 2003, I want Japan.) ful CD. to point out the various and sundry FIDONET was also created in 1983. “Net” anniversaries this year has brought Tcl Me us. IBM’s VNET grew to 1001 when I’ve been a Tcl fan for a long time. And I Reykjavik was connected in 1983. admit that I am a friend both of Clif In 1968, BBN received the contract to build and deploy a packet-switching net- Networking was barely a teenager, but Flynt and of Brent Welch. I liked both had become a raging success all over the books when they first came out. Flynt’s work of four nodes. world. has improved somewhat for this second In 1973 (30 years ago), the ARPANET edition – and those of you who read his was made up of 35 hosts; by the end of Moving on, in 1988 the NSFNET T1 column in ;login: will understand where the year, Bob Metcalfe felt it necessary to backbone became operational much of it has come from. Welch’s book warn about security problems (RFC 602, (1.544Mbps!). Five years later, in 1993, has changed tremendously since the first “The Stockings Were Hung by the both PSINet and AlterNet deployed T3 edition nearly a decade ago. But so has Chimney with Care”). backbones. And DARPA became ARPA, Tcl since I first came in contact with it in again. 1989 or 1990. Tcl is a really fine script- A decade later, the Net had grown to 575 ing language, and these books will hosts and the DCA-enforced switch to enable you to use many enhancements, TCP/IP had been announced. A mere internationalization, and the toolkits. five years later (1988), there were 60,000 hosts. That’s only 15 years ago. Right now, my best guess is that the Internet is at 300 million hosts.

34 Vol. 28, No. 5 ;login: news USENIX MEMBER BENEFITS of the latter is election due to name As a member of the USENIX Association, 2004 USENIX recognition rather than skill, thus lead- EWS you receive the following benefits: ing to irretrievable mistakes born of FREE SUBSCRIPTION TO ;login:,the Association’s Nominating inexperience. In the case of the problem magazine, published six times a year, featur- of self-perpetuation, the solution is term USENIX N ing technical articles, system administration Committee limits, and USENIX indeed has them ● articles, tips and techniques, practical columns on such topics as security, Tcl, Perl, (http://www.usenix.org/about/bylaws. Java, and operating systems, book reviews, by Dan Geer html#art4). In the case of the problem of and summaries of sessions at USENIX con- getting well known but inappropriate ferences. Chair, Nominating Committee candidates, the solution is a Nominating ACCESS TO ;login: online from October 1997 Committee, and USENIX indeed has to last month: www.usenix.org/ one of those (http://www.usenix.org/ publications/login/. about/bylaws.html#art7). None of this is ACCESS TO PAPERS from the USENIX Confer- magic—democracy is messy, after all. ences online starting with 1993 www.usenix.org/publications/library/proceedings/ Having the right leaders for their time is hard to predict, but you have to try. THE RIGHT TO VOTE on matters affecting the More than anything else, it is short-term Association, its bylaws, election of its direc- [email protected] essential that there are enough good tors and officers. candidates to choose amongst: some DISCOUNTS on registration fees for all It is time for you to think about the next new and some familiar, spread over all of USENIX conferences. round of elections for the Board of USENIX’s sub-specialties, a history of DISCOUNTS on the purchase of proceedings Directors and for Officers of USENIX. and CD-ROMS from USENIX conferences. both devotion and delivery to the orga- nization, and the room to maneuver SPECIAL DISCOUNTS on a variety of products, USENIX has eight board members: four books, software, and periodicals. See are officers (President, Vice President, their private life to make USENIX a pri- for details. board members at large. All stand for essential that the best potential officers FOR MORE INFORMATION election every two years. be gotten onto the Board, so that next set of Officers can be chosen from those REGARDING MEMBERSHIP OR Organizations like USENIX have a diffi- with prior experience at the Board level. cult trade-off: Renewing their leadership BENEFITS, PLEASE SEE In other words, some new Board mem- by officially recognizing the candidates http://www.usenix.org/ bers every two years is a good thing, most likely to build the organization along with a convincing set of Officers membership/ versus the free-for-all. The risk of the known for their prior accomplishment former is self-perpetuation of the offi- OR CONTACT to be ready to take leadership roles. cers more than the organization; the risk [email protected] Phone: 510 528 8649

USENIX BOARD OF DIRECTORS DIRECTORS: Communicate directly with the USENIX Board Tina Darmohray, [email protected] of Directors by writing to [email protected]. John Gilmore, [email protected] Jon “maddog” Hall, [email protected] PRESIDENT: Avi Rubin, [email protected] Marshall Kirk McKusick, [email protected] VICE PRESIDENT: EXECUTIVE DIRECTOR: Michael B. Jones, [email protected] Ellie Young, [email protected] SECRETARY: Peter Honeyman, [email protected] TREASURER: Lois Bennett, [email protected]

Vol. 28, No. 5 ;login: 35 The Nominating Committee Chair is Please review the benefits and dues ❥ All USENIX Conference Proceed- appointed by the sitting Board. I am that structure listed below and on our Web ings in a downloadable format to Chair this time around. The Nominating site : http://www.usenix.org/membership/. your server, so that staff and stu- Committee must be both knowledgeable dents at your institution have any- We will have an electronic membership and wise, and must be willing to forego time access to all papers from our renewal process in place by this Fall. office themselves. That Committee will events Please look for renewal notices sent to be announced shortly. I will choose you via email. Supporting Members also receive: them and we will make recommenda- ❥ A free ad in ;login: once during the tions both for Board members at large USENIX Membership Benefits and for Officers. We will do our best and membership term we will very much hope you follow our INDIVIDUALS AND STUDENTS ❥ 10% discount on sponsorship and advice, as by the time we are done we MEMBERSHIP exhibit fees at USENIX events will believe in our decisions. ❥ Access to the latest USENIX Con- ❥ Ten member-priced conference reg- ference Proceedings on the istrations for your company staff If you are considering offering yourself USENIX Web site: http://www. during the membership term for election by petition, please keep in usenix.org/publications/library/ ❥ Your name and URL listed on our mind that the Board of USENIX is real proceedings/ Supporting Members Web site, as work, never trivial, and a great place to ❥ Free subscription to ;login: - the well as acknowledgment in printed make a difference. You should have Association’s magazine, published materials for events and in all issues pluck, vision, the ability to work with six times a year, bringing you tech- of ;login: others, and a thorough sense of some nical features, summaries of topic area that USENIX covers. You USENIX conferences, system SAGE Membership BENEFITS should be goal- rather than process- administration tips, reports on var- ❥ Access to the members-only area of oriented, have a sense of humor, and ious Standards activities, and book SAGEweb, which offers Web-based be fond of the idea that you get more reviews. ;login: is sent in print, and services, including employment accomplished if you don’t care who gets is available online a month later to and job placement services, a the credit. all current USENIX members speakers bureau, a mentoring pro- Send suggestions for nominees, com- ❥ Discounts on technical sessions reg- gram, and a discussion site: ments on existing Board members, or istration fees at all USENIX-spon- http://www.sageweb.sage.org just plain questions to me: [email protected]. sored and co-sponsored events ❥ The ability to join SAGE-only elec- Speaking as a former Board member ❥ Discounts on USENIX Conference tronic mailing lists and Officer, I can tell you that being on Proceedings and CD-ROMs ❥ The Annual System Administration the Board of USENIX is absolutely, posi- ❥ The right to vote on matters affect- Salary Survey tively worth the effort. ing the Association ❥ Discount on the registration fee for ❥ Savings on books, journals, and the annual LISA Conference MEMBERSHIP software from cooperating publish- ❥ The most recent “Short Topic in ers Systems Administration” series NEWS ❥ 15% off subscription to The Linux booklet – practical publications Journal covering system administration SAGE Dues ❥ 20% off O’Reilly & Associates pub- issues and techniques, with online You may now join SAGE for $40! While lications access to all booklets: it is no longer required that you be a ❥ $10 off subscription to Sys Admin http://sageweb.sage.org/resources/ member of USENIX to be a member of ❥ 20% discount off No Starch Press publications/short_topics.html SAGE, we hope you will remain or books ❥ The right to vote for the SAGE become a member of both organiza- Executive Committee and on other tions. Member support allows us to con- INSTITUTIONAL MEMBERSHIP SAGE matters (EDUCATIONAL, CORPORATE, AND tinue to offer some of the most highly ❥ A Code of Ethics for System respected conferences and publications SUPPORTING) Administrators, suitable for ❥ in the industry. Includes all of the benefits of an framing Individual membership, plus:

36 Vol. 28, No. 5 ;login: conference reports This issue’s reports focus on The USENIX Annual Technical (Metro Fiber Systems, a wide-area opti- cal-networking company), MCI, and USENIX Annual Technical Conference Conference EPORTS then Worldcom, rising to challenge the R and on the 15th Annual FIRST Confer- June 9–14, 2003 largest companies ence. San Antonio, Texas in America. He is co-author of !%@:: A

Directory of Electronic Mail Addressing & ONFERENCE

AWARDS C

Networks,published by O’Reilly Books. ● The USENIX Lifetime Achievement OUR THANKS TO THE SUMMARIZERS: He is also co-author of RFC-850, the Award (the Flame) was given to Rick Standard for Interchange of FOR USENIX ATC ‘03 Adams for implementing Serial Line IP Messages, which was updated to become William Acosta (SLIP) and founding UUNET, thereby RFC 1036 in 1987. Raya Budrevich making the Internet widely accessible. In 1982 Rick ran the first international The Software Tools User Group (STUG) Francis Manoj David UUCP email link at the machine seismo award recognizes significant contribu- Rik Farrow (owned by the Center for Seismic Stud- tions to the community that reflect the Shashi Guruprasad ies in Northern Virginia), which evolved spirit and character demonstrated by Hai Huang into the first (UUCP-based) UUNET. He those who came together in the STUG. James Nugent maintained “B” News (at one time the Recipients of the annual STUG award most popular Usenet News transport), conspicuously exhibit a contribution to Manish Prasad wrote the first implementation of SLIP the reusable code-base available to all Peter Salus (Serial Line IP), and defined the first and/or the provision of a significant, Benjamin A. Schmit protocol for running TCP/IP over ordi- enabling technology directly to users in Wenguang Wang nary serial ports (in particular, dial-up a widely available form. modems). The SLIP protocol was super- FOR FAST 2003 The 2003 award was given to CVS (the seded, years later, by PPP, which is still in Anne Bennett Concurrent Versioning System) and its use. Rick founded a nonprofit telecom- four main authors, Dick Grune, Brian munications company, UUNET, to Berliner, Jeff Polk, and Jim Klingmon. reduce the cost of mail and net- Without CVS, it wouldn’t be possible for news, particularly for rural sites in any number of people to work on the America. (UUNET was founded with a same code without interfering with each $50,000 loan from the USENIX Associa- other. It can be argued that without tion, which was subsequently repaid.) remote-collaboration tools such as CVS, UUNET became an official gateway most of the larger free and open source between UUCP mail and Internet email, software that is available today could not as well as between North America and have existed. While individuals can pro- Europe. It hosted many related services, duce significant software, collaborative such as Internet FTP access for its methods are often needed for complex UUCP clients and the comp.sources. and wide-ranging projects. unix archives. Rick spun out a for-profit company, UUNET Technologies, which Dick Grune: The original author of the was the second ISP in the United States. CVS shell script, written in July 1986, The for-profit company bought the Dick is also credited with many of the assets of the nonprofit, repaying it with a CVS conflict resolution algorithms. He share of the profits over the years. The developed the script at the Free Univer- nonprofit has spent that money for sity of Amsterdam (Vrije Universiteit), many UNIX-related charitable causes where he teaches principles of program- over the years, such as supporting the ming languages and compiler construc- Internet Software Consortium. The for- tion. He was involved in constructing profit ISP became a multi-billion-dollar Algol 68 compilers in the 1970s and par- company and was merged with MFS ticipated in the Amsterdam Compiler

October 2003 ;login: USENIX ATC ‘03 ● 37 Kit in the 1980s. He is co-author of three concerns “how the brain works.” In case “I’m going to make no thunderous pro- books: Programming Language you don’t recall, Descartes separated nouncements,”Stephenson pronounced. Essentials, Modern Compiler Design,and mind and body; Damasio – and, by “Novelists, or coders, or philosophers, or Parsing Techniques: A Practical Guide. extension, Stephenson – think this paleontologists don’t go on churning wrong. The separation actually dates out masses of filterable stuff, but succeed Brian Berliner:Coder and designer of back to Plato, and Stephenson sees it as by doing it right the first time. the first translation of the CVS scripts to the difference between Spock and libido the C language, in April 1989, Brian “What we do is a much more physical (Kirk or Bones). based his design on the original work kind of work and emotional kind of done by Dick Grune. Stephenson went on to talk of his pro- work than is believed. Pressure to just duction of The Big U (1984), his first churn out lines of code will not lead to Jeff Polk: Jeff rewrote most of the code opus, his creation of “steaming moun- good things in the end.” of CVS 1.2. He made just about every- tains of crap,”resulting in a process of thing dynamic (by using malloc), added Yep. cutting and distillation. He analogized a generic hashed list manager, rewrote this with coding and with “distilling the modules’ database parsing in a com- INVITED TALKS whiskey from beer.” patible but extended way, generalized ENGINEERING REUSABLE SOFTWARE LIBRARIES directory hierarchy recursion for virtu- When we execute there is a “foreground Kiem-Phong Vo, AT&T Labs – Research ally all the commands, generalized the process” and a “background process” loginfo file to be used for pre-commit which goes on “all the time.”That back- Summarized by Wenguang Wang checks and commit templates, wrote a ground process needs space to do what Phong Vo has 20 years of experience new and flexible RCS parser, fixed an it does – “which is a mystery.”What we building general purpose software uncountable number of bugs, and have is “a faculty of maintaining the libraries, which cover a wide range of helped in the design of future CVS fea- entire stream all the time, while access- computing areas such as I/O, memory tures. ing it only bit-by-bit at a time.” allocation, container data types, and data transforming. In this talk, he Jim Kingdon: While at Cygnus, in 1993 Stephenson feels that we should “kick showed how to build efficient, flexible, Jim made the first remote CVS, which down the Platonic model.”He is force- and portable software libraries using the ran over TCP or rsh or kerberos’d rsh, fully against PowerPoint. discipline and method architecture. and eventually over TCP/IP. The “I use a fountain pen,”he remarked, and remote-CVS protocol enabled real use of Vo first pointed out that traditional went on to say that he does not use CVS by the open source community; standard software libraries such as mal- “smileys.”He thinks came up before remote CVS, everyone had to log loc, stdio, curses, and libc have many with something in Perl because there’s in to a central server, copy their patches problems. For example, malloc frag- “more than one way to do something.” there, etc. Some years later, Jim formed ments memory; many container data Cyclic, a company which offered CVS Stephenson returned to Damasio and types have multiple incompatible inter- support and development. cited Einstein’s “clear images,”which are faces; inconsistent and inadequate inter- “visual and muscular,”though he admit- faces are common; programmers often KEYNOTE ADDRESS ted that he wasn’t certain what Einstein resort to hacks around problem areas, Neal Stephenson meant by “muscular.”(I’d guess “vigor- which cause problems in maintenance, porting, and performance. Summarized by Peter H. Salus ous” or “forceful” would be the appro- priate gloss: the German is “kraeftig.”) Neal Stephenson – sci-fi author extraor- Vo then discussed the characteristics of dinaire of The Big U (part funny, part Mathematical entities can be combined ideal standard libraries. These libraries silly, but worth it), Snow Crash,and in an infinite number of useless forms, should have good standard interfaces; Cryptonomicon, among others – began Stephenson said, but they are capable of address unsatisfied needs; be easy to his keynote by remarking that Bertrand leading us to mathematical truth. configure, use, and upgrade; and, most Russell and Stephen Jay Gould didn’t “Invention is choice.” importantly, have decent performance. rewrite: they had “no delete key.” These libraries should enable applica- Down with Plato or Descartes: “We pos- tions to tailor algorithms for specific He then discussed at length Antonio sess an emotional marking system.” needs and simplify library composition Damasio’s Descartes’ Error (1994), which to optimize resource usage. Vo argued

38 Vol. 28, No. 5 ;login: that with these ideal libraries in hand, icy. Typical methods include Vmpool, software collection built on top of some programmers could focus on writing which allocates objects of the same size; of these libraries is available at http:// EPORTS libraries instead of writing programs, Vmbest, which allocates objects based www.research.att.com/sw/download. R since a program consists of libraries plus on a best-fit policy; and Vmlast, which the application specifics. The productiv- allocates memory for a complex struc- THE CONVERGENCE OF UBIQUITY: THE FUTURE OF NETWORK SECURITY

ture and releases all ONFERENCE objects together in con- William A. Arbaugh, University of C ● stant time. Maryland at College Park Summarized by Rik Farrow These methods are pre- defined in the library Bill Arbaugh, who often works in the and cannot be extended wireless arena, started off his talk with a by programmers. They joke (I wish I would remember to do can be selected dynami- that). He showed slides of Free Software, cally by environment Free Willy, Free Kevin, Free Martha, Free variables. For example, Wireless, and even Free Beer. But what after the VMDEBUG Arbaugh really wanted to talk about was environment variable is the past, present, and future of wireless set to 1, Vmalloc uses the security. debugging-enabled Hotspots are a part of the present of USENIX ’03 Reception methods instead of the wireless networks. You can find maps of default-efficient meth- ity of programming should increase due wireless networks, collected by war driv- ods for memory allocation, which allows to the maximal library reuse and mini- ing, for most large American cities, and various memory allocation problems to mal special code. some of these networks are open, requir- be detected automatically. This feature ing no authorization or encryption key. In traditional software libraries, the han- makes the debugging and the profiling Arbaugh postulated that your next gen- dle plus operations model is often used, of applications very easy whenever eration cell phone will work over wire- where the handle represents resources needed. less networks when possible, and fall and the operations define resource man- Vo then used the Vcodex library to show back on slower, but more ubiquitous, agement functions. Although this model how to design a D&M library. The cell phone technology when necessary. can address immediate needs and is easy Vcodex library can transform data using As you walk out of a hotspot talking on to use, it causes the problems discussed compression/decompression, data dif- your cell phone, it will switch transpar- above. To address these limitations, Vo ferencing, encryption/decryption, etc. ently from the Internet to the cellular presented the discipline and method Designing a D&M library requires deter- network. Arbaugh quipped that the poor architecture (D&M), which was used mining resource types and general oper- sound quality provided by cell phones when he developed the Vmalloc, Sfio, ations, characterizing resources in a has made short breaks in communica- Cdt, and Vcodex libraries with other general discipline interface, and captur- tion and occasional garbling expected in researchers. ing algorithms/resource management in phone calls, and that will make Internet Vo used the Vmalloc library as an exam- methods. In Vcodex, bytestream is iden- use more acceptable. ple to demonstrate the D&M architec- tified as a resource type (i.e., discipline). The ghost of wireless security past is ture. Vmalloc is a popular memory All data-transforming operations (com- WEP, or Wired Equivalent Privacy. allocation library used to replace the pression/decompression, encryption/ Arbaugh described the author of WEP standard malloc interface. In Vmalloc, a decryption, etc.) are different methods as being in hiding, having created a discipline defines the type of memory since they all change some set of bytes to cryptographic protocol that failed in and the event handling of memory allo- produce other bytes. every imaginable way. cation. Typical disciplines include heap All the D&M libraries discussed in this and shared memory. Programmers can The present of wireless security is Wi-Fi talk (Sfio, Cdt, Vmalloc, Vcodex) can be define their own disciplines to extend Protected Access (WPA). WPA bolts on downloaded from http://www.research. the type of memory. A method in Vmal- over existing hardware and solves some att.com/sw/tools. The AST OpenSource of the problems inherent in WEP. For loc defines the memory-allocation pol-

October 2003 ;login: USENIX ATC ‘03 ● 39 example, keys will change with each an enterprise UNIX OS for the IA-64. correctly. In the example she described, packet sent, and packet integrity will SCO has indicated that they may go after the email system used the pads with the actually work, due to a different set of large-scale users of Linux and mailed XOR operation two times too many, algorithms (TKIP). Stronger access con- letters to that effect to 1500 companies. making it possible to extract the pads trol (unlike the MAC addresses used by Even if IBM is exonerated, it is not clear from each message (and decrypt the WEP) is included. WPA2 will follow, that this would prevent SCO from messages as well). require hardware changes, and support bringing suits against other companies. One of Perlman’s personal pet peeves is AES. This is a problem for small companies, screensavers. When her screen goes which lack the money for a well organ- Alas, WPA does not solve the current blank, she immediately assumes that her ized legal defense. denial-of-service issues. Arbaugh screensaver has kicked in, and types her demonstrated this by sending manage- Several legal issues were brought up. password. Of course, while this might be ment packets (which are never One of them, an obscure legal ruling the case, a power-saving feature might encrypted or authenticated), which from the late 1800s involving the sale of have just blanked the screen, so if you knocked people off the room’s wireless a pregnant cow when neither the seller see a weird set of characters in an email channels. nor the buyer knew this fact, may be from her, it just might have been a pass- used to allow SCO to continue, despite word that she just had to change. Arbaugh’s future includes smaller their releasing Linux code under the devices, with the transparent transition- Perlman lambasted systems such as one GPL. ing between hotspots and cellular men- where whenever you change your pass- tioned earlier, IPv6 addresses and SCO has not yet divulged in June where word, it gets emailed back to you. Or an routing, and always-on connections. All the proprietary code is. A panel of peo- SSL-protected online site that you devices will need personal firewalls in ple from the open source community is back a receipt that includes your credit addition to anti-virus – not that that will forming to look at the code under a very card information. protect users from the management restrictive NDA, thus making the panel Perlman also contrasted secret and pub- back doors that already exist in many of questionable value. lic keys. One big reason for Kerberos, cell phones today. Jon Hall brought up the point that a quipped Perlman, was the cost of public Obviously, we are in for a lot of sur- comparison of the SCO and Linux code key licenses. Kerberos requires a direc- prises, and interesting work, in the is insufficient, since code could have tory that must always be available and future of wireless. been inherited from a multitude of securely hold everyone’s secrets. Public other places, such as BSD. keys no longer have the onerous license INTELLECTUAL PROPERTY IN AN AGE fee but require Public Key Infrastruc- The impact of this suit on the future is OF COMMERCE: CORE ISSUES IN THE tures instead. SCO / LINUX IP SUIT unclear, although it could affect the will- Chris DiBona, Damage Studios ingness of companies to share code. She described the four types of PKIs: monopoly, oligarchy, anarchy, and bot- Summarized by James Nugent HOW TO BUILD AN INSECURE SYSTEM OUT tom-up. In monopoly, everyone pays Chris DiBona gave a highly informal OF PERFECTLY GOOD CRYPTOGRAPHY Verisign to play. In oligarchy, 80 or more discussion-format talk on the SCO v. , Sun Microsystems vendors all have self-signed keys in all IBM suit and what it may mean. Jon Laboratories browsers, with no mechanism for revo- Hall and Don Marti (Linux Journal) also Summarized by Rik Farrow cation. In anarchy, we use the PGP participated in a semi-panel format. Radia Perlman always seems like she is model, which scales poorly. Bottom-up First, a general discussion of the issue is having a great time, and this talk was no appeared the most reasonable, in which in order. On March 6, 2003, SCO filed a exception. She selected cryptography each organization has its own certifica- suit alleging that IBM had stolen some bloopers from a much larger collection tion authority (CA), and organizations code from them and placed it in the than she could present during one talk. sign the certificates for the CAs they Linux kernel; they asked for $1 billion She began with an email system based need to work with. [now $3 billion – ed.] in damages. This on a one-time pad. One-time pads are Perlman went on to rant about the was supposedly done as part of Project the strongest, most secure way of craziness in the SSL certificate scheme Monterey , a joint effort involving SCO, encrypting data, provided the pads (use of X.509), the “300 content-free IBM, and other companies to produce themselves are kept secure and are used

40 Vol. 28, No. 5 ;login: pages of ISAKMP framework” for IPSec, should dictate policy but that these matter of choice or desire that some and her top-ten favorite list of people issues should be decided by the market. people would like to have. An important EPORTS and things that get in the way of better counterpoint that came up was how to R The approach that New.net has taken security. During the question and handle Uniform Domain-Name Dis- does not involve configuration pains for answer period, a manager begged her to pute-Resolution Policy (UDRP) when ISPs or individuals and is backward

lobby for better security, but Perlman numerous TLDs exist. For example, how ONFERENCE

compatible with the existing system. C stated that the world just doesn’t care would they deal with someone wanting New.net domains can be enabled either ● enough. to open up a .boycott or .sucks domain (1) via recursives that rely on US gov- with organization names in it for people ernment root servers but augment with ALTERNATIVE TOP-LEVEL DOMAINS to criticize companies. It would also be New.net domains, or (2) through user (AKA THE GAME OF THE NAME) harder to enforce anti-cybersquatting machines that have the New.net client Steve Hotz, New.net laws. To the speaker’s position that we plug-in (which still relies on the US gov- Summarized by Shashi Guruprasad need a consortium rather than one ernment root servers). New.net cur- organization, one person responded that Steve Hotz, the CTO of New.net, gave rently offers 88 new domains. They ICANN was the consortium. Lastly, it one of the most controversial talks of currently have 150 million customers in was pointed out that New.net names will this year’s USENIX ATC. This was evi- total. They have tied up with five out of be invalid if DNS security extensions are dent from the audience’s hostility and the top seven US-based ISPs and with implemented. bitter reactions, including a spate of pro- many worldwide. The speaker also men- fanities hurled at New.net and a few tioned that their efforts to talk to even directed at the speaker himself. MODELING THE INTERNET ICANN and other organizations have Harry DeLano and Peter H. Salus, Matrix New.net is a domain name registrar not succeeded. They also avoid conflict- offering consumers a large number of NetSystems ing TLDs with ICANN or country-code Summarized by William Acosta new and alternative top-level domains TLDs. However, it remains to be seen if (TLDs) – e.g., .family, .kids, .shop – that As the Internet continues to grow, it ICANN will break the Internet by becomes harder to represent the net- are not ratified by the Internet Corpora- launching conflicting TLDs. New.net tion for Assigned Names and Numbers, work graphically using a geographical also plans to partner with other alterna- model. Such representations of the net- or ICANN, a technical coordination tive TLD providers that may bid on body composed of a broad coalition of work lack the ability to visually express TLDs with ICANN, such as .kids or such details as the location of the “big the Internet’s business, technical, aca- .golf. Some of the big companies such as demic, and user communities. Among pipes,”peering relations, and routing Microsoft, AOL, AT&T, Verisign, and information. To illustrate the increasing other things, ICANN is responsible for Nokia are potential new TLD providers. assignment of domain names. complexity of the visual models of the However, New.net is not really making Internet, the speakers presented a his- New.net’s position is that additional an effort in this direction. The reality of tory of “maps” of the Internet, starting TLDs provide more choices and richer “breaking the root” will occur only if with the earliest known maps of naming schemes and that a market for multiple non-cooperating businesses ARPANET from 1969 and continuing them exists. Currently, there is no real promote competing namespaces. The with the geographical maps through the Internet directory and nothing to fill the last point was that New.net has been 1970s and 1980s that included the first gap between whois and search engines. operational for the past two years, yet satellite links to both Hawaii and Europe An artificial scarcity exists and has not the Internet is not broken. and the integration of multiple networks been mitigated by ICANN. The speaker During the Q&A, one person expressed like Usenet to form what is now know as repeatedly stressed that this is not the the wish that New.net would fail. the Internet. most important problem that needs Another mentioned that New.net has solving in the Internet but just a place One of the fundamental questions that broken the fundamental principle of a modeling attempts to answer is, how where a market exists. Some of the issues single consistent naming system wher- in new TLDs were: how many, Internet does the Internet behave? This question ever you go. Someone else said he can be broken down into several compo- stability, trademarks, who will control believed that things would break only if them, and, finally, where does the money nents: What are the nodes and their con- there was significant adoption. Another nections? How can the Internet be ($1 billion per year) go? New.net’s per- person asked how it matters whether spective is that no single organization visualized? How can its health be moni- there is a .usenix or usenix.org. The tored? How can we detect and respond answer was that it is not a necessity but a to events? Some of the techniques used

October 2003 ;login: USENIX ATC ‘03 ● 41 for measurement analysis include the Someone pointed out that maps of the can “hold hands” with each other to traceroute program, BGP information, physical world are backed by a physical form larger objects. By controlling the and DNS data. Similarly, tools such as reality that is less mutable than the direction of the “hand-holding,”one can ping and the Internet Weather Report Internet. This prompted the question of create solids, liquids, or gases. Control of [now discontinued - ed.] are used for whether we need better mapping sci- such machines requires nanocomputers. monitoring the “health” of the Internet. ence. Other issues raised during the Nanocomputers, however, cannot be However, these tools are not perfect. As Q&A included a discussion on the scala- programmed using the usual techniques. an example, the speakers noted that bility of current measurement tools and The energy required to erase a bit, traceroute does not accurately reflect the the limitation on probing large numbers though small, is very significant at such existence of multiple parallel paths to of hosts frequently. small scales. Thus, these nanocomputers some destination. need to be programmed using reversible URLs programming techniques that avoid The speakers discussed several projects CAIDA: http://www.caida.org/ erasing bits. (from Lumeta, CAIDA, MIDS) which Graphviz: http://www.research.att.com/ analyze and track certain sets of Internet Flying cars are now possible. Airflow can sw/tools/graphviz/ information. In particular, the speakers be controlled over the skin of the car. Lumeta: http://www.lumeta.com/ showed the Internet averages of latency, Negative drag gives all the required MIDS: http://average.matrix.net/ packet loss, and reachability observed thrust. Thus, the car can be sleek and Rocketfuel: http://www.cs.washington. during events like the September 11, elegant. Instant houses? At bacterial edu/research/networking/rocketfuel/ 2001, terrorist attacks and the April replication speeds, nanorobots can turn

Fools’ Virus, to demonstrate how these NANOTECHNOLOGY: AS HARDWARE a pile of dirt into a house. However, tools tracked the effect these events had BECOMES SOFTWARE there exists the problem of runaway on the Internet. Additionally, the speak- J. Storrs Hall, Institute for Molecular nanorobots. There are several means to ers discussed tools, such as Graphviz Manufacturing control these nanorobots. Removal of from AT&T Labs and Walrus from fuel can stop replication. Also, a fixed Summarized by Francis Manoj David CAIDA, that help visualize the topology supply of replication unit kernels can of the Internet. What is nanotechnology? It is the con- control the replication. struction of self-replicating machines Currently, data collection is subjective with atomic precision. Molecular manu- Other interesting applications include and performed in an ad hoc manner. facturing is actually a better term to thin skins that can insulate and protect Additionally, the data sets obtained lack describe this process. The manufactur- humans. These skins can be so thin that context and may be incomplete. As an ing process is not straightforward. they wrap around a single strand of hair, example, it was noted that paths from Because of their size it is not possible to providing air management and sweat traceroute and BGP data contain incon- just pick atoms and place them together. management! Respirocytes are yet sistencies. The speakers suggested that A molecule can be used as a handle to another design for small nanomachines data collection needs to be more objec- move the atom to the desired site. A that function like red blood cells. They tive and that the interpretation of the bond-forming chemical reaction is used can store and release oxygen in the results deserves more scrutiny. To this to fuse the atom to the rest of the struc- bloodstream as and when necessary. end, the speakers suggested the forma- ture. Space travel can also be made cheaper by tion of an international body patterned the construction of an extremely tall after the Centers for Disease Control Self-replication is a key requirement for platform using artificial diamonds. Thus that would both monitor the health of nanomachines. This results in a sophisti- nanotechnology has tremendous poten- the Internet and be able to respond to cated product with exponential growth tial for the future of humankind. events such as disasters and virus out- in capital because of the 100% reinvest- URLs: http://www.imm.org/Parts/ breaks. ment. A parts assembly robot is used to build other robots. Just like a car factory, http://discuss.foresight.org/~josh/ During the Q&A session, John Quarter- pipelining and convergent assembly http://www.losthighways.org/radebaugh. man asked what are the fixed points lines are used to speed up the process. html (analogous to cities on traditional road http://www.moller.com/ maps) of Internet maps? Suggestions Josh went on to illustrate applications of from attendees included using Internet this technology. Nanomachines can be exchanges and the root nameservers. built to simulate molecules. One such design is “utility fog.”These machines

42 Vol. 28, No. 5 ;login: TECHNICAL SESSIONS – ROLE CLASSIFICATION OF HOSTS WITHIN refined versions of most of the algo- GENERAL TRACK ENTERPRISE NETWORKS BASED ON CONNEC- rithms described in this paper. EPORTS

TION PATTERNS R ADMINISTRATION MAGIC Godfrey Tan, John Guttag, and Frans A COOPERATIVE INTERNET BACKUP SCHEME Summarized by Francis Manoj David Kaashoek, MIT; Massimiliano Poletto, Mark Lillibridge, Hewlett-Packard Labs; Mazu Networks Sameh Elnikety, Rice University;

UNDO FOR OPERATORS: BUILDING AN ONFERENCE UNDOABLE EMAIL STORE This paper presents methods to auto- Andrew Birrell, Mike Burrows, and C Michael Isard, Microsoft Research ● Aaron B. Brown and David A. matically group hosts in a network Patterson, University of California at based on their communication patterns. Backup service providers on the Internet Berkeley Grouping of hosts can simplify network are costly. This paper describes a scheme This won the General Track Best Paper management and can also detect anom- by which individual computers on the award. Human error is the primary bar- alous conditions. For example, a devel- Internet can cooperate to back up each rier to building dependable systems. A oper’s machine behaving like a mana- other’s data. Each computer uses a set of small mistake such as misconfiguring an ger’s machine would be interesting partners to store its backup data. In email virus scanner can result in lots of information to an administrator. return, it holds part of each partner’s backup data. By making redundant lost email. This damage could be pre- The proposed scheme uses probe hard- vented if operators were able to undo copies of the backup on multiple com- ware to sniff all network traffic. A cen- puters, a high level of reliability is their mistakes. This paper presents an tral aggregator then collects the infor- implementation of undo capabilities for obtained. Thus, inexpensive backups can mation from all the probes. It is respon- be obtained. an email distribution system. sible for maintaining a “neighbor rela- Three Rs – rewind, repair, and replay – tionship” between hosts that commun- The backup scheme depends upon are necessary to achieve such capabili- icate regularly. Groups are then identi- cooperation between hosts. There are ties. Rewind brings back a system to a fied based on host similarities. Similarity always bound to be hosts that refuse to time in the past, repair fixes the mistake, is defined as the number of common cooperate. In its basic form, this scheme and replay ensures that new information neighbors. Group formation is treated has several pitfalls. Hosts may promise is not lost. Paradoxes can arise during a as a graph theory problem. Each node in to hold data and not keep their word. replay, though. This is because the sys- the graph is a host and each edge with This is discouraged by several mecha- tem might make changes to message weight e represents the fact that there are nisms, including periodic challenges to bodies or restore missing email to a e common neighbors between the hosts. ensure that partners are cooperating and mailbox. To explain this to the user, For a given value of similarity, say k,a k- a novel method called “disk-space wast- explanatory emails are sent. neighborhood graph is constructed with ing” designed to make cheating unprof- edges of weight k.Bi-connected compo- itable. Attacks aimed at disrupting the The architecture for the undoable email nents in this k-neighborhood graph are service are also possible in the basic store consists of rewindable storage and considered separate groups. Special cases scheme. The paper outlines some solu- a proxy that intercepts user events. All are used when the graph is a tree. Groups tions to prevent these attacks as well. user events are encoded as “verbs” which can also be merged into a larger group are logged. An undo manager is used to Results from an initial prototype show based on group similarities. For more that these techniques are feasible as far control all undo operations. Studies with details on this, please refer to the paper. a Java implementation show that the as performance is concerned. The costs time and space overheads are not signifi- The effectiveness of this scheme has involved are also quite low compared to cant. Responding to a question at the been studied by comparing the groups other Internet backup options. end of the presentation, Aaron also clar- generated automatically with groups of ified that the explanatory messages hosts determined by system administra- themselves are not treated as user events tors. Results of this study show that the and hence no verbs are generated for groups extracted automatically closely them. match the desired groups. This scheme shows that automatic role classification of hosts based on their communication patterns is feasible. Mazu Network’s PowerSecure software now incorporates

October 2003 ;login: USENIX ATC ‘03 ● 43 POWER ory nodes each process uses creates Machine (JVM), and by sharing Summarized by Hai Huang opportunities to power off a large per- resources among these processes, it centage of nodes in the system, thereby reduces resource consumption and CURRENTCY: A UNIFYING ABSTRACTION FOR saving a significant amount of energy. improves overall performance. Multi- EXPRESSING ENERGY MANAGEMENT POLICIES The authors introduced other tech- user MVM takes this a step further and Heng Zeng, Carla S. Ellis, Alvin R. Lebeck, niques – library aggregation and page allows multiple users to safely execute and Amin Vahdat, Duke University migration – to achieve energy savings. different processes on the same MVM. A system-level energy management These techniques are compatible with As if executing in a traditional OS, technique is discussed. Energy is classi- various DRAM architectures, including processes, files, and other resources fied as yet another resource that can be SDR, DDR, and RDRAM. belonging to the same user will not be managed by the operating system. An illegally accessed by another user. MVM- energy quota is allocated to each process GET VIRTUAL 2 extends MVM by having a separate periodically, and when each process Summarized by Hai Huang instance of Jlogin per user session to accesses an energy-consuming hardware manage user identity, environment, and device (e.g., disk, CPU, network inter- OPERATING SYSTEM SUPPORT FOR VIRTUAL other access control information. It was face), it is charged the amount of energy MACHINES shown that for a typical workload, that was spent accessing the device. Samuel T. King, George W. Dunlap, and MVM-2 only incurs a small perform- When all its energy quota is spent, it Peter M. Chen, University of Michigan ance overhead over MVM, while provid- cannot execute further until the operat- The authors describe several techniques ing a safe multi-user environment. ing system replenishes its quota. The to reduce the virtualization overhead authors were able to show that the sys- associated with type II virtual-machine NEEDLES AND HAYSTACKS tem is able to achieve desired runtime, if monitors (VMM) so they can achieve it is possible, with a high success rate. By performance similar to type I VMMs. In Summarized by Wenguang Wang managing the energy quota of each their experiments, they used UMLinux A LOGIC FILE SYSTEM application, it is easy to suppress the as the guest operating system. One of Yoann Padioleau and Olivier Ridoux, execution of the less important tasks, the techniques they use to improve per- IRISA / University of Rennes while giving more energy to the more formance is to move the VMM from the Organizing information using the hier- important ones to keep the system run- user-space to the kernel, so they can archical paradigm, as is done by tradi- ning longer when the energy supply is avoid high-overhead ptrace calls and tional file systems, can provide naviga- low. reduce the number of context switches tion functions but is rigid in that an when guest system calls/signals are object can only be reached via one path. DESIGN AND IMPLEMENTATION OF POWER- invoked. Other techniques include the The Boolean query paradigm used by AWARE VIRTUAL MEMORY use of segmentation registers to reduce search engines provides flexible key- Hai Huang, Padmanabhan Pillai, and overhead of guest user-to-system and word-based search capability but lacks a Kang G. Shin, University of Michigan guest system-to-user boundary crossing, navigation mechanism. In this talk, A novel technique to reduce the power and the use of multiple address spaces Yoann Padioleau presented a logic file dissipation in the DRAM was presented. for the virtual machine to reduce the system (LISFS) based on the Boolean As workloads become more data-cen- guest user-to-user switching overhead. query paradigm but extended to provide tric, more energy is spent to sustain the The performance of their system was navigation. Each file in LISFS is associ- ever-growing DRAM in systems. The comparable to that achieved by VMware ated with a conjunction of properties of intuitive basis of current power-man- Workstation. interest. The directories are used to rep- agement technology is that processes resent properties. One can navigate in execute one at a time, and when each is A MULTI-USER VIRTUAL MACHINE LISFS by giving the desired or undesired executing, it only uses a small fraction of Grzegorz Czajkowski and Laurent properties as directory names. the total system memory. By figuring Daynès, Sun Microsystems; Ben Titzer, out the memory nodes each process Purdue University A prototype of LISFS has been imple- uses, energy can be saved by putting This paper describes the implementa- mented by the authors. The data struc- unused memory nodes into a low energy tion of adding multi-user capability in a ture of the current implementation is state. Proactively allocating physical multi-tasking virtual machine (MVM). optimized for searching and navigation. pages to minimize the number of mem- A MVM allows a user to execute multi- It is somewhat slow for file and property ple processes in the same Java Virtual creation. The performance experiments

44 Vol. 28, No. 5 ;login: of the prototype showed efficient navi- They found that delta-encoding using achieves a significant reduction of run- gation execution but 4 to 34 times Rabin fingerprints can improve on sim- time when reading large files from the EPORTS longer creation time than ext2. The disk ple compression by up to a factor of two, server, given that a significant portion of R space overhead is 2–5KB per file, given depending on workload. A small frac- the file blocks can be found in the CAS. 50 properties. A prototype of LISFS and tion of objects can potentially account

more information on this project can be for a large portion of the space and CHANGE IS CONSTANT ONFERENCE downloaded at http://www.irisa.fr/LIS. bandwidth savings. More importantly, C

Summarized by Shashi Guruprasad ● when multiple files match the same SYSTEM SUPPORT FOR ONLINE APPLICATION-SPECIFIC DELTA-ENCODING VIA number of features, any file can be used RECONFIGURATION RESEMBLANCE DETECTION as a good base for computing delta. Fred Douglis and Arun Iyengar, IBM T.J. Craig A.N. Soules, Gregory R. Ganger, Carnegie Mellon University; Jonathan Watson Research Center OPPORTUNISTIC USE OF CONTENT Appavoo, Michael Stumm, and Kevin Delta-encoding is a data-compressing ADDRESSABLE STORAGE FOR DISTRIBUTED Hui, University of Toronto; Robert W. FILE SYSTEMS method that represents an object using Wisniewski, Dilma Da Silva, Orran Niraj Tolia, Mahadev Satyanarayanan, its difference relative to a similar object. Krieger, Marc Auslander, Michal Carnegie Mellon University and Intel In this talk, Fred Douglis showed how to Ostrowski, Bryan Rosenburg, and Jimi Research Pittsburgh; Michael Kozuch, use delta-encoding to compress a set of Xenidis, IBM T.J. Watson Research Brad Karp, Intel Research Pittsburgh; files without specific knowledge of these Center Thomas Bressoud, Denison University files in advance. Unlike previous and Intel Research Pittsburgh; Adrian Craig Soules spoke about their work to approaches, the resemblance between Perrig, Carnegie Mellon University extend and replace active OS compo- files is detected dynamically. A distributed file system nents to perform an online reconfigura- on WAN is often slow. tion. OSes are complex pieces of Niraj Tolia presented software that are required to work under their work on building a a variety of hardware resources and distributed file system, usage patterns. To overcome this prob- called CASPER, which lem of one size fits all, OSes are required exploits the readily avail- to be tuned with the right set of compo- able Content Address- nents to work well under many demand- able Storage (CAS) to ing conditions. The need to bring down cache the objects in a the system in order to reconfigure it is remote server so that the costly in terms of availability, human read traffic through time, and lost system state. The goal is to WAN can be reduced. perform such online reconfiguration with minimal overhead. Implementing Ted Hayelka, Jim Larson, and Bart Massey enjoying the File recipes, which are this in a traditional OS is difficult USENIX Reception the file content hashes because of unstructured component that describe the data boundaries, and thus it is hard to back- blocks composing the file, are computed Douglis first explained how to detect fit new code. An object-oriented OS for each file and stored in the server. resemblance between objects. The Rabin such as IBM’s K42 is a suitable fit and is When a client wants to read a large file, fingerprints are used to compute hashes used for this purpose. it first fetches the recipes of this file from of overlapping sequences of bytes in a the server. The client then uses these To support online reconfiguration, com- file. A subset of these fingerprints (i.e., recipes as keys to search the file blocks in ponent boundaries must be clearly features) are used to represent the file. CAS. If a block is found, the client defined, external references to the com- Files sharing many features would, retrieves it from CAS; otherwise, the ponent must be updated, state transfer hopefully, have similar contents. client reads it from the server. The client mechanisms must be defined between After two objects are detected as similar, finally reconstitutes the full file content the components being replaced, and delta-encoding is applied to compress using these blocks. The experimental components should be quiesced during the objects. The authors evaluated a results of CASPER showed that when reconfiguration so as to avoid state cor- number of parameters of this approach. client-server bandwidth is low, CASPER ruption. An object translation (indirec-

October 2003 ;login: USENIX ATC ‘03 ● 45 tion) table approach is used to take care application to achieve the goal. A tool CUP: CONTROLLED UPDATE PROPAGATION of external references. Components han- with an architecture-independent API IN PEER-TO-PEER NETWORKS dle their own state transfers, since the known as Dyninst is used for “hijacking” Mema Roussopoulos and Mary Baker, states of different components vary. An an application. It’s easier to hijack a Stanford University interposition phase interposes a media- dynamically linked library than a stati- Mema Roussopoulos presented work tor around an existing object that is cally linked one, which requires intro- addressing the problem of reducing being reconfigured. The mediator per- ducing the dynamic linker code into the search query latency in peer-to-peer forms the hot-swapping of the object in application’s process space. In order to (P2P) systems. Search queries are a per- three phases: forward, block, and trans- achieve a high level of transparency, the formance bottleneck in both structured fer. It uses a thread-generation mecha- system tries to automatically determine (Chord, CAN, Pastry, Tapestry) and nism to detect active calls, and the original X-server host so as to redi- unstructured (Gnutella, Freenet) P2P eventually blocks new calls and thus rect communication to the new X- systems. Recent work has used caching detects when a quiescent state is server. This is achieved by enumerating of metadata that is returned in response reached. Once new calls are blocked, the the process’s socket descriptors that are to these queries at intermediate nodes new object is hot-swapped and a state communicating with a peer port that along the path and is referred to as Path transfer is initiated. After this process, falls in the well-known X-server port Caching with Expiration (PCX). PCX the mediator is detached, the old com- range. This could fail, however, if tun- mechanisms are only a partial solution, ponent is destroyed, and all new calls neling is used, and is remedied by trying as there is no maintenance of the caches directly occur on the new component. to determine the right socket by a brute- along the path. force approach which involves connect- The performance overhead was around CUP addresses this problem by having ing with every port that the application 500 CPU cycles for the interposition asynchronous propagation of metadata is already connected to. A user could phase and 4000 cycles for the hot-swap- updates with independent local node also explicitly provide this information. ping phase on an IBM RS/6000 24 policies rather than global ones. It is Another step in the process is to find a 600Mhz Power-PC CPU-based server. independent of the underlying search suitable point in the X message stream Two benchmarks, Postmark and SDET, algorithm and uses incentive-based poli- so as not to corrupt or miss any mes- were used for the evaluation. Several cies. A node propagates an update only if sage. This is performed by walking well-known adaptive algorithms per- it has an incentive to do so. For example, through the process stack in search of X formed the online reconfiguration to a node would be willing to receive library stub functions and, if any exist, demonstrate the utility. Some of the updates as long as there are queries for a repeating the operation after a timeout. open issues include coordinated swap- particular piece of metadata, since the A limitation of this approach currently ping and recovery from broken compo- node would have to propagate the query is that it does not work with stripped nents that replace working ones. further if it did not maintain the cache statically linked binaries, for which alter- locally. Two policies are defined, proba- CHECKPOINTS OF GUI-BASED APPLICATIONS nate mechanisms are being worked out. bilistic and history-based. Separate logi- The last step is to retrieve the list of GUI Victor C. Zandy and Barton P. Miller, cal query and update channels are University of Wisconsin resources that exist on the source X- maintained and queries are coalesced. server and re-create them on the target Victor Zandy talked about transparently The number of overlay round-trip hops X-server. X-fonts present a problem migrating the GUI (X Windows) for queries to come back with answers because of lack of enough state on the belonging to unmodified applications starting from the originator is used as a application side to determine the font between hosts without premeditation, a metric for the cost of a query. Miss cost names. The current solution is to system that he calls “guievict.”GUI repli- is defined as the total number of hops retrieve all fonts and match the font cation as well as process migration is incurred by all misses. As long as the dif- geometry with the ones being used. This also possible using this system. Some of ference in miss costs between PCX and hideously slow approach will likely be the related work such as VNC or xmove CUP is larger than the overhead of solved with client-side fonts in the needs premeditation and uses proxies to update propagation, CUP recovers its future. Currently, this overhead is miti- achieve GUI migration. cost. gated in subsequent requests by caching The interesting part of the system is the font data. The Stanford Narses P2P flow-simulator use of program-editing techniques to is used to evaluate the algorithm with a URL: introduce new code into a running variety of cut-off policies, network scales http://www.cs.wisc.edu/~zandy/guievict/

46 Vol. 28, No. 5 ;login: and topologies, outgoing update capac- smarter load balancing, algorithm even override permissions found in the ity, and query arrival distributions. chaining within the kernel thread, using underlying file system (when mounted EPORTS

Based on their model, CUP recovers its the second processor as a cryptographic with VFS bypass permission). R cost by a factor of 2 to 300 under a vari- device in dual-processor systems, and NCryptfs outperformed CFS, TCFS, but ety of workloads described in detail in minimizing copying overhead. not BestCrypt in the Am-utils bench-

the paper. The various cut-off policies ONFERENCE

mark. In an I/O-intensive benchmark, C have comparable performance when the NCRYPTFS: A SECURE AND CONVENIENT CRYPTOGRAPHIC FILE SYSTEM NCryptfs was fastest. Future work ● query rates are high, while second- includes better key management, lock- chance history-based policy is better Charles P. Wright, Michael C. Martino, and Erez Zadok, Stony Brook University box mode, centralized key servers, and with low query rates. Charles Wright presented this paper by threshold secret serving. SECURITY MECHANISMS explaining motivation (providing pro- Matt Blaze was on his feet even before Summarized by Rik Farrow tection to data) and describing prior Wright finished. Blaze asked if Wright work. He mentioned that Matt Blaze’s was sure he didn’t have the CFS and THE DESIGN OF THE OPENBSD CFS suffered from network and data TCFS performance reversed? Wright CRYPTOGRAPHIC FRAMEWORK copy overhead, and I watched Blaze, just said that they fixed some problems with Angelos D. Keromytis, Columbia a few rows ahead of me, sit up a bit TCFS that had a performance impact. University; Jason L. Wright and Theo de straighter at that statement. Wright also Raadt, OpenBSD Project All three crypto file systems were using mentioned TCFS, Microsoft’s EFS Blowfish, but had different overhead Angelos Keromytis described the ratio- (which does not work over a network), outside of encryption. nale and mechanisms behind the API Cryptfs, a proof of concept by the same created within OpenBSD, and adopted authors, BestCrypt (commercial), and Marc Stavely asked, Why not use under- by FreeBSD and NetBSD, for integrating StegFS. lying file system checks? Wright said that hardware cryptographic devices. The there are no ACLs on lower-level file sys- basic concept was to provide an abstrac- Design goals include strong encryption; tems, and that their ACLs provide the tion layer so that programs and kernel convenience for users, system adminis- key needed to read/write files. routines could ignore the differences trators, and programmers; and perform- between devices, or even the lack of a ance. NCryptfs has three sets of players: A BINARY REWRITING DEFENSE AGAINST device. Previous implementations could system administrators, who set up STACK-BASED BUFFER OVERFLOW ATTACKS stall the CPU waiting for a device to NCryptfs through mounts but do not Manish Prasad and Tzi-cker Chiueh, respond. hold keys; owners, who hold encryption Stony Brook University keys; and others, called readers and writ- Manish Prasad explained that the goal Within the kernel, three functions han- ers, to whom access is delegated by own- was to instrument a program to defend dle creating, dispatching, and freeing a ers. All keys are stored using a long-term against stack-based buffer overflow crypto_session.The dispatch call key, and each owner uses a passphrase attacks without having access to the includes a callback parameter, so that (currently) to authenticate and access source. There are tools available, such as the crypto device can progress asynchro- his or her own keys. Etch, that can perform binary code nously. Out in user-land, /dev/crypto Borrowing from Blaze’s CFS, NCryptfs translations, but the authors are not provides a uniform entry point, con- aware of any that perform return trolled via sysctl() calls. The current ver- also uses an attach to set up encrypted directories for owners. An attach is like a address defense (RAD) based on binary sion of OpenBSD Cryptographic rewriting. Framework (OCF) is synchronous at the lightweight user-mode mount that, user level. unlike a regular mount, cannot hide files The authors use both linear instruction or directories under a mount point. disassembly and control flow analysis. Another big goal of OCF beyond trans- Only data gets encrypted, not the meta- Because Intel processors use variable- parency is performance. The new API data, so any underlying store can be length instructions, and 248 out of 256 does add overhead, but in systems with used (local disk, NFS, or CIFS). In an bytes are legitimate starting points for hardware cryptographic devices it interesting addition to this scheme, instructions, linear analysis has real improves performance because the owners may delegate their access to indi- problems distinguishing code from data. hardware devices can operate in parallel viduals and to arbitrary groups, simply The authors then follow with a second with the CPU. Future work includes by adding authorizations for several pass, starting with the code’s entry point individuals. And this delegation can

October 2003 ;login: USENIX ATC ‘03 ● 47 (as found in the program header), and ing I/O operations with significantly MULTIPROCESSOR SUPPORT FOR following all function calls and fewer system calls. It is implemented by EVENT-DRIVEN PROGRAMS branches. Even here there are problems, exposing certain elements of a connec- Nickolai Zeldovich, Stanford University; as GUI-based applications tend to use tion’s state to user-space over a piece of Alexander Yip, Frank Dabek, Frans callbacks instead of direct function calls. memory shared between kernel and Kaashoek, and Robert T. Morris, MIT; user-land. The amount of shared mem- David Mazières, New York University RAD replaces the prologue and epilogue ory required is just a small fraction of a The contribution of this paper is liba- of functions with its own instructions, typical Web server main memory (1MB, sync-smp, a library that allows event- storing the return address during the for application with 65K concurrent driven applications to take advantage of call and checking it when the function connections, 16 bytes per connection). multiprocessors by running code for returns. They instrumented several The authors implemented a user-level event handlers in parallel. Typically, Microsoft applications, and estimate select() wrapper called uselect(),which event-driven programs are structured as that they missed less than 1% of func- reads from the shared memory area a collection of callback functions which tions. When used against open source whenever possible, and calls select() a main loop calls as I/O events occur. Windows applications, such as gzip and only when the required information is However, to execute callbacks concur- Wget, they obtained similar results, but not available in user-space. rently on a multiprocessor requires run- not with Apache, which includes func- ning multiple copies of the application tions without any absolute addresses The other mechanism proposed is data- or fine-grained synchronization. (called from tables). stream splicing, which allows forwarding of data between server and client This paper maps this problem to graph- Overhead for an instrumented binary streams from within the kernel, thus coloring by allowing the programmer to varied 1–4%, and a test of an exploit reducing a lot of data-copy and context- choose a color for each callback, so that against an RAD-protected version of switching overhead. The two connec- callbacks with the same color are never winhlp32.exe worked. Prahad ended his tions (to client and to server) are spliced executed concurrently. Thus, event- last slide with a note: I’m looking for at the socket level. The novel features in driven servers can get more juice out of work! One person asked if safe lan- this mechanism are support for request a multiprocessor machine with minor guages would help. Prahad answered pipelining and persistent connections, modifications in code. As proof of con- that that is the best solution. A person content caching decoupled from client cept, they modified the SFS file server from CERT asked about how RAD han- aborts, and efficient splicing even for (about 90 lines of changed code out of a dles detected changes. Prahad said that short transfers. total of about 12,000), and observed that the program exits. the modified version on a four-CPU The proposed mechanisms were evalu- server ran 2.5 times as fast as an unmod- FAST SERVERS ated on a testbed comprising commod- ified SFS server running on one CPU. Summarized by Manish Prasad ity hardware. Experiments were done Further improvements were observed in using the Squid proxy server and bench- KERNEL SUPPORT FOR FASTER WEB PROXIES task-processing rates with thread-affin- marked using PolyGraph. The speaker ity optimizations. Marcel-Catalin Rosu and Daniela Rosu, presented CPU utilization results for dif- IBM T.J. Watson Research Center ferent request rates for a benchmark BIG DATA The paper addresses the challenges in biased toward small files, which showed Summarized by Manish Prasad Web proxy design that arise out of a best overhead reduction with a combi- large number of TCP connections. Han- nation of uselect and splicing. Splicing SENECA: REMOTE MIRRORING DONE WRITE dling a large number of network con- achieved very good results for large Minwen Ji, Alistair Veitch, and John nections causes the proxy server to incur objects. Also, the proxy hit response Wilkes, Hewlett-Packard Labs huge overheads due to context switches times showed substantial reductions Minwen Ji presented Seneca, a prototype and user/kernel data copies. The paper using a combination of uselect and remote-mirroring storage system. A pri- presents two mechanisms to ameliorate splice (10–30%). However, the miss mary contribution of the paper is a this overhead, namely, user-level con- response times reduced only by about comprehensive taxonomy of design nection tracking and data-stream splic- 0.5 to 1.3% using the same combination. choices for remote mirroring over vari- ing. ous axes: fault-coverage (component- User-level connection tracking allows an failure tolerance), degree of synchron- application to coordinate its non-block- ization (divergence), update propagation

48 Vol. 28, No. 5 ;login: and acknowledgment, and location of lows an access-based policy, which FAST, SCALABLE DISK IMAGING WITH FRISBEE data duplication. They also classify places a block into a cache when this Mike Hibler, Leigh Stoller, Jay Lepreau, EPORTS related work in the area, based on their block is accessed, so that the block that Robert Ricci, and Chad Barb, University R taxonomy. resides in the upper-level cache is also of Utah contained in the lower-level cache. Robert Ricci talked about Frisbee, a pro- The second contribution of the paper is

While this might be essential when totype disk-imaging system from Utah. ONFERENCE

the design of a robust asynchronous C upper-level caches are significantly

The paper motivates the use of raw disk ● remote-mirroring protocol that sup- smaller than the lower-level caches, it’s imaging over differential update which ports write coalescing, asynchronous not quite so important in a storage operates above the file system. Advan- propagation, and in-order delivery and cache hierarchy where storage-server tages offered by disk imaging include that provides resilience to many kinds and file-server (storage client) caches are generality across file systems, robustness and sequences of failures, low network comparable in size. An eviction-based to file-system corruption, and speed. bandwidth demands, and low (and tun- strategy places a block in the cache only Disk imaging, however, is low on band- able) data loss. The principal goal of when it is evicted from an upper level. width efficiency. Frisbee (derived from Seneca’s design is to make the data avail- “flying disks”) incorporates file-system- able and keep each copy consistent The paper uses idle distance as a metric aware , data segmenta- despite disk array failures, Seneca box for evaluating the effectiveness of a tion, and a custom application-level failures, network failures, and both tem- cache-placement policy. Idle distance is reliable multicast protocol. porary and permanent site outages. Its the period of time the block resides in levels of fallback divergence modes are the cache without being accessed. A bet- The talk proceeded with a description of time-bounded, log-space-bounded, and ter cache-placement policy will have the Emulab environment, where the unbounded. Updates are propagated lower average idle distances. The paper prototype was built and tested, which either atomically in order, asynchronous presents real-world storage-access traces, comprises a cluster of 168 commodity batched, or out of order. In Seneca, the which reveal that eviction-based strate- PCs. Being a time-shared setup (in the data duplication is done in the duplexed gies show lower average idle distances true sense of the term), every user has SAN appliance. than access-based policies. The speaker root access to all machines in the cluster. presented simulation results comparing This necessitates returning the cluster Seneca’s correctness is verified using a the two placement policies as applied in nodes to a known state with change of simulator which “approximates” the I/O combination with various cache- users. Thus a system like Frisbee turns automaton model. One of the key goals replacement strategies (LRU, frequency out to be a compelling requirement in of performance evaluation experiments based, 2Q, MQ), which showed that the an environment like Emulab. is to answer the question as to how eviction-based strategy always per- much network bandwidth can be saved Frisbee was evaluated for scalability with formed better. by delaying I/Os (asynchrony) and coa- an increasing number of cluster nodes lescing writes. Results show that sub- The work proposes a mechanism to (from 1 to 80). Experiments were also stantial reductions in WAN traffic transparently obtain eviction informa- conducted to evaluate scalability in the (5–40%) were observed for a batch size tion from client buffer caches, which is presence of various percentages of of 30 seconds. nontrivial because a client buffer cache packet loss. Frisbee has been in produc- always silently evicts a clean page and tion use for over 18 months. Speed was EVICTION-BASED CACHE PLACEMENT FOR only writes out dirty pages to the back- reported as the single most attractive STORAGE CACHES end storage system. The main idea is to feature of Frisbee, being able to write Zhifeng Chen and Yuanyuan Zhou, Uni- maintain a mapping between buffer 50GB of data to 80 disks in 34 seconds. versity of Illinois at Urbana-Champaign; addresses and the disk block that it On this note Ricci quoted colleague Kai Li, Princeton University houses, so that by intercepting read/ Mike Hibler: “Earlier I could go for The work addresses the problem of write I/O operations, any changes lunch while the disk reloads, but now I cache placement (not replacement) in detected in the mapping can be attrib- can’t even make it to the bathroom!” the context of buffer cache management uted to the block being evicted from the for lower-level caches (resident on the cache. Evaluation on real systems back-end storage server) in a multi-level showed a 22% improvement in cache hit storage hierarchy (client-file server-disk ratios and a 20% improvement in the array). Cache placement typically fol- transaction rate.

October 2003 ;login: USENIX ATC ‘03 ● 49 FREENIX TRACK and distributed anti-spam networks. involves extracting information from the Each of these methods has many draw- header and body of the message. For NETWORK SERVICES backs, however, including high percent- example, SpamAssassin uses hand-coded IMPLEMENTATION OF A MODERN WEB ages of false negatives. ASK proposes a features with linear weights combined SEARCH ENGINE CLUSTER new solution to the problem by using a with threshold values. More sophisti- Maxim Lifantsev and Tzi-cker Chiueh, challenge-authentication model. It cated techniques include information Stony Brook University applies authentication to the email with- gain and clustering. The simplest format Summarized by William Acosta out interpreting the content and focuses of features is binary, works on all algo- The authors presented the design and on validating the senders rather than the rithms, and is robust. There are a wide analysis of Yuntis, a prototype for a scal- message; this approach is highly effec- variety of machine-learning techniques. able and extensible Web search engine. tive, since spammers use fake addresses The authors chose techniques that were The design of Yuntis attempts to provide to hide their identity. The system con- simple, popular, and educational. The performance by using an event-driven sists of three lists: white, ignore, and general description of machine learning model with one main thread of execu- black. When a message is received into a involves gathering a large number of tion and “select”-like functionality to waiting queue, a confirmation is sent to training instances, measuring statistical avoid the context-switching overhead of the sender, and once a confirmation is properties with respect to a feature set, a multi-threaded design. Similarly, Yun- received the message is moved into the classifying target instances, and, finally, tis employs custom disk data storage and in-box and the of the measuring the accuracy of the classifica- intra-cluster communication mecha- sender is added to the white list; there- tion. nisms. after, a confirmation from that sender Bart Massey described five different will not be required. The overall process of creating the techniques for mail classification, focus- search database involves crawling the There are a few specific issues that the ing on only a few. The most simple is the network to retrieve documents, which software deals with. In order to catch minimal Hamming Distance Voting. are compressed and stored upon suc- spammers who impersonate the owner, Here a message is compared against the cessful retrieval. A page quality score is a “mailkey” can be added to one’s own messages in the corpus to determine its iteratively computed, keywords are mail. To avoid mail loops, the software classification; the classification time for extracted, and an index is created of the keeps the last few emails (the number is this method is huge and it is necessary extracted phrases. The prototype imple- variable) in a FIFO queue and checks to have a large variety in the data. The mentation consists of 12 cluster nodes whether a message occurs more than simplest neural net is a single neuron and a data set of 4 million crawled once. It is also possible to send com- call perceptron, but a more complex pages, with statistics and information mands to configure the queue remotely. neural net works even better. Basically, stored in 121 separate data tables. The A limitation of the software is dealing features are given as input and the net- content is partitioned over the entire with bounces: The software cannot dis- work’s output provides the classification. cluster. Future work includes workload tinguish between real and spam bounces There is a need for a representative data balancing of the cluster, scalability without accessing the MTA. Another set. Tests were run on different corpora, improvements, and fault-tolerance issue that needs to be addressed is that personal messages, and synthetic data. mechanisms. simple challenges can be defeated with a There were enough different characteris- smart auto-responder. tics to make a difference. The complex URL: http://www.ecsl.cs.sunysb.edu/ neural network seemed to have the best yuntis/ LEARNING SPAM: SIMPLE TECHNIQUES FOR results, with <1% false positives and FREELY AVAILABLE SOFTWARE 1–3% false negatives. Even humans can- MAIL Bart Massey, Raya Budrevich, Scott not achieve 0% misclassification rates. Summarized by Raya Budrevich Long, Mick Thomure, Portland State Machine learning is a key aspect of the University ASK: ACTIVE SPAM KILLER filtering. Marco Paganini Today there are many approaches to the problem of spam, including black/white Currently, there are three main preven- lists, laws, and automated filtering, tion approaches to the problem of which includes feature recognition and unwanted commercial email: real-time classification; these methods can all black hole lists, keyword identification, work together. Feature detection

50 Vol. 28, No. 5 ;login: NETWORK PROTOCOLS tion, rsync would be a good tool for syn- ing system, where a hashtable had been Summarized by Benjamin A. Schmit chronization with mobile devices. With designed too small for today’s require- EPORTS Network Programming for the approach presented, the update of a ments. R file can be done in place, without the the Rest of Us After these problems were solved, the creation of a temporary work copy. Glyph Lefkowitz, Twisted Matrix Labs; authors improved the NFS server per-

Itamar Shtull-Trauring, Zoteca ONFERENCE

A problem that needed to be solved, formance by introducing cursors, which C

Itamar Shtull-Trauring started the first however, was that data necessary for made the server able to recognize several ● talk by comparing network program- later rsync commands should not be concurrent file reads. On a heavily ming to driving a car: An automatic overwritten by earlier ones (as he loaded NFS server, these optimizations transmission frees the driver from hav- explained, rsync does its work by trying speed up file requests by a factor of up ing to make low-level choices, at the to find blocks from the old file some- to 2.4. possible cost of some performance. Sim- where in the new file). The solution here ilarly, programmers who want to do is the creation of a dependency graph; BIOS AND VIRTUAL DEVICES networking without caring for peak per- though it might contain cycles, these are Summarized by Shashi Guruprasad formance should use a networking broken up by retransmitting data over- FLEXIBILITY IN ROM: A STACKABLE OPEN toolkit. written by earlier commands, for which SOURCE BIOS two algorithms are implemented. The main design goal for the Twisted Adam Agnew, Adam Sulmicki, William networking framework was providing a The benchmarks, which update files typ- Arbaugh, University of Maryland at cross-platform solution that still allows ically used on handheld computers, College Park; Ronald Minnich, Los access to platform-specific features. It is show that there is very little difference in Alamos National Labs written in Python, a high-level language performance as compared to the original This won the Best Student Paper award. that helps beginners avoid common pro- rsync implementation. The new algo- Adam Agnew presented the first open gramming errors by providing, for rithm, however, needs more main mem- source PC BIOS that could boot any example, automatic boundary checking. ory for calculating the dependency modern OS. They leveraged earlier work The framework is built around an event graph. This problem can be solved by in Linux BIOS, which could only boot loop and allows programmers to easily introducing windowing, which is not yet Linux. Unlike Linux, some of the OSes add even complex services such as a Web implemented. rely on services such as video, hard server to their programs. drive, memory sizing, and a PCI table NFS TRICKS AND BENCHMARKING TRAPS provided by legacy BIOSes. Therefore In order to show how the Twisted Daniel Ellard and Margo Seltzer, Linux BIOS could not directly boot framework can help a programmer Harvard University these OSes. Legacy BIOSes do a poor job speed up development, Conch, an SSH The initial research goal of this paper of setting up a PC, requiring modern application, was implemented. The was to improve the performance of an OSes such as Linux to redo some of the implementation contains 5000 lines of NFS server by optimizing its read-ahead tasks, increasing bootstrap latency. The code written by a single person, as com- caching strategy. Daniel Ellard pointed advantages of using Linux BIOS are pared to 64,000 lines by 84 people in the out that 5–10% of the NFS requests numerous: (1) dramatic increase in boot case of OpenSSH. The framework is arrive at the server in the wrong order, speed – for example, they have achieved available for download from http://www. which could hinder it from doing read- a record of a three-second boot – the twistedmatrix.com/ and was released ahead properly. main bottleneck is the time required to under the LGPL. bring the IDE hard drive spin to normal The benchmarks designed to measure operational speed; (2) small size – an IN-PLACE RSYNC: FILE SYNCHRONIZATION performance improvements showed an image size of 36KB uncompressed offers FOR MOBILE AND WIRELESS DEVICES unusually large amount of variance. scope for more powerful tools to be part David Rasch and Randal Burns, Johns This was caused mainly by the fact that of BIOS; (3) open source – it’s easily Hopkins University modern hard disks have different data debuggable and royalty free and saves David Rasch identified the space require- transmission rates at different physical motherboard manufacturers $3–$5 per ment of rsync as a major problem for areas. New benchmarks that took this PC in royalties; (4) remote administra- mobile and wireless devices with little into account still showed problems, this tion is possible via console support. storage capacity. Without this restric- time located within the FreeBSD operat-

October 2003 ;login: USENIX ATC ‘03 ● 51 The proposed solution employs a com- in supporting denser clusters that have better BIOS integration, serial port emu- bination of Linux BIOS and Bochs x86 more machines per rack where it is nec- lation, and frame-buffer emulation for emulator, using a custom wrapper func- essary to support only the most essential OSes that do not support character- tionality known as Adhesive Loader of the I/O interfaces for reducing both based consoles. (ADLO), whixh doesn’t require the space requirements and the heat dissipa- Bochs emulator to be modified. The tion of each machine. An example of IMPLEMENTING CLONABLE NETWORK STACKS Bochs emulator has excellent PC BIOS such a cluster that researchers in the IN THE FREEBSD KERNEL interrupt support, and using an existing IBM Austin lab built is known as a Marko Zec, University of Zagreb mechanism avoided maintaining more superdense server with 200–300 x86 Virtual hosting, network simulation, and software. Together, these provide the servers in a 42U rack. These servers advanced VPN provisioning require missing legacy BIOS support except for already support an Ethernet interface support for multiple protocol stacks on the Video BIOS, which is extracted from and have no room for serial ports on the the same physical host. This paper the Video ROM and packaged along board. focuses on the design, implementation, with the above to complete the BIOS. and performance of an experimental Two different solutions for console over One of the advantages of such a solution clonable network stack in the FreeBSD Ethernet were developed. The first solu- is that different components can be kernel. By separate stacks, the author tion, which is a TCP/IP-based Linux stacked, leading to different configura- means independent states – routing console named “etherconsole,” uses a tions, of bootloaders, for example, or tables, interfaces, and firewall rules – kernel interface to the socket library to more sophisticated tools in the BIOS, rather than independent protocol code. send and receive console messages such as console support. In other words, this work does not sup- instead of performing low-level device port the creation of a separate protocol Possible future work was discussed, operations on the serial device. A major stack that is not already supported by which included TCPA, console over downside of this approach is that con- the OS. Virtualizing a network stack for other devices such as USB, authenticated sole support is possible only after the above applications is a more light- booting, virtual machine/virtual TCP/IP initialization. Problems that weight alternative to a traditional virtual machine monitor support in BIOS, occur before this phase will not be machine. transparent encrypted storage, and reported on the console and, therefore, transparent backup and intrusion detec- debugging such problems is difficult. A The key design goals were: API/ABI tion that cannot be turned off. There second solution involves the use of link (Application Binary Interface) compati- was also a demonstration of a quick- layer networking, that is, the sending of bility and low or negligible performance booting Windows 2000 OS. During the console messages using raw Ethernet overhead. A naïve approach to imple- Q&A, someone asked about menu-based frames with a special Ethernet type and ment clonable stacks is to add an array configuration support. The answer was a broadcast Ethernet address. At the dimension to the kernel networking data that currently no configuration support receiving end, a program captures these structures. Such an approach, however, is present and it is necessary to re-flash. frames and provides the console output. increases the amount of kernel code A combined approach was then used for modification. Instead, the approach used URLs: console support: the link-layer approach was to group the kernel data structures http://www.missl.cs.umd.edu/ until DHCP is performed, and the that maintain the state of the network http://Linuxbios.org/ TCP/IP approach afterwards. stack together and provide access http://bochs.sourceforge.net/ through an additional level of indirec- Security issues were discussed. The pos- tion. This grouping is termed a “virtual CONSOLE OVER ETHERNET sibility of break-ins and denial-of-ser- image” and has an instance of network Mike Kistler, Eric van Hensbergen, and vice is increased when the console is stack associated with it. Virtual images Freeman Rawson, IBM Austin Research available over the network. The sug- Laboratory are organized in a hierarchy similar to gested solutions included a separate pri- the UNIX process hierarchy that helps in Eric van Hensbergen talked about con- vate network for console and switch sole support over an Ethernet device. binding unmodified programs to differ- VLAN support. The latter does not ent network stacks. Traditional forms of console support are address denial-of-service. This was inte- through the serial devices that are aggre- grated with Linux BIOS into custom There was also a discussion on the gated through KVM switches in clusters. firmware at IBM and allowed full system implementation of CPU load and usage The main problem with serial console is administration and debugging. Some accounting/limiting per virtual image, future areas of research in this area were

52 Vol. 28, No. 5 ;login: which helps in resource management behaves when things go wrong, like over any existing mechanism of data and avoids receive livelock. The per- restarting an out-of-date SE or the fail- exchange (NFS, HTTP, FTP) and can EPORTS formance degradation is hardly notice- ure of an HE (manual failover), and also leverage existing IPSec infrastructure for R able, around 3.5% lower than an how the HE ensures that a quorum of secure client-file server communication. unmodified kernel where the test system SEs are in sync when one SE happens to The experimental platform comprised a

had one active network stack among 128 be slower than the other. ONFERENCE

set of machines with modest hardware C stack instances. In some tests, the per- Finally, the author presented some inter- running OpenBSD. Measurements using ● formance improved slightly (5.7%) due esting experimental results for read and Bonnie micro-benchmark showed that to better locality of kernel network stack write availability against various quo- write throughput was comparable to variables which led to higher CPU cache rum sizes and arrived at a quorum size that of NFSv2, although read through- hits. of two and a set of three replicating SEs put was observed to be lower than NFS. URL: http://www.tel.fer.hr/zec/BSD/ as a recommended configuration to The Postmark benchmark, representing vimage/index.html achieve good read and write availability. a heavy workload of small files, was used He concluded that SEs not in quorum for versions of DisCFS without any FILE SYSTEMS could be connected over the Internet, security and with and without credential Summarized by Manish Prasad thus eliminating the need for a dedi- caching. cated high-bandwidth low-latency link. STARFISH: HIGHLY AVAILABLE BLOCK The talk concluded with a discussion of STORAGE SECURE AND FLEXIBLE GLOBAL FILE SHARING future work, the most noteworthy con- Eran Gabber, Jeff Fellin, Michael Flaster, Stefan Miltchev, Jonathan M. Smith, cerning implementation of access con- Fengrui Gu, Bruce Hillyer, Wee Teck Sotiris Ioannidis, University of Pennsyl- trol beyond permission bits and Ng, Banu Özden, and Elizabeth Shriver, vania; Vassilis Prevelakis, Drexel Uni- avoiding the use of inode numbers as Lucent Technologies, Bell Labs versity; John Ioannidis, AT&T Labs – file handles because of security holes Michael Flaster presented this FREENIX Research; Angelos D. Keromytis, created by inode number reuse. award paper. The principal contribution Columbia University was the dissemination of a replicated Stefan Miltchev presented this work on THE CRYPTOGRAPHIC DISK DRIVER block storage system to the open source authentication in network file systems. Roland C. Dowdeswell, NetBSD Project; community, built from commodity The paper presents Distributed Creden- John Ioannidis, AT&T Labs – Research servers running FreeBSD, connected by tial FileSystem (DisCFS), which uses Dowdeswell presented the Crypto- standard high-speed IP networking gear. trust management credentials to Graphic Disk Driver (CGD), “a pseudo- StarFish is not a distributed file system. “directly authorize actions rather than device driver that sits below the buffer It assumes a single-owner scenario and divide the authorization task into cache and provides an encrypted view of thus doesn’t deal with issues resulting authentication and access control” and an underlying raw partition.”CGD tar- from multiple writers. “to identify files being stored; users; and gets the problems arising from laptops and other portables “growing legs” and StarFish architecture comprises a com- conditions under which their file access vanishing from public places! Here, modity server with an appropriate SCSI is allowed.”DisCFS is intended to “protection of data from other concur- or FC controller, called host element address the weaknesses of existing file rent users is not essential, but protection (HE), which helps the client host access access control mechanisms, which are against loss or theft is important.” data from a set of storage elements (SEs) either too coarse-grained (e.g., Web, that talk to the HE using TCP/IP.An SE FTP) or are unsuitable for use across Due to its positioning in the I/O stack forms a single unit of replication. The administrative domains (e.g., NFS). (just above the disk driver), it is com- SEs could be connected to the HE via DisCFS uses a direct binding between a pletely transparent to file systems. Some either a dedicated link or the Internet. public key and a set of authorizations. A of the goals of the work are to maintain Writes to StarFish are considered to be user can delegate trust to another, which good performance while providing ade- complete on receiving ACKs from all the results in creation of a chain of trust, quate security, ease of use, and seamless SEs that form the required quorum. similar to systems like SPKI. Only a sub- integration into the OS release. The in- kernel driver supports an ioctl interface The presenter projected steadfast relia- set of privileges may be delegated to to attach CGD to an underlying disk bility as the unique selling point of another user, making privilege escalation device or partition and configure the StarFish and presented in detail how it impossible. DisCFS can be implemented

October 2003 ;login: USENIX ATC ‘03 ● 53 required parameters such as encryption expressions were used to recognize the to understand the raw packet data. The algorithm, key length, IV method, etc. strings of grid numbers that defined a usefulness of this tool is difficult to They define a modular framework for stroke. An additional problem is that if overemphasize; it allows the network adding cryptographic algorithms. the device is held tilted, strokes will also trace to be examined in detail and prob- be tilted. The solution to this is that the lem areas to be picked out easily. The driver was evaluated on DEC Per- common horizontal strokes (backspace sonal Workstation 500a, Pentium The LBX proxy uses application-specific and space) were used to reorient the grid 4–based PC, and an IBM Thinkpad compression to reduce the size of when one was recognized. One nice fea- 600E. Results were presented for read- requests. In general, LBX was found to ture is that tilting the device enough to and-write throughput for various block be of very little utility in a modern set- cause incorrect recognition produces sizes, comparing various encryption ting: some of the requests it can com- garbage characters, alerting the user to algorithms – Blowfish, AES, and 3DES – press are obsolete, plus bandwidth for X correct the problem with backspace. against unencrypted I/O. As expected, requests is now a problem only for large More information and software are encrypted I/O throughput edges closer image files. SSH’s gzip compression, in available at http://www.xstroke.org. to raw I/O with the increase in block fact, was found to be superior in almost

size. The talk concluded with a discus- MATCHBOX: WINDOW MANAGEMENT NOT all cases. sion of future work, which included FOR THE DESKTOP In general, latency dominates band- addressing the problem of key revoca- Matthew Allum, OpenedHand Ltd. width, and most of the latency comes tion and leveraging hardware crypto- Matchbox is an X Window manager from synchronous requests. Many of graphic accelerators to achieve better designed for small devices that have low these can be eliminated by batching performance. memory, no keyboard, and limited CPU them, as with internAtom requests power. Matchbox is designed to be (already done by some toolkits). Finally, X WINDOW SYSTEM small, fast, flexible, and configurable. It restructuring applications can reduce Summarized by James Nugent has some restrictions specific to its envi- the effects of latency. Image files were XSTROKE: FULL-SCREEN GESTURE ronment: Only a single full-screen app is found to be the dominant reason for RECOGNITION FOR X supported at a time, and applications bandwidth usage. It was suggested that Carl D. Worth, University of Southern cannot resize windows or make windows using original image formats, which are California larger than the screen. A toolbar/virtual usually compressed, would be helpful. keyboard is always visible. The goal of Xstroke is to provide full- Client-side fonts were also studied; they screen gesture recognition for X Win- Matchbox’s focus is on providing win- had the effect of significantly reducing dows. A gesture is recognized and is dow management that is also compliant the round trips and, hence, application passed to the system as a keystroke or a with standards, most notably the ICCCM startup time. Bandwidth usage is about series of keystrokes. The focus is on standards. Matchbox is themeable, has the same, although compressing the handheld devices, so it is important that runtime and compile-time configura- glyphs for transport would reduce this. the entire screen can be used for recog- tion options, and is extensible via plug- nition, unlike some systems that have a ins. It also has a PDA-style app launcher. EXPERIENCES special “gesture area.”The size and loca- Summarized by Raya Budrevich tion of the gesture are also important. X WINDOW SYSTEM NETWORK Finally, the gesture “draws” on the screen PERFORMANCE BUILDING A WIRELESS COMMUNITY NET- with a transparent color, thus allowing Keith Packard and James Gettys, WORK IN THE NETHERLANDS the gesture to be seen as it is made, but Cambridge Research Laboratory, HP Rudi van Drunen, Dirk-Willem van not to obscure data beneath. Labs Gulik, Jasper Koolhaas, Huub Schuurmans, and Marten Vijn, Wireless One difficulty is toggling gesture recog- Several X clients were set up to talk to an X server attached to a passive packet- Leiden Foundation nition on and off. Several approaches The purpose is to provide free wireless were discussed, but none was ideal. The monitoring tool and then to low-band- width X (LBX) and SSH proxies with a access in historical cities in the Nether- Recognizer is a 3x3 grid based on the lands. The technology uses 802.11b with bounding box of the stroke. A grid has NISTNet router. (NISTNet is capable of producing a variety of delay/bandwidth- three available concurrent channels the problem that similar strokes may without interference. The network is on look different near corners; thus regular limited network conditions.) The xplot performance visualization tool was used ISM band, shared with ham radio,

54 Vol. 28, No. 5 ;login: microwave, and vehicle ID. It is an IPv4 for the pointers and sign the revision with a portable application-specific col- network with a private address space, records, we get end-to-end integrity and lection strategy. EPORTS routing by OSPF, with ISC-DHCP and auditability. SSL is used as the authenti- R ISC-DNS; the network is transparent to cation technology. Because of the per- FREE SOFTWARE AND HIGH-POWER the user and provides no user-level secu- missions setup of the system, anony- ROCKETRY: THE PORTLAND STATE AEROSPACE SOCIETY

rity. mous access is easy and convenient. ONFERENCE

James Perkins, Andrew Greenberg, C Currently, there is a 20km2 outdoor cov- Currently authentication and integrity Jamey Sharp, David Cassard, and Bart ● erage for the city of Leiden, 20+ nodes, work well. During the last two years Massey, Portland State University 300 private users, and three proxies to there have been only three breaches. The The research group consists of under- connect to the Internet; many local hazards include OpenSSL bugs yielding graduate and graduate students in sci- schools and libraries are also connected. an unreliable transport. Boehm-GC ence and engineering at Portland State There is a need to find free locations for flaws yield leaked memory and regular University, people in local industry, and the antennas, since there is no funding server hangs. local aerospace enthusiasts. The current to pay rental fees. At every site the Several decisions, during implementa- model reached 3.6km; the next-genera- strength of the other nodes and the tion, that seemed plausible led to errors tion model will leave the atmosphere. interference are measured. Network sim- later on: The goal is to put nanosatellites in orbit. ulation software is used to determine the construction of all sites. 1. Using a texty format for debugging – The active guidance computer on the This cost 30% more space and didn’t rocket determines the rocket’s current A node consists of multiple antennas compress, and file systems didn’t respect position, heading, and course; the com- with donated PCs or embedded systems. cases. Therefore one should plan ahead puter then steers the rocket to keep it on The systems run FreeBSD because it is for binary format. course. In order to determine this infor- stable, single distribution, and tagged mation the computer uses several sen- release. Industry-standard wireless cards 2. Server-side integrity checks – Files sors: GPS, inertial measurement unit are used. might be damaged when they reach the (IMU), magnetometers, and pressure server. The server needs to serialize/ The is available freely and optical sensors. deserialize, which leads to the server to all inhabitants of Leiden. The Wireless knowing the object schema. The launch tower is used to launch the Leiden Foundation is a nonprofit organ- rocket, view rocket status, and send ization that tries to create strong con- 3. Build a simple file-based store first – emergency commands to the rocket. The nections with schools and universities. One file per object was stored, using gzip on-board system includes the flight soft- The network has many uses, such as P2P, to compress, but in switching to binary ware avionics firmware. gaming, and VPN. On any given day, format, it was discovered that 60% of about 100 users are connected. the files were less than 500 bytes. It was difficult to decide which embed- ded operating system to use. There were Community acceptance, project man- 4. Build an event-driven server – Used several candidates but RedHat’s eCos agement, and interference were the main non-blocking I/O and an event loop, was selected. It has all of the needed cri- problems facing the project. which simplified the code and the server teria: RT, POSIX, free for use, small. but caused problems with SSL. OPENCM: EARLY EXPERIENCES AND LESSONS Commercial OEM GPS boards don’t LEARNED 5. Use GC and exception handling – work because of their software limita- Jonathan S. Shapiro, John Vanderburgh, This was well engineered for memory tions in determining acceleration veloc- and Jack Lloyd, Johns Hopkins management but has many platform ity and altitude. GPL-licensed firmware University dependencies, leaks memory in large for GPS receivers uses eCos. Differential OpenCM is a new configuration-man- objects, and doesn’t work on OpenSSL. GPS base station and receiver provide agement system that uses cryptographic To fix GC and exceptions a new manage- precision timing and altitude determina- authentication and high integrity. The ment, GC_SCOPEs, is introduced. Cur- tions. architecture uses cryptographic naming, rently, schema issues described in the Future work includes creating an and the content of a configuration file is paper are all fixed in the development enhanced inertial measurement unit, a DAG; every revision is a pointer to the branch. There’s still a need to modify developing next-generation navigation root of the DAG. If we use crypto hashes storage format and replace Boehm-GC algorithms, investigating loose coupling

October 2003 ;login: USENIX ATC ‘03 ● 55 of IMU and GPS, GPS aiding of the support relinquishing access privileges. For MAC, the FreeBSD kernel has been IMU unit, deep coupling of IMU/GPS This is no problem as long as the appli- extended by additional events. MAC- and other sensors, and adding a steer- cations do not contain any bugs (which aware applications register their interest able hybrid motor. could lead to root privileges for any local in some events, then get these events, user). and possibly de-register afterward. PRIVILEGE MANAGEMENT Access policies are divided into adapta- Privman is implemented as a user-space tions of traditional access policies, tradi- Summarized by Benjamin A. Schmit library which mirrors some calls to the tional MAC policies, and new ones like a POSIX ACCESS CONTROL LISTS ON LINUX libc as well as some system calls. A pro- port of the SELinux FLASK policy to gram linked to the library starts by fork- Andreas Gruenbacher, SuSE Linux AG FreeBSD. The POSIX.1 access permission model is ing into a privileged server and a client not always sufficient, especially when that gives up all its access privileges. The main benefit of MAC is that the interacting with Windows clients via When the client needs to make a privi- source code does not have to be modi- SAMBA. The abandoned standard leged call, it contacts the server on a fied to use it. The implementation still POSIX.1e (the last draft, 17, is available pipe. The server than decides, based on a lacks support for multi-threaded appli- to the public) is taken as the basis of policy file, whether the access should be cations and multi-threaded kernel most ACL implementations. granted. threads. Also, object labeling within the kernel currently decreases performance, The main advantage to Privman is that In the Linux implementation, which has which could be solved by introducing a few changes need to be added to the become an official part of the 2.5 kernel cache. version, access permissions (read, write, source code. The overhead induced by the wrapper calls can be huge for a sin- and execute) can be granted to the KERNEL gle system call, but because these calls owner, named users, the owning group, Summarized by Francis Manoj David named groups, and others. A mask is occur rarely, the performance of a typi- used to retain backward compatibility cal application decreases by only about USING READ-COPY-UPDATE TECHNIQUES for non-ACL-aware applications. 5%. Privman has been used to adapt FOR SYSTEM V IPC IN THE LINUX 2.5 Default permissions make it possible to OpenSSH and wu-ftpd for privilege KERNEL inherit permissions that were set at management. Andrea Arcangeli, SuSE; Mingming directory level. Cao, Paul McKenney, and Dipankar THE TRUSTEDBSD MAC FRAMEWORK: Sarma, IBM The Linux ACL implementation can be EXTENSIBLE KERNEL ACCESS CONTROL FOR Locking and synchronization are used for ACL-aware network protocols, FREEBSD 5.0 extremely important tools in multi- the most important ones being NFS Robert Watson, Wayne Morrison, and threaded environments. Atomic incre- (only version 4) and CIFS (formerly Chris Vance, Network Associates ment, the key to locking, is possible on SMB). Since these two protocols imple- Laboratories; Brian Feldman, FreeBSD current CPUs. However, newer proces- Project ment ACLs in a different way, a mapping sors (like the Pentium 4) consume sig- needs to be done here. The Linux imple- In his dense talk, Robert Watson nificantly more cycles than older mentation does not support granting explained the implementation of the processors during a single atomic incre- somebody other than the owner the Mandatory Access Control system ment. This is because the whole pipeline ability to change permissions. within FreeBSD 5.x. The MAC frame- needs to be flushed, and this is an work is implemented partly within the expensive operation. For data that is PRIVMAN: A LIBRARY FOR PARTITIONING kernel, partly in user-space. Common read-only, paying this penalty is unrea- APPLICATIONS applications require more than just the sonable. Douglas Kilpatrick, Network Associates standard UNIX security mechanisms. Laboratories Operating system support for access Read-Copy-Update (RCU) is a tech- In this talk, Douglas Kilpatrick described control is important; without it, applica- nique that “allows lock-free read-only the Privman privilege management tions would have to cope with several access to data structures that are concur- library. Some UNIX applications make different security mechanisms, the result rently modified on SMP systems.” To use of privileges that normal users can’t being mostly an intersection of the indi- illustrate RCU concepts, let us look at an acquire and, therefore, run with root vidual permissions. example. In a linked list, deletion of an privileges because UNIX does not yet element and traversal of the list cannot happen simultaneously. This is because

56 Vol. 28, No. 5 ;login: the traversal code might reference a some primitive like test and set or incre- PROVIDING A LINUX API ON THE SCALABLE pointer that is freed by the deletion ment a variable (e.g., count++ in C). In K42 KERNEL EPORTS code. The RCU-based solution is to order to guarantee that this code is Jonathan Appavoo, University of R defer the freeing of the pointer until it is called atomically, a user thread should Toronto; Marc Auslander, Dilma Da certain that no other code is traversing not be preempted in the middle of this Silva, David Edelsohn, Orran Krieger, Michal Ostrowski, Bryan Rosenburg,

the list. The paper references a couple of code. If it does get preempted, then the ONFERENCE more efficient solutions to this problem. code needs to be restarted. This restart- Robert W. Wisniewski, and Jimi Xeni- C ● ing would ensure that the operation is dis, IBM T.J. Watson Research Center The authors have successfully applied performed atomically. K42 is a new research OS kernel RCU techniques for data designed to be highly scalable and exten- structures in the Linux 2.5 kernel. There RAS has been implemented for the sible and written in object-oriented is a dynamic array called ipc_ids that NetBSD OS. The user thread registers style. It also supports online reconfigu- needs to be traversed for all semaphore the entry and exit points of the RAS ration of kernel services. K42 pushes a operations. This array is protected by a with the kernel. Whenever there is a lot of usual kernel functionality into global lock. This design prevents sema- context switch, the kernel checks to see user-space, enabling efficient IPC. Also, phore operations from proceeding in whether the execution was in the middle global data and locks are avoided in parallel. Using RCU, the global lock was of the RAS, and if it was, the RAS is order to improve performance. eliminated and deferred deletion of restarted when the thread returns. Thus, was implemented. The total atomic operations are provided. Also, The motivation for this work was to run number of new lines of code added to one cannot write arbitrary code for RAS all standard Linux programs on K42, the semaphore implementation was only and expect things to work correctly. An thus increasing the number of applica- 151. Micro-benchmarks show very good RAS should have the following proper- tions available to those working with results. In the future, the authors plan to ties. It should have a single entry and K42. This paper describes the challenges use RCU to optimize more code in the exit point, should not modify shared that the authors faced in providing the kernel. The current code is available data, and should not invoke any func- Linux API on K42. Linux processes were under the GPL license. tions. This ensures that, on restarting the supported by mapping them to K42 sequence, the thread is restored to a cor- processes. A K42 process called the URLs: rect state. Thus, the responsibility for ProcessLinuxServer maintains these http://www.rdrop.com/users/paulmck getting things to work correctly is shared relationships. Fork was implemented by http://sourceforge.net/projects/lse between the kernel and the user thread. placing most of the code in user-space. The file descriptor table was lazily repli- AN IMPLEMENTATION OF USER-LEVEL The user interface to the system is simi- cated for performance reasons. For sig- RESTARTABLE ATOMIC SEQUENCES ON THE lar to that provided by mmap. RASes are nals, all state was kept in the client. To NETBSD OPERATING SYSTEM registered with the kernel. The kernel provide a Linux environment, a new ver- Gregory McGarry checks for such sequences in the sion of glibc, targeted toward K42, was A Restartable Atomic Sequence (RAS) is cpu_switch function and restarts compiled. Exec was also implemented a mechanism used to efficiently imple- sequences if necessary. Performance in user-space. This involves an unload ment atomic operations on uniprocessor studies show that this approach is better operation followed by a reload opera- systems. It was designed to provide than the syscall approach. RAS is cur- tion. Namespace resolution for files was atomic operations on processors that rently transparently used in the NetBSD also implemented in user-space. The don’t provide them. On processors that pthread library for uniprocessor entire Linux emulation layer was pro- do provide them, RAS increases proces- machines. vided by a modified Linux kernel, with sor performance. K42 as the target architecture. Currently, System calls are one option when it the project has a fully functional imple- comes to performing atomic operations mentation for 64-bit architectures and is for user threads. This emulates memory- available for download under the LGPL interlocked instructions. However, this license. comes with a large overhead. RAS is a URL: http://www.research.ibm.com/K42/ better solution to this problem. An RAS is basically some code that provides

October 2003 ;login: USENIX ATC ‘03 ● 57 15th Annual Computer ■ IDS: IDS logs only known suspi- Active probing (port scanning), though Security Incident Handling cious traffic, but we often need useful during investigations, may tip off (FIRST) Conference complete logs. Also, some kinds of an intruder that you have noticed them. suspicious packets cannot be Make sure you have the authority to OTTAWA, CANADA logged: Out-of-sequence FINs can scan any machine you want to probe. JUNE 22–27, 2003 “turn off” IDS monitoring for a Some useful tools are: Summarized by Anne Bennett connection, and too many frag- The Forum of Incident Response and ments can overwhelm an IDS. ■ nmap: finds open ports, O/S finger- Security Teams (FIRST) is a global ■ System forensics: A compromised printing. organization whose aim is to facilitate system has often had its data ■ nc: text communication with tcp the sharing of security-related informa- altered, attempts to fix the problem port. tion and to foster cooperation in the may have destroyed data, and the ■ pnscan: a massively parallel scan- effective prevention and detection of, intruder may not have written any- ner, can grab banners, even pass text and recovery from, computer security thing to disk. to the port, but can sometimes miss incidents. It holds several technical col- ■ Sniffers: Often it is too late to turn hosts because of timeout problems. loquia each year open to members only, on the sniffer – the problem has ■ rpcinfo: identifies RPC services and one annual conference which is already happened. Also, because the running. open to all. entire packet is logged instead of ■ smbclient: NetBIOS info, comes just the header info, disk consump- with Samba. TUTORIALS tion is massive and there are more ■ dcetest: like rpcinfo but for DCE serious privacy issues. services. NETWORK FORENSICS ■ Firewalls: Usually they do not con- ■ ibnet: can craft specific packets. E. Larry Lidz, University of Chicago tain as much information as audit It is extremely important to keep system Looking at problems from the network: logs (except in the special case of logs in a trusted place (central log Often, we don’t have access to the application firewalls). machine we suspect of being involved in server) and watch them for unusual a compromise (the machine is physically All of the above have their place, but it is activity. still useful to have network audit logs to inaccessible, it belongs to a student, We were shown examples of how Net- help investigate break-ins, to determine etc.). Network audit logs can save the Flow and Argus are used to investigate how our network is being used for both day: If we are able to go back and show incidents. For example, a bunch of legitimate and illegitimate purposes, and where a problem came from, we can Solaris hosts were reported to be com- to get a picture of “normal” use of the quickly resolve the problem. Also, if it is promised. The investigators picked a rel- network. Also, the network audit logs necessary to turn over evidence to atively “quiet” one (fewer logs to look are (or should be!) on a trusted, hard- authorities, it can be legally easy to turn through) and found out that something ened system, so their information is over network audit logs, which tend not had scanned it for port 111 and con- quite reliable. to contain confidential information, nected to ttdbserver. Then another host whereas it can be really tricky to turn Well-known audit log systems: had connected to port 1524 (which was over the hard drives of compromised later determined to be a back-door port) ■ Cisco NetFlow logs straight from machines (because confidential infor- and rsh’d to a machine on the same net- Cisco (and some other vendors’) mation must be protected). work as the original “scanner” (this was routers and switches (can impact interpreted to mean that the victim was The speaker gave a crash course on performance). These logs are easy to downloading a rootkit); then IRC traffic TCP/IP, connection establishment and search. termination, IP addresses and network started up. ■ Argus logs from the spanning port masks, and well-known ports. of a switch and is able to log packet This analysis suggests that to find addi- Why use network audit logs if we already contents if desired. Argus is a free tional compromised machines, there are have one or more of an IDS (Intrusion product, and there is a more DoS- a few things that can be looked for: suc- Detection System), the ability to do sys- resistant and featureful related com- cessful connections to port 1524 locally, tem forensics later, the ability to sniff mercial product named Gargoyle. local machines rshing to the same offsite traffic as needed, or firewall logs? machine, and, possibly, IRC traffic

58 Vol. 28, No. 5 ;login: (unless there is lots of legitimate IRC Van Wyk strongly emphasized a mind- Creating a National Alerting and traffic). Ideally, you should cross-corre- set that needs to be adopted to have any Reporting Service EPORTS late multiple sources for better reliabil- chance of creating reasonably secure R Nienke van den Berg, GOVCERT.NL, the ity, and, of course, hand-verify hosts software. Try to look at your software Netherlands; Jeffrey Carpenter, likely to have been compromised. from the point of view of an attacker: CERT/CC, Carnegie Mellon University,

how can it be subverted? With that in ONFERENCE

Hints on what to look for in network USA; Graham Ingram, AusCERT, C

mind, it is possible to apply a set of ● logs when investigating an incident: Australia architectural principles and make sure ■ When was the first traffic to a back that the proposed design respects those The Internet is no longer just an aca- door (first successful connection to principles, such as: restrict privileges demic and research network, but has a port which previously had no traf- granted, assign responsibility to individ- become part of a national social and fic)? uals (make sure that actions can be economic infrastructure. Governments ■ When did IRC traffic start? traced back to their originators), log suf- are starting to look to national CSIRTs ■ Was there a sudden increase in traf- ficiently well that events can be recon- to help keep the Net viable as such an fic on a given machine? structed, and fail safely. infrastructure. Entering a “national pic- ■ Did unexpected traffic start shortly ture” has raised legal, process, alerting We were given examples of risk-assess- after a connection to a service run- (info provided to the public), and ment questions and broad categories of ning on the machine? reporting (info received from the pub- risk mitigation strategies, including ■ Once a back-door port has been lic) issues for existing (and new) CSIRTs. avoiding the risk by removing the cause found, what machines connected to or the consequence, limiting the risk by In deciding when to issue an alert, you it? detecting and responding to problems, should evaluate the particular vulnera- These techniques proved useful for transferring the risk to another party bility or event based on its technical Slammer Worm cases (where the symp- (e.g., by buying insurance), and assum- impact (a.k.a. objective impact – the toms were that the external net died and ing the risk (being prepared to deal with probability of technical and economic several internal routers crashed; Cisco the consequences of a problem). damage) and on its social impact (a.k.a. reported a network-wide DDoS) and in subjective impact – the perception of When maintaining or modifying exist- cases of a specific compromise. The ini- safety, influenced by the likelihood of ing software, be aware of the design tial Slammer analysis and identification media exposure). intent and the security model of the of an initial set of Slammer-compro- original, otherwise you risk introducing How to provide alerts: Web (need a mised machines was done in 15 min- problems you have no idea about. good content management system), utes! email (lists can get quite large), mass Near the end of the presentation, we Note: There are now tools (e.g., softflow) media, and SMS. were given some specific coding tips. to send flow logs off from UNIX The fairly low emphasis on this issue in Not all of these are used for each alert: machines which are acting as routers. the presentation maps well onto the There is a decision matrix that maps

SECURE CODING: PRINCIPLES AND PRACTICES amount of time that should be devoted technical and social impacts onto a Kenneth R. van Wyk, Tekmark to the implementation of a project, ver- “media mix” (set of channels through Technology Risk Management, USA; sus planning and design, which should which the alert is sent). A communica- Mark G. Graff receive the lion’s share of time. Most of tions advisor is used to ensure that alerts The intention of the authors of the those coding tips ought to be well- are “not too technical” when they are O’Reilly book of the same name was not known by now! aimed at the general public. to give tons of source code examples Finally, don’t neglect the environment in Email alerts are PGP-signed, of course, but, rather, to give programmers the which the code will be used: Secure the but since the general public may not be ability to think clearly about secure cod- OS and network, monitor logs, keep up in a position to check the signature, it ing. When we build bridges, we don’t to date with patches, and so on. And, of helps to have a very well–publicized Web make trucks drive over it, see if it col- course, don’t forget to test every compo- site address to reduce the chances that lapses, then if it does, move on to nent and test at every stage: Your envi- an impostor could hijack (forge) the dis- “bridge v.1.1.”We need a similar per- ronment changes. Automate your tribution channel. spective for the construction of software. testing.

October 2003 ;login: 15TH ANNUAL FIRST CONFERENCE● 59 Here are some questions to ask in decid- To evaluate potential (additional) your stated goals are in fact being ing what is worthy of advisories: What is impact, questions involve how many addressed in your processes. Also, pri- the current threat posed by this vulnera- people and/or machines are affected by vacy considerations dictate that you bility? How bad would it be if the vul- the vulnerability, the importance of the must have guidelines about what infor- nerability were publicly known? if the systems, how rapidly the problem is mation is communicated to whom and vulnerability were publicly known and spreading, and how complicated the under what conditions. Finally, the being exploited? How bad is the incident attack method is (i.e., how many people process makes clear what resources are already? How much worse can it get? could write an exploit, therefore how needed to do the work. likely the exploit is to appear soon). CERT/CC has developed formal metrics AusCERT described their complex to rank vulnerabilities according to the To set up an alerting service you need reporting service system. First of all, it severity of the impact on a “typical” site; money; an operational CSIRT (center of was necessary to collect incident data these metrics involve questions with operations); systems for Web service, online; the alternative of a call center numerical answers by which a weighted Web content management, email list staffed by 40 or so people was simply total is computed. This metric is used to management, and project and office not viable. In terms of security, the last help decide whether the problem justi- management; and technical, communi- thing a CSIRT wants is to have the press fies an email advisory or whether it is cation, and legal expertise. Legal issues report that the CSIRT suffered an intru- sufficient to post the info to the Web must be addressed (general terms and sion! It is equally important to keep data site. The metric is also used to decide conditions of service, privacy policy and segregated, so that one reporter does not which issues to work on first. The ques- disclaimers, contracts and service level have access to data submitted by another tions involve the ease of exploitation of agreements), PR must be handled, and reporter. The automated response sys- the vulnerability, how widely known the internal processes must be clear. tem must automatically do triage on the vulnerability is, the risk to the Internet incoming data: If something is reported The purpose of a reporting service (e.g., infrastructure, and how many systems for the first time, it probably requires AusCERT) is to collect, process, and would be affected and the severity of this human attention. The system must analyze computer security incident impact. assign “threads” to incidents so that cor- reports and share sanitized aggregate respondence and actions on a particular CERT/CC has also developed formal reporting with an appropriate audience. incident can be correlated. metrics to rank incidents (as opposed to This should provide meaningful intelli- vulnerabilities). With respect to incident gence about trends and modus operandi The types of requirements that must be activity relating to a particular vulnera- with respect to network attack activity. met in order to set up a reporting service bility, CERT/CC differentiates between are the same as for an alerting service, The goals of the service are to promote current activity and potential additional but the actual contents are different. the use of mitigation strategies, raise activity, not counting what has hap- Most CSIRTs need to use a Web form to awareness of computer security issues, pened so far (exploitation of a vulnera- collect data in order to automate the keep people up-to-date with threat bility), in order to determine the goals of process, including the initial analysis of activity and trends, provide information any action. incoming reports. The system must scale about attack data which they would oth- very well. A worm might result in thou- CERT/CC also adds a “uniqueness fac- erwise not be able to obtain, and provide sands of reports from the general public, tor” so that they don’t issue multiple value-added analysis of this data. for example, but if there is a new and advisories or incident notes on the same Any reporting service must be able to dangerous exploit nestled in among subject, even if related-incident activity protect the data of the reporters (and these thousands, the system must continues. reassure them that this will indeed be quickly pick it out for human attention. To evaluate current impact, questions the case). However, a reporting system as It must also be possible to quickly are asked about how many people a black hole is not a good idea: People change the Web form to respond to and/or machines are affected, at how are motivated to provide data by having emerging issues; for example, a new type many unique sites, what the importance access to sanitized and/or aggregated of incident might require the collection of the systems affected is, what the data and statistics. of a new category of data. impact during and after the actual activ- It is important to have a clear workflow How to handle anonymous reports? ity is, and how complicated the attack for the reporting process, not least AusCERT accepts and stores anonymous method is. because it forces you to make sure that

60 Vol. 28, No. 5 ;login: reports, but takes no action unless they ■ Vendor pressures: Vendors are mation. In addition, CSIRTs may be can later be correlated with reports from under economic pressure to ship slow, depending on their policies EPORTS validated sources. Don’t forget to try to new features fast, with the result and on the resources available to R collect consent from reporters with that testing is incomplete. Testing them. respect to sharing confidential informa- for security (the absence of unin- Clearly, not all information is equal, so it

tion with specific bodies when needed. tended functionality) is harder than ONFERENCE

helps to use multiple sources to ensure C

Arrange to refuse or otherwise deal with testing for intended functionality. ● speed, reliability, and completeness. You certain types of reports that are not 2. Sources of information about vulner- should verify this information by corre- computer-related or where liability abilities: lating the information from indepen- would be an issue – reports of pure dent sources, testing for yourself, and criminal activity such as murders, for ■ Incident reports are clearly a reli- using the trustworthiness of the source example, are not wanted in a computer able indication that there is a prob- as a criterion. security incident reporting scheme. lem! However, the information Reports of cybercrime accompanied by a obtained may have been obscured 3. CSIRT tasks with respect to vulnera- refusal to release information to law by the attackers and may be hard to bilities: enforcement authorities should also be interpret. ■ Planning: Plan in advance how to discouraged or refused. ■ Full-disclosure communities (e.g., BugTraq): Often the information is use information for the benefit of Alerting and reporting services tips and up to date but the quality is vari- your constituency and minimize tricks. Share knowledge and expertise able, and there is more emphasis on harm; there’s no point in sending with other CSIRTs; integrate the alerting problems than on solutions. out an advisory that says, “There’s a and reporting activities with the problem but there’s nothing you ■ Crackers (via published tools): The processes of the CSIRT; do alerting first information is very current, but can do.” ■ to build credibility for later report col- has to be handled with Distribution: Simply pass on exist- lection; keep close contact with target extreme care, since it may contain ing information, perhaps translat- groups to improve the quality of the additional attacks aside from the ing to a local language or sending alerts; start a newsletter service for peo- attack it claims to be performing. It only information that is relevant to ple who are not specifically interested in can be difficult to extract good a particular constituency. ■ alerts; and establish good national press information from the tools; reverse Interpretation: This can involve contacts in case escalation is needed. engineering is required. gathering information from multi- ple sources but, most importantly, ■ Vendors: Information can be of INTRODUCTION TO ADVISORIES very good quality, but vendors can interpreting the information to suit Andrew Cormack, UKERNA, UK be very slow, and, because of com- it to the skill level or common plat- 1. Why do vulnerabilities happen? peting motives, the information forms for your community, and/or adding your own introduction to ■ Laws of nature: Because computer may be incomplete or may under- networks are complex systems, they state the impact of the problem. place the information in context. When writing an advisory, list the will certainly contain errors, some ■ Commercial services such as anti- of which will have security implica- virus vendors, ISS, etc.: The quality important information in the first tions. of the information is generally high, paragraph: who is vulnerable, whether the problem is exploitable ■ Customer demands: People ask for and their advisories usually come computers which are “easy” to use; out before the vendors’ advisories. remotely, what the damage is, and rarely do they demand computers However, there may be restrictions the immediacy of the threat. Then which are “safe” to use. Therefore, on distribution, and, again, compet- discuss how to fix the problem, vendors sell systems with everything ing motives may affect the informa- mentioning any side effects of the turned on. When Sun tried the tion, for example, by overstating the fixes. IEEE 1044 describes ways to opposite, they received many “non- impact in order to sell their services. describe software anomalies in something like these terms. working” returns! And while users ■ Other CSIRTs have similar motiva- ■ will turn on what they need, they tions as us and are generally trust- Investigation: Be clear about why will rarely bother to turn off ser- worthy. But there may be restric- you are investigating a problem: to vices they don’t need. tions on distribution of the infor- better understand the problem?

October 2003 ;login: 15TH ANNUAL FIRST CONFERENCE● 61 because you intend to notify the Once the vendor has received and If other vendors are affected, either vendor? to check or create patches acknowledged a vulnerability report, it notify them directly (especially if the and workarounds? You can investi- needs to reproduce and verify the prob- number is small) or hand off to a CSIRT gate based on incident artifacts lem: What exactly is the problem? Deter- coordination center of some kind to deal (e.g., files left on a compromised mine workarounds, and whether they with contacting the other vendors. machine), source code if available, are effective and feasible. Find the fix. When the fix arrives, check it, and do or test systems (which are not on a Determine whether other vendors are regression testing to make sure that the public network!). affected. Determine if the problem fix doesn’t break anything else (unless ■ Coordination: This refers to work- deserves an advisory, based on ease of the fix is really trivial). Before releasing ing with vendors to resolve a prob- exploit, impact, and so on. all this, make sure, if the fix will require lem. This can be tricky; trust must The “reproduction and fixing” stage may a hardware upgrade, that this hardware be built and is easy to lose, and the take from a few hours to a few weeks, is available through the normal chan- motivations of the parties may depending on how much information nels. compete. Vendors don’t want bad the vendor has, the complexity of the publicity, but our constituency Finish off the advisory with information setup, and, of course, the difficulty of needs patches to prevent incidents, about the migration path and how to debugging. Also, teams may be small and so do other sites. (But will any obtain fixed hardware or software. Run and working on other cases already. publicity of our efforts increase the the early copy past developers; legal spe- risks to those other sites?) The advisory is the vendor’s official cialists, including export control people response to notification of the vulnera- for crypto; the PR department; and The presenter is a partner in the TRAN- bility. It is best to prepare the advisory selected groups of technical people – for SITS project, a European initiative to before you need it, in case events force example, the original reporter of the provide training on issues related to the you to publish prematurely. Use an vulnerability. provision of CSIRT services, and has informative title, the status (draft, documented the mechanics of how to Decide when to publish the advisory. Do interim, final), the date in GMT, a sum- write an advisory. The relevant links are, we have to wait for other vendors? What mary to be read by management and by respectively: day of the week is it in other parts of the techies to determine whether the advi- world? (Of course, if there is active http://www.ist-transits.org/ sory applies to them at all. exploitation, release as soon as possible.) http://www.ja.net/documents/ Other information that should appear in gn_advisories.pdf The advisory should use a vendor- the advisory: which hardware and soft- independent format (text, HTML, PDF) ware models are and are not affected; ADVISORIES and should be cryptographically signed. specific configurations that are affected Michael Caudill, Cisco Systems Ltd, UK It makes sense to release an advisory if applicable; what causes the problem; At Cisco, an advisory usually starts with internally first so that, for example, tech the symptoms of the problem (crash, some kind of notification of “bad news” support knows what the customers are performance slowdown, etc.); the actual from an independent researcher or a calling about! However, take into consequences (unauthorized access, customer. It is a good idea for the ven- account the possibility of leaks, so pre- DoS, information leak, etc.); who dis- dor to send some kind of response release only on a need-to-know basis, covered the problem (give credit where within 24 hours, because otherwise the and/or don’t allow much lead time; peo- due); whether the problem is being person will feel that their message has ple need to prepare themselves, but too exploited, including the names of not been heard/read; Caudill recom- much time will increase the likelihood of known exploit tools where applicable; mended using a PGP-signed reply. Nev- leaks. links to other advisories; CVE number. ertheless, vulnerability reporters should Determine what language(s) should be Once released, the advisory (at least the keep in mind that the vendor may be used in the advisory. version on the Web) needs to be kept up working different office hours, have dif- to date; there may be corrections and ferent national holidays, or be a small Obtaining the actual bug fix from devel- additional information. When making shop that has only one person receiving opers can be difficult, because of the changes, it is a good idea to update the vulnerabilities, so it is reasonable to commercial pressures to provide fea- revision number and keep a revision his- allow a week or so for the vendor to tures they are under. tory so that people can decide whether respond. they need to re-read the document.

62 Vol. 28, No. 5 ;login: KEYNOTE to two separate central offices, purchases tices”; protection of and from “clueless A GLOBAL CULTURE OF SECURITY and mergers among those companies users” making mistakes; the certification EPORTS Marcus H. Sachs, US Department of had had the result that most of the of network engineers; a mechanism for R Homeland Security, USA “redundant” connectivity went through information sharing about computer A “culture of security” is developing the same fiber bundles to the same C.O. security; and agreements between around the world which involves every- In most large cities, telephones are con- nations with respect to cybercrime. ONFERENCE nected to a single C.O., with little redun- C one, IT people and end users alike, and ● which is parallel to the “safety mind-set” dancy. On September 11, phone service TECHNICAL SESSIONS in New York basically failed. Interest- that makes us wear seat belts in cars. The WORST FEARS/WORKINGS OF A WORM use of computers and networks puts any ingly, people were still able to use the Roelof Temmingh, Sense Post, South country at risk; that vulnerability must Internet to communicate. It was also Africa first be recognized before it can be found that the co-location of the various There are over 1 million networked addressed. In the early 1980s, AT&T different utilities (water, gas, steam, elec- computers on Earth. The Internet covers ceased to have a telecommunications trical power) constituted a vulnerability. not just the Western world, though the monopoly. Since then, the US telecom- The US government decided that “secu- vast majority of connected computers munications network is no longer rity” had been divided into too many lit- are in the Western world, and Internet- domestic, terrestrial, and circuit-switched; tle offices; the Department of Homeland based services include not only email there is a diversity of circuit- and Security was created to bring these func- and the Web, though those are the major packet-switched technology, terrestrial, tions under one umbrella. This depart- services. Most of the computers are run- satellite, and wireless, supporting voice, ment published two major documents ning Microsoft products. About 90% of data, and other communications. in February 2003: “The National Strat- Web servers run either Apache or IIS, and 96% of browsers are MSIE. The In the late 1990s, a military study deter- egy to Secure Cyberspace” and “The state of security ofn those products is mined that the Department of Defense National Strategy for the Physical Pro- not reassuring. Many vulnerabilities could be reached from the Internet, and tection of Critical Infrastructures and have been revealed in the past few years could be attacked through that route. Key Assets,” which are available from and have been exploited by some of the Somehow, that information was at the http://www.whitehouse.gov/homeland/. major worms. In fact, while the recent time considered to be of only theoretical The Cyberspace Strategy is intended to worms have exploited previously known interest. A few months later, DOD com- be modular and to change as needed. vulnerabilities, most had no direct mali- puters became the subject of Net-based Why is it important to secure cyber- cious payload or had an effect limited to attacks. That got the attention of the space? The USA is fully dependent on DoS, and in most cases only a small per- leadership, and a task force was created cyberspace, and the range of threats is centage of targeted hosts were infected; to coordinate the defense of digital net- huge, from script kiddies to nation- nevertheless, their effect was large. works. At the end of that decade, there states. Recommendations include was a lot of cybercentric effort because addressing vulnerabilities, as opposed to What, then, is the worst-case scenario? of the impending Y2K, though for a threats, because threats are constant and Imagine a group of 15 or so program- while physical sectors of the infrastruc- everywhere. Also, the government alone mers working for six months in isola- ture were somewhat neglected. cannot secure cyberspace – individuals tion. September 11, 2001, made it clear that and industry must participate. The Phase 0: Perform reconnaissance work the “physical” sector ought not to be speaker went on to list and describe (scanning) to find vulnerable Web ignored. When the two towers of the many initiatives being taken by the US servers (assume that we have prepared World Trade Center and some adjacent government, and he listed requirements an exploit for an as-yet-unpublished buildings were destroyed, the telecom- for a secure Internet: accountable vulnerability that will break Apache and munications redundancy that had been addressing (such as IPv6); dependable IIS). Pick the 1000 busiest servers. provided for the New York Stock network services (routing, DNS); trust- Phase 1: First we stealthily infect a thou- Exchange turned out to be inadequate. worthy software; authenticated user sand machines that are important, high- Although connectivity had been pur- services (Web, email); a working public traffic Web servers (e.g., CNN). These chased from several different companies, key infrastructure; networks built to be “master servers” have huge log files and using several physical routes and going secure from the start, not as an after- thought; the adoption of “best prac- are constantly under attack anyway, so

October 2003 ;login: 15TH ANNUAL FIRST CONFERENCE● 63 our attack has a better chance of all kinds, corrupt or destroy the BIOS, includes the provision of actual services remaining undetected, which is very and pop up a box asking users to call related to this information, including important at this point. The code we their local help desks, thus swamping “out-of-Internet” alerting so that infor- inject into our victim “master servers” the help resources and perhaps even the mation can get through even when the does DNS lookups of random names phone lines. At the same time, the “mas- Internet is non-functional. using DNS servers on predetermined ters” remove all Web content, and start A code of conduct binds the participants controller hosts, and the return values of DDoSing Microsoft and Apache (to in the information exchange, and covers these DNS lookups contain commands inhibit their supplying any patches), the issues of cooperation, protection of for the master servers to execute. We root DNS servers, and random other intellectual property rights and confi- send each master 1/1000 of the list of IP sites. dential data, and contributions to the addresses of all vulnerable Web servers Long-term data destruction (e.g., of data goals of the project. as determined during phase 0. on backups) was also discussed, but it is The common language is based on IETF Phase 2: We now infect as many a bit more involved. initiatives Intrusion Detection Message machines as we can, still stealthily. We The results of such a worm would effec- Exchange Format, or IDMEF (to send send a signal to each master to inject tively be total chaos. To prevent such a attack information directly from the attack code into random Web pages (but worm from succeeding, several measures sensors), and Incident Object Descrip- not the home page) of the Web server it should be taken: tion Exchange Format, or IODEF (for has infected. Therefore, when a browser local CSIRTs to send a more high-level downloads that page, it also downloads ■ DMZ: Isolate outside-facing Web view of an incident). A Web form will the client version of the attack code. servers; don’t allow machines in the allow constituents who don’t yet support Using these “client servers,” we do recon- DMZ to make connections to the IODEF to send IODEF objects. IODEF naissance on the networks they are on inside. will also be used to send information to (possibly internal networks not directly ■ Tight filtering: A Web server should local CSIRTs. IDMEF incident data can connected to the Internet), remove any never initiate connections. If it is be included within IODEF incident antiviral software, collect any passwords unable to initiate Web connections, descriptions. we can find, and take control of any it cannot spread a worm that way, infrastructural machines (such as even if it is itself infected. eCSIRT.net has a clearinghouse func- routers) that we can. ■ Internal segmentation: In case part tion, to share as much information as of the organization gets infected legally can be shared; this is a low-prior- Phase 3: In a “pre-meltdown” phase, at a anyway, the other parts are pro- ity task which occurs after incidents are predetermined time (because communi- tected if the internal network is seg- closed, at which point they can be cation with infected clients may be mented by packet filters. processed for statistical purposes. The impossible), all “clients” look for vulner- ■ Filter any third parties that have output will be tailored to the recipient, able Web servers on the inside of that connectivity to your network. where participating CSIRTs will get network and infect them so that they too ■ Use personal firewalls on user PCs more detailed information, and the gen- will now carry the “browser worm,”at as a last line of defense. eral public will get just an overview. least for the next three hours. At the

same time, all “masters” carry out the AUTOMATIC EXCHANGE OF INCIDENT- Three types of statistical information infection of their predetermined lists of RELATED DATA AND ITS APPLICATION IN will be collected: externally connected vulnerable Web CSIRT OPERATIONS ■ Type 1 data are related to the work- servers. Klaus-Peter Kossakowski, Presecure load of each CSIRT: number of Three hours after the above (which is Consulting GmbH, Germany attacks reported; false positives; sys- short enough that it is unlikely that eCSIRT.net has a project whose goals are tems attacked and affected; time human intervention can take place in to improve the exchange of incident- spent analyzing, responding, docu- time to stop the process), the “clients” related data and to foster improved menting; and so on. start destroying the local machine and cooperation among CSIRTs. For this, all ■ Type 2 data concern incidents network; they send DoS packets to any groups involved have to agree on the themselves. This information will hosts that could not be infected, insert meanings of words so that statistics are be aggregated, and any identifying random bytes into data and text files of meaningful and a shared knowledge base is possible. The project also

64 Vol. 28, No. 5 ;login: information removed before it is should be run using HTTPS to improve and the public’s right to know more presented to anyone. security. than do the receivers, while the receivers EPORTS

■ Type 3 data pertain to Internet place a higher premium than do the R RTIR adds these functions to RT: events (which are not necessarily reporters on the avoidance of “FUD” intrusions). These events will be ■ An “incident” object ties together (fear, uncertainty, and doubt).

monitored using automated tech- the various reports that might come ONFERENCE

Only just over half of the receivers C

niques such as Argus (traffic moni- in about a single incident and vari- ● passed on the bug information to their tor), honeypots, and so on. The ous actions such as blocking net- developers to prevent further occur- event information will be accessible works and performing investigations. rences of that type of bug. to constituents via HTTPS and user ■ Incident response team–specific certificates. workflows, for example, automati- One-third of the receiving organizations cally opening incidents to notify the have a proactive publicity strategy for The incident (type 2) information is people responsible for each of a list cases where there is a publicity crisis highly controversial, because of trust of IP addresses about a vulnerability concerning vulnerabilities in their prod- and confidentiality issues; these issues discovered while scanning. ucts, and one-third of them have PR are being addressed. There is potential ■ “Clicky” metadata extraction and personnel who are familiar with vulner- that type 3 (event) information could tracking (to get more information ability issues and who have direct media have online processing resulting in alerts about IP addresses through such contacts. that can be sent to constituents. utilities as whois, traceroute, etc.). The vulnerability-reporting communi- There is technology to send out ■ Integration of whois information. cation seems too often to be one-way; encrypted mail, to make phone calls and ■ Separate email threads for separate two-way symmetrical communication is faxes, and so on. The idea is to free conversations about different tasks needed. A dialog between the parties humans to analyze problems instead of or components of the event. would improve mutual understanding, tying them up in the mechanics of the ■ High-level overviews. and vulnerability reporting policies transferring and storing of information. ■ Better searching tailored to inci- would also be helpful. dents so that multiple events can be for Incident Response correlated. PANEL DISCUSSION John Green, JANET-CERT, UK; Jesse ■ Simple scriptable actions. ASK THE EXPERTS Vincent, Best Practical Solutions, USA ■ New reporting functions. Cory Cohen, CERT/CC, Carnegie Mel- RTIR is a tool for incident handling, More information can be obtained from lon University, USA; Robert Hensing, which claims to be usable, cross-plat- http://www.bestpractical.com/. Microsoft, USA; Michael Warfield, ISS, form, open source, extensible, securable, USA; moderator: Roger Safian, and supported. COMMUNICATION IN SOFTWARE Northwestern University, USA VULNERABILITY PROCESS Q: Concerning Microsoft’s recent acqui- RT (the base software for RTIR) was Tiina Havana, Juha Roning, University sition of an anti-virus product vendor, is designed to track issues of any kind. It of Oulu, Finland Microsoft planning to automate anti- gets used for bug tracking, help desk and Reporting software vulnerabilities is virus download-and-security-patch customer service, network operation, “to central to software development, but the management into the same agent? do” lists, and so on. In its bare form, it communication process is problematic. does get used for incident response, but In 2002, a study was done where vulner- Rob: Microsoft is working on an anti- it is not ideal for this use. It is Web- ability reporters and the receivers of virus product of its own. It is also work- based, and the client side is designed to such reports were questioned. Recipients ing with VIA (the Virus Information work with just about any browser. reported contacting reporters more than Alliance), and working on other secu- rity-related services geared toward home RTIR is an extension to RT, which uses the reporters believed that they were users. I cannot comment on the details. the designed extension mechanisms; so, contacted. for example, it is possible to upgrade RT The values and beliefs of the two parties Q: In the case of an organization with without having to “repatch” the exten- differ. While they agree on the impor- no real security other than that provided sion. It is possible to add functionality, tance of security, precision and accuracy, by volunteers, what “glory words” can be change the user interface, and so on. It and non-malfeasance (avoiding harm to presented to financial folks to persuade others), reporters value public benefit

October 2003 ;login: 15TH ANNUAL FIRST CONFERENCE● 65 them to devote resources to a security Microsoft is chasing spammers legally’ Q: We have seen suspicious packets with team? We need high-profile prosecutions to a fixed window size. These were dis- discourage this type of activity. counted by CERT/CC, while ISS went Rob: To justify an IRT, use threat model- Microsoft is suing 13 spammers, of further and gave meaningful data. Give ing: What am I trying to protect? What which 11 are in the USA and 2 are in the us some idea of your internal processes. is it worth? A model called STRIDE UK. 60% of mail coming to Hotmail is (Spoofing, Tampering, Repudiation, Mike: ISS got caught with its pants spam! Information Disclosure, Damage Poten- down on this one. We had noticed pecu- tial, and Exploitability) assigns dollar Q: Lots of spam is fraud (eBay fraud, liar traffic, deferred looking into it for a values to each of the named items. trying to get credit card information). week, then looked into it after incident Chapter 4 in Writing Secure Code walks reports had appeared in various other Mike: eBay and Pay-Pal frauds get much through the STRIDE model. sources. At that point, we started press coverage, but those make up less detailed investigations, which are still in Andrew Cormack: Universities don’t than 1/10 of a percent of the incoming progress [as of 06/25/03], but postings work on dollars and cents – list assets, spam, while one-third of it is for Viagra have been made. liability. and other “personal enhancement” (!) technologies. Prosecutions need to pro- Rob: We [Microsoft] saw the traffic. No Q: For home users and the general pub- ceed on the fraud front; this type of one could figure out its origin, and this lic, a major problem is spam. Break-ins activity must become unprofitable. caused great concern: Was it the first big on DSL-connected hosts are often moti- kernel-mode Windows worm? [It turns vated by the desire to install spam distri- Q: Regarding security advisories, is there out that this is probably not the case.] bution agents. Various government any chance that advisory formats will organizations are interested in stopping converge so that automated means can Mike: This is a classic example of the this because it costs a lot of money. As be used to disseminate them, or their fact that our business gets interesting security people, should we do something existence, to our constituents? with the words “I never saw it do that radical to the email infrastructure to before!” Mike: Advisory formats are evolving, but address this issue? no wholesale conversion to a single Q: What are the pros and cons of proto- Mike: Honeynet analysis of scanning for common format should be expected. We col- and signature-based IDSes? What open proxies (squid, socks) shows that may see two or three different versions are some of your favorite IDSes? these scans are done mostly by spam- of the same information (text, Web, Cory: As a member of CERT/CC, I have mers wanting to inject their spam. These XML), of which XML is most amenable no favorites, but we did some research guys have crossed the line at that point to automated processing. But advisories on IDSes to assess their ability to really into illegal activities, therefore prosecu- must go through marketing and PR describe vulnerabilities using signature tions need to be done. Spammers are departments, which are less amenable to languages. We were disappointed about trying to sell something, so they must technical standards! what could be done to really describe leave ways to track them back. Our Cory: We are open to exploring stan- vulnerabilities. A lot of signature-based responsibility as security people consists dardized formats, but this is less impor- systems could not describe a recent Ker- of hardening email systems, implement- tant for advisories. We are more beros vulnerability, for example. There- ing authentication between servers, and interested in a standard format for the fore, I have doubts about signature- locking down open relays. exchange of vulnerability information, based systems. Cory: The growth of really usable cryp- which can then be customized with text Mike: Marketing people want us to tography could help. Membership in for advisories. believe that the two kinds are very dif- mailing lists should require use of cryp- Rob: Microsoft released two new bul- ferent. Protocol-based engines need tography to participate in them. letins just today. After the advisories more horsepower. Therefore it makes Roger Safian: The issues of someone have gone through the legal people, they sense to try for the cheaper signature- abusing your resources versus someone are mangled! If a standards body were to based methods first to skim off 80% of just sending spam should be kept sepa- come up with a standard format for the problem; then do protocol analysis rate. advisories, I would encourage Microsoft to catch more; then, finally, there’s the to probably play along. But Microsoft holy grail of “anomaly detection,”which Rob: Technical approaches to this bulletins are already huge; a list of mini- does not really exist yet. Each technique problem are seen here at conferences. mal information would be helpful.

66 Vol. 28, No. 5 ;login: has its own domain of applicability; all KEYNOTE Agency): eEurope 2005 build on the of them should be used. HE UROPEAN NITIATIVES IN ETWORK AND progress made by eEurope 2002: Inter- T E I N EPORTS net penetration in houses has doubled R Rob: I’m more interested in host-based INFORMATION SECURITY since then, there is a telecommunica- than network-based IDSes; the OS needs Andrea Servida, Information Society tions framework in place, and there is a to get more intelligent. Tripwire, for Directorate of the European

Commission, Belgium very fast research backbone network. ONFERENCE example. C

Why Europe needs to act on security: There are projects which affect CSIRTs, ● Q: Are there signature development 75% of European companies had no such as the European Information Secu- classes, tips on developing signatures? security strategy in 2002. It seems that rity Program, which aims at providing SMEs with the IT security solutions Mike: Get in good with the Snort guys, underestimating core business risks is needed to develop their e-commerce who are doing it all the time on our responsible for the low level of IT secu- business. eE2005 aims to: mailing list. rity investment in most European com- panies; there is a lack of awareness of the ■ Establish a cybersecurity task force Cory: I found it difficult to match up issues. The paradigm for security is by mid-2003. ENISA is not itself a individual signatures from various sys- changing from “security through obscu- CSIRT, but would have a facilitat- tems. rity” to “security in openness,”but this ing role, coordinating existing Roger Safian: We just completed testing openness and sharing of resources is a capabilities and resources in mem- eight IDSes of multiple kinds. All had challenge to manage. ber states. plusses and minuses. Sales people are Toward a European-integrated approach ■ Develop a “culture of security” by overzealous in their claims. to security: International cooperation is the end of 2005 (develop good practice and standards). Q: What about intrusion prevention needed and is occurring in the following ■ Secure communications between technology? areas: public services. Mike: I have similar comments as for ■ Economic, business, and social Ambient intelligence (wearable comput- “protocol versus signature”: there is a aspects of security in an informa- ers, computers inside our bodies) raises big area of overlap between technolo- tion society: These include business new security issues. Today’s technologies gies, and fuzzy definitions of them. opportunities and growth, individ- are already pervasive and intrusive, and Many vendors talk about putting things ual issues such as privacy and the have huge interdependencies which are in midstream and blocking based on protection of minors, dealing with not being managed well. The challenge findings. There are concerns about false the digital divide, and long-term for tomorrow is to develop a respectful positives; we need a bit more track preservation of knowledge and cul- approach; the ethics of privacy will be a record on these devices. ture. ■ Cybercrime,“homeland” security. key element in an information society. Rob: Intrusion prevention is another ■ External security and defense. way to say “host hardening”: You can’t ■ Security research. TECHNICAL SESSIONS attack what isn’t there! I’ve been told INTRODUCTION OF THE APCERT: NEW The role of the Commission: The Euro- that Windows listens on too many ports, FORUM FOR CSIRTSINASIA PACIFIC pean Commission proposes and orches- that we cannot turn the services off. But Yurie Ito, JPCERT/CC, Japan trates the development of a regulatory Win2K has IPSec, which can be used to There is now a new forum for coopera- block access to ports. framework, for example governing elec- tronic signatures, data protection in tion among CSIRTs in the Asia Pacific Roger Safian: SANS cornered Gartner. It electronic communications (consent to region: APCERT. turns out that intrusion detection collection of data, use of cookies), infor- Most CSIRTs in the AP region do direct devices that are actually deployed are mation, and network security. The Com- incident handling and coordination as generally not used to block traffic! mission also launches policy initiatives, well as issue warnings and alerts, techni- in particular to promote the develop- cal bulletins, and so on. However, there ment of various technologies or is little participation from IRTs in indus- approaches. try. Many CSIRTs in the AP region do eEurope 2005 and ENISA (European communicate directly with each other, Network and Information Security to share observations, data, and techni- cal information and to contact sites involved in an incident. While there are October 2003 ;login: 15TH ANNUAL FIRST CONFERENCE● 67 commonalities between the various causal link to computer security inci- The following item apparently appeared CSIRTs, such as a narrow block of time dents is evident. on Slashdot the day of the presentation: zones and IP address blocks, there are The Palo Alto Unified School District What is under attack? There is an evolu- differences in the technological maturity had a wireless network which was not tion away from “computer security,” of the various participants. secured properly. A reporter for the Palo where we are protecting information but Alto Weekly parked in the parking lot There was a desire to coordinate the we don’t necessarily need to know what and was able to pick up, for each stu- interaction between the AP CSIRTs. A information it is we are protecting. We dent, full names and addresses, pictures, working group was formed in 1997, are now realizing that this information and in one case a psychological evalua- whose core members were three large has a meaning, and can be used to cause tion. While this could be characterized teams: CERTCC-KR, SingCERT, and serious damage to people and institu- as a malicious attack, in fact errors in JPCERT/CC. Over the years, it was tions. Attackers break into systems in system configuration made the task of decided to create APCERT, which was general because they are looking for obtaining this information trivial. established in February 2003. Its objec- information; privacy is an extension of tives: what CSIRTs have been doing as part of The speaker described several more pri- their jobs. vacy breach cases, including one in ■ Share security information among which used computers were sold with members. Identity theft is turning into one of the their disks incompletely wiped, so that ■ Handle security issues on a most prevalent crimes of this century. personal information about hospital regional basis. Criminals need anonymity (in the form patients (for example) was compro- ■ Support the establishment of of an identity other than their own), and mised. CSIRTs in countries not yet covered stealing an existing identity that has a by such services. history is much more useful to them Recommendations: Be prepared for the ■ Collaborate with other regional than creating a fake identity, which can implications of privacy breaches. Apply frameworks, such as FIRST and TF- be found to have “popped into exis- not only deductive reasoning (what hap- CERT. tence” recently and can therefore be pened to cause this breach?), but also flagged as fake. This has implications inductive reasoning (anticipating the There are currently 15 full members. beyond someone losing money from a impact of information loss). APCERT’s activities include an annual stolen credit card, and we can expect conference (APSIRC), and working civil and criminal liabilities for HONEYNETS APPLIED TO THE CSIRT groups on accreditation rules for “enablers.” SCENARIO APCERT membership, on training and Cristine Hoepers, Klaus Steding-Jessen, communications for CSIRTs, and on As the concept of “computer security” and Antonio Montes, NIC BR Security financing the APCERT effort. has evolved to “information security” Office, Brazil and then to “privacy,”the people respon- The APCERT involves other players, The Brazilian CSIRT set up a honeynet sible for working on securing that infor- such as users, system integrators and with the objectives of monitoring cur- mation have grown to include not only operators, regulatory bodies, the insur- rent attacks and intrusions, collecting sysadmins, but also CIOs and, finally, ance industry, law enforcement, and the data about opportunistic (“script-kid- individuals, who must safeguard their technology development and engineer- die”-type) activity, developing new own information. There is a plethora of ing communities. It encourages these tools, and evaluating the usefulness of laws affecting privacy at the national and players to cooperate and communicate honeynets to CSIRTs. Requirements for local levels. with each other; it facilitates mutual the honeynet included low cost and reli- trust and information sharing. The author suggested four basic cate- ability, as well as a high-quality data gories (a taxonomy) of privacy breaches: control mechanism. The team also PRIVACY INCIDENTS ON THE RISE: TAXON- wanted to make sure that it could pre- OMY AND RESPONSE ■ Malicious attack: often preceded by vent the use of the honeynet as a launch- Lance Hayden, Cisco Systems Inc., USA security breach ing platform to attack other sites. ■ What is privacy? As incident responders, Process breakdown: security Therefore, free software was used, and we are called upon to take a “first processes fail to protect data was stored in “libpcap” format to ■ responder” approach to privacy Human or system error: not mali- facilitate its analysis with existing tools. breaches, whether or not the direct cious, but still damaging ■ Other

68 Vol. 28, No. 5 ;login: This team’s honeynet started operations By maintaining a honeynet, a CSIRT can POLICY-BASED CONFIGURATION OF in late March 2002. provide an additional source of data to DISTRIBUTED IDS EPORTS

help understand what’s going on in a Olaf Gellert, Presecure Consulting R The honeynet’s topology includes an particular set of incidents, and can pin- GmbH, Germany administrative network, whose func- point and notify compromised machines IDS is a security component which tions are to prevent outgoing attacks, to

of constituents if they show up in the monitors system events for unwanted ONFERENCE log activity, and to store artifacts and C

honeynet logs. This work is also a great behavior, using the following methods: ● disk images of the honeypots. The hon- source of training material for log and eypot segment includes several honey- ■ Anomaly detection: As a first step, artifact analysis and forensic methods, pots running different OSes. the IDS gathers statistics about and can be used to capture attack tools normal behavior, and as a second Technologies used to control outgoing that would otherwise be only in the vic- step, it generates alerts on unusual data (to prevent attacks from within the tim host’s memory, since it is possible to behavior. honeynet) include firewall rules (e.g., to capture full network traffic on the hon- ■ Misuse detection: The IDS com- block spoofed packets), outgoing traffic eynet. pares events against patterns of normalization (to discard invalid pack- known attacks. ets), a tool called “sessionlimit” (which AN INTERNET ATTACK SIMULATOR USING THE can limit outgoing traffic based on fairly EXTENSIONS OF SSFNET There are different kinds of sensors: sophisticated state-based rules), band- Eul-Gyu Im, Jung-Taek Seo, and Cheol- ■ Network sensors (NIDS) are com- width limitation to prevent DoS attacks Won Lee, National Security Research Institute, Republic of Korea ponents which inspect network from the honeynet, and an outgoing fil- traffic. One sensor per subnet is ter (“hogwash”) to block traffic based on Because of the increase in Internet needed, and there is no feedback contents. Alerts and summaries are pro- attacks, there is great demand for from hosts. duced to report on activity in the hon- research on them and their effects. How- ■ Host sensors (HIDS) require one or eynet. ever, it is not always possible to study the attacks on a production network, for more sensors (one per host), but Activities seen in the past year included obvious reasons. This is why network they permit direct access to infor- lots of IRC traffic, a lot of worm activity simulators are useful. mation. (some new worms were captured), and An IDS also needs analyzing compo- denial-of-service attempts. Several new SSFNet (Scalable Simulation Frame- nents to collect the data from the sen- rootkits and exploit tools were captured. work, network module) is a freely avail- sors, to take action on logged data, and Statistics were produced on the top able network simulation tool which has to compile statistics. Management con- scanned ports (FTP, SSH, Telnet, and a process-based discrete event-oriented soles are the front-end for visualization portmap were at the top). Also, many kernel. The authors added extensions to of logged data. scans for open relays and open proxies SSFNet: a firewall and a packet manipu- were seen on ports 25, 1080, 3128, and lator. They then performed experiments All IDSes suffer from false positives, false 8080. with this setup; for example, they simu- negatives, and often offer no explanation lated a “smurf” attack in a network of of an anomaly. With distributed IDSes, a Worm activity concentrated on ports 80, 13,000 clients, 40 servers, and 270 large amount of generated data adds up: 443, and 1433. There is still a lot of routers, and were able to show the It’s important to correlate all alerts for Nimda and Code Red activity, which degradation of ping response times as a one attack into one single alert. With means that many compromised hosts function of the number of subnets par- lots of sensors, the problem of configur- are still active! ticipating in the attack; it turns out that ing all those sensors becomes significant. The origin of the problematic traffic was response times degrade drastically when There is a diversity of different specifica- graphed by country; for most cases, the 12 or more subnets are involved in the tion formalisms used for the configura- US was at the top, though not in as large attack. tion of IDSes, and different types proportion as the US’s presence on the describe different events. There are also Net. Back-door access and language of configuration differences, even among IRC conversations point to Romania as a several sensors of the same type, major source of malicious activity. depending on their placement. In addi- tion, anomaly-based sensors require fre-

October 2003 ;login: 15TH ANNUAL FIRST CONFERENCE● 69 quent updates to their attack-recogni- could be supported by automatic intruder, but in easy reach of the sysad- tion signatures. processes, using the output of the IDS, min. for example, or using scanning tools. There are several possible solutions to The stealth “master” itself is not con- Actual updates might still require man- this configuration complexity, including nected to the outside Internet, but can ual confirmation, as might the signature some unacceptable solutions such as make SSH connections to its monitored database, though in that case the IDS using only one vendor and/or only one clients. The fingerprint is stored by the vendors can help. Some unresolved type of sensor. There have been attempts monitor; there are no logs on the clients. problems: how to update rules without to specify a single configuration lan- restarting sensors? how to introduce STEALTH itself uses standard software guage for all types of sensors, but this advanced topology information? like “find” and “md5sum,”so it is highly does not solve the problem of different flexible and adaptable. The timing of configurations being required based on The status of this work: A tool to config- STEALTH runs is unpredictable. each sensor’s placement within the net- ure IPtables and Cisco access lists now Because the actual file signature calcula- work. The speaker suggested a solution: exists (written in awk), and nearly com- tions occur on the client, the monitor specify only a policy and generate rules pleted work in object-oriented Perl is can be any old hardware; resource automatically based on this policy. being done for handling subset-of-pol- requirements are small. STEALTH can icy objects. The policy description suggested con- be obtained at ftp://ftp.rug.nl/contrib/

tains users, hosts, services, access by THE STEALTH FILE INTEGRITY CHECKER frank/software/linux/stealth/. source/destination, and restrictions Frank Brokken, University of based on these combinations. In addi- Groningen, the Netherlands KEYNOTE tion, one needs a database of rule sets to STEALTH (SSH-based Trust Enforce- CYBER SECURITY IN CANADA describe known attacks (for signature- ment Acquired through a Locally James Harlick, OCIPEP, Canada based IDSes), a database of assets (exist- Trusted Host) is a file integrity scanner Canada created OCIPEP (Office of Crit- ing systems, installed software), and a that has the advantage of being stealthy. ical Infrastructure Protection and Emer- database of some kind gency Preparedness) in February 2001 to (not implemented yet). STEALTH’s mission is to ensure the security of our computers (integrity, enhance the safety and security of Cana- A policy is used to generate a rule out- availability, confidentiality of informa- dians in their physical and cyber envi- line (a description of allowed services), tion stored). Intruders may modify the ronments. The mandate has two based on which we use the asset data- integrity of the information on a host; components: base to generate rule specifications for our lines of defense include denying ■ Protection of critical infrastructure: each IDS based on their placement. access when an intrusion attempt is physical and cyber components of Finally, we use the contents of the signa- detected, keeping software patches up- the energy, utilities, communica- ture database (where applicable) to gen- to-date, monitoring system logs, and, tions, services, safety, and govern- erate particular rules for each sensor. We finally, knowing when relevant informa- ment sectors can also set the severity for sets of tion is altered. This last is the goal of a ■ Emergency preparedness for all events: for example, distinguishing file integrity checker. kinds of emergencies between “alert for this event” and “just collect stats on this event.” File integrity checkers work by creating a Currently, the critical information struc- “fingerprint” of the current state of a ture is highly interdependent and has The results of this work suggest that it is host, then detecting modifications of numerous vulnerabilities (e.g., a worm possible to configure all sensors cen- that fingerprint. The problem with the knocked out 911 service in part of the trally. Specific rules for each sensor traditional approach of storing the fin- US because they were using VoIP). reduce the number of false positives, and gerprint on the monitored computer Canada has pledged itself to be the most we see improved accuracy on the sever- itself is that the fingerprint itself is connected nation on earth by 2006, ity of alerts; it becomes easier to update therefore vulnerable to intruders. The which puts its infrastructure at great the rules. A side benefit is that the assets usual solution, to keep this state on risk. database can be used for other purposes. read-only media, makes updating the Canada’s cyber-security framework con- The maintenance of the asset database fingerprint difficult or costly. The solu- tains four ways to reduce risks: takes time, though this maintenance tion is to store the fingerprint on another computer, out of reach of the

70 Vol. 28, No. 5 ;login: ■ Strengthen policy framework: Gov- generate rules for attack detection. There with 34,000 peerings, yet despite this ernment departments and agencies are four types of agent: complexity, it works! The use of firewalls EPORTS

are required to report cyber-threats tends to interfere with the proper func- R ■ Manager agent: coordinates work and incidents to OCIPEP, and a tioning of the Net in many ways. and information flow. framework is created to support ■ Monitor agent: gets information Firewalls break the Internet by miscon-

information sharing and protection ONFERENCE

from sensors (a monitor agent figuration. For example, Path MTU dis- C

among various jurisdictions. ● should run on each host being covery, which permits the fragmentation ■ Enhance readiness and response: monitored). of packets where necessary so they can protect (issue alerts and advi- ■ Decision agent: uses information to reach a network with a “smaller” MTU, sories), detect (coordinate identifi- decide what to do. depends on particular ICMP control cation and analysis), respond ■ Action agent: generates alerts and messages getting through – yet people (establish incident response cen- heartbeats. There is a GUI that can do filter them (by filtering “all ICMP”). ters), and recover (provide incident show graphs of various measured Other examples where a misconfigured impact analysis, technical assis- parameters. firewall breaks things include jumbo tance). frames, fragmentation, and certain ■ Build capacities: This includes The decision agent, which is the heart of applications, such as FTP, IRC, H323, training and education (CSIRT the system, is based on fuzzy logic (fuzzy IPSec, IPv6, and multicast. training, training on malicious sets with “degrees of set membership” code analysis, IDS data analysis, between 0 and 1). Instead of hand-coded Another reason firewalls break things is assessment of vulnerabilities) and fuzzy rules, rules are generated automat- the added complexity they introduce: R&D (coordination with funding ically by “training” the decision agent Because communication is no longer councils of the government, direct with genetic algorithms, under normal “end-to-end,”it becomes difficult to find research projects). conditions and attack conditions. errors. Things no longer “just work” a ■ Build partnerships: It is necessary lot of the time. The increased number of Parameters monitored by MMDS for partnerships to be formed machines, people, and procedures include network activity (sent and internally, among the federal and involved means more opportunity for received bytes and packets), user activity provincial governments as well as things to go wrong. (logins, failed logins, number of users critical infrastructure owners and logged in), process information (total The time spent working around firewalls operators, and externally, with number of processes, number of root- is time wasted (and sometimes work- other governments and CSIRTs. owned processes, and number processes arounds are not available). For example, Already there is a daily “health in various states), system parameters H323 videoconferencing requires four check” information exchange (physical and virtual memory in use), extra machines to work through a fire- between OCIPEP and the and MAC-level network data (sequence wall! provinces, and there is a weekly numbers). conference call with critical infra- Firewalls are single points of failure and structure owners and operators in Once the decision agent had been will, of course, contain errors (since they the private sector. trained to recognize three attacks (SSH are developed using the same processes hack, nmap scan, and MAC spoofing on used for other software!), and yet we TECHNICAL SESSIONS a wireless network), the researchers place our trust in them. claimed very good results for detecting MULTI-LEVEL MONITORING AND DETECTION Firewalls limit network speed by creat- those attacks under test conditions. SYSTEMS (MMDS) ing a bottleneck: Will firewalls, especially Madhavi Latha Kaniganti, D. Dasgupta, FIRE YOUR FIREWALL those which must traverse the protocol J. Gomez, F. Gonzalez et al., University stack, keep up as networks speed up? of Memphis, USA Jan Meijer, Hans Trompert, SURFnet/CERT-NL, the Netherlands The presenter described an intrusion Firewalls consume scarce resources by detection system (MMDS) which is an The speaker’s intention was to show that requiring staff, equipment, time, and agent-based approach to monitoring firewalls cause more problems than they money to acquire and maintain. On a and detecting attacks. The design is a solve (which is not to say that anyone related matter, firewalls create bureau- hierarchy of specialized agents, with a has the right solution). The Internet (in cracy: policy, carefully kept rules, lots of fuzzy decision support system used to 2002) was composed of 11,000 networks administration.

October 2003 ;login: 15TH ANNUAL FIRST CONFERENCE● 71 Firewalls create a false sense of security: the fear of negative publicity and com- There is usually traffic which works petitive disadvantage, uncertainty as to around or tunnels through the firewall whether law enforcement is interested or (for example VPNs), and lots of traffic able to deal with the crime, not wanting on legitimate protocols (email, Web) is to “challenge” crackers, and not knowing still dangerous. Firewalls provide no who to call or what is worth reporting. denial-of-service protection. Too many The speaker reassured us that law configurations filter only inbound traf- enforcement tries to be discrete with fic; in this case, a “call-back tunnel” will information and does not seize victims’ get around the “problem,”and local computers; the victim does not lose con- users do set these up. Even if there is a trol but, rather, is consulted closely. Also, firewall, it is still necessary to patch, use the number of law enforcement agencies end-to-end encrypted communications, able to deal with computer crime issues switch off unneeded services, etc., meas- has increased at all levels in the US. ures which are too often neglected when There have been some success stories — there’s a firewall. for example, the successful prosecution

INCIDENT RESPONSE AND THE ROLE OF LAW of Mafiaboy (DDoS) and the Melissa ENFORCEMENT virus author. Kimberly Kiefer, Computer Crime and We are encouraged to report incidents Intellectual Property Section of the US even if the damages do not meet the Department of Justice police’s criteria for investigation, because The CCIPS, a 30-attorney section within our incident may be linked to others, the Criminal Division of the US Depart- which collectively do meet the require- ment of Justice, prosecutes computer ment. crime and criminal intellectual property (IP) cases, trains and counsels agents Tips on cooperating with law enforce- and prosecutors, develops policy on ment: computer crime and IP, provides input ■ Keep detailed notes and logs, on legislation, and represents the USA including records that will quantify on international bodies which address the damages caused by the incident. computer crime and IP issues. ■ Set up contacts with law enforce- Security incidents are generally under- ment before an incident occurs. detected and underreported. Some rea- sons for not reporting incidents include

72 Vol. 28, No. 5 ;login: