Proceedings of the 17 Large Installation Systems Administration Conference
Total Page:16
File Type:pdf, Size:1020Kb
USENIX Association Proceedings of the 17th Large Installation Systems Administration Conference San Diego, CA, USA October 26–31, 2003 THE ADVANCED COMPUTING SYSTEMS ASSOCIATION © 2003 by The USENIX Association All Rights Reserved For more information about the USENIX Association: Phone: 1 510 528 8649 FAX: 1 510 548 5738 Email: [email protected] WWW: http://www.usenix.org Rights to individual papers remain with the author or the author's employer. Permission is granted for noncommercial reproduction of the work for educational or research purposes. This copyright notice must be included in the reproduced paper. USENIX acknowledges all trademarks herein. DryDock: A Document Firewall Deepak Giridharagopal – The University of Texas at Austin ABSTRACT Auditing a web site’s content is an arduous task. For any given page on a web server, system administrators are often ill-equipped to determine who created the document, why it’s being served, how long it’s been publicly viewable, and how it’s changed over time. To police our web site, we created a secure web publishing application, DryDock, that governs the replication of content from an internal, developmental web server to a stripped-down, external, production web server. DryDock codifies a formal approval process that forces management to approve all web site changes before they are pushed out to the external machine. Users never interact directly with the production machine; DryDock updates the production server on their behalf. This allows administrators to operate their production web server in a more secure and regimented network environment than normally feasible. DryDock audits documents, tracks revisions, and notifies users of changes via email. Managers can approve files for publication at their leisure without the risk of inappropriate content ever being publicly visible. Web authors can develop pages without intimate knowledge of security policies. And administrators can instantly know the complete history of any file that has ever been published. Introduction paper documents, not web pages. Their checks and balances were inadequate when applied to our web Information is valuable. While nearly all organi- architecture, which allowed users to publish pages zations go to great expense to protect their networks, a without even a cursory review. Many staff members much smaller percentage have formal safeguards were unaware that web pages even fell under these against the accidental dissemination of sensitive infor- guidelines. We found ourselves unable to track who mation. In a web server environment where many published specific files and who had deemed those users can update different parts of a web site at once, files fit for the public. Since much of ARL:UT’s auditing the server for inappropriate content becomes research is sensitive and proprietary, we needed to an increasingly difficult system administration task. strictly regulate the flow of information from inside Most system administrators can’t easily determine our organization onto our public web server. why a particular file exists on a web server, or who in management authorized its publication. How can To ensure that only material suitable for public administrators be expected to safeguard information if viewing appeared on our web site, we needed to force they can’t tell which documents are fit for the public documents to undergo an approval process – only files and which ones aren’t? that successfully complete the process will move to the web server. Furthermore, for any publicly view- This was the situation in our organization, able file, we needed a way to determine who autho- Applied Research Laboratories, The University of rized its publication, when the file was published, and Texas at Austin (ARL:UT). for what reason. Not only did we need this informa- Motivation tion available for currently published files, but for any Since 1994, ARL:UT has had a publicly accessible previous versions as well. A web publishing system web server. Initially, we served simple, static pages. that provided us with these features would enforce our There was relatively little web server traffic, and, like information security policies by ensuring publicly many organizations at the time, we didn’t concentrate on viewable content is acceptable and well accounted for. the security of our network or our information. Our web Due Diligence server resided on our internal network with full access to In late 2001, we searched for tools that would our file server and other intranet resources, and employ- put our new web publishing plan into service. Of the ees were trusted to only serve documents that were countless managed web publishing solutions on the appropriate for public viewing. market at the time, we found none that, out-of-the- By 2001, our web presence had grown in both box: traffic and size by several orders of magnitude. Our • implemented a role-based approval process appro- site’s much larger scale made it impossible for admin- priate for our organization’s managerial structure istrators to effectively police its content. Our formal • gave us the thorough revisioning and auditing publishing policy and guidelines were conceived for capabilities we needed 2003 LISA XVII – October 26-31, 2003 – San Diego, CA 41 DryDock: A Document Firewall Giridharagopal • didn’t mix approved and unapproved content ARL:UT, however, we needed a strict web presence on our public web server that was decidedly static and non-collaborative – this • had a friendly, web-based user interface that obviated many of these systems’ features. All we managers could understand wanted was software that would allow approved docu- • were minimally intrusive for both our system ments through to the web server, while blocking all administrators and our web developers other unauthorized updates. All of these packages • used technologies we were already familiar with required so much customization that it was easier for us to build our own solution, tailored specifically to Sweeping, monolithic content management sys- our environment. tems such as Vignette StoryServer [17] were inappro- Having resigned ourselves to writing a custom priate for our environment. ARL:UT is comprised of tool, we began looking at platforms upon which we many autonomous groups that have their own web could base our application. We were particularly inter- development methods and practices. While the need for ested in the Zope [16] application server. Written in a formal information security process was universally Python, Zope has been used to build many complex recognized, a sweeping content management system and dynamic web sites. Its features include user man- was both politically infeasible and far too expensive. agement, web-based administration, searching, clus- Portal and weblog systems such as PostNuke [7], tering, and syndication. Like the aforementioned pack- Tiki [20], or Plone [8] focus on creating highly ages, however, a great deal of Zope’s dynamic compo- dynamic, interactive web sites. Thus, they frequently nentry was of no use to us, and much of Zope’s func- offer collaborative features such as article syndication, tionality fell far outside the scope of simple publica- Wiki1 systems, forums, and user commentary. At tion oversight. Though these issues weren’t intractable, when combined with Zope’s steep learning 1‘‘Wiki Wiki Web is a set of pages of information that are curve, they led us to look at other less complex and open and free for anyone to edit as they wish, through a web less ambitious platforms. interface. The system creates cross-reference hyperlinks be- tween pages automatically. Anyone can change, delete, or We settled on WebKit, the Webware for Python add to anything they see.’’ [10] [22] application server. WebKit uses a design pattern Figure 1: Web publishing with DryDock. 42 2003 LISA XVII – October 26-31, 2003 – San Diego, CA Giridharagopal DryDock: A Document Firewall fashioned after Sun’s Java Servlet [15] architecture (a server setup, and by forcing a separation between cre- paradigm we were familiar with), and includes little ating content and approving its publication. extraneous, dynamic componentry we’d have to work Figure 1 details the process of web publishing around. We could implement all of the heavy-lifting using DryDock. DryDock’s dual web server setup is functionality in plain Python and use a small number of comprised of a production machine and a staging servlets to expose a web interface – we found such an machine, both of which have identical web server con- architecture much more workable via WebKit than Zope. figurations. The production server holds content suit- Using WebKit, we devised an application that able for the public and resides in a publicly accessible would give us the security and auditing features we DMZ2 [14], while the staging server resides behind the required. Several months later, we put DryDock into firewall as part of the internal network. production. DryDock has been managing our web The staging server houses a web tree containing publishing for over a year and a half now. files under development (the development tree) and a separate tree containing files authorized for publica- What Is DryDock? tion (the export tree). The development tree is DryDock tackles our information security prob- 2Demilitarized Zone: a ‘‘neutral zone’’ between a compa- lem in two main ways: by implementing a dual web ny’s private network and the outside public network. Figure 2: The DryDock directory listing screen. 2003 LISA XVII – October 26-31, 2003 – San Diego, CA 43 DryDock: A Document Firewall Giridharagopal accessible to web authors on the internal network, option sparingly; users must be vigilant about pre- while the export tree is solely managed by DryDock. approved files’ content. Managers (content approvers) use DryDock’s Web Developer Migration web interface (Figure 2) to browse the staging server. The web interface presents an integrated and easily Migrating web developers to DryDock’s web discernible view of the development tree and the publishing process should be painless.