Paper on Web Management

Total Page:16

File Type:pdf, Size:1020Kb

Paper on Web Management Managing the Web at its birthplace Prepared for the South African Annual Conference on Web applications (1999-09-09) To be published in the South African Journal of Information Management Maria P. Dimou - CERN Dariusz J. Kogut (for the TORCH section) Abstract CERN, the European Laboratory for Particle Physics is the place where the Web was born. It has the longest tradition in Web-related development, however, its main role is to do basic physics' research. This ambiguous combination of interests makes the task of managing the Web at CERN most challenging. The large amount of Web servers to consolidate, the users' high expectations on performance and centrally supported tools, the expression of individual creativity within some corporate image standards creates a non-trivial Intranet operation. In this paper the CERN central servers' configuration will be discussed. The choice of commercial and public domain software for the server, search engine, Web discussion forums and log statistics will be presented. The guidelines and recommendations on Web authoring and site management will be listed as well as the criteria that led us to these conclusions. Introduction Managing a very large Intranet and, by the way, the oldest one, may seem at first glance to be a straight forward technical issue but it is more than that. It touches upon very interesting human and social aspects like: • who can publish • what can be published • how to promote important information • what is important information • who is responsible for the information quality • what is creative/what is bad taste • how independent are the data structures from the content • how uniform the Intranet appears Our Intranet is far from being the most attractive and conscise, however, the following statements are all true and explain or justify the situation: • our "lack of discipline" has also its advantages as it led to achievements like the World Wide Web. • the main objective of the CERN authors and web server managers is to do High Energy Physics research, therefore they never focus on their sites' aesthetics. • Physicists are used to build entire projects by themselves and solve problems across disciplines (physics, engineering, computing) therefore hate conforming to restrictions on corporate image, accepted without problem in business environments. Experience has shown that trying to over-centralize and control the presentation, location, platform or number of the CERN web page/sites/servers is a utopia, anyway contrary to the design and philosophy of the Web, the "universe of network-accessible information" (Tim Berners-Lee). What we found realistic and valuable was to offer solid and open solutions that would prove themselves performant and simple to adopt or comply. These solutions were made available on the central web server www.cern.ch for those who have no resources to set-up a service of their own. They were also documented as tools, courses and guidelines for those who want to stay independent and run their own service, with the possibility to get advice. This work covered the period of 1997-98 and is summarized here. As people and strategies change there is no commitment that the policy (or URLs!) of this paper will remain valid in the distant future. The choice of web server CERN was, of course, using the home-made CERN httpd from the web's birth till June 1998 when the central server www.cern.ch converted to Apache (some other servers on-site had done this earlier). It has been amazing to notice the performance degradation observed during the last year of use of the CERN httpd and it was very hard for us to identify the reason for that blockage. We upgraded the hardware, doubled the CPUs, added memory, increased the afs cache, monitored the impact of cgis running on the host, to witness in practice that the CERN httpd simply doesn't scale enough over a certain number of hits (~100K requests/day or on average 4 requests/sec). This was suggested by Jes Sorensen/CERN and proved to be true after we installed Apache and concluded to the following values for the httpd.conf tuning parameters: • MaxKeepAliveRequests 100 • MaxSpareServers 40 • MaxClients 250 • MaxRequestsPerChild 1000 • MinSpareServers 20 Apache proved to be a very powerful Web server.What we appreciated the most in Apache was the use of Virtual Web Servers, Server Side Includes and the handling of page protections. The Virtual Web Servers allowed us to host those web sites that didn't have resources to maintain their own system any more but wanted to keep their identity (URLs) unchanged. The Server Side Includes helped us create and maintain a uniform look on our service pages and every time site-wide changes were needed the update had to be done only in one place. Most information owners want to know who visits their pages and ask for the possibility to access the Apache logs and extract the information that concerns them. We chose the analog package to do this by making available to the user a list of fields and tailorable parameters they can select and make statistics reports for the period, the criteria and the format that suit them. Another preoccupation every system manager has when running a web server on his/her system is the potential security risks of CGI scripts. To ensure some safety we decided to keep all scripts in one place. Their authors aren't allowed to move them there by themselves but need to submit them to a programmer of the web support team for checking of potential security holes. Some documentation was made available recently on typical errors in Perl with shell commands and file I/O. (Co*)authoring (*)"L'enfer c'est les autres" J-P. Sartre The challenges in this area depend: • on the site's size (one page, a few pages, hundreds of pages?) • on the documents' history (newly created, legacy papers) • on the documents' "mission" (to print or to link) • on the documents' complexity (images, dynamic content) • on the number of (co-)authors • on their working platform • on their preferred authoring tool (package to edit pages and manage web sites) The above do not concern page/site style, fonts and content but only version consistency, document portability and write permissions. We started by making an inventory of the tools available in the market for page editing, site management and image processing. It was immediately obvious that it is a never-ending task to keep it up-to-date, evaluate all of the packages or support more than a handful of them. Therefore, we evaluated a subset of these products against a list of criteria and concluded that: • legacy documents that traditionally existed on paper and should continue to be printable with a rigid format and page numbers should be published as pdf or postscript. • users who have a very small number of pages to write should be left free to use their preferred editing tool and save them in a web-viewable format (e.g. HTML or pdf). • Unix users are usually happy to write plain HTML and should be left alone till their sites become relatively large (over 50 pages). • PC and Mac users enjoy the availability of a large number of editing/management tools and should choose the most standard and open ones, i.e. those creating portable code across platforms and server software. The products we actually examined more carefully were: • MacromediaDreamweaver (for Windows and Macs) • GoLive CyberStudio (excellent but Mac-only at the time) • MS FrontPage 98 (very good if one stays within Microsoft products) • NetObjects Fusion (good for very large sites, several hundred pages, expensive) • Adobe PageMill (limited functionality) Their page editing facilities are very similar i.e. link insertion, list,template and table creation are quite easy etc. The worries start when proprietory add-ons reduce the portability possibilities of the end-product. We needed to come up with a recommendation in a finite amount of time and suggested the Macromedia Dreamweaver for medium-sized sites not for its perfection (there are bugs!) but for its open,straight- forward, W3C standards' compliant features. Some of them, not exhaustive nor exclusive are: • HTML hiding • Clean HTML production • Easy link,table,image,font,metadata insertion/change • Easy form creation Templates (local or remote) & Library elements • Page result preview in user's preferred browsers • Preferred external editor invocation • Link validity checking • Easy site definition and uploading to server • Easy maintenance of any site on remote Unix or Windows system • Site map automatically built • Safe editing by >1 authors • Help with scripting,Javascript, DHTML • Server Side Includes (SSI) Support (also locally) • Common plug-ins' insertion • Cascading Style Sheets' (CSS) support • XML support / parser From the page usability point of view one of the most serious problems we have due to the large number of authors (over 1,000) and the lack of coordination is the quality of the page content. Pages are written and forgotten, authors leave the organization, users don't know whom to contact to find out what is still valid. Some of our collaborators in the 20 member States of the laboratory or the rest of the world have modest network connectivity and can't use sites heavily loaded with images and animations. For these reasons we issued guidelines to authors explaining that on every page there should be: • a signature for the readers' information • a mail address for feedback • a date when the page might expire or need review • some concern for users with slow lines (no page over-load with pictures etc) • the appropriate metadata for promoting the pages in good ranks of search results • a robots.txt file at the level of the site's document root that prevents search engines from unauthorised indexing • good content of the TITLE tag that corresponds to the page' mission • ALT attributes of IMG elements that provide a text description of an image (vital for interoperability with speech-based and text only user agents).
Recommended publications
  • World-Wide Web Proxies
    World-Wide Web Proxies Ari Luotonen, CERN Kevin Altis, Intel April 1994 Abstract 1.0 Introduction A WWW proxy server, proxy for short, provides access to The primary use of proxies is to allow access to the Web the Web for people on closed subnets who can only access from within a firewall (Fig. 1). A proxy is a special HTTP the Internet through a firewall machine. The hypertext [HTTP] server that typically runs on a firewall machine. server developed at CERN, cern_httpd, is capable of run- The proxy waits for a request from inside the firewall, for- ning as a proxy, providing seamless external access to wards the request to the remote server outside the firewall, HTTP, Gopher, WAIS and FTP. reads the response and then sends it back to the client. cern_httpd has had gateway features for a long time, but In the usual case, the same proxy is used by all the clients only this spring they were extended to support all the within a given subnet. This makes it possible for the proxy methods in the HTTP protocol used by WWW clients. Cli- to do efficient caching of documents that are requested by ents don’t lose any functionality by going through a proxy, a number of clients. except special processing they may have done for non- native Web protocols such as Gopher and FTP. The ability to cache documents also makes proxies attrac- tive to those not inside a firewall. Setting up a proxy server A brand new feature is caching performed by the proxy, is easy, and the most popular Web client programs already resulting in shorter response times after the first document have proxy support built in.
    [Show full text]
  • Site Map - Apache HTTP Server 2.0
    Site Map - Apache HTTP Server 2.0 Apache HTTP Server Version 2.0 Site Map ● Apache HTTP Server Version 2.0 Documentation ❍ Release Notes ■ Upgrading to 2.0 from 1.3 ■ New features with Apache 2.0 ❍ Using the Apache HTTP Server ■ Compiling and Installing Apache ■ Starting Apache ■ Stopping and Restarting the Server ■ Configuration Files ■ How Directory, Location and Files sections work ■ Server-Wide Configuration ■ Log Files ■ Mapping URLs to Filesystem Locations ■ Security Tips ■ Dynamic Shared Object (DSO) support ■ Content Negotiation ■ Custom error responses ■ Setting which addresses and ports Apache uses ■ Multi-Processing Modules (MPMs) ■ Environment Variables in Apache ■ Apache's Handler Use ■ Filters ■ suEXEC Support ■ Performance Hintes ■ URL Rewriting Guide ❍ Apache Virtual Host documentation ■ Name-based Virtual Hosts ■ IP-based Virtual Host Support ■ Dynamically configured mass virtual hosting ■ VirtualHost Examples ■ An In-Depth Discussion of Virtual Host Matching ■ File descriptor limitations ■ Issues Regarding DNS and Apache ❍ Apache Server Frequently Asked Questions http://httpd.apache.org/docs-2.0/sitemap.html (1 of 4) [5/03/2002 9:53:06 PM] Site Map - Apache HTTP Server 2.0 ■ Support ❍ Apache SSL/TLS Encryption ■ SSL/TLS Encryption: An Introduction ■ SSL/TLS Encryption: Compatibility ■ SSL/TLS Encryption: How-To ■ SSL/TLS Encryption: FAQ ■ SSL/TLS Encryption: Glossary ❍ Guides, Tutorials, and HowTos ■ Authentication ■ Apache Tutorial: Dynamic Content with CGI ■ Apache Tutorial: Introduction to Server Side Includes ■ Apache
    [Show full text]
  • F JUN 1 1 1996
    Mixed Retrieval and Virtual Documents on the World Wide Web by Christopher Alan Fuchs Submnitted to the Department of Electrical Engineering and Computer Science in Partial Fulfillment of the Requirements for the Degrees of Bachelor of Science in Computer Science and Engineering and Master of Engineering in Electrical Engineering and Computer Science at the Massachusetts Institute of Technology May 2, 1996 Copyright 1996 Christopher Alan Fuchs. All rights reserved. The author hereby grants to M.I.T. permission to reproduce and to distribute copies of this thesis document in whole or in part, and to grant others the right to do so. Author V J .. f Department of Electrical Engineering and Computer Science May 2, 1996 Certified by Ci James S. Miller (I^ - Thesis Supervisor Accepted by. F.R. Morgenthaler . -'.s!-s rs u F Chairman, Department Conunittee on Graduate Theses OF TECHNOLOGY JUN 1 1 1996 ULIBRARIES Eng. Mixed Retrieval and Virtual Documents on the World Wide Web by Christopher Alan Fuchs Submitted to the Department of Electrical Engineering and Computer Science May 2, 1996 In Partial Fulfillment of the Requirements for the Degree of Bachelor of Science in Computer Science and Engineering and Master of Engineering in Electrical Engineering and Computer Science ABSTRACT World Wide Web site managers are forced to serve data using static retrieval (from the file system) or dynamic retrieval (by executing programs) without the practical option of migrating between the two. It was found that dynaimnic retrieval through the CGI script mechanism has an overhead of 18-33% above its statically-retrieved counterpart. A mixed retrievalsystem allows Web documents to migrate between static and dynamic retrieval to gain ithe benefits of both methods and increase the overall performance of the server.
    [Show full text]
  • Load Testing of Containerised Web Services
    UPTEC IT 16003 Examensarbete 30 hp Mars 2016 Load Testing of Containerised Web Services Christoffer Hamberg Abstract Load Testing of Containerised Web Services Christoffer Hamberg Teknisk- naturvetenskaplig fakultet UTH-enheten Load testing web services requires a great deal of environment configuration and setup. Besöksadress: This is especially apparent in an environment Ångströmlaboratoriet Lägerhyddsvägen 1 where virtualisation by containerisation is Hus 4, Plan 0 used with many moving and volatile parts. However, containerisation tools like Docker Postadress: offer several properties, such as; application Box 536 751 21 Uppsala image creation and distribution, network interconnectivity and application isolation that Telefon: could be used to support the load testing 018 – 471 30 03 process. Telefax: 018 – 471 30 00 In this thesis, a tool named Bencher, which goal is to aid the process of load testing Hemsida: containerised (with Docker) HTTP services, is http://www.teknat.uu.se/student designed and implemented. To reach its goal Bencher automates some of the tedious steps of load testing, including connecting and scaling containers, collecting system metrics and load testing results to name a few. Bencher’s usability is verified by testing a number of hypotheses formed around different architecture characteristics of web servers in the programming language Ruby. With a minimal environment setup cost and a rapid test iteration process, Bencher proved its usability by being successfully used to verify the hypotheses in this thesis. However, there is still need for future work and improvements, including for example functionality for measuring network bandwidth and latency, that could be added to enhance process even further. To conclude, Bencher fulfilled its goal and scope that were set for it in this thesis.
    [Show full text]
  • World Wide Web
    World Wide Web Introduction to networking Dr. Klára Pešková, [email protected] Department of Software and Computer Science Education 1 / 32 Feel free to contact me: ● Join a Zoom meeting during an officially scheduled lecture ● E-mail me Assignment You can find the “Simple Static Web” assignment in ReCodEx. Your goal is to create a simple static web, the specification of this assignment in ReCodEx is quite detailed. You will be rewarded by one point if your solution meets the required criteria. Except for ReCodEx, the assignment has to be approved by myself. Your web page has to look like a real web page, it is not enough to just stack the mandatory elements with no concept. You will find my comments to your solution directly in ReCodEx. If your solution is approved, I will reward you one extra point. You need to get 1+1 points to be able to take the “Introduction to networking” test. WWW – World Wide Web ● WWW is the most used Internet service ● Started as an experiment in CERN ● Last year’s description – Platform for information and data exchange – Environment for applications, that are accessible from anywhere ● This year – Social interactions – Shopping – Culture – Studying – Entertainment … Introduction to networking (2020) 2 / 32 World Wide Web and Internet are not the same. Web is one of the Internet’s (most used) services. Ancient history ● 1945 – Vannevar Bush – Human brain works with associations – hypothetical electromechanical device Memex – "enlarged intimate supplement to one's memory", bookmark list of static microfilm pages ● ‘60s – Theodore Nelson first used the word hyper-text, i.e.
    [Show full text]
  • GDPR Assessment Evidence of Compliance
    33 GDPR Assessment Evidence of Compliance Prepared for: CONFIDENTIALITY NOTE: The information contained in this report document My Client Company is for the exclusive use of the organisation specified above and may contain confidential, privileged and non-disclosable information. If the recipient of this Prepared by: report is not the organisation or addressee, such recipient is strictly prohibited from reading, photocopying, distributing or otherwise using this report or its YourIT Company contents in any way. Scan Date: 1/18/2018 1/18/2018 Evidence of Compliance GDPR ASSESSMENT Table of Contents 1 - APPLICABLE LAW 2 - DATA PROTECTION OFFICER 3 - REPRESENTATIVE OF CONTROLLER OR PROCESSORS NOT ESTABLISHED IN THE UNION 4 - PRINCIPLES RELATING TO PROCESSING OF PERSONAL DATA 5 - PERSONAL DATA 5.1 - AUTOMATED SCAN FOR PERSONAL DATA 6 - CHILD CONSENT 7 - SPECIAL CATEGORIES OF PERSONAL DATA 8 - PRIVACY POLICY REVIEW 9 - PROCESSOR OR SUB-PROCESSOR 10 - IMPLEMENTATION OF CONTROLS FROM ISO 27001 11 - INFORMATION SECURITY POLICIES 12 - ORGANISATION OF INFORMATION SECURITY 13 - USER ACCESS MANAGEMENT 13.1 - TERMINATED USERS 13.2 - INACTIVE USERS 13.3 - SECURITY GROUPS 13.4 - GENERIC ACCOUNTS 13.5 - PASSWORD MANAGEMENT 14 - PHYSICAL AND ENVIRONMENTAL SECURITY 14.1 - SCREEN LOCK SETTINGS 15 - OPERATIONS SECURITY 15.1 - APPLICATION LIST 15.2 - OUTBOUND WEB FILTERING 15.3 - ENDPOINT SECURITY 15.4 - CORPORATE BACKUP 15.5 - ENDPOINT BACKUP 15.6 - LOGGING AND MONITORING 15.7 - CLOCK SYNCHRONIZATION 15.8 - TECHNICAL VULNERABILITY MANAGEMENT 16 - COMMUNICATION SECURITY Page 2 of 80 Evidence of Compliance GDPR ASSESSMENT 16.1 - NETWORK CONTROLS 16.2 - SEGREGATION IN NETWORKS 17 - SYSTEM ACQUISITION 17.1 - EXTERNAL APPLICATION SECURITY Page 3 of 80 Evidence of Compliance GDPR ASSESSMENT 1 - APPLICABLE LAW ISO 27001 (18.1.1): Identification of applicable legislation and contractual requirements We have identified the following laws, regulations and standards as being applicable to our business.
    [Show full text]
  • HTML5 Unbound: a Security & Privacy Drama Mike Shema Qualys a Drama in Four Parts
    HTML5 Unbound: A Security & Privacy Drama Mike Shema Qualys A Drama in Four Parts • The Meaning & Mythology of HTML5 • Security From Design • Security (and Privacy) From HTML5 • Design, Doom & Destiny “This specification defines the 5th major revision of the core language of the World Wide Web: the Hypertext Markup Language (HTML). In this version, new features are introducedIs my Geocities to help Web application site secure? authors, new elements are introducedHTML4 basedWeb on research 2.0 into prevailing authoring practices, and special attention has been given to defining clear conformance criteria for user agents in an effortHTML5 to improve interoperability.” 2 The Path to HTML5 3350 B.C. Cuneiform enables stone markup languages. “Cyberspace. A consensual hallucination...” July 1984 Neuromancer, p. 0x33. Dec 1990 CERN httpd starts serving HTML. Nov 1995 HTML 2.0 standardized in RFC 1866. Dec 1999 HTML 4.01 finalized. HTML5. Meaning vs. Mythology <!doctype html> Social [_______] Cross Origin [_______] as a Service Resource Sharing Web 2.0++ WebSocket API Flash, Silverlight Web Storage CSRF Web Workers Clickjacking “Default Secure” Takes Time Mar 2012 Nov 2009 Apr 2002 “Default Insecure” Is Enduring Dec 2005 Mar 2012 “Developer Insecure” Is Eternal • Advanced Persistent Ignorance JavaScript: Client(?!) Code • The global scope of superglobals • The prototypes of mass assignment • The eval() of SQL injection • The best way to create powerful browser apps • The main accomplice to HTML5 History of Web Designs • Cookies – Implementation by
    [Show full text]
  • The Computer) Or the Software (The Computer Application) That Helps to Deliver Content That Can Be Accessed Through the Internet
    Introduction to web server A web server can be referred to as either the hardware (the computer) or the software (the computer application) that helps to deliver content that can be accessed through the Internet. The most common use of Web servers is to host Web sites but there are other uses like data storage or for running enterprise applications. Overview The primary function of a web server is to deliver web pages on the request to clients. This means delivery of HTML documents and any additional content that may be included by a document, such as images, style sheets and JavaScripts. A client, commonly a web browser or web crawler, initiates communication by making a request for a specific resource using HTTP and the server responds with the content of that resource or an error message if unable to do so. The resource is typically a real file on the server's secondary memory, but this is not necessarily the case and depends on how the web server is implemented. While the primary function is to serve content, a full implementation of HTTP also includes ways of receiving content from clients. This feature is used for submitting web forms, including uploading of files. Many generic web servers also support server-side scripting, e.g., Apache HTTP Server and PHP. This means that the behaviour of the web server can be scripted in separate files, while the actual server software remains unchanged. Usually, this function is used to create HTML documents "on-the-fly" as opposed to returning fixed documents.
    [Show full text]
  • World Wide Web
    World Wide Web PDF generated using the open source mwlib toolkit. See http://code.pediapress.com/ for more information. PDF generated at: Sun, 02 Dec 2012 15:32:53 UTC Contents Articles Main article 1 World Wide Web 1 History 13 History of the World Wide Web 13 How it works 20 Ajax 20 Push technology 23 WWW prefix in Web addresses 27 Pronunciation of "www" 27 Standards 29 Web standards 29 Accessibility 32 Web accessibility 32 Link rot and Web archival 40 Link rot 40 Web pages 44 Web page 44 References Article Sources and Contributors 49 Image Sources, Licenses and Contributors 52 Article Licenses License 53 1 Main article World Wide Web World Wide Web The Web's logo designed by Robert Cailliau [1] Inventor Tim Berners-Lee Company CERN Availability Worldwide The World Wide Web (abbreviated as WWW or W3,[2] commonly known as the Web), is a system of interlinked hypertext documents accessed via the Internet. With a web browser, one can view web pages that may contain text, images, videos, and other multimedia, and navigate between them via hyperlinks. Using concepts from his earlier hypertext systems like ENQUIRE, British engineer, computer scientist and at that time employee of CERN, Sir Tim Berners-Lee, now Director of the World Wide Web Consortium (W3C), wrote a proposal in March 1989 for what would eventually become the World Wide Web.[1] At CERN, a European research organisation near Geneva situated on Swiss and French soil,[3] Berners-Lee and Belgian computer scientist Robert Cailliau proposed in 1990 to use hypertext "to link and access information of various kinds as a web of nodes in which the user can browse at will",[4] and they publicly introduced the project in December of the same year.[5] World Wide Web 2 History In the May 1970 issue of Popular Science magazine, Arthur C.
    [Show full text]
  • Tim Berners-Lee
    TIM BERNERS-LEE Cuiabá Março 2020 Bruna Keiko Hatakeyama Carlos Daniel de Souza Silva Ivanildo Cesar Neves Leony Tamio Hatakeyama Victor Hugo Maranholi TIM BERNERS-LEE Trabalho solicitado pela professor Patricia Cristiane de Souza como forma de avaliação da nota semestral da disciplina História da Computação, do curso Sistemas de Informação, localização no Instituto de Computação da Universidade Federal de Mato Grosso – Campus Cuiabá. Cuiabá Março 2020 1. BIOGRAFIA Timothy John Berners-Lee (TimBL ou TBL) é um físico britânico, cientista da computação e professor do MIT (Massachusetts Institute of Technology) - criador da World Wide Web com proposta inicial para sua criação a 12 de março de 1989. Entre suas criações que lhe concedem os créditos pela invenção da web estão: o Identificador Uniforme de Recurso (URI, na sigla em inglês) que é uma cadeia de caracteres compacta usada para identificar ou denominar um recurso na Internet. E o HTTP- Hypertext Transfer Protocol (protocolo de comunicação) e definiu o idioma para páginas web - HTML (linguagem de marcação utilizada na construção de páginas na Web). Nascido em Londres, Inglaterra, no dia 8 de junho de 1955, filho de Conway Berners-Lee e Mary Lee Woods. Estudou na escola primária Sheen Mount e depois na Emanuel School em Londres (1969 a 1973). Depois estudou no The Queen's College, em Oxford (1973 a 1976), graduando-se em Física. No período de junho a dezembro de 1980, enquanto trabalhava como contratado independente no CERN (Organisation Européenne pour la Recherche Nucléaire), Berners- Lee propôs um projeto baseado no conceito de hipertexto para facilitar a partilha e atualização de informações entre os pesquisadores.
    [Show full text]
  • Reducing the Disk I/O of Web Proxy Server Caches
    THE ADVANCED COMPUTING SYSTEMS ASSOCIATION The following paper was originally published in the Proceedings of the 1999 USENIX Annual Technical Conference Monterey, California, USA, June 6–11, 1999 Reducing the Disk I/O of Web Proxy Server Caches Carlos Maltzahn and Kathy J. Richardson Compaq Computer Corporation Dirk Grunwald University of Colorado © 1999 by The USENIX Association All Rights Reserved Rights to individual papers remain with the author or the author's employer. Permission is granted for noncommercial reproduction of the work for educational or research purposes. This copyright notice must be included in the reproduced paper. USENIX acknowledges all trademarks herein. For more information about the USENIX Association: Phone: 1 510 528 8649 FAX: 1 510 548 5738 Email: [email protected] WWW: http://www.usenix.org Reducing the Disk I/O of Web Proxy Server Caches Carlos Maltzahn and Kathy J. Richardson Compaq Computer Corporation Network Systems Laboratory Palo Alto, CA [email protected], [email protected] Dirk Grunwald University of Colorado Department of Computer Science Boulder, CO [email protected] Abstract between the corporate network and the Internet. The lat- ter two are commonly accomplished by caching objects on local disks. The dramatic increase of HTTP traffic on the Internet has resulted in wide-spread use of large caching proxy Apart from network latencies the bottleneck of Web servers as critical Internet infrastructure components. cache performance is disk I/O [2, 27]. An easy but ex- With continued growth the demand for larger caches and pensive solution would be to just keep the entire cache in higher performance proxies grows as well.
    [Show full text]
  • On Applying Controlled Natural Languages for Ontology Authoring and Semantic Annotation
    Provided by the author(s) and NUI Galway in accordance with publisher policies. Please cite the published version when available. Title On Applying Controlled Natural Languages for Ontology Authoring and Semantic Annotation Author(s) Davis, Brian Patrick Publication Date 2013-02-07 Item record http://hdl.handle.net/10379/4176 Downloaded 2021-09-25T01:14:11Z Some rights reserved. For more information, please see the item record link above. On Applying Controlled Natural Languages for Ontology Authoring and Semantic Annotation Brian Patrick Davis Submitted in fulfillment of the requirements for the degree of Doctor of Philosophy PRIMARY SUPERVISOR: SECONDARY SUPERVISOR: Prof. Dr. Siegfried Handschuh Professor Hamish Cunningham National University of Ireland Galway University of Sheffield INTERNAL EXAMINER: Prof. Dr. Manfred Hauswirth National University of Ireland Galway EXTERNAL EXAMINER: EXTERNAL EXAMINER: Professor Chris Mellish Professor Laurette Pretorius University of Aberdeen University of South Africa Digital Enterprise Research Institute, National University of Ireland Galway. February 2013 i Declaration I declare that the work covered by this thesis is composed by myself, and that it has not been submitted for any other degree or professional qualification except as specified. Brian Davis The research contributions reported by this thesis was supported(in part) by the Lion project supported by Science Foundation Ireland under Grant No. SFI/02/CE1/I131 and (in part) by the European project NEPOMUK No FP6-027705. ii iii Abstract Creating formal data is a high initial barrier for small organisations and individuals wishing to create ontologies and thus benefit from semantic technologies. Part of the so- lution comes from ontology authoring, but this often requires specialist skills in ontology engineering.
    [Show full text]