On Applying Controlled Natural Languages for Ontology Authoring and Semantic Annotation

Total Page:16

File Type:pdf, Size:1020Kb

On Applying Controlled Natural Languages for Ontology Authoring and Semantic Annotation Provided by the author(s) and NUI Galway in accordance with publisher policies. Please cite the published version when available. Title On Applying Controlled Natural Languages for Ontology Authoring and Semantic Annotation Author(s) Davis, Brian Patrick Publication Date 2013-02-07 Item record http://hdl.handle.net/10379/4176 Downloaded 2021-09-25T01:14:11Z Some rights reserved. For more information, please see the item record link above. On Applying Controlled Natural Languages for Ontology Authoring and Semantic Annotation Brian Patrick Davis Submitted in fulfillment of the requirements for the degree of Doctor of Philosophy PRIMARY SUPERVISOR: SECONDARY SUPERVISOR: Prof. Dr. Siegfried Handschuh Professor Hamish Cunningham National University of Ireland Galway University of Sheffield INTERNAL EXAMINER: Prof. Dr. Manfred Hauswirth National University of Ireland Galway EXTERNAL EXAMINER: EXTERNAL EXAMINER: Professor Chris Mellish Professor Laurette Pretorius University of Aberdeen University of South Africa Digital Enterprise Research Institute, National University of Ireland Galway. February 2013 i Declaration I declare that the work covered by this thesis is composed by myself, and that it has not been submitted for any other degree or professional qualification except as specified. Brian Davis The research contributions reported by this thesis was supported(in part) by the Lion project supported by Science Foundation Ireland under Grant No. SFI/02/CE1/I131 and (in part) by the European project NEPOMUK No FP6-027705. ii iii Abstract Creating formal data is a high initial barrier for small organisations and individuals wishing to create ontologies and thus benefit from semantic technologies. Part of the so- lution comes from ontology authoring, but this often requires specialist skills in ontology engineering. Defining a Controlled Natural Language (CNL) for formal data description can enable naive users to develop ontologies using a subset of natural language. How- ever despite the benefits of CNLs, users are still required to learn the correct syntactic structures in order to use the Controlled Language properly. This can be time consum- ing, annoying and in certain cases may prevent uptake of the tool. The reversal of the CNL authoring process involves generation of the controlled language from an existing ontology using Natural Language Generation (NLG) techniques, which results in a round trip ontology authoring environment: one can start with an existing imported ontology (re)produce the CNL using NLG, modify or edit the text as required and subsequently parse the text back into the ontology using the CNL authoring environment. By in- troducing language generation into the authoring process, the learning curve associated with the CNL can be reduced. While the creation of ontologies is critical for the Se- mantic Web, without a critical mass of richly interlinked metadata, this vision cannot become a reality. Manual semantic annotation is a labor-intensive task requiring training in formal ontological descriptions for the otherwise non-expert user. Although automatic annotation tools attempt to ease this knowledge acquisition barrier, their development often requires access to specialists in Natural Language Processing (NLP). This chal- lenges researchers to develop user-friendly annotation environments. While CNLs have been applied to ontology authoring, little research has focused on their application to se- mantic annotation. In summary, this research applies CNL techniques to both ontology authoring and semantic annotation, and provides solid empirical evidence that for certain scenarios applying CNLs to both tasks can be more user friendly than standard ontology authoring and manual semantic annotation tools respectively. iv v Acknowledgements I would first like to thank Professor Hamish Cunningham, University of Sheffield, for giving me the opportunity to pursue postgraduate study and start an academic career at DERI, NUI Galway. In particular, I want to thank him and the GATE team for their support and invaluable training, Most importantly, I would like to thank my supervisor, Prof. Dr. Siegfried Handschuh for his patience, guidance and mentorship with respect to scientific writing, research project management, teaching, proposal writing, research leadership etc and for giving me countless opportunities to development my research skills and academic career. Other academics, I would like to thank include Prof. Dr. Stefan Decker, for his inspiration, Dr Conor Hayes for his practical advice on research project management and Dr Paul Buitelaar for being an veritable treasure chest of natural language processing experience. I would like to thank Prof. Chris Mellish and Prof Laurette Pretorius, (who made the long journey from South Africa) and Prof. Dr. Manfred Hauswirth for ensuring that the final submission is in pristine condition, such that in later years I will be able to glance through this thesis with pride! DERI (now INSIGHT) has and continues to be a stimulating environment to study and work in. I have met so many wonderful, intelligent and helpful people and made so many friends and acquaintances over the years that to list them all would be impossible! I want in particular thank the old Nepomuk/SMILE team past and present: Tudor, Knud, Laura, Cipri, Georgeta, Ioana, Ismael, Simon, VinhTuan, Alex, Pradeep, Vit, Manuela, vi Keith, Judy and Jeremy, who have all gone on to bigger and better things! A particular thanks to Behrang for always raising the NLP bar with his vast knowledge and wonderful discussions! Thanks to Luk, Laleh and Bahareh for tea, friendship and the occasional needed kick in the backside! Thanks to the DERI Operations, Administrative and Technical team over the years, which without, DERI would simply have ground to a halt: Hilda, Claire, Brian Wall, Ger and Andrew (and Caragh!) Gallagher, Carmel, Michelle and Sylvia and all the Irish not forgetting Ed and Sean! You have put up with a lot over the years! To my other baby OCD Ireland, special thanks to the team past and present and my good friend Leslie for her wisdom over the years and for taking on so much extra work in the last year so I could finish thesis! A special mention to Prof. Fionnuala as well! To all my childhood friends in Dublin, Neil, Ian and Ross, Eoin and Mark, who had no idea what I was studying and why it took so long, but have made me feel extremely proud! To staff of the former M. Davis and Co. for always making me feel welcome and not forgetting Rose and the infamous Mr Kitt! I want to thank my mother and father, Louis and Mary for their unconditional love, support, understanding and patience over the years, as well as my brother Colm and sister Louise. I want to acknowledge Mark and Yvonne, and my army of nieces and nephews, Caragh, Eoghan, Amelie and my Goddaughters Niamh and Ella and all the extended family!! I also want to thank my fianc´eAlessandra, who has always been a steady force of love, patience and much needed motivation to finish, this thesis even when I resisted! Finally, I want to mention my father, Louis Davis who passed away unexpectedly a year before this thesis was defended. I miss you very much Dad, and I wish you were here to celebrate with me. vii viii Dedicated to my father, Louis Michael Davis (1940 - 2012). x Table of Contents Abstract iv Acknowledgements v Table of Contents xi List of Figures xiv List of Tables xviii I Prelude 1 1 Introduction 2 1.1 Motivation and Problem Description . .... 2 1.2 ResearchQuestions............................... 6 1.3 Contributions.................................. 7 1.4 ThesisStructure ................................ 8 II Foundations 10 2 Human Language Technology and the Semantic Web 11 2.1 TheSemanticWebandLinkedData . 11 2.2 HumanLanguageTechnology . 25 xi 2.3 InformationExtraction. 28 2.4 Ontology based Information Extraction (OBIE) . ....... 40 2.5 Natural Language Generation . 46 2.6 Natural Language Generation from Ontologies . ...... 53 2.7 Conclusions ................................... 59 3 Controlled Natural Languages and Semantic Annotation 63 3.1 Control Natural Languages for the Semantic Web . ...... 63 3.2 SemanticAnnotation.. .. .. .. .. .. .. .. .. 74 3.3 Conclusions ................................... 87 III Core Research 92 4 CLOnE - Controlled Language for Ontology Editing 93 4.1 Introduction................................... 93 4.2 Requirements and Implementation . 95 4.3 Evaluation.................................... 101 4.4 RelatedWork.................................. 110 4.5 Conclusion ................................... 113 5 Round Trip Ontology Authoring 115 5.1 Introduction................................... 115 5.2 DesignandImplementation . 117 5.3 Evaluation.................................... 127 5.4 Relatedwork .................................. 137 5.5 Conclusion&Discussion. 143 6 Towards Controlled Natural Language for Semantic Annotation 146 6.1 Introduction................................... 146 6.2 CLANN: Design and Implementation . 154 xii 6.3 Evaluation.................................... 171 6.4 Relatedwork .................................. 184 6.5 ConclusionandFutureWork . 188 IV Conclusion and Future Work 191 7 Conclusions and Future Work 192 7.1 ResearchContributions . 192 7.2 OpenQuestions................................. 195 7.3 OngoingWorkwithCLANN . 196 7.4 FutureWork .................................. 197 7.5 Summary .................................... 199 Bibliography 201 A Evaluation documents for CLIE/CLOnE 231 A.1 Trainingmanual
Recommended publications
  • World-Wide Web Proxies
    World-Wide Web Proxies Ari Luotonen, CERN Kevin Altis, Intel April 1994 Abstract 1.0 Introduction A WWW proxy server, proxy for short, provides access to The primary use of proxies is to allow access to the Web the Web for people on closed subnets who can only access from within a firewall (Fig. 1). A proxy is a special HTTP the Internet through a firewall machine. The hypertext [HTTP] server that typically runs on a firewall machine. server developed at CERN, cern_httpd, is capable of run- The proxy waits for a request from inside the firewall, for- ning as a proxy, providing seamless external access to wards the request to the remote server outside the firewall, HTTP, Gopher, WAIS and FTP. reads the response and then sends it back to the client. cern_httpd has had gateway features for a long time, but In the usual case, the same proxy is used by all the clients only this spring they were extended to support all the within a given subnet. This makes it possible for the proxy methods in the HTTP protocol used by WWW clients. Cli- to do efficient caching of documents that are requested by ents don’t lose any functionality by going through a proxy, a number of clients. except special processing they may have done for non- native Web protocols such as Gopher and FTP. The ability to cache documents also makes proxies attrac- tive to those not inside a firewall. Setting up a proxy server A brand new feature is caching performed by the proxy, is easy, and the most popular Web client programs already resulting in shorter response times after the first document have proxy support built in.
    [Show full text]
  • Site Map - Apache HTTP Server 2.0
    Site Map - Apache HTTP Server 2.0 Apache HTTP Server Version 2.0 Site Map ● Apache HTTP Server Version 2.0 Documentation ❍ Release Notes ■ Upgrading to 2.0 from 1.3 ■ New features with Apache 2.0 ❍ Using the Apache HTTP Server ■ Compiling and Installing Apache ■ Starting Apache ■ Stopping and Restarting the Server ■ Configuration Files ■ How Directory, Location and Files sections work ■ Server-Wide Configuration ■ Log Files ■ Mapping URLs to Filesystem Locations ■ Security Tips ■ Dynamic Shared Object (DSO) support ■ Content Negotiation ■ Custom error responses ■ Setting which addresses and ports Apache uses ■ Multi-Processing Modules (MPMs) ■ Environment Variables in Apache ■ Apache's Handler Use ■ Filters ■ suEXEC Support ■ Performance Hintes ■ URL Rewriting Guide ❍ Apache Virtual Host documentation ■ Name-based Virtual Hosts ■ IP-based Virtual Host Support ■ Dynamically configured mass virtual hosting ■ VirtualHost Examples ■ An In-Depth Discussion of Virtual Host Matching ■ File descriptor limitations ■ Issues Regarding DNS and Apache ❍ Apache Server Frequently Asked Questions http://httpd.apache.org/docs-2.0/sitemap.html (1 of 4) [5/03/2002 9:53:06 PM] Site Map - Apache HTTP Server 2.0 ■ Support ❍ Apache SSL/TLS Encryption ■ SSL/TLS Encryption: An Introduction ■ SSL/TLS Encryption: Compatibility ■ SSL/TLS Encryption: How-To ■ SSL/TLS Encryption: FAQ ■ SSL/TLS Encryption: Glossary ❍ Guides, Tutorials, and HowTos ■ Authentication ■ Apache Tutorial: Dynamic Content with CGI ■ Apache Tutorial: Introduction to Server Side Includes ■ Apache
    [Show full text]
  • F JUN 1 1 1996
    Mixed Retrieval and Virtual Documents on the World Wide Web by Christopher Alan Fuchs Submnitted to the Department of Electrical Engineering and Computer Science in Partial Fulfillment of the Requirements for the Degrees of Bachelor of Science in Computer Science and Engineering and Master of Engineering in Electrical Engineering and Computer Science at the Massachusetts Institute of Technology May 2, 1996 Copyright 1996 Christopher Alan Fuchs. All rights reserved. The author hereby grants to M.I.T. permission to reproduce and to distribute copies of this thesis document in whole or in part, and to grant others the right to do so. Author V J .. f Department of Electrical Engineering and Computer Science May 2, 1996 Certified by Ci James S. Miller (I^ - Thesis Supervisor Accepted by. F.R. Morgenthaler . -'.s!-s rs u F Chairman, Department Conunittee on Graduate Theses OF TECHNOLOGY JUN 1 1 1996 ULIBRARIES Eng. Mixed Retrieval and Virtual Documents on the World Wide Web by Christopher Alan Fuchs Submitted to the Department of Electrical Engineering and Computer Science May 2, 1996 In Partial Fulfillment of the Requirements for the Degree of Bachelor of Science in Computer Science and Engineering and Master of Engineering in Electrical Engineering and Computer Science ABSTRACT World Wide Web site managers are forced to serve data using static retrieval (from the file system) or dynamic retrieval (by executing programs) without the practical option of migrating between the two. It was found that dynaimnic retrieval through the CGI script mechanism has an overhead of 18-33% above its statically-retrieved counterpart. A mixed retrievalsystem allows Web documents to migrate between static and dynamic retrieval to gain ithe benefits of both methods and increase the overall performance of the server.
    [Show full text]
  • Load Testing of Containerised Web Services
    UPTEC IT 16003 Examensarbete 30 hp Mars 2016 Load Testing of Containerised Web Services Christoffer Hamberg Abstract Load Testing of Containerised Web Services Christoffer Hamberg Teknisk- naturvetenskaplig fakultet UTH-enheten Load testing web services requires a great deal of environment configuration and setup. Besöksadress: This is especially apparent in an environment Ångströmlaboratoriet Lägerhyddsvägen 1 where virtualisation by containerisation is Hus 4, Plan 0 used with many moving and volatile parts. However, containerisation tools like Docker Postadress: offer several properties, such as; application Box 536 751 21 Uppsala image creation and distribution, network interconnectivity and application isolation that Telefon: could be used to support the load testing 018 – 471 30 03 process. Telefax: 018 – 471 30 00 In this thesis, a tool named Bencher, which goal is to aid the process of load testing Hemsida: containerised (with Docker) HTTP services, is http://www.teknat.uu.se/student designed and implemented. To reach its goal Bencher automates some of the tedious steps of load testing, including connecting and scaling containers, collecting system metrics and load testing results to name a few. Bencher’s usability is verified by testing a number of hypotheses formed around different architecture characteristics of web servers in the programming language Ruby. With a minimal environment setup cost and a rapid test iteration process, Bencher proved its usability by being successfully used to verify the hypotheses in this thesis. However, there is still need for future work and improvements, including for example functionality for measuring network bandwidth and latency, that could be added to enhance process even further. To conclude, Bencher fulfilled its goal and scope that were set for it in this thesis.
    [Show full text]
  • World Wide Web
    World Wide Web Introduction to networking Dr. Klára Pešková, [email protected] Department of Software and Computer Science Education 1 / 32 Feel free to contact me: ● Join a Zoom meeting during an officially scheduled lecture ● E-mail me Assignment You can find the “Simple Static Web” assignment in ReCodEx. Your goal is to create a simple static web, the specification of this assignment in ReCodEx is quite detailed. You will be rewarded by one point if your solution meets the required criteria. Except for ReCodEx, the assignment has to be approved by myself. Your web page has to look like a real web page, it is not enough to just stack the mandatory elements with no concept. You will find my comments to your solution directly in ReCodEx. If your solution is approved, I will reward you one extra point. You need to get 1+1 points to be able to take the “Introduction to networking” test. WWW – World Wide Web ● WWW is the most used Internet service ● Started as an experiment in CERN ● Last year’s description – Platform for information and data exchange – Environment for applications, that are accessible from anywhere ● This year – Social interactions – Shopping – Culture – Studying – Entertainment … Introduction to networking (2020) 2 / 32 World Wide Web and Internet are not the same. Web is one of the Internet’s (most used) services. Ancient history ● 1945 – Vannevar Bush – Human brain works with associations – hypothetical electromechanical device Memex – "enlarged intimate supplement to one's memory", bookmark list of static microfilm pages ● ‘60s – Theodore Nelson first used the word hyper-text, i.e.
    [Show full text]
  • GDPR Assessment Evidence of Compliance
    33 GDPR Assessment Evidence of Compliance Prepared for: CONFIDENTIALITY NOTE: The information contained in this report document My Client Company is for the exclusive use of the organisation specified above and may contain confidential, privileged and non-disclosable information. If the recipient of this Prepared by: report is not the organisation or addressee, such recipient is strictly prohibited from reading, photocopying, distributing or otherwise using this report or its YourIT Company contents in any way. Scan Date: 1/18/2018 1/18/2018 Evidence of Compliance GDPR ASSESSMENT Table of Contents 1 - APPLICABLE LAW 2 - DATA PROTECTION OFFICER 3 - REPRESENTATIVE OF CONTROLLER OR PROCESSORS NOT ESTABLISHED IN THE UNION 4 - PRINCIPLES RELATING TO PROCESSING OF PERSONAL DATA 5 - PERSONAL DATA 5.1 - AUTOMATED SCAN FOR PERSONAL DATA 6 - CHILD CONSENT 7 - SPECIAL CATEGORIES OF PERSONAL DATA 8 - PRIVACY POLICY REVIEW 9 - PROCESSOR OR SUB-PROCESSOR 10 - IMPLEMENTATION OF CONTROLS FROM ISO 27001 11 - INFORMATION SECURITY POLICIES 12 - ORGANISATION OF INFORMATION SECURITY 13 - USER ACCESS MANAGEMENT 13.1 - TERMINATED USERS 13.2 - INACTIVE USERS 13.3 - SECURITY GROUPS 13.4 - GENERIC ACCOUNTS 13.5 - PASSWORD MANAGEMENT 14 - PHYSICAL AND ENVIRONMENTAL SECURITY 14.1 - SCREEN LOCK SETTINGS 15 - OPERATIONS SECURITY 15.1 - APPLICATION LIST 15.2 - OUTBOUND WEB FILTERING 15.3 - ENDPOINT SECURITY 15.4 - CORPORATE BACKUP 15.5 - ENDPOINT BACKUP 15.6 - LOGGING AND MONITORING 15.7 - CLOCK SYNCHRONIZATION 15.8 - TECHNICAL VULNERABILITY MANAGEMENT 16 - COMMUNICATION SECURITY Page 2 of 80 Evidence of Compliance GDPR ASSESSMENT 16.1 - NETWORK CONTROLS 16.2 - SEGREGATION IN NETWORKS 17 - SYSTEM ACQUISITION 17.1 - EXTERNAL APPLICATION SECURITY Page 3 of 80 Evidence of Compliance GDPR ASSESSMENT 1 - APPLICABLE LAW ISO 27001 (18.1.1): Identification of applicable legislation and contractual requirements We have identified the following laws, regulations and standards as being applicable to our business.
    [Show full text]
  • HTML5 Unbound: a Security & Privacy Drama Mike Shema Qualys a Drama in Four Parts
    HTML5 Unbound: A Security & Privacy Drama Mike Shema Qualys A Drama in Four Parts • The Meaning & Mythology of HTML5 • Security From Design • Security (and Privacy) From HTML5 • Design, Doom & Destiny “This specification defines the 5th major revision of the core language of the World Wide Web: the Hypertext Markup Language (HTML). In this version, new features are introducedIs my Geocities to help Web application site secure? authors, new elements are introducedHTML4 basedWeb on research 2.0 into prevailing authoring practices, and special attention has been given to defining clear conformance criteria for user agents in an effortHTML5 to improve interoperability.” 2 The Path to HTML5 3350 B.C. Cuneiform enables stone markup languages. “Cyberspace. A consensual hallucination...” July 1984 Neuromancer, p. 0x33. Dec 1990 CERN httpd starts serving HTML. Nov 1995 HTML 2.0 standardized in RFC 1866. Dec 1999 HTML 4.01 finalized. HTML5. Meaning vs. Mythology <!doctype html> Social [_______] Cross Origin [_______] as a Service Resource Sharing Web 2.0++ WebSocket API Flash, Silverlight Web Storage CSRF Web Workers Clickjacking “Default Secure” Takes Time Mar 2012 Nov 2009 Apr 2002 “Default Insecure” Is Enduring Dec 2005 Mar 2012 “Developer Insecure” Is Eternal • Advanced Persistent Ignorance JavaScript: Client(?!) Code • The global scope of superglobals • The prototypes of mass assignment • The eval() of SQL injection • The best way to create powerful browser apps • The main accomplice to HTML5 History of Web Designs • Cookies – Implementation by
    [Show full text]
  • The Computer) Or the Software (The Computer Application) That Helps to Deliver Content That Can Be Accessed Through the Internet
    Introduction to web server A web server can be referred to as either the hardware (the computer) or the software (the computer application) that helps to deliver content that can be accessed through the Internet. The most common use of Web servers is to host Web sites but there are other uses like data storage or for running enterprise applications. Overview The primary function of a web server is to deliver web pages on the request to clients. This means delivery of HTML documents and any additional content that may be included by a document, such as images, style sheets and JavaScripts. A client, commonly a web browser or web crawler, initiates communication by making a request for a specific resource using HTTP and the server responds with the content of that resource or an error message if unable to do so. The resource is typically a real file on the server's secondary memory, but this is not necessarily the case and depends on how the web server is implemented. While the primary function is to serve content, a full implementation of HTTP also includes ways of receiving content from clients. This feature is used for submitting web forms, including uploading of files. Many generic web servers also support server-side scripting, e.g., Apache HTTP Server and PHP. This means that the behaviour of the web server can be scripted in separate files, while the actual server software remains unchanged. Usually, this function is used to create HTML documents "on-the-fly" as opposed to returning fixed documents.
    [Show full text]
  • World Wide Web
    World Wide Web PDF generated using the open source mwlib toolkit. See http://code.pediapress.com/ for more information. PDF generated at: Sun, 02 Dec 2012 15:32:53 UTC Contents Articles Main article 1 World Wide Web 1 History 13 History of the World Wide Web 13 How it works 20 Ajax 20 Push technology 23 WWW prefix in Web addresses 27 Pronunciation of "www" 27 Standards 29 Web standards 29 Accessibility 32 Web accessibility 32 Link rot and Web archival 40 Link rot 40 Web pages 44 Web page 44 References Article Sources and Contributors 49 Image Sources, Licenses and Contributors 52 Article Licenses License 53 1 Main article World Wide Web World Wide Web The Web's logo designed by Robert Cailliau [1] Inventor Tim Berners-Lee Company CERN Availability Worldwide The World Wide Web (abbreviated as WWW or W3,[2] commonly known as the Web), is a system of interlinked hypertext documents accessed via the Internet. With a web browser, one can view web pages that may contain text, images, videos, and other multimedia, and navigate between them via hyperlinks. Using concepts from his earlier hypertext systems like ENQUIRE, British engineer, computer scientist and at that time employee of CERN, Sir Tim Berners-Lee, now Director of the World Wide Web Consortium (W3C), wrote a proposal in March 1989 for what would eventually become the World Wide Web.[1] At CERN, a European research organisation near Geneva situated on Swiss and French soil,[3] Berners-Lee and Belgian computer scientist Robert Cailliau proposed in 1990 to use hypertext "to link and access information of various kinds as a web of nodes in which the user can browse at will",[4] and they publicly introduced the project in December of the same year.[5] World Wide Web 2 History In the May 1970 issue of Popular Science magazine, Arthur C.
    [Show full text]
  • Tim Berners-Lee
    TIM BERNERS-LEE Cuiabá Março 2020 Bruna Keiko Hatakeyama Carlos Daniel de Souza Silva Ivanildo Cesar Neves Leony Tamio Hatakeyama Victor Hugo Maranholi TIM BERNERS-LEE Trabalho solicitado pela professor Patricia Cristiane de Souza como forma de avaliação da nota semestral da disciplina História da Computação, do curso Sistemas de Informação, localização no Instituto de Computação da Universidade Federal de Mato Grosso – Campus Cuiabá. Cuiabá Março 2020 1. BIOGRAFIA Timothy John Berners-Lee (TimBL ou TBL) é um físico britânico, cientista da computação e professor do MIT (Massachusetts Institute of Technology) - criador da World Wide Web com proposta inicial para sua criação a 12 de março de 1989. Entre suas criações que lhe concedem os créditos pela invenção da web estão: o Identificador Uniforme de Recurso (URI, na sigla em inglês) que é uma cadeia de caracteres compacta usada para identificar ou denominar um recurso na Internet. E o HTTP- Hypertext Transfer Protocol (protocolo de comunicação) e definiu o idioma para páginas web - HTML (linguagem de marcação utilizada na construção de páginas na Web). Nascido em Londres, Inglaterra, no dia 8 de junho de 1955, filho de Conway Berners-Lee e Mary Lee Woods. Estudou na escola primária Sheen Mount e depois na Emanuel School em Londres (1969 a 1973). Depois estudou no The Queen's College, em Oxford (1973 a 1976), graduando-se em Física. No período de junho a dezembro de 1980, enquanto trabalhava como contratado independente no CERN (Organisation Européenne pour la Recherche Nucléaire), Berners- Lee propôs um projeto baseado no conceito de hipertexto para facilitar a partilha e atualização de informações entre os pesquisadores.
    [Show full text]
  • Reducing the Disk I/O of Web Proxy Server Caches
    THE ADVANCED COMPUTING SYSTEMS ASSOCIATION The following paper was originally published in the Proceedings of the 1999 USENIX Annual Technical Conference Monterey, California, USA, June 6–11, 1999 Reducing the Disk I/O of Web Proxy Server Caches Carlos Maltzahn and Kathy J. Richardson Compaq Computer Corporation Dirk Grunwald University of Colorado © 1999 by The USENIX Association All Rights Reserved Rights to individual papers remain with the author or the author's employer. Permission is granted for noncommercial reproduction of the work for educational or research purposes. This copyright notice must be included in the reproduced paper. USENIX acknowledges all trademarks herein. For more information about the USENIX Association: Phone: 1 510 528 8649 FAX: 1 510 548 5738 Email: [email protected] WWW: http://www.usenix.org Reducing the Disk I/O of Web Proxy Server Caches Carlos Maltzahn and Kathy J. Richardson Compaq Computer Corporation Network Systems Laboratory Palo Alto, CA [email protected], [email protected] Dirk Grunwald University of Colorado Department of Computer Science Boulder, CO [email protected] Abstract between the corporate network and the Internet. The lat- ter two are commonly accomplished by caching objects on local disks. The dramatic increase of HTTP traffic on the Internet has resulted in wide-spread use of large caching proxy Apart from network latencies the bottleneck of Web servers as critical Internet infrastructure components. cache performance is disk I/O [2, 27]. An easy but ex- With continued growth the demand for larger caches and pensive solution would be to just keep the entire cache in higher performance proxies grows as well.
    [Show full text]
  • Paper on Web Management
    Managing the Web at its birthplace Prepared for the South African Annual Conference on Web applications (1999-09-09) To be published in the South African Journal of Information Management Maria P. Dimou - CERN Dariusz J. Kogut (for the TORCH section) Abstract CERN, the European Laboratory for Particle Physics is the place where the Web was born. It has the longest tradition in Web-related development, however, its main role is to do basic physics' research. This ambiguous combination of interests makes the task of managing the Web at CERN most challenging. The large amount of Web servers to consolidate, the users' high expectations on performance and centrally supported tools, the expression of individual creativity within some corporate image standards creates a non-trivial Intranet operation. In this paper the CERN central servers' configuration will be discussed. The choice of commercial and public domain software for the server, search engine, Web discussion forums and log statistics will be presented. The guidelines and recommendations on Web authoring and site management will be listed as well as the criteria that led us to these conclusions. Introduction Managing a very large Intranet and, by the way, the oldest one, may seem at first glance to be a straight forward technical issue but it is more than that. It touches upon very interesting human and social aspects like: • who can publish • what can be published • how to promote important information • what is important information • who is responsible for the information quality • what is creative/what is bad taste • how independent are the data structures from the content • how uniform the Intranet appears Our Intranet is far from being the most attractive and conscise, however, the following statements are all true and explain or justify the situation: • our "lack of discipline" has also its advantages as it led to achievements like the World Wide Web.
    [Show full text]