<<

Spinning the

by TONY JOHNSON

Tony spins a tale of mystery and intrigue as he takes us on a futuristic journey around the electronic world known as "the Web."

F THE IMPORTANCE of developments were to be measured solely in terms of their popular I press coverage, then probably the most significant development to have sprung from the world of high energy physics in the last few years would not be the discovery of the top quark, or even the demise of the SSC, but rather the development of the World Wide Web. This tool (often referred to as WWW or simply as “the Web”) is able not only to access the entire spectrum of information available on the , but also to present it to the user using a single consistent easy-to-use interface.

2 FALL 1994 This has opened up the network, pre- The World Wide Web extends the long as they can be transformed by viously viewed as the home of com- well-established concept of hyper- the server software into the format puter hackers (and crazed scientists), text by making it possible for the des- the software expects to receive. to a new audience, leading to spec- tination document to be located on This model can naturally be extend- ulation that the Internet could be the a completely different computer from ed to allow documents to be dy- precursor to the much talked about the source document, either one lo- namically created in response to a “Information Super Highway.” cated anywhere on the network. This request from users, for example by The ideas behind the World-Wide was made possible by exploiting the querying a database and translat- Web were formulated at CERN existing capabilities of the Internet, ing the result of the query into a in 1989, leading to a proposal sub- a world-wide network of intercon- hypertext document. mitted in November 1990 by Tim nected computers developed over the From the information consumer’s Berners-Lee and for preceding 20 years, to establish a perspective, all the documents on the a “universal hypertext system.” In rapid connection to any named com- Web are presented in the form of hy- the four years since the original pro- puter on the network. pertext. The consumer remains bliss- posal the growth of the World Wide To achieve this, the World Wide fully ignorant of how the documents Web has been phenomenal, expand- Web uses a client-server architecture. are maintained by the information ing well beyond the high energy A user who wants to access infor- provider and, unless he really wants physics community into other acad- mation runs a World Wide Web client to know, from where the documents emic disciplines, into the world of (sometimes referred to as a brows- are being accessed. commerce, and even into people’s er) on his local computer. The client homes. fetches documents from remote net- GROWTH OF THE WEB This article describes the basic work nodes by connecting to a serv- concepts behind the World Wide er on that node and requesting the The initial implementation of the Web, traces its development over the document to be retrieved. A docu- Web client at CERN was for the NeXT past four years with examples of its ment typically can be requested and platform. This earliest browser was use both inside and outside of the fetched in less than a second, even able to display documents using mul- high energy physics community, and when it resides on the other side of tiple fonts and styles and was even goes on to describe some of the ex- the world from the requester. (Or at able to edit documents, but access tensions under development as part least it could be in the early days of was limited to users fortunate of the World Wide Web project. the Web; one of the drawbacks of the enough to have a NeXT box on their enormous success of the Web is that desks. This was followed by devel- sometimes transactions are not as opment of the CERN “linemode” WORLD WIDE WEB CONCEPTS fast now as they were in the earlier, browser, which could run on many The World Wide Web is designed less heavily trafficked days. One platforms but which displayed its around two key concepts: hypertext of the challenges of the Web’s fu- output only on character-based ter- documents and network-based in- ture is to overcome these scaling minals. These early browsers were formation retrieval. Hypertext doc- problems.) followed by the first browsers de- uments are simple documents in The client-server model offers signed for X-Windows, Viola devel- which words or phrases act as advantages to both the information oped at the University of California, to other documents. Typically hy- provider and the consumer. The Berkeley, and Midas developed at the pertext documents are presented to information provider is able to Stanford Linear Accelerator Center. the user with text that can act as a keep control of the documents he Initially the growth of the World link highlighted in some way, and maintains by keeping them on his Wide Web was relatively slow. By the the user is able to access the linked own computer. Furthermore the end of 1992 there were about 50 documents by clicking with a mouse documents can be maintained by the hypertext transfer protocol (HTTP) on the highlighted areas. information provider in any form, so servers. At about the same time,

BEAM LINE 3 10,000

1000

100 FTP for the Internet, and is still the most gy 10 widely used for transferring large 1 files). While the growth in WWW traffic is enormous, it is worth not- –1 ing that it is still not the dominant protocol; in fact, FTP, e-mail and NNTP (Network News transfer pro- tocol) traffic are all substantially WWW May Jan larger. Sep Owing to the distributed man- May agement of the Internet and the Jan NSFNET Bytes per Month World Wide Web, it is very difficult to obtain hard numbers about the The dramatic increase of World Wide short movie and sound clips; and the size of the Web or the number of Web usage over the past year and a half ability to display forms. Forms great- users. (The number of users on the is illustrated. While the growth rate is ly enhanced the original search Internet, often estimated to be in the phenomenal, more traditional uses of the mechanism built into WWW by al- tens of millions, is itself a con- network such as file transfer and e-mail lowing documents to contain fields tentious issue, with some estimates still dominate. that the user could fill in, or select claiming this number to be an over- Gopher, a somewhat similar infor- from a list of choices, before clicking estimate by perhaps as much as an mation retrieval tool to WWW but on a link to request further infor- order of magnitude.) One illustration based on menus and plain text doc- mation. The introduction of forms of the size of the Web came in early uments rather than hypertext, was to the WWW opened a new of 1994 when a server was set up to pro- expanding rapidly with several hun- applications in which the World vide information and up-to-the- dred servers. Wide Web acts not only as a way of minute results from the Winter During 1993 the situation changed viewing static documents, but also Olympics being held in Lillehammer, dramatically, driven in large part by as a way of interacting with the Norway. The implementation of the the development of the client information in a simple but flexi- server wasn’t started until the day by a talented and extremely enthu- ble manner, enabling the design of before the Olympics were scheduled siastic group at the National Cen- Web-based graphical interfaces to to start, but two weeks later the serv- ter for Supercomputer Applications databases and similar applications. er (together with a hastily arranged (NCSA) at the University of Illinois During 1993 the usage of WWW mirror server in the United States) in Champaign-Urbana. The Mosaic began to grow exponentially. As new had been accessed 1.3 million times, client for World Wide Web was orig- people discovered the Web they of- by users on somewhere between inally developed for X-Windows un- ten became information providers 20,000 and 30,000 different comput- der , with subsequent versions themselves, and as more information ers in 42 countries. released for both the and became available new users were at- NCSA now estimates that more PC platforms. tracted to the Web. The graph on this than a million copies of the Mosaic The Mosaic client software added page shows the growth in World software have been taken from their a few new key features to the World Wide Web (or more accurately HTTP) distribution site, and approximate Wide Web: the ability to display em- traffic over the National Science counts of the number of HTTP bedded images within documents, Foundation backbone since early servers indicates there are more than enabling authors to greatly enhance 1993, in comparison to Gopher and 3000 servers currently operating the aesthetics of their documents; FTP traffic during the same period (Stanford University alone has over the ability to incorporate links to (FTP—file-transfer protocol—was one 40 HTTP servers, not including one simple multimedia items such as of the earliest protocols developed for the Stanford Shopping Center!).

4 FALL 1994 HISTORY OF THE WORLD WIDE WEB

March 1989 First proposal written at CERN by Tim Berners-Lee. October 1990 Tim Berners-Lee and Robert Cailliau submit revised proposal at CERN. As the size of the Web has in- One specific way that Commerce- November 1990 creased, so has the interest in the Net is enhancing WWW is by the First prototype developed WWW from outside the academic proposed introduction of a “secure- at CERN for the NeXT. community. One of the first com- HTTP,” which would enable en- March 1991 panies to take an active interest in crypted transactions between clients Prototype linemode browser the World Wide Web was the pub- and servers. This would ensure pri- available at CERN. January 1991 lisher O’Reilly and Associates. For vacy, but perhaps more interesting- First HTTP servers outside over a year they have provided an on- ly would also enable the use of dig- of CERN set up including servers line service, the Global Network ital signatures, effectively ensuring at SLAC and NIKHEF. Navigator, using the World Wide that when you fill in an order form July 1992 Web. This includes regularly pub- on the Internet and submit it, it re- Viola browser for X-windows lished articles about developments ally goes to the company you believe developed by P. Wei at Berkeley. in the Internet, the “Whole Inter- you are ordering from (and only November 1992 Midas browser (developed at net Catalog,” an index of informa- them), and that they know when SLAC) available for X-windows. tion available on the Web, a travel they receive the order that it really January 1993 section, business section, and even came from you (and can prove it at Around 50 known HTTP servers. daily online comics and advertising, a later date if necessary). This mech- August 1993 all illustrated with professionally de- anism also begins to address a prob- O’Reilly hosts first WWW Wizards signed icons. lem of great interest to commercial Workshop in Cambridge, Mass. The Global Network Navigator is publishers—that of billing for infor- Approximately 40 attend. February 1993 now only one of many examples of mation accessed through the Web. NCSA releases first alpha version commercial publishers making in- CommerceNet has ambitious plans of “Mosaic for X." formation available on the Web, in- to incorporate thousands of member September 1993 cluding a number of print magazines companies in the first year or two, NCSA releases working versions and newspapers which are available primarily in Northern California, of Mosaic browser for X-win- partially or in their entirety on the but eventually to expand towards dows, PC/Windows and Macin- Web. the much broader horizons of the tosh platforms. Another interesting example of Internet. October 1993 Over 500 known HTTP servers. commercial use of the World Wide December 1993 Web is the CommerceNet organiza- John Markov writes a page and a tion. This organization, based in USES OF WORLD WIDE WEB half on WWW and Mosaic in the northern California and funded by New York Times business sec- IN HIGH ENERGY PHYSICS a consortium of large high technol- tion. Guardian (UK) publishes a ogy companies with matching funds While the Web has spread far from page on WWW. May 1994 of $6 million from the U.S. govern- its original HEP roots, it remains an First International WWW Confer- ment’s Technology Reinvestment extremely useful tool for dissemi- ence, CERN, Geneva, Switzer- Project, aims to actively encourage nating information within the wide- land. Approximately 400 attend. the development of commerce on ly distributed international high June 1994 the Internet using WWW as one of energy physics community. One ex- Over 1500 registered HTTP its primary enabling technologies. ample of the use of World Wide Web servers. CommerceNet aims to encourage within HEP is the access provided to July 1994 companies to do business on the In- the SPIRES databases at SLAC, a set MIT/CERN agreement to start WWW Organization. ternet by making catalogs available of databases covering a wide range of October 1994 and accepting electronic orders, and topics of relevance to HEP such as ex- Second International WWW Con- also by encouraging electronic col- periments, institutes, publications, ference, Chicago, Illinois, with laboration between companies. and particle data. over 1500 attendees.

BEAM LINE 5 The largest of the SPIRES databas- to electronic Bulletin Boards at Los es is the HEP preprints database, con- Alamos and elsewhere, it is possible taining over 300,000 entries. In 1990 to follow hypertext links from the the only way to access the SPIRES database search results to access ei- databases was by logging in to the ther the abstract of a particular pa- IBM/VM system at SLAC where the per, or the full text of the paper, database resides, or by using the which can then be viewed online or QSPIRES interface which could work sent to a nearby printer. only from remote BITNET nodes. In The WWW interface to SPIRES has either case to access information you now been extended to cover other had to have at least a rudimentary databases including experiments in knowledge of the somewhat esoteric HEP, conferences, software, institu- SPIRES query language. tions, and information from the Since 1990, the introduction of the Lawrence Berkeley Laboratory Par- World Wide Web, coupled with the ticle Data Group. There are now widespread adoption of Bulletin over 9000 publications available with Boards as the primary means of dis- full text, and more than 40,000 ac- tributing computer-readable versions cesses per week to the SPIRES data- of HEP preprints, has revolutionized bases through WWW. the ease of access and usefulness Another area in which WWW is of the information in the SPIRES ideally suited to HEP is in providing databases. communication within large collab- The SPIRES WWW server was one orations whose members are now of the very first WWW servers set up commonly spread around the world. outside of CERN and one of the first Most HEP experiments and labora- to illustrate the power of interfacing tories today maintain Web docu- WWW to an existing database, a task ments that describe both their mis- greatly simplified by WWW’s dis- sion and results, aimed at readers tributed client-server design. Using from outside the HEP field, as well as this interface it is now possible to detailed information about the ex- look up papers within the database periment designed to keep collabo- without any knowledge of the SPIRES rators up-to-date with data-taking, query language, using simple fill-out analysis and software changes. forms (for SPIRES aficionados it is In addition large HEP collabora- possible to use the SPIRES query lan- tions provide an ideal environment guage through the Web too). Access for trying the more interactive fea- to more advanced features of SPIRES, tures of WWW available now, as well such as obtaining citation indexes, as those to be introduced in the fu- can also be performed by clicking on ture. An example is the data moni- hypertext links. Since the access to toring system set up by the SLD col- the database is through WWW it can laboration at SLAC. The facility uses be viewed from anywhere on the WWW forms to provide interactive Internet. access to databases containing up-to- In addition, by linking the en- date information on the performance tries in the SPIRES databases to the of the detector and the event filter- computer-readable papers submitted ing and reconstruction software.

6 FALL 1994 Information can be extracted from from obsolete information, and the databases and used to produce maintaining multiple versions of plots of relevant data as well as dis- documents, perhaps in different plays of reconstructed events. Using languages. these tools collaborators at remote One new area of research is the de- institutes can be directly involved velopment of a new Virtual Reality in monitoring the performance of the Markup Language (VRML). The idea experiment on a day-by-day basis. behind VRML is to emulate the suc- cess of hypertext markup language (HTML) by creating a very simple lan- FUTURE DEVELOPMENTS guage able to represent simple vir- The size of the Web has increased tual reality scenarios. For example, by several orders of magnitude over the language might be able to de- the last two years, producing a num- scribe a conference room by specify- ber of scaling problems. One of the ing the location of tables, chairs, and most obvious is the problem of dis- doors within a room. As with HTML covering what is available on the the idea would be to have a language Web, or finding information on a par- which can be translated into a view- ticular topic of interest. able object on almost any platform, A number of solutions to this from small PC’s to high-end graphic problem are being tried. These range workstations. While the amount of from robots which roam the Web detail available would vary between each day sniffing out new informa- the platforms, the essential elements tion and inserting it into large data- of the room would be the same be- bases which can themselves be tween the platforms. Users would be searched through the Web, to more able to move between rooms, maybe traditional types of digital libraries, by clicking on doors, would be able where librarians for different subject to see who else was in the room, and areas browse the Web, collate infor- would be able to put documents mation, and produce indexes of their from their local computer “on to the subject areas. A number of indexes conference table” from where oth- are already available along these ers could fetch the document and lines, or spanning the space in be- view it. tween these two extremes. While This type of model could be fur- these are quite effective, none of ther enhanced by the ability to in- them truly solves the problems of clude active objects into HTML or keeping up-to-date with a constant- VRML documents. Using this tech- ly changing Web of information and nique, already demonstrated in a truly being able to separate the rel- number of prototypes, active objects evant from the irrelevant. This is such as spreadsheets or data plots can an active area of research at many be embedded into documents. While sites, together with other problems older browsers would display these associated with scalability of the objects merely as static objects, new- Web, such as preventing links from er browsers would allow the user to breaking when information moves, interact with the object, perhaps by separating up-to-date information rotating a three dimensional plot, or

BEAM LINE 7 World Wide Web Protocols

ECHNICALLY the World Wide (currently being defined) will allow still the same document the server would Web hinges on three enabling more advanced features such as mathe- be able to translate the image and Tprotocols, the HyperText matical equations, tables, and figures return the (larger) GIF image. This Markup Language (HTML) that speci- with captions and flow-around text. provides a way of introducing more fies a simple markup language for sophisticated document formats in the describing hypertext pages, the Hypertext Transfer Protocol future but still enabling an older or Hypertext Transfer Protocol (HTTP) Although most Web browsers are less advanced browser to access the which is used by Web browsers to able to communicate using a variety of same information. communicate with Web clients, and protocols, such as FTP, Gopher and In addition to the basic "GET" Uniform Resource Locators (URL’s) WAIS, the most common protocol in use transaction described above the HTTP which are used to specify the links on the Web is that designed specifically is also able to support a number of between documents. for the WWW project, the Hypertext other transaction types, such as Transfer Protocol. In order to give the "POST" for sending the data for fill-out Hypertext Markup Language fast response time needed for Hypertext forms back to the server and "PUT" The hypertext pages on the Web applications, a very simple protocol which might be used in the future to are all written using the hypertext which uses a single round trip between allow authors to save modified ver- markup language (HTML), a simple the client and the server is used. sions of documents back to the language consisting of a small num- In the first phase of a HTTP transfer server. ber of tags to delineate logical con- the browser sends a request for a docu- structs within the text. Unlike a proce- ment to the server. Included in this Uniform Resource Locators dural language such as Postscript request is the description of the docu- The final keys to the World Wide (move 1 inch to the right, 2 inches ment being requested, as well as a list of Web are the URLs which allow the down, and create a green WWW in 15 document types that the browser is hypertext documents to point to other point bold Helvetica font), HTML deals capable of handling. The Multipurpose documents located anywhere on the with higher level constructs such as Internet Mail Extensions (MIME) standard Web. A URL consists of three major “headings,” “lists,” “images,” and so is used to specify the document types components: on. This leaves individual browsers that the browser can handle, typically a :/// free to format text in the most appro- variety of video, audio, and image for- The first component specifies the priate way for their particular environ- mats in addition to plain text and HTML. protocol to be used to access the ment; for example, the same docu- The browser is able to specify weights document, for example, HTTP, FTP, or ment can be viewed on a Mac, on a for each document type, in order to Gopher, etc. The second component PC, or on a linemode terminal, and inform the server about the relative specifies the node on the network while the content of the document desirability of different document types. from which the document is to be remains the same, the precise way it In response to a query the server obtained, and the third component is displayed will vary between the dif- returns the document to the browser specifies the location of the document ferent environments. using one of the formats acceptable to on the remote machine. The third The earliest version of HTML (sub- the browser. If necessary the server can component of the URL is passed with- sequently labeled HTML1), was delib- translate the document from the stored out modification by the browser to the erately kept very simple to make the format into a format acceptable to the server, and the interpretation of this task of browser developers easier. browser. For example the server might component is performed by the Subsequent versions of HTML will have an image stored in the highly com- server, so while a document's location allow more advanced features. pressed JPEG image format, and if a is often specified as a Unix-like file HTML2 (approximately what most browser capable of displaying JPEG specification, there is no requirement browsers support today) includes the images requested the image it would be that this is how it is actually inter- ability to embed images in docu- returned in this format; however, if a preted by the server. ments, layout fill-in forms, and nest browser capable of displaying images lists to arbitrary depths. HTML3 only if they are in GIF format requested

8 FALL 1994 The ability of Web browsers to com- municate and negotiate with remote Browsers servers allows users on a wide variety of platforms to access information from Dumb PC Mac X NeXT many different sources around the world.

expanding and rebinning a particu- Addressing Scheme + Common Protocol + Format Negotiation lar area of a data plot. Currently the Web is viewed mainly as a tool for allowing access to a large amount of “published” in- HTTP FTP Gopher NNTP Server Server Server Server formation. The new features de- scribed here, together with the en- cryption features described earlier Internet that will allow more sensitive data News to be placed on the Web, will open up the Web to a whole new area, Servers/Gateways where it will be viewed more as a “collaborative tool” than purely an commonplace. As such, the WWW information retrieval system. Ideal- may play a key role in influencing ly it will be possible to take classes how such systems develop. At worst on the Web, to interact with the in- such a system may just become a glo- structor and fellow pupils, to play rified video delivery system and in- chess on the Web, to browse catalogs tegrated home shopping network ELECTRONIC and purchase goods, and to collabo- with a built-in method of tracking SOURCES rate actively in real-time with col- your purchases and sending you per- leagues around the world on such sonalized junk e-mail. At its best tasks as document preparation and such as system could provide truly HE SPIRES database and SLD data analysis. interactive capabilities, allowing not Tinformation featured in this arti- only large corporations and publish- cle can be accessed from the ers but also individuals and com- SLAC home page at: CONCLUSION munities to publish information and http://www-slac.slac.stan- Over the previous year the char- interact through the network, while ford.edu/FIND/slac.html acteristics of the average Internet maintaining individual privacy. The The illustrations on the Web in user have changed dramatically as outcome will have a major impact this article show the Midas WWW many new people are introduced to on the quality of life in the 21st cen- browser developed at SLAC. Infor- the Net through services such as tury, influencing the way we work, mation on obtaining and using this America Online, aimed primarily play, shop, and even how we are gov- browser is available from: at home users. The current Web us- erned. http://www-midas.slac.stan- age is likely to be insignificant in ford.edu/midas_latest/intro- comparison to the potential for us- duction. age once the much vaulted “Infor- Pointers to other pages men- mation Super Highway” reaches into tioned in this article: peoples’ homes. Global Network Navigator: It is perhaps unlikely that the http://nearnet.gnn.com/gnn/ services eventually offered domes- GNNhome.html tically on the Information Super CommerceNet: Highway will be direct descendants http://www.commerce.net of the World Wide Web, but what is Stanford Shopping Center: clear is that WWW offers an excellent http://netmedia.com/ims/ssc/ testing ground for the types of ssc.html services that will eventually be

BEAM LINE 9