J-8594/18 15/7/97 11:58 AM Page 240

240 Chapter 18 The Internet

Blaise Cronin and Geoffrey McKim Indiana University, United States

n a remarkably short time, the Internet has evolved from an academic curiosity to a mass I medium. It has been heralded as the basis of eco- nomic salvation for developing nations, as a new scholarly communications system and even as an entertainment alternative to television. However, the Internet has also thrown into relief controversial issues relating to censorship and freedom of expres- sion, pornography and intellectual property rights that have profound ramifications for both individu- als and nation-states. This chapter describes and seeks to explain the phenomenon that is the Internet.

Origins The earliest experiments in what later became the Internet began in 1966 with the United States Department of Defense Advanced Research Projects Agency (DARPA). The first nodes in the resultant ARPANET were created in 1969. In 1977, the TCP/IP (Transmission Control Protocol/Internet Protocol) protocols that underlay the Internet were demonstrated for the first time. In 1986, the National Science Foundation (NSF) created the first NSFNET backbone and allowed regional networks, mostly supporting universities, to feed into this backbone. By 1990, the Internet was supporting commercial activities. Even after all this growth and development, the same basic TCP/IP protocols remain in use and still serve to unify the Internet. In March 1989, the first World Wide Web (WWW) pro- posal was elaborated and circulated at the European Laboratory for Particle Physics (CERN) in Geneva, Switzerland, and in November 1990 the first proto- type Web browser was created (see Chapter 17).

Growth The most comprehensive and regularly administered survey of Internet-connected computers, or hosts, is the Internet Domain Survey (Network Wizards, 1996). Figure 1, showing the number of Internet hosts from 1981 to 1995, is based on this survey. J-8594/18 15/7/97 11:58 AM Page 241

The Internet241

From the data, it can be seen that the number of host 9,472,000 host computers on the Internet (Network computers on the Internet doubles approximately Wizards, 1996). International growth is highly vari- annually. Additional statistics on Internet growth are able. There is also considerable variation in Internet provided by the Internet Society (1996), and Matrix presence for different industry sectors. Information and Directory Services (MIDS, 1996). The number of computers on the World Wide Web, Organization and structure currently the most popular portion of the Internet, is A defining feature of the Internet is that no one per- doubling every four or five months. The number of son, company, government or organization has ulti- electronic mail messages sent over the Internet is mate control. The Internet Society (ISOC), an inter- doubling approximately every year (Internet Society, national, non-governmental organization whose 1994). As of January 1996, there were an estimated members consist of governments, corporations, indi-

Fig. 1. Internet hosts by year.

13 000 000 12 881 000

12 000 000

11 000 000

10 000 000

9 000 000

8 000 000

7 000 000 6 642 000

6 000 000

5 000 000 Number of Internet hosts

4 000 000 3 864 000

3 000 000

2 056 000 2 000 000

1 136 000 1 000 000 617 000 313 000 213 235 562 1 024 1 961 2 308 28 174 56 000 159 000 0 1981 1982 1983 1984 19851986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 Year J-8594/18 15/7/97 11:58 AM Page 242

Infrastructures for

information242 work

viduals and not-for-profit organizations, co-ordi- go through three phases: Proposed, Draft Standard nates many activities related to technical standards, and Standard. globalization, administrative procedures, education and training, and scaling. The ISOC Board of Access Trustees is the governing body of the ISOC. The Access to the Internet is often divided into three Internet Activities Board (IAB), a technical advisory classes, a trichotomy first proposed by Matrix group to ISOC, is responsible for oversight of Information and Directory Services (1994): the Core Internet technical standards, for the standards- Internet, consisting of those who can provide or dis- making process and for all protocols and architec- tribute information over the Internet, the Consumer tures used on the Internet. In addition, the IAB acts Internet, consisting of people who can receive infor- as a liaison with other national and international mation over the Internet, and the Matrix, consisting standards-making organizations, such as the Inter- of users with access to electronic mail systems who national Standards Organization (ISO) and the can exchange mail with Internet users, including most American National Standards Institute (ANSI), and proprietary, corporate e-mail systems. Until recent- publishes the Request for Comments (RFC) ly, the most common way to access the Internet document series that effectively defines Internet was through a university or government agency. standards and conventions. The IAB and the Federal However, in the course of 1995, the number of hosts Networking Council (FNC) have delegated respon- in the commercial domain exceeded the number of sibility for co-ordinating the management and dis- hosts in the educational domain for the first time. semination of unique Internet host computer num- Users with personal accounts generally access bers, domain names and other parameters to the the Internet by dialling in with a modem, either Internet Assigned Numbers Authority (IANA) at through a commercial online service such as America the University of Southern California. The Internet Online, CompuServe or , or directly to the Network Information Center (InterNIC), main- Internet through a local Internet Service Provider tained by AT&T and , provides (ISP), otherwise known as a Point of Presence site, host, domain and personal directory services to (POP). These commercial services provide addition- the Internet. al proprietary information not available on the Protocols and standards are researched and Internet as well as Internet access. ISPs can range in developed by the Internet Engineering Task Force size from a couple of simultaneous connections (IETF), which also administers the overall Internet operating from an individual’s home to large, nation- standards-making process. An open organization of al providers such as PSI in the United States or I- network designers, vendors and researchers, the Way, Pipex, U-Net and Demon Internet in the IETF, manages Internet standards through the RFC United Kingdom. In developed nations, the tele- document series. The chair of the IETF, along with phone call to the ISP is most often a local call. the area directors of the IETF, form the IESG Almost all the United Kingdom and much of the (Internet Engineering Steering Group) that handles United States are covered locally by ISPs. Recently, policy issues related to protocol research and devel- ISPs have also begun to appear in other countries. opment. RFCs, the official and published documents In the United States, the Telecommunications of the IETF (and thus the Internet), are divided into Act of 1996 makes it likely that telephone compa- four different types: Standards Track, Informational, nies, both local Regional Bell Operating Companies Experimental and Historic. Standards Track RFCs (RBOCs) and long-distance carriers, will begin J-8594/18 15/7/97 11:58 AM Page 243

The Internet243

offering Internet connectivity as a standard service. ing’ of the resource (for example, excessive cross- France Telecom has also announced intentions to posting of messages). They posit an Internet costing provide a consumer Internet service, one that will model based on: incremental packet cost, social cost include (for an extra fee) access to its existing Minitel of delay to others, network infrastructure fixed service. If this happens, it is unlikely that local ISPs costs, incremental cost of connecting an additional will be able to survive without offering significant user, and cost to expand network capacity. added value. In addition, some users have access to Most authors, even those in favour of a more limited parts of the Internet (often just electronic use-based Internet pricing model, agree that some mail) through local computer bulletin board systems subsidies for civic, educational and not-for-profit use (BBS) or community networks (‘freenets’). An alter- are required. Kahin (1995) discusses the provision of native means of connection is through what is called subsidies to schools and public libraries for Internet a ‘shell account’, in which a user dials into a remote access. Subsidies are not peculiar to the United computer connected to the Internet. Users in this States, however. For example, in Tarragona, Spain, case may have limited access to certain Internet ser- TINET (Tarragona Internet) has begun offering vices (e-mail, Usenet newsgroups, and even the users free basic Internet service (electronic mail and World Wide Web), although they generally do not Usenet news), and below-market rate full Internet have access to graphics or many of the more service, and the Peruvian Scientific Network (RCP), advanced services. The advantage of this type of seeded with monies from the United Nations Devel- account, however, is that it requires only a low-end opment Programme (UNDP), is providing subsi- computer and a slow modem, and it is particularly dized Internet access and training to the public. popular in developing countries, where higher-end A Web site or page can also be the level of eco- equipment is often unavailable. nomic analysis. Thus far, relatively few sites charge for the content they provide, with the information Economics and pricing either being funded through advertising or provided Pricing models for Internet access are varied, and as a loss-leader to entice the user to purchase a more have been the subject of much study. Kahin (1995) complete version of the product. The Fourth World describes the economics of the Internet in terms of Wide Web Survey revealed that the number of peo- the characteristics of its primary underlying tech- ple unwilling to pay anything for access to Web sites nologies, leased lines and routers (computers used to had increased to 31.8% from 22.6% in the previous direct data traffic), both of which are subject to large survey (Georgia Institute of Technology, 1995). economies of scale. Additional factors to be taken into account are the continually declining costs of Internet services the computer hardware and the statistical multiplex- Internet services are combinations of protocols and ing techniques used to combine the traffic from dif- software programs that allow people to use the ferent sources into a steady average traffic stream, Internet in different ways. A number of genres have both of which serve to drive down marginal costs. emerged over the lifetime of the Internet, and most MacKie-Mason and Varian (1995) approach Internet are still being used today, albeit in various incarna- economics from the perspective of congestion con- tions. Usenet is a distributed network of computers, trol. They compare fixed-rate access to the Internet predating the Internet but now running almost with the ‘tragedy of the commons’, wherein there is entirely on the Internet infrastructure, that ex- no penalty for increased use, resulting in ‘overgraz- changes messages via a set of agreed-upon protocols J-8594/18 15/7/97 11:58 AM Page 244

Infrastructures for

information244 work

in collections of messages called newsgroups. These vice is the World Wide Web – often referred to as the newsgroups can be thought of as electronic discus- multimedia portion of the Internet. The World Wide sion groups, and are arranged in hierarchies. There Web is based on the concepts of hypertext and are seven top-level international hierarchies of news- hypermedia. Information available via the World groups, called comp (for computer-related discus- Wide Web is provided in the form of hypermedia sions and information), sci (for the sciences), soc pages, which look like pages from a magazine, com- (for sociocultural issues), rec (for hobbies and recre- bining graphics and text, but with the added feature ational activities), news (for activities related to that the user can follow links provided by the author Usenet itself), talk (for debate-oriented activities) to other documents. Users view these hypermedia and misc (for activities not fitting into one of the pages with the aid of software programs known as existing categories or spanning categories). In addi- Web browsers. While the first widely available Web tion to the global hierarchies, there also exist local, browser was Mosaic, more recently regional and national hierarchies (de for Germany, in Navigator has become the browser of choice for for Indiana, etc.). Finally, there are alternative hierar- most people. Browsers on the World Wide Web chies that are carried by some news servers, includ- access Web servers via HTTP, or HyperText ing the anarchic alt hierarchy, which has been the Transport Protocol. Information on the Web is gen- subject of controversy owing in part to the sexually erally marked up with HyperText Markup Language explicit nature of some of its newsgroups. While (HTML), a subset of the Standard Generalized there is no central Usenet authority, a number of Markup Language (SGML). HTML provides facili- accepted rules and procedures have evolved that ties for the incorporation of text, graphics, sound, users and news server administrators abide by in the video and hypertext links into Web-based docu- maintenance of Usenet newsgroups for the seven ments, as well as document formatting. To provide major global hierarchies. These procedures include documents over the Web, information providers calls for discussion about the creation of new news- mark up these documents using HTML codes (or groups, calls for voting on the creation of such news- tags) and make them available via an HTTP server. groups, and protocols for the collection and count- HTML is a continuously evolving standard, and ing of votes and subsequent action. HTML 2.0 is the currently accepted version, sup- Gopher, developed at the University of ported by almost every browser. HTML 3.0 is cur- Minnesota in the United States, was the first multi- rently under discussion, though many Web brows- media-oriented network navigation tool. Designed ers have already implemented some of its features. to simplify network navigation for the user by al- Some browser developers, notably Netscape and lowing providers to present their information in the , have implemented non-standard features, form of navigable hierarchical menus, Gopher, and and a major discussion item among Web service its companion Internet search index, VERONICA, developers is the degree to which these features played a major role in increasing the accessi- should be utilized. bility of the Internet to the non-technical user. Most recent developments in Internet service Although many Gopher-based servers still exist, are intended to fit within the World Wide Web and Gopher has been largely superseded by the World HTML framework, which has proven remarkably Wide Web, which duplicates and significantly en- extensible and flexible. Virtual Reality Modeling hances its functionality. Language (VRML) is a technology used to represent Undoubtedly, the most significant Internet ser- three-dimensional interactive objects and scenes. J-8594/18 15/7/97 11:58 AM Page 245

The Internet245

The most successful applications of VRML to date tent or title of the resource itself. Consequently, the have been in the areas of molecular modelling and contents of a document may change, but its URL architecture. Recently, the VRML standard has been will not change at all if the location remains constant. extended by the VRML Architecture Group to Second, multiple copies of a document in different incorporate motion, through the Moving Worlds locations may have entirely different URLs, provid- standard. The most significant extension of the ing no clue that they are indeed the same document. World Wide Web architecture has been the develop- There have been efforts to develop a more consistent ment of Java. Created by Sun Microsystems, Java is a and location-independent scheme for referring to full object-oriented, distributed programming lan- Internet resources (usually referred to as Uniform guage. Instead of downloading static documents, a Resource Identifiers (URI)), but so far there has Web user can download active Java programs, which been no agreement, nor standard implementation. then execute in his or her Web browser (in platform- independent manner). Applications range from cos- Navigation metic enhancements of Web pages through remote Today’s best-known navigation tools include scientific instrumentation to dynamic software rental. Yahoo!, , WebCrawler, OpenText, AltaVista, Inktomi, and Magellan. Each has its own Specifying Internet resources particular focus, way of gathering material to be Uniform Resource Locators (URLs) are strings of indexed, search language and interface. Several also characters that specify completely the information offer value-adding features, such as Yahoo!’s brows- needed to retrieve a resource available on the Inter- able ontology. These tools are typically funded in net. They include the protocol used to access the one of four ways: subsidized by a university (many resource (‘http’ for the Web, ‘gopher’ for Gopher, search engines start out this way, and then become ‘ftp’ for FTP, ‘telnet’ for Telnet, ‘mailto’ for electron- commercial); a fee levied for access (such as with ic mail, etc.), the Internet host on which the resource InfoSeek, which has a two-tier structure – the first is accessible, the port number on the host through level is free to users, and the more advanced capabili- which the resource is being made available (usually ty is charged on a subscription and per-search basis); this number is absent, and a default is assumed), and as a demonstration of indexing software or hardware the location (usually the directory path name) within (OpenText, AltaVista); and, most significantly, by the host at which the resource may be found. The advertising. Many search engines are funded using location may also be omitted; in this case, the the broadcasting model – the content is not so much resource retrieved is usually the primary home page the product as the bait to deliver users to advertisers’ available on the specified host. Example of URLs doors. include http://www.unesco.org/general/eng/about/ These navigation tools also differ in terms of constitution/index.html (UNESCO’s Constitution), the body of documents to which they provide access. and telnet://infogate.ucs.indiana.edu (the Indiana Yahoo! sources much of its content directly from University library catalogue). Web browsers use document owners. This subject categorization, limit- URLs both to retrieve documents directly and to ed indexing and browsability make it ideal for initial link to documents from other pages. investigation into the range of resources available on The URL scheme has some significant limita- a topic, but less desirable for finding more obscure tions. First, as URLs are primarily instructions for or specific information. Others, such as AltaVista retrieving a resource, they do not identify the con- and Inktomi, focus on speed and comprehensive- J-8594/18 15/7/97 11:58 AM Page 246

Infrastructures for

information246 work

ness. Some search engines, such as McKinley’s representing a particular organization (a university, a Magellan, include reviews and ratings of many Web government agency, a corporation). Within these sites. Most of these search engines obtain indexable may be Internet hosts, or subdomains, often repre- material through the use of a ‘spider’. Also known as senting particular organizational units. For example, robots or crawlers, spiders are software agents that the hostname of the primary Indiana University rove from site to site, retrieving information, index- School of Library and Information Science Internet ing it and following all links recursively. This is a server is ‘www-slis.lib.indiana.edu’. This means that lengthy, computationally- and bandwith-intensive the host is in the edu top-level domain, and is thus a process, and there are always more Web sites than United States higher education institution. The ‘indi- have been visited by the spiders. There are several ana.edu’ is a domain registered to Indiana Uni- problems with this approach to indexing. The first is versity. The ‘lib’ is a subdomain within Indiana that sites which have not been linked to any of the University, and ‘www-slis’ is the actual name of the sites indexed by a spider may not be discovered by computer. the spider. Second, many sites have changed since they were originally indexed, and thus the indexes Commercial use and users are often out-of-date, and contain many ‘dead links’. The business potential of the Internet has been evi- Third, many users may not want their sites to be dent for some time. It has been estimated that use of indexed by these publicly available search engines, the Web is growing at 40% per month. Of course, considering it an invasion of privacy. In addition, there are many outstanding technical issues, relating from the user’s perspective, these search indexes in particular to bandwidth and responsiveness, often generate a large number of false hits, which which affect perceptions of credibility and reliability provide useless information. (for example, gateway failures, capacity limitations, dead links and server overloads). As a market-place Internet addressing and the domain name the Web is unusual. The number and range of suppli- system ers is unlike any other market-place: it is a World’s Each host on the Internet has a unique address, or Fair, souk, shopping centre and direct mail catalogue hostname. These are arranged hierarchically in rolled into one. Within the Web, marketing can be groups called domains. The largest domains, top- business-to-business, business-to-consumer or con- level domains, contain all of the hosts in a particular sumer-to-consumer. This plurality is a defining fea- country, and are identified by the ISO 3166 two- ture, and offers a mix of benefits for both producers letter country code. For example, the domain for and consumers. Japan is jp, the domain for Brazil br, and for South Africa za. The full list of these country codes can be Producer perspective found at http://www.nw.com/zone/iso-country- The generic attractions of the Web from a supplier codes. Although the United States has a top-level perspective include (Cronin and McKim, 1996): domain, us, it also has the additional top-level • Lower entry costs: Virtual markets are easy to domains com, edu, org, gov, net and mil (for com- penetrate. mercial organizations, higher education, not-for- • Re-purposing: A digitized product base can be profit organizations, government, network pro- configured in a variety of ways to create sec- viders and the military, respectively). Within each of ondary product lines. these top-level domains are other domains, usually • Direct customer access: The Web creates direct J-8594/18 15/7/97 11:58 AM Page 247

The Internet247

connections between producers and consumers dimension to the concept of customer con- without recourse to distributors or a sales net- venience. work. • Customer feedback: Vendors will become high- • Lower distribution costs: The separation of con- ly sensitive to the voice of the consumer. tent from the storage medium eliminates sever- • Impersonality: Some consumers enjoy the sense al steps in traditional industry value chains. of anonymity afforded by electronic shop- • Indirect sales channels: Retailers can exploit the ping/trading. Web to generate referrals to conventional wholesale/retail outlets. Producer/consumer concerns • Pre-segmented markets: The Web encourages Many companies’ reluctance to move quickly into self-branding/self-segmentation. electronic trading is a function of the perceived • Lower advertising costs: Merely to have a pres- threat of break-ins to their internal networks by ence on the Web is to advertise. hackers. Other concerns have to do with the vulner- • Lower transaction costs: For providers of cer- ability of soft goods to piracy and the resultant loss tain categories of goods the costs of doing busi- of revenue. From a consumer perspective, Web mar- ness drop significantly. kets raise issues of privacy. Consumers may seek • Lower exit costs: The converse of low entry safeguards that transaction meta-data will not be costs are low exit costs. used for unauthorized purposes. • Secondary markets: Additional revenue streams can be generated by selling advertising space or From Internet to Intranet designing home pages. Many businesses, recognizing that the technologies of the Internet (and particularly the World Wide Consumer perspective Web) are robust, easy to use, well-tested and flexible, The underpinning dynamic of the virtual market have begun to use them not only in the construction changes traditional relationships between suppliers of public Web-based presences, but also in the cre- and buyers in a number of ways (Cronin and ation of internal corporate information-sharing net- McKim, 1996): works. The Georgia Institute of Technology (1995) • Shift from push to pull: The Web gives con- Fourth WWW User Survey notes intra-enterprise sumers a voice and the option of drilling down use of the Web as the most common commercial use. into product information. Such internal networks, often termed ‘intranets’, are • Greater choice: The breadth and depth of prod- a natural intention of the Internet, which has been uct range that the Web encourages will translate used since its inception to facilitate discussion and into greater consumer choice. the dissemination of information. • Transparency: The Web creates transparency by facilitating consumer-to-consumer information Electronic transactions exchange. Models for secure commercial transaction over the • Disintermediation: The Web has been described Internet fall into three classes: those that seek merely as the instantiation of frictionless capitalism. to provide secure transportation of transaction • Price drivers: Transparency in the market-place information from purchaser to merchant; those that makes it harder to fool consumers. attempt to facilitate the actual funds’ authorization • Convenience: Electronic shopping adds a new and transaction settlement process; and those that J-8594/18 15/7/97 11:58 AM Page 248

Infrastructures for

information248 work

aim to reproduce the essential features of money in value itself has transferred from customer to vendor. digital form. The first class is concerned with the The DigiCash scheme also provides another ‘cash- provision of secure transfer of information from like’ feature – payer anonymity. When electronic a browser to a server. There are two competing cash is exchanged, the payer is not necessarily iden- standards for the provision of this service: Secure tified to the vendor (as would be the case if a credit HTTP (S-HTTP) and Secure Sockets Layer (SSL). card number were exchanged). This ensures addi- Although from time to time the security in such sys- tional customer privacy, and prevents the purchase- tems may be penetrated (for example, certain weak tracking and marketing information-gathering that is points can theoretically be exploited), in practical possible with credit card transactions. Finally, there terms they are sufficiently fail-safe for the purposes are commerce models, such as that of First Virtual, of ordinary commerce. which rely not on sending encrypted information The second class is concerned with facilitating over the Internet, but on e-mail verification and pur- the entire electronic purchasing process. After an ini- chase confirmation. tial period of dispute, a draft standard for secure electronic transactions emerged in early 1996. Government applications Known as the Secure Electronic Transactions (SET) Government organizations have been leaders in standard, it provides a framework within which making information available over the Internet. The confidentiality can be protected, payment integrity United States Federal Government has been at the ensured, and both merchants and customers authen- forefront, with Web sites such as THOMAS, a ticated to each other. CyberCash also provides a repository of current and past legislative informa- secure, though not yet SET-compliant, transaction- tion, the LC Marvel information system (of the US facilitation service. Most existing secure transaction Library of Congress), and the National Aeronautics techniques depend on public-key cryptographic and Space Administration (NASA) Web site. The techniques, which do not require the sender and Bureau of the Census also makes extensive data recipient of encrypted data to agree upon a secret available. National Technical Information Services, encryption password beforehand. These crypto-sys- through FedWorld, provide pointers to all United tems can also be used to provide facilities for authen- States Federal Government information resources. tication and digital signatures. One of the primary Government organizations as diverse as the impediments to the spread of secure transactions Brazilian Ministry of Planning (http://www.seplan. internationally is the ITAR (International Tariff in gov.br), the Ministry of Interior Affairs in Latvia Arm Regulations) that restricts the export from the (http://www.ugdd.lv) and the Ministry of Informa- United States of software using strong cryptographic tion and Communication in the Republic of Korea techniques. Countries such as France also have (http://www.mic.go.kr) all provide information strong laws against the export or use of crypto- about their functions and services via their home- graphic software. pages. Similar enthusiasm can be seen among non- The DigiCash payment scheme is different in governmental organizations (NGOs). The United that the customer withdraws electronic cash from a Nations itself has a Web site (http://www.un.org), DigiCash bank, and that electronic cash is actual with pointers to the sites of its departments and money rather than just a credit card number. When divisions, or to its Specialized Agencies such as the customer transfers DigiCash to the vendor, then, UNESCO (http://www.unesco.org). A guide to the it is as though cash has been exchanged – the item of use of United Nations Internet-based resources has J-8594/18 15/7/97 11:58 AM Page 249

The Internet249

been released. The World Bank (http://www.world mat or nature of medium. Third, the boundary lines bank.org), too, has a well-developed Web presence. drawn by disciplinary groups are ignored by the infinitely extensible latticework of hypertextual Education, research and scholarship links that give the Web its unique character. Fourth, Although the Web is relatively tiny today, containing ‘grey’ literature is no longer the stepchild of primary only a fraction of the world’s publicly available data, publishing; the Web entertains semi-published and it is quadrupling in size annually and in six or so vanity items, irrespective of provenance or pedigree. years may grow a thousandfold. It would be short- sighted, however, to see the Web merely as a distrib- Cost uted document store and/or digital reference library, Although the commercial character of the Web is though it increasingly satisfies both these functions. developing rapidly, many organizations, including The Web is much more than a virtual equivalent of universities, research institutes and government existing archival and library institutions. It is a agencies, are actively making materials available at dynamic environment that supports new kinds of zero cost to users. Scholars, in many cases, benefit foraging and communication in which scholars are from their parent institution’s willingness to provide anything but passive participants. Moreover, the subsidized and unmetered Internet access in support Web is as much a showcase for authors as a source of of the teaching and research functions. The general documents. In its far-sighted electronic publishing absence of direct or metered charges, coupled with plan, the Association for Computing Machinery the savings in time and effort afforded by desktop (ACM) acknowledges that many authors view their access to the World Wide Web, underscore the cost- works as ‘living on the Web’ and see networks as effectiveness of the technology from the standpoint opportunities ‘for collaborative authoring and for of time-pressed scholars with limited budgets for dynamic documents that incorporate other docu- consumables and subscriptions. ments’ (Denning and Rous, 1995). Features and issues worth considering are: size and scope, cost, Ease-of-use ease of use, novelty, community and legitimacy. Simplicity of use combined with interactivity make for a powerful technology, and recent software Size and scope developments, notably Java, offer new levels of The bypassing of traditional (institutional) informa- dynamic interaction. The increasing availability of tion suppliers and reference sources will be a con- statistical data sets on the Web will allow scholars to sequence of progressive migration to the Web. acquire and interactively analyse remote data. The Commercial publishers, for their part, are coming to implications, however, extend beyond local conve- recognize the importance of digital publishing, and nience. The worldwide reach of the Web means that are struggling to develop a business framework for academics and researchers in less-developed nations, online enterprise. First, materials located on a server handicapped by lack of resources or unable to travel in Addis Ababa, for example, need be no less accessi- abroad and work in foreign research institutions, can ble than those hosted by one’s own institution in compensate, in part, by connecting and interacting Bloomington. Second, statistical data sets, image with remote data sets hosted by First World insti- banks, textual archives, information services, enter- tutions. In fact, the Web makes possible new kinds tainment and much else are available on the Web of technology transfer for educational purposes without any partitioning on the basis of content, for- between centre and periphery nations. Convenience, J-8594/18 15/7/97 11:58 AM Page 250

Infrastructures for

information250 work

combined with cost-attractiveness and local control, impossible’. Gresham’s Law seems, in some cases, to helps explain the success of non-conventional elec- apply to the currency of digital discourse. tronic publishing/storage ventures such as the Los The evolution of communities of interest, of Alamos Preprint Archive or the CERN Preprint virtual communities not bound by geography, ranks Server in high-energy physics. These (and other) col- among the most notable developments stimulated by lectivist ventures have in a relatively short time the Internet. One of the earliest, and most influen- established themselves as the primary information tial, of these virtual communities was the WELL exchange/pre-publishing forums for international (Whole Earth ’Lectronic Link), an 8,000-member, research communities, bypassing established mecha- San Francisco-based virtual community. While it is nisms and procedures. Their success and transparen- impossible to measure the number of these virtual cy are such that concerns about legitimacy and insti- communities, their impact is undeniable. They take tutional oversight seem to count for little, least of all many forms, including LISTSERVs, Usenet news- with opinion leaders in the scientific cultures in groups and various Web-based forums. General question. social norms and guidelines for discussion groups and virtual communities on the Internet, often The search for novelty known as ‘netiquette’, have emerged. Experientially, the World Wide Web offers scholars something new: a tool that eliminates distance, Legitimacy erodes arbitrary boundaries between domains and Many of the barriers to the use of the Web in schol- facilitates associative learning. Although the Web can arship relate to the perceived legitimacy of digital be used as a document locator, its real strength may documents, that is, the acceptability of documents lie in the fact that it supports query-free browsing existing only in electronic form as a part of the and promotes serendipity. The ability to forage for scholarly record. The first concern relates to plagia- new ideas and insights in a hypernavigable and rism. The ease of copying, coupled with the sheer unbounded space is a singular aspect of the Web. number of potential electronic texts, creates unparal- leled opportunity for plagiarism. The second obsta- Cyber salons and digital communities cle has to do with the difficulty in establishing the The Web functions as a global common; a shared authenticity and authorship of electronic documents. space which creates new forms of social interaction. The technologies and protocols that enable authenti- Berghel (1995) uses the term ‘digital village’ to cap- cation of documents and document authorship, digi- ture the defining characteristics of cyber communi- tal signatures and public key cryptography in partic- ties. The Web, with its unparalleled capacity to link ular, do exist but, for a variety of technological and scattered communities, can be a powerful catalyst for political reasons, public acceptance and implementa- highly intensive and participatory exchange across tion of these has been slow. The third problem is that national boundaries and disciplinary borders, of ephemerality. Documents on the Web may be here though the outcomes of these interactions will not today, but gone tomorrow, if the host organization always or necessarily be for the better. As Poster loses funding, the individual providers leave their (1995) observes, segments of virtual social space dif- organization, or the will to make older documents fer from the public sphere in important ways: they available is absent. For the scholarly community to can be places where ‘rational argument rarely pre- accept digital documents, reliably managed archives vails, and achieving consensus is widely seen as that use digital signatures and public key cryptogra- J-8594/18 15/7/97 11:58 AM Page 251

The Internet251

phy to ensure the integrity of their holdings will be tion of television-like responsiveness, and user frus- required. But perhaps the most serious obstacle is tration is bound to result. the problem of version control. Documents available The Internet has also provoked serious contro- on the Web can change regularly, without their cor- versy. The original Internet users were primarily responding references (e.g. the URLs) changing. A scholars and computer experts, whose prevailing scholar may cite a document, but by the time the ethos might be characterized as ‘anything goes’ and citation is checked, the Web document may have ‘information wants to be free.’ Commerce was origi- changed (often providing little or no indication of nally forbidden by the NSFNET usage guidelines, the changes made). Archives of digital documents and even thereafter was strongly discouraged. will have to take into account the need to cite a However, as the Internet grew and became more document as it exists at a particular moment in tightly integrated with society in general, many gov- time. ernments attempted to regulate it as they did estab- In the university sector, there is significant lished media, by applying stringent copyright and investment in the World Wide Web as an enterprise- anti-obscenity legislation. The result has been several wide utility to support a range of core functions – well-publicized clashes. For example, the 1996 teaching, scholarship, administration and market Communications Decency Act in the United States positioning. Rates of adoption and development are applies legally weak ‘indecency’ standards to traffic differential (within and across both institutions and on the Internet, which has spawned high-profile countries), but the Web is clearly seen as a means of public protest. Controversies on the Internet have enhancing and accelerating scholarly communica- ranged from clashes of cultures to conflicts of tion, fostering indigenous/local publication, facilitat- national law. In one case the book Le grand secret, ing computer-mediated teaching and underpinning which dealt with François Mitterand’s battle with distance learning strategies. Also, at a time of cancer, was banned in France by a judicial decision increasing competition for revenue and resources, only to be posted on the Internet, thus infringing the Web can act as a lever in gaining an edge in terms French copyright law. This event led some to consid- of advertising, branding and recruitment. er stricter controls on Internet content. In another well-publicized case the Church of Scientology, an Disillusionment and controversy American-based religious sect, successfully obtained There has been some evidence in recent months that restraining orders and search warrants after a dis- use of the Internet may actually be slowing and frus- affected member posted copyrighted Church docu- tration rising. Ironically, as bandwidth overall on the ments on the Internet. Internet increases, more and more people are access- ing it from home using at best a 28.8 Kb per sec The Internet and development modem, and thus have effectively less bandwidth. Although there are computers on the Internet in This problem is accentuated by the increasingly most countries, penetration is strongest in the devel- graphical nature of most Web pages, which slows the oped world. The top seventeen nations in terms of transmission of documents greatly. Add to this the number of Internet connections are all members of still-greater bandwidth required by more advanced the Organisation for Economic Co-operation and multimedia formats (video, animation, sound), the Development (OECD). Countries such as Turkey, proliferation of graphically intensive advertisements Brazil and Thailand, however, have made recent that do not contribute to content, and the expecta- rapid advances in terms of Internet connectedness. It J-8594/18 15/7/97 11:58 AM Page 252

Infrastructures for

information252 work

Table 1. Internet hosts by country, January 19961

Country Hosts Country Hosts Country Hosts

United States 6 053 402 Ukraine 2 318 Monaco 56 Germany 452 997 Colombia 2 262 Guam 55 United Kingdom 451 750 Croatia 2 230 Trinidad and Tobago 55 Canada 372 891 China 2 146 Fiji 52 Australia 309 562 Philippines 1 771 Liechtenstein 44 Japan 269 327 Luxembourg 1 756 Cayman Islands 42 Finland 208 502 Latvia 1 631 Macedonia 39 Netherlands 174 888 Costa Rica 1 495 Albania 36 Sweden 149 877 Kuwait 1 233 Uzbekistan 35 France 137 217 Venezuela 1 165 Guatemala 27 Norway 88 356 Bulgaria 1 013 Saudi Arabia 27 Switzerland 85 844 Romania 954 Gibraltar 26 Italy 73 364 Peru 813 Belarus 23 Spain 53 707 India 788 El Salvador 23 New Zealand 53 610 Lithuania 630 Anguilla 23 Austria 52 728 Uruguay 626 Jordan 19 Denmark 51 827 Bermuda 608 Nepal 19 South Africa 48 277 Egypt 591 Pakistan 17 Belgium 30 535 Faroe Islands 533 Kenya 17 Israel 29 503 Ecuador 504 Algeria 16 Korea, Republic of 29 306 Cyprus 384 Senegal 14 Taiwan 25 273 United Arab Emirates 365 Namibia 11 Poland 24 945 Bahamas 276 Moldova, Republic of 10 Singapore 22 769 Iran 271 Andorra 10 Brazil 20 113 Morocco 234 Solomon Islands 9 Hong Kong 17 693 Kazakstan 187 Antarctica 7 Czech Republic 16 786 Jamaica 164 Ghana 6 International Organizations 15 570 Antigua and Barbuda 160 Sri Lanka 6 Ireland 15 036 Brunei Darussalam 156 Côte d’Ivoire 3 Russian Federation 14 320 Panama 148 Barbados 2 Mexico 13 787 Bahrain 142 Vatican City 2 Hungary 11 486 Nicaragua 141 Guinea 2 Portugal 9 359 Dominican Republic 139 Swaziland 1 Chile 9 027 Zimbabwe 93 New Caledonia 1 Greece 8 787 San Marino 90 Belize 1 Iceland 8 719 Greenland 88 Azerbaijan 1 Slovenia 5 870 Lebanon 88 Ethiopia 1 Turkey 5 345 Tunisia 82 Tonga 1 Argentina 5 312 Armenia 77 Cuba 1 Malaysia 4 194 Malta 68 Cook Islands 1 Estonia 4 129 Bolivia 66 Thailand 4 055 Macao 65 Slovakia 2 913 Georgia 60 Indonesia 2 351 Uganda 58

1. Data are from the Internet Domain Survey (http://www.nw.com). J-8594/18 15/7/97 11:58 AM Page 253

The Internet253

is only in the United States and a few other OECD to source important background information and to nations that users routinely have access to the mobilize support from like-minded, but often geo- Internet from their homes. Otherwise, access is pro- graphically dispersed, groups. But universal democ- vided almost entirely through universities, govern- racy comes with a price tag: the technology plat- ment agencies and businesses. Table 1 provides a forms which facilitate open exchange also support breakdown of Internet hosts by country. Even electronic eavesdropping and cyber surveillance of assuming that a reliable telecommunications infra- dissident voices by, for example, government depart- structure and logistical support system exist, the pre- ments, national security agencies, or corporations vailing culture, social structures, community values (see Chapter 20). and established rhythms of life in many LDCs will Of course, it is not an accident that Internet challenge simplistic assumptions about the nature of connections are scarce in closed societies. The per- technology transfer. How is indigenous knowledge ception among ruling élites is that real-time commu- shared and diffused throughout local communities nication of news and views, whatever the medium, is from generation to generation, and how do these dis- potentially threatening. As Travica and Hogan semination practices differ from the knowledge (1992) noted, computer networks (particularly REL- transfer process in industrialized countries? In their COM and GlasNet) were a key source of otherwise review of computing in North Africa, Danowitz et inaccessible information at the time of the 1991 al. (1995) acknowledge that Internet connectivity, in attempted coup in the Soviet Union and a means of particular, could weaken the enforcement of prevail- mobilizing counter-action. Networking ruptures ing social values and hinder censorship of ideas and centralized control. Networks have the capability to opinions inimical to ruling powers. To illustrate the destabilize autocratic regimes by diffusing and importance of cultural relativism, it is only necessary amplifying unorthodox views in both vertical and to compare information access policies in, say, horizontal directions. A few governments have Sweden or the United States with those of China or already expressed concern that the Internet will Singapore. enable their citizens to obtain information from out- In the United States, the present administration side groups – in particular, dissident groups from is committed to connecting public schools, libraries outside the country – and are working on an infra- and hospitals to the Internet as part of its National structure that will allow them a much greater control Information Infrastructure (NII) initiative. If public over Internet content. In 1993, the Institute for libraries have Internet connections, so the logic goes, Global Communications launched the PeaceNet local citizens and community groups will become World News Service, which offers news rarely found electronically empowered. Approximately 21% of in the mainstream press. Currently, a group of American public libraries and 35% of public schools human rights organizations (Amnesty International, have some connection to the Internet – although Human Rights Watch, PEN) is exploring the possi- such access is not equitably distributed. In many bility of establishing a communication system over societies, pervasive networking may stimulate the Internet. greater participation in the democratic process and, at the same time, add a further set of checks and bal- Internet demographics ances on all levels of government. Networks can There have been very few reliable studies of Internet enable concerned citizens, local action groups or dis- demographics. Most have been delivered through affected individuals to challenge authority directly, the Internet itself, and have thus been highly J-8594/18 15/7/97 11:58 AM Page 254

Infrastructures for

information254 work

skewed towards advanced computer users. In 1995, the respondents was down from 35 to 32.7 years. CommerceNet, an organization dedicated to pro- Finally, the proportion of Web users from the moting standards for commerce on the Internet, United States is diminishing, as usage from Canada, along with Nielsen Media Services conducted per- Mexico, Europe, Latin America, Africa, the Middle haps one of the first controlled, random-sample sur- East, Asia and Oceania increases. veys on Internet demographics in the United States. Among other things, the survey found that people Conclusions with access to the Internet fell into the following Although the Internet growth curve must inevitably age-groups: 16–24 (22%); 25–34 (30%); 35–44 slacken from exponential to logistic, there are no (26%); 45–54 (17%); 55+ (5%). Overall 64.5% were signs yet that the rate of adoption is abating; indeed, male, 88% had some college education, and they predictions of a billion users by the year 2000 are were primarily either professionals (37%) or full- commonplace. While congestion is often cited as a time students (16%), while 55% had a household major impediment to sustained, widespread use, it is income of US$50,000 or higher. The survey also conceivable that the technology/capacity trajectory found that 17% of the total population of the United will keep pace with the demand curve. Another fac- States and Canada had some access to the Internet, tor to take into account is the phenomenon of intelli- 8% had used the Web in the last three months, and gent agency, and whether in fact the Internet will be 11% the Internet. Approximately 14% of all roamed mostly by programs, not people. It may, Internet users had purchased goods or services over therefore, be helpful to think in terms of three the Internet. worlds: the Internet (public space), the intranet General demographic surveys of Web users (closed communities), and what we have chosen to have also been carried out by the Georgia Institute term the ‘infranet’ (the backgrounded portions of of Technology (1995) for the past three years, and the public Internet increasingly inhabited by auto- provide a snapshot of Web users’ lifestyles, behav- mated agents working on behalf of the great majori- iours and attitudes. The mean age of Web users is ty of ordinary users). 32.7; approximately 70% are male; median income is However, technical matters will not necessarily US$63,000 (well above the $36,950 United States dominate. As transnational usage grows, a cluster of median income); 76.2% are from the United States, sociocultural issues will move dramatically to the 10.2% from Canada and 9.8% from Europe; 31% fore. Primary among these will be concerns relating work in computer-related and 24% in education- to censorship, social control, cultural contamination, related fields. More than 40% use their browser for linguistic hegemony and computer crime, though six to ten hours per week, with shopping a much less nations and individuals will, of course, differ frequently cited activity than entertainment or markedly in the perspectives they bring to bear and accessing reference information. Some trends can be their assessments of the benefits and drawbacks of inferred when comparing data from the third Web open electronic communications: what one nation survey as compared with the current survey. The might consider an egregious example of censorship median income of Web users is dropping, indicating might well be considered wise social stewardship in that use of the Web is becoming less socially exclu- another. More optimistically, there are those who sive. The proportion of women responding to the view the Internet as a powerful tool for constructing survey increased by 15%, although not by nearly as identity, cultural self-awareness, and local self- much outside the United States. The average age of sufficiency on an unprecedented scale. ■■ J-8594/18 15/7/97 11:58 AM Page 255

The Internet255

References the Former USSR: Technology, Uses and Social Effects. In: D. Shaw (ed.), ASIS ’92: Proceedings of

BERGHEL, H. 1995. Digital Village: Maiden Voyage. the 55th ASIS Annual Meeting, Pittsburgh, PA, Communications of the ACM, Vol. 38, No. 11, pp. October 26–29, pp. 120–35. Washington, D.C., ASIS. 25–7. CRONIN, B.; MCKIM, G. 1996. Markets, Competition, and Intelligence on the World Wide Web. Competitive Intelligence Review, Vol. 7, No. 1, pp. 45–51. DANOWITZ, A. K.; NASSEF, Y.; GOODMAN, S. E. 1995. Cyberspace across the Sahara: Computing in North Africa. Communications of the ACM, Vol. 38, No. 12, pp. 23–8. DENNING, P. J.; ROUS, B. 1995. The ACM Electronic Publishing Plan. Communications of the ACM, Vol. 38, No. 4, pp. 97–103. GEORGIA INSTITUTE OF TECHNOLOGY. 1995. GVU Center’s 4th WWW User Survey. (Available from URL: http://www.cc.gatech.edu/gvu/user_surveys/ survey-10-1995.) INTERNET SOCIETY. 1994. Growth of the Internet: Internet Messaging Traffic. (Available from URL: http:// www.isoc.org/ftp/isoc/charts/90s-mail.txt.) ——. 1996. Internet Society Information Services. (Available from URL: http://info.isoc.org: 80/ infosvc/index.html.) KAHIN, B. 1995. The Internet and the National Information Infrastructure. In: B. Kahin and J. Keller (eds.), Public Access to the Internet, pp. 3–23. Cambridge, Mass., MIT Press. 390 pp. MACKIE-MASON, J.; VARIAN, H. 1995. Pricing the Internet. In: B. Kahin and J. Keller (eds.), Public Access to the Internet, pp. 269–314. Cambridge, Mass., MIT Press. 390 pp. MATRIX INFORMATION AND DIRECTORY SERVICES. 1994. MIDS Press Release: New Data on the Size of the Internet and the Matrix. (Available from URL: http://www.tic.com.) MIDS. 1996. MIDS Home Page. (Available from URL: http://www.mids.org.) NETWORK WIZARDS. 1996. Internet Domain Survey. (Available from URL: http://www.nw.com.) POSTER, M. 1995. The Net as a Public Sphere? Wired, Vol. 3, No. 11, pp. 136–7. TRAVICA, B.; HOGAN, M. 1992. Computer Networks in J-8594/18 15/7/97 11:58 AM Page 256

Infrastructures for

information256 work

Blaise Cronin is Professor of Information Science at Geoffrey McKim, Indiana University and Dean of the School of Library Manager of Information and Information Science. He is also the BLCMP Systems at Indiana Visiting Professor of Information Science at University’s School of Manchester Metropolitan University in the United Library and Information Kingdom, and an Associate Consultant with Solon Consultants, Science, has degrees in mathematics, and London. From 1985 to 1991 he was Professor of Information Science library and information science. A and Head of the Department of Information Science, Strathclyde former network analyst for Indiana Business School, University of Strathclyde, United Kingdom. He has University Computing Services, he has taught or consulted in more than thirty countries, and been an invited been involved in the development and speaker at fifty universities worldwide. Dr Cronin is author or editor management of Internet resources for of more than 200 books, reports and articles on strategic information over seven years. He has taught courses management, information marketing, scholarly communication and in Web server design, Internet resource citation analysis. He is a Fellow of the Institute of Information use and management, and information Scientists, Institute of Management, and Library Association, and a technology in organizations. He is member of several other professional associations. His editorial board author of a recent book, Internet memberships include Journal of Documentation, Library Quarterly, Research Companion, and a member of International Journal of Information Management and Revista the Internet Society, the Society for Española de Documentación Científica, and he was Founding Editor Social Studies of Science, the American of the Journal of Economic and Social Intelligence. Society for Information Science, and the Association for Computing Machinery.

Blaise Cronin Dean, School of Library and Information Science Geoffrey McKim Indiana University Information Systems Manager Bloomington School of Library and Information Indiana 47405-1801 Science United States Indiana University Tel: 812-855-2848 Bloomington Fax: 812-855-0078 Indiana 47405-1801 E-mail: [email protected] United States Tel: 812-855-2848 Fax: 812-855-0078