World Wide Web

UTeach CS Unit 6: Innovative Principles Technologies

World Wide Web

WWW

How many times have you seen www. at the start of a URL? It is so ubiquitous that many web browsers and web sites will insert it into the URL even if you do not type it. But www. is a special part of a domain’s address indicating that it is a server hosting content designed to meet the standards of the World Wide Web. And almost every online service you likely use is a part of the World Wide Web.

In fact, the World Wide Web is one of those things that most of us use on a regular basis without ever thinking about how it works or what problems it was originally created to solve. But, since its inception in the early 1990s, the Web has proven to be one of the most revolutionary and empowering inventions in history.

Not to be confused with the broader concept of theI nternet, the World Wide Web, itself, is a content-oriented ecosystem that has been built atop the globally networked infrastructure of the Internet. It was designed primarily to provide an open platform that could provide uses from all over the world a standard and accessible means of communicating and sharing information online.

Origins and Growth of the Web

"In those days, there was different information on different computers, but you had to log on to different computers to get at it. Also, sometimes you had to learn a different program on each computer. Often it was just easier to go and ask people when they were having coffee...”—Tim Berners-Lee While working at CERN near Geneva, Switzerland, British computer scientistT im Berners-Lee recognized the potential of the Internet as a communications and computational medium and proposed the development of a platform that might help to overcome some of its limitations.

As the Internet became more established, the world’s many computers, servers, routers, and other computational devices gradually became networked together into a worldwide, interconnected ecosystem. However, much the same way that the introduction of air travel in the early 1900’s suddenly brought together people from far-off lands who spoke different languages, shared different customs, and adhered to different laws, the Internet also exposed similar differences and incompatibilities between the world’s various computing systems.

Berners-Lee proposed that a standardized set of protocols and tools be developed that might help to ease the integration of these disparate computing systems and to facilitate improved communications between them. In short, he wanted to employ the ideas of abstraction to design a more generalized means of sharing information across the Internet that was independent of any particular hardware or software that a user might be using. It is also to keep in mind that abstractions can be combined. Lower- level abstractions can be blended to make higher-level abstractions, such as short message services (SMS) or email messages, images, audio files, and videos.

As a result of his efforts, Berners-Lee created the set of fundamental tools and technologies that make up what we now more familiarly know of as the World Wide Web*.

Web Applications:

Web browser—Client application that runs on an end-user’s computer and is used to request and view web pages. Web server—Program that runs on a remote computer and that serves up web pages.

Web Technologies:

HTML (Hypertext Markup Language)—A standardized set of formatting instructions that dictate how the content of a web page should be arranged and displayed by the client application (i.e., web browser). URI (Uniform Resource Identifier)—A unique address that identifies each resource on the web and also known as aU RL (Uniform Resource Locator). HTTP (Hypertext Transfer Protocol)—Standards for requesting and receiving linked resources from across the Web.

On August 6, 1991, Berners-Lee brought the world’s first web site online. It ran on a NeXT cube computer located in his lab at CERN and prominently displayed a sticker on the front of the machine which read, “This machine is a server. DO NOT POWER IT DOWN!!”

*Interestingly enough, “World Wide Web” was not the only name that Berners-Lee considered when choosing a name for his creation. He almost named it one of his other ideas: Information Mesh, Mine of Information, or Information Mine. Consider how the Web the Mine might look today with URLs like moi.google.com or moi.facebook.com instead of our familiar www. prefix.

Hyperlinks

One of the key features that Berners-Lee incorporated into his invention is the use ofh yperlinks to connect documents with one another in a non-linear way. While the pages of a book are arranged linearly in sequence (e.g., page 1, page 2, page 3, etc.), there is no such sequencing of documents in the World Wide Web. Instead, like the multiply connected computers of the Internet, the Web consists of a collection of massively interconnected pages of content.

Each web page is effectively a single, text-based document that has been “marked up” with embedded formatting instructions known as HTML (Hypertext Markup Language) tags. Each of these electronic documents are stored on a computer running a web server. The location of the file within the computer’s file system corresponds to the documents URL (i.e., the address of the web page).

A hyperlink is a clickable bit of text, image, or other on-screen element within an HTML document that a user can select to request another, related document. Each link is designed to enable the user to selective seek out, or browse, from one document to the next, following whatever sequence they choose. This non-linear approach to organizing and connecting information has created an unlimited number of new ways that people can find, learn, and consume information.

Consider the following bit of HTML:

You can search for something, tweet a comment, or like a friend’s post at these popular sites.

The above example, would produce the following hyperlinked text within a web page:

You can search for something, tweet a comment, or like a friend’s post at these popular sites.

Here, you can see that “search,” “tweet,” and “like” have each been formatted to act as hyperlinks (linking to Google, Twitter, and Facebook, respectively). Each hyperlink is denoted with the use of an anchor ( ... ) tag that frames the text being linked (e.g., “search,” “tweet,” and “like”). Each anchor tag includes the URL of the other page or site that the hyperlink is referencing (e.g. href="..." ).

When a user clicks on any of these links, the web browser sends a request to the corresponding web server for the specified page (as referenced in the href tag). Exercise #1: Map the Web

Build a map of the World Wide Web. OK, maybe nota ll of it (it is rather large, after all). In this exercise, you will begin mapping out the interconnectedness of a very small portion of the Web.

1. Using your preferred search engine (Google, Bing, DuckDuckGo, etc.), conduct a search for your own name. 2. Record the URL of the first link that your search returns. 3. Visit that URL and count and record the total number of different links that you can find on that page. 4. Also record the URLs of up to three more of the hyperlinks on that page. 5. Continue repeating this process counting and recording hyperlinks for each URL you record for at least two more levels.

Using your findings, estimate the total number of different pages that could be reached if you were to start at the URL found from your original “vanity search” (i.e., searching for your own name) and followed a series of five clicks. What about 10 clicks? 20 clicks?

Exercise #2: Wikipedia Race

Your teacher will select a random topic for you to look up onW ikipedia. This will be your starting point for the race. Your teacher will then name a second topic. This will be your target. Your goal is to browse through Wikipedia to reach your target topic by only clicking on hyperlinks within the body of Wikipedia article. What is the shortest path that you can find to get from the starting topic to the ending topic (i.e., following the fewest number of links)?