History of History of Internet Searching

n Built in late sixties.

n Funded by the DoD for scientific research.

n First nodes (connections) on the Internet were at universities (UCLA, UCSB, Stanford, Univ. of Utah)

History of Internet History of Internet Searching Searching

n FTP - File Transfer Protocol n FTP - File Transfer Protocol

n Protocol established in 1985. n Protocol established in 1985.

n FTP Servers provide files to FTP Clients n FTP Servers provide files to FTP Clients § Problems with FTP § No organization of FTP Servers § User had to know an FTP Server existed § User had to visit FTP Server to see files

History of Internet History of Internet Searching Searching

n n

n 1990 (No WWW) n 1991 (WWW Began)

n @ McGill Univ. in Montreal n Paul Lindner & Mark P. McCahill of Univ. of

n Searchable directory of FTP files Minnesota

n Searched FTP Servers and indexed their n Named after the Univ. of Minn. Mascot files n Connected Gopher servers through the

n User searched the Index Gopher hierarchy (gopherspace)

n Required Telnet and FTP

1 History of Internet History of Internet Searching Searching

n n Wanderer (Matthew Gray’s World Wide n Released by the Univ. of Nevada (1992) Web Wanderer)

n Very Easy Rodent-Oriented Netwide Index n First WWW Index n GOPHER n Designed to track the size of the WWW n n Captured URL’s and entered into database n Jonzy’s Universal Gopher Hierarchy (Wandex) Excavation and Display n First Robots “bots” n The same as VERONICA but not as good

History of Internet Search Engine Technology Searching

ARCHIE VERONICA The first Internet The first Internet Search Tool Search Engine

Wanderer The first WWW Index

Search Engine Technology Search Engine Technology

n Three parts to a Search Engine n Bots (Robots)

n Bots (Robots)

n Database

n User Interface

2 Search Engine Technology Search Engine Technology

n Bots (Robots) n Database

n Also called Spiders n Collects the information from Query Server

n Computer programs sent out by Query and organizes it. Servers

n Search the Internet for servers

n Identify servers & collect information

n Uses links from websites to find other sites

Search Engine Technology Search Engine Technology

n User Interface n Relevance Ranking

n Allows users to search the database and returns the information from it.

Search Engine Technology Search Engine Technology

n Relevance Ranking n Relevance Ranking (Techniques)

n Search engine measures the relevance of n How often do the search terms appear

the information found to your request n How close are the search terms to each n First search engine to use Relevance other

Ranking was the Repository-Based n Where do the search terms appear Software Engine (RBSE) in 1993 n How often do the search terms appear compared to the length of the web page

3 Six Myths of the Internet

1) All information is there.

2) It is always the best choice.

3) It is free.

4) It is easy and well organized.

5) It is trustworthy.

6) Search engines give you access to the entire WWW.

4