How Does Google Work? Google As a Search Engine

Total Page:16

File Type:pdf, Size:1020Kb

How Does Google Work? Google As a Search Engine How does Google work? Google as a search engine Ethical Forum University Foundation November 2009 Vincent Blondel Louvain School of Engineering, UCL «Don’t be evil» 1998 Larry Page and Sergey Brin 1996: Research project, by Larry Page and Sergey Brin at Stanford University 1998: Google Inc. company. 25 million webpages indexed. 2000: One billion webpages indexed. 2009 2004: Google goes public. 2005: One billion images. 2006: "to google" added to the Oxford English Dictionary. PageRank 6 PageRank 5 Google employs a number of techniques to improve search quality including PageRank, anchor text, and proximity information. 6 www.google.be PageRank 10 www.uclouvain.be PageRank 9 www.kbr.be PageRank 8 www.fnrs.be PageRank 7 www.fondationuniversitaire.be PageRank 6 The web: a democracy of links http://kvina.niva.no/booking/ The web: a democracy of links The web: a democracy of links 15 23 billions webpages http://www.worldwidewebsize.com/ PageRank democracy: let the links vote You webpage inherit a high PageRank if it is being pointed by pages that themselves have a high PageRank. How is this done? • Frequent updates (Googlebot) • Storage 36 data centers (19 in the U.S., 12 in Europe). About 450.000 computers • Sophisticated distributed computation To be googeable or not to be Golden triangle • Position 1: 100% • Position 2: 100% • Position 3: 100% • Position 4: 85% • Position 5: 60% • Position 6: 50% • Position 7: 50% • Position 8: 30% • Position 9: 30% • Position 10: 20% Genesis 1:1 "In the beginning God created the heaven and the earth." Organic Search Results Sponsored Links Paid for by the website owners. Google ranking robustness From: *** From: *** Date: December 16, 2006 1:20:56 PM CST Date: December 18, 2006 6:29 To: *** • Google changes theTo: Vincent algorithm Blondel <[email protected] and the> ranking Subject: google Kinderstart had a traffic 10Subject: million google a month, something caused its ranking Dear Girish, to drop. Traffic dropped 70 percent and revenue dropped 80 percent. Vincent is the person IKinderstart had mentioned whosued has Google. ceased Hi Prof. Blondel, to exist now that his webpage is no longer reachable via Google! I'm Prof. Kumar's ex-student, now working at Google. He told me about the problems you were having with your website listings under Google Can you please• Google give him a pointer removes on whom to write to? webpagessearch. It seems your site from is being searched index properly now - it shows up at the top of the results while searching for your name. I am copying him on this message. Thanks, Best, --Kumar Girish --------------• Google bombing P. R. Kumar Franklin W. Woeltge Professor of Electrical and Computer Engineering, and Research Professor,Google: Coordinated «miserable Science Lab failure» • Buy links 26 Email from a UCL colleague (sometime in 2007). «Un jour, j'ai besoin du CV du recteur de l'UCL pour la soumission d'un projet. J'utilise Google et je tape "Bernard Coulie". Sans faire attention, j'entre en fait "BernardCoulie", sans espace, et je tombe sur deux liens internes à l'UCL. C'étaient deux fichiers qui donnaient les salaires de tous les membres de l'université. Par accident, ces fichiers étaient mal protégés et accessibles à tous. J'avertis les informaticiens. Alerte générale. Durant la nuit tout est réparé. Restaient les caches pour lesquels il a fallu contacter Google ainsi que les autres moteurs de recherche afin que l’information disparaisse totalement du web.» Two months ago Two weeks ago Two days ago 2356300965432378987 x 577810098750098318 = ?.
Recommended publications
  • Intro to Google for the Hill
    Introduction to A company built on search Our mission Google’s mission is to organize the world’s information and make it universally accessible and useful. As a first step to fulfilling this mission, Google’s founders Larry Page and Sergey Brin developed a new approach to online search that took root in a Stanford University dorm room and quickly spread to information seekers around the globe. The Google search engine is an easy-to-use, free service that consistently returns relevant results in a fraction of a second. What we do Google is more than a search engine. We also offer Gmail, maps, personal blogging, and web-based word processing products to name just a few. YouTube, the popular online video service, is part of Google as well. Most of Google’s services are free, so how do we make money? Much of Google’s revenue comes through our AdWords advertising program, which allows businesses to place small “sponsored links” alongside our search results. Prices for these ads are set by competitive auctions for every search term where advertisers want their ads to appear. We don’t sell placement in the search results themselves, or allow people to pay for a higher ranking there. In addition, website managers and publishers take advantage of our AdSense advertising program to deliver ads on their sites. This program generates billions of dollars in revenue each year for hundreds of thousands of websites, and is a major source of funding for the free content available across the web. Google also offers enterprise versions of our consumer products for businesses, organizations, and government entities.
    [Show full text]
  • Dr. Eric Schmidt Eric Schmidt Is Founder of Schmidt Futures
    Biography of Dr. Eric Schmidt Eric Schmidt is Founder of Schmidt Futures. Eric is also Technical Advisor to Alphabet Inc., holding company of Google Inc, where he advises its leaders on technology, business and policy issues. Eric was Executive Chairman of Alphabet from 2015-2018, and of Google from 2011-2015. From 2001-2011, Eric served as Google’s Chief Executive Officer, overseeing the company’s technical and business strategy alongside founders Sergey Brin and Larry Page. Under his leadership, Google dramatically scaled its infrastructure and diversified its product offerings while maintaining a strong culture of innovation, growing from a Silicon Valley startup to a global leader in technology. Prior to joining Google, Eric was the chairman and CEO of Novell and chief technology officer at Sun Microsystems, Inc. Previously, he served on the research staff at Xerox Palo Alto Research Center (PARC), Bell Laboratories and Zilog. He holds a bachelor’s degree in electrical engineering from Princeton University as well as a master’s degree and Ph.D. in computer science from the University of California, Berkeley. Eric was elected to the National Academy of Engineering in 2006 and inducted into the American Academy of Arts and Sciences as a fellow in 2007. Since 2008, he has been a trustee of the Institute for Advanced Study in Princeton, New Jersey. Since 2012, Eric has been on the board of the Broad Institute and the Mayo Clinic. Eric was a member of the President’s Council of Advisors on Science 2009-2017. In 2013, Eric and Jared Cohen co-authored The New York Times bestselling book, The New Digital Age: Transforming Nations, Businesses, and Our Lives.
    [Show full text]
  • La Sécurité Informatique Edition Livres Pour Tous (
    La sécurité informatique Edition Livres pour tous (www.livrespourtous.com) PDF générés en utilisant l’atelier en source ouvert « mwlib ». Voir http://code.pediapress.com/ pour plus d’informations. PDF generated at: Sat, 13 Jul 2013 18:26:11 UTC Contenus Articles 1-Principes généraux 1 Sécurité de l'information 1 Sécurité des systèmes d'information 2 Insécurité du système d'information 12 Politique de sécurité du système d'information 17 Vulnérabilité (informatique) 21 Identité numérique (Internet) 24 2-Attaque, fraude, analyse et cryptanalyse 31 2.1-Application 32 Exploit (informatique) 32 Dépassement de tampon 34 Rétroingénierie 40 Shellcode 44 2.2-Réseau 47 Attaque de l'homme du milieu 47 Attaque de Mitnick 50 Attaque par rebond 54 Balayage de port 55 Attaque par déni de service 57 Empoisonnement du cache DNS 66 Pharming 69 Prise d'empreinte de la pile TCP/IP 70 Usurpation d'adresse IP 71 Wardriving 73 2.3-Système 74 Écran bleu de la mort 74 Fork bomb 82 2.4-Mot de passe 85 Attaque par dictionnaire 85 Attaque par force brute 87 2.5-Site web 90 Cross-site scripting 90 Défacement 93 2.6-Spam/Fishing 95 Bombardement Google 95 Fraude 4-1-9 99 Hameçonnage 102 2.7-Cloud Computing 106 Sécurité du cloud 106 3-Logiciel malveillant 114 Logiciel malveillant 114 Virus informatique 120 Ver informatique 125 Cheval de Troie (informatique) 129 Hacktool 131 Logiciel espion 132 Rootkit 134 Porte dérobée 145 Composeur (logiciel) 149 Charge utile 150 Fichier de test Eicar 151 Virus de boot 152 4-Concepts et mécanismes de sécurité 153 Authentification forte
    [Show full text]
  • Application Development with Tocollege.Net
    CYAN YELLOW MAGENTA BLACK PANTONE 123 C BOOKS FOR PROFESSIONALS BY PROFESSIONALS® THE EXPERT’S VOICE® IN WEB DEVELOPMENT Companion eBook Available Covers Pro Web 2.0 Application GWT 1.5 Pro Development with GWT 2.0 Web Dear Reader, This book is for developers who are ready to move beyond small proof-of-concept Pro sample applications and want to look at the issues surrounding a real deploy- ment of GWT. If you want to see what the guts of a full-fledged GWT application look like, this is the book for you. GWT 1.5 is a game-changing technology, but it doesn’t exist in a bubble. Real deployments need to connect to your database, enforce authentication, protect against security threats, and allow good search engine optimization. To show you all this, we’ll look at the code behind a real, live web site called Application Development with ToCollege.net. This application specializes in helping students who are applying Web 2.0 to colleges; it allows them to manage their application processes and compare the rankings that they give to schools. It’s a slick application that’s ready for you to sign up for and use. Application Development This book will give you a walking tour of this modern Web 2.0 start-up’s code- base. The included source code will provide a functional demonstration of how to merge together the modern Java stack including Hibernate, Spring Security, Spring MVC 2.5, SiteMesh, and FreeMarker. This fully functioning application is better than treasure if you’re a developer trying to wire GWT into a Maven build environment who just wants to see some code that makes it work.
    [Show full text]
  • 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
    1 TABLE OF CONTENTS 2 I. INTRODUCTION ...................................................................................................... 2 3 II. JURISDICTION AND VENUE ................................................................................. 8 4 III. PARTIES .................................................................................................................... 9 5 A. Plaintiffs .......................................................................................................... 9 6 B. Defendants ....................................................................................................... 9 7 IV. FACTUAL ALLEGATIONS ................................................................................... 17 8 A. Alphabet’s Reputation as a “Good” Company is Key to Recruiting Valuable Employees and Collecting the User Data that Powers Its 9 Products ......................................................................................................... 17 10 B. Defendants Breached their Fiduciary Duties by Protecting and Rewarding Male Harassers ............................................................................ 19 11 1. The Board Has Allowed a Culture Hostile to Women to Fester 12 for Years ............................................................................................. 19 13 a) Sex Discrimination in Pay and Promotions: ........................... 20 14 b) Sex Stereotyping and Sexual Harassment: .............................. 23 15 2. The New York Times Reveals the Board’s Pattern
    [Show full text]
  • Google Gives Glimpse of Internet Glasses 4 April 2012
    Google gives glimpse of Internet glasses 4 April 2012 technology could look like and created a video to demonstrate what it might enable you to do," Google said, stressing that the glasses were a concept far from being brought to market. "We're sharing this information now because we want to start a conversation and learn from your valuable input." The project team invited people to express ideas for the glasses at the Google+ page. The Google France offices pictured in 2011. Google Google co-founder Sergey Brin is deeply involved gave the world a glimpse of its vision for letting people with the California company's X Labs, best known look at life through Internet-tinted glasses. for its work on a self-driving car. A YouTube video of legally blind Steve Mahan "driving" an autonomous Google car in his Google on Wednesday gave the world a glimpse of California neighborhood has been viewed more its vision for letting people look at life through than 1.2 million times since it was uploaded on Internet-tinted glasses. March 27. A video posted at a Project Glass page at Google+ (c) 2012 AFP social network confirmed the rumor that the technology titan is working on eyewear that meshes the online world with the real world. "We think technology should work for you -- be there when you need it and get out of your way when you don't," members of the project team said in a Google+ post. "A group of us from Google X (Labs) started Project Glass to build this kind of technology; one that helps you explore and share your world." Images showed people wearing eyeglasses with stylish silver frames that featured tiny cameras and on-lens displays to discretely show information such as walking directions, weather forecasts or messages from friends.
    [Show full text]
  • Towards Active SEO 2
    JISTEM - Journal of Information Systems and Technology Management Revista de Gestão da Tecnologia e Sistemas de Informação Vol. 9, No. 3, Sept/Dec., 2012, pp. 443-458 ISSN online: 1807-1775 DOI: 10.4301/S1807-17752012000300001 TOWARDS ACTIVE SEO (SEARCH ENGINE OPTIMIZATION) 2.0 Charles-Victor Boutet (im memoriam) Luc Quoniam William Samuel Ravatua Smith South University Toulon-Var - Ingémédia, France ____________________________________________________________________________ ABSTRACT In the age of writable web, new skills and new practices are appearing. In an environment that allows everyone to communicate information globally, internet referencing (or SEO) is a strategic discipline that aims to generate visibility, internet traffic and a maximum exploitation of sites publications. Often misperceived as a fraud, SEO has evolved to be a facilitating tool for anyone who wishes to reference their website with search engines. In this article we show that it is possible to achieve the first rank in search results of keywords that are very competitive. We show methods that are quick, sustainable and legal; while applying the principles of active SEO 2.0. This article also clarifies some working functions of search engines, some advanced referencing techniques (that are completely ethical and legal) and we lay the foundations for an in depth reflection on the qualities and advantages of these techniques. Keywords: Active SEO, Search Engine Optimization, SEO 2.0, search engines 1. INTRODUCTION With the Web 2.0 era and writable internet, opportunities related to internet referencing have increased; everyone can easily create their virtual territory consisting of one to thousands of sites, and practically all 2.0 territories are designed to be participatory so anyone can write and promote their sites.
    [Show full text]
  • You Get the Leadership You Inspire: Humor at Google with Eric Schmidt
    CASE: M-378 DATE: 05/07/19 YOU GET THE LEADERSHIP YOU INSPIRE: HUMOR AT GOOGLE WITH ERIC SCHMIDT “You get the leadership that you inspire. If the leadership of the company is relaxed and humorous and having fun, the other people will have permission, within the appropriate boundaries, to do the same thing.” —Eric Schmidt Once upon a time, in the town of Mountain View, California, a sleepy suburban locale most notable for its abundance of moderately-priced Chinese restaurants, there arose a plucky search engine startup by the name of Google. In Google’s early years, founders Sergey Brin and Larry Page and CEO Eric Schmidt worked hard to run their company in a way that avoided the cagey business practices of predecessor technology firms. They hoped to establish a culture built on authenticity and transparency, and they made it a priority to share as much information with their employees as they sensibly could. There were numerous personal and business risks that came with communicating openly and directly with their employees, and humor was one tool they used to mitigate these risks. Schmidt, Page and Brin held an hour-long ‘all-hands’ meeting at the end of each week called TGIF (Thank Goodness It’s Friday), in which every single employee at the company was invited. For the first 30 minutes, the team reviewed news and product launches from the past week, provided demos for unreleased products, and celebrated recent wins. The second 30 minutes was comprised of a question and answer session where any Google employee could query the leadership team.1 1 Subtle search engine pun.
    [Show full text]
  • Larry Page Developing the Largest Corporate Foundation in Every Successful Company Must Face: As Google Word.” the United States
    LOWE —continued from front flap— Praise for $19.95 USA/$23.95 CAN In addition to examining Google’s breakthrough business strategies and new business models— In many ways, Google is the prototype of a which have transformed online advertising G and changed the way we look at corporate successful twenty-fi rst-century company. It uses responsibility and employee relations——Lowe Google technology in new ways to make information universally accessible; promotes a corporate explains why Google may be a harbinger of o 5]]UZS SPEAKS culture that encourages creativity among its where corporate America is headed. She also A>3/9A addresses controversies surrounding Google, such o employees; and takes its role as a corporate citizen as copyright infringement, antitrust concerns, and “It’s not hard to see that Google is a phenomenal company....At Secrets of the World’s Greatest Billionaire Entrepreneurs, very seriously, investing in green initiatives and personal privacy and poses the question almost Geico, we pay these guys a whole lot of money for this and that key g Sergey Brin and Larry Page developing the largest corporate foundation in every successful company must face: as Google word.” the United States. grows, can it hold on to its entrepreneurial spirit as —Warren Buffett l well as its informal motto, “Don’t do evil”? e Following in the footsteps of Warren Buffett “Google rocks. It raised my perceived IQ by about 20 points.” Speaks and Jack Welch Speaks——which contain a SPEAKS What started out as a university research project —Wes Boyd conversational style that successfully captures the conducted by Sergey Brin and Larry Page has President of Moveon.Org essence of these business leaders—Google Speaks ended up revolutionizing the world we live in.
    [Show full text]
  • Lecture 18: CS 5306 / INFO 5306: Crowdsourcing and Human Computation Web Link Analysis (Wisdom of the Crowds) (Not Discussing)
    Lecture 18: CS 5306 / INFO 5306: Crowdsourcing and Human Computation Web Link Analysis (Wisdom of the Crowds) (Not Discussing) • Information retrieval (term weighting, vector space representation, inverted indexing, etc.) • Efficient web crawling • Efficient real-time retrieval Web Search: Prehistory • Crawl the Web, generate an index of all pages – Which pages? – What content of each page? – (Not discussing this) • Rank documents: – Based on the text content of a page • How many times does query appear? • How high up in page? – Based on display characteristics of the query • For example, is it in a heading, italicized, etc. Link Analysis: Prehistory • L. Katz. "A new status index derived from sociometric analysis“, Psychometrika 18(1), 39-43, March 1953. • Charles H. Hubbell. "An Input-Output Approach to Clique Identification“, Sociolmetry, 28, 377-399, 1965. • Eugene Garfield. Citation analysis as a tool in journal evaluation. Science 178, 1972. • G. Pinski and Francis Narin. "Citation influence for journal aggregates of scientific publications: Theory, with application to the literature of physics“, Information Processing and Management. 12, 1976. • Mark, D. M., "Network models in geomorphology," Modeling in Geomorphologic Systems, 1988 • T. Bray, “Measuring the Web”. Proceedings of the 5th Intl. WWW Conference, 1996. • Massimo Marchiori, “The quest for correct information on the Web: hyper search engines”, Computer Networks and ISDN Systems, 29: 8-13, September 1997, Pages 1225-1235. Hubs and Authorities • J. Kleinberg. “Authoritative
    [Show full text]
  • It's All About Visibility
    8 It’s All About Visibility This chapter looks at the critical tasks for getting your message found on the web. Now that we’ve discussed how to prepare a clear targeted message using the right words (Chapter 5, “The Audience Is Listening (What Will You Say?)”), we describe how online visibility depends on search engine optimization (SEO) “eat your broccoli” basics, such as lightweight and crawlable website code, targeted content with useful labels, and inlinks. In addition, you can raise the visibility of your website, products, and services online through online advertising such as paid search advertising, outreach through social websites, and display advertising. 178 Part II Building the Engine Who Sees What and How Two different tribes visit your website: people, and entities known as web spiders (or crawlers or robots). People will experience your website differently based on their own characteristics (their visual acuity or impairment), their browser (Internet Explorer, Chrome, Firefox, and so on), and the machine they’re using to view your website (a TV, a giant computer monitor, a laptop screen, or a mobile phone). Figure 8.1 shows a page on a website as it appears to website visitors through a browser. Figure 8.1 Screenshot of a story page on Model D, a web magazine about Detroit, Michigan, www.modeldmedia.com. What Search Engine Spiders See The web spiders are computer programs critical to your business because they help people who don’t know about your website through your marketing efforts find it through the search engines. The web spiders “crawl” through your website to learn about what it contains and carry information back to the gigantic servers behind the search engines, so that the search engine can provide relevant results to people searching for your product or service.
    [Show full text]
  • Google Cheat Sheets [.Pdf]
    GOOGLE | CHEAT SHEET Key for skill required Novice This two page Google Cheat Sheet lists all Google services and tools as to understand the Intermediate well as background information. The Cheat Sheet offers a great reference underlying concepts to grasp of basic to advance Google query building concepts and ideas. Expert CHEAT SHEET GOOGLE SERVICES Google domains google.co.kr Google Company Information google.ae google.kz Public (NASDAQ: GOOG) and google.com.af google.li (LSE: GGEA) Google AdSense https://www.google.com/adsense/ google.com.ag google.lk google.off.ai google.co.ls Founded Google AdWords https://adwords.google.com/ google.am google.lt Menlo Park, California (1998) Google Analytics http://google.com/analytics/ google.com.ar google.lu google.as google.lv Location Google Answers http://answers.google.com/ google.at google.com.ly Mountain View, California, USA Google Base http://base.google.com/ google.com.au google.mn google.az google.ms Key people Google Blog Search http://blogsearch.google.com/ google.ba google.com.mt Eric E. Schmidt Google Bookmarks http://www.google.com/bookmarks/ google.com.bd google.mu Sergey Brin google.be google.mw Larry E. Page Google Books Search http://books.google.com/ google.bg google.com.mx George Reyes Google Calendar http://google.com/calendar/ google.com.bh google.com.my google.bi google.com.na Revenue Google Catalogs http://catalogs.google.com/ google.com.bo google.com.nf $6.138 Billion USD (2005) Google Code http://code.google.com/ google.com.br google.com.ni google.bs google.nl Net Income Google Code Search http://www.google.com/codesearch/ google.co.bw google.no $1.465 Billion USD (2005) Google Deskbar http://deskbar.google.com/ google.com.bz google.com.np google.ca google.nr Employees Google Desktop http://desktop.google.com/ google.cd google.nu 5,680 (2005) Google Directory http://www.google.com/dirhp google.cg google.co.nz google.ch google.com.om Contact Address Google Earth http://earth.google.com/ google.ci google.com.pa 2400 E.
    [Show full text]