Google Hacks by Tara Calishain, Rael Dornfest Publisher : O'reilly Pub
Total Page:16
File Type:pdf, Size:1020Kb
Google Hacks By Tara Calishain, Rael Dornfest Publisher : O'Reilly Pub Date : February 2003 ISBN : 0-596-00447-8 Preface Search engines for large collections of data preceded the World Wide Web by decades. There were those massive library catalogs, hand -typed with painstaking precision on index cards and eventually, to varying degrees, automated. There were the large data collections of professional information companies such as Dialog and LexisNexis. Then there are the still-extant private, expensive medical, real estate, and legal search services. Those data collections were not always easy to search, but with a little finesse and a lot of patience, it was always possible to search them thoroughly. Information was grouped according to established ontologies, data preformatted according to particular guidelines. Then came the Web. Information on the Web—as anyone knows who's ever looked at half-a-dozen web pages knows—is not all formatted the same way. Nor is it necessarily particularly accurate. Nor up to date. Nor spellchecked. Nonetheless, search engines cropped up, trying to make sense of the rapidly-increasing index of information online. Eventually, special syntaxes were added for searching common parts of the average web page (such as title or URL). Search engines evolved rapidly, trying to encompass all the nuances of the billions of documents online, and they still continue to evolve today. Google™ threw its hat into the ring in 1998. The second incarnation of a search engine service known as BackRub, the name "Google" was a play on the word "googol," a one followed by a hundred zeros. From the beginning, Google was different from the other major search engines online—AltaVista, Excite, HotBot, and others. Was it the technology? Partially. The relevance of Google's search results was outstanding and worthy of comment. But more than that, Google's focus and more human face made it stand out online. With its friendly presentation and its constantly expanding set of options, it's no surprise that Google continues to get lots of fans. There are weblogs devoted to it. Search engine newsletters, such as ResearchBuzz, spend a lot of time covering Google. Legions of devoted fans spend lots of time uncovering documented features, creating games (like Google whacking) and even coining new words (like "Googling," the practice of checking out a prospective date or hire via Google's search engine.) In April 2002, Google reached out to its fan base by offering the Google API. The Google API gives developers a legal way to access the Google search results with automated queries (any other way of accessing Google's search results with automated software is against Google's Terms of Service.) Why Google Hacks? "Hacks" are generally considered to be "quick -n-dirty" solutions to programming problems or interesting techniques for getting a task done. But what does this kind of hacking have to do with Google? Considering the size of the Google index, there are many times when you might want to do a particu lar kind of search and you get too many results for the search to be useful. Or you may want to do a search that the current Google interface does not support. The idea of Google Hacks is not to give you some exhaustive manual of how every command in the Google syntax works, but rather to show you some tricks for making the best use of a search and show applications of the Google API that perform searches that you can't perform using the regular Google interface. In other words, hacks. Dozens of programs and interfaces have sprung up from the Google API. Both games and serious applications using Google's database of web pages are available from everybody from the serious programmer to the devoted fan (like me). Google Hacks Seite 1 von 204 How This Book Is Organized The combination o f Google's API and over 3 billion pages of constantly shifting data can do strange things to your imagination and give you lots of new perspectives on how best to search. This book goes beyond the instruction page to the idea of "hacks"—tips, tricks, and techniques you can use to make your Google searching experience more fruitful, more fun, or (in a couple of cases) just more weird. This book is divided into several chapters: Chapter 1 This chapter describes the fundamentals of how Google's search properties work, with some tips for making the most of Google's syntaxes and specialty search offerings. Beyond the list of "this syntax means that," we'll take a look at how to eke every last bit of searching power out of each syntax—and how to mix syntaxes for some truly monster searches. Chapter 2 Google goes beyond web searching into several different arenas, including images, USENET, and news. Did you know that these collections have their own syntaxes? As you'll learn in this section, Google's equally adroit at helping you holiday shop or search for current events. Chapter 3 Not all the hacks are ones that you want to install on your desktop or web server. In this section, we'll take a look at third-party services that integrate the Google API with other applications or act as handy web tools—or even check Google by email! Chapter 4 Google's API doesn't search all Google properties, but sometimes it'd be real handy to take that search for phone numbers or news stories and save it to a file. This collection of scrapers shows you how. Chapter 5 We'll take a look under the hood at Google's API, considering several different languages and how Google works with each one. Hint: if you've always wanted to learn Perl but never knew what to "do with it," this is your section. Chapter 6 Once you've got an understanding of the Google API, you'll start thinking of all kinds of ways you can use it. Take inspiration from this collection of useful applications that use the Google API. Chapter 7 All work and no play makes for a dull web surfer. This collection of pranks and games t urns Google into a poet, a mirror, and a master chef. Well, a chef anyway. Or at least someone who throws ingredients together. Chapter 8 If you're a web wrangler, you see Google from two sides—from the searcher side and from the side of someone who wants to get the best search ranking for a web site. In this section, you'll learn about Google's (in)famous PageRank, cleaning up for a Google visit, and how to make sure your pages aren't indexed by Google if you don't want them there. How to Contact Us We have tested and verified the information in this book to the best of our ability, but you may find that features have changed (or even that we have made mistakes!). As reader of this book, you can help us to improve future editions by sending us your feedback. Please let us know about any errors, inaccuracies, bugs, misleading or confusing statements, and typos that you find anywhere in this book. Please also let us know what we can do to make this book more useful to you. We take your comments seriously and will try to incorporate reasonable suggestions into future editions. You can write to us at: O'Reilly & Associates, Inc. 1005 Gravenstein Hwy N. Sebastopol, CA 95472 (800) 998-9938 (in the U.S. or Canada) (707) 829-0515 (international/local) (707) 829-0104 (fax) To ask technical questions or to comment on the book, send email to: Google Hacks Seite 2 von 204 [email protected] The web site for Google Hacks lists examples, errata, and plans for future editions. You can find this page at: http://www.oreilly.com/catalog/googlehks/ For more information about this book and others, see the O'Reilly web site: http://www.oreilly.com Gotta Hack? To explore Hacks books online or to contribute a hack for future titles, visit: http://hacks.oreilly.com Chapter 1. Searching Google Section 1.1. Hacks #1-28 Section 1.2. What Google Isn't Section 1.3. What Google Is Section 1.4. Google Basics Section 1.5. The Special Syntaxes Section 1.6. Advanced Search Hack 1. Setting Preferences Hack 2. Language Tools Hack 3. Anatomy of a Search Result Hack 4. Specialized Vocabularies: Slang and Terminology Hack 5. Getting Around the 10 Word Limit Hack 6. Word Order Matters Hack 7. Repetition Matters Hack 8. Mixing Syntaxes Hack 9. Hacking Google URLs Hack 10. Hacking Google Search Forms Hack 11. Date-Range Searching Hack 12. Understanding and Using Julian Dates Hack 13. Using Full-Word Wildcards Hack 14. inurl: Versus site: Hack 15. Checking Spelling Hack 16. Consulting the Dictionary Google Hacks Seite 3 von 204 Hack 17. Consulting the Phonebook Hack 18. Tracking Stocks Hack 19. Google Interface for Translators Hack 20. Searching Article Archives Hack 21. Finding Directories of Information Hack 22. Finding Technical Definitions Hack 23. Finding Weblog Commentary Hack 24. The Google Toolbar Hack 25. The Mozilla Google Toolbar Hack 26. The Quick Search Toolbar Hack 27. GAPIS Hack 28. Googling with Bookmarklets .1 Hacks #1-28 Google's front page is deceptively simple: a search form and a couple of buttons. Yet that basic interface—so alluring in its simplicity—belies the power of the Google engine underneath and the wealth of information at its disposal. And if you use Google's search syntax to its fullest, the Web is your research oyster.