Search Engine Optimization with PHP

Total Page:16

File Type:pdf, Size:1020Kb

Search Engine Optimization with PHP 00929ffirs.qxd:00929ffirs 3/13/07 10:36 AM Page iii Professional Search Engine Optimization with PHP A Developer’s Guide to SEO Jaimie Sirovich Cristian Darie 00929ffirs.qxd:00929ffirs 3/13/07 10:36 AM Page iv Professional Search Engine Optimization with PHP: A Developer’s Guide to SEO Published by Wiley Publishing, Inc. 10475 Crosspoint Boulevard Indianapolis, IN 46256 www.wiley.com Copyright © 2007 by Wiley Publishing, Inc., Indianapolis, Indiana Published simultaneously in Canada ISBN: 978-0-470-10092-9 Manufactured in the United States of America 10 9 8 7 6 5 4 3 2 1 Library of Congress Cataloging-in-Publication Data: Sirovich, Jaimie, 1981- Professional search engine optimization with PHP : a developer's guide to SEO / Jaimie Sirovich, Cristian Darie. p. cm. Includes index. ISBN 978-0-470-10092-9 (pbk.) 1. PHP (Computer program language) 2. Web sites--Design. 3. Search engines. I. Darie, Cristian. II. Title. QA76.73.P224S525 2007 005.13'3--dc22 2007003317 No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except as permitted under Sections 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 646-8600. Requests to the Publisher for permis- sion should be addressed to the Legal Department, Wiley Publishing, Inc., 10475 Crosspoint Blvd., Indianapolis, IN 46256, (317) 572-3447, fax (317) 572-4355, or online at http://www.wiley.com/go/permissions. LIMIT OF LIABILITY/DISCLAIMER OF WARRANTY: THE PUBLISHER AND THE AUTHOR MAKE NO REPRESENTATIONS OR WARRANTIES WITH RESPECT TO THE ACCURACY OR COMPLETENESS OF THE CONTENTS OF THIS WORK AND SPECIFICALLY DISCLAIM ALL WARRANTIES, INCLUDING WITHOUT LIMITATION WARRANTIES OF FITNESS FOR A PARTI CULAR PURPOSE. NO WARRANTY MAY BE CREATED OR EXTENDED BY SALES OR PROMOTIONAL MATERIALS. THE ADVICE AND STRATEGIES CONTAINED HEREIN MAY NOT BE SUITABLE FOR EVERY SITUATION. THIS WORK IS SOLD WITH THE UNDERSTAND- ING THAT THE PUBLISHER IS NOT ENGAGED IN RENDERING LEGAL, ACCOUNTING, OR OTHER PRO- FESSIONAL SERVICES. IF PROFESSIONAL ASSISTANCE IS REQUIRED, THE SERVICES OF A COMPETENT PROFESSIONAL PERSON SHOULD BE SOUGHT. NEITHER THE PUBLISHER NOR THE AUTHOR SHALL BE LIABLE FOR DAMAGES ARISING HEREFROM. THE FACT THAT AN ORGANIZATION OR WEBSITE IS REFERRED TO IN THIS WORK AS A CITATION AND/OR A POTENTIAL SOURCE OF FURTHER INFORMA- TION DOES NOT MEAN THAT THE AUTHOR OR THE PUBLISHER ENDORSES THE INFORMATION THE ORGANIZATION OR WEBSITE MAY PROVIDE OR RECOMMENDATIONS IT MAY MAKE. FURTHER, READ- ERS SHOULD BE AWARE THAT INTERNET WEBSITES LISTED IN THIS WORK MAY HAVE CHANGED OR DISAPPEARED BETWEEN WHEN THIS WORK WAS WRITTEN AND WHEN IT IS READ. For general information on our other products and services please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002. Trademarks: Wiley, the Wiley logo, Wrox, the Wrox logo, Programmer to Programmer, and related trade dress are trademarks or registered trademarks of John Wiley & Sons, Inc. and/or its affiliates, in the United States and other countries, and may not be used without written permission. Microsoft and Excel are registered trademarks of Microsoft Corporation in the United States and/or other countries. All other trademarks are the property of their respective owners. Wiley Publishing, Inc., is not associated with any product or vendor mentioned in this book. Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic books. 00929ffirs.qxd:00929ffirs 3/13/07 10:36 AM Page v About the Authors Jaimie Sirovich is a search engine marketing consultant. He works with his clients to build them power- ful online presences. Officially Jaimie is a computer programmer, but he claims to enjoy marketing much more. He graduated from Stevens Institute of Technology with a BS in Computer Science. He worked under Barry Schwartz at RustyBrick, Inc., as lead programmer on e-commerce projects until 2005. At present, Jaimie consults for several organizations and administrates the popular search engine market- ing blog, SEOEgghead.com. Cristian Darie is a software engineer with experience in a wide range of modern technologies, and the author of numerous books and tutorials on AJAX, ASP.NET, PHP, SQL, and related areas. Cristian cur- rently lives in Bucharest, Romania, studying distributed application architectures for his PhD. He’s get- ting involved with various commercial and research projects, and when not planning to buy Google, he enjoys his bit of social life. If you want to say “Hi,” you can reach Cristian through his personal web site at http://www.cristiandarie.ro. 00929ffirs.qxd:00929ffirs 3/13/07 10:36 AM Page vi Credits Acquisitions Editor Vice President and Executive Group Publisher Kit Kemper Richard Swadley Developmental Editor Vice President and Executive Publisher Kenyon Brown Joseph B. Wikert Technical Editor Compositor Bogdan Brinzarea Laurie Stewart, Happenstance Type-O-Rama Production Editor Proofreader Angela Smith Ian Golder Copy Editor Indexer Kim Cofer Melanie Belkin Editorial Manager Anniversary Logo Design Mary Beth Wakefield Richard Pacifico Production Manager Tim Tate 00929ffirs.qxd:00929ffirs 3/13/07 10:36 AM Page vii Acknowledgments The authors would like to thank the following people and companies, listed alphabetically, for their invaluable assistance with the production of this book. Without their help, this book would not have been possible in its current form. Dan Kramer of Volatile Graphix for generously providing his cloaking database to the public — and even adding some data to make our cloaking code examples work better. Kim Krause Berg of The Usability Effect for providing assistance and insight where this book references usability and accessibility topics. MaxMind, Inc., for providing their free GeoLite geo-targeting data — making our geo-targeting code examples possible. Several authors of WordPress plugins including Arne Brachhold, Lester Chan, Peter Harkins, Matt Lloyd, and Thomas McMahon. Family and friends of both Jaimie and Cristian — for tolerating the endless trail of empty cans of (caffeinated) soda left on the table while writing this book. 00929ffirs.qxd:00929ffirs 3/13/07 10:36 AM Page viii 00929ftoc.qxd:00929ftoc 3/13/07 2:02 PM Page ix Contents Acknowledgments vii Introduction xvii Chapter 1: You: Programmer and Search Engine Marketer 1 Who Are You? 2 What Do You Need to Learn? 3 SEO and the Site Architecture 4 SEO Cannot Be an Afterthought 5 Communicating Architectural Decisions 5 Architectural Minutiae Can Make or Break You 5 Preparing Your Playground 6 Installing XAMPP 7 Preparing the Working Folder 8 Preparing the Database 11 Summary 12 Chapter 2: A Primer in Basic SEO 13 Introduction to SEO 13 Link Equity 14 Google PageRank 15 A Word on Usability and Accessibility 16 Search Engine Ranking Factors 17 On-Page Factors 17 Visible On-Page Factors 18 Invisible On-Page Factors 20 Time-Based Factors 21 External Factors 22 Potential Search Engine Penalties 26 The Google “Sandbox Effect” 26 The Expired Domain Penalty 26 Duplicate Content Penalty 27 The Google Supplemental Index 27 Resources and Tools 28 Web Analytics 28 00929ftoc.qxd:00929ftoc 3/13/07 2:02 PM Page x Contents Market Research 29 Researching Keywords 32 Browser Plugins 33 Community Forums 33 Search Engine Blogs and Resources 34 Summary 35 Chapter 3: Provocative SE-Friendly URLs 37 Why Do URLs Matter? 38 Static URLs and Dynamic URLs 38 Static URLs 39 Dynamic URLs 39 URLs and CTR 40 URLs and Duplicate Content 41 URLs of the Real World 42 Example #1: Dynamic URLs 42 Example #2: Numeric Rewritten URLs 43 Example #3: Keyword-Rich Rewritten URLs 44 Maintaining URL Consistency 44 URL Rewriting 46 Installing mod_rewrite 48 Testing mod_rewrite 49 Introducing Regular Expressions 54 URL Rewriting and PHP 60 Rewriting Numeric URLs with Two Parameters 61 Rewriting Keyword-Rich URLs 64 Building a Link Factory 66 Pagination and URL Rewriting 72 Rewriting Images and Streaming Media 72 Problems Rewriting Doesn’t Solve 75 A Last Word of Caution 75 Summary 76 Chapter 4: Content Relocation and HTTP Status Codes 77 HTTP Status Codes 78 Redirection Using 301 and 302 79 301 81 302 82 Removing Deleted Pages Using 404 83 Avoiding Indexing Error Pages Using 500 84 x 00929ftoc.qxd:00929ftoc 3/13/07 2:02 PM Page xi Contents Redirecting with PHP and mod_rewrite 84 Using Redirects to Change File Names 85 URL Correction 89 Dealing with Multiple Domain Names Properly 90 Using Redirects to Change Domain Names 90 URL Canonicalization: www.example.com versus example.com 91 URL Canonicalization: /index.php versus / 92 Other Types of Redirects 94 Summary 94 Chapter 5: Duplicate Content 95 Causes and Effects of Duplicate Content 96 Duplicate Content as a Result of Site Architecture 96 Duplicate Content as a Result of Content Theft 96 Excluding Duplicate Content 97 Using the Robots Meta Tag 97 robots.txt Pattern Exclusion 99 Solutions for Commonly Duplicated Pages 103 Print-Friendly Pages 103 Navigation Links and Breadcrumb Navigation 104 Similar Pages 106 Pages with Duplicate Meta Tag or Title Values 106 URL Canonicalization 106 URL-Based Session IDs 107 Other Navigational
Recommended publications
  • The Importance of RSS in the Exchange of Medical Information
    The Importance of RSS in the Exchange of Medical Information Frankie Dolan and Nancy Shepherd 1 MedWorm.com [email protected] 2 Shepherd Research LLC. [email protected] Abstract. This paper investigates the role of RSS in providing a so- lution to the problem of medical information overload, speeding up the dissemination of information and improving communications between all those with an interest in health. It compares the exchange and use of medical information on the Internet before and after the use of RSS and also shares a vision for the future, using MedWorm, a medical search engine and RSS newsfeed provider, as an example. The conclusion high- lights how RSS has opened a new dimension of information exchange which has the potential to enable giant steps forward in the ¯eld of medicine. To realise its full potential, both publishers and users of medi- cal information need to recognise the importance of RSS, ensure thought- ful implementation of RSS feeds to announce publication, and provide for education regarding its everyday use. 1 Introduction The Internet has enabled access to a wealth of in depth research and medically related information not previously available, but it has also given rise to a new set of problems for todays physician. Medical practitioners are now inundated with information[1], short of time [2] and yet obliged to keep up to date at all times with the very latest developments. Patients are researching their own conditions and often expect their doctors to have expert and recent knowledge on a vast range of topics. This paper briefly describes RSS (really simple syndication) [3] and inves- tigates the way in which RSS is starting to provide a solution to the problem of medical information overload, speeding up the dissemination of information across the Internet and improving communications between all those with an interest in health.
    [Show full text]
  • Topix.Net Weblog: the Secret Source of Google's Power
    Topix.net Weblog: The Secret Source of Google's Power Topix.net Weblog News and information about Topix.net February 2005 « New Developments | Main | An 'On Topix' Sonnet... » Sun Mon Tue Wed Thu Fri Sat 1 2 3 4 5 6 7 8 9 10 11 12 April 04, 2004 13 14 15 16 17 18 19 20 21 22 23 24 25 26 The Secret Source of Google's Power 27 28 Much is being written about Gmail, Google's new Search free webmail system. There's something deeper to learn about Google from this product than the initial reaction to the product features, however. Search this site: Ignore for a moment the observations about Google leapfrogging their competitors with more user value and a new feature or two. Or Google Archives diversifying away from search into other applications; they've been doing that for a while. February 2005 January 2005 Or the privacy red herring. December 2004 November 2004 No, the story is about seemingly incremental October 2004 features that are actually massively expensive for September 2004 others to match, and the platform that Google is August 2004 building which makes it cheaper and easier for July 2004 them to develop and run web-scale applications June 2004 than anyone else. May 2004 April 2004 I've written before about Google's snippet service, March 2004 which required that they store the entire web in February 2004 RAM. All so they could generate a slightly better January 2004 page excerpt than other search engines. Recent Entries The Incremental Web Google has taken the last 10 years of systems Interview with Peter Da Vanzo software research out of university labs, and built Upcoming Speaking Engagements their own proprietary, production quality system.
    [Show full text]
  • Applied Text Analytics for Blogs
    UvA-DARE (Digital Academic Repository) Applied text analytics for blogs Mishne, G.A. Publication date 2007 Document Version Final published version Link to publication Citation for published version (APA): Mishne, G. A. (2007). Applied text analytics for blogs. General rights It is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright holder(s), other than for strictly personal, individual use, unless the work is under an open content license (like Creative Commons). Disclaimer/Complaints regulations If you believe that digital publication of certain material infringes any of your rights or (privacy) interests, please let the Library know, stating your reasons. In case of a legitimate complaint, the Library will make the material inaccessible and/or remove it from the website. Please Ask the Library: https://uba.uva.nl/en/contact, or a letter to: Library of the University of Amsterdam, Secretariat, Singel 425, 1012 WP Amsterdam, The Netherlands. You will be contacted as soon as possible. UvA-DARE is a service provided by the library of the University of Amsterdam (https://dare.uva.nl) Download date:06 Oct 2021 Applied Text Analytics for Blogs Gilad Mishne Applied Text Analytics for Blogs Academisch Proefschrift ter verkrijging van de graad van doctor aan de Universiteit van Amsterdam op gezag van de Rector Magnificus prof.dr. J.W. Zwemmer ten overstaan van een door het college voor promoties ingestelde commissie, in het openbaar te verdedigen in de Aula der Universiteit op vrijdag 27 april 2007, te 10.00 uur door Gilad Avraham Mishne geboren te Haifa, Isra¨el.
    [Show full text]
  • Digital Marketing Handbook
    Digital Marketing Handbook PDF generated using the open source mwlib toolkit. See http://code.pediapress.com/ for more information. PDF generated at: Sat, 17 Mar 2012 10:33:23 UTC Contents Articles Search Engine Reputation Management 1 Semantic Web 7 Microformat 17 Web 2.0 23 Web 1.0 36 Search engine optimization 37 Search engine 45 Search engine results page 52 Search engine marketing 53 Image search 57 Video search 59 Local search 65 Web presence 67 Internet marketing 70 Web crawler 74 Backlinks 83 Keyword stuffing 85 Article spinning 86 Link farm 87 Spamdexing 88 Index 93 Black hat 102 Danny Sullivan 103 Meta element 105 Meta tags 110 Inktomi 115 Larry Page 118 Sergey Brin 123 PageRank 131 Inbound link 143 Matt Cutts 145 nofollow 146 Open Directory Project 151 Sitemap 160 Robots Exclusion Standard 162 Robots.txt 165 301 redirect 169 Google Instant 179 Google Search 190 Cloaking 201 Web search engine 203 Bing 210 Ask.com 224 Yahoo! Search 228 Tim Berners-Lee 232 Web search query 239 Web crawling 241 Social search 250 Vertical search 252 Web analytics 253 Pay per click 262 Social media marketing 265 Affiliate marketing 269 Article marketing 280 Digital marketing 281 Hilltop algorithm 282 TrustRank 283 Latent semantic indexing 284 Semantic targeting 290 Canonical meta tag 292 Keyword research 293 Latent Dirichlet allocation 293 Vanessa Fox 300 Search engines 302 Site map 309 Sitemaps 311 Methods of website linking 315 Deep linking 317 Backlink 319 URL redirection 321 References Article Sources and Contributors 331 Image Sources, Licenses and Contributors 345 Article Licenses License 346 Search Engine Reputation Management 1 Search Engine Reputation Management Reputation management, is the process of tracking an entity's actions and other entities' opinions about those actions; reporting on those actions and opinions; and reacting to that report creating a feedback loop.
    [Show full text]
  • Personal Electronic Device Use in Face-To-Face Organizational Meetings: How It Is Perceived and the Factors Influencing Perceptions
    Minnesota State University, Mankato Cornerstone: A Collection of Scholarly and Creative Works for Minnesota State University, Mankato All Graduate Theses, Dissertations, and Other Graduate Theses, Dissertations, and Other Capstone Projects Capstone Projects 2014 Personal Electronic Device Use in Face-to-Face Organizational Meetings: How it is Perceived and the Factors Influencing Perceptions Kimber Goodwin Minnesota State University - Mankato Follow this and additional works at: https://cornerstone.lib.mnsu.edu/etds Part of the Communication Technology and New Media Commons, and the Organizational Communication Commons Recommended Citation Goodwin, K. (2014). Personal Electronic Device Use in Face-to-Face Organizational Meetings: How it is Perceived and the Factors Influencing Perceptions [Master’s thesis, Minnesota State University, Mankato]. Cornerstone: A Collection of Scholarly and Creative Works for Minnesota State University, Mankato. https://cornerstone.lib.mnsu.edu/etds/374/ This Thesis is brought to you for free and open access by the Graduate Theses, Dissertations, and Other Capstone Projects at Cornerstone: A Collection of Scholarly and Creative Works for Minnesota State University, Mankato. It has been accepted for inclusion in All Graduate Theses, Dissertations, and Other Capstone Projects by an authorized administrator of Cornerstone: A Collection of Scholarly and Creative Works for Minnesota State University, Mankato. Personal Electronic Device Use in Face-to-Face Organizational Meetings: How it is Perceived and the Factors Influencing Perceptions By: Kimber Goodwin A Thesis Submitted in Partial Fulfillment of the Requirements for the Degree of Master of Arts In Communication Studies Minnesota State University Mankato, Minnesota December 2014 ii Personal Electronic Device Use in Face-to-Face Organizational Meetings: How it is Perceived and the Factors Influencing Perceptions Kimber Goodwin This thesis has been examined and approved by the following members of the student’s committee: ________________________________ Dr.
    [Show full text]
  • Wrox.Professional.Search.Engine
    00929ffirs.qxd:00929ffirs 3/13/07 10:36 AM Page iii Professional Search Engine Optimization with PHP A Developer’s Guide to SEO Jaimie Sirovich Cristian Darie 00929ffirs.qxd:00929ffirs 3/13/07 10:36 AM Page ii 00929ffirs.qxd:00929ffirs 3/13/07 10:36 AM Page i Professional Search Engine Optimization with PHP 00929ffirs.qxd:00929ffirs 3/13/07 10:36 AM Page ii 00929ffirs.qxd:00929ffirs 3/13/07 10:36 AM Page iii Professional Search Engine Optimization with PHP A Developer’s Guide to SEO Jaimie Sirovich Cristian Darie 00929ffirs.qxd:00929ffirs 3/13/07 10:36 AM Page iv Professional Search Engine Optimization with PHP: A Developer’s Guide to SEO Published by Wiley Publishing, Inc. 10475 Crosspoint Boulevard Indianapolis, IN 46256 www.wiley.com Copyright © 2007 by Wiley Publishing, Inc., Indianapolis, Indiana Published simultaneously in Canada ISBN: 978-0-470-10092-9 Manufactured in the United States of America 10 9 8 7 6 5 4 3 2 1 Library of Congress Cataloging-in-Publication Data: Sirovich, Jaimie, 1981- Professional search engine optimization with PHP : a developer's guide to SEO / Jaimie Sirovich, Cristian Darie. p. cm. Includes index. ISBN 978-0-470-10092-9 (pbk.) 1. PHP (Computer program language) 2. Web sites--Design. 3. Search engines. I. Darie, Cristian. II. Title. QA76.73.P224S525 2007 005.13'3--dc22 2007003317 No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except as permitted under Sections 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 646-8600.
    [Show full text]
  • Google's Pagerank and Beyond: the Science of Search Engine Rankings
    Google’s PageRank and Beyond: The Science of Search Engine Rankings This page intentionally left blank Google’s PageRank and Beyond: The Science of Search Engine Rankings Amy N. Langville and Carl D. Meyer PRINCETON UNIVERSITY PRESS PRINCETON AND OXFORD This page intentionally left blank Contents Preface ix Chapter 1. Introduction to Web Search Engines 1 1.1 A Short History of Information Retrieval 1 1.2 An Overview of Traditional Information Retrieval 5 1.3 Web Information Retrieval 9 Chapter 2. Crawling, Indexing, and Query Processing 15 2.1 Crawling 15 2.2 The Content Index 19 2.3 Query Processing 21 Chapter 3. Ranking Webpages by Popularity 25 3.1 The Scene in 1998 25 3.2 Two Theses 26 3.3 Query-Independence 30 Chapter 4. The Mathematics of Google’s PageRank 31 4.1 The Original Summation Formula for PageRank 32 4.2 Matrix Representation of the Summation Equations 33 4.3 Problems with the Iterative Process 34 4.4 A Little Markov Chain Theory 36 4.5 Early Adjustments to the Basic Model 36 4.6 Computation of the PageRank Vector 39 4.7 Theorem and Proof for Spectrum of the Google Matrix 45 Chapter 5. Parameters in the PageRank Model 47 5.1 The α Factor 47 5.2 The Hyperlink Matrix H 48 5.3 The Teleportation Matrix E 49 Chapter 6. The Sensitivity of PageRank 57 6.1 Sensitivity with respect to α 57 vi CONTENTS 6.2 Sensitivity with respect to H 62 T 6.3 Sensitivity with respect to v 63 6.4 Other Analyses of Sensitivity 63 6.5 Sensitivity Theorems and Proofs 66 Chapter 7.
    [Show full text]
  • The SEO Interviews
    e Interviews AARON MATTHEW WALL Search Engine Optimization Book © Aaron Matthew Wall 150 Caldecott Ln #8 • Oakland • Ca 94618 Phone (401)207-1945 • E-mail: [email protected] Last Updated: Monday, November 5, 2007 Table of Contents Why All These Interviews? 1 Interview............................................................... 112 Questions, Comments, & Concerns .......................1 Dan Kramer 119 Shawn Hogan of Digital Point 2 Interview............................................................... 119 Interview....................................................................2 Bob Massa 124 David Naylor 6 Interview............................................................... 124 Interview:...................................................................6 Scott Smith: Caveman 131 NFFC 11 Interview............................................................... 131 Interview..................................................................11 Brian Clark: Copyblogger 140 Dan Thies 17 Interview............................................................... 140 Interview..................................................................17 Digital Ghost 143 Peter Da Vanzo 25 Interview............................................................... 143 Interview..................................................................25 Frank Schilling 148 Jason Duke 32 Interview............................................................... 148 Interview..................................................................32 Michael Mann 153
    [Show full text]