Spidering Hacks

Total Page:16

File Type:pdf, Size:1020Kb

Spidering Hacks SPIDERING HACKSTM 100 Industrial-Strength Tips & Tools Kevin Hemenway & Tara Calishain Spidering Hacks Spidering Hacks By Tara Calishain, Kevin Hemenway Publisher: O'Reilly Pub Date: October 2003 ISBN: 0−596−00577−6 Pages: 424 Copyright Credits About the Authors Contributors Preface Why Spidering Hacks? How This Book Is Organized How to Use This Book Conventions Used in This Book How to Contact Us Got a Hack? Chapter 1. Walking Softly Hacks #1−7 Hack 1. A Crash Course in Spidering and Scraping Hack 2. Best Practices for You and Your Spider Hack 3. Anatomy of an HTML Page Hack 4. Registering Your Spider Hack 5. Preempting Discovery Hack 6. Keeping Your Spider Out of Sticky Situations Hack 7. Finding the Patterns of Identifiers Chapter 2. Assembling a Toolbox Hacks #8−32 Perl Modules Resources You May Find Helpful Hack 8. Installing Perl Modules Hack 9. Simply Fetching with LWP::Simple Hack 10. More Involved Requests with LWP::UserAgent Hack 11. Adding HTTP Headers to Your Request Hack 12. Posting Form Data with LWP Hack 13. Authentication, Cookies, and Proxies Hack 14. Handling Relative and Absolute URLs Hack 15. Secured Access and Browser Attributes Hack 16. Respecting Your Scrapee's Bandwidth Hack 17. Respecting robots.txt Hack 18. Adding Progress Bars to Your Scripts Hack 19. Scraping with HTML::TreeBuilder Hack 20. Parsing with HTML::TokeParser Hack 21. WWW::Mechanize 101 Hack 22. Scraping with WWW::Mechanize Hack 23. In Praise of Regular Expressions Hack 24. Painless RSS with Template::Extract Hack 25. A Quick Introduction to XPath Hack 26. Downloading with curl and wget Hack 27. More Advanced wget Techniques 1 Spidering Hacks Hack 28. Using Pipes to Chain Commands Hack 29. Running Multiple Utilities at Once Hack 30. Utilizing the Web Scraping Proxy Hack 31. Being Warned When Things Go Wrong Hack 32. Being Adaptive to Site Redesigns Chapter 3. Collecting Media Files Hacks #33−42 Hack 33. Detective Case Study: Newgrounds Hack 34. Detective Case Study: iFilm Hack 35. Downloading Movies from the Library of Congress Hack 36. Downloading Images from Webshots Hack 37. Downloading Comics with dailystrips Hack 38. Archiving Your Favorite Webcams Hack 39. News Wallpaper for Your Site Hack 40. Saving Only POP3 Email Attachments Hack 41. Downloading MP3s from a Playlist Hack 42. Downloading from Usenet with nget Chapter 4. Gleaning Data from Databases Hacks #43−89 Hack 43. Archiving Yahoo! Groups Messages with yahoo2mbox Hack 44. Archiving Yahoo! Groups Messages with WWW::Yahoo::Groups Hack 45. Gleaning Buzz from Yahoo! Hack 46. Spidering the Yahoo! Catalog Hack 47. Tracking Additions to Yahoo! Hack 48. Scattersearch with Yahoo! and Google Hack 49. Yahoo! Directory Mindshare in Google Hack 50. Weblog−Free Google Results Hack 51. Spidering, Google, and Multiple Domains Hack 52. Scraping Amazon.com Product Reviews Hack 53. Receive an Email Alert for Newly Added Amazon.com Reviews Hack 54. Scraping Amazon.com Customer Advice Hack 55. Publishing Amazon.com Associates Statistics Hack 56. Sorting Amazon.com Recommendations by Rating Hack 57. Related Amazon.com Products with Alexa Hack 58. Scraping Alexa's Competitive Data with Java Hack 59. Finding Album Information with FreeDB and Amazon.com Hack 60. Expanding Your Musical Tastes Hack 61. Saving Daily Horoscopes to Your iPod Hack 62. Graphing Data with RRDTOOL Hack 63. Stocking Up on Financial Quotes Hack 64. Super Author Searching Hack 65. Mapping O'Reilly Best Sellers to Library Popularity Hack 66. Using All Consuming to Get Book Lists Hack 67. Tracking Packages with FedEx Hack 68. Checking Blogs for New Comments Hack 69. Aggregating RSS and Posting Changes Hack 70. Using the Link Cosmos of Technorati Hack 71. Finding Related RSS Feeds Hack 72. Automatically Finding Blogs of Interest Hack 73. Scraping TV Listings Hack 74. What's Your Visitor's Weather Like? Hack 75. Trendspotting with Geotargeting Hack 76. Getting the Best Travel Route by Train Hack 77. Geographic Distance and Back Again 2 Spidering Hacks Hack 78. Super Word Lookup Hack 79. Word Associations with Lexical Freenet Hack 80. Reformatting Bugtraq Reports Hack 81. Keeping Tabs on the Web via Email Hack 82. Publish IE's Favorites to Your Web Site Hack 83. Spidering GameStop.com Game Prices Hack 84. Bargain Hunting with PHP Hack 85. Aggregating Multiple Search Engine Results Hack 86. Robot Karaoke Hack 87. Searching the Better Business Bureau Hack 88. Searching for Health Inspections Hack 89. Filtering for the Naughties Chapter 5. Maintaining Your Collections Hacks #90−93 Hack 90. Using cron to Automate Tasks Hack 91. Scheduling Tasks Without cron Hack 92. Mirroring Web Sites with wget and rsync Hack 93. Accumulating Search Results Over Time Chapter 6. Giving Back to the World Hacks #94−100 Hack 94. Using XML::RSS to Repurpose Data Hack 95. Placing RSS Headlines on Your Site Hack 96. Making Your Resources Scrapable with Regular Expressions Hack 97. Making Your Resources Scrapable with a REST Interface Hack 98. Making Your Resources Scrapable with XML−RPC Hack 99. Creating an IM Interface Hack 100. Going Beyond the Book Colophon Index Book: Spidering Hacks Copyright © 2004 O'Reilly & Associates, Inc. Printed in the United States of America. Published by O'Reilly & Associates, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472. O'Reilly & Associates books may be purchased for educational, business, or sales promotional use. Online editions are also available for most titles (http://safari.oreilly.com). For more information, contact our corporate/institutional sales department: (800) 998−9938 or [email protected]. Nutshell Handbook, the Nutshell Handbook logo, and the O'Reilly logo are registered trademarks of O'Reilly & Associates, Inc. Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in this book, and O'Reilly & Associates, Inc. was aware of a trademark claim, the designations have been printed in caps or initial caps. The association between the image of a flex scraper and the topic of spidering is a trademark of O'Reilly & Associates, Inc. While every precaution has been taken in the preparation of this book, the publisher and authors assume no responsibility for errors or omissions, or for damages resulting from the use of the information 3 Spidering Hacks contained herein. The trademarks "Hacks Books" and "The Hacks Series," and related trade dress, are owned by O'Reilly & Associates, Inc., in the United States and other countries, and may not be used without written permission. All other trademarks are property of their respective owners. URL spiderhks−PREFACE−1 4 Spidering Hacks Book: Spidering Hacks Section: Credits About the Authors Kevin Hemenway, coauthor of Mac OS X Hacks (O'Reilly), is better known as Morbus Iff, the creator of disobey.com, which bills itself as "content for the discontented." Publisher and developer of more home cooking than you could ever imagine (like the popular open source aggregator AmphetaDesk, the best−kept gaming secret Gamegrene.com, the popular Ghost Sites and Nonsense Network, giggle−inducing articles at the O'Reilly Network, etc.), he has a fondness for expensive bags of beef jerky. Living in Concord, New Hampshire, he can be reached at [email protected]. Tara Calishain is the editor of the online newsletter ResearchBuzz (http://www.researchbuzz.com) and author or coauthor of several books, including the O'Reilly best seller Google Hacks. A search engine enthusiast for many years, she began her foray into the world of Perl when Google released its API in April 2002. URL /spiderhks−PREFACE−2−SECT−1 About the Authors 5 Spidering Hacks Book: Spidering Hacks Section: Credits Contributors The following people contributed their hacks, writing, and inspiration to this book: • Jacek Artymiak (http://www.artymiak.com) is a freelance consultant, developer and writer. He's been programming computers since 1986, starting with Sinclair ZX Spectrum. His interests include network security, computer graphics and animation, and multimedia. Jacek lives in Lublin, Poland, with his wife Gosia and can be reached at [email protected]. • Chris Ball (http://printf.net) holds a BSc in Computation from UMIST, England, and works on machine learning problems as a researcher in Cambridge University's Inference Group. In his spare time, he can be found not answering email and hanging rooks at chess tournaments. You can email him at [email protected]. • Paul Bausch (http://www.onfocus.com) is a web application developer and the cocreator of Blogger (http://www.blogger.com), the popular weblog software. His favorite site to spider is Amazon.com, the subject of his book Amazon Hacks (O'Reilly). He's also the coauthor of We Blog: Publishing Online with Weblogs (John Wiley & Sons), and he posts thoughts and photos almost daily to his personal weblog: onfocus. • Erik Benson (http://www.erikbenson.com) is the Technical Program Manager of Personalized Merchandizing at Amazon.com (http://www.amazon.com). He runs All Consuming (http://www.allconsuming.net) and has a weblog at http://www.erikbenson.com. • Adam Bregenzer is a programmer in Atlanta, GA. He has been programming for most of his life and is an advocate of Unix−like operating systems. In his free time, he plots to take over the world . and eat Mexican food. • Daniel Biddle spends far too much time reading abstruse technical documents and solving people's programming problems on IRC at irc.freenode.net, where he goes by the nickname deltab. He can be contacted at [email protected]. • Sean M. Burke is the author of O'Reilly's Perl & LWP, RTF Pocket Guide, and many of the articles in the Best of the Perl Journal volumes. An active member in the Perl open source community, he is one of CPAN's most prolific module authors and an authority on markup languages.
Recommended publications
  • [PDF] Beginning Raku
    Beginning Raku Arne Sommer Version 1.00, 22.12.2019 Table of Contents Introduction. 1 The Little Print . 1 Reading Tips . 2 Content . 3 1. About Raku. 5 1.1. Rakudo. 5 1.2. Running Raku in the browser . 6 1.3. REPL. 6 1.4. One Liners . 8 1.5. Running Programs . 9 1.6. Error messages . 9 1.7. use v6. 10 1.8. Documentation . 10 1.9. More Information. 13 1.10. Speed . 13 2. Variables, Operators, Values and Procedures. 15 2.1. Output with say and print . 15 2.2. Variables . 15 2.3. Comments. 17 2.4. Non-destructive operators . 18 2.5. Numerical Operators . 19 2.6. Operator Precedence . 20 2.7. Values . 22 2.8. Variable Names . 24 2.9. constant. 26 2.10. Sigilless variables . 26 2.11. True and False. 27 2.12. // . 29 3. The Type System. 31 3.1. Strong Typing . 31 3.2. ^mro (Method Resolution Order) . 33 3.3. Everything is an Object . 34 3.4. Special Values . 36 3.5. :D (Defined Adverb) . 38 3.6. Type Conversion . 39 3.7. Comparison Operators . 42 4. Control Flow . 47 4.1. Blocks. 47 4.2. Ranges (A Short Introduction). 47 4.3. loop . 48 4.4. for . 49 4.5. Infinite Loops. 53 4.6. while . 53 4.7. until . 54 4.8. repeat while . 55 4.9. repeat until. 55 4.10. Loop Summary . 56 4.11. if . ..
    [Show full text]
  • Perl Baseless Myths & Startling Realities
    http://xkcd.com/224/ 1 Perl Baseless Myths & Startling Realities by Tim Bunce, February 2008 2 Parrot and Perl 6 portion incomplete due to lack of time (not lack of myths!) Realities - I'm positive about Perl Not negative about other languages - Pick any language well suited to the task - Good developers are always most important, whatever language is used 3 DISPEL myths UPDATE about perl Who am I? - Tim Bunce - Author of the Perl DBI module - Using Perl since 1991 - Involved in the development of Perl 5 - “Pumpkin” for 5.4.x maintenance releases - http://blog.timbunce.org 4 Perl 5.4.x 1997-1998 Living on the west coast of Ireland ~ Myths ~ 5 http://www.bleaklow.com/blog/2003/08/new_perl_6_book_announced.html ~ Myths ~ - Perl is dead - Perl is hard to read / test / maintain - Perl 6 is killing Perl 5 6 Another myth: Perl is slow: http://www.tbray.org/ongoing/When/200x/2007/10/30/WF-Results ~ Myths ~ - Perl is dead - Perl is hard to read / test / maintain - Perl 6 is killing Perl 5 7 Perl 5 - Perl 5 isn’t the new kid on the block - Perl is 21 years old - Perl 5 is 14 years old - A mature language with a mature culture 8 How many times Microsoft has changed developer technologies in the last 14 years... 9 10 You can guess where thatʼs leading... From “The State of the Onion 10” by Larry Wall, 2006 http://www.perl.com/pub/a/2006/09/21/onion.html?page=3 Buzz != Jobs - Perl5 hasn’t been generating buzz recently - It’s just getting on with the job - Lots of jobs - just not all in web development 11 Web developers tend to have a narrow focus.
    [Show full text]
  • Learning Perl Through Examples Part 2 L1110@BUMC 2/22/2017
    www.perl.org Learning Perl Through Examples Part 2 L1110@BUMC 2/22/2017 Yun Shen, Programmer Analyst [email protected] IS&T Research Computing Services Spring 2017 Tutorial Resource Before we start, please take a note - all the codes and www.perl.org supporting documents are accessible through: • http://rcs.bu.edu/examples/perl/tutorials/ Yun Shen, Programmer Analyst [email protected] IS&T Research Computing Services Spring 2017 Sign In Sheet We prepared sign-in sheet for each one to sign www.perl.org We do this for internal management and quality control So please SIGN IN if you haven’t done so Yun Shen, Programmer Analyst [email protected] IS&T Research Computing Services Spring 2017 Evaluation One last piece of information before we start: www.perl.org • DON’T FORGET TO GO TO: • http://rcs.bu.edu/survey/tutorial_evaluation.html Leave your feedback for this tutorial (both good and bad as long as it is honest are welcome. Thank you) Yun Shen, Programmer Analyst [email protected] IS&T Research Computing Services Spring 2017 Today’s Topic • Basics on creating your code www.perl.org • About Today’s Example • Learn Through Example 1 – fanconi_example_io.pl • Learn Through Example 2 – fanconi_example_str_process.pl • Learn Through Example 3 – fanconi_example_gene_anno.pl • Extra Examples (if time permit) Yun Shen, Programmer Analyst [email protected] IS&T Research Computing Services Spring 2017 www.perl.org Basics on creating your code How to combine specs, tools, modules and knowledge. Yun Shen, Programmer Analyst [email protected] IS&T Research Computing
    [Show full text]
  • Linux Lunacy V & Perl Whirl
    SPEAKERS Linux Lunacy V Nicholas Clark Scott Collins & Perl Whirl ’05 Mark Jason Dominus Andrew Dunstan Running Concurrently brian d foy Jon “maddog” Hall Southwestern Caribbean Andrew Morton OCTOBER 2ND TO 9TH, 2005 Ken Pugh Allison Randal Linux Lunacy V and Perl Whirl ’05 run concurrently. Attendees can mix and match, choosing courses from Randal Schwartz both conferences. Doc Searls Ted Ts’o Larry Wall Michael Warfield DAY PORT ARRIVE DEPART CONFERENCE SESSIONS Sunday, Oct 2 Tampa, Florida — 4:00pm 7:15pm, Bon Voyage Party Monday, Oct 3 Cruising The Caribbean — — 8:30am – 5:00pm Tuesday, Oct 4 Grand Cayman 7:00am 4:00pm 4:00pm – 7:30pm Wednesday, Oct 5 Costa Maya, Mexico 10:00am 6:00pm 6:00pm – 7:30pm Thursday, Oct 6 Cozumel, Mexico 7:00am 6:00pm 6:00pm – 7:30pm Friday, Oct 7 Belize City, Belize 7:30am 4:30pm 4:30pm – 8:00pm Saturday, Oct 8 Cruising The Caribbean — — 8:30am – 5:00pm Sunday, Oct 9 Tampa, Florida 8:00am — Perl Whirl ’05 and Linux Lunacy V Perl Whirl ’05 are running concurrently. Attendees can mix and match, choosing courses Seminars at a Glance from both conferences. You may choose any combination Regular Expression Mastery (half day) Programming with Iterators and Generators of full-, half-, or quarter-day seminars Speaker: Mark Jason Dominus Speaker: Mark Jason Dominus (half day) for a total of two-and-one-half Almost everyone has written a regex that failed Sometimes you’ll write a function that takes too (2.5) days’ worth of sessions. The to match something they wanted it to, or that long to run because it produces too much useful conference fee is $995 and includes matched something they thought it shouldn’t, and information.
    [Show full text]
  • EN-Google Hacks.Pdf
    Table of Contents Credits Foreword Preface Chapter 1. Searching Google 1. Setting Preferences 2. Language Tools 3. Anatomy of a Search Result 4. Specialized Vocabularies: Slang and Terminology 5. Getting Around the 10 Word Limit 6. Word Order Matters 7. Repetition Matters 8. Mixing Syntaxes 9. Hacking Google URLs 10. Hacking Google Search Forms 11. Date-Range Searching 12. Understanding and Using Julian Dates 13. Using Full-Word Wildcards 14. inurl: Versus site: 15. Checking Spelling 16. Consulting the Dictionary 17. Consulting the Phonebook 18. Tracking Stocks 19. Google Interface for Translators 20. Searching Article Archives 21. Finding Directories of Information 22. Finding Technical Definitions 23. Finding Weblog Commentary 24. The Google Toolbar 25. The Mozilla Google Toolbar 26. The Quick Search Toolbar 27. GAPIS 28. Googling with Bookmarklets Chapter 2. Google Special Services and Collections 29. Google Directory 30. Google Groups 31. Google Images 32. Google News 33. Google Catalogs 34. Froogle 35. Google Labs Chapter 3. Third-Party Google Services 36. XooMLe: The Google API in Plain Old XML 37. Google by Email 38. Simplifying Google Groups URLs 39. What Does Google Think Of... 40. GooglePeople Chapter 4. Non-API Google Applications 41. Don't Try This at Home 42. Building a Custom Date-Range Search Form 43. Building Google Directory URLs 44. Scraping Google Results 45. Scraping Google AdWords 46. Scraping Google Groups 47. Scraping Google News 48. Scraping Google Catalogs 49. Scraping the Google Phonebook Chapter 5. Introducing the Google Web API 50. Programming the Google Web API with Perl 51. Looping Around the 10-Result Limit 52.
    [Show full text]
  • Name Description
    Perl version 5.10.0 documentation - perlnewmod NAME perlnewmod - preparing a new module for distribution DESCRIPTION This document gives you some suggestions about how to go about writingPerl modules, preparing them for distribution, and making them availablevia CPAN. One of the things that makes Perl really powerful is the fact that Perlhackers tend to want to share the solutions to problems they've faced,so you and I don't have to battle with the same problem again. The main way they do this is by abstracting the solution into a Perlmodule. If you don't know what one of these is, the rest of thisdocument isn't going to be much use to you. You're also missing out onan awful lot of useful code; consider having a look at perlmod, perlmodlib and perlmodinstall before coming back here. When you've found that there isn't a module available for what you'retrying to do, and you've had to write the code yourself, considerpackaging up the solution into a module and uploading it to CPAN so thatothers can benefit. Warning We're going to primarily concentrate on Perl-only modules here, ratherthan XS modules. XS modules serve a rather different purpose, andyou should consider different things before distributing them - thepopularity of the library you are gluing, the portability to otheroperating systems, and so on. However, the notes on preparing the Perlside of the module and packaging and distributing it will apply equallywell to an XS module as a pure-Perl one. What should I make into a module? You should make a module out of any code that you think is going to beuseful to others.
    [Show full text]
  • Coleman-Coding-Freedom.Pdf
    Coding Freedom !" Coding Freedom THE ETHICS AND AESTHETICS OF HACKING !" E. GABRIELLA COLEMAN PRINCETON UNIVERSITY PRESS PRINCETON AND OXFORD Copyright © 2013 by Princeton University Press Creative Commons Attribution- NonCommercial- NoDerivs CC BY- NC- ND Requests for permission to modify material from this work should be sent to Permissions, Princeton University Press Published by Princeton University Press, 41 William Street, Princeton, New Jersey 08540 In the United Kingdom: Princeton University Press, 6 Oxford Street, Woodstock, Oxfordshire OX20 1TW press.princeton.edu All Rights Reserved At the time of writing of this book, the references to Internet Web sites (URLs) were accurate. Neither the author nor Princeton University Press is responsible for URLs that may have expired or changed since the manuscript was prepared. Library of Congress Cataloging-in-Publication Data Coleman, E. Gabriella, 1973– Coding freedom : the ethics and aesthetics of hacking / E. Gabriella Coleman. p. cm. Includes bibliographical references and index. ISBN 978-0-691-14460-3 (hbk. : alk. paper)—ISBN 978-0-691-14461-0 (pbk. : alk. paper) 1. Computer hackers. 2. Computer programmers. 3. Computer programming—Moral and ethical aspects. 4. Computer programming—Social aspects. 5. Intellectual freedom. I. Title. HD8039.D37C65 2012 174’.90051--dc23 2012031422 British Library Cataloging- in- Publication Data is available This book has been composed in Sabon Printed on acid- free paper. ∞ Printed in the United States of America 1 3 5 7 9 10 8 6 4 2 This book is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE !" We must be free not because we claim freedom, but because we practice it.
    [Show full text]
  • Teach Yourself Perl 5 in 21 Days
    Teach Yourself Perl 5 in 21 days David Till Table of Contents: Introduction ● Who Should Read This Book? ● Special Features of This Book ● Programming Examples ● End-of-Day Q& A and Workshop ● Conventions Used in This Book ● What You'll Learn in 21 Days Week 1 Week at a Glance ● Where You're Going Day 1 Getting Started ● What Is Perl? ● How Do I Find Perl? ❍ Where Do I Get Perl? ❍ Other Places to Get Perl ● A Sample Perl Program ● Running a Perl Program ❍ If Something Goes Wrong ● The First Line of Your Perl Program: How Comments Work ❍ Comments ● Line 2: Statements, Tokens, and <STDIN> ❍ Statements and Tokens ❍ Tokens and White Space ❍ What the Tokens Do: Reading from Standard Input ● Line 3: Writing to Standard Output ❍ Function Invocations and Arguments ● Error Messages ● Interpretive Languages Versus Compiled Languages ● Summary ● Q&A ● Workshop ❍ Quiz ❍ Exercises Day 2 Basic Operators and Control Flow ● Storing in Scalar Variables Assignment ❍ The Definition of a Scalar Variable ❍ Scalar Variable Syntax ❍ Assigning a Value to a Scalar Variable ● Performing Arithmetic ❍ Example of Miles-to-Kilometers Conversion ❍ The chop Library Function ● Expressions ❍ Assignments and Expressions ● Other Perl Operators ● Introduction to Conditional Statements ● The if Statement ❍ The Conditional Expression ❍ The Statement Block ❍ Testing for Equality Using == ❍ Other Comparison Operators ● Two-Way Branching Using if and else ● Multi-Way Branching Using elsif ● Writing Loops Using the while Statement ● Nesting Conditional Statements ● Looping Using
    [Show full text]
  • Advancing Perl Supporting Community
    Advancing Perl Supporting Community 2012 Year End Report Board of Directors Nathan Torkington Chairman Karen Pauley President Jim Brandt Secretary Dan Wright Treasurer Kurt DeMaagd Curtis “Ovid” Poe Allison Randal Kevin Lenzo Director Emeritus Committee Chairs Alberto Simões Grants Heath Bair Conferences Mark Keating Marketing Ya’akov Sloman Community Advocacy Karen Pauley Steering The Perl Foundation 340 S Lemon Ave #6055 Walnut, CA 91789 The Perl Foundation is a business alias for The Yet Another Society, a 501-c-3 charitable organization, incorporated in the state of Michigan. A Look Inside: TPF Inside: A Look President’s Message This has been a great year for The Perl Foundation. Thanks to the generosity of our donors and corporate partners, the commitment of our volunteers, and the vitality and energy of our community, we have been able to continue in our mission to advance the Perl programming language. As you know, TPF depends on donations, both of time and money, to operate. These donations have made it possible for us to provide financial support to key developers of Perl 5 and Perl 6, to continue our efforts in protecting our trademark, and to support many community programs. One of the pillars of Perl is its community. We recently launched an initiative, the Community Advocacy Committee, targeting the growth and health of the Perl community. With the help of this committee we plan to not only improve communication within our own community but to expand our links with other F/OSS communities. This year has been one pointing to a great future for Perl and those who use it.
    [Show full text]
  • Proceedings YAPC::Europe 2012 .Com Perl Software Development Services Table of Contents 
    Proceedings YAPC::Europe 2012 .com Perl Software Development Services Table of contents Foreword 5 FatPacker: understanding and appreciating the insanity 6 Vorbild::Beitrag::POD 8 CGI.pm MUST DIE - Together we shall annihilate CGI.pm! 9 CPANTS: Kwalitative website and its tools 10 Designing the Internet of Things: Arduino and Perl 11 Dancing with WebSockets 12 Dancer 2 - Official status: 14 Practical Dancer: moving away from CGI 16 Bringing Perl to a Younger Generation 18 Asynchronous programming FTW! 22 Mojolicious 24 Continuous deployment with Perl 36 Array programming for mere mortals 37 Ontology Aware Applications 42 Distributed Code Review in Perl - State of the Practice 48 address-sanitizer - A fast memory error detector 52 Exceptional Perl 6 55 The joy of breaking stuff 59 Macros in Rakudo 66 Why statement modifiers can harm maintainability! 70 A discussion on how to organize a Perl Mongers group 72 Building C/C++ libraries with Perl 74 8PSMET#1*O0OMJOF"DDPNNPEBUJPO3FTFSWBUJPOTBOETUJMMHSPXJOH 8FOFFE1FSM%FWFMPQFST .Z4RM%#"T 8FVTF1FSM 1VQQFU "QBDIF 4PGUXBSF%FWFMPQFST 4ZT"ENJOT .Z42- .FNDBDIF (JU -JOVY $JTDP 8FC%FTJHOFST 'SPOU&OE%FWFMPQFST +VOJQFSBOENPSF /FUXPSL&OHJOFFSTBOENPSFw /08)*3*/( (SFBUMPDBUJPOJOUIFDFOUFSPG"NTUFSEBN $PNQFUJUJWF4BMBSZ3FMPDBUJPO1BDLBHF *OUFSOBUJPOBM SFTVMUESJWFOEZOBNJDXPSLFOWJSPONFOU *OUFSFTUFE XXXCPPLJOHDPNKPCT Foreword 5 Welcome to YAPC::Europe 2012. This is the fourteenth European Perl conference! The Frankfurt 8PSMET#1*O0OMJOF"DDPNNPEBUJPO3FTFSWBUJPOTBOETUJMMHSPXJOH Perlmongers have great pleasure in hosting this event this year. We‘d like to welcome you here in Frankfurt. The city that is now the heart of the Perl community for at least days. We have attendees from more than 40 countries all over the world, so there is a rich mix of different cultures and different people.
    [Show full text]
  • Perlmonks.Com with a Gaim Plug-In SEEKING WISDOM
    PROGRAMMING Perl: A Gaim Plugin Get the news from perlmonks.com with a Gaim plug-in SEEKING WISDOM irst-time visitors to perlmonks. The Gaim project offers an instant messenger client that speaks a large com are rubbing their eyes in dis- Fbelief: High-caliber Perl hackers number of protocols. We’ll show you how to extend Gaim with Perl are jumping to answer even the simplest of newbie questions. The reason for this plugins. BY MICHAEL SCHILLI is that the community assigns XP (expe- rience) points for the best replies. And the more XP you have, the higher you answer a question correctly typically However, it can take a few seconds to climb in the ranking, from a novice, to a gets the most XP. Instead of pulling the download the content of a remote web monk, and slowly to expert status. web page with the latest questions time page. Both DNS name resolution and the and time again, it makes sense to script process of retrieving the content of the Best of Class the process and have the script let you requested web page can take some time, Due to the community dynamics on know when a new query arrives. during which the CPU should return to perlmonks.com, the The pmwatcher.pl script described other tasks. The tried-and-trusted POE first person to in this issue fetches the perlmonks. [3] framework provides exactly what we com page with the Newest Nodes at need. The POE kernel runs a single pro- regular intervals, remembering cess (and only a single thread), but uses older entries and sending out an cooperative multitasking between con- instant message when it discovers current tasks to ensure that each one is new postings.
    [Show full text]
  • The Perl Review Perl Mongers News 5 Perl Foundation News 5 Perlwar
    The Perl Review 2.1 The Perl Review Volume 2, Issue 1 Winter 2005 Perl Mongers News 5 Editor / Publisher: brian d foy Perl Foundation News 5 Design: Eric Maki PerlWar 6 Columnists: David H. Adler Andy Lester The Seven Sins of Alberto Simões Perl OO Programming 10 Copy Editors: Mike Fragassi Perl News 16 Thomas Mackenzie Haskell for Perlers 18 www.theperlreview.com [email protected] Hash Anti-Patterns 26 The Perl Review Book Review 29 5250 N. Broadway Suite 157 Chicago, IL, 60640 Book Review 30 The Perl Review (ISSN 1553-667X) is published quarterly by brian d foy. Publication Office: 5301 N. Kenmore Ave #2, Chicago, IL 60640. US subscriber rate is $16 per year (4 issues). Foreign subscriber rate is $30 US per year (4 issues). Single copies are $5 in the US, $8 elsewhere. Postmaster: Send address changes to The Perl Review, 5250 N. Broadway Suite 157, Chicago, IL, 60640. Subscribers: Subscribe online or send a check or money order drawn in US funds to the above address. Advertisers: Send email to [email protected] for rates and information. On the Cover Long before people had computers, they'd sit at home and stitch their programs into cloth. It was a simpler time when Copyright © 2005 The Perl Review people were more careful with their code (always using All rights reserved. Individual articles may strictures) since the code was much harder to fix once they be available under other licenses. had finished the program. Programmers faced other difficulties, such as running out of the right color thread for their syntax Printed in Chicago, IL, USA.
    [Show full text]