Web Development with Apache and Perl
Total Page:16
File Type:pdf, Size:1020Kb
Web Development with Apache and Perl Web Development with Apache and Perl THEO PETERSEN MANNING Greenwich (74° w. long.) For online information and ordering of this and other Manning books, go to www.manning.com. The publisher offers discounts on this book when ordered in quantity. For more information, please contact: Special Sales Department Manning Publications Co. 209 Bruce Park Avenue Fax: (203) 661-9018 Greenwich, CT 06830 email: [email protected] ©2002 by Manning Publications Co. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by means electronic, mechanical, photocopying, or otherwise, without prior written permission of the publisher. Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in the book, and Manning Publications was aware of a trademark claim, the designations have been printed in initial caps or all caps. Recognizing the importance of preserving what has been written, it is Manning’s policy to have the books we publish printed on acid-free paper, and we exert our best efforts to that end. Manning Publications Co. Copyeditor: Joel Rosenthal 209 Bruce Park Avenue Typesetter: Dottie Marsico Greenwich, CT 06830 Cover designer: Leslie Haimes ISBN 1-930110-06-5 Printed in the United States of America 12345678910– VHG – 06 05 04 03 02 To Rachel for saying yes contents preface xii acknowledgments xvi author online xvii about the cover illustration xviii Part 1 Web site basics 1 1 Open Source 3 1.1 What is Open Source? 3 And it’s free! 5 1.2 Why choose Open Source 5 Support 5 ✦ Quality 8 ✦ Security 10 ✦ Innovation 11 1.3 A buyer’s guide to Open Source 14 Stable version and ongoing development 14 ✦ Support 15 Leveraging other investments 15 ✦ Ongoing costs 16 1.4 Open Source licensing 17 The GPL and LGPL 18 ✦ The BSD/MIT license 18 The Artistic License/”Perl terms” 19 ✦ The right license? 19 2The web server20 2.1 What makes a good web server 21 Hardware 21 ✦ Operating system 22 ✦ Re-evaluation and installation 23 2.2 Securing the site 24 2.3 The case for Apache 25 Installation 26 ✦ First tests 27 2.4 Apache configuration 28 The httpd.conf configuration file 28 ✦ Things you need to change 29 2.5 Production server 30 vii 2.6 Development server 33 Allow everything 33 ✦ Allow documents and scripts 34 Allow approved documents 34 2.7 Using apachectl 35 2.8 Serving documents 36 2.9 thttpd 37 3 CGI scripts 39 3.1 Why scripting 40 Scripting language choices 41 3.2 The case for Perl 42 Installing Perl 44 ✦ Testing a sample script 44 Updating Perl modules with CPAN 45 3.3 Inside CGI 46 Hello, Web! 48 ✦ Dynamic content 51 ✦ Interacting 54 HTML forms with CGI.pm 57 ✦ Taking action 63 3.4 Strictness, warnings, and taint checking 66 3.5 CGI modules 67 Part 2 Tools for web applications 69 4 Databases 71 4.1 Files 72 4.2 Address book 72 4.3 Hash files 75 Perl’s tie and hash files 76 ✦ Hash file address book 77 4.4 Relational databases 82 What’s relational? 82 ✦ Choosing a relational database 84 MySQL 84 ✦ PostgreSQL 85 ✦ Which to choose 87 4.5 Installing MySQL 87 Set the root password 87 ✦ Create a database 87 Add users and permissions 87 ✦ Create tables 88 Testing the server 88 ✦ Learning more 89 4.6 DBI, Perl’s database interface 89 Installing the Perl modules 89 ✦ Making a connection 90 CGI scripts with DBI 91 4.7 Data maintenance via CGI 97 WDBI 97 ✦ HTMLView, an alternative to WDBI 97 Installing WDBI 98 ✦ Creating a definition file 99 Using WDBI 100 ✦ Enhancing forms 102 viii CONTENTS 5 Better scripting 104 5.1 Why CGI is slow 105 Stateless protocol 106 ✦ Session-oriented persistent CGI 107 5.2 FastCGI 108 5.3 The case for mod_perl 109 Buyer’s guide 110 5.4 Installing mod_perl 111 Building and rebuilding 112 ✦ Apache run-time configuration 112 5.5 Scripting with mod_perl 114 Apache::Registry 115 ✦ Apache::DBI 117 When CGI attacks 118 5.6 Beyond CGI 119 Beyond CGI.pm? 121 5.7 mod_perl goodies 122 Apache::Status 123 5.8 Maintaining user state 123 Apache::Session 124 ✦ A to-do list 125 Cookie sessions 129 ✦ Session management and user management 132 6 Security and users 134 6.1 Listening in on the Web 134 6.2 Secure Sockets Layer (SSL) 135 Legal issues 136 6.3 OpenSSL and Apache 137 Apache-SSL 137 ✦ mod_ssl 138 ✦ Installing mod_ssl 138 Certificates 139 ✦ Configure and test 141 6.4 User authentication 143 Using HTTP authentication 144 ✦ Doing your own authentication 146 ✦ Do I need SSL for this? 152 6.5 User management 152 6.6 Login sessions 156 7 Combining Perl and HTML 163 7.1 HTML design 163 7.2 Server-side includes 164 SSI with mod_perl and Apache::Include 166 7.3 Scripting in HTML 167 Perl and HTML 168 ✦ Templates 170 CONTENTS ix 7.4 HTML::Mason 174 The case for Mason 174 ✦ Installation 175 ✦ Making pages from components 176 ✦ How Mason interprets components 180 Faking missing components with dhandlers 181 ✦ Autohandlers 182 Session management 186 ✦ Mason resources 189 7.5 The Template Toolkit 190 Template environment 191 ✦ Uploading a document 194 Viewing the upload directories 198 ✦ Template Toolkit resources 202 7.6 XML alternatives 202 Part 3 Example sites 205 8 Virtual communities 207 8.1 Serving the community 208 Community site examples 208 8.2 Implementing features 211 News 211 ✦ Forums 214 ✦ Chats 218 ✦ Search engines 221 8.3 Building a site 224 Installation 225 ✦ httpd.conf 226 ✦ mod_perl.conf 228 The front page 228 ✦ News 231 ✦ Forums 232 ✦ Chat 232 Searching 233 ✦ Improvements 233 8.4 Slash, the Slashdot code 234 8.5 Online resources 235 Search engine submission 235 9 Intranet applications 237 9.1 Documentation and file server 238 Documentation directory tree 238 ✦ File server 239 Generating index pages 240 ✦ mod_perl for indexing 241 Searching 242 9.2 Office applications 242 Email 243 ✦ Calendar 247 ✦ Project management 250 9.3 Interfaces to nonweb applications 251 Other Perl interface tools 252 ✦ Passwords 254 ✦ File management 255 9.4 System administration 256 WebMIN 257 9.5 Build your own portal 259 Maintaining links 260 ✦ UserDir, Redirect, and mod_rewrite for user maintenance 262 ✦ mod_perl for translation and redirection 264 9.6 Joining multiple intranets 269 VPNs 270 ✦ PPP 271 ✦ SSH 271 ✦ Put it all together, it spells... 271 x CONTENTS 10 The web storefront 273 10.1 E-commerce requirements 274 Security and privacy 274 ✦ Stress testing 275 10.2 Components of an e-commerce site 276 Catalog 277 ✦ Account data 282 ✦ Shopping cart 286 Taking the order 291 ✦ Tracking shipments 296 10.3 Feedback 299 Product reviews 299 ✦ Customer feedback and other services 302 10.4 Open Source e-commerce tools 303 Interchange 304 ✦ AllCommerce 308 10.5 Credit card processing 311 CCVS 313 Part 4 Site management 315 11 Content management 317 11.1 Development life cycle 318 Development, staging, production 319 A staging area on your production server 319 11.2 Tools for content management 326 FrontPage 326 ✦ rsync 327 ✦ Mason-CM 333 ✦ WebDAV 343 11.3 Managing a production site 345 Configuration 345 ✦ Development to staging 349 Staging to production 350 11.4 Backup and recovery 352 Backup 353 ✦ Recovery 355 ✦ Test and verify! 356 12 Performance Management 358 12.1 Victims of success 359 Monitoring system loads 361 ✦ Stress-testing your server 364 12.2 Tuning your server 368 Apache configuration 369 ✦ mod_perl issues 373 ✦ Socket tricks 380 12.3 Web farming 382 Reverse proxies 383 ✦ Squid accelerator 388 ✦ Load balancing 389 references 395 index 397 CONTENTS xi preface A quick look at your local bookstore’s Internet section will tell you that there are quite a few commercial packages out there for building web sites. What those books often fail to mention is that many of the world’s most popular web sites were built using freely available tools, and run on free operating systems (OS). They also tend to be served up by Apache, the world’s leading web server, which is also free. I don’t think this omission is due to some vast commercial software conspiracy; there are plenty of books about Linux, one of the free OSs in question. In fact, the success of Linux has drawn much-needed attention to the Open Source software movement, and in turn helped to make this book possible. If anything, the problem is lack of information about free tools, and mis- conceptions about Open Source solutions. My goal in writing this book is to make you aware of the amazing richness and quality of Open Source tools for building web sites. I’m not selling anything (other than the book) and I won’t profit from your buying decisions. While I will encourage you to consider the advantages of a free OS, chances are good you can use these tools on a commercial OS you already have. I should also point out that this is an idea book, not a comprehensive reference about any par- ticular programming tool or operating system. As part of the task of making you aware of what products are available and what you can do with them, I’ll encourage you to look at online resources and other books for more detailed information. Who should read this book? You can take any of several approaches to the material here, depending on what you want or need.