Software Engineering for Internet Applications Eve Andersson

Softwar computer science/software engineering e Engin Software Engineering for Internet Applications Eve Andersson Philip Greenspun eer Andrew Grumet in g for Software Engineering After completing this self-contained course on server-based Internet applications software, students who start with only the knowledge of how to write and debug a computer program will have learned how to build Interne Web-based applications on the scale of Amazon.com. Unlike the desktop applications that most students for Internet Applications have already learned to build, server-based applications have multiple simultaneous users. This fact, coupled with the unreliability of networks, gives rise to the problems of concurrency and transactions, which students t A pplic learn to manage by using the relational database system. After working their way to the end of the book, students will have the skills to take vague and ambitious at specifications and turn them into a system design that can be built and launched in a few months. They ions will be able to test prototypes with end-users and refine the application design. They will understand how to meet the challenge of extreme business requirements with automatic code generation and the use of open- source toolkits where appropriate. Students will understand HTTP, HTML, SQL, mobile browsers, VoiceXML, data modeling, page flow and interaction design, server-side scripting, and usability analysis. The book, which originated as the text for an MIT course, is suitable for classroom use and will be a useful Ande reference for software professionals developing multi-user Internet applications. It will also help managers evaluate such commercial software as Microsoft Sharepoint of Microsoft Content Management Server. r sson, Greenspun, and Gr Eve Andersson is Senior Vice President and Chair of the Bachelor of Science in Computer Science at Neumont University, Salt Lake City. Philip Greenspun, a software developer, author, teacher, pilot, and photographer, originated the Software Engineering for Internet Applications course at MIT. He is the author of Philip and Alex’s Guide to Web Publishing. Andrew Grumet received his Ph.D. in Electrical Engineering and Computer Science from MIT and builds Web applications as an independent software developer. umet “Filled with practical advice for elegant and effective Web sites.” —Edward Tufte, author of The Visual Display of Quantitative Information The MIT Press 0-262-51191-6 Massachusetts Institute of Technology Cambridge, Massachusetts 02142 Eve Andersson, Philip Greenspun, and Andrew Grumet http://mitpress.mit.edu Cover design by Erin Hasley Software Engineering for Internet Applications Eve Andersson, Philip Greenspun, and Andrew Grumet Software Engineering for Internet Applications The MIT Press Cambridge, Massachusetts London, England 6 2006 Massachusetts Institute of Technology All rights reserved. No part of this book may be reproduced in any form by any elec- tronic or mechanical means (including photocopying, recording, or information storage and retrieval) without permission in writing from the publisher. MIT Press books may be purchased at special quantity discounts for business or sales promotional use. For information, please email [email protected] or write to Special Sales Department, The MIT Press, 55 Hayward Street, Cambridge, MA 02142. This book was set in Times New Roman on 3B2 by Asco Typesetters, Hong Kong, and printed and bound in the United States of America. Library of Congress Cataloging-in-Publication Data Andersson, Eve Astrid. Software engineering for Internet applications / Eve Andersson, Philip Greenspun, and Andrew Grumet. p. cm. Includes bibliographical references and index. ISBN 0-262-51191-6 (pbk. : alk. paper) 1. Internet programming. 2. Application software. 3. Software engineering. I. Greenspun, Philip. II. Grumet, Andrew. III. Title. QA76.625.A55 2006 005.2076—dc22 2005049144 10987654321 Contents Preface vii Acknowledgments ix 1 Introduction 1 2 Basics 9 3 Planning 47 4 Software Structure 63 5 User Registration and Management 75 6 Content Management 97 7 Software Modularity 141 8 Discussion 161 9 Adding Mobile Users to Your Community 183 10 Voice (VoiceXML) 199 11 Scaling Gracefully 213 12 Search 241 13 Planning Redux 261 14 Distributed Computing with HTTP, XML, SOAP, and WSDL 269 15 Metadata (and Automatic Code Generation) 281 vi Contents 16 User Activity Analysis 303 17 Writeup 313 Reference Chapters A HTML 329 B Engagement Management by Cesar Brea 351 C Grading Standards 359 Glossary 363 To the Instructor 375 Sample Contract (between Student Team and Client) 391 About the Authors 393 Index 395 Preface This is the textbook for the MIT course ‘‘Software Engineering for Internet Applications.’’ The course is intended for juniors and seniors in computer science. We assume that they know how to write a computer program and debug it. We do not assume knowledge of any particular programming lan- guages, standards, or protocols. The most concise statement of the course goal is that ‘‘The student finishes knowing how to build amazon.com by him or herself.’’ Other people who might find this book useful include the following: m professional software developers building online communities or other multi- user Internet applications m managers who are evaluating packaged software aimed at supporting online communities—various chapters contain criteria for judging the features of products such as Microsoft Sharepoint or Microsoft Content Management Server m university students and faculty looking to add some structure to a ‘‘capstone’’ project at the end of a computer science degree If you’re confused by the ‘‘student knows how to build amazon.com’’ statement, we can break it down in terms of principles and skills. The fundamental di¤erence between server-based Internet applications and the desktop applications that students have already learned to build is that server-based applications have multiple simultaneous users. Coupled with the unreliability of networks, this gives rise to the problems of concurrency and transactions. Stateless communications protocols such as HTTP mean that the student must learn how to build a stateful user experience on top of stateless protocols. For persistence between clicks and management of concurrency and transactions, viii Preface the student needs to learn how to use the relational database management system. Finally, though this goes beyond the simple stand-alone amazon.com-style service, students ought to learn about object-oriented distributed computing where each object is a Web service. In addition to learning these principles, we’d like the student to learn some skills. This is a laboratory course, and we want students who graduate to be competent software engineers. We’d like our students to be able to take vague and ambitious specifications and turn them into a system design that can be built and launched within a few months, with the features most important to users and easiest to develop built first and the di‰cult bells and whistles de- ferred to a second version. We’d like our students to know how to test prototypes with end-users and refine their application design once or twice within even a three-month project. When business requirements are extreme, for example, ‘‘build me amazon.com by yourself in three months,’’ we want our students to understand how to cope with the challenge via automatic code generation and use of open-source toolkits where appropriate. We can recast the ‘‘student knows how to build amazon.com’’ statement in terms of technologies used. By the time someone has finished reading and doing the exercises in this book, he or she will understand HTTP, HTML, SQL, mobile browsers on telephones, VoiceXML, data modeling, page flow and interaction design, server-side scripting, and usability analysis. Eve Andersson, Philip Greenspun, Andrew Grumet Cambridge, Massachusetts December 2005 Acknowledgments The book is an outgrowth of six semesters of teaching experience at MIT and other universities. So our first thanks must go to our students, who taught us what worked and what didn’t work. It is a privilege to teach at MIT, and every instructor should have the opportunity once in a lifetime. We did not teach alone. Hal Abelson and the late Michael Dertouzos were our partners on the lecture podium. Hal was Mr. Pedagogy and also pushed the distributed computing ideas to the fore. Michael gave us an early push into voice applications. Lydia Sandon was our first teaching assistant. Ben Adida was our teaching assistant at MIT in the fall of 2003 when this book took its final pre-print shakedown cruise. In semesters where we did not have a full-time teaching assistant, the students’ most valuable partners were their industry mentors, most of whom were MIT alumni volunteering their time: David Abercrombie, Tracy Adams, Ben Adida, Mike Bonnet, Christian Brechbuhler, James Buszard-Welcher, Bryan Che, Bruce Keilin, Chris McEniry, Henry Minsky, Neil Mayle, Dan Parker, Richard Perng, Lydia Sandon, Mike Shurpik, Steve Strassman, Jessica Wong, and certainly a few more whose names have slipped from our memory. We’ve gotten valuable feedback from instructors at other universities using these materials, notably Aurelius Prochazka at Caltech and Oscar Bonilla at Universidad Galileo. 1 Introduction The concern for man and his destiny must always be the chief interest of all technical e¤ort. Never forget it between your diagrams and equations. —Albert Einstein A twelve-year-old can build a nice Web application using the tools that came standard with any Linux or Windows machine. Thus it is worth asking our- selves, ‘‘What is challenging, interesting, and inspiring about Internet-based applications?’’ There are some easy-to-identify technology-related challenges. For example, in many situations it would be more convenient to interact with an information system by talking and listening. You’re in the bathtub reading New Yorker. You want to know whether there are any early morning appointments on your calendar that would prevent you from staying in the tub and finishing an interesting article.

Software Engineering for Internet Applications Eve Andersson

Lisp.Qxd 17/10/05 12:42 Página 46

Towards Intelligent Structures: Active Control of Buckling

Asynchronous Logic Automata David Allen Dalrymple

SITE CONTROLLER: a System for Computer-Aided Civil Engineering and Construction

IBM EX. 1006 1 of 4 Scalable Display Technologies 1/2004 – October 2014 Position: President, Chairman, and Co-Founder

Is an Oxymoron 9780262312325 Lisa Gitelman 2013 MIT Press

About Lisp ...Or Lambda, the Ultimate Lecture

Task Force on Taxation of the Digital Economy

Evolution of Emacs Lisp

A Security Kernel Based on the Lambda-Calculus

Quotes: a Actually, I Think What Would Be Most Appropriate Is If We Were All to Shut up Now

Structure and Interpretation of Computer Programs Free