Michael T. Nygard — «Release It! Design and Deploy Production-Ready Software
Total Page:16
File Type:pdf, Size:1020Kb
What readers are saying about Release It! Agile development emphasizes delivering production-ready code every iteration. This book finally lays out exactly what this really means for critical systems today. You have a winner here. Tom Poppendieck Poppendieck.LLC It’s brilliant. Absolutely awesome. This book would’ve saved [Really Big Company] hundreds of thousands, if not millions, of dollars in a recent release. Jared Richardson Agile Artisans, Inc. Beware! This excellent package of experience, insights, and patterns has the potential to highlight all the mistakes you didn’t know you have already made. Rejoice! Michael gives you recipes of how you redeem yourself right now. An invaluable addition to your Pragmatic bookshelf. Arun Batchu Enterprise Architect, netrii LLC Release It! Design and Deploy Production-Ready Software Michael T. Nygard The Pragmatic Bookshelf Raleigh, North Carolina Dallas, Texas Many of the designations used by manufacturers and sellers to distinguish their prod- ucts are claimed as trademarks. Where those designations appear in this book, and The Pragmatic Programmers, LLC was aware of a trademark claim, the designations have been printed in initial capital letters or in all capitals. The Pragmatic Starter Kit, The Pragmatic Programmer, Pragmatic Programming, Pragmatic Bookshelf and the linking g device are trademarks of The Pragmatic Programmers, LLC. Every precaution was taken in the preparation of this book. However, the publisher assumes no responsibility for errors or omissions, or for damages that may result from the use of information (including program listings) contained herein. Our Pragmatic courses, workshops, and other products can help you and your team create better software and have more fun. For more information, as well as the latest Pragmatic titles, please visit us at http://www.pragmaticprogrammer.com Copyright © 2007 Michael T. Nygard. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmit- ted, in any form, or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior consent of the publisher. Printed in the United States of America. ISBN-10: 0-9787392-1-3 ISBN-13: 978-0-9787392-1-8 Printed on acid-free paper with 85% recycled, 30% post-consumer content. First printing, April 2007 Version: 2007-3-28 Contents Preface 10 Who Should Read This Book? ................... 11 How the Book Is Organized .................... 12 About the Case Studies ...................... 13 Acknowledgments .......................... 13 Introduction 14 1.1 Aiming for the Right Target ................ 15 1.2 Use the Force ........................ 15 1.3 Quality of Life ........................ 16 1.4 The Scope of the Challenge ................ 16 1.5 A Million Dollars Here, a Million Dollars There ..... 17 1.6 Pragmatic Architecture ................... 18 Part I—Stability 20 The Exception That Grounded an Airline 21 2.1 The Outage .......................... 22 2.2 Consequences ........................ 25 2.3 Post-mortem ......................... 27 2.4 The Smoking Gun ...................... 31 2.5 An Ounce of Prevention? .................. 34 Introducing Stability 35 3.1 Defining Stability ...................... 36 3.2 Failure Modes ........................ 37 3.3 Cracks Propagate ...................... 39 3.4 Chain of Failure ....................... 41 3.5 Patterns and Antipatterns ................. 42 CONTENTS 6 Stability Antipatterns 44 4.1 Integration Points ...................... 46 4.2 Chain Reactions ....................... 61 4.3 Cascading Failures ..................... 65 4.4 Users ............................. 68 4.5 Blocked Threads ...................... 81 4.6 Attacks of Self-Denial .................... 88 4.7 Scaling Effects ........................ 91 4.8 Unbalanced Capacities ................... 96 4.9 Slow Responses ....................... 100 4.10 SLA Inversion ........................ 102 4.11 Unbounded Result Sets .................. 106 Stability Patterns 110 5.1 Use Timeouts ........................ 111 5.2 Circuit Breaker ....................... 115 5.3 Bulkheads .......................... 119 5.4 Steady State ......................... 124 5.5 Fail Fast ........................... 131 5.6 Handshaking ........................ 134 5.7 Test Harness ......................... 136 5.8 Decoupling Middleware .................. 141 Stability Summary 144 Part II—Capacity 146 Trampled by Your Own Customers 147 7.1 Countdown and Launch .................. 147 7.2 Aiming for QA ........................ 148 7.3 Load Testing ......................... 152 7.4 Murder by the Masses ................... 155 7.5 The Testing Gap ....................... 157 7.6 Aftermath .......................... 158 Introducing Capacity 161 8.1 Defining Capacity ...................... 161 8.2 Constraints ......................... 162 8.3 Interrelations ........................ 165 CONTENTS 7 8.4 Scalability .......................... 165 8.5 Myths About Capacity ................... 166 8.6 Summary ........................... 174 Capacity Antipatterns 175 9.1 Resource Pool Contention ................. 176 9.2 Excessive JSP Fragments ................. 180 9.3 AJAX Overkill ........................ 182 9.4 Overstaying Sessions .................... 185 9.5 Wasted Space in HTML ................... 187 9.6 The Reload Button ..................... 191 9.7 Handcrafted SQL ...................... 193 9.8 Database Eutrophication ................. 196 9.9 Integration Point Latency ................. 199 9.10 Cookie Monsters ...................... 201 9.11 Summary ........................... 203 Capacity Patterns 204 10.1 Pool Connections ...................... 206 10.2 Use Caching Carefully ................... 208 10.3 Precompute Content .................... 210 10.4 TunetheGarbageCollector ................ 214 10.5 Summary ........................... 217 Part III—General Design Issues 218 Networking 219 11.1 Multihomed Servers .................... 219 11.2 Routing ............................ 222 11.3 Virtual IP Addresses .................... 223 Security 226 12.1 The Principle of Least Privilege .............. 226 12.2 Configured Passwords ................... 227 Availability 229 13.1 Gathering Availability Requirements ........... 229 13.2 Documenting Availability Requirements ......... 230 13.3 Load Balancing ....................... 232 13.4 Clustering .......................... 238 CONTENTS 8 Administration 240 14.1 “Does QA Match Production?” ............... 241 14.2 Configuration Files ..................... 243 14.3 Start-up and Shutdown .................. 247 14.4 Administrative Interfaces ................. 248 Design Summary 249 Part IV—Operations 251 Phenomenal Cosmic Powers, Itty-Bitty Living Space 252 16.1 Peak Season ......................... 252 16.2 Baby’s First Christmas ................... 253 16.3 Taking the Pulse ...................... 254 16.4 Thanksgiving Day ...................... 256 16.5 Black Friday ......................... 256 16.6 Vital Signs .......................... 257 16.7 Diagnostic Tests ....................... 259 16.8 Call in a Specialist ..................... 260 16.9 Compare Treatment Options ............... 262 16.10 Does the Condition Respond to Treatment? ....... 262 16.11 Winding Down ........................ 263 Transparency 265 17.1 Perspectives ......................... 267 17.2 Designing for Transparency ................ 275 17.3 Enabling Technologies ................... 276 17.4 Logging ............................ 276 17.5 Monitoring Systems ..................... 283 17.6 Standards, De Jure and De Facto ............ 289 17.7 Operations Database .................... 299 17.8 Supporting Processes .................... 305 17.9 Summary ........................... 309 Adaptation 310 18.1 Adaptation Over Time ................... 310 18.2 Adaptable Software Design ................ 312 18.3 Adaptable Enterprise Architecture ............ 319 18.4 Releases Shouldn’t Hurt .................. 327 18.5 Summary ........................... 334 CONTENTS 9 Bibliography 336 Index 339 Preface You’ve worked hard on the project for more than year. Finally, it looks like all the features are actually complete, and most even have unit tests. You can breathe a sigh of relief. You’re done. Or are you? Does “feature complete” mean “production ready”? Is your system really ready to be deployed? Can it be run by operations staff and face the hordes of real-world users without you? Are you starting to get that sinking feeling that you’ll be faced with late-night emergency phone calls or pager beeps? It turns out there’s a lot more to development than just getting all the features in. Too often, project teams aim to pass QA’s tests, instead of aiming for life in Production (with a capital P). That is, the bulk of your work probably focuses on passing testing. But testing—even agile, pragmatic, auto- mated testing—is not enough to prove that software is ready for the real world. The stresses and the strains of the real world, with crazy real users, globe-spanning traffic, and virus-writing mobs from coun- tries you’ve never even heard of, go well beyond what we could ever hope to test for. To make sure your software is ready for the harsh realities of the real world, you need to be prepared. I’m here to help show you where the problemslieandwhatyouneedtogetaroundthem.Butbeforewe begin, there are some popular misconceptions I’ll discuss. First, you need to accept that fact that despite your best laid plans, bad things will still