Efficient ++ Performance Programming Techniques By Dov Bulka, David Mayhew

Publisher : Addison Wesley

Pub Date : November 03, 1999

ISBN : 0-201-37950-3 Table of Pages : 336 Contents

Far too many programmers and software designers consider efficient C++ to be an oxymoron. They regard C++ as inherently slow and inappropriate for performance- critical applications. Consequently, C++ has had little success penetrating domains such as networking, operating system kernels, device drivers, and others.

Efficient C++ explodes that myth. Written by two authors with first-hand experience wringing the last ounce of performance from commercial C++ applications, this book demonstrates the potential of C++ to produce highly efficient programs. The book reveals practical, everyday object-oriented design principles and C++ coding techniques that can yield large performance improvements. It points out common pitfalls in both design and code that generate hidden operating costs.

This book focuses on combining C++'s power and flexibility with high performance and scalability, resulting in the best of both worlds. Specific topics include temporary objects, memory management, templates, inheritance, virtual functions, inlining, reference- counting, STL, and much more.

With this book, you will have a valuable compendium of the best performance techniques at your fingertips.

T

E A M

F L Y

Team-Fly® Table of Content

Table of Content ...... i Copyright...... v Dedication ...... vi Preface...... vi Introduction...... viii Roots of Software Inefficiency...... viii Our Goal...... xi Software Efficiency: Does It Matter?...... xi Terminology ...... xii Organization of This Book ...... xiii Chapter 1. The Tracing War Story...... 1 Our Initial Trace Implementation...... 2 Key Points ...... 7 Chapter 2. Constructors and Destructors...... 9 Inheritance...... 9 Composition ...... 18 Lazy Construction ...... 19 Redundant Construction ...... 21 Key Points ...... 25 Chapter 3. Virtual Functions ...... 26 Virtual Function Mechanics ...... 26 Templates and Inheritance ...... 28 Key Points ...... 31 Chapter 4. The Return Value Optimization ...... 32 The Mechanics of Return-by-Value...... 32 The Return Value Optimization...... 33 Computational Constructors...... 35 Key Points ...... 36 Chapter 5. Temporaries...... 37 Object Definition...... 37 Type Mismatch ...... 38 Pass by Value...... 40 Return by Value...... 40 Eliminate Temporaries with op=() ...... 42 Key Points ...... 43 Chapter 6. Single-Threaded Memory Pooling...... 44 Version 0: The Global new() and delete()...... 44 Version 1: Specialized Rational Memory Manager...... 45 Version 2: Fixed-Size Object Memory Pool ...... 49 Version 3: Single-Threaded Variable-Size Memory Manager...... 52 Key Points ...... 58 Chapter 7. Multithreaded Memory Pooling...... 59 Version 4: Implementation...... 59 Version 5: Faster Locking ...... 61 Key Points ...... 64 Chapter 8. Inlining Basics ...... 66 What Is Inlining?...... 66 Method Invocation Costs ...... 69 Why Inline? ...... 72 Inlining Details ...... 73 Inlining Virtual Methods...... 73 Performance Gains from Inlining ...... 74

ii Key Points ...... 75 Chapter 9. Inlining—Performance Considerations...... 76 Cross-Call Optimization ...... 76 Why Not Inline? ...... 80 Development and Compile-Time Inlining Considerations...... 82 Profile-Based Inlining...... 82 Inlining Rules ...... 85 Key Points ...... 86 Chapter 10. Inlining Tricks ...... 87 Conditional Inlining...... 87 Selective Inlining ...... 88 Recursive Inlining...... 89 Inlining with Static Local Variables...... 92 Architectural Caveat: Multiple Register Sets ...... 94 Key Points ...... 94 Chapter 11. Standard Template Library ...... 96 Asymptotic Complexity...... 96 Insertion...... 96 Deletion...... 103 Traversal...... 105 Find...... 106 Function Objects ...... 108 Better than STL? ...... 110 Key Points ...... 112 Chapter 12. Reference Counting ...... 113 Implementation Details...... 114 Preexisting Classes ...... 123 Concurrent Reference Counting...... 126 Key Points ...... 129 Chapter 13. Coding Optimizations...... 131 Caching...... 132 Precompute...... 133 Reduce Flexibility ...... 134 80-20 Rule: Speed Up the Common Path...... 134 Lazy Evaluation ...... 137 Useless Computations ...... 139 System Architecture...... 140 Memory Management ...... 140 Library and System Calls...... 142 Compiler Optimization...... 143 Key Points ...... 144 Chapter 14. Design Optimizations ...... 145 Design Flexibility ...... 145 Caching...... 148 Efficient Data Structures ...... 150 Lazy Evaluation ...... 151 Useless Computations ...... 153 Obsolete Code...... 154 Key Points ...... 155 Chapter 15. Scalability...... 156 The SMP Architecture ...... 158 Amdahl's Law...... 160 Multithreaded and Synchronization Terminology...... 161 Break Up a Task into Multiple Subtasks...... 162

iii Cache Shared Data ...... 163 Share Nothing...... 164 Partial Sharing ...... 166 Lock Granularity ...... 167 False Sharing...... 169 Thundering Herd ...... 170 Reader/Writer Locks...... 171 Key Points ...... 172 Chapter 16. System Architecture Dependencies ...... 173 Memory Hierarchies...... 173 Registers: Kings of Memory ...... 174 Disk and Memory Structures...... 177 Cache Effects ...... 179 Cache Thrash ...... 180 Avoid Branching ...... 181 Prefer Simple Calculations to Small Branches...... 182 Threading Effects...... 183 Context Switching ...... 184 Kernel Crossing...... 186 Threading Choices...... 187 Key Points ...... 189 Bibliography...... 190

iv Copyright

Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in this book and Addison-Wesley was aware of a trademark claim, the designations have been printed in initial caps or all caps.

The authors and publishers have taken care in the preparation of this book, but no expressed or implied warranty of any kind and assume no responsibility for errors or omissions. No liability is assumed for incidental or consequential damages in connection with or arising out of the use of the information or programs contained herein.

The publisher offers discounts on this book when ordered in quantity for special sales. For more information, please contact:

Corporate Government and Special Sales

Addison Wesley Longman, Inc.

One Jacob Way

Reading, Massachusetts 01867

Library of Congress Cataloging-in-Publication Data

Bulka, Dov.

Efficient C++ : performance programming techniques / Dov Bulka,

David Mayhew. p. m.

Includes bibliographical references (p. ).

ISBN 0-201-37950-3

1. C++ (Computer program language) I. Mayhew, David. II. Title.

QA76.73.C153B85 1999

005.13 ‘ 3—dc21 99-39175

CIP

Copyright © 2000 by Addison Wesley Longman, Inc.

All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form, or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior consent of the publisher. Printed in the United States of America. Published simultaneously in Canada.

Text printed on recycled and acid-free paper.

v 1 2 3 4 5 6 7 8 9 10 —CRS—03 02 01 00 99

First printing, October 1999

Dedication

To my mother, Rivka Bulka and to the memory of my father Yacov Bulka, survivor of the Auschwitz concentration camp. They could not take away his kindness, compassion and optimism, which was his ultimate triumph. He passed away during the writing of this book.

D.B

To Ruth, the love of my life, who made time for me to write this. To the boys, Austin, Alex, and Steve, who missed their dad for a while. To my parents, Mom and Dad, who have always loved and supported me

D.M. Preface

If you conducted an informal survey of software developers on the issue o