Computer Organization and Architecture Designing for Performance Ninth Edition

COMPUTER ORGANIZATION AND ARCHITECTURE DESIGNING FOR PERFORMANCE NINTH EDITION William Stallings Boston Columbus Indianapolis New York San Francisco Upper Saddle River Amsterdam Cape Town Dubai London Madrid Milan Munich Paris Montréal Toronto Delhi Mexico City São Paulo Sydney Hong Kong Seoul Singapore Taipei Tokyo Editorial Director: Marcia Horton Designer: Bruce Kenselaar Executive Editor: Tracy Dunkelberger Manager, Visual Research: Karen Sanatar Associate Editor: Carole Snyder Manager, Rights and Permissions: Mike Joyce Director of Marketing: Patrice Jones Text Permission Coordinator: Jen Roach Marketing Manager: Yez Alayan Cover Art: Charles Bowman/Robert Harding Marketing Coordinator: Kathryn Ferranti Lead Media Project Manager: Daniel Sandin Marketing Assistant: Emma Snider Full-Service Project Management: Shiny Rajesh/ Director of Production: Vince O’Brien Integra Software Services Pvt. Ltd. Managing Editor: Jeff Holcomb Composition: Integra Software Services Pvt. Ltd. Production Project Manager: Kayla Smith-Tarbox Printer/Binder: Edward Brothers Production Editor: Pat Brown Cover Printer: Lehigh-Phoenix Color/Hagerstown Manufacturing Buyer: Pat Brown Text Font: Times Ten-Roman Creative Director: Jayne Conte Credits: Figure 2.14: reprinted with permission from The Computer Language Company, Inc. Figure 17.10: Buyya, Rajkumar, High-Performance Cluster Computing: Architectures and Systems, Vol I, 1st edition, ©1999. Reprinted and Electronically reproduced by permission of Pearson Education, Inc. Upper Saddle River, New Jersey, Figure 17.11: Reprinted with permission from Ethernet Alliance. Credits and acknowledgments borrowed from other sources and reproduced, with permission, in this textbook appear on the appropriate page within text. Copyright © 2013, 2010, 2006 by Pearson Education, Inc., publishing as Prentice Hall. All rights reserved. Manufactured in the United States of America. This publication is protected by Copyright, and permission should be obtained from the publisher prior to any prohibited reproduction, storage in a retrieval system, or transmission in any form or by any means, electronic, mechanical, photocopying, recording, or likewise. To obtain permission(s) to use material from this work, please submit a written request to Pearson Education, Inc., Permissions Department, One Lake Street, Upper Saddle River, New Jersey 07458, or you may fax your request to 201-236-3290. Many of the designations by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in this book, and the publisher was aware of a trademark claim, the designations have been printed in initial caps or all caps. Library of Congress Cataloging-in-Publication Data available upon request 10 9 8 7 6 5 4 3 2 1 ISBN 10: 0-13-293633-X ISBN 13: 978-0-13-293633-0 To Tricia (ATS), my loving wife, the kindest and gentlest person This page intentionally left blank CONTENTS Online Resources xi Preface xiii About the Author xxi Chapter 0 Reader’s and Instructor’s Guide 1 0.1 Outline of the Book 2 0.2 A Roadmap for Readers and Instructors 2 0.3 Why Study Computer Organization and Architecture? 3 0.4 Internet and Web Resources 5 PART ONE OVERVIEW 6 Chapter 1 Introduction 6 1.1 Organization and Architecture 7 1.2 Structure and Function 8 1.3 Key Terms and Review Questions 14 Chapter 2 Computer Evolution and Performance 15 2.1 A Brief History of Computers 16 2.2 Designing for Performance 37 2.3 Multicore, MICs, and GPGPUs 43 2.4 The Evolution of the Intel x86 Architecture 44 2.5 Embedded Systems and the ARM 45 2.6 Performance Assessment 49 2.7 Recommended Reading 59 2.8 Key Terms, Review Questions, and Problems 60 PART TWO THE COMPUTER SYSTEM 65 Chapter 3 A Top-Level View of Computer Function and Interconnection 65 3.1 Computer Components 66 3.2 Computer Function 68 3.3 Interconnection Structures 84 3.4 Bus Interconnection 85 3.5 Point-To-Point Interconnect 93 3.6 PCI Express 98 3.7 Recommended Reading 108 3.8 Key Terms, Review Questions, and Problems 108 Chapter 4 Cache Memory 112 4.1 Computer Memory System Overview 113 4.2 Cache Memory Principles 120 4.3 Elements of Cache Design 123 v vi CONTENTS 4.4 Pentium 4 Cache Organization 141 4.5 ARM Cache Organization 144 4.6 Recommended Reading 146 4.7 Key Terms, Review Questions, and Problems 147 Appendix 4A Performance Characteristics of Two-Level Memories 152 Chapter 5 Internal Memory 159 5.1 Semiconductor Main Memory 160 5.2 Error Correction 170 5.3 Advanced DRAM Organization 174 5.4 Recommended Reading 180 5.5 Key Terms, Review Questions, and Problems 181 Chapter 6 External Memory 185 6.1 Magnetic Disk 186 6.2 RAID 195 6.3 Solid State Drives 205 6.4 Optical Memory 210 6.5 Magnetic Tape 215 6.6 Recommended Reading 217 6.7 Key Terms, Review Questions, and Problems 218 Chapter 7 Input/Output 221 7.1 External Devices 223 7.2 I/O Modules 226 7.3 Programmed I/O 228 7.4 Interrupt-Driven I/O 232 7.5 Direct Memory Access 240 7.6 I/O Channels and Processors 246 7.7 The External Interface: Thunderbolt and Inﬁniband 248 7.8 IBM zEnterprise 196 I/O Structure 256 7.9 Recommended Reading 260 7.10 Key Terms, Review Questions, and Problems 260 Chapter 8 Operating System Support 265 8.1 Operating System Overview 266 8.2 Scheduling 277 8.3 Memory Management 283 8.4 Pentium Memory Management 294 8.5 ARM Memory Management 299 8.6 Recommended Reading 304 8.7 Key Terms, Review Questions, and Problems 304 PART THREE ARITHMETIC AND LOGIC 309 Chapter 9 Number Systems 309 9.1 The Decimal System 310 9.2 Positional Number Systems 311 9.3 The Binary System 312 9.4 Converting Between Binary and Decimal 312 CONTENTS vii 9.5 Hexadecimal Notation 315 9.6 Recommended Reading 317 9.7 Key Terms and Problems 317 Chapter 10 Computer Arithmetic 319 10.1 The Arithmetic and Logic Unit 320 10.2 Integer Representation 321 10.3 Integer Arithmetic 326 10.4 Floating-Point Representation 341 10.5 Floating-Point Arithmetic 349 10.6 Recommended Reading 358 10.7 Key Terms, Review Questions, and Problems 359 Chapter 11 Digital Logic 364 11.1 Boolean Algebra 365 11.2 Gates 368 11.3 Combinational Circuits 370 11.4 Sequential Circuits 388 11.5 Programmable Logic Devices 397 11.6 Recommended Reading 401 11.7 Key Terms and Problems 401 PART FOUR THE CENTRAL PROCESSING UNIT 405 Chapter 12 Instruction Sets: Characteristics and Functions 405 12.1 Machine Instruction Characteristics 406 12.2 Types of Operands 413 12.3 Intel x86 and ARM Data Types 415 12.4 Types of Operations 418 12.5 Intel x86 and ARM Operation Types 431 12.6 Recommended Reading 441 12.7 Key Terms, Review Questions, and Problems 441 Appendix 12A Little-, Big-, and Bi-Endian 447 Chapter 13 Instruction Sets: Addressing Modes and Formats 451 13.1 Addressing Modes 452 13.2 x86 and ARM Addressing Modes 459 13.3 Instruction Formats 464 13.4 x86 and ARM Instruction Formats 473 13.5 Assembly Language 477 13.6 Recommended Reading 479 13.7 Key Terms, Review Questions, and Problems 479 Chapter 14 Processor Structure and Function 483 14.1 Processor Organization 484 14.2 Register Organization 486 14.3 Instruction Cycle 491 14.4 Instruction Pipelining 495 14.5 The x86 Processor Family 512 viii CONTENTS 14.6 The ARM Processor 520 14.7 Recommended Reading 526 14.8 Key Terms, Review Questions, and Problems 527 Chapter 15 Reduced Instruction Set Computers 531 15.1 Instruction Execution Characteristics 533 15.2 The Use of a Large Register File 538 15.3 Compiler-Based Register Optimization 543 15.4 Reduced Instruction Set Architecture 545 15.5 RISC Pipelining 551 15.6 MIPS R4000 556 15.7 SPARC 562 15.8 RISC Versus CISC Controversy 568 15.9 Recommended Reading 569 15.10 Key Terms, Review Questions, and Problems 569 Chapter 16 Instruction-Level Parallelism and Superscalar Processors 573 16.1 Overview 574 16.2 Design Issues 579 16.3 Pentium 4 589 16.4 ARM Cortex-A8 595 16.5 Recommended Reading 603 16.6 Key Terms, Review Questions, and Problems 605 PART FIVE PARALLEL ORGANIZATION 611 Chapter 17 Parallel Processing 611 17.1 Multiple Processor Organizations 613 17.2 Symmetric Multiprocessors 615 17.3 Cache Coherence and the MESI Protocol 619 17.4 Multithreading and Chip Multiprocessors 626 17.5 Clusters 633 17.6 Nonuniform Memory Access 640 17.7 Vector Computation 644 17.8 Recommended Reading 656 17.9 Key Terms, Review Questions, and Problems 657 Chapter 18 Multicore Computers 664 18.1 Hardware Performance Issues 665 18.2 Software Performance Issues 669 18.3 Multicore Organization 674 18.4 Intel x86 Multicore Organization 676 18.5 ARM11 MPCore 679 18.6 IBM zEnterprise 196 Mainframe 684 18.7 Recommended Reading 687 18.8 Key Terms, Review Questions, and Problems 687 CONTENTS ix Appendix A Projects for Teaching Computer Organization and Architecture 691 A.1 Interactive Simulations 692 A.2 Research Projects 694 A.3 Simulation Projects 694 A.4 Assembly Language Projects 695 A.5 Reading/Report Assignments 696 A.6 Writing Assignments 696 A.7 Test Bank 696 Appendix B Assembly Language and Related Topics 697 B.1 Assembly Language 698 B.2 Assemblers 706 B.3 Loading and Linking 710 B.4 Recommended Reading 718 B.5 Key Terms, Review Questions, and Problems 719 ONLINE CHAPTERS1 PART SIX THE CONTROL UNIT 19-1 Chapter 19 Control Unit Operation 19-1 19.1 Micro-operations 19-3 19.2 Control of the Processor 19-13 19.3 Hardwired Implementation 19-30 19.4 Recommended Reading 19-35 19.5 Key Terms, Review Questions, and Problems 19-35 Chapter 20 Microprogrammed Control 20-1 20.1 Basic Concepts 20-3 20.2 Microinstruction Sequencing 20-16 20.3 Microinstruction Execution 20-26 20.4 TI 8800 20-45 20.5 Recommended Reading 20-59 20.6 Key Terms, Review Questions, and Problems 20-60 ONLINE APPENDICES Appendix C Hash Tables Appendix D Victim Cache Strategies D.1 Victim Cache D.2 Selective Victim Cache 1Online chapters, appendices, and other documents are Premium Content, available via the access card at the front of this book.

Computer Organization and Architecture Designing for Performance Ninth Edition

Floating-Point Package for Intel 8008 and 8080 Microprocessors

Fill Your Boots: Enhanced Embedded Bootloader Exploits Via Fault Injection and Binary Analysis

8088 8-Bit Hmos Microprocessor 8088 8088-2

Secondary Memory

ARM Instruction Set

Lecture 1: Course Introduction G Course Organization G Historical Overview G Computer Organization G Why the MC68000? G Why Assembly Language?

Trends in Electrical Efficiency in Computer Performance

Computer Performance Evaluation and Benchmarking

CSCI 120 Introduction to Computation Bits... and Pieces (Draft)

How Many Bits Are in a Byte in Computer Terms

Id Question Microprocessor Is the Example of ___Architecture. A

Instruction Register MAR = Memory Address Register MBR = Memory Buffer Register I/O AR = I/O Address Register I/O BR = I/O Buffer Register