0-9, and Symbols A

Index Note : Online information is listed by print page number and a period followed by “e” with online page number (54.e1). Page references preceded by a single letter with hyphen refer to appendices. Page references followed by “f ,” “ t ,” and “ b ” refer to fi gures, tables, and boxes, respectively. 0-9, and symbols ID (ASID) , 436 VAX fl oating-point formats , D-29 inadequate , 497.e5–497.e6 ALU control , 249–251 . See also 1-bit ALU , A-26–A-29 . See also shared , 507–508 Arithmetic logic unit (ALU) Arithmetic logic unit (ALU) single physical , 507 , 507–508 bits , 250–251 , 250 f adder , A-27 f virtual , 436 logic , C-6–C-7 CarryOut , A-28 Address translation mapping to gates , C-4–C-7 for most signifi cant bit , A-33 f for ARM cortex-A53 , 458 f truth tables , C-5f , C-5 f illustrated , A-29f defi ned , 418–419 ALU control block , 253 logical unit for AND/OR , A-27 f fast , 428–430 defi ned , C-4–C-6 performing AND, OR, and addition , for Intel core i7 , 458 f generating ALU control bits , C-6 f A-31 , A-33 f TLB for , 428–430 ALUOp , 250 , C-6 b –C-7 b 64-bit ALU , A-29–A-31 . See also Address-control lines , C-26f bits , 250 , 251 Arithmetic logic unit (ALU) Addresses control signal , 253 from 63 copies of 1-bit ALU , A-34 f b a s e , 6 9 Amazon Web Services (AWS) , 415b with 64 1-bit ALUs , A-30f byte , 70 AMD Opteron X4 (Barcelona) , 533 , 534f defi ning in Verilog , A-36–A-37 defi ned , 68 AMD64 , 148 , 148 , 215 , 173.e5 illustrated , A-35f m e m o r y , 7 8 b Amdahl’s law , 391 , 493–494 ripple carry adder , A-29 virtual , 418–419 , 438 , 439 b corollary , 49 7090/7094 hardware , 248.e6 Addressing defi ned , 49 base , 118 f fallacy , 546 A in branches , 115–117 and (and) , 64f displacement , 118 AND gates , A-12–A-13 , C-7 Absolute references , 127 immediate , 118f AND operation , 90 , A-6 Abstractions PC-relative , 115–116 , 118f andi (and immediate) , 64f hardware/soft ware interface , 22 register , 118 f Annual failure rate (AFR) , 408–409 principle , 22 RISC-V modes , 117–118 versus MTTF of disks , 408b –409 b to simplify design , 11 x86 modes , 151 Antidependence , 325 Accumulator architectures , 173.e1–173.e2 Addressing modes Antifuse , A-77 A c r o n y m s , 9 desktop architectures , D-5–D-6 Apple computer , 54.e6 Active matrix , 18 Advanced Vector Extensions (AVX) , 216 , Apple iPad 2 A1395 , 20f add (add) , 64 f 217 logic board of , 20 f addi (add immediate) , 64f , 72 , 84 AGP , B-9–B-10 processor integrated circuit of , 21 f Addition , 172–175 . See also Arithmetic Algol-60 , 173.e6 Application binary interface (ABI) , 22 binary , 172 b –173 b Aliasing , 434 Application programming interfaces fl oating-point , 196–199 , 204 Alignment restriction , 70 (APIs) operands , 173 , 173 All-pairs N-body algorithm , B-65 defi ned , B-4 signifi cands , 195b –196 b Alpha architecture graphics , B-14 speed , 175 b bit count instructions , D-29 Architectural registers , 335–336 Address interleaving , 370–371 fl oating-point instructions , D-28–D-29 Arithmetic , 170 Address select logic , C-24 , C-25 f instructions , D-27–D-29 addition , 172–175 Address space , 418 , 421b no divide , D-28 addition and subtraction , 172–175 extending , 467b PAL code , D-28 division , 181–189 fl at , 467 unaligned load-store , D-28 fallacies and pitfalls , 220–223 I-1 I-2 Index Arithmetic (Continued) microcode , C-30 Biased notation , 81 , 193 fl oating-point , 189–214 number acceptance , 126 Binary numbers , 82 historical perspective , 225 object fi le , 126 ASCII versus, 109 b multiplication , 175–181 A s s e m b l y l a n g u a g e , 1 5f conversion to decimal numbers , 77 b parallelism and , 214–215 defi ned , 14 , 125 defi ned , 74 Streaming SIMD Extensions and fl oating-point , 205f Bisection bandwidth , 525 advanced vector extensions in illustrated , 15 f Bit maps x86 , 215–216 programs , 125 defi ned , 18 subtraction , 172–175 RISC-V , 64 f , 8 5 b –86 b g o a l , 1 8 subword parallelism , 214–215 translating into machine language , storing , 18 subword parallelism and matrix 85 b –86 b Bit-Interleaved Parity (RAID 3) , 481.e4 multiply , 216–220 Asserted signals , 240 , A-4 Bits Arithmetic instructions . See also Associativity ALUOp , 250 , 251 Instructions in caches , 395 b –396 b defi ned , 14 desktop RISC , D-11f , D-11 f degree, increasing , 394–396 , 442 dirty , 428 b embedded RISC , D-13 f increasing , 399–400 guard , 212 logical , 241–242 set, tag size versus, 399 b –400 b patterns , 212b –213 b operands , 67–74 Atomic compare and swap , 123 b reference , 426b Arithmetic intensity , 531–532 Atomic exchange , 122 rounding , 212 Arithmetic logic unit (ALU) . See also Atomic fetch-and-increment , 123b sign , 75 ALU control ; Control units Atomic memory operation , B-21 state , C-8–C-10 1-bit , A-26–A-29 Attribute interpolation , B-43–B-44 sticky , 212 64-bit , A-29–A-31 auipc’s eff ect , 156 valid , 374–376 before forwarding , 297f Automobiles, computer application in , 4 Blocking assignment , A-24 branch datapath , 244–245 Average memory access time (AMAT) , Blocking factor , 404 hardware , 174 392 Block-Interleaved Parity (RAID 4) , 481. memory-reference instruction calculating , 392 b e4–481.e5 use , 235 Blocks for register values , 242 B combinational , A-4–A-5 R-format operations , 243 f defi ned , 365–366 signed-immediate input , 300 Bandwidth , 29–30 fi nding , 442–443 ARM Cortex-A53 , 234 , 332–340 bisection , 525 fl exible placement , 392–396 address translation for , 458f external to DRAM , 388 least recently used (LRU) , 399 caches in , 459 f memory , 388 locating in cache , 397–399 data cache miss rates for , 460f network , 523–524 miss rate and , 381f memory hierarchies of , 457 Barrier synchronization , B-18 multiword, mapping addresses to , performance of , 460–462 defi ned , B-20 380 b –381 b specifi cation , 333 f for thread communication , B-34 placement locations , 441 TLB hardware for , 458 f Base addressing , 69 , 118 placement strategies , 394 ARPAnet , 54.e9 Base registers , 69 replacement selection , 399 Arrays , 405 f Basic block , 95 b replacement strategies , 444 logic elements , A-18–A-20 Benchmarks , 528–538 spatial locality exploitation , 381 multiple dimension , 210 defi ned , 46 state , A-4–A-5 pointers versus, 141–144 Linpack , 528 , 248.e2–248dir.e3 , valid data , 374–376 procedures for setting to zero , 141 f 248.e3 Bonding , 28 ASCII multiprocessor , 528–538 Boolean algebra , A-6–A-7 binary numbers versus, 109 b NAS parallel , 530 Bounds check shortcut , 96 character representation , 108 f parallel , 529 f Branch datapath defi ned , 108–109 PARSEC suite , 530 ALU , 244–245 symbols , 111 SPEC CPU , 46–48 operations , 244–245 Assemblers , 125–127 SPEC power , 48–49 Branch if Equal (beq) , A-32 defi ned , 14 SPECrate , 528 Branch if greater than or equal, unsigned function , 125–127 Stream , 538 b (bgeu) , 95–96 Index I-3 Branch if less than (blt) instruction , compiling assignment with registers , set-associative cache , 395 95–96 67 b –68 b steps , 383 Branch if less than, unsigned (bltu) , compiling while loops in , 94 b –95 b in write-through cache , 383 95–96 sort algorithms , 141 f Cache performance , 388–408 Branch instructions translation hierarchy , 124 f calculating , 390 b –391 b pipeline impact , 306f translation to RISC-V assembly hit time and , 391–392 Branch not taken language , 65 impact on processor performance , assumption , 305–306 variables , 104 b 390–391 defi ned , 244 C.mmp , 577.e3–577.e4 Cache-aware instructions , 470 Branch prediction C + + language , 173.e7 , 150.e26 Caches , 373–388 . See also Blocks b u ff ers , 308 Cache blocking and matrix multiply , accessing , 376–382 as control hazard solution , 272 463–466 in ARM cortex-A53 , 459 f defi ned , 271–272 Cache coherence , 452–456 associativity in , 395b –396 b dynamic , 272 , 308–312 coherence , 452 bits in , 380 b static , 322 consistency , 452 bits needed for , 380 Branch predictors enforcement schemes , 454 contents illustration , 377f accuracy , 310 implementation techniques , 482. defi ned , 19–22 , 373–374 correlation , 310–311 e10–482.e11 direct-mapped , 374 , 375f , 380 , 392 information from , 310–311 migration , 454 empty , 376 tournament , 311–312 problem , 452 , 453 f , 456 b FSM for controlling , 447–452 Branch table , 97–98 protocol example , 482.e11–482.e15 fully associative , 393 Branch taken protocols , 454 GPU , B-38 cost reduction , 306–307 replication , 454 inconsistent , 383 defi ned , 244 snooping protocol , 454–456 index , 378 Branch target snoopy , 482.e16 in Intel Core i7 , 459 f addresses , 244 state diagram , 482.e15 f Intrinsity FastMATH example , b u ff ers , 310 Cache coherency protocol , 482.e11–482. 385–387 B r a n c h e s . See also Conditional e15 locating blocks in , 397–399 branches fi nite-state transition diagram , 482.e14 f locations , 375 f addressing in , 115–117 functioning , 482.e13 f multilevel , 388 , 400–403 compiler creation , 93–94 mechanism , 482.e13f nonblocking , 458 decision, moving up , 306–307 state diagram , 482.e15 f physically addressed , 434–435 delayed , 272 , 306–308 states , 482.e12 physically indexed , 434b –435 b ending , 95b write-back cache , 482.e14 f physically tagged , 434b –435 b execution in ID stage , 307 Cache controllers , 457 primary , 400 , 407–408 pipelined , 308b coherent cache implementation secondary , 400 , 407–408 target address , 306–307 techniques , 482.e10–482.e11 set-associative , 393 Branch-on-zero instruction , 258–259 implementing , 482.e1 simulating , 466 b Bubble Sort , 140 snoopy cache coherence , 482.e16 size

0-9, and Symbols A

ARM Против Intel: Успешная Стратегия Для RISC Или Выгода Для CISC?

R&D and the Market for Acquisitions

A Study of the Soc and Smartphone Industries

Dr. Ishaq Unwala 11400 Rockwell Pl Phone #: (512) 567 - 4467 Austin, TX 78726 [email protected]

Apple Inc. This Article Is About the Technology Company

Of 48 in the UNITED STATES DISTRICT COURT for the EASTERN DISTRICT of TEXAS MARSHALL DIVISION Plaintiff, V. Defendant. Ci

Advanced Processing Techniques Using the Intrinsity(Tm

Iphone Es Una Línea De Teléfonos Inteligentes Diseñada Y Comercializada Por Apple Inc

1 in the United States District Court for the Eastern

Intrinsity(Tm) Fastmath(Tm) Application to Computed Tomography

Sisteminės Programos

Ipad Seminar Report