027_evans_IX.fm Page 505 Monday, April 7, 2003 10:03 PM

Index

his index begins with symbolic and introducing an option (Unix/), 53 Tnumeric items, listed in accordance subtraction operator, 63 with the ASCII character code ordering (see unary negation operator, 63 Table 2-3), followed by the A–Z items. . character introducing a directive, 56 " " characters in symbols, 58 enclosing string data, 60 symbol for location counter, 62 enclosing string parameters, 470 / character # character (preprocessor), 201 division operator, 63 $ character double, for comment delimiter, 14, 56 beginning a system-defined macro, 56 examine command (adb), 72 in symbols, 58 @ character % character @gprel, 15 in format strings, 270 in floating-point class mnemonics, 247 in parameters for asm function, 482 in macros, 475 & character : character logical AND operator, 63 in adb commands, 72, 74 reference indicator (C language), 210 double, 55 () characters, clarifying precedence, 63 terminating a label, 55 * character ; character multiplication operator, 63 double, 15, 55 + character marking instruction dependencies, 55, addition operator, 63 138–140 unary positive operator, 63 = character for symbol definition, 59 , character, separating specifiers, 55 [] characters in register indirect addressing, 111 - character \ character

505 027_evans_IX.fm Page 506 Monday, April 7, 2003 10:03 PM

506 Index

in macros, 467–470 Addresses in repeat blocks, 462–463 comparing as unsigned values, 129 ^ character for exclusive OR operator, 63 definition of, 27 _ character in symbols, 58 number of, 32–34 | character, logical OR operator, 63 symbolic, 56 ~ character, binary complement operator, 63 Addressing modes 0 prefix for octal, 58–59 autodecrement, 116, 190 0b prefix for hexadecimal, 58–59 autodecrement deferred, 118 0x prefix for binary, 59 autoincrement, 111, 116, 190 1620 computer, 3 autoincrement deferred, 117–118 360 computer, 1 base addressing, 117 4004 processor, 7 branch, 135–136 8008 processor, 7 direct, 110–111, 117 8080 processor, 7–8 displacement, 117 8085 processor, 7–8 displacement deferred, 118 8086 processor, 7–8, 26 immediate, 89, 110, 112 8088 processor, 8, 26 indirect, 111 80286 processor, 8 , 110–112 memory direct, 110–111 memory indirect, 111 A other architectures, 116–118 a access mode, 279 performance issues, 325 a flag, 473 postincrement, 101, 102, 111–112 -a option for requesting assembly listing file, 67 register direct, 89, 101–102, 110–112 A class of instruction, 86–88 register indirect, 101–102, 111–112, 116 a.out (C default output file), 180 register indirect deferred, 117 Absolute path (see Path) relative, 117 Absolute symbols (see Symbols) relative deferred, 117 Absolute value (floating-point), 239 for stacks (see Stacks) aCC command (HP ), 52, 343 Addressing range for branches, 135–136 Access mode, 279 addp4 instruction, 107 Accumulator register, 33 adds instruction, 88–89, 94 Accuracy (floating-point), 38 Advanced load (see Load types) acq completer, 388–390 -Ae option for HP-UX , 53, 422 Actual parameters, 467–468 AINT function (), 249 Ada language as a 3GL, 3, 405 ALAT (advanced load address table), 308–309, adb debugger (Unix), 53, 70–71, 73–74 312 add instruction, 88–89, 94–96 .align directive, 57 Addition Alignment floating-point, 240 of floating-point data, 38, 40, 236–237 integer, 88–89 of integer data, 101–102 logical basis, 162–163 for pipeline efficiency, 298 addl instruction, 88–89, 94–96 alloc instruction, 179–180, 198–199, 313 Address space, 29 Allocation, storage (see Storage allocation) 027_evans_IX.fm Page 507 Monday, April 7, 2003 10:03 PM

Index 507

Alpha architecture, 7, 9, 34, 84, 301 PA-RISC, 10–11, 34 Alphanumeric characters, 41–44 PDP-8 architecture, 23, 32 Altivec extensions (PowerPC), 407 PDP-11, 7, 9, 33–34, 83 .altrp directive, 212–213 Pentium, 35 ALU (see Arithmetic logic unit) of the piano, 2–3 Ampersand character (see & character) PowerPC, 296, 301 and compare type, 161 RISC, 6–7, 9–11, 84, 300–301, 340–341 and instruction, 157–158 SPARC, 197–198 AND logical function, 157 stack-based, 32–33 and.orcm compare type, 161 superscalar, 35, 295–296 andcm compare type, 161 three-address, 33 andcm instruction, 157–158 two-address, 33 ANSI/IEEE 754 (see IEEE floating-point numbers) UltraSparc, 384 Answers for selected exercises, 495–502 VAX, 7, 9, 33–34, 84 Antidependency, 298 VLIW, 296, 301 Application registers (see Registers) zero-address, 32–33 Approximation instructions Argument passing reciprocal, 253–254 for C, 210 reciprocal square root, 254–255 for COBOL, 210 APPROXPI program, 256–260 by descriptor, 209–210 ar.bsp register, 212 for FORTRAN, 210 ar.bspstore register, 212 locations, 208–209 ar.ccv register, 389 methods, 209–210 ar.csd register, 389–390 by reference, 209–210 ar.ec register, 313–316 by value, 209–210 ar.fpsr register, 212, 242–243 Argument return registers, 214, 451 ar.itc register, 480–482 Arguments for conditional assembly, 464–465 ar.lc register, 143–145, 212 Arguments for macros (see Macros) ar.pfs register, 180–181, 198–199, 212, 456 Aries (PA-RISC to Itanium migration), 34 ar.rnat register, 212 Arithmetic instructions ar.unat register, 212 as a category, 30 Architecture floating-point, 240–243 Alpha, 7, 9, 301 integer, 88–93 CISC, 6–7, 84 parallel floating-point, 385 CRAY-1, 378–379 parallel integer, 380 defined, 1 special cases, 91–92 EPIC, 6, 10–11, 301 Arithmetic logic unit, 156 extensions to, 393–394 Arithmetic overflow IA-16, IA-32, IA-64, 10 floating-point, 243 instruction-level parallelism, 300–301 integer, 89–90 Itanium, 9–11, 34–35 Arithmetic shift, 165 lifetime, 396 Arrays load/store, 33–34 record structures, 104–105 one-address, 33 shladd instruction for indexing, 91 027_evans_IX.fm Page 508 Monday, April 7, 2003 10:03 PM

508 Index

AS/400, 34 Backside cache (see Cache) ASCII characters, 41–44 BACKWARD program, 181–183 ASCII codes, table of, 43 Banked registers (see Registers) .ascii directive, 60 Base addressing (see Addressing modes) ASCII file (see Text file) Base for number systems ASCII mode (ftp), 415 binary, 16 .asciz directive, 60 decimal, 16 _Asm functions (HP-UX C), 4, 479–481 hexadecimal, 16 asm keyword (C language), 4, 481–482 for machine language, 3 Assembler directives, 57, 60 octal, 16 Assemblers, 3, 49, 57, 66–67, 142 BASIC language as a 3GL, 3 Assembly language BBEdit, 427 advantages and disadvantages, 4, 49–50 BetterTelnet, 427 architectural dependence of, 4 Biased exponent, 37–41, 234–235 compared with other languages, 3–4 Bibliography, 485–493 for Itanium, 14–15 Big-endian convention, 36–37 lack of portability of, 50 bin subdirectory, 53, 416 operators, 56–57 Binary files, 288 statement types, 54–55 Binary mode (ftp), 416 why study, 4 Binary multiples Assembly process, 66–67 names for, 5 Asterisk character (see * character) prefixes for, 5 At character (see @ character) Binary operators, arithmetic and logical, 62–63 Atomic instructions, 386–390 Bit encoding of instructions, 93–96 Authors, 503 Bit-field layouts for instructions, 85–86 .auto directive, 460 Bits, numbering of, 36 Autodecrement addressing (see Addressing modes) .body directive, 57, 145, 180–181 Autodecrement deferred addressing (see Address- Boolean functions, 156–157 ing modes) Boolean operations Autoincrement addressing AND, 157 postincrementing as Itanium equivalent, 111 NAND, 157 for stacks, 190 NOR, 157 Autoincrement deferred addressing (see NOT, 92, 157 Addressing modes) NXOR, 157 Automatic registers (see Registers) OR, 157 Automatic variables, 191–192 XOR, 157 booth function, 215–216 B Booth’s algorithm for multiplication, 170–173 /b byte display mode (debuggers), 72 br instruction, 130–131 :b command (adb), 72 br.call instruction, 179–180, 205 b suffix, temporary labels, 142–143 br.cexit instruction, 315 B class of instruction, 86–88 br.cloop instruction, 143–145 B-unit, 87–88 br.ctop instruction, 315 Backing store, 199–200 br.ia instruction, 408 027_evans_IX.fm Page 509 Monday, April 7, 2003 10:03 PM

Index 509

br.ret instruction, 179–180, 205 C br.wexit instruction, 315–316 /c display mode (debuggers), 72 br.wtop instruction, 315–316 :c command (adb), 72 Branch addressing, 135–136 .c file (see File types) Branch deallocation hint, 130–131 C language Branch instructions as a 3GL, 3 argument passing, 210 call, 205 compiler output, 345–349 completers, 130–131 library functions for I/O, 269, 278 conditional, 130–131 SQUARES program, 12–13 ordinary, 130–131 C++ language as a 4GL, 3 pipeline behavior, 298–299 Cache return, 205 backside, 99 unconditional, 130–131 hit ratio, 120–121, 325 Branch offset, IP-relative, 135–136 Itanium 2 processor, 98–100 Branch penalty Itanium processor, 399–400 in general, 131, 143 levels, 29, 98–100 Itanium 2 processor, 297 line size, 98–99 Itanium processor, 402 structures, 98–100, 399–400 Branch prediction, 299, 404 subsystem, 29, 98–100 call branch type, 130, 205 Branch prefetch hint, 130 Call to procedure, 205–207 Branch range, 135–136 Calling conventions, 203–214 Branch registers (see Registers) Caret character (see ^ character) Branch type, 130 Carriage return character, 415 Branch whether hint, 130 Carry bit, 162–163 break command (gdb), 71–73 Case sensitivity (Unix/Linux), 53 break statement (C language), 149 Case structures, 148–150 Breakpoints, 71–74 cat command (Unix/Linux), 416 brl instruction, 136, 405 cc command (HP compiler), 52, 343, 422 brp instruction, 404 cc_bundled command (HP compiler), 52, 343, brtype completer (branch instructions), 130 422 Bubble sort cd command (DOS, Unix/Linux), 416 integers (SORTINT), 284–288 CDC 6600, 340 Central processing unit, 26–27 strings (SORTSTR), 273–277 cexit branch type, 130, 315 Bubbles in a pipline, 294–300 cfm register, 199, 205–207 Bundle template (see Template) Characters Bundle of instructions, 35, 85 ASCII codes, 41–44 Bus, 26 byte storage for, 44 bwh completer (branch instructions), 130 control, 44 .byte directive, 60 cr (for line termination), 415 Byte-addressable memory (see Memory) lf (for line termination), 415 Bytes, numbering of, 36 Check load (see Load types) 027_evans_IX.fm Page 510 Monday, April 7, 2003 10:03 PM

510 Index

chk.a instruction, 309, 397, 405 Compare instructions chk.s instruction, 310–311, 397, 405 completers for, 127–129 chrget routine, 178–179 floating-point, 246 chrput routine, 178–179 parallel floating-point, 385 Circumflex character (see ^ character) parallel integer, 379 CISC architecture (see Architecture) parallel logical types, 160–161 CISC instructions, power of, 325–326 signed integer, 127–128 Classes of instructions (see Instruction classes) unconditional, 146 Classification of floating-point value, 247–248 unsigned integer, 128–129 Clearing a register, 92, 239 Compare relationship Client , 426–428 floating-point, 246 Clipping saturation (see Saturation) signed integer, 128 clock function (C), 333–334 unsigned integer, 129 CLOCKS_PER_SEC, 333 Compare type, 128, 146 cloop branch type, 130, 143–145 Comparing addresses, 129 Closed routines, 465 Comparing source files, 54 Closing a file, 279 Compiler warning messages, 360–361 clr deallocation hint, 130–131, 309 Compilers, 3, 49, 339–341 cmp instruction, 126–129 Complementation of a Boolean, 92 cmp4 instruction, 126–129 Completers, 55, 100 (also see individual cmpxchg instruction, 389 instructions) COBOL language Complex instruction set computer (see as a 3GL, 3, 405 Architecture, CISC) argument passing, 210 Computer architecture (see Architecture) SQUARES program, 13–14 Computer languages (see Languages) Codes Computer structures for operations, 56–57 central processing unit, 26–27 for pseudo-operations, 56–57 input and output devices, 29–30 Colon character (see : character) memory, 29 COM_C program, 345 cond branch type, 130 COM_F program, 345 Condition codes, 124–125 Comma character (see , character) Conditional assembly, 464–465 Command-line programming environments, 50, Conditional branch instruction, 130–131 413–417 Constants, 58 Command-line prompt, 414 assembler syntax, 58–59 Commands, DOS and Unix/Linux, 416 loading into registers, 91, 103–104 Comment field Contents of an information unit, 27 in assembly language, 56 continue command (gdb), 72 importance of, 77–78 Control characters, 44 .common directive, 371 Control dependency, 310–311 Comparative instructions Control instructions as a category, 30 branches, 123–126 floating-point, 246 as a category, 30 integer, 126–129 Control path, 27 027_evans_IX.fm Page 511 Monday, April 7, 2003 10:03 PM

Index 511

Control registers (see Registers) integers, 37 Control speculation (see Speculation) logical, 156 Control statements (see Statement types) data1 directive, 60 Control string for formatted I/O, 270, 280 data2 directive, 60 Control structures data4 directive, 60 case structures data8 directive, 60 if-based structures, 131–134 Datapath, 6, 27, 84 loop structures, 134–135 +DD64 option (HP-UX compilers), 53, 423 Conventions for argument passing, 208–210 Dead code, 361 Conventions for register use, 204, 450–455 Deallocation hint, 130–131 Conversion instructions, 248–251 Debuggers Copying, register to register, 91 adb example, 73–74 Coroutines, 202–203 breakpoints, 71–74 Counted loops (see Loops) capabilities of, 71 Counter, location (see Location counter) disassembly, 76–77 CPU (see Central processing unit) gdb example, 71–73 cpuid registers, 409 and optimization, 369–370 cr character for line termination, 415 role of, 69–70 CRAY-1 architecture (see Architecture) single-stepping, 75–76 crel completer (compare instructions), 128–129 watchpoints, 74–75 Cross-reference table, 64 Declarative statements (see Statement types) ctop branch type, 130, 315 DECNUM program, 175–178 ctype completer (compare instructions), 128 DECNUM2 program, 194–196 czx instruction, 380, 392 DECNUM3 program, 216–218 .default directive, 460 del command (DOS), 416 D Delay slot (RISC), 301 /d display mode (debuggers), 72 delete command (gdb), 72 :d command (adb), 72 dep instruction, 167–168 .data directive, 57 dep.z instruction, 167–168 Data access instructions, 98, 101–105 Denormal numbers, 233 Data conversion instructions, 248–251 Dependency Data dependency, 140–142, 297–300, 307–310 control (see Control dependency) Data movement instruction data (see Data dependency) as a category, 30 Deposit instructions, 167 mov pseudo-op (floating-point), 239 Depth of pipeline (see Pipeline depth) mov pseudo-op (integer), 91 %DESCR function (FORTRAN), 210 movl instruction, 103–104 Descriptor, passing argument by, 209–210 (see also Load, Store) DET (exception detection pipeline stage), Data speculation (see Speculation) 296–297, 401–402 Data stalls, 298 Device drivers, 30 Data types dh completer (branch instructions), 130–131 alphanumeric characters, 41–44 diff command (Linux/Unix), 54 floating-point numbers, 37–41 Digital Equipment Corporation processors 027_evans_IX.fm Page 512 Monday, April 7, 2003 10:03 PM

512 Index

Alpha, 7, 9 +DSitanium2 option, 407 PDP-8, 23, 32 +DSmckinley option, 407 PDP-11, 7, 9, 33 Dual issue, 401 VAX, 7, 9, 33 Dynamic binding, 219 dir command (DOS), 416 Dynamic optimization, 369 Direct addressing (see Addressing modes) Direct assignment of a symbol, 58 E Direct memory access controllers, 30, 200 /e display mode (adb), 72 Directives, assembler, 56–57 ecc command ( C/C++ compiler), 53, 342, Directories, 266–267 422 disas command (gdb), 76–77 Editors, use of, 52 Disassembly, 76–77 efc command (Intel FORTRAN compiler), 342, Displacement addresing (see Addressing modes) 422 Displacement addresing deferred (see Addressing Effective address, 111–112, 116 modes) Ei (symbol for binary prefix), 5 Display modes for debuggers, 71–72 ELF binary file format, 210 Display output, 268–270 .else directive, 464 Division Encapsulation, 178–179 floating-point, 255–256 End of file, 278, 280, 283 floating-point software routines, 256 Endian conventions (big- and little-), 36–37 IEEE requirement, 253 .endif directive, 464–465 integer, 93, 173–175, 219–220 .endm directive, 466–467 integer software routines, 219–220 .endp directive, 57 shr instruction for powers of 2, 166 .endr directive, 460–463 by ten, 173–175 Enforced lazy mode (RSE), 199 by zero (floating-point), 243 EOF condition, 278, 280, 283 DMA controllers (see Direct memory access EPIC architecture (see Architecture) controllers) EPIC instruction-level parallelism, 301 Dollar sign character (see $ character) Epilog counter, 205, 313–316 DOS commands, 416 Epilog phase of loop, 314 Dot character (see Period character) Epilogue, 210–211 Dot product (see Scalar product) eq compare relationship, 128 DOTCLOOP program, 143–145 Equal sign (see = character) DOTCTOP program, 316–321 .err directive, 472 DOTCTOP2 program, 321–324 Error detection and printing, 288 DOTLOOP program, 136–137 exa- (decimal prefix), 5 DOTPROD program, 107–109, 113–116 exbi- (binary prefix), 5 .double directive, 60 Exceptions, 243 Double precision, 38–40 excl completer, 326–327 Double word, 35–36 EXE (execute pipeline stage), 296–297, 401–402 dpnt branch whether hint, 130, 404 Executable file, 49 dptk branch whether hint, 130, 404 Execution control with debugger, 71 +DSblended option, 407 Execution units (B, F, I, M), 87–88, 305 +DSitanium option, 407 exit branch predict hint, 404 027_evans_IX.fm Page 513 Monday, April 7, 2003 10:03 PM

Index 513

.exitm directive, 472 Fibonacci numbers, 328–334, 337, 370–373 EXP (instruction dispersal pipeline stage), Fields in assembly language statements 296–297, 401–402 comments, 56, 77–78 .explicit directive, 460 labels, 55 Explicit parallelism, 302 operators, 55 Explicitly parallel instruction computer (see Archi- specifiers, 55 tecture, EPIC) Fields in record structures,104–105 Exponent, 37–41, 234–235 Figure of merit for loops, 140 Expressions, 62–63 File name, 278–279 External symbol (see Symbols) File pointer, 278–279 extr instruction, 166–168 File storage (logical and physical), 265–267 extr.u instruction, 166–168 File systems, 266–267 Extract instructions, 166–167 File types, default naming of .c (Unix/Linux), 225 F .f (Unix/Linux), 225 f command (adb), 72 .o (Unix/Linux), 356 .f file (see File types) .s (Unix/Linux), 52 -f option (Linux compilers), 342 Fill form of load instruction f suffix, temporary labels, 142–143 floating-point, 237–238 F class of instruction, 86–88 integer, 101–102 F-unit, 87–88 Flag value on stack, 195–196 f90 command (HP compiler), 343, 422 Floating-point instructions fabs instruction, 239 arithmetic, 240–243 Factorial function, 337, 471–472 compare, 246 fadd instruction, 240 compared to integer instructions, 232 famax instruction, 242 conversion, 248–251 famin instruction, 242 load, 236–238 fand instruction, 252 logical, 252–253 fandcm instruction, 252 store, 235–236 fault completer, 326–327 Floating-point numbers fclass instruction, 247–248 double extended precision, 234 fclose function (C), 278–279 double precision, 38–41, 234 fcmp instruction, 246 IEEE special values, 233 fcvt instruction, 249–250 IEEE standards, 37–41 Feature size, 395 memory representation, 37–41 FET (instruction prefetch pipeline stage), 401–402 natural alignment of, 38, 40 Fetch, 427 register representation, 39–40, 234–235 fetchadd instruction, 388–390 single-precision, 39–40, 234 few prefetch hint, 130 Floating-point registers (see Registers) .fframe directive, 212–213 Floating-point status register (FPSR), 242–243 fgets function (C), 278, 279–280 Flynn’s classification, 378–379 fib function, 370–373 fma instruction, 241 FIB1 function, 329–331 fmax instruction, 242 FIB2 function, 331–332 fmerge instruction, 239 027_evans_IX.fm Page 514 Monday, April 7, 2003 10:03 PM

514 Index

fmin instruction, 242 fpswap instruction, 385 fmix instruction, 385 fpsxt instruction, 385 fmpy instruction, 240 fputs function (C), 278–280 fms instruction, 241 Fraction, 37–41 fneg instruction, 239 Frame marker, 205–207 fnegabs instruction, 239 Frame size, 191–192 fnma instruction, 241 frcpa instruction, 253–254 fnmpy instruction, 240 FreeBSD, 425 fnorm instruction, 241–242 Free Software Foundation, 417 fopen function (C), 278–279 frsqrta instruction, 254–255 for instruction, 252 fscanf function (C), 278, 280 Formal parameter, 462, 466–467 fselect instruction, 252–253 Format fsub instruction, 240 assembly language statements, 55–56 fswap instruction, 385 Itanium instructions, 85–86 fsxt instruction, 385 Format control string for I/O, 270, 280 ftp program, use of, 415–416 Formatted I/O, 270, 280 Full adder, 162–163 FORTRAN language Full stop (see . character) as a 3GL, 3 Functional units (see Execution units) argument passing, 210 Functions, 203 compiler output, 349–352 Fused multiply–add instruction, 241 SQUARES program, 13 fxor instruction, 252 Forwarding, 299 fpabs instruction, 385 fpack instruction, 385–386 G fpamax instruction, 385 /g display mode (debuggers), 72 fpamin instruction, 385 g77 command (Linux compiler), 341 fpcmp instruction, 385 gcc command (Linux compiler), 53, 341 fpcvt instruction, 385 gdb debugger (Linux), 53, 71–73 fpma instruction, 385 ge compare relationship, 128 fpmax instruction, 385 GEM compilers, 341 fpmerge instruction, 385 General registers (see Registers) fpmin instruction, 385 get (ftp command), 415 fpmpy instruction, 385 getchar function (C), 178–179 fpms instruction, 385 getf instruction, 250 fpneg instruction, 385 getput (encapsulated C routines), 178–179 fpnegabs instruction, 385 gets function (C), 269–270 fpnma instruction, 385 gettimeofday function (Unix/Linux), 222–224 fpnmpy instruction, 385 Gettysburg address, 283–284 fprcpa instruction, 385 geu compare relationship, 129 __fpreg data type, 480 Gi (symbol for binary prefix), 5 fprintf function (C), 278, 280 gibi- (binary prefix), 5 fprsqrta instruction, 385 giga- (decimal prefix), 5 FPSR (floating-point status register), 242–243 .global directive, 57 027_evans_IX.fm Page 515 Monday, April 7, 2003 10:03 PM

Index 515

Global label (see Label) I Global pointer (see Registers) /i display mode (debuggers), 72 Global symbols (see Symbols) I class of instruction, 86–88 GNU Project, 417 I-unit, 87–88 goto instructions, 123 ia branch type, 130, 408 gp register, 103, 113, 451 IA-16, IA-32, IA-64 architectures, 10 @gprel assembler indicator, 15, 218–219 IA-32 instruction set mode, 408–409 Groups of instructions, 137–138 ias (Intel assembler), 422 gt compare relationship, 128 IBM processors gtu compare relationship, 129 1620, 3 Guard bits, 234 360, 1, 7 801, 340 AS/400, 34 H System/36, 34 h completer, 380 System/38, 34 /h display mode (adb), 72 Identifier (synonym for symbol), 58–59 Half adder, 162 IEEE floating-point numbers Harvard architecture, 99–100 ANSI/IEEE specification, 38 Hazards, pipeline denormal, 233 branch effects, 298–299 double precision, 38–41 data stalls, 298 infinity, 233 multiple-issue effects, 300 NaN (not a number), 38, 233 producer–consumer effects, 299–300 single precision, 39–41 Hewlett-Packard processors special values, 38, 233 7xxx series, 11 zero, 233 8xxx series, 11 .if directive, 464–465 PA-RISC, 10–11, 23, 34 .ifdef directive, 464–465 HEXNUM program, 96–98 .ifndef directive, 464 HEXNUM2 program, 163–164 .ifnotdef directive, 464 Hidden bit, 37–41 if…then…else structures, 131–134, 146–147 High-level languages (see also Languages) ILP (see Instruction-level parallelism) defined, 3–4 ILP32 scheme, 422 portability of, 49–50 Immediate addressing (see Addressing modes) standardization of, 49–50 Immediate data in instructions, 86, 95–96 Hints imp importance hint, 404 for branch instructions, 130–131 Imperative statements (see Statement types) for load instructions, 102 Implementation for store instructions, 101 changes to, 394–397 Hints for exercises, 495–502 defined, 1 Hit ratio (see Cache) of the piano, 2–3 HORNER program, 243–245 version, determining, 409 Horner’s rule, 243–244 IN instruction for I/O, 30 HP-UX software, 422–423 in0–in7 stacked registers, 198, 205–207, 451 Hyphen character (see - character) #include directive (preprocessor), 201 027_evans_IX.fm Page 516 Monday, April 7, 2003 10:03 PM

516 Index

.include directive, 64 Instruction execution cycle, 31–32 Indefinite repeat block, 462–463 Instruction groups (see Groups) Indexing of arrays (see Arrays) Instruction issue, 295, 306–307 Indirect addressing (see Addressing modes) Instruction-level parallelism @inf mnemonic for fclass, 247 difficult outside loops, 406 Infinity as IEEE number, 233 EPIC, 301 Information units RISC, 300–301 address of, 27–28 throughput advantage, 300 bytes, 35–36 VLIW, 301 contents of, 27 Instruction pipelining, 294–295 double words, 35–36 Instruction pointer, 27, 31–32, 449–450 quad words, 35–36 Instruction power, 325–326 size of, 28–29, 35 Instruction reordering, 327 words, 35–36 Instruction retiring, 295 Inline assembly, 479–483 Instruction set architecture, 5–6, 10, 32–34 Inline functions, 327, 366–369 Instruction size, 325 INLINE program, 367 Instruction slots, 85 Inner product (see Scalar product) Instruction templates, 302–307 Input stacked registers, 198 Instruction widths, 83–85 Input/output Instructions (see Itanium instruction set) C functions, 268–270, 277–280 int data type (C), 178–179 system, 29–30 Integer division (see Division) Instruction architectures Integer multiplication (see Multiplication) load/store, 33–34 Integer sizes one-address, 33 byte, 37 stack-based, 32–33 double word, 37 three-address, 33 quad word, 37 two-address, 33 word, 37 zero-address, 32–33 Integers Instruction bundles, 35, 85 in floating-point registers, 235 Instruction categories representations of, 16–20, 37 arithmetic, 30, 88–93, 380, 385 signed, 18–20, 37 comparative, 30, 380, 385 unsigned, 17–18, 37 control, 30 Intel assembler (ias), 422 data access, 98, 101–105 Intel processors data movement, 30 4004, 7 logical, 30 8008, 7 semaphore, 31, 386–390 8080, 7–8 Instruction classes (A, B, F, I, M, X), 86–88 8085, 8 Instruction completers (see Completers) 8086, 7–8, 26 Instruction components 8088, 7, 26 operand specifiers, 31 80286, 8 operation code (opcode), 31 Intel386, 8, 46 Instruction encoding at the bit level, 93–96 Intel486, 8 027_evans_IX.fm Page 517 Monday, April 7, 2003 10:03 PM

Index 517

Itanium, 9–11, 397–405 J Itanium 2, 10–11 /j display mode (adb), 72 Pentium, 8 Java, 3, 42 Intel386 CPU, 8, 46 JMPE instruction (IA-32 mode), 408 Intel486 CPU, 8 Jump table, 148 Interface, user-visible, 1 Interrupt handling, 200 K Interval time counter, 480–482 Kernel phase of loop, 314 Intrinsic functions, 479–481 Keyboard input, 268–270 invala instruction, 309–310 Keyword parameters, 469–470 I/O (see Input/output) Ki (symbol for binary prefix), 5 IO_C program, 179–180 kibi- (binary prefix), 5 -ip option (Intel compilers), 342 kilo- (decimal prefix), 5 IP (see Instruction pointer) IPF (see Itanium Processor Family) L IPG (IP generation pipeline stage), 296–297, l completer, 380 401–402 Label -ipo option (Intel compilers), 342 global, 55 .irp directive, 462 local, 56 .irpc directive, 463 local in macros, 470–471 ISA (instruction set architecture), 5–6, 10 macro-generated, 475 ISAM files, 288 temporary, 142–143 Issue, multiple (see Multiple issue) Label field, 55 Issue ports, 400–401 Languages Issuing of instructions, 295 1GL, 2GL, 3GL, 4GL, 3 Itanium architecture, 9–11 ANSI C, 3 Itanium instruction set artificial intelligence, 3 by function, 430–437 assembly, 3–4 by opcode, 438–447 database access, 3 Itanium 2 processor high-level, 3–4 cache, 98–100 machine, 3 contrasted with first Itanium processor, natural language, 3 397–399 object-oriented, 3 execution units, 87–88 Latency, 131, 140–142, 295, 299, 402–403 latency factors, 402–403 LC, location counter, 64–66 pipelines, 296–297 .lcomm directive, 60 Itanium processor ld instructions, 101–102, 389–390 cache, 399–400 LDC instruction (PA-RISC), 387 contrasted with Itanium 2 processor, 397–399 ldf instructions, 236–237 execution units, 400–401 le compare relationship, 128 latency factors, 402–403 leu compare relationship, 129 pipelines, 401–402 lf character for line termination, 415 Itanium Processor Family, 10–11 lfetch instructions, 326–327 027_evans_IX.fm Page 518 Monday, April 7, 2003 10:03 PM

518 Index

Libraries in linking process, 53 . character as symbol for, 62 lincoln.txt test file, 283 in listing file, 64–66 Line feed character, 415 maintained by assembler, 57 Line numbers in listing file, 64–66 multiple instances, 61–62 Line prefetch instructions, 326–327 in object file sections, 473 Line size (see Cache) Logging in and out, 413–414 Line terminators in text files, 279–280, 415 Logical data, 156 Linear congruential method, 221 Logical difference, 158 Link map, 68–69 Logical functions Linkers, 49, 68–69 binary, 156–157 Linking process, 68–69 unary, 156–157 Linux Logical instructions client software, 426 as a category, 30 commands, 53, 416 floating-point, 252–253 I/O software, 267 integer, 157–158 line terminator, 415 Logical mask, 159, 163–164 online documentation, 417 Logical product, 158 support for Unicode, 42 Logical shift, 165 LISP language as a 4GL, 3 Logical sum, 158 Listing file, 51, 57, 64–67 .long directive, 60 Little-endian convention, 36–37 long long int (C data type), 178–179 Live quantity in software pipeline, 317 Long shift instruction, 166 Load hints loop branch predict hint, 404 floating-point, 237–238 Loop count register, 143–145, 315 integer, 102 Loop structures, 134–135, 314–316 Load instructions Loops floating-point, 236–238 counted, 134, 314–315 floating-point pair, 238 modulo-scheduled, 313–324 integer, 101–102 unrolling, 312–313, 361–366 semaphore, 389–390 while, 315–316 Load types Lower case usage, 53 advanced, 308–309 Lower case, converting to upper, 159 check, 308 LP64 scheme, 423 floating-point, 236–238 ls command (Unix/Linux), 416 integer, 102 lt compare relationship, 128 speculative, 310–312 ltu compare relationship, 129 speculative advanced, 311–312 loc0–loc127 stacked registers, 198, 205–207, 451 M Local labels (see Label) -M option for requesting map file, 68 Local stacked registers, 198 -m option (Linux compilers), 342 Local variables, 205–207, 328 M class of instruction, 86–88 Locality, 136 M-unit, 87–88 Location counter Macintosh 027_evans_IX.fm Page 519 Monday, April 7, 2003 10:03 PM

Index 519

client software, 426–427 Memory direct addressing (see Addressing modes) line terminator, 415 Memory indirect addressing .macro directive, 466–467 (see Addressing modes) Macro libraries, 51 Memory-mapped I/O, 30 Macros Memory stacks, 190–194 actual parameters, 466–470 Merced code name, 10–11, 398 assembler, 200, 465–472 Merge (see fpmerge instruction) default values, 469–470 Mi (symbol for binary prefix), 5 defining, 466–467 mi (Macintosh text editor), 427 formal parameters, 466–470 Minimum invoking, 467–468 floating-point, 242 keyword parameters, 469–470 parallel floating-point, 385 names, 466–467 parallel integer, 380 positional parameters, 468–469 -minline-divide option (gcc), 341 purposes, 466 Minus sign character (see - character) recursive, 471–472 MIMD computing systems, 378–379 self-redefining, 471 MIPS compilers, 340 string parameters, 470 MISD computing systems, 378 MacSFTP, 427 mix instruction, 380 MacSSH, 427 mkdir command (Unix/Linux), 416 Maintainability, writing for, 77–78 MMX instructions, 378, 407 man command (Unix/Linux), 417 Modular programming, 200–203 many prefetch hint, 130 Modulo-scheduling of loops, 313–324 Map file, 51, 68–69 MONEY macro, 473–476 Masking, 106, 159, 163–164 Moore’s law, 395 Math coprocessor, 231 more command, 416 MATRIX program, 376 Motion video instructions, 407 MAX-2 additions to PA-RISC, 384 Motorola processors Maximum 680x0 series, 231 floating-point, 242 PowerPC, 29, 296, 301 parallel floating-point, 385 mov instruction, 105 parallel integer, 380 mov pseudo-instruction MAXIMUM program, 150–151 floating-point, 239 McKinley code name, 10, 398 integer, 57, 91 md command (DOS), 416 movl instruction, 103–104 MDMX extensions (MIPS), 407 Multimedia instructions, 379–381, 407 mebi- (binary prefix), 5 Multiple-issue effects, 300 mega- (decimal prefix), 5 Multiplication Memory Booth’s algorithm, 170–173 as an array, 28 floating-point, 240–241 byte-addressable, 28 integer, 92–93, 170–173, 251–252, 381–384 holding instructions and data, 27 parallel instructions for 32-bits, 381–384 information units, 27–29 pmpy2 instruction for 16-bits, 92–93 word size, 28 scalar product of vectors, 107–109 027_evans_IX.fm Page 520 Monday, April 7, 2003 10:03 PM

520 Index

shl instruction for powers of 2, 166 nt1 completer shladd instruction for special cases, 90–91 floating-point load instructions, 237 unsigned integers, 173 integer load instructions, 102 Multiply-defined symbols, 67 line prefetch instruction, 327 Multiway branching, 147–150 semaphore instructions, 388–389 mux instruction, 380 nt2 completer mv command (Unix/Linux), 416 line prefetch instruction, 327 nta completer floating-point load instructions, 237 N floating-point store instructions, 236 NaN (see Not a number) integer load instructions, 102 NAND function, 157 integer store instructions, 101 @nat mnemonic for fclass, 247 line prefetch instruction, 327 NaT bit, 310–311, 450 semaphore instructions, 388–389 Natural alignment NUE (native user environment), 423–425 of floating-point data, 38, 40, 236–237 NUL character, terminating a string, 60 of integer data, 101–102 Number of addresses within instructions, 32–34 NaTVal (see Not a thing value) Number conversion, 175–178 nc completer (check load), 308–309 Number systems ne compare relationship, 128 base, for machine language, 3 @neg mnemonic for fclass, 247 binary, 16–18 Negation decimal, 16–17 of a floating-point value, 239 floating-point, 37–41 of an integer value, 91 hexadecimal, 17–18 Nesting IEEE floating-point, 37–41 of conditionals, 146–147 integers, 16–20 of macros, 467 octal, 16 of parentheses in expressions, 63 one’s complement, 18–19 Newline character, 269–270 sign and magnitude, 18–19 nge compare relationship, 246 signed integers, 18–20 ngt compare relationship, 246 two’s complement, 18–20 NiftyTelnet SSH, 427 Numeric ranges nle compare relationship, 246 floating-point numbers, 39 nlt compare relationship, 246 integers, 37 nm command (Unix/Linux), 52–53, 69–70 NXOR function, 157 -nolib_inline option (Intel compilers), 343 nz test bit type, 160 nop instructions, 66, 305–306 NOR function, 157 @norm mnemonic for fclass, 247 O Normalization, floating-point, 241 .o file, 356 Not a number, 38, 233 -o option (Unix/Linux), 53 Not a thing value, 235 +O0 option (HP-UX compilers), 343 NOT operation, 92, 157 -O0 option (Linux compilers), 53, 342 NotePad, 427 +O1 option (HP-UX compilers), 343 027_evans_IX.fm Page 521 Monday, April 7, 2003 10:03 PM

Index 521

-O1 option (Linux compilers), 342 post-compilation, 369 +O2 option (HP-UX compilers), 343 profile-guided, 369 -O2 option (Linux compilers), 342–343 static, 369 +O3 option (HP-UX compilers), 343 Options for Unix/Linux commands, 53 -O3 option (Linux compilers), 342–343 or compare type, 161 +O4 option (HP-UX compilers), 343 or instruction, 157–158 Object file, 49, 51, 57, 69 OR logical function, 157 Object file sections, 473 or.andcm compare type, 161 Object libraries, 51 orcm compare type, 161 Object-oriented languages, 3 ord compare relationship, 246 od command (Unix/Linux), 417 Ordinateur (French for computer), 183 +Ofast option (HP compilers), 343 Organization of a computer, 1 +Ofaster option (HP compilers), 343 -Os option (Linux compilers), 342 Offset +Osize option (HP compilers), 343 in branch instructions, 135–136 OUT instruction for I/O, 30 in displacement addressing, 117 out0–out7 stacked registers, 199, 205–207, 451 location counter value as, 62 Output dependency, 298 +Olimit option (HP compilers), 343 Output stacked registers, 199 On-chip cache, 98–100, 399–400 Overflow One-address instruction set, 33 floating-point, 243 One’s complement representation, 18–19 integer, 89–90 Opcode, 31, 56–57, 84–86, 87–88 Opcode extension fields, 87–88, 93–96 Open routines, 466 P Opening a file, 279 PA-RISC processors, 10–11, 46 OpenVMS operating system, 415, 426 pack instruction, 380 Operand, 31 Packing data not encouraged, 78 Operand specifiers, 31, 55 padd instruction, 380 Operation code (see Opcode) Palindromes, 188 Operator field, 55 Parallel logical compare instructions, 160–161 Operators Parallel operations assembly language, 56–57 floating-point, 384–386 binary and unary, 62–63 integer, 379–381 Optimization Parameters and debugging, 369–370 actual, 466–470 dynamic, 369 default values, 469–470 enabling, 356–361 formal, 462, 466–470 factors in a program, 325–328 keyword, 469–470 inline functions, 366–369 null, 469 inhibition of, 53, 345 positional, 468–469 levels for cc, aCC, and f90, 343 Parentheses, clarifying precedence, 63 levels for ecc and efc, 342–343 Pascal language as a 3GL, 3 levels for gcc and g77, 341–342 Passing arguments (see Argument passing) loop unrolling, 361–366 passwd command (Unix/Linux), 414 027_evans_IX.fm Page 522 Monday, April 7, 2003 10:03 PM

522 Index

Passwords, 414 pmpy2 instruction, 92–93, 380 Path (for directory searches), 416 pmpyshr2 instruction, 380 pavg instruction, 380 Pointers, 103–104, 107, 422–423 pavgsub instruction, 380 Polynomial evaluation, 241, 243–245 PC (see Program counter) Pop item from a stack, 190 PCI bus, 30 popcnt instruction, 380, 392 pcmp instruction, 380 Portability of programs, 50 PDP-8 architecture, 23, 32 @pos mnemonic for fclass, 247 PDP-11 architecture, 7, 9, 33–34, 83 Position-independent code, 218–219 PDP-11 emulation, 34 Positional coefficients, 16 pebi- (binary prefix), 5 Positional parameters, 468–469 Pentium CPU, 8 Postcompilation optimization, 369 Performance considerations, 293 Postincrement addressing (see Addressing) Performance for loops, 138–140 Postincrementing Period character (see . character) with floating-point load, 236–238 perror function (C), 269, 288 with floating-point load pair, 238 peta- (decimal prefix), 5 with floating-point store, 235–236 PFE (Programmer’s File Editor), 427 with integer load, 102 pfm (previous frame marker), 205–207 with integer store, 101 ph completer (branch instructions), 130 PowerPC (see Motorola processors) Pi (symbol for binary prefix), 5 pr register, 212 Piano architecture and implementation, 2–3 Precedence of operators, 63 Pipeline bubbles, 294–300 Precision, floating-point, 39–40, 240 Pipeline depth, 294 Precision completer, 240, 250 Pipeline hazards (see Hazards) Predicate (see Qualifying predicate) Pipeline stages Predicate registers (see Registers) for a generic processor, 294–295 Predication, 131–135, 245–246, 316–323 Itanium 2 processor, 296–297 Prediction, branch (see Branch prediction) Itanium processor (initial implementation), Prefetch hint, 130 401–402 Prefetching (see Advanced loads, Line prefetch Pipelined loops, 312–316 instructions, Speculative loads) Pipelining Preprocessors, 201 bubbles, 295 Preserved registers (see Registers) floating-point, 296 print command (gdb), 72–73 hardware, 294–300 printenv command, 417 hazards, 297 printf function (C), 269–270 software, 313 @priunat preservation, 212 stalls, 295 Privilege level, 205 superpipelining, 295 .proc directive, 57 superscalar, 295 Procedural dependency (see Control dependency) PL/I language as a 3GL, 3 Procedure frame, 191–192 Plus sign character (see + character) Procedures, 203 pmax instruction, 380 Producer–consumer effects, 299–300 pmin instruction, 380 Profile-guided optimization, 369 027_evans_IX.fm Page 523 Monday, April 7, 2003 10:03 PM

Index 523

progbits type, 473 SORTINT, 284–288 Program counter, 27, 31 SORTSTR, 273–277 , 293 SQUARES, 12–15, 71, 73–74, 95–96, Program segmentation, 200–203 112–115 Program size, 326 TESTFIB, 333–334 Programming environments Prolog phase of loop, 314 command-line, 50 .prologue directive, 145, 180–181, 212 tools, 52–53 Prologue section, 145, 210–213 Programs Prompt, command-line (see Command line) APPROXPI, 256–260 pr.rot (rotating predicate registers), 452 BACKWARD, 181–183 psad1 instruction, 380 booth (function), 215–216 Pseudo-operations, 56–57, 89, 91 chrget (function), 178–179 Pseudo-random numbers, 220–221, 256–260 chrput (function), 178–179 pshl instruction, 380 COM_C, 345 pshladd2 instruction, 380 COM_F, 345 pshr instruction, 380 DECNUM, 175–178 pshradd2 instruction, 380 DECNUM2, 194–196 psp (previous stack pointer), 212 DECNUM3, 216–218 psub instruction, 380 DOTCLOOP, 143–145 Push item onto a stack, 190 DOTCTOP, 316–321 put (ftp command), 415 DOTCTOP2, 321–324 putchar function (C), 178–179 DOTLOOP, 136–137 puts function (C), 269–270 DOTPROD, 107–109, 113–116 pwd command (Unix/Linux), 416 fib function, 370–373 FIB1 function, 329–331 Q FIB2 function, 331–332 q command (adb), 72 getput (encapsulated C routines), 178–179 @qnan mnemonic for fclass, 247 HEXNUM, 96–98 -Qoption option (ecc), 68 HEXNUM2, 163–164 qp (qualifying predicate), 55, 86 HORNER, 243–245 .quad directive, 60 INLINE, 367 Quad word, 35–36 IO_C, 179–180 Qualifying predicate, 55, 86, 315–316 MATRIX, 376 quit command (gdb), 72 MAXIMUM, 150–151 Quotation marks (see " character) MONEY (macro), 473–476 Quotient in division, 219–220 RANDC, 225 RANDF, 225–226 random (function), 222–224 R random_ (function) for FORTRAN, r access mode, 279 222–224 :r command (adb), 72 SCANFILE, 280–284 r command (adb), 72 SCANTERM, 270–273 r completer, 380 SCANTEXT, 168–169 ra command (adb), 72 027_evans_IX.fm Page 524 Monday, April 7, 2003 10:03 PM

524 Index

Radix control, 58–59 automatic, 450–451 RANDC, 225 banked, 200, 451 RANDF, 225–226 branch, 130, 452 random (function), 222–224 constant, 450–451, 453 random_ (function) for FORTRAN, 222–224 conventions for use, 204, 450–455 Random numbers, 220–226, 257–260 in the CPU, 27 RAR dependency, 307 cpuid, 409 RAW dependency, 307 floating-point, 27, 39–40, 212–213, 453 rd command (DOS), 416 function returned values, 214 readelf (Linux utility), 210–211 general, 450–451 real4 directive, 60 global pointer, 103, 113, 451 real8 directive, 60 integer, 27, 450–451 Reciprocal predicate, 126, 451–452 approximate, 253–254 preserved, 204 approximate square root, 254–255 read-only, 454 known, for division, 173–175 register rename base, 314 Record structures, 104–105 rotating, 199, 313–314 Recurrence relationship, 329 scratch, 204 Recursion, 327–329, 333–334, 370–373 special, 450–451 Recursive macros, 471–472 stack pointer, 190–192, 451 Reduced instruction set computer (see stacked, 198–199, 205–209, 451 Architecture, RISC) state management, 455–457 %REF function (FORTRAN), 210 system control, 457–458 Reference, passing argument by, 209–210 system information, 457 REG (register read pipeline stage), 296–297, .regstk directive, 214 401–402 rel completer Register-level programming, 27 integer store instructions, 101 Register direct addressing (see Addressing modes) semaphore instructions, 388–390 Register indirect addressing (see Relative addressing (see Addressing modes) Addressing modes) Relative deferred addressing Register indirect deferred addressing (see (see Addressing modes) Addressing modes) Relative path (see Path) Register-level programming, 27 Relocatable symbols (see Symbols) Register naming Remainder, in division, 175, 219–220 gdb debugger, 71 REN (rename registers pipeline stage), 296–297, Itanium assemblers, 14 401–402 Register renaming, 298, 314 ren command (DOS), 416 Register stack. 198 Repeat blocks Register stack engine. 199–200 indefinite, 462–463 Register windows (SPARC), 197–198 simple, 460–461 Registers Representation of numbers adding to an architecture, 407 integers, 16–20 application, 105, 143, 453–455 one’s complement, 18–19 for argument passing, 205–207 sign and magnitude, 18–19 027_evans_IX.fm Page 525 Monday, April 7, 2003 10:03 PM

Index 525

signed integers, 18–20 Self-redefining macros, 471 two’s complement, 18–20 Semaphore instructions .rept directive, 461 as a category, 31 Reserved opcodes, 88 Itanium support, 386–390 .restore directive, 212 ordering completers, 388–389 ret branch type, 205 Semicolon character (see ; character) ret0–ret3 function return value registers, 214, set command (gdb), 72 451 setf instruction, 250 Retiring of instructions, 295 Shared library functions, 203 Return from procedure, 205–207 Shell program, 417 RISC architecture (see Architecture) Shift instructions, 165–166, 380 RISC instruction-level parallelism, 300–301 shl instruction, 165 rm command (Unix/Linux), 416 shladd instruction, 90–91 rmdir command (Unix/Linux), 416 shladdp4 instruction, 107 ROT (rotate instructions pipeline stage), 296–297, shr instruction, 165 401–402 shr.u instruction, 165 Rotate using shrp instruction, 166 shrp instruction, 166 Rotating stacked registers (see Registers) si command (see stepi command) Rounding, 242–243 SI units, 5 rp register, 212, 452 Sign and magnitude representation rrb register, 314, 456 floating-point numbers, 37–41, 234–235 RSE (see Register stack engine) integers, 18–19 run command (gdb) , 72 Sign extension, 106 Sign-manipulation instructions, 105–107 S Signed integers (see Integers) .s file type (see File types) Significand, 37–41, 234–235 -S option (Linux/Unix compilers), 345 Signum function, 154, 262 :s command (adb), 72 SIMD computing systems, 378–379 /s display mode (debuggers), 72 , 421 Saturation, clipping, 381 .single directive, 60 .save directive, 144–145, 180–181, 212–213 Single-stepping with debugger, 75–76 .sbttl directive, 64 SISD computing systems, 378 Scalar product of vectors, 107–109, 136–137, Ski simulator, 420–421, 423–425 143–145, 316–324 .skip directive, 57, 60 scanf function (C), 269–270 Slash character (see / character) SCANFILE program, 280–284 Slots for instructions in bundle, 85 SCANTERM program, 270–273 Smalltalk language as a 4GL, 3 SCANTEXT program, 168–169 @snan mnemonic for fclass, 247 Scope, frame and stack, 190–192 sof field (cfm register), 199, 206–207 Scratch area, stack, 191–192 Software-pipelined loops, 312–316 Scratch registers (see Registers) sol field (cfm register), 199, 206–207 .section directive, 473 sor field (cfm register), 199, 207 Secure client software, 426–428 Sorting Seed, 221 integers, 284–288 027_evans_IX.fm Page 526 Monday, April 7, 2003 10:03 PM

526 Index

strings, 273–277 user-defined, 192–194 SORTINT program, 284–288 Stalls, data, (see Data stalls) SORTSTR program, 273–277 Standard C library, 268 Source files, comparing variants, 54 Standard error, 268 Source program, 49 Standard input, 268 sp register, 190–192, 451 Standard output, 268 Space character, use in statements, 55–56 Starting address, 32 SPARC register windows, 197–198 State examination with debugger, 71 Special registers (see Registers) State management by an architecture, 125–126 Special values (IEEE), 38, 233 State management registers (see Registers) Specifier field, 55 Statement format, assembly language, 55–56 Speculation Statement types, assembly language data, 307–310 control, 55, 64 control, 310–312 declarative, 54 Speculative load (see Load types) imperative, 54 .spill directive, 213 Static binding, 219 Spill form of store instruction Static initialization, 474 floating-point, 236 Status field completer, 240 integer, 101 stderr, 268 Split issue, 304–305 stdin, 268 spnt branch whether hint, 130 header file, 178, 268 sptk branch whether hint, 130 stdout, 268 SQL language as a 4GL, 3 stepi command (gdb), 72 Square brackets (see [] characters) Stepwise development, 50–53 Square root stf instructions, 235–236 floating-point software routines, 256 Stop (double semicolon), 15, 138–140, 460 IEEE requirement, 253 Storage allocation, 49, 60 SQUARES program, 12–15, 71, 73–74, 95–96, Store hints, 101, 236 112–115 Store instructions sscanf function (C), 290 floating-point, 235–236 SSE extensions, 379. 407 integer, 101 st instructions, 101, 390 semaphore, 389–390 Stack-based instruction set, 32–33 Store types Stack pointer (see Registers) floating-point, 235–236 Stack unwinding (see Unwind tables) integer, 101 Stacked registers (see Registers) Streaming single-instruction, multiple data exten- Stacks sions (see SSE extensions) addressing, 190–194 String of characters, 44 CISC architectures, 190 string directive, 60 for argument passing, 208–209 String parameters, 470 Itanium architecture, 191–192 stringz directive, 60 load/store architectures, 190–191 sub instruction, 88–89, 94 memory, 190–194 Subroutines, 201–202 register, 196–200 Subtraction 027_evans_IX.fm Page 527 Monday, April 7, 2003 10:03 PM

Index 527

floating-point, 240 Terminators, line (see Line terminators) integer, 88–89 Terms in expressions, 63 Subword instructions, 384 Test relationship, 310 Superpipelining, 295 TESTFIB program, 333–334 Superscalar parallelism, 35, 87, 295–296 .text directive, 57 switch statement (C language), 149 Text editors, 414–415 sxt instruction, 106 Text file I/O, 277–280 Symbol table, 57, 64, 66–67 Text files, comparing variants, 54 Symbolic addresses, 56 Text mode (ftp), 415 Symbolic assembler (see Assemblers) TextWrangler, 427 Symbolic debugger (see Debuggers) Threads, 203 Symbols Three-address instruction set, 33 absolute, 64 Ti (symbol for binary prefix), 5 characters used in, 58 Tilde character (see ~ character) external, 67, 68 Time from operating system, 222–224 global, 57, 64, 67 .title directive, 64 multiply-defined error, 67 tnat instruction, 310 relocatable, 64 tp register, 451 scope of, 59 trel completer (test bit instruction), 160 undefined, 67–68 trunc completer, 249 System/36, 34 Truncation System/38, 34 floating-point, 242–243, 249 System calls for I/O integer, 166 Linux, 267 Two-address instruction set, 33 Unix, 267 Two-pass assembler, 66–67 System control registers (see Registers) Two’s complement representation, 18–20 System information registers (see Registers) type command (DOS), 416 System libraries in linking process, 53 .type directive, 180 Typed language, 61 T Types of instructions (see Instruction classes) Tab character, use in statements, 56 tbit instruction, 160 U tebi- (binary prefix), 5 uint64_rem_min_lat (Intel open source), telnet program, use of, 413–414 220, 224 Template UltraSparc, 384 assembler-selected, 303 Unary operators, arithmetic and logical, 63 codes, 302 Unbiased rounding, 242–243 in instruction bundle, 85, 302–307 unc compare type, 146, 246 manually assigned, 303 Unconditional branch, 130–131 Temporary label (see Label) Unconditional compare, 146 tera- (decimal prefix), 5 Underflow, floating-point, 243 Tera Term, 427–428 Underscore character (see _ character) Terminal I/O, 268–270 Unformatted line I/O, 269–270, 279–280 027_evans_IX.fm Page 528 Monday, April 7, 2003 10:03 PM

528 Index

Unicode, 42 -w2 option (ecc), 53, 360 Unix operating system -Wa option (gcc), 67 commands, 416 -Wall option (gcc), 53, 360 I/O software, 267 WAR dependency, 307 line terminators, 415 Warnings from compilers, 360–361 on-line documentation, 417 watch command (gdb), 72 unord compare relationship, 246 Watchpoints, 71, 74–75 @unorm mnemonic for fclass, 247 WAW dependency, 307 unpack instruction, 380 Weights of digits, 16 Unrolling of loops (see Loop unrolling) wexit branch type, 130, 315 Unsigned integers (see Integers) While loops (see Loops) Unwind information, 181, 210–211 Width of instructions, 83–85 Uploading files, 415 Windows Upper case, converting to lower, 159 64-bit, 425 Upper case usage, 53 client software, 427–428 Usernames, 414 line terminators, 415 UTF-8, UTF-16, UTF-32, 42 Wired or, 160 -Wl option (gcc), 68 V WLD (word-line decide pipeline stage), 401–402 %VAL function (FORTRAN), 210 Word (data width), 35–36 Value, passing argument by, 209–210 Word counting, 168–169 .word Value of a number, 16 directive, 60 -wp-ipo Variable-width instructions, 84 option (Intel compilers), 342 VAX architecture, 7, 9, 33–34, 84 WRB (write-back pipeline stage), 296–297, 401– 402 Vectors, scalar product (see Scalar product) Writing programs, conventions for, 77–78 Vertical bar character (see | character) wtop branch type, 130, 315 Very long instruction word (see Architecture, VLIW) .vframe directive, 212–213 X Virtual addresses, 35 -x option (nm command), 53 Virtual machine, 420–421 /x display mode (debuggers), 72 Virtual PC, 420–421 x command (gdb), 72–73 VIS extensions (UltraSparc), 407 X class of instruction, 86–88 VLIW architecture (see Architecture) XCHG instruction (IA-32 architecture), 387 VLIW instruction-level parallelism, 301 xchg instruction, 388 VMS (see OpenVMS) xma instruction, 251 von Neumann architecture, 27, 99–100 xmpy instruction, 251 xor instruction, 157–158 W XOR logical function, 157 w access mode, 279 +w option (acc), 53, 369 Y /w display mode (gdb), 72 Y2K problem, 405 027_evans_IX.fm Page 529 Monday, April 7, 2003 10:03 PM

Index 529

Z Zero-address instruction set, 32–33 -Z option (gcc), 472 Zero extension, 106–107 z test bit type, 160 Zero latency, 131, 140 @zero mnemonic for fclass, 247 zxt instruction, 106–107 Zero as IEEE number, 38, 40, 233