Computer Architecture 10
Instruction Set Architectures
Made wi th OpenOffi ce.org 1 Computer Application Areas
DesktopDesktop computingcomputing focus on both integer and FP processing program size does not matter power consumption may be an issue (laptops) ServersServers basically only integer performance matters code size and power consumption is irrelevant EmbeddedEmbedded ApplicationsApplications cost and power consumption are critical code size is important (hand-crafted code) focus on worst-case performance Made wi th OpenOffi ce.org 2 Instruction Set Architecture (ISA)
TargetTarget applicationsapplications general-purpose vs dedicated ProgrammingProgramming languagelanguage compiler vs assembler coding TechnologyTechnology memory & I/O performance LegacyLegacy issuesissues backward compatibility
Made wi th OpenOffi ce.org 3 Taxonomy of ISAs
past StackStack AccumulatorAccumulator Memory-MemoryMemory-Memory
Register-MemoryRegister-Memory Register-RegisterRegister-Register present
Made wi th OpenOffi ce.org 4 Stack Architecture
InternalInternal stackstack Processor AllAll operandsoperands implicitimplicit SP Push-PopPush-Pop memorymemory operationsoperations Advantages:Advantages: fast processing efficient and small code
calculation-oriented – match Memory with stack-oriented programming languages DisadvantageDisadvantage specific applications Made wi th OpenOffi ce.org 5 Accumulator Architecture
SpecialSpecial purposepurpose register(s)register(s) Processor ImplicitImplicit source(one)source(one) andand destinationdestination operandsoperands
Advantages:Advantages:
simple hardware – low cost Memory simple instructions easy code generation DisadvantageDisadvantage low performance
Made wi th OpenOffi ce.org 6 Memory-Memory Architecture
NoNo internalinternal registersregisters Processor ExplicitExplicit operandsoperands
Advantages:Advantages:
most flexible code Memory generation most efficient programs DisadvantageDisadvantage long instructions memory performance
Made wi th OpenOffi ce.org 7 Register-Memory Architecture
General/SpecialGeneral/Special purposepurpose Processor registersregisters ExplicitExplicit sourcesource andand destinationdestination operandsoperands
Advantages:Advantages: flexible code generation Memory efficient programs DisadvantageDisadvantage complex hardware complex instructions Made wi th OpenOffi ce.org 8 Register-Register Architecture
GeneralGeneral purposepurpose registersregisters Processor ExplicitExplicit operandsoperands Load-StoreLoad-Store memorymemory operationsoperations
Advantages:Advantages: fast processing Memory simple/short instructions DisadvantageDisadvantage long programs code optimization issues Made wi th OpenOffi ce.org 9 Code Comparison
Operation:Operation: CC ←← AA ++ BB A, B and C are memory addreses of operans
Stack Acc. Mem-Mem Reg-Mem Reg-Reg
PUSH A LOAD A ADD A,B,C LOAD A,R1 LOAD A,R1 PUSH B ADD B ADD R1,B,R2 LOAD B,R2 ADD STORE C STORE R2,C ADD R1,R2,R3 POP C STORE R3,C
Made wi th OpenOffi ce.org 10 Internal Registers
Special-purposeSpecial-purpose registersregisters (extended(extended accumulator)accumulator) dedicated hardware units simpler internal circuitry limitation on code generation General-purposeGeneral-purpose registersregisters orthogonal instruction set flexibility in code generation more complex internal circuitry SomeSome special-purposespecial-purpose registersregisters areare alwaysalways neededneeded HowHow manymany general-purposegeneral-purpose registersregisters areare sufficient?sufficient?
Made wi th OpenOffi ce.org 11 Instruction Operands
22 vsvs 33 operandsoperands memorymemory vsvs registerregister operandsoperands
Number of Maximum memory number of all Architecture Examples operands operands 0 3 Reg-Reg MIPS, Sparc, Alpha, PowerPC, ARM 1 2 Reg-Mem Intel i86, Motorola 68k, TI TMS320 2 2 Mem-Mem VAX 3 3 Mem-Mem VAX
Type Advantages Disadvantages simple,short, fixed-length Reg-Reg 0-3 long code instructions, pipelining possible source operand overwritten, Reg-Mem 1-2 flexible data access, compact code variable execution time Mem-Mem long instructions, memory most compact and efficient code 2-2,3-3 bottleneck
Made wi th OpenOffi ce.org 12 Some Conclusions
Reg-RegReg-Reg architecturearchitecture areare favoredfavored nowadaysnowadays forfor desktopdesktop andand highhigh performanceperformance computerscomputers simple, short and fixed-length instructions load-store architecture fast processing (pipelining) code optimization techniques large and cheap memories (large programs) fast cache memories multi-processor environments
Made wi th OpenOffi ce.org 13 Addressing Modes
AddressingAddressing modesmodes ofof modernmodern architecturesarchitectures combined modes possible PC-relative modes
Addressing mode Example Meaning Application
Immediate Mov Rx,#imm Rx = imm For constants
Sometimes useful for accessing Direct (absolute) Mov Rx,(adr) Rx = Mem[adr] static data, large address constant When a value (variable) is in a Register Mov Rx,Ry Rx = Ry register Accessing using a pointer or a Register indirect Mov Rx,(Ry) Rx = Mem[Ry] computed address R.i. + Displacement Mov Rx,100(Ry) Rx = Mem[100+Ry] Accessing local data/variables Sometimes useful in array R.i. Indexed Mov Rx,(Ry,Rz) Rx = Mem[Ry+Rz] addressing If Ry is the address of a pointer p, Memory indirect Mov Rx,@(Ry) Rx = Mem[Mem[Ry]] then mode yields *p Useful for stepping through arrays Autoincrement Mov Rx,(Ry)+ Rx = Mem[Ry], Ry = Ry+d or stack implementation Autodecrement Mov Rx,–(Ry) Ry = Ry–d, Rx = Mem[Ry] Same use as autoincrement
Scaled Mov Rx,100(Ry)[Rz] Rx = Mem[100+Ry+Rz*d] Used to index arrays Made wi th OpenOffi ce.org 14 Addressing Modes - Tests
MostMost frequentlyfrequently usedused byby compilers:compilers: Register Indirect variations Immediate Addressing
CICS Architecture, Hennessy, Patterson, Computer.Architecture.-.A.Quantitative.Approach Made wi th OpenOffi ce.org 15 Displacement Size
MajorityMajority ofof smallsmall displacementdisplacement valuesvalues DisplacementsDisplacements matchmatch thethe wholewhole addressingaddressing spacespace
16-bit displacement architecture, Hennessy, Patterson, Computer.Architecture.-.A.Quantitative.Approach Made wi th OpenOffi ce.org 16 Immediate Size
SmallSmall immediateimmediate valuesvalues correspondcorrespond toto datadata oper.oper. LongLong immediatesimmediates correspondcorrespond toto addressaddress oper.oper.
16-bit displacement architecture, Hennessy, Patterson, Computer.Architecture.-.A.Quantitative.Approach Made wi th OpenOffi ce.org 17 Immediate Ops. Frequency
AboutAbout ¼¼ ofof allall instructionsinstructions havehave anan immediateimmediate operandoperand
Hennessy, Patterson, Computer.Architecture.-.A.Quantitative.Approach Made wi th OpenOffi ce.org 18 DSP Addressing Modes
DedicatedDedicated forfor DSPDSP algorithmsalgorithms Modulo or circular (in|de)crement addressing mode ● to deal with continues streams of data the circular buffers Bit-reverse addressing mode ● only for FFT, which is the fundamental DSP algorithm
SpecificSpecific addressingaddressing modesmodes areare useuse inin assemblerassembler routines,routines, andand notnot generatedgenerated byby compilerscompilers
Made wi th OpenOffi ce.org 19 Some Conclusions
EmphasisEmphasis onon simplesimple addressingaddressing modesmodes immediate + register indirect with displacement SmallSmall displacementsdisplacements areare mostmost frequentfrequent separate small&fast instructions LongerLonger displacementdisplacement areare fairlyfairly commoncommon size should cover the available addressing space SmallSmall andand bigbig immediatesimmediates areare equallyequally commoncommon data and address arithmetic instructions SpecificSpecific addressingaddressing modesmodes forfor DSPsDSPs
Made wi th OpenOffi ce.org 20 Size of Operands
AlignmentAlignment ofof operandsoperands inin memorymemory TreatmentTreatment ofof operandsoperands inin registersregisters TypesTypes ofof operandsoperands Bytes: strings processing: ASCII, UTF-8, BCD Half-words: Unicode, short integer Word: integer, single-precision FP Double: double-precision FP, long integer BussinesBussines vsvs ScientificScientific computationscomputations
Made wi th OpenOffi ce.org 21 Distribution of Operand Use
StrongStrong positionposition ofof byte-sizebyte-size operationsoperations 64-bit64-bit IEEE754IEEE754 forfor FPFP computationscomputations 3232 andand 64-bit64-bit integerinteger operationsoperations
Hennessy, Patterson, Computer.Architecture.-.A.Quantitative.Approach Made wi th OpenOffi ce.org 22 Operations in Instruction Set
NecessaryNecessary typestypes ofof instructionsinstructions Arithmetic and logical ● Integer arithmetic and logical operations: add, subtract, and, or, multiple, divide Data transfer ● Loads-stores (move instructions on computers with memory addressing) Control ● Branch, jump, procedure call and return, traps System ● Operating system call, mode of operation Application specific (DSP)
Made wi th OpenOffi ce.org 23 Operations in Instruction Set
OptionalOptional typestypes ofof instructionsinstructions Floating point ● Floating-point operations: add, multiply, divide, compare Decimal ● Decimal add, decimal multiply, decimal-to-character conversions String ● String move, string compare, string search Graphics & multimedia ● Pixel and vertex operations, compression/decompression operations
Made wi th OpenOffi ce.org 24 Quantitative Example
SimpleSimple instructionsinstructions dominatedominate
Hennessy, Patterson, Computer.Architecture.-.A.Quantitative.Approach Made wi th OpenOffi ce.org 25 Control Flow Instructions
BranchesBranches (conditional)(conditional) -- DominateDominate JumpsJumps (unconditional)(unconditional) ProcedureProcedure callscalls && returnsreturns
Hennessy, Patterson, Computer.Architecture.-.A.Quantitative.Approach Made wi th OpenOffi ce.org 26 Addressing Modes, Displacement
PC-relativePC-relative modesmodes –– positionposition independentindependent codecode IndirectIndirect modesmodes –– jumpsjumps toto targettarget givengiven atat run-timerun-time BranchBranch (PC-relative)(PC-relative) displacementdisplacement << 1212 bits:bits:
Hennessy, Patterson, Computer.Architecture.-.A.Quantitative.Approach Made wi th OpenOffi ce.org 27 Condition Testing
ThreeThree primaryprimary techniquestechniques CCR, conditions register, test & branch
Made wi th OpenOffi ce.org 28 Condition Types
MajorityMajority ofof conditionsconditions areare comparisonscomparisons MostMost ofof comparisonscomparisons areare simple,simple, manymany withwith 00
Hennessy, Patterson, Computer.Architecture.-.A.Quantitative.Approach Made wi th OpenOffi ce.org 29 Some Conclusions
ProcessingProcessing shouldshould supportsupport allall thethe datadata sizes:sizes: 8,8, 16,3216,32 andand 6464 bitsbits (bytes(bytes areare important!)important!) CompilersCompilers favorfavor simplesimple instructions;instructions; somesome advancedadvanced mightmight bebe ofof valuevalue ImplementationImplementation ofof FPFP instructionsinstructions inin allall general-purposegeneral-purpose processorsprocessors isis moremore forfor competitioncompetition ratherrather thanthan forfor realreal useuse PC-relativePC-relative branchesbranches dominatedominate withwith displacementdisplacement ofof 8-128-12 bitsbits toto bebe enoughenough SimpleSimple teststests areare favoredfavored inin programsprograms ThereThere areare alternativealternative conditioncondition testingtesting methodsmethods Made wi th OpenOffi ce.org 30 Instruction Encoding
Requirements:Requirements: number of instructions (opcode) number of registers variable operand size several addressing modes long displacements and immediates LimitationsLimitations alignment of instructions in memory pipeline processing code density decoding complexity Made wi th OpenOffi ce.org 31 Encoding Choices
VariableVariable all addressing modes for all operands many instruction formats variable instruction length – compact code FixedFixed addressing mode inside opcode fixed-size instructions (for pipeline) – large code reduced number of instructions HybridHybrid some variation in size is allowed compression of instructions Made wi th OpenOffi ce.org 32 Implementations
Made wi th OpenOffi ce.org 33 Compiler Support
AlmostAlmost allall programmingprogramming isis donedone atat high-levelhigh-level CompilerCompiler affectsaffects thethe codecode performanceperformance ArchitectureArchitecture affectsaffects thethe compliercomplier complexitycomplexity
HowHow thethe compilercompiler technologytechnology influenceinfluence thethe computercomputer architectures?architectures? HowHow thethe architecturearchitecture cancan helphelp thethe compiler?compiler?
Made wi th OpenOffi ce.org 34 Compiler Operation
Made wi th OpenOffi ce.org 35 Optimization Examples
Made wi th OpenOffi ce.org 36 Register Allocation
RegisterRegister allocationallocation isis keykey architecture-architecture- dependentdependent optimizationoptimization AllocationAllocation isis NP-completeNP-complete problemproblem (how(how toto useuse limitedlimited resources,resources, soso thatthat dependingdepending operationsoperations willwill useuse differentdifferent containerscontainers –– graphgraph coloringcoloring problem)problem) HeuristicsHeuristics workwork wellwell –– nearlynearly inin linearlinear timetime NumberNumber ofof registersregisters mustmust bebe atat leastleast 1616
Made wi th OpenOffi ce.org 37 Optimization Example
Hennessy, Patterson, Computer.Architecture.-.A.Quantitative.Approach Made wi th OpenOffi ce.org 38 Impact of Compilers
VariablesVariables allocationallocation andand addressingaddressing Stack – local, scalar, variables SP-relative Global data area – indexed data structures Dynamic allocation area – access with pointers RegisterRegister allocationallocation effective only for stack-allocated data Rare/exoticRare/exotic instructioninstruction supportsupport
Made wi th OpenOffi ce.org 39 Impact of Architecture
MakeMake frequentfrequent fastfast andand infrequentinfrequent correctcorrect ArchitectureArchitecture regularityregularity orthogonal instruction set FavorFavor primitives,primitives, notnot solutionssolutions simple instructions, not matching the high-level language functionality ReduceReduce trade-offstrade-offs amongamong alternativesalternatives performance vs size register allocation vs cache performance
Made wi th OpenOffi ce.org 40 Some Conclusions
FixedFixed instructioninstruction encodingencoding –– performanceperformance VariableVariable instructioninstruction encodingencoding –– codecode sizesize Software/HardwareSoftware/Hardware Co-designCo-design
Made wi th OpenOffi ce.org 41 Summary
ArchitectureArchitecture classclass AddressingAddressing modesmodes DataData typestypes InstructionInstruction typestypes FlowFlow controlcontrol EncodingEncoding CompilerCompiler issuesissues
Made wi th OpenOffi ce.org 42