<<

Computer Architecture 10

Instruction Set Architectures

Made wi th OpenOffi ce.org 1 Application Areas

DesktopDesktop computingcomputing focus on both integer and FP processing program size does not matter power consumption may be an issue (laptops) ServersServers basically only integer performance matters code size and power consumption is irrelevant EmbeddedEmbedded ApplicationsApplications cost and power consumption are critical code size is important (hand-crafted code) focus on worst-case performance Made wi th OpenOffi ce.org 2 Instruction Set Architecture (ISA)

TargetTarget applicationsapplications general-purpose vs dedicated ProgrammingProgramming languagelanguage compiler vs assembler coding TechnologyTechnology memory & I/O performance LegacyLegacy issuesissues backward compatibility

Made wi th OpenOffi ce.org 3 Taxonomy of ISAs

past StackStack AccumulatorAccumulator Memory-MemoryMemory-Memory

Register-MemoryRegister-Memory Register-RegisterRegister-Register present

Made wi th OpenOffi ce.org 4 Stack Architecture

InternalInternal stackstack Processor AllAll operandsoperands implicitimplicit SP Push-PopPush-Pop memorymemory operationsoperations Advantages:Advantages: fast processing efficient and small code

calculation-oriented – match Memory with stack-oriented programming languages DisadvantageDisadvantage specific applications Made wi th OpenOffi ce.org 5 Accumulator Architecture

SpecialSpecial purposepurpose register(s)register(s) Processor ImplicitImplicit source(one)source(one) andand destinationdestination operandsoperands

Advantages:Advantages:

simple hardware – low cost Memory simple instructions easy code generation DisadvantageDisadvantage low performance

Made wi th OpenOffi ce.org 6 Memory-Memory Architecture

NoNo internalinternal registersregisters Processor ExplicitExplicit operandsoperands

Advantages:Advantages:

most flexible code Memory generation most efficient programs DisadvantageDisadvantage long instructions memory performance

Made wi th OpenOffi ce.org 7 Register-Memory Architecture

General/SpecialGeneral/Special purposepurpose Processor registersregisters ExplicitExplicit sourcesource andand destinationdestination operandsoperands

Advantages:Advantages: flexible code generation Memory efficient programs DisadvantageDisadvantage complex hardware complex instructions Made wi th OpenOffi ce.org 8 Register-Register Architecture

GeneralGeneral purposepurpose registersregisters Processor ExplicitExplicit operandsoperands Load-StoreLoad-Store memorymemory operationsoperations

Advantages:Advantages: fast processing Memory simple/short instructions DisadvantageDisadvantage long programs code optimization issues Made wi th OpenOffi ce.org 9 Code Comparison

Operation:Operation: CC ←← AA ++ BB A, B and C are memory addreses of operans

Stack Acc. Mem-Mem Reg-Mem Reg-Reg

PUSH A LOAD A ADD A,B,C LOAD A,R1 LOAD A,R1 PUSH B ADD B ADD R1,B,R2 LOAD B,R2 ADD STORE C STORE R2,C ADD R1,R2,R3 POP C STORE R3,C

Made wi th OpenOffi ce.org 10 Internal Registers

Special-purposeSpecial-purpose registersregisters (extended(extended accumulator)accumulator) dedicated hardware units simpler internal circuitry limitation on code generation General-purposeGeneral-purpose registersregisters orthogonal instruction set flexibility in code generation more complex internal circuitry SomeSome special-purposespecial-purpose registersregisters areare alwaysalways neededneeded HowHow manymany general-purposegeneral-purpose registersregisters areare sufficient?sufficient?

Made wi th OpenOffi ce.org 11 Instruction Operands

22 vsvs 33 operandsoperands memorymemory vsvs registerregister operandsoperands

Number of Maximum memory number of all Architecture Examples operands operands 0 3 Reg-Reg MIPS, Sparc, Alpha, PowerPC, ARM 1 2 Reg-Mem Intel i86, Motorola 68k, TI TMS320 2 2 Mem-Mem VAX 3 3 Mem-Mem VAX

Type Advantages Disadvantages simple,short, fixed-length Reg-Reg 0-3 long code instructions, pipelining possible source operand overwritten, Reg-Mem 1-2 flexible data access, compact code variable execution time Mem-Mem long instructions, memory most compact and efficient code 2-2,3-3 bottleneck

Made wi th OpenOffi ce.org 12 Some Conclusions

Reg-RegReg-Reg architecturearchitecture areare favoredfavored nowadaysnowadays forfor desktopdesktop andand highhigh performanceperformance computerscomputers simple, short and fixed-length instructions load-store architecture fast processing (pipelining) code optimization techniques large and cheap memories (large programs) fast cache memories multi-processor environments

Made wi th OpenOffi ce.org 13 Addressing Modes

AddressingAddressing modesmodes ofof modernmodern architecturesarchitectures combined modes possible PC-relative modes

Addressing mode Example Meaning Application

Immediate Mov Rx,#imm Rx = imm For constants

Sometimes useful for accessing Direct (absolute) Mov Rx,(adr) Rx = Mem[adr] static data, large address constant When a value (variable) is in a Register Mov Rx,Ry Rx = Ry register Accessing using a pointer or a Register indirect Mov Rx,(Ry) Rx = Mem[Ry] computed address R.i. + Displacement Mov Rx,100(Ry) Rx = Mem[100+Ry] Accessing local data/variables Sometimes useful in array R.i. Indexed Mov Rx,(Ry,Rz) Rx = Mem[Ry+Rz] addressing If Ry is the address of a pointer p, Memory indirect Mov Rx,@(Ry) Rx = Mem[Mem[Ry]] then mode yields *p Useful for stepping through arrays Autoincrement Mov Rx,(Ry)+ Rx = Mem[Ry], Ry = Ry+d or stack implementation Autodecrement Mov Rx,–(Ry) Ry = Ry–d, Rx = Mem[Ry] Same use as autoincrement

Scaled Mov Rx,100(Ry)[Rz] Rx = Mem[100+Ry+Rz*d] Used to index arrays Made wi th OpenOffi ce.org 14 Addressing Modes - Tests

MostMost frequentlyfrequently usedused byby compilers:compilers: Register Indirect variations Immediate Addressing

CICS Architecture, Hennessy, Patterson, Computer.Architecture.-.A.Quantitative.Approach Made wi th OpenOffi ce.org 15 Displacement Size

MajorityMajority ofof smallsmall displacementdisplacement valuesvalues DisplacementsDisplacements matchmatch thethe wholewhole addressingaddressing spacespace

16-bit displacement architecture, Hennessy, Patterson, Computer.Architecture.-.A.Quantitative.Approach Made wi th OpenOffi ce.org 16 Immediate Size

SmallSmall immediateimmediate valuesvalues correspondcorrespond toto datadata oper.oper. LongLong immediatesimmediates correspondcorrespond toto addressaddress oper.oper.

16-bit displacement architecture, Hennessy, Patterson, Computer.Architecture.-.A.Quantitative.Approach Made wi th OpenOffi ce.org 17 Immediate Ops. Frequency

AboutAbout ¼¼ ofof allall instructionsinstructions havehave anan immediateimmediate operandoperand

Hennessy, Patterson, Computer.Architecture.-.A.Quantitative.Approach Made wi th OpenOffi ce.org 18 DSP Addressing Modes

DedicatedDedicated forfor DSPDSP algorithmsalgorithms Modulo or circular (in|de)crement ● to deal with continues streams of data the circular buffers Bit-reverse addressing mode ● only for FFT, which is the fundamental DSP algorithm

SpecificSpecific addressingaddressing modesmodes areare useuse inin assemblerassembler routines,routines, andand notnot generatedgenerated byby compilerscompilers

Made wi th OpenOffi ce.org 19 Some Conclusions

EmphasisEmphasis onon simplesimple addressingaddressing modesmodes immediate + register indirect with displacement SmallSmall displacementsdisplacements areare mostmost frequentfrequent separate small&fast instructions LongerLonger displacementdisplacement areare fairlyfairly commoncommon size should cover the available addressing space SmallSmall andand bigbig immediatesimmediates areare equallyequally commoncommon data and address arithmetic instructions SpecificSpecific addressingaddressing modesmodes forfor DSPsDSPs

Made wi th OpenOffi ce.org 20 Size of Operands

AlignmentAlignment ofof operandsoperands inin memorymemory TreatmentTreatment ofof operandsoperands inin registersregisters TypesTypes ofof operandsoperands Bytes: strings processing: ASCII, UTF-8, BCD Half-words: Unicode, short integer Word: integer, single-precision FP Double: double-precision FP, long integer BussinesBussines vsvs ScientificScientific computationscomputations

Made wi th OpenOffi ce.org 21 Distribution of Operand Use

StrongStrong positionposition ofof byte-sizebyte-size operationsoperations 64-bit64-bit IEEE754IEEE754 forfor FPFP computationscomputations 3232 andand 64-bit64-bit integerinteger operationsoperations

Hennessy, Patterson, Computer.Architecture.-.A.Quantitative.Approach Made wi th OpenOffi ce.org 22 Operations in Instruction Set

NecessaryNecessary typestypes ofof instructionsinstructions Arithmetic and logical ● Integer arithmetic and logical operations: add, subtract, and, or, multiple, divide Data transfer ● Loads-stores (move instructions on with memory addressing) Control ● Branch, jump, procedure call and return, traps System ● Operating system call, mode of operation Application specific (DSP)

Made wi th OpenOffi ce.org 23 Operations in Instruction Set

OptionalOptional typestypes ofof instructionsinstructions Floating point ● Floating-point operations: add, multiply, divide, compare Decimal ● Decimal add, decimal multiply, decimal-to-character conversions String ● String move, string compare, string search Graphics & multimedia ● Pixel and vertex operations, compression/decompression operations

Made wi th OpenOffi ce.org 24 Quantitative Example

SimpleSimple instructionsinstructions dominatedominate

Hennessy, Patterson, Computer.Architecture.-.A.Quantitative.Approach Made wi th OpenOffi ce.org 25 Control Flow Instructions

BranchesBranches (conditional)(conditional) -- DominateDominate JumpsJumps (unconditional)(unconditional) ProcedureProcedure callscalls && returnsreturns

Hennessy, Patterson, Computer.Architecture.-.A.Quantitative.Approach Made wi th OpenOffi ce.org 26 Addressing Modes, Displacement

PC-relativePC-relative modesmodes –– positionposition independentindependent codecode IndirectIndirect modesmodes –– jumpsjumps toto targettarget givengiven atat run-timerun-time BranchBranch (PC-relative)(PC-relative) displacementdisplacement << 1212 bits:bits:

Hennessy, Patterson, Computer.Architecture.-.A.Quantitative.Approach Made wi th OpenOffi ce.org 27 Condition Testing

ThreeThree primaryprimary techniquestechniques CCR, conditions register, test & branch

Made wi th OpenOffi ce.org 28 Condition Types

MajorityMajority ofof conditionsconditions areare comparisonscomparisons MostMost ofof comparisonscomparisons areare simple,simple, manymany withwith 00

Hennessy, Patterson, Computer.Architecture.-.A.Quantitative.Approach Made wi th OpenOffi ce.org 29 Some Conclusions

ProcessingProcessing shouldshould supportsupport allall thethe datadata sizes:sizes: 8,8, 16,3216,32 andand 6464 bitsbits (bytes(bytes areare important!)important!) CompilersCompilers favorfavor simplesimple instructions;instructions; somesome advancedadvanced mightmight bebe ofof valuevalue ImplementationImplementation ofof FPFP instructionsinstructions inin allall general-purposegeneral-purpose processorsprocessors isis moremore forfor competitioncompetition ratherrather thanthan forfor realreal useuse PC-relativePC-relative branchesbranches dominatedominate withwith displacementdisplacement ofof 8-128-12 bitsbits toto bebe enoughenough SimpleSimple teststests areare favoredfavored inin programsprograms ThereThere areare alternativealternative conditioncondition testingtesting methodsmethods Made wi th OpenOffi ce.org 30 Instruction Encoding

Requirements:Requirements: number of instructions (opcode) number of registers variable operand size several addressing modes long displacements and immediates LimitationsLimitations alignment of instructions in memory pipeline processing code density decoding complexity Made wi th OpenOffi ce.org 31 Encoding Choices

VariableVariable all addressing modes for all operands many instruction formats variable instruction length – compact code FixedFixed addressing mode inside opcode fixed-size instructions (for pipeline) – large code reduced number of instructions HybridHybrid some variation in size is allowed compression of instructions Made wi th OpenOffi ce.org 32 Implementations

Made wi th OpenOffi ce.org 33 Compiler Support

AlmostAlmost allall programmingprogramming isis donedone atat high-levelhigh-level CompilerCompiler affectsaffects thethe codecode performanceperformance ArchitectureArchitecture affectsaffects thethe compliercomplier complexitycomplexity

HowHow thethe compilercompiler technologytechnology influenceinfluence thethe computercomputer architectures?architectures? HowHow thethe architecturearchitecture cancan helphelp thethe compiler?compiler?

Made wi th OpenOffi ce.org 34 Compiler Operation

Made wi th OpenOffi ce.org 35 Optimization Examples

Made wi th OpenOffi ce.org 36 Register Allocation

RegisterRegister allocationallocation isis keykey architecture-architecture- dependentdependent optimizationoptimization AllocationAllocation isis NP-completeNP-complete problemproblem (how(how toto useuse limitedlimited resources,resources, soso thatthat dependingdepending operationsoperations willwill useuse differentdifferent containerscontainers –– graphgraph coloringcoloring problem)problem) HeuristicsHeuristics workwork wellwell –– nearlynearly inin linearlinear timetime NumberNumber ofof registersregisters mustmust bebe atat leastleast 1616

Made wi th OpenOffi ce.org 37 Optimization Example

Hennessy, Patterson, Computer.Architecture.-.A.Quantitative.Approach Made wi th OpenOffi ce.org 38 Impact of Compilers

VariablesVariables allocationallocation andand addressingaddressing Stack – local, scalar, variables SP-relative Global data area – indexed data structures Dynamic allocation area – access with pointers RegisterRegister allocationallocation effective only for stack-allocated data Rare/exoticRare/exotic instructioninstruction supportsupport

Made wi th OpenOffi ce.org 39 Impact of Architecture

MakeMake frequentfrequent fastfast andand infrequentinfrequent correctcorrect ArchitectureArchitecture regularityregularity orthogonal instruction set FavorFavor primitives,primitives, notnot solutionssolutions simple instructions, not matching the high-level language functionality ReduceReduce trade-offstrade-offs amongamong alternativesalternatives performance vs size register allocation vs cache performance

Made wi th OpenOffi ce.org 40 Some Conclusions

FixedFixed instructioninstruction encodingencoding –– performanceperformance VariableVariable instructioninstruction encodingencoding –– codecode sizesize Software/HardwareSoftware/Hardware Co-designCo-design

Made wi th OpenOffi ce.org 41 Summary

ArchitectureArchitecture classclass AddressingAddressing modesmodes DataData typestypes InstructionInstruction typestypes FlowFlow controlcontrol EncodingEncoding CompilerCompiler issuesissues

Made wi th OpenOffi ce.org 42