EE108B Lecture2 MIPSAssemblyLanguageI

ChristosKozyrakis StanfordUniversity http://eeclass.stanford.edu/ee108b

C.Kozyrakis EE108bLecture2 1 Announcements

– EEundergrads:EE108AandCS106B – Everybodyelse:E40andCS106B(orequivalent) • EE108BProblemSession – Fridays2.15pm– 3.05pm,Thornton102,LiveonE4 • HW1outtoday • LAB1andPA1outnextweek – Makesureyouregisterfortheclass – Registerateeclass.stanford.edu/ee108b • Detailsaboutlabarrangementstobeannouncedsoon – Checkyouremailandwebpageannouncements

C.Kozyrakis EE108bLecture2 2 Review

• Integratedcircuits(Moore’sLaw) – 2xtransistorsevery18months – ~300milliontransistorstoday – Slowingdownduetocostoffabricationplants • Processors – HidecomplexityofICdesignwithsequentialprogrammingmodel – Untilrecently60%performanceimprovementperyear – Slowingsownduetohighpower,highcomplexity,andlackofideas • Whystudycomputerdesign? – Processorsareimportant – Goodexampleofdigitalsystemwithgeneraldesignprinciples – Helpwritebetter(fewerbugs,higherperformance)programs

C.Kozyrakis EE108bLecture2 3 Today’sTopic:MIPSInstructionSetArchitecture

• Textbookreading – 2.1–2.6 – Lookathowinstructionsaredefinedandrepresented

• Whatisaninstructionsetarchitecture(ISA)? • InterplayofCandMIPSISA • ComponentsofMIPSISA – Registeroperands – Memoryoperands – Arithmeticoperations – Controlflowoperations

C.Kozyrakis EE108bLecture2 4 5componentsofanyComputer

Computer Keyboard, Mouse Processor Memory Devices Disk Control Input

Datapath Output

Display Stored-program concept Printer

C.Kozyrakis EE108bLecture2 5 Computers(AllDigitalSystems) AreAtTheirCorePrettySimple

• Computersonlyworkwithbinarysignals – Signalonawireiseither0,or1 • Usuallycalleda“bit” – Morecomplexstuff(numbers,characters,strings,pictures) • Mustbebuiltfrommultiplebits

• Builtoutofsimplelogicgatesthatperformboolean logic – AND,OR,NOT,… • andmemorycellsthatpreservebitsovertime – Flipflops,registers,SRAMcells,DRAMcells,…

• Togethardwaretodoanything,needtobreakitdowntobits – Stringsofbitsthattellthehardwarewhattodoarecalled instructions – Asequenceofinstructionscalled machine language program (machinecode)

C.Kozyrakis EE108bLecture2 6 Hardware/SoftwareInterface Application (Firefox) Operating Compiler System Software Assembler (Linux) Instruction Set Hardware Processor Memory I/O system Architecture Datapath & Control • TheInstructionSetArchitecture(ISA)defineswhatinstructions do • MIPS,IA32(x86),SunSPARC,PowerPC,IBM390,IntelIA64 – TheseareallISAs • ManydifferentimplementationscanimplementsameISA(family) – 8086,386,486,Pentium,PentiumII,Pentium4implementIA32 – Ofcoursetheycontinuetoextendit,whilemaintainingbinarycompatibility • ISAslastalongtime – x86hasbeeninusesincethe70s C.Kozyrakis EE108bLecture2 7 – IBM390startedasIBM360 in60s MIPSISA

• MIPS– semiconductorcompanythatbuiltoneofthefirst commercialRISCarchitectures – FoundedbyJ.Hennessy • WewillstudytheMIPSarchitectureinsomedetailinthis class • WhyMIPSinsteadofIntel80x86? – MIPSissimple,elegantandeasytounderstand – x86isuglyandcomplicatedtoexplain – x86isdominantondesktop – MIPSisprevalentinembeddedapplicationsas processorcoreofsystemonchip(SOC)

C.Kozyrakis EE108bLecture2 8 MIPSProcessorHistory

Model - Clock Year Instruction Set Cache (I+D) Transistor Count Rate (MHz) External:4K+4K 1987 R200016 MIPSI 115thousand to32K+32K External:4K+4K 1990 33 120thousand to64K+64K 1991 100 MIPSIII 8K+8K 1.35million

1993 R4400150 16K+16K 2.3million

R4600100 16K+16K 1.9million

1995 Vr4300133 16K+8K 1.7million

1996 200 MIPSIV 32K+32K 3.9million

R10000200 32K+32K 6.8million

1999 R12000300 32K+32K 7.2million

2002 R14000600 32K+32K 7.2million

C.Kozyrakis EE108bLecture2 9 Cvs MIPSIISA ProgrammersInterface

•C • MIPSIISA – Variables – Registers(processorstate) • locals • 3232binteger,R0=0 • globals • 3232bsingleFP,1664bdoubleFP – Datatypes • PCandotherspecialregisters • int,short,char,unsigned – Memory • float,double • 232 lineararrayofbytes • Aggregatedatatypes – Datatypes • pointers • word(32b),byte(8b),half – Operators word(16b), • +,,*,%,++,<,etc. • singleFP(32b),doubleFP(64b) – Controlstructures – Operators • Ifelse,while,dowhile,for,switch, • add,sub,lw,sw,slt,etc. procedurecall,return – Control • Branches,jumps,jumpandlink

C.Kozyrakis EE108bLecture2 10 RunningAnApplication

temp = v[k]; High Level Language v[k] = v[k+1]; Program v[k+1] = temp; Compiler lw $15, 0($2) Assembly Language lw $16, 4($2) Program sw $16, 0($2) Assembler sw $15, 4($2) 0000 1001 1100 0110 1010 1111 0101 1000 Machine Language 1010 1111 0101 1000 0000 1001 1100 0110 Program 1100 0110 1010 1111 0101 1000 0000 1001 0101 1000 0000 1001 1100 0110 1010 1111 Machine Interpretation

Control Signal High/Low on control lines Specification

C.Kozyrakis EE108bLecture2 11 WhyHaveRegisters?

• MemorymemoryISA – AllHLLvariablesdeclaredinmemory – Whynotoperatedirectlyonmemoryoperands? Memory – E.g.DigitalEquipmentCorp(DEC)VAXISA 232 Bytes • Benefitsofregisters – Smallerisfaster – Multipleconcurrentaccesses – Shorternames Load Store

• LoadStoreISA 32 Registers – Arithmeticoperationsonlyuseregisteroperands 27 Bytes – Dataisloadedintoregisters,operatedon,and storedbacktomemory Arithmetic – AllRISCinstructionsets unit

C.Kozyrakis EE108bLecture2 12 UsingRegisters

• Registersareafiniteresourcethatneedstobemanaged – Programmer – Compiler:registerallocation • Goals – Keepdatainregistersasmuchaspossible – Alwaysusedatastillinregistersifpossible • Issues – Finitenumberofregistersavailable • Spillregisterstomemorywhenallregistersinuse – Arrays • Dataistoolargetostoreinregisters • What’stheimpactoffewerormoreregisters?

C.Kozyrakis EE108bLecture2 13 RegisterNaming

• Registersareidentifiedbya $

• Byconvention,wealsogivethemnames – $zero containsthehardwiredvalue0 – $v0, $v1 areforresultsandexpressionevaluation – $a0–$a3 areforarguments – $s0, $s1, … $s7 areforsavevalues – $t0, $t1, … $t9 arefortemporaryvalues – Theotherswillbeintroducedasappropriate • SeeFig3.13p.140fordetails

• Compilersusetheseconventionstosimplifylinking

C.Kozyrakis EE108bLecture2 14 AssemblyInstructions

• Thebasictypeofinstructionhasfourcomponents: 1. Operationname 2. 1stsourceoperand 3. 2ndsourceoperand 4. Destinationoperand

• add dst, src1, src2 # dst = src1 + src2

• dst,src1,andsrc2areregisternames($) • Whatdotheseinstructionsdo? – add$1,$1,$1 – add$1,$1,$1

C.Kozyrakis EE108bLecture2 15 CExample

1: int sum_pow2(int b, int c) 2: { 3: int pow2 [8] = {1, 2, 4, 8, 16, 32, 64, 128}; 4: int a, ret; 5: a = b + c; 6: if (a < 8) 7: ret = pow2[a]; 8: else 9: ret = 0; 10: return(ret); 11:}

C.Kozyrakis EE108bLecture2 16 ArithmeticOperators

• Considerline5,Coperationforaddition a = b + c; • Assumethevariablesareinregisters $1 $3 respectively • The add operatorusingregisters add $1, $2, $3 # a = b + c

• Usethe sub operatorfor a=b–c inMIPS sub $1, $2, $3 # a = b - c

• Butweknowthata,b,andcreallyrefertomemorylocations

C.Kozyrakis EE108bLecture2 17 ComplexOperations

• Whataboutmorecomplexstatements? a = b + c + d - e;

• Breakintomultipleinstructions add $t0, $s1, $s2 # $t0 = b + c add $t1, $t0, $s3 # $t1 = $t0 + d sub $s0, $t1, $s4 # a = $t1 - e

C.Kozyrakis EE108bLecture2 18 Signed&UnsignedNumber

• Ifgivenb[n1:0]inaregisterorinmemory

• Unsignedvalue n−1 value b i = ∑ i 2 i=0

• Signedvalue(2’scomplement)

n−2 value b n−1 b i = −( n−1 2 ) + ∑ i 2 i=0

C.Kozyrakis EE108bLecture2 19 Unsigned&SignedNumbers

X unsigned signed 0000 • Examplevalues 0001 – 4bits 0010 – Unsigned:[0,2 4 – 1] 0011 3 3 0100 – Signed:[ 2 ,2 1] 0101 • Equivalence 0110 – Sameencodingsfornon 0111 negativevalues 1000 • Uniqueness 1001 – Everybitpatternrepresents 1010 uniqueintegervalue 1011 1100 – Nottruewithsignmagnitude 1101 1110 1111 C.Kozyrakis EE108bLecture2 20 ArithmeticOverflow

Whenthesumoftwonbitnumberscannotberepresentedinnbits

Unsigned Signed • WrapsAround True Sum – Iftruesum≥2 n 0 n – Atmostonce Modular Sum n True Sum 0 n

0 n

1 n

n Modular Sum 1

C.Kozyrakis EE108bLecture2 21 Constants

• Oftenwanttobeabletospecifyoperandintheinstruction:immediateor literal • Usethe addi instruction

addi dst, src1, immediate

• Theimmediateisa16bitsignedvaluebetween215 and2 15 1 • Signextendedto32bits

• ConsiderthefollowingCcode a++;

• The addi operator addi $s0, $s0, 1 # a = a + 1

C.Kozyrakis EE108bLecture2 22 MIPSSimpleArithmetic

Instruction Example Meaning Comments add add$1,$2,$3 $1=$2+$3 3operands;Overflow subtract sub$1,$2,$3 $1=$2– $3 3operands;Overflow addimmediate addi $1,$2,100 $1=$2+100 +constant;Overflow addunsigned addu $1,$2,$3 $1=$2+$3 3operands;Nooverflow subtractunsign subu $1,$2,$3 $1=$2– $3 3operands;Nooverflow addimm unsign addiu $1,$2,100 $1=$2+100 +constant;Nooverflow

HowdoesCtreatoverflow?

C.Kozyrakis EE108bLecture2 23 MemoryDataTransfer

• Datatransferinstructionsareusedtomovedatatoandfrom memory • Aloadoperationmovesdatafromamemorylocationtoaregister andastoreoperationmovesdatafromaregistertoamemory location

address (32)

Memory CPU data (32) 232 Bytes load store

C.Kozyrakis EE108bLecture2 24 DataTransferInstructions:Loads

• Datatransferinstructionshavethreeparts – Operatorname(transfersize) – Destinationregister – Baseregisteraddressandconstantoffset

lw dst, offset(base)

– Offsetvalueisasignedconstant

C.Kozyrakis EE108bLecture2 25 MemoryAccess

• Allmemoryaccesshappensthroughloadsandstores • Alignedwords,halfwords,andbytes • FloatingPointloadsandstoresforaccessingFPregisters • Displacement based addressing mode

Immediate

Registers Memory Data to load/ location to + store into Base

C.Kozyrakis EE108bLecture2 26 LoadingDataExample

• Considertheexample a = b + *c;

• Usethe lw instructiontoload Assumea($s0),b($s1),c($s2)

lw $t0, 0($s2) # $t0 = Memory[c] add $s0, $s1, $t0 # a = b + *c

C.Kozyrakis EE108bLecture2 27 AccessingArrays

• Arraysarereallypointerstothebaseaddressinmemory – AddressofelementA[0] • Useoffsetvaluetoindicatewhichindex • Rememberthataddressesareinbytes,somultiplybythesizeof the element – Considertheintegerarraywhere pow2 isthebaseaddress – Withthiscompileronthisarchitecture,eachint requires4bytes – Thedatatobeaccessedisatindex5: pow2[5] – Thentheaddressfrommemoryis pow2 + 5 * 4

• UnlikeC,assemblydoesnothandlepointerarithmeticforyou!

C.Kozyrakis EE108bLecture2 28 ArrayMemoryDiagram

1032 1028 pow2[7] 1024 pow2[6] 1020 pow2[5] 1016 pow2[4] 1012 pow2[3] 1008 pow2[2] 1004 pow2[1] 4 bytes 1000 pow2[0]

pow2

C.Kozyrakis EE108bLecture2 29 ArrayExample

• Considertheexample a = b + pow2[7];

• Usethe lw instructionoffset,assume $s3 = 1000

lw $t0, 28($s3) # $t0 = Memory[pow2[7]] add $s0, $s1, $t0 # a = b + pow2[7]

C.Kozyrakis EE108bLecture2 30 ComplexArrayExample

• Considerline7fromsum_pow2() ret = pow2[a];

• Firstfindthecorrectoffset,againassume $s3 = 1000

sll $t0, $s0, 2 # $t0 = 4 * a : shift left by 2 add $t1, $s3, $t0 # $t1 = pow2 + 4*a lw $v0, 0($t1) # $v0 = Memory[pow2[a]]

C.Kozyrakis EE108bLecture2 31 StoringData

• Storingdataisjustthereverseandtheinstructionisnearlyidentical

• Usethe sw instructiontocopyawordfromthesourceregistertoan addressinmemory

sw src, offset(base)

• Offsetvalueissigned

C.Kozyrakis EE108bLecture2 32 StoringDataExample

• Considertheexample *a = b + c;

• Usethe sw instructiontostore

add $t0, $s1, $s2 # $t0 = b + c sw $t0, 0($s0) # Memory[s0] = b + c

C.Kozyrakis EE108bLecture2 33 StoringtoanArray

• Considertheexample a[3] = b + c;

• Usethesw instructionoffset

add $t0, $s1, $s2 # $t0 = b + c sw $t0, 12($s0) # Memory[a[3]] = b + c

C.Kozyrakis EE108bLecture2 34 ComplexArrayStorage

• Considertheexample a[i] = b + c;

• Usethe sw instructionoffset add $t0, $s1, $s2 # $t0 = b + c sll $t1, $s3, 2 # $t1 = 4 * i add $t2, $s0, $t1 # $t2 = a + 4*i sw $t0, 0($t2) # Memory[a[i]] = b + c

C.Kozyrakis EE108bLecture2 35 A“short”ArrayExample

• ANSICrequiresashorttobeatleast16bitsandnolongerthanan int,butdoesnotdefinetheexactsize • Forourpurposes,treatashortas2bytes • So,withashortarrayc[7]isatc+7*2,shiftleftby1 1016 1014 c[ 7] 1012 c[ 6] 1010 c[ 5] 1008 1006 c[ 4] 1004 c[ 3] 1002 c[ 2] 1000 2 bytes c[ 1] c[ 0] c C.Kozyrakis EE108bLecture2 36 MIPSIntegerLoad/Store

Instruction Example Meaning Comments storeword sw $1,8($2) Mem[8+$2]=$1 Storeword storehalf sh $1,6($2) Mem[6+$2]=$1 Storesonlylower16bits storebyte sb $1,5($2) Mem[5+$2]=$1 Storesonlylowestbyte storefloat sf $f1,4($2) Mem[4+$2]=$f1 StoreFPword loadword lw $1,8($2) $1=Mem[8+$2] Loadword loadhalfword lh $1,6($2) $1=Mem[6+$2] Loadhalf;signextend loadhalfunsign lhu $1,6($2) $1=Mem[8+$2] Loadhalf;zeroextend loadbyte lb$1,5($2) $1=Mem[5+$2] Loadbyte;signextend loadbyteunsign lbu $1,5($2) $1=Mem[5+$2] Loadbyte;zeroextend

C.Kozyrakis EE108bLecture2 37 AlignmentRestrictions

• InMIPS,dataisrequiredtofallonaddressesthataremultiplesofthedata size 0 1 2 3 Aligned

Not Aligned

• Considerword(4byte)memoryaccess

0 4 8 12 16

0 4 8 12 16

C.Kozyrakis EE108bLecture2 38 AlignmentRestrictions(cont)

• CExample struct foo { char sm; Whatisthesizeof short med; thisstructure? char sm1; int lrg; } Byte offset 0 1 2 3 4 5 7 8 11 sm x med sm1 x lrg

• Historically – Earlymachines(IBM360in1964)requiredalignment – Removedin1970storeduceimpactonprogrammers – ReintroducedbyRISCtoimproveperformance • Alsointroduceschallengeswithmemoryorganizationwithvirtual memory,etc.

C.Kozyrakis EE108bLecture2 39 MemoryMappedI/O

• DatatransferinstructionscanbeusedtomovedatatoandfromI/O deviceregisters • AloadoperationmovesdatafromaanI/OdeviceregistertoaCPU registerandastoreoperationmovesdatafromaCPUregisterto a I/Odeviceregister

address (8)

Memory CPU data (8) 27 Bytes

I/O register

I/O register at address 0x80 (128)

C.Kozyrakis EE108bLecture2 40 ChangingControlFlow

• Oneofthedistinguishingcharacteristicsofcomputersistheability toevaluateconditionsandchangecontrolflow •C – Ifthenelse – Loops – Casestatements • MIPS – Conditionalbranchinstructionsareknownas branches – Unconditionalchangesinthecontrolflowarecalled jumps – Thetargetofthebranch/jumpisalabel

C.Kozyrakis EE108bLecture2 41 Conditional:Equality

• Thesimplestconditionaltestisthe beq instructionforequality beq reg1, reg2, label

• Considerthecode if (a == b) goto L1; // Do something L1: // Continue

• Usethe beq instruction beq $s0, $s1, L1 # Do something L1: # Continue

C.Kozyrakis EE108bLecture2 42 Conditional:NotEqual

• The bne instructionfornotequal bne reg1, reg2, label

• Considerthecode if (a != b) goto L1; // Do something L1: // Continue

• Usethe bne instruction bne $s0, $s1, L1 # Do something L1: # Continue

C.Kozyrakis EE108bLecture2 43 Unconditional:Jumps

• The j instructionjumpstoalabel j label

C.Kozyrakis EE108bLecture2 44 IfthenelseExample

(true) (false) • Considerthecode i == j? i == j i != j if (i == j) f = g + h; else f = g – h; f=g+h f=g-h

Exit

C.Kozyrakis EE108bLecture2 45 IfthenelseSolution

• Createlabelsanduseequalityinstruction

beq $s3, $s4, True # Branch if i == j sub $s0, $s1, $s2 # f = g – h j Exit # Go to Exit True: add $s0, $s1, $s2 # f = g + h Exit:

C.Kozyrakis EE108bLecture2 46 OtherComparisons

• Otherconditionalarithmeticoperatorsareusefulinevaluating conditionalexpressionsusing<,>,<=,>= • Registeris“set”to1whenconditionismet

• ConsiderthefollowingCcode if (f < g) goto Less;

• Solution slt $t0, $s0, $s1 # $t0 = 1 if $s0 < $s1 bne $t0, $zero, Less # Goto Less if $t0 != 0

C.Kozyrakis EE108bLecture2 47 MIPSComparisons

Instruction Example Meaning Comments setlessthan slt $1, $2, $3 $1=($2<$3) complessthansigned setlessthanimm slti $1, $2, 100 $1=($2<100) compw/constsigned setlessthanuns sltu $1, $2, $3 $1=($2<$3) comp<unsigned setl.t.imm.uns sltiu $1, $2, 100 $1=($2<100) comp<constunsigned

•C if (a < 8)

• Assembly

slti $v0,$a0,8 # $v0 = a < 8 beq $v0,$zero, Exceed # goto Exceed if $v0 == 0

C.Kozyrakis EE108bLecture2 48 CExample

1: int sum_pow2(int b, int c) 2: { 3: int pow2 [8] = {1, 2, 4, 8, 16, 32, 64, 128}; 4: int a, ret; 5: a = b + c; 6: if (a < 8) 7: ret = pow2[a]; 8: else 9: ret = 0; 10: return(ret); 11:}

C.Kozyrakis EE108bLecture2 49 sum_pow2: # $a0 = b, $a1 = c addu $a0,$a0,$a1 # a = b + c, $a0 = a slti $v0,$a0,8 # $v0 = a < 8 beq $v0,$zero, Exceed # goto Exceed if $v0 == 0 addiu $v1,$sp,8 # $v1 = pow2 address sll $v0,$a0,2 # $v0 = a*4 addu $v0,$v0,$v1 # $v0 = pow2 + a*4 lw $v0,0($v0) # $v0 = pow2[a] j Return # gotoReturn Exceed: addu $v0,$zero,$zero # $v0 = 0 Return: jr ra # return sum_pow2

C.Kozyrakis EE108bLecture2 50