Review Memory Disambiguation Review Explicit Register Renaming

Review Memory Disambiguation Review Explicit Register Renaming

5HYLHZ5HRUGHU%XIIHU 52% &6 *UDGXDWH&RPSXWHU$UFKLWHFWXUH 8VHRIUHRUGHUEXIIHU /HFWXUH ² ,QRUGHULVVXH2XWRIRUGHUH[HFXWLRQ,QRUGHUFRPPLW ² +ROGVUHVXOWVXQWLOWKH\FDQEHFRPPLWWHGLQRUGHU ,QVWUXFWLRQ/HYHO3DUDOOHOLVP ª 6HUYHVDVVRXUFHRIYDOXHVXQWLOLQVWUXFWLRQVFRPPLWWHG ² 3URYLGHVVXSSRUWIRUSUHFLVHH[FHSWLRQV6SHFXODWLRQVLPSO\WKURZRXW *HWWLQJWKH&3, LQVWUXFWLRQVODWHUWKDQH[FHSWHGLQVWUXFWLRQ ² &RPPLWVXVHUYLVLEOHVWDWHLQLQVWUXFWLRQRUGHU ² 6WRUHVVHQWWRPHPRU\V\VWHPRQO\ZKHQWKH\UHDFKKHDGRIEXIIHU 6HSWHPEHU ,Q2UGHU&RPPLW LVLPSRUWDQWEHFDXVH 3URI-RKQ.XELDWRZLF] ² $OORZVWKHJHQHUDWLRQRISUHFLVHH[FHSWLRQV ² $OORZVVSHFXODWLRQDFURVVEUDQFKHV &6.XELDWRZLF] &6.XELDWRZLF] /HF /HF 5HYLHZ0HPRU\'LVDPELJXDWLRQ 5HYLHZ([SOLFLW5HJLVWHU5HQDPLQJ 4XHVWLRQ*LYHQDORDGWKDWIROORZVDVWRUHLQSURJUDP 0DNHXVHRIDSK\VLFDO UHJLVWHUILOHWKDWLVODUJHUWKDQ RUGHUDUHWKHWZRUHODWHG" QXPEHURIUHJLVWHUVVSHFLILHGE\,6$ ² 7U\LQJWRGHWHFW5$:KD]DUGVWKURXJKPHPRU\ .H\LQVLJKW$OORFDWHDQHZSK\VLFDOGHVWLQDWLRQUHJLVWHU ² 6WRUHVFRPPLWLQRUGHU 52% VRQR:$5:$:PHPRU\KD]DUGV IRUHYHU\LQVWUXFWLRQWKDWZULWHV ,PSOHPHQWDWLRQ ² 5HPRYHVDOOFKDQFHRI:$5RU:$:KD]DUGV ² .HHSTXHXHRIVWRUHVLQSURJRUGHU ² 6LPLODUWRFRPSLOHUWUDQVIRUPDWLRQFDOOHG6WDWLF6LQJOH$VVLJQPHQW ² :DWFKIRUSRVLWLRQRIQHZORDGVUHODWLYHWRH[LVWLQJVWRUHV ª /LNHKDUGZDUHEDVHGG\QDPLFFRPSLODWLRQ" :KHQKDYHDGGUHVVIRUORDGFKHFNVWRUHTXHXH 0HFKDQLVP".HHSDWUDQVODWLRQWDEOH ² ,IDQ\ VWRUHSULRUWRORDGLVZDLWLQJIRULWVDGGUHVVVWDOOORDG ² ,6$UHJLVWHU⇒ SK\VLFDOUHJLVWHUPDSSLQJ ² ,IORDGDGGUHVVPDWFKHVHDUOLHUVWRUHDGGUHVV DVVRFLDWLYHORRNXS ² :KHQUHJLVWHUZULWWHQUHSODFHHQWU\ZLWKQHZUHJLVWHUIURPIUHHOLVW WKHQZHKDYHDPHPRU\LQGXFHG 5$: KD]DUG ² 3K\VLFDOUHJLVWHUIUHHZKHQQRWXVHGE\DQ\DFWLYHLQVWUXFWLRQV ª VWRUHYDOXHDYDLODEOH⇒ UHWXUQYDOXH $GYDQWDJHVRIH[SOLFLWUHQDPLQJ ª VWRUHYDOXHQRWDYDLODEOH⇒ UHWXUQ52%QXPEHURIVRXUFH ² 'HFRXSOHVUHQDPLQJ IURPVFKHGXOLQJ ² 2WKHUZLVHVHQGRXWUHTXHVWWRPHPRU\ ² $OORZVGDWDWREHIHWFKHGIURPVLQJOHUHJLVWHUILOH :LOOUHOD[H[DFWGHSHQGHQF\FKHFNLQJLQODWHUOHFWXUH ª 1RQHHGWRE\SDVVYDOXHVIURPUHRUGHUEXIIHU ª %HWWHUXWLOL]DWLRQRIUHJLVWHUSRUWV &6.XELDWRZLF] &6.XELDWRZLF] /HF /HF *HWWLQJ&3,,VVXLQJ 0XOWLSOH,QVWUXFWLRQV&\FOH ,QVWUXFWLRQ/HYHO3DUDOOHOLVP 6XSHUVFDODUYDU\LQJQRLQVWUXFWLRQVF\FOH WR VFKHGXOHGE\FRPSLOHURUE\+: 7RPDVXOR ² ,%03RZHU3&6XQ8OWUD6SDUF'(&$OSKD+3 +LJKVSHHGH[HFXWLRQEDVHGRQLQVWUXFWLRQ 9HU\ /RQJ,QVWUXFWLRQ:RUGV 9 /,: OHYHO SDUDOOHOLVP LOS SRWHQWLDORIVKRUW IL[HGQXPEHURILQVWUXFWLRQV VFKHGXOHG LQVWUXFWLRQVHTXHQFHVWRH[HFXWHLQSDUDOOHO E\WKHFRPSLOHUSXWRSVLQWRZLGHWHPSODWHV +LJKVSHHGPLFURSURFHVVRUVH[SORLW,/3E\ ² -RLQW+3,QWHODJUHHPHQWLQ" SLSHOLQHGH[HFXWLRQRYHUODSLQVWUXFWLRQV ² ,QWHO$UFKLWHFWXUH ,$ ELWDGGUHVV 2XWRIRUGHUH[HFXWLRQ FRPPLWLQRUGHU ª 6W\OH´([SOLFLWO\3DUDOOHO,QVWUXFWLRQ&RPSXWHU (3,& µ 0XOWLSOHLVVXHLVVXHDQGH[HFXWHPXOWLSOH ² 1HZ6810$-,&$UFKLWHFWXUH9/,:IRU-DYD LQVWUXFWLRQVSHUFORFNF\FOH 9HFWRU3URFHVVLQJ 9HFWRULQVWUXFWLRQVPDQ\LQGHSHQGHQWRSVVSHFLILHG ZLWKDVLQJOHLQVWUXFWLRQ ([SOLFLWFRGLQJRILQGHSHQGHQWORRSVDV RSHUDWLRQVRQODUJHYHFWRUVRIQXPEHUV 0HPRU\DFFHVVHVIRUKLJKVSHHG PLFURSURFHVVRU" ² 0XOWLPHGLDLQVWUXFWLRQVEHLQJDGGHGWRPDQ\SURFHVVRUV ² 'DWD&DFKHSRVVLEO\PXOWLSRUWHGPXOWLSOHOHYHOV $QWLFLSDWHGVXFFHVVOHDGWRXVHRI ,QVWUXFWLRQV3HU&ORFN F\FOH ,3& YV&3, &6.XELDWRZLF] &6.XELDWRZLF] /HF /HF *HWWLQJ&3,,VVXLQJ 0XOWLSOH,QVWUXFWLRQV&\FOH 5HYLHZ8QUROOHG/RRSWKDW 0LQLPL]HV6WDOOVIRU6FDODU 6XSHUVFDODU '/;LQVWUXFWLRQV)3IDQ\WKLQJ HOVH 1 Loop: LD F0,0(R1) LD to ADDD: 1 Cycle 2 LD F6,-8(R1) ADDD to SD: 2 Cycles ² )HWFKELWVFORFNF\FOH,QW RQOHIW)3RQULJKW 3 LD F10,-16(R1) ² &DQRQO\LVVXHQGLQVWUXFWLRQLIVWLQVWUXFWLRQLVVXHV 4 LD F14,-24(R1) ² 0RUHSRUWVIRU)3UHJLVWHUVWRGR)3ORDGI)3RSLQDSDLU 5 ADDD F4,F0,F2 7\SH 3LSH6WDJHV 6 ADDD F8,F6,F2 7 ADDD F12,F10,F2 ,QWLQVWUXFWLRQ ,) ,' (; 0(0 :% 8 ADDD F16,F14,F2 )3LQVWUXFWLRQ ,) ,' (; 0(0 :% 9 SD 0(R1),F4 ,QWLQVWUXFWLRQ ,) ,' (; 0(0 :% 10 SD -8(R1),F8 11 SD -16(R1),F12 )3LQVWUXFWLRQ ,) ,' (; 0(0 :% 12 SUBI R1,R1,#32 ,QWLQVWUXFWLRQ ,) ,' (; 0(0 :% 13 BNEZ R1,LOOP )3LQVWUXFWLRQ ,) ,' (; 0(0 :% 14 SD 8(R1),F16 ; 8-32 = -24 F\FOHORDGGHOD\H[SDQGVWRLQVWUXFWLRQV LQ66 14 clock cycles, or 3.5 per iteration ² LQVWUXFWLRQLQULJKWKDOIFDQ·WXVHLWQRULQVWUXFWLRQVLQQH[W&6.XELDWRZLF] VORW &6.XELDWRZLF] /HF /HF '\QDPLF6FKHGXOLQJLQ6XSHUVFDODU /RRS8QUROOLQJLQ6XSHUVFDODU 7KHHDV\ZD\ ,QWHJHU LQVWUXFWLRQ )3 LQVWUXFWLRQ &ORFN F\FOH /RRS /') 5 +RZWRLVVXHWZRLQVWUXFWLRQVDQGNHHSLQRUGHU /') 5 LQVWUXFWLRQLVVXHIRU7RPDVXOR" /') 5 $'''))) ² $VVXPHLQWHJHUIORDWLQJSRLQW /') 5 $'''))) ² 7RPDVXOR FRQWUROIRULQWHJHUIRUIORDWLQJSRLQW /') 5 $'''))) ,VVXH;&ORFN5DWHVRWKDWLVVXHUHPDLQVLQRUGHU 6' 5 ) $'''))) 6' 5 ) $'''))) 2QO\)3ORDGVPLJKWFDXVHGHSHQGHQF\EHWZHHQ 6' 5 ) LQWHJHUDQG)3LVVXH 6' 5 ) ² 5HSODFHORDGUHVHUYDWLRQVWDWLRQZLWKDORDGTXHXH RSHUDQGVPXVWEHUHDGLQWKHRUGHUWKH\DUHIHWFKHG 68%,55L ² /RDGFKHFNVDGGUHVVHVLQ6WRUH4XHXHWRDYRLG5$:YLRODWLRQ %1(=5/223 ² 6WRUHFKHFNVDGGUHVVHVLQ/RDG4XHXHWRDYRLG:$5:$: 6' 5 ) ² &DOOHG´GHFRXSOHGDUFKLWHFWXUHµFRPSDUHZLWK6PLWKSDSHU 8QUROOHGWLPHVWRDYRLGGHOD\V GXHWR66 FORFNVRUFORFNVSHULWHUDWLRQ ; &6.XELDWRZLF] &6.XELDWRZLF] /HF /HF 0XOWLSOH,VVXH&KDOOHQJHV :KLOH,QWHJHU)3VSOLWLVVLPSOHIRUWKH+:JHW&3, 9/,:9HU\/DUJH,QVWUXFWLRQ:RUG RIRQO\IRUSURJUDPVZLWK ² ([DFWO\O)3RSHUDWLRQV (DFK´LQVWUXFWLRQµKDVH[SOLFLWFRGLQJIRUPXOWLSOH ² 1RKD]DUGV RSHUDWLRQV ,IPRUHLQVWUXFWLRQVLVVXHDWVDPHWLPHJUHDWHU ² ,Q(3,&JURXSLQJFDOOHGD´SDFNHWµ GLIILFXOW\RIGHFRGHDQGLVVXH ² ,Q7UDQVPHWDJURXSLQJFDOOHGD´PROHFXOHµ ZLWK´DWRPVµDVRSV ² (YHQVFDODU !H[DPLQHRSFRGHVUHJLVWHUVSHFLILHUVIGHFLGH LIRULQVWUXFWLRQVFDQLVVXH 7UDGHRIILQVWUXFWLRQVSDFHIRUVLPSOHGHFRGLQJ ² 5HJLVWHUILOHQHHG[UHDGVDQG[ZULWHVF\FOH ² 7KHORQJLQVWUXFWLRQZRUGKDVURRPIRUPDQ\RSHUDWLRQV ² 5HQDPHORJLFPXVWEHDEOHWRUHQDPHVDPHUHJLVWHUPXOWLSOHWLPHV ² %\GHILQLWLRQDOOWKHRSHUDWLRQVWKHFRPSLOHUSXWVLQWKHORQJ LQRQHF\FOHR)RULQVWDQFHFRQVLGHUZD\LVVXH LQVWUXFWLRQZRUGDUHLQGHSHQGHQW !H[HFXWHLQSDUDOOHO add r1, r2, r3 add p11, p4, p7 ² (JLQWHJHURSHUDWLRQV)3RSV0HPRU\UHIVEUDQFK sub r4, r1, r2 ⇒ sub p22, p11, p4 lw r1, 4(r4) lw p23, 4(p22) ª WRELWVSHUILHOG ! RUELWVWR RU add r5, r1, r2 add p12, p23, p4 ELWVZLGH ,PDJLQHGRLQJWKLVWUDQVIRUPDWLRQLQDVLQJOHF\FOHR ² 1HHGFRPSLOLQJWHFKQLTXHWKDWVFKHGXOHVDFURVVVHYHUDOEUDQFKHV ² 5HVXOWEXVHV1HHGWRFRPSOHWHPXOWLSOHLQVWUXFWLRQVF\FOH ª 6RQHHGPXOWLSOHEXVHVZLWKDVVRFLDWHGPDWFKLQJORJLFDWHYHU\ UHVHUYDWLRQVWDWLRQ ª 2UQHHGPXOWLSOHIRUZDUGLQJSDWKV &6.XELDWRZLF] &6.XELDWRZLF] /HF /HF 5HFDOO8QUROOHG/RRSWKDW /RRS8QUROOLQJLQ9/,: 0LQLPL]HV6WDOOVIRU6FDODU Memory Memory FP FP Int. op/Clock 1 Loop: LD F0,0(R1) LD to ADDD: 1 Cycle reference 1 reference 2 operation 1 op. 2 branch 2 LD F6,-8(R1) ADDD to SD: 2 Cycles LD F0,0(R1) LD F6,-8(R1) 1 3 LD F10,-16(R1) LD F10,-16(R1) LD F14,-24(R1) 2 4 LD F14,-24(R1) LD F18,-32(R1) LD F22,-40(R1) ADDD F4,F0,F2 ADDD F8,F6,F2 3 5 ADDD F4,F0,F2 LD F26,-48(R1) ADDD F12,F10,F2 ADDD F16,F14,F2 4 6 ADDD F8,F6,F2 7 ADDD F12,F10,F2 ADDD F20,F18,F2 ADDD F24,F22,F2 5 8 ADDD F16,F14,F2 SD 0(R1),F4 SD -8(R1),F8 ADDD F28,F26,F2 6 9 SD 0(R1),F4 SD -16(R1),F12 SD -24(R1),F16 7 10 SD -8(R1),F8 SD -32(R1),F20 SD -40(R1),F24 SUBI R1,R1,#48 8 11 SD -16(R1),F12 SD -0(R1),F28 BNEZ R1,LOOP 9 12 SUBI R1,R1,#32 Unrolled 7 times to avoid delays 13 BNEZ R1,LOOP 14 SD 8(R1),F16 ; 8-32 = -24 7 results in 9 clocks, or 1.3 clocks per iteration (1.8X) Average: 2.5 ops per clock, 50% efficiency 14 clock cycles, or 3.5 per iteration &6.XELDWRZLF] Note: Need more registers in VLIW (15 vs. 6 in SS)&6.XELDWRZLF] /HF /HF 5HFDOO6RIWZDUH3LSHOLQLQJ 5HFDOO6RIWZDUH3LSHOLQLQJ([DPSOH %HIRUH8QUROOHGWLPHV After: Software Pipelined 2EVHUYDWLRQLILWHUDWLRQVIURPORRSVDUHLQGHSHQGHQW LD F0,0(R1) 1 SD 0(R1),F4 ; Stores M[i] WKHQFDQJHWPRUH,/3E\WDNLQJLQVWUXFWLRQVIURP ADDD F4,F0,F2 2 ADDD F4,F0,F2 ; Adds to M[i-1] GLIIHUHQW LWHUDWLRQV SD 0(R1),F4 3 LD F0,-16(R1); Loads M[i-2] LD F6,-8(R1) 4 SUBI R1,R1,#8 6RIWZDUHSLSHOLQLQJUHRUJDQL]HVORRSVVRWKDWHDFK ADDD F8,F6,F2 5 BNEZ R1,LOOP LWHUDWLRQLVPDGHIURPLQVWUXFWLRQVFKRVHQIURP SD -8(R1),F8 GLIIHUHQWLWHUDWLRQVRIWKHRULJLQDOORRS 7RPDVXOR LQ LD F10,-16(R1) 6: ADDD F12,F10,F2 SW Pipeline Iteration SD -16(R1),F12 0 Iteration 1 Iteration SUBI R1,R1,#24 2 Iteration BNEZ R1,LOOP Time 3 Iteration 4 Loop Unrolled • Symbolic Loop Unrolling Software- – Maximize result-use distance ops overlapped pipelined iteration – Less code space than unrolling – Fill & drain pipe only once per loop Time vs. once per each unrolled iteration in loop unrolling &6.XELDWRZLF] &6.XELDWRZLF] /HF /HF 6RIWZDUH3LSHOLQLQJZLWK /RRS8QUROOLQJLQ9/,: 7UDFH6FKHGXOLQJ 3DUDOOHOLVPDFURVV,)EUDQFKHVYV/223EUDQFKHV Memory Memory FP FP Int. op/Clock 7ZRVWHSV reference 1 reference 2 operation 1 op. 2 branch ² 7UDFH 6HOHFWLRQ LD F0,-48(R1) ST 0(R1),F4 ADDD F4,F0,F2 1 ª )LQGOLNHO\VHTXHQFHRIEDVLFEORFNV WUDFH LD F6,-56(R1) ST -8(R1),F8 ADDD F8,F6,F2 SUBI R1,R1,#24 2 RI VWDWLFDOO\SUHGLFWHGRUSURILOHSUHGLFWHG LD F10,-40(R1) ST 8(R1),F12 ADDD F12,F10,F2 BNEZ R1,LOOP 3 ORQJVHTXHQFHRIVWUDLJKWOLQHFRGH ² 7UDFH &RPSDFWLRQ • Software pipelined across 9 iterations of original loop ª 6TXHH]HWUDFHLQWRIHZ9/,:LQVWUXFWLRQV – In each iteration of above loop, we: ª 1HHGERRNNHHSLQJFRGHLQFDVHSUHGLFWLRQLVZURQJ » Store to m,m-8,m-16 (iterations I-3,I-2,I-1) 7KLVLVDIRUPRIFRPSLOHUJHQHUDWHGVSHFXODWLRQ » Compute for m-24,m-32,m-40 (iterations I,I+1,I+2) ² &RPSLOHUPXVWJHQHUDWH´IL[XSµFRGHWRKDQGOHFDVHVLQZKLFK » Load from m-48,m-56,m-64 (iterations I+3,I+4,I+5) WUDFHLVQRWWKHWDNHQEUDQFK • 9 results in 9 cycles, or 1 clock per iteration ² 1HHGVH[WUDUHJLVWHUVXQGRHVEDGJXHVVE\GLVFDUGLQJ • Average: 3.3 ops per clock, 66% efficiency 6XEWOHFRPSLOHUEXJVPHDQZURQJDQVZHU YVSRRUHUSHUIRUPDQFHQRKDUGZDUHLQWHUORFNV Note: Need less registers for software pipelining &6.XELDWRZLF] &6.XELDWRZLF] (only using 7 registers

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    14 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us