Performance Characterization • Delay analysis • Transistor sizing • Logical effort • Power analysis
ECE 261 James Morizio 1 Delay Definitions
• tpdr: rising propagation delay
– From input to rising output crossing VDD/2
• tpdf: falling propagation delay
– From input to falling output crossing VDD/2
• tpd: average propagation delay
– tpd = (tpdr + tpdf)/2
• tr: rise time
– From output crossing 0.2 VDD to 0.8 VDD
• tf: fall time
– From output crossing 0.8 VDD to 0.2 VDD
ECE 261 James Morizio 2 Simulated Inverter Delay • Solving differential equations by hand is too hard • SPICE simulator solves the equations numerically – Uses more accurate I-V models too! • But simulations take time to write 2.0
1.5
1.0 (V) t = 66ps t = 83ps V pdf pdr in V 0.5 out
0.0
0.0 200p 400p 600p 800p 1n t(s)
ECE 261 James Morizio 3 Delay Estimation
• We would like to be able to easily estimate delay – Not as accurate as simulation – But easier to ask “What if?” • The step response usually looks like a 1st order RC response with a decaying exponential. • Use RC delay models to estimate delay – C = total capacitance on output node – Use effective resistance R
– So that tpd = RC • Characterize transistors by finding their effective R – Depends on average current as gate switches
ECE 261 James Morizio 4 RC Delay Models • Use equivalent circuits for MOS transistors – Ideal switch + capacitance and ON resistance – Unit nMOS has resistance R, capacitance C – Unit pMOS has resistance 2R, capacitance C • Capacitance proportional to width • Resistance inversely proportional to width d s kC kC R/k d 2R/k d g k g kC g k g s kC kC kC s s d
ECE 261 James Morizio 5 Example: 3-input NAND
• Sketch a 3-input NAND with transistor widths chosen to achieve effective rise and fall resistances equal to a unit inverter (R).
ECE 261 James Morizio 6 Example: 3-input NAND
• Sketch a 3-input NAND with transistor widths chosen to achieve effective rise and fall resistances equal to a unit inverter (R).
ECE 261 James Morizio 7 Example: 3-input NAND • Sketch a 3-input NAND with transistor widths chosen to achieve effective rise and fall resistances equal to a unit inverter (R). 2 2 2
3 3 3
ECE 261 James Morizio 8 3-input NAND Caps • Annotate the 3-input NAND gate with gate and diffusion capacitance.
2 2 2
3
3
3
ECE 261 James Morizio 9 3-input NAND Caps • Annotate the 3-input NAND gate with gate and diffusion capacitance. 2C 2C 2C 2C 2C 2C 2 2 2 2C 2C 2C
3C 3 3C 3C 3 3C 3C 3 3C 3C
ECE 261 James Morizio 10 3-input NAND Caps • Annotate the 3-input NAND gate with gate and diffusion capacitance. 2 2 2
3 9C 5C 3 3C 5C 3 3C 5C
ECE 261 James Morizio 11 Elmore Delay • ON transistors look like resistors • Pullup or pulldown network modeled as RC ladder • Elmore delay of RC ladder t pd ≈ ƒ Ri−to−sourceCi nodes i
= R1C1 + (R1 + R2 )C2 + ... + (R1 + R2 + ... + RN )CN
R1 R2 R3 RN
C1 C2 C3 CN
ECE 261 James Morizio 12 Example: 2-input NAND • Estimate worst-case rising and falling delay of 2- input NAND driving h identical gates.
2 2 Y A 2 h copies x B 2
ECE 261 James Morizio 13 Example: 2-input NAND
• Estimate rising and falling propagation delays of a 2-input NAND driving h identical gates.
2 2 Y A 2 6C 4hC x h copies B 2 2C
ECE 261 James Morizio 14 Example: 2-input NAND • Estimate rising and falling propagation delays of a 2-input NAND driving h identical gates.
2 2 Y A 2 6C 4hC x h copies B 2 2C
R Y t = (6+4h)C pdr
ECE 261 James Morizio 15 Example: 2-input NAND • Estimate rising and falling propagation delays of a 2-input NAND driving h identical gates.
2 2 Y A 2 6C 4hC x B 2 2C h copies
R Y t = (6 + 4h)RC (6+4h)C pdr
ECE 261 James Morizio 16 Example: 2-input NAND • Estimate rising and falling propagation delays of a 2-input NAND driving h identical gates.
2 2 Y A 2 6C 4hC x h copies B 2 2C
ECE 261 James Morizio 17 Example: 2-input NAND • Estimate rising and falling propagation delays of a 2-input NAND driving h identical gates.
2 2 Y A 2 6C 4hC h copies x B 2 2C
R/2 x Y t = R/2 2C (6+4h)C pdf
ECE 261 James Morizio 18 Example: 2-input NAND • Estimate rising and falling propagation delays of a 2-input NAND driving h identical gates.
2 2 Y A 2 6C 4hC x h copies B 2 2C
t = (2C)(R )+ »(6 + 4h)Cÿ(R + R ) x R/2 Y pdf 2 ⁄ 2 2 2C (6+4h)C R/2 = (7 + 4h)RC
ECE 261 James Morizio 19 Delay Components
• Delay has two parts – Parasitic delay • 6 or 7 RC • Independent of load – Effort delay • 4h RC • Proportional to load capacitance
ECE 261 James Morizio 20 Contamination Delay • Best-case (contamination) delay can be substantially less than propagation delay. • Ex: If both inputs fall simultaneously
2 2 Y A 2 6C 4hC x B 2 2C
R R Y t = (3+ 2h)RC (6+4h)C cdr
ECE 261 James Morizio 21 Diffusion Capacitance • We assumed contacted diffusion on every s / d. • Good layout minimizes diffusion area • Ex: NAND3 layout shares one diffusion contact – Reduces output capacitance by 2C – Merged uncontacted diffusion might help too
2C 2C Shared Contacted Diffusion Isolated Contacted 2 2 2 Merged Diffusion Uncontacted 3 7C Diffusion 3 3C 3C 3C 3C 3 3C
ECE 261 James Morizio 22 Layout Comparison
• Which layout is better? VDD VDD A B A B
Y Y
GND GND
ECE 261 James Morizio 23 Resizing the Inverter Minimum-sized 2λ transistor: W=3λ, L=2λ To get equal rise and fall times, Ω βn = βp Wp = 3Wn, assuming 3λ n-diffusion that electron mobility is three times that of holes
poly Wp=9λ
2λ Sometimes the function being implemented 9λ p-diffusion makes resizing unnecessary!
poly
ECE 261 James Morizio 24 Analyzing the NAND Gate VDD
βp1 1 βp2 βp3 a βn, eff = b c 1 + 1 + 1 F βn1 βn2 βn3 a βn1 Resistances are in series (conductances b β n2 are in parallel)
c βn3 If βn1 = βn2 = βn3 = βn then βn, eff = βn/3 • Pull-down circuit has three times resistance, Why not consider reGsisntdances in parallel?one-third times the conductance
For pull-up, only one transistor has to be on, βp, eff = min{βp1,βp2,βp3} Ω If βp1 = βp2 = βp3 = βp = βn/3 then βn, eff = βp no resizing is necessary
ECE 261 James Morizio 25 Analyzing the NOR Gate VDD 1 β = p, eff 1 a + 1 + 1 βp1 βp1 βp2 βp3 b βp2 Resistances are in series (conductances c are in parallel) βp3
βn1 If β = β = β = β then β = β /3 βn2 βn3 p1 p2 p3 p p, eff p a b c • Pull-up circuit has three times resistance, Gnd one-third times the conductance
For pull-down, only one transistor has to be on, βn, eff = min{βn1,βn2,βn3} Ω If βn1 = βn2 = βn3 = βn = 3βp then βn,eff=9βp,eff considerable resizing is necessary Wp = 9Wn!
ECE 261 James Morizio 26 Effect of Series Transistors
Diffusion Diffusion L poly poly
L poly 3L
L poly
W W
ECE 261 James Morizio 27 Effect of Series Transistors
VDD Transistor resizing Resize the pull-up transistors to a example βp make pull-up times equal c βp b After resizing: βp a: 2βp, b: 2βp, c: βp
Pull-down
ECE 261 James Morizio 28 Transistor Placement (Series Stack) How to order transistors in a series stack?
Body effect: δVt ∝ √Vsb • At time t = 0, a=b=c=0, f=1, capacitances are charged Pull-up • Ideally V = V = V ≈ 0.8V stack ta tb tc • However, V > V > V because of F ta tb tc a t body effect a Ca b • If a, b, c become 1 at the same time, which tb Cb transistor will switch on first? c • tc will switch on first (Vsb for tc is zero), Cc will Cc discharge, pulling Vsb for tb to zero tc • If signals arrive at different times, how should the Gnd transistors be ordered? • Design strategy: place latest arriving signal nearest to output-early signals will discharge internal nodes
ECE 261 James Morizio 29 Transistor Placement Pull-up stack 2 a F 2 ta C 2 a
2 tb b Cb c Primary Cc Pull-up t inputs c stack (change Gnd 2 2 b simultaneously) 2 ta C F 2 a
tb a Cb
c Cc tc
ECE 261 James Morizio 30 Some Design Guidelines • Use NAND gates (instead of NOR) wherever possible • Placed inverters (buffers) at high fanout nodes to improve drive capability • Avoid use of NOR completely in high-speed
circuits: A1 + A2 + … + An = A1.A2….An
ECE 261 James Morizio 31 Some Design Guidelines
• Use limited fan-in (<10): high fan-in Ω long series stacks • Use minimum-sized gates on high fan-out nodes: minimize load presented to driving gate
ECE 261 James Morizio 32 Logical Effort
• Chip designers face a bewildering array of choices ? ? ? – What is the best circuit topology for a function? – How many stages of logic give least delay? – How wide should the transistors be?
• Logical effort is a method to make these decisions – Uses a simple model of delay – Allows back-of-the-envelope calculations – Helps make rapid comparisons between alternatives – Emphasizes remarkable symmetries
ECE 261 James Morizio 33 Example
• Ben Bitdiddle is the memory designer for the Motoroil 68W86, an embedded automotive processor. Help Ben design the decoder for a A[3:0] A[3:0] register file. 32 bits 4 : 1 1 6 6
• Decoder specifications:
w D 16 o e r d c Register File s – 16 word register file o d e – Each word is 32 bits wide r – Each bit presents load of 3 unit-sized transistors – True and complementary address inputs A[3:0] – Each input may drive 10 unit-sized transistors • Ben needs to decide: – How many stages to use? – How large should each gate be? – How fast can decoder operate?
ECE 261 James Morizio 34 Delay in a Logic Gate • Express delays in process-independent unit d τ = 3RC d = abs τ ≈ 12 ps in 180 nm process 40 ps in 0.6 µm process
ECE 261 James Morizio 35 Delay in a Logic Gate • Express delays in process-independent unit d d = abs τ • Delay has two components d = f + p
ECE 261 James Morizio 36 Delay in a Logic Gate
• Express delays in process-independent unit d d = abs τ • Delay has two components d = f + p • Effort delay f = gh (a.k.a. stage effort) – Again has two components
ECE 261 James Morizio 37 Delay in a Logic Gate • Express delays in process-independent unit d d = abs τ • Delay has two components d = f + p • Effort delay f = gh (a.k.a. stage effort) – Again has two components • g: logical effort – Measures relative ability of gate to deliver current – g ≡ 1 for inverter
ECE 261 James Morizio 38 Delay in a Logic Gate • Express delays in process-independent unit d d = abs τ • Delay has two components d = f + p • Effort delay f = gh (a.k.a. stage effort) – Again has two components
• h: electrical effort = Cout / Cin – Ratio of output to input capacitance – Sometimes called fanout
ECE 261 James Morizio 39 Delay in a Logic Gate • Express delays in process-independent unit d d = abs τ • Delay has two components d = f + p
• Parasitic delay p – Represents delay of gate driving no load – Set by internal parasitic capacitance
ECE 261 James Morizio 40 Delay Plots 2-input NAND Inverter 6 d = f + p g = d : 5 y p = a = gh + p l
e d =
D 4 g = d
e p = z i l 3
a d = m r
o 2 N
1
0 0 1 2 3 4 5 Electrical Effort:
h = Cout / Cin
ECE 261 James Morizio 41 Delay Plots 2-input NAND Inverter 6 d = f + p g = 4/3 d : 5 y p = 2 a = gh + p l
e d = (4/3)h + 2
D 4 g = 1 d
e p = 1 z i l 3
a d = h + 1
• What about m r
o 2 Effort Delay: f NOR2? N 1 Parasitic Delay: p 0 0 1 2 3 4 5 Electrical Effort:
h = Cout / Cin
ECE 261 James Morizio 42 Computing Logical Effort • DEF: Logical effort is the ratio of the input capacitance of a gate to the input capacitance of an inverter delivering the same output current. • Measure from delay vs. fanout plots • Or estimate by counting transistor widths 2 2 A 4 Y 2 B 4 A 2 A Y Y 1 B 2 1 1
Cin = 3 Cin = 4 Cin = 5 g = 3/3 g = 4/3 g = 5/3
ECE 261 James Morizio 43 Catalog of Gates
• Logical effort of common gates Gate type Number of inputs 1 2 3 4 n Inverter 1 NAND 4/3 5/3 6/3 (n+2)/3 NOR 5/3 7/3 9/3 (2n+1)/3 Tristate / mux 2 2 2 2 2
ECE 261 James Morizio 44 Catalog of Gates • Parasitic delay of common gates
– In multiples of pinv (≈1) Gate type Number of inputs 1 2 3 4 n Inverter 1 NAND 2 3 4 n NOR 2 3 4 n Tristate / mux 2 4 6 8 2n XOR, XNOR 4 6 8
ECE 261 James Morizio 45 Example: Ring Oscillator
• Estimate the frequency of an N-stage ring oscillator
Logical Effort: g = Electrical Effort: h = Parasitic Delay: p = Stage Delay: d =
Frequency: fosc =
ECE 261 James Morizio 46 Example: Ring Oscillator
• Estimate the frequency of an N-stage ring oscillator
31 stage ring oscillator in Logical Effort: g = 1 0.6 µm process has Electrical Effort: h = 1 frequency of ~ 200 MHz Parasitic Delay: p = 1 Stage Delay: d = 2
Frequency: fosc = 1/(2*N*d) = 1/4N
ECE 261 James Morizio 47 Example: FO4 Inverter • Estimate the delay of a fanout-of-4 (FO4) inverter d
Logical Effort: g = Electrical Effort: h = Parasitic Delay: p = Stage Delay: d =
ECE 261 James Morizio 48 Example: FO4 Inverter • Estimate the delay of a fanout-of-4 (FO4) inverter d
Logical Effort: g = 1 Electrical Effort: h = 4 The FO4 delay is about Parasitic Delay: p = 1 200 ps in 0.6 µm process 60 ps in a 180 nm process Stage Delay: d = 5 f/3 ns in an f µm process
ECE 261 James Morizio 49 Multistage Logic Networks • Logical effort generalizes to multistage networks • Path Logical Effort G = ∏ gi
Cout-path • Path Electrical Effort H = Cin-path
• Path Effort F = ∏ fi = ∏ gihi
10 x y z 20 g1 = 1 g2 = 5/3 g3 = 4/3 g4 = 1 h1 = x/10 h2 = y/x h3 = z/y h4 = 20/z
ECE 261 James Morizio 50 Multistage Logic Networks • Logical effort generalizes to multistage networks • Path Logical Effort G = ∏ gi
Cout path • Path Electrical Effort H = − Cin− path
• Path Effort F = ∏ fi = ∏ gihi
• Can we write F = GH?
ECE 261 James Morizio 51 Paths that Branch
• No! Consider paths that branch: 15 90 5 G = 15 H = 90 GH =
h1 =
h2 = F = GH?
ECE 261 James Morizio 52 Paths that Branch
• No! Consider paths that branch: 15 90 5 G = 1 15 H = 90 / 5 = 18 90 GH = 18
h1 = (15 +15) / 5 = 6
h2 = 90 / 15 = 6
F = g1g2h1h2 = 36 = 2GH
ECE 261 James Morizio 53 Branching Effort • Introduce branching effort – Accounts for branching between stages in path C + C b = on path off path Con path Note: B = ∏bi ∏hi = BH • Now we compute the path effort – F = GBH
ECE 261 James Morizio 54 Multistage Delays
• Path Effort Delay DF = ƒ fi
• Path Parasitic Delay P = ƒ pi
• Path Delay D = ƒdi = DF + P
ECE 261 James Morizio 55 Designing Fast Circuits
D = ƒdi = DF + P • Delay is smallest when each stage bears same effort
1 ˆ N f = gihi = F • Thus minimum delay of N stage path is 1 D = NF N + P • This is a key result of logical effort – Find fastest possible delay – Doesn’t require calculating gate sizes
ECE 261 James Morizio 56 Gate Sizes • How wide should the gates be for least delay? fˆ = gh = g Cout Cin g C Ω C = i outi ini fˆ
• Working backward, apply capacitance transformation to find input capacitance of each gate given load it drives. • Check work by verifying input cap spec is met.
ECE 261 James Morizio 57 Example: 3-stage path
• Select gate sizes x and y for least delay from A to B x
y x 45 A 8 x y B 45
ECE 261 James Morizio 58 Example: 3-stage path
x
y x 45 A 8 x y B 45
Logical Effort G = Electrical Effort H = Branching Effort B = Path Effort F = Best Stage Effort fˆ = Parasitic Delay P = Delay D =
ECE 261 James Morizio 59 Example: 3-stage path x
y x 45 A 8 x y B 45
Logical Effort G = (4/3)*(5/3)*(5/3) = 100/27 Electrical Effort H = 45/8 Branching Effort B = 3 * 2 = 6 Path Effort F = GBH = 125 Best Stage Effort fˆ = 3 F = 5 Parasitic Delay P = 2 + 3 + 2 = 7 Delay D = 3*5 + 7 = 22 = 4.4 FO4
ECE 261 James Morizio 60 Example: 3-stage path
• Work backward for sizes y = x = x
y x 45 A 8 x y B 45
ECE 261 James Morizio 61 Example: 3-stage path • Work backward for sizes y = 45 * (5/3) / 5 = 15 x = (15*2) * (5/3) / 5 = 10
45 A P: 4 P: 4 N: 4 P: 12 N: 6 B N: 3 45
ECE 261 James Morizio 62 Best Number of Stages • How many stages should a path use? – Minimizing number of stages is not always fastest • Example: drive 64-bit datapath with unit inverter Initial Driver 1 1 1 1
D =
Datapath Load 64 64 64 64
N: 1 2 3 4 f: D:
ECE 261 James Morizio 63 Best Number of Stages • How many stages should a path use? – Minimizing number of stages is not always fastest • Example: drive 64-bit datapath with unit inverter
Initial Driver 1 1 1 1
8 4 2.8 1/N D = NF + P 16 8 = N(64)1/N + N 23
Datapath Load 64 64 64 64
N: 1 2 3 4 f: 64 8 4 2.8 D: 65 18 15 15.3 Fastest
ECE 261 James Morizio 64 Derivation • Consider adding inverters to end of path – How many give least delay? N - n1 Extra Inverters Logic Block: n n Stages 1 1 1 N Path Effort F D = NF + ƒ pi + (N − n1 )pinv i=1 ∂D 1 1 1 = −F N ln F N + F N + p = 0 ∂N inv 1 • Define best stage effort ρ = F N
pinv + ρ (1− ln ρ )= 0
ECE 261 James Morizio 65 Best Stage Effort
pinv + ρ (1− ln ρ )= 0 • has no closed-form solution
• Neglecting parasitics (pinv = 0), we find ρ = 2.718 (e)
• For pinv = 1, solve numerically for ρ = 3.59
ECE 261 James Morizio 66 Review of Definitions Term Stage Path number of stages 1 N g logical effort G = ∏ gi C h = Cout H = out-path electrical effort Cin Cin-path
Con-path +Coff-path b = C B = b branching effort on-path ∏ i effort f = gh F = GBH D = f effort delay f F ƒ i parasitic delay p P = ƒ pi D d D P delay d = f + p = ƒ i = F +
ECE 261 James Morizio 67 Method of Logical Effort 1) Compute path effort F = GBH
2) Estimate best number of stages N = log4 F 3) Sketch path with N stages 1 4) Estimate least delay D = NF N + P 1 ˆ N 5) Determine best stage effort f = F g C i outi Cin = 6) Find gate sizes i fˆ
ECE 261 James Morizio 68 Limits of Logical Effort
• Chicken and egg problem – Need path to compute G – But don’t know number of stages without G • Simplistic delay model – Neglects input rise time effects • Interconnect – Iteration required in designs with wire • Maximum speed only – Not minimum area/power for constrained delay
ECE 261 James Morizio 69 Summary
• Logical effort is useful for thinking of delay in circuits – Numeric logical effort characterizes gates – NANDs are faster than NORs in CMOS – Paths are fastest when effort delays are ~4 – Path delay is weakly sensitive to stages, sizes – But using fewer stages doesn’t mean faster paths
– Delay of path is about log4F FO4 inverter delays – Inverters and NAND2 best for driving large caps • Provides language for discussing fast circuits – But requires practice to master
ECE 261 James Morizio 70 Power and Energy
• Power is drawn from a voltage source attached to
the VDD pin(s) of a chip. P(t) = iDD (t)VDD • Instantaneous Power: T T E = — P(t)dt = — iDD (t)VDDdt 0 0 • Energy: E 1 T Pavg = = — iDD (t)VDDdt T T 0 • Average Power:
ECE 261 James Morizio 71 Dynamic Power • Dynamic power is required to charge and discharge load capacitances when transistors switch. • One cycle involves a rising and falling output.
• On rising output, charge Q = CVDD is required • On falling output, charge is dumped to GND
• This repeats Tfsw times over an interval of T VDD iDD(t)
C fsw
ECE 261 James Morizio 72 Dynamic Power Cont.
Pdynamic =
VDD
iDD(t)
C fsw
ECE 261 James Morizio 73 Dynamic Power Cont. 1 T Pdynamic = — iDD (t)VDDdt T 0 T VDD = — iDD (t)dt T 0
VDD = [Tf CV ] VDD T sw DD iDD(t) 2 = CVDD fsw
C fsw
ECE 261 James Morizio 74 Activity Factor
• Suppose the system clock frequency = f
• Let fsw = αf, where α = activity factor – If the signal is a clock, α = 1 – If the signal switches once per cycle, α = ½ – Dynamic gates: • Switch either 0 or 2 times per cycle, α = ½ – Static gates: • Depends on design, but typically α = 0.1
2 • Dynamic power: Pdynamic = αCVDD f
ECE 261 James Morizio 75 Short Circuit Current
• When transistors switch, both nMOS and pMOS networks may be momentarily ON at once • Leads to a blip of “short circuit” current. • < 10% of dynamic power if rise/fall times are comparable for input and output
ECE 261 James Morizio 76 Example
• 200 Mtransistor chip – 20M logic transistors • Average width: 12 λ – 180M memory transistors • Average width: 4 λ – 1.2 V 100 nm process
– Cg = 2 fF/µm
ECE 261 James Morizio 77 Dynamic Example
• Static CMOS logic gates: activity factor = 0.1 • Memory arrays: activity factor = 0.05 (many banks!)
• Estimate dynamic power consumption per MHz. Neglect wire capacitance and short-circuit current.
ECE 261 James Morizio 78 Dynamic Example • Static CMOS logic gates: activity factor = 0.1 • Memory arrays: activity factor = 0.05 (many banks!) • Estimate dynamic power consumption per MHz. Neglect wire capacitance.
6 Clogic = (20 ×10 )(12λ)(0.05µm / λ)(2 fF / µm) = 24nF
6 Cmem = (180 ×10 )(4λ)(0.05µm / λ)(2 fF / µm) = 72nF » ÿ 2 Pdynamic = 0.1Clogic + 0.05Cmem ⁄(1.2) f = 8.6 mW/MHz
ECE 261 James Morizio 79 Static Power • Static power is consumed even when chip is quiescent. – Ratioed circuits burn power in fight between ON transistors – Leakage draws power from nominally OFF devices
V −V gs t » −Vds ÿ nvT vT Ids = Ids0e …1− e Ÿ … ⁄Ÿ
Vt = Vt0 −ηVds + γ ( φs +Vsb − φs )
ECE 261 James Morizio 80 Ratio Example • The chip contains a 32 word x 48 bit ROM – Uses pseudo-nMOS decoder and bitline pullups – On average, one wordline and 24 bitlines are high • Find static power drawn by the ROM – β = 75 µA/V2
– Vtp = -0.4V
ECE 261 James Morizio 81 Ratio Example • The chip contains a 32 word x 48 bit ROM – Uses pseudo-nMOS decoder and bitline pullups – On average, one wordline and 24 bitlines are high • Find static power drawn by the ROM – β = 75 µA/V2 – V = -0.4V tp 2 (VDD − Vtp ) • Solution: I = β = 24A pull-up 2
Ppull-up = VDD Ipull-up = 29W
Pstatic = (31+ 24)Ppull-up = 1.6 mW
ECE 261 James Morizio 82 Leakage Example
• The process has two threshold voltages and two oxide thicknesses. • Subthreshold leakage:
– 20 nA/µm for low Vt
– 0.02 nA/µm for high Vt • Gate leakage: – 3 nA/µm for thin oxide – 0.002 nA/µm for thick oxide • Memories use low-leakage transistors everywhere • Gates use low-leakage transistors on 80% of logic
ECE 261 James Morizio 83 Leakage Example Cont.
• Estimate static power:
ECE 261 James Morizio 84 Leakage Example Cont. • Estimate static power:
– High leakage: (20 ×106 )(0.2)(12λ)(0.05µm / λ) = 2.4 ×106 µm – Low leakage: (20 ×106 )(0.8)(12λ)(0.05µm / λ)+ (180 ×106 )(4λ)(0.05µm / λ) = 45.6 ×106 µm 6 » ÿ Istatic = (2.4 ×10 µm) (20nA/ µm)/ 2 + (3nA/ µm)⁄+
6 (45.6 ×10 µm) »(0.02nA/ µm)/ 2 + (0.002nA/ µm)⁄ÿ = 32mA
Pstatic = IstaticVDD = 38mW
ECE 261 James Morizio 85 Leakage Example Cont.
• Estimate static power: (20 ×106 )(0.2)(12λ)(0.05µm / λ) = 2.4 ×106 µm – High leakage: (20 ×106 )(0.8)(12λ)(0.05µm / λ)+ – Low leakage: 6 6 (180 ×10 )(4λ)(0.05µm / λ) = 45.6 ×10 µm 6 » ÿ Istatic = (2.4 ×10 µm) (20nA/ µm)/ 2 + (3nA/ µm)⁄+
6 (45.6 ×10 µm) »(0.02nA/ µm)/ 2 + (0.002nA/ µm)⁄ÿ = 32mA
Pstatic = IstaticVDD = 38mW
• If no low leakage devices, Pstatic = 749 mW (!)
ECE 261 James Morizio 86 Low Power Design
• Reduce dynamic power – α: – C:
– VDD: – f: • Reduce static power
ECE 261 James Morizio 87 Low Power Design
• Reduce dynamic power – α: clock gating, sleep mode – C:
– VDD: – f: • Reduce static power
ECE 261 James Morizio 88 Low Power Design
• Reduce dynamic power – α: clock gating, sleep mode – C: small transistors (esp. on clock), short wires
– VDD: – f: • Reduce static power
ECE 261 James Morizio 89 Low Power Design
• Reduce dynamic power – α: clock gating, sleep mode – C: small transistors (esp. on clock), short wires
– VDD: lowest suitable voltage – f: • Reduce static power
ECE 261 James Morizio 90 Low Power Design
• Reduce dynamic power – α: clock gating, sleep mode – C: small transistors (esp. on clock), short wires
– VDD: lowest suitable voltage – f: lowest suitable frequency • Reduce static power
ECE 261 James Morizio 91 Low Power Design
• Reduce dynamic power – α: clock gating, sleep mode – C: small transistors (esp. on clock), short wires
– VDD: lowest suitable voltage – f: lowest suitable frequency • Reduce static power – Selectively use ratioed circuits
– Selectively use low Vt devices – Leakage reduction: stacked devices, body bias, low temperature
ECE 261 James Morizio 92