<<

Memory Design

• Memory Types •Memoryyg Organization • ROM design • RAM design • PLA design

Adapted from J. M. Rabaey, A. Chandrakasan and B. Nikolic, Digital Integrated Circuits, 2nd ed. Copyright 2003 Prentice Hall/Pearson.

ECE 261 James Morizio 1 Classification

Non-Volatile Read-Write Memory Read-Write Read-Only Memory Memory

Random Non-Random EPROM Mask-Programmed Access Access 2 E PROM Pbl(PROM)Programmable (PROM)

SRAM FIFO FLASH

DRAM LIFO Shift Register CAM

ECE 261 James Morizio 2 MTiiDfiiiMemory Timing: Definitions

Read cycle

READ

Write cycle Read access Read access WRITE

Write access valid

DATA

Data written

ECE 261 James Morizio 3 Memory Architecture: M Decoders M bits S S0 0 Word 0 Word 0 S1 Word 1 A 0 Word 1 S2 Storage Storage Word 2 Word 2 cell A 1 cell

words A K2 1 N SN2 2 Decoder Word N2 2 Word N2 2 SN2 1 Word N2 1 Word N2 1 K 5 log2N

Input-Output Input-Output (M bits) (M bits) Intuitive architecture for N x M memory Decoder reduces the number of select signals Too many select signals: K = log N NdN words == NltilN select signals 2

ECE 261 James Morizio 4 Array-Structured Memory Architecture Problem: ASPECT RATIO or HEIGHT >> WIDTH

2L 2 K line Storage cell

A K

A K11 Word line Decoder

A L 21 Row

M.2K Amplify swing to Sense amplifiers / Drivers rail-to-rail amplitude

A 0 Column decoder Selects appropriate A K21 word

Inppput-Output (M bits)

ECE 261 James Morizio 5 Hierarchical Memory Architecture Block 0 Block i Block P 21

Row address

Column address Block address

Global data bus Control Block selector Global circuitry amplifier/driver

Advantages: I/O 1. Shorter wires within blocks 2. Block address activates only 1 block => power savings ECE 261 James Morizio 6 RdRead-OlOnly M emory C Cllells BL BL BL

VDD WL WL WL 1

BL BL BL

WL WL WL 0

GND

Diode ROM MOS ROM 1 MOS ROM 2

ECE 261 James Morizio 7 MOS OR ROM BL[0] BL[1] BL[2] BL[3]

WL[0]

VDD WL[1]

WL[2]

VDD

WL[3]

Vbias

Pull-down loads

ECE 261 James Morizio 8 ROM Example

• 4-word x 6-bit ROM Word 0: 010101 – Represented with dot diagram Word 1: 011001 – Dots indicate 1’s in ROM Word 2: 100101 weak Word 3: 101010 A1 A0 pseudo-nMOS pullups

2:4 DEC

ROM Array

Y5 Y4 Y3 Y2 Y1 Y0 Looks like 6 4-input pseudo-nMOS NORs

ECE 261 James Morizio 9 MOS NOR ROM

VDD Pull-up devices

WL[0]

GND WL [1]

WL [2]

GND WL [3]

BL [0] BL [1] BL [2] BL [3]

ECE 261 James Morizio 10 MOS NOR ROM Layout

Cell (9.5λ x 7λ)

Programmming using the Active Layyyer Only

Polysilicon Metal1 Diffusion Metal1 on Diffusion

ECE 261 James Morizio 11 MOS NOR ROM Layout Cell (11λ x7x 7λ)

Programmming using the Contact Layer Only

Polysilicon Metal1 Diffusion Metal1 on Diffusion

ECE 261 James Morizio 12 MOS NAND ROM

VDD Pull-up devices

BL[0] BL[1] BL[2] BL[3]

WL[0]

WL[1]

WL[2]

WL[3]

All word lines high by default with exception of selected row

ECE 261 James Morizio 13 MOS NAND ROM Layout Cell (8λ x 7λ) PiiProgrammming using the Metal-1 Layer Only

No contact to VDD or GND necessary; drastically reduced cell size Loss in performance compared to NOR ROM

Polysilicon Diffusion Metal1 on Diffusion

ECE 261 James Morizio 14 NAND ROM Layout Cell (5λ x 6λ)

PiiProgrammming using Implants Only

Polysilicon Threshold-altering implant Metal1 on Diffusion

ECE 261 James Morizio 15 Decreasinggy Word Line Delay

Driver WL Polysilicon word line

Metal word line (a) Driving the word line from both sides

Metal bypass

WL K cells Polysilicon word line (b) Using a metal bypass

ECE 261 James Morizio 16 Precharged MOS NOR ROM

V f pre DD

Precharge devices

WL[0]

GND WL[1]

WL[2] GND WL[3]

BL[0] BL[1] BL[2] BL[3]

PMOS precharge device can be made as large as necessary, but clock driver becomes harder to design .

ECE 261 James Morizio 17 Read-Write Memories (RAM)

‰ STATIC (SRAM) DtData s tore d as long as supp ly is app lidlied Large (6 transistors/cell) Fast Differential

‰ DYNAMIC (DRAM) Periodic refresh required Small (1-3 transistors/cell) Slower Single Ended

ECE 261 James Morizio 18 6-transistor CMOS SRAM Cell

WL

VDD M 2 M 4 Q Q M M 5 6

M 1 M 3

BL BL

ECE 261 James Morizio 19 6T-SRAM — Layout

V M2 M4 DD

Q Q

M1 M3

GND M5 M6 WL

BL BL

ECE 261 James Morizio 20 Statue of Goethe and Schiller: the German National Theater, Weimar

ECE 261 James Morizio 21 3-Transistor DRAM Cell

BL1 BL2

WWL

RWL WWL

M 3 RWL

M 1 X X M 2

CS BL 1

BL 2

No constraints on device ratios Reads are non-destructive Value stored at node X when writing a “1” = VWWL-VTn ECE 261 James Morizio 22 3T-DRAM — Layout

BL2 BL1 GND

RWL M3

M2

WWL M1

ECE 261 James Morizio 23 1-Transistor DRAM Cell BL WL Write 1 Read 1 WL

M 1 X GND VDD 2 VT CS

VDD BL V /2 V /2 DD sensing DD

CBL

Write: C S is charged or discharged by asserting WL and BL. Read: Charge redistribution takes places between bit line and storage capacitance

C ------S ΔV ==VBL –VVPRE BIT – VPRE CS + CBL Voltage swing is small; typically around 250 mV.

ECE 261 James Morizio 24 DRAM Cell Observations

‰ 1T DRAM requires a sense amplifier for each bit line, due to charge redistribution read-out. ‰ DRAM s are s ing le-ended i n contrast to SRAM cells. ‰The read-out of the 1T DRAM cell is destructive; read and refres h operati ons are necessary f or correct operati on. ‰ Unlike 3T cell, 1T cell requires presence of an extra capacitance that must be explicitly included in the design. ‰ When writing a “1” into a DRAM cell, a threshold voltage is lost. This charge loss can be circumvented by bootstrapping the word lines to a higher value than VDD

ECE 261 James Morizio 25 1-T DRAM Cell

Capacitor

Metal word line M1 word line SiO2 Poly n+ n+ Field Oxide Inversion layer Diffused Poly induced by bit line plate bias Polysilicon Polysilicon gate plate Cross-section Layout Uses Polysilicon-Diffusion Capacitance Expensive in Area (trend now is to use trench capacitors

ECE 261 James Morizio 26 Perippyhery

‰ Decoders ‰ Sense Amplifiers ‰ Input/Output Buffers ‰ Control / Timing Circuitry

ECE 261 James Morizio 27 RDdRow Decoders Collection of 2M complex logic gates Organized in regular and dense fashion

(N)AND Decoder

NOR Decoder

ECE 261 James Morizio 28 Hierarchical Decoders Multi-stage implementation improves performance

•••

WL 1

WL 0

A 0A 1 A 0A 1 A 0A 1 A 0A 1 A 2A 3 A 2A 3 A 2A 3 A 2A 3

••• NAND decoder using 22--inputinput prepre--decodersdecoders A 1 A 0 A 0 A 1 A 3 A 2 A 2 A 3

ECE 261 James Morizio 29 Dy namic Decoders

Precharge devices GND GND VDD

WL3 VDD WL3

WL WL 2 2 VDD

WL1 WL 1 VDD

WL0

WL 0

VDD φ A0 A0 A1 A1 A0 A0 A1 A1 φ

2-input NOR decoder 2-input NAND decoder

ECE 261 James Morizio 30 4-to-1 tree based column decoder

BL 0 BL 1 BL 2 BL 3

A 0

A 0

A1

A 1

D Number of devices drastically reduced Delay increases quadratically with # of sections; prohibitive for large decoders Solutions: buffers progressive sizing combinati on of t ree and pass t ransi st or approach es

ECE 261 James Morizio 31 PLA versus ROM ‰ Programmable Logic Array structured approach to random logic “two level logic implementation” NOR-NOR (product of sums) NAND-NAND (sum of products)

SIMILAR TO ROM ‰ Main difference ROM: fully populated PLA: one element per minterm

Note: Imppyortance of PLA’s has drastically reduced 1. slow 2. better software techniques (mutli-level logic synthesis) BtBut …

ECE 261 James Morizio 32 Programmable Logic Array PdPseudo-NMOS PLA

V DD GND GND GND GND GND

GND

GND

V DD X 0 X 0 X 1 X 1 X 2 X 2 f0 f1 AND-plane OR-plane

ECE 261 James Morizio 33 Dynamic PLA

f AND

GND V DD

f OR

f OR f AND V DD X 0 X 0 X 1 X 1 X 2 X 2 f0 f1 GND

AND-plane OR-plane

ECE 261 James Morizio 34 PLA Layout And-Plane Or-Plane VDD φ GND

x x x x x x 0 0 1 1 2 2 f0 f1 Pull-up devices Pull-up devices

ECE 261 James Morizio 35 CAMs

• Ext ensi on of ordi nary memory (e.g. SRAM) – Read and write memory as usual – Also match to see which words contain a key adr data/key

read CAM match write

ECE 261 James Morizio 36 10T CAM Cell

• Add four match transistors to 6T SRAM – 56 x 43 λ unit cell bit bit_ b word cell cell_b

match

ECE 261 James Morizio 37 CAM Cell Operation

• Read and write like ordinary SRAM CAM cell • For matching: clk weak

row decoder row miss – Leave wordline low address match0 – Precharge matchlines match1 – Place key on bitlines match2 – Matchlines evaluate match3 read/write column circuitry • Miss line data – Pseudo-nMOS NOR of match lines – Goes high if no words match

ECE 261 James Morizio 38