Lecture 14: Memory Elements

Lecture 14: Memory Elements Mark McDermott Electrical and Computer Engineering The University of Texas at Austin 10/23/18 VLSI-1 Class Notes Memory Elements § Memory arrays § SRAMs § Serial Memories § Dynamic memories Pentium-4 (Willamette) Yellow boxes are memory arrays 10/23/18 VLSI-1 Class Notes Page 2 Memory Arrays Memory Arrays Random Access Memory Serial Access Memory Content Addressable Memory (CAM) Read/Write Memory Read Only Memory (RAM) (ROM) Shift Registers Queues (Volatile) (Nonvolatile) Serial In Parallel In First In Last In Static RAM Dynamic RAM Parallel Out Serial Out First Out First Out (SRAM) (DRAM) (SIPO) (PISO) (FIFO) (LIFO) Mask ROM Programmable Erasable Electrically Flash ROM ROM Programmable Erasable (PROM) ROM Programmable (EPROM) ROM (EEPROM) 10/23/18 VLSI-1 Class Notes Page 3 1D Memory Architecture m bits m bits S S 0 Word 0 0 Word 0 S S 1 Word 1 1 Word 1 S S 2 Word 2 Storage A0 2 Word 2 Storage S S 3 Cell A1 3 Cell Ak-1 Decoder n words S S n-2 Word n-2 n-2 Word n-2 S S n-1 Word n-1 n-1 Word n-1 Input/Output Input/Output n words ® n select signals Decoder reduces # of inputs k = log2 n 10/23/18 VLSI-1 Class Notes Page 4 2D Memory Architecture 2k-j bit line word line Aj Aj+1 storage Ak-1 (RAM) cell Row Address Row Row DecoderRow m2j A 0 selects appropriate word A1 Column Decoder from memory row Column Address Aj-1 Sense Amplifiers amplifies bit line swing Read/Write Circuits Input/Output (m bits) 10/23/18 VLSI-1 Class Notes Page 5 Array Architecture § 2n words of 2m bits each § If n >> m, fold by 2k into fewer rows of more columns bitline conditioning wordlines bitlines row decoder row memory cells: 2n-k rows x 2m+k columns n-k k column circuitry n column decoder 2m bits § Good regularity – easy to design § Very high density if good cells are used 10/23/18 VLSI-1 Class Notes Page 6 SRAM Block Diagram Precharge Precharge 2N Addr Address Row Wordline WL Memory Memory <N+M> Buffer Decoder Driver Cell Cell Bitlines 2M M Column YSEL# <0> Column <2 -1> Column Chip Select Decoder Select Select Differential Column Mux Sense Amplifier saout Input Output Write Enable Buffer Buffer Data_in Data_out 10/23/18 VLSI-1 Class Notes Page 7 SRAM Simulation Cross Section BLPC# WL<0> WL Drivers Keepers WL<2 N-1> Row Decoder Bitcell Column Decoder Col<2> Col<1> Col<0> SAPC# Data_out Wrt Driver Dout <0 SenseAmp wsel > WE*CS SAE Data_in 10/23/18 VLSI-1 Class Notes Page 8 6T SRAM Cell § Cell size accounts for most of array size – Reduce cell size at expense of complexity § 6T SRAM Cell PG: pass gate – Used in most commercial chips PU: pull-up – Data stored in cross-coupled inverters PD: pull-down § Read: – Precharge bit, bit_b Wordline – Raise wordline PU PU § Write: – Drive data onto bit, bit_b PG PG – Raise wordline PD PD Bit Bit# 10/23/18 VLSI-1 Class Notes Page 9 Moore’s Law Scaling for Memory Source: Kelin Kuhn, Intel 10/23/18 VLSI-1 Class Notes Page 10 SRAM Memory Cell Improvements 10/23/18 VLSI-1 Class Notes Page 11 12T SRAM Cell § Basic building block: SRAM Cell – Holds one bit of information, like a latch – Must be read and written § 12-transistor (12T) SRAM cell – Use a simple latch connected to bitline – 46 x 75 l unit cell bit write write_b read read_b 10/23/18 VLSI-1 Class Notes Page 12 SRAM Read § Improve performance when bit-line capacitance is high § Precharge both bitlines high bit bit_b § Then turn on wordline word P1 P2 § One of the two bitlines N2 N4 will be pulled down by the cell A A_b N1 N3 § Ex: A = 0, A_b = 1 – bit discharges, bit_b stays high A_b bit_b – But A bumps up slightly 1.5 1.0 bit § Read stability word – A must not flip 0.5 A – N1 >> N2 0.0 0 100 200 300 400 500 600 time (ps) 10/23/18 VLSI-1 Class Notes Page 13 Read Timing phi2 CLK phi1 WL BL/BL# DV differential BL equalized SAE Dt tsu YSEL# BLPC# BL precharge SAPC# SA precharge SAOUT DATA_OUT evaluate/latch time 10/23/18 VLSI-1 Class Notes Read Requirements § Pre-charge & equalize bit-lines from previous cycle § Minimum “Design Margin” before next READ begins § Delay requirement to allow sufficient bit-line voltage development BL Equalization BL# Degradation Due to & Restore Spec Quiet WL(s) + coupling VDD SA Offset BL# Design Margin Requirement BL VSS Write Restore Read Dt delay § NOTE: The Dt delay can be generated by a chain of inverter delays or by replica “dummy” row and column composed of bitcells. 10/23/18 VLSI-1 Class Notes SRAM Write § Drive one bitline high, the other low § Then turn on wordline bit bit_b § Bitlines overpower cell with new word value P1 P2 N2 N4 § Ex: A = 0, A_b = 1, A A_b N1 N3 bit = 1, bit_b = 0 – Force A_b low, then A rises high A_b 1.5 A § Writability bit_b – Must overpower 1.0 feedback inverter – N4 >> P2 0.5 word – N2 >> P1 (symmetry) 0.0 0 100 200 300 400 500 600 700 time (ps) 10/23/18 VLSI-1 Class Notes Page 16 Write Timing write mode from latch read mode WREN phi2 CLK phi1 WL WSEL write pulsewidth BL/BL# BL equalized BLPC# BL precharge DATA data arrives early from previous cycle DATA_OUT holds state from previous READ time 10/23/18 VLSI-1 Class Notes Write Requirements § Pre-charge & equalize bit-lines from previous cycle § WRITE can begin as soon as word-line is available § Must guarantee minimum write pulse-width, data valid time and write recovery; internal “high node” reaches say 90% of VDD BL Equalization WRITE Recovery Design Margin & Restore Spec VDD 90% of VDD BL# data# data and data# are the 6T bitcell state nodes. WL 50% of VDD WRITE Pulse Width BL internal cell nodes cross VSS TTW data § NOTE: Write pulse-width margin increases with lower frequency 10/23/18 VLSI-1 Class Notes SRAM Sizing § High bitlines must not overpower inverters during reads § But low bitlines must write new value into cell bit bit_b word weak med med A A_b strong Wdn > Wpass for read stability Wpass > Wup to enable writes 10/23/18 VLSI-1 Class Notes Page 19 6-Transistor SRAM Cell Layout transistor LEFT RIGHT PG 1.00 1.00 PD 1.38 1.38 PU 0.44 0.44 PFET NFET BL #BL 0.07/.055 0.07/.055 55 0.16/.055 0.16/.055 160 70 120 (m3) poly 70 360 220 0.22/.055 0.22/.055 70 55 160 (m3) 110 WL 1040 PASS Units are in nanometers {nm} GATE In 45nm CMOS, a typical µ 6T bit-cell area = 0.38 µm2 1.04 m (45nm) VLSI-1 Class Notes 10/23/18 Decoders § n:2n decoder consists of 2n n-input AND gates – One needed for each row of memory – Build AND from NAND or NOR gates Static CMOS Pseudo-nMOS A1 A0 A1 A0 word0 1 1 8 word0 1/2 4 16 word word A1 1 4 A0 A1 2 8 word1 word1 1 1 A0 1 word2 word2 word3 word3 10/23/18 VLSI-1 Class Notes Page 21 Decoder Layout § Decoders must be pitch-matched to SRAM cell – Requires very skinny gates A3 A3 A2 A2 A1 A1 A0 A0 VDD word GND NAND gate buffer inverter 10/23/18 VLSI-1 Class Notes Page 22 Large Decoders § For n > 4, NAND gates become slow – Break large gates into multiple smaller gates A3 A2 A1 A0 word0 word1 word2 word3 word15 10/23/18 VLSI-1 Class Notes Page 23 Predecoding § Many of these gates are redundant – Factor out common A3 gates into pre-decoder – Saves area A2 – Same path effort A1 A0 predecoders 1 of 4 hot predecoded lines word0 word1 word2 word3 word15 10/23/18 VLSI-1 Class Notes Page 24 Column Circuitry § Some circuitry is required for each column – Bitline conditioning – Sense amplifiers – Column multiplexing bitline conditioning wordlines bitlines row decoder row memory cells: 2n-k rows x 2m+k columns n-k k column circuitry n column decoder 2m bits 10/23/18 VLSI-1 Class Notes Page 25 Bitline Conditioning § Precharge bitlines high before reads f bit bit_b § Equalize bitlines to minimize voltage f difference when using sense amplifiers bit bit_b 10/23/18 VLSI-1 Class Notes Page 26 Sense Amplifiers § Bitlines have many cells attached – Ex: 32-kbit SRAM has 256 rows x 128 cols – 128 cells on each bitline § tpd µ (C/I) DV – Even with shared diffusion contacts, 64C of diffusion capacitance (big C) – Discharged slowly through small transistors (small I) § Sense amplifiers are triggered on small voltage swing (reduce DV) BL Equalization BL# Degradation Due to & Restore Spec Quiet WL(s) + coupling VDD Sense Amp BL# Design Margin Offset requirement BL VSS Write Restore Read Dt delay 10/23/18 VLSI-1 Class Notes Page 27 Differential Pair Amp § Differential pair requires no clock § But always dissipates static power P1 P2 sense_b sense bit N1 N2 bit_b N3 10/23/18 VLSI-1 Class Notes Page 28 Clocked Sense Amp § Clocked sense amp saves power § Requires timing the sense_clk signal to arrive after enough bitline swing § Isolation transistors cut off large bitline capacitance bit bit_b sense_clk isolation transistors regenerative feedback sense sense_b 10/23/18 VLSI-1 Class Notes Page 29 De-coupled Sense Amplifier § With SAE low; Data and Data# are pre-charge high § When SAE goes high; source-coupled pair acts as differential amplifier § Cross coupled inverters amplify and latch any voltage difference Bit Bit# Data# Data Out Out# SAE 10/23/18 VLSI-1 Class Notes Page 30 Twisted Bitlines § Sense amplifiers also amplify noise – Coupling noise is severe in modern processes – Try to couple equally onto bit and bit_b (common mode) – Done by twisting bitlines b0 b0_b b1 b1_b b2 b2_b b3 b3_b 10/23/18 VLSI-1 Class Notes Page 31 Column Multiplexing § Recall that array may be folded for good aspect ratio § Ex: 2 kword x 16 folded into 256 rows x 128 columns – Must select 16 output bits from the 128 columns – Requires 16 8:1 column multiplexers Precharge Precharge 2N Addr Address Row Wordline WL Memory Memory <N+M> Buffer Decoder Driver Cell Cell Bitlines M 2 <0> <2M-1> Column YSEL# Column Column Chip Select Decoder Select Select

Load more