Memory Hierarchy

Graduate Institute of Electronics Engineering, NTU Memory Hierarchy Lecturer: ChihhaoChao Advisor: Prof. An-YeuWu Date: 2009.4.29 Wednesday Adapted from Prof. Wu’s 計算機結構 Lecture Note ACCESS IC LAB ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Outline v Review of memory basics v Memory hierarchy v Cache overview v Measuring and improving cache performance P2 Graduate Institute of Electronics Engineering, NTU Review of Memory Basics ACCESS IC LAB ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Memory Classification & Metrics Non-Volatile Read-Only Read-Write Memory Read-Write Memory Memory Random Non-Random EPROM Access Access Mask- EEPROM SRAM FIFO Programmed FLASH DRAM LIFO v Key Design Metrics 1. Memory Density (number of bits/um2) and Size 2. Access Time (time to read or write) and Throughput 3. Power Dissipation P4 ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Memory Array Architecture P5 ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Latch and Register Based Memory Positive Latch Negative Latch Register-based Memory v Works fine for small memory blocks v Simple memory model, simple timing v Inefficient in area for large memories v Density is the key metric in large memory circuits P6 ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Static RAM (SRAM) Cell (6(6--T Cell) v Logic state held by cross-coupled inverters (M1,M2;M3,M4) v Retain state as long as power supply turns on v Feedback must be overdriven to write into the memory P7 ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Dynamic RAM (DRAM) Cell Write: set Bit Line (BL) to 0 or VDD & enable Word Line (WL) Read: set Bit Line (BL) to 0 or VDD/2 & enable Word Line (WL) v DRAM relies on charge stored in a capacitor to hold logic state v Use in all high density memories (one bit / transistor) v Must be “refreshed” or state will be lost – high overhead P8 ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Interacting with a Memory Device v Output Enable gates the chip’s v Address pins drive row and tristate driver column decoders v Write Enable sets the memory’s v Data pins are bidirectional and read/write mode shared by reads and writes v Chip Enable/Chip Select acts as a master switch P9 ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Asynchronous SRAM On the outside v Basic Memory, e.g.. v MCM6264C 8k x 8 SRAM v Bidirectional data bus for read/write v Chip Enables (E1 and E2) v E1=1’b0, E2=1’b1 to enable the chip On the inside v Write Enable (W) v active-low when chip is enabled v Output Enable (G) v active-low when chip is enabled P10 ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Asynchronous SRAM Read Operation v Read cycle begins when all enable signals are active (E1,E2,G) v Data is valid after read access time v Data bus is tristated shortly after G or E1 goes high (inactive) P11 ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Address Controlled Reads v Perform multiple reads without disable chip (G=1’b0) v Data bus after Address bus, after some delay v Note the Bus enable time, Access time, Contamination time, Bus tristate time P12 ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Asynchronous SRAM Write Operation v Data latched when W or E1 goes high v Data must be stable at this time v Address must be stable before W goes low (inactive) v Write waveforms are very important v Glitches to address can cause write to unexpected address P13 ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Synchronous SRAM v Use synchronization registers to provide synchronous inputs and encourage more reliable operation at high speed P14 ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Asynchronous DRAM Operation v Usually address are separated to Row address and Column address v Manipulation of RAS and CAS can provide early-write, read-write, hidden- refresh, ... etc efficient operating modes. P15 ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Key Messages on Memory Devices v DRAM vs. SRAM v SRAM holds states as long as power supply is turned on; DRAM must be refreshed à result in complicated control v DRAM has much higher density, but requires special capacitor technology v Handle memory operations v Primary inputs of memory should be registered for synchronization and reducing glitches v It’s bad idea to enable two tri-states driving the bus at the same time v An SRAM doesn’t need to be refreshed while a DRAM does v A synchronous memory can result in higher throughput P16 Graduate Institute of Electronics Engineering, NTU Memory Hierarchy ACCESS IC LAB ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Where are we now? P18 ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Technology Trends Capacity Speed (latency) Logic: 2x in 3 years 2x in 3 years DRAM: 4x in 3 years 2x in 10 years Disk: 4x in 3 years 2x in 10 years DRAM Year Size Cycle Time 1980 1000:1! 64 Kb 2:1! 250 ns 1983 256 Kb 220 ns 1986 1 Mb 190 ns 1989 4 Mb 165 ns 1992 16 Mb 145 ns 1995 64 Mb 120 ns 1998 256 Mb 100 ns 2001 1 Gb 80 ns P19 ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Processor Memory Latency Gap µProc 60%/yr. 1000 CPU (2X/1.5yr) “Moore’s Law” e c Processor-Memory n a 100 Performance Gap: m (grows 50% / year) r o f 10 r DRAM e 9%/yr. P DRAM (2X/10 yrs) 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 8 8 8 8 8 8 8 8 8 8 9 9 9 9 9 9 9 9 9 9 0 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 Time P20 ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU What the Gap means? v We use the pipelined MIPS as an example: Clock period will be bounded by Memory, not Logic!! P21 ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Deep Pipeline in Modern Desktop uP P22 ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Memory Access Pattern v Model the memory access address and access time v Not fully random, e.g. uniform distributed v Usually have some pattern à Here comes the chance! P23 ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Memory Hierarchy v A memory hierarchy consists of multiple levels of memory with different speeds and sizes v Guideline: Build memory as a hierarchy of levels, with the fastest memory close to the processor, and the slower, less expensive memory below that v Goal: To present the user with as much as is available in the cheapest technology, while providing access at the speed offered by the fastest memory. v Three major technologies used to construct memory hierarchy: Memory hierarchy Typical access time $ per GB in 2004 SRAM 0.5 – 5 ns $4000 -$10000 DRAM 50 – 70 ns $100 -$200 Magnetic disk 5,000,000-20,000,000 ns $0.5 -$2 P24 ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU General Principles of Memory v Definitions v Upper: memory closer to processor v Block: minimum unit that is present or not present v Block address: location of block in memory v Locality + smaller HW is faster = memory hierarchy v Levels: each smaller, faster, more expensive/byte than level below v Inclusive: data found in upper leve also found in the lower level P25 ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Memory Hierarchy: How Does it Work? v Temporal Locality (Locality in Time): => Keep most recently accessed data items closer to the processor v Spatial Locality (Locality in Space): => Move blocks consists of contiguous words to the upper levels Lower Level To Processor Upper Level Memory Memory Blk X From Processor Blk Y P26 ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Memory Hierarchy: Terminology v Hit: data appears in some block in the upper level (example: Block X) v Hit Rate: the fraction of memory access found in the upper level v Hit Time: Time to access the upper level which consists of RAM access time + Time to determine hit/miss v Miss: data needs to be retrieve from a block in the lower level (Block Y) v Miss Rate = 1 -(Hit Rate) v Miss Penalty: Time to replace a block in the upper level + Time to deliver the block the processor v Hit Time << Miss Penalty Lower Level To Processor Upper Level Memory Memory Blk X From Processor Blk Y P27 ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Memory Hierarchy of a Modern Computer System Processor Control Tertiary Secondary Storage Storage (Tape) O Second Main R C (Disk) e n a g Level Memory - C c i Datapath s h h Cache (DRAM) t e e i r p s (SRAM) Speed (ns): 1s 10s 100s 10,000,000s 10,000,000,000s (10s ms) (10s sec) Size (bytes): 100s Ks Ms Gs Ts P28 ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU A Typical Memory Hierarchy of Modern Computer System P29 Graduate Institute of Electronics Engineering, NTU Cache Overview ACCESS IC LAB ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Inside a Cache P31 ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU The Basics of Cache P32 ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU The Basics of Cache (2) v Cache: a safe place for hiding or storing things.

Load more