04 Cache Memory
Total Page:16
File Type:pdf, Size:1020Kb
1 + Chapter 4 Cache Memory 2 Location Performance Internal (e.g. processor registers, cache, Access time main memory) Cycle time External (e.g. optical disks, magnetic disks, Transfer rate tapes) Physical Type Capacity Semiconductor Number of words Magnetic Number of bytes Optical Unit of Transfer Magneto-optical Word Physical Characteristics Block Volatile/nonvolatile Access Method Erasable/nonerasable Sequential Organization Direct Memory modules Random Associative Table 4.1 Key Characteristics of Computer Memory Systems + 3 Characteristics of Memory Systems Location Refers to whether memory is internal and external to the computer Internal memory is often equated with main memory Processor requires its own local memory, in the form of registers Cache is another form of internal memory External memory consists of peripheral storage devices that are accessible to the processor via I/O controllers Capacity Memory is typically expressed in terms of bytes Unit of transfer For internal memory the unit of transfer is equal to the number of electrical lines into and out of the memory module Location of memory Processor requires its own local memory( register) Internal memory : the main memory External memory : memory on peripheral storage devices (disk, tape, etc.) Memory Capacity Memory capacity of internal memory is typically expressed in terms of bytes or words Common word lengths are 8, 16, and 32 bits External memory capacity is typically expressed in terms of bytes Unit of Transfer of Memory For internal memory, The unit of transfer is equal to the word length, but is often larger, such as 64, 128, or 256 bits Usually governed by data bus width For external memory, The unit of transfer is usually a block which is much larger than a word For addressable unit, The unit of transfer is the smallest location which can be uniquely addressed Word internally Access Methods (1) Sequential access Start at the beginning and read through in order Access time depends on location of data and previous location e.g. tape Direct access Individual blocks have unique address Access is by jumping to vicinity plus sequential search Access time depends on location and previous location e.g. disk Access Methods (2) Random access Individual addresses identify locations exactly Access time is independent of location or previous access e.g. RAM Associative Data is located by a comparison with contents of a portion of the store Access time is independent of location or previous access e.g. cache Access Methods (3) : Summary 9 Sequential Direct Random Associative access access access Each addressable A word is retrieved Memory is organized into Involves a shared read- location in memory has a based on a portion of its units of data called write mechanism unique, physically wired- contents rather than its records in addressing mechanism address Each location has its own The time to access a Individual blocks or addressing mechanism Access must be made in given location is records have a unique and retrieval time is a specific linear independent of the address based on constant independent of sequence sequence of prior physical location location or prior access accesses and is constant patterns Any location can be Cache memories may selected at random and Access time is variable Access time is variable employ associative directly addressed and access accessed Main memory and some cache systems are random access Performance Access time(latency) For random access memory, this is the time it takes to perform a read or write operation That is, the time between presenting the address and getting the valid data For non-random access memory, access time is the time it takes to position the read-write mechanism at the desired location Memory Cycle time Memory cycle time = access time + any additional time required before a second access can commence Additional time ? The time of transient signal dies out on signal lines The time to regenerate data if they are read destructively Memory cycle time >= Access time Performance Transfer Rate (Bandwidth) Rate at which data can be transferred into or out of a memory unit For random access memory, Transfer rate = (1/cycle time) For non-random access memory, the following relationship holds TN=TA+(N/R) TN : average time to read or write N bits TA : average access time N : number of bits R : transfer rate (bps) Physical Types Semiconductor RAM Magnetic Disk & Tape Optical CD & DVD Others Bubble (failed technology in 1980s) Holographic data storage (new technology) Tradeoffs in Memory Performance For memory, Faster access time, greater cost per bit Greater capacity, smaller cost per bit Greater capacity, slower access time The system designer may confront a problem to meet optimal solution The way to solve this problem is “the memory hierarchy” + Memory 14 The most common forms are: Semiconductor memory Magnetic surface memory Optical Magneto-optical Several physical characteristics of data storage are important: Volatile memory Information decays naturally or is lost when electrical power is switched off Nonvolatile memory Once recorded, information remains without deterioration until deliberately changed No electrical power is needed to retain information Magnetic-surface memories Are nonvolatile Semiconductor memory May be either volatile or nonvolatile Nonerasable memory Cannot be altered, except by destroying the storage unit Semiconductor memory of this type is known as read-only memory (ROM) For random-access memory the organization is a key design issue Organization refers to the physical arrangement of bits to form words + 15 Memory Hierarchy Design constraints on a computer’s memory can be summed up by three questions: How much, how fast, how expensive There is a trade-off among capacity, access time, and cost Faster access time, greater cost per bit Greater capacity, smaller cost per bit Greater capacity, slower access time The way out of the memory dilemma is not to rely on a single memory component or technology, but to employ a memory hierarchy Memory Hierarchy Registers In CPU Internal or Main memory May include one or more levels of cache “RAM” External memory Backing store Memory Hierarchy - Diagram - Decreasing cost per bit - Increasing access time - Increasing capacity - Decreasing frequency of access of the memory by the processor Hierarchy List Registers L1 Cache L2 Cache Main memory Disk cache Disk Optical Tape So do you want fast memory architecture ? It is possible to build a computer which uses only static RAM (see later) This would be very fast This would need no cache This would cost a very large amount Locality of Reference During the course of the execution of a program, memory references tend to cluster (group !) e.g. loops Cache Small amount of fast memory Sits between normal main memory and CPU May be located on CPU chip or module T1 + T2 T2 e m i t s s e c c a e g a r e v A T1 0 1 Fraction of accesses involving only Level 1 (Hit ratio) Figure 4.2 Performance of a Simple Two-Level Memory 22 + 23 Memory The use of three levels exploits the fact that semiconductor memory comes in a variety of types which differ in speed and cost Data are stored more permanently on external mass storage devices External, nonvolatile memory is also referred to as secondary memory or auxiliary memory Disk cache A portion of main memory can be used as a buffer to hold data temporarily that is to be read out to disk A few large transfers of data can be used instead of many small transfers of data Data can be retrieved rapidly from the software cache rather than slowly from the disk Block Transfer 24 Word Transfer CPU Cache Main Memory Fast Slow (a) Single cache Level 1 Level 2 Level 3 Main CPU (L1) cache (L2) cache (L3) cache Memory Fastest Fast Less Slow fast (b) Three-level cache organization Figure 4.3 Cache and Main Memory Cache operation – overview CPU requests contents of memory location Check cache for this data If present, get from cache (fast) If not present, read required block from main memory to cache Then deliver from cache to CPU Cache includes tags to identify which block of main memory is in each cache slot Line Memory Number Tag Block address 26 0 0 1 1 2 2 Block 0 3 (K words) C – 1 Block Length (K Words) (a) Cache Block M – 1 2n – 1 Word Length (b) Main memory Figure 4.4 Cache/Main-Memory Structure START 27 Receive address RA from CPU Is block No Access main containing RA memory for block in cache? containing RA Yes Fetch RA word Allocate cache and deliver line for main to CPU memory block Load main Deliver RA word memory block to CPU into cache line DONE Figure 4.5 Cache Read Operation Read address (RA) Cache Design Parameters Size Mapping Function Replacement Algorithm Write Policy Block Size Number of Caches Size does matter Cost More cache is expensive Speed More cache is faster (up to a point) Cf. The larger the cache, the larger the number of gates involved in addressing the cache. The result is that large caches tend to be slightly slower than small ones. Cache Addressing Where does cache sit? Between processor and virtual memory management unit Between MMU and main memory Logical cache (virtual cache) stores data using virtual addresses Stores data using virtual addresses The processor accesses the cache directly, without going through the MMU. Cf. a physical cache stores data using main memory physical addresses Advantage of the logical cache is that cache access faster than for a physical cache, before MMU address translation Disadvantage has to do with the fact that most virtual memory systems supply each application with the same virtual memory address space.