MINIMIZING MEMORY COST in SET-TOP BOXES Jeff Mitchell Rambus Inc
Total Page:16
File Type:pdf, Size:1020Kb
MINIMIZING MEMORY COST IN SET-TOP BOXES Jeff Mitchell Rambus Inc. conventional DRAMs. This makes a difficult Abstract choice for the system designer - add more memory and increase cost or accept lower Memory is the largest contributor to performance. the cost of a set top box, typically comprising up to half or more of the component cost. This Recently there have been new paper examines methods of minimizing this developments in the DRAM industry that offer cost using commodity memory parts solutions to the memory size/bandwidth consumed in large volumes by the PC industry dichotomy. Two new types of DRAM and by taking advantage of new developments devices, RambusTM DRAM (RD~) and in high density, high bandwidth DRAMs. Synchronous DRAM (SDRAM) offer improved bandwidth over traditional devices Architectural methods of extracting and yet still maintain an inexpensive DRAM bandwidth from conventional DRAM are cost structure. Both are designed to be used in examined, along with performance personal computer main memory applications, requirements of emerging set-top box the largest user of DRAM memory. Over half architectures. Alternatives are presented for of the world's annual DRAM shipments go meeting product performance goals while into PC's. Interactive set top boxes which use preserving low cost. these new high bandwidth types of memory can meet their performance objectives and take advantage of the PC cost/volume curves. This allows the consumer to have the best of both THE EFFECT OF MEMORY ON SYSTEM worlds - high performance at low cost. PERFORMANCE Digital set top boxes require tremendous amounts of bandwidth in order to MEMORY ALTERNATIVES deliver the interactive experience desired by consumers. MPEG-2 decoding, stereo audio, DRAM has always been the memory computational 3D graphics and interactive user type of choice in cost-sensitive applications. interfaces demand performance levels Although DRAM performance is poor relative previously unheard of in consumer products. to other types of memory devices such as Yet the cost constraints of the consumer Static RAM (SRAM), DRAtv'l has the virtue of market demand that inexpensive, relatively being both dense and cheap. System designers low performance Dynamic Random Access go to great lengths to get the performance they Memory (DRAM) be used for memory. In the need from DRAM and avoid using other types past, computer designers have used many of memory in order to keep product cost DRAMs in parallel in order to increase down. memory bandwidth. This approach works well in PC's or workstations where there are The conventional method of increasing many megabytes (MB) of memory, but is not performance in DRAM systems is to put · effective in a set top box where the total several devices in parallel. If a single device memory size may only be as little as two to cannot provide the required bandwidth, two four MB. Two or four MB requires only one devices used in parallel will double the or two DRAM devices, whereas four or more performance. This also doubles the total devices are typically needed in order to amount of memory and nearly doubles the provide sufficient bandwidth from number of interface pins on the memory 1996 NCTA Technical Papers -57- controller. The effect on the system of using When designing a new product, the DRAMs in parallel for increased performance most cost effective memory to use is the is shown in TABLE 1. This example uses 16 densest DRAM technology available. This will Megabit (Mb) page mode DRAM devices in a yield the lowest total memory cost. FIGURE 1 16 bit wide organization (1M x 16). shows the relative cost over time of various densities of DRAM. For new products it is best to target the densest device that will be cost effective in the time frame that the product DRAMs Bandwidth Controller Total reaches mass production. (MB/s) Pins Memo For example, a product starting design 1 66 40 2MB today would target the 16 Mb generation while also giving consideration of how to use 64 Mb 2 133 60 4MB devices should the anticipated lifetime of the product extend beyond 1998 or 1999. 4 266 120 8MB Considering that a 64 Mb DRAM is 8 megabytes of memory in a single chip, it is TABLE 1. SYSTEM EFFECTS OF OPERATING DRAMS IN PARALLEL 40 Memozy Granularity 35 In order to achieve a desired level of 30 bandwidth, some number of DRAMs must be used in parallel. The minimum amount of 25 memory required to be used in parallel is co called the memory granularity. As devices are :il paralleled to increase performance the total - 20 system memory grows proportionately. Large (IJ- memory granularity can be a problem in applications where cost requires that the total memory be kept to a minimum. 1 0 In past generations of DRAM 5 technology memory granularity was usually not an issue because DRAMs were less dense, 0 with fewer total memory bits in each device. 94 95 96 97 98 Four 4 Mb devices in parallel yields the same total bandwidth as shown in TABLE 1, but the memory granularity would only be 2 MB. FIGURE 1. DRAM PRICING TRENDS This is a factor of four less than with 16 Mb DRAMs. easy to see that the conventional practice of using DRAMs in parallel to obtain more So why not just use lower density bandwidth has become impractical to do in devices to get the needed performance? small memory systems. There are two Besides the increased board space, power, and alternatives to this practice - wider DRAMs the sheer number of pins needed on the and faster DRAMs. memory controller, DRAM economics dictate that the cheapest price per bit will be found on Wide DRAMs the densest device. One alternative to using several DRAMs in parallel is for the DRAM 1996 NCTA Technical Papers -58- manufacturers to simply make wider devices Synchronizing the data transfer to a with data bus widths of 16, 32 or 64 bits. clock allows for tighter timing parameters and This provides the same benefit as parallelism therefore a higher operating speed. SDRAMs but without increasing the total system can run in systems at speeds up to 66 MHz, memory size. This approach works up to a about double the speed of a conventional point, but then becomes both financially DRAM. Doubling the interface speed means unattractive and technically difficult. that only half as many devices are needed for a given bandwidth. This reduces the memory As a DRAM is made wider the die size granularity to half that of a conventional increases and the package gets larger and more DRAM. costly. With more J/0 pins, it also becomes more expensive to test. These factors tend to Rambus DRAM CRDRAM) negate the cost advantages of DRAM. As with SDRAM, RDRAM is a A wide DRAM also cannot be operated conventional DRAM mated to a synchronous as fast as a narrower device. The increased interface. An RDRAM has a 64 bit wide number of output pins causes more noise and internal bus running at 75 MHz. The RDRAM ground bounce. The remedy to this problem is connects to a memory controller, which also to run the device slower, which offsets the has a 64 bit wide internal bus running at 7 5 performance advantage of being wider. Wider MHz. parts must also have more pins providing power and ground connections, which again These wide internal busses narrow to increases cost. only 8 bits externally without any impact on performance. This gives an RDRAM the Wider DRAMs are a partial solution to performance of a 64 bit wide DRAM while achieving higher performance, but at some retaining all of the cost advantages of a narrow width around 32 bits this approach reaches 8 bit external bus. diminishing returns. Fast DRAMs Type Bus Package Bandwidth An alternative to making a wider Width Pins (MB/s) DRAM is to make a faster DRAM. Here the objective is to keep the device width down to a DRAM 16 bits 42 66 manageable size, but increase the speed at which it operates. DRAM 32 bits 100 133 There are two types of 'fast' DRAMs, SDRAM 16 bits 50 133 Synchronous DRAM (SDRAM) and Rambus DRAM (RDRAM). These two DRAM RDRAM 8 bits 32 600 derivatives are similar in that they both use a conventional memory core and run the external interface at a high speed. This provides the TABLE 2. PINLBANDWIUTQ: economic advantages of a conventional DRAM COMPARISON while providing much higher performance. Synchronous DRAM CSDRAM) TABLE 2 summarizes the number of signals, package pins, and relative bandwidth An SDRAM is a conventional DRAM of DRAM, wide DRAM, SDRAM and mated to a synchronous interface. The RDRAM. synchronous interface aligns data transfers into and out of the part with an external clock reference. 1996 NCTA Technical Papers -59- With each type of DRAM there is a straight-line relationship between bandwidth and memory granularity. This relationship 2500 makes it straightforward to approximate the memory granularity for a given level of system 2CXXJ performance If the system performance requirement lies above the line shown in 1500 FIGURE 2 for a particular type of DRAM, either another type of DRAM will have to be m :IE 1CXXJ . used or the bus width will have to be increased, with a corresponding increase in memory granularity and system cost. For example, an application which 0 requires 250 MB/s of system bandwidth can 2 4 6 8 be designed one of three ways, depending Memory Granularity (MB) upon whether DRAM, SDRAM, or RDRAM is used. FIGURE 3 shows block diagrams of FIGURE 2.