AMD-K6 Series
Total Page:16
File Type:pdf, Size:1020Kb
AMD-K6 Series The AMD-K6 processor is a high-performance sixth-generation processor that is physically installable in a P5 (Pentium) motherboard. It essentially was designed for AMD by NexGen and was first known as the Nx686. The NexGen version never appeared because it was purchased by AMD before the chip was due to be released. The AMD-K6 delivers performance levels somewhere between the Pentium and Pentium II processor as a result of its unique hybrid design. The K6 processor contains an industry-standard, high-performance implementation of the new multimedia instruction set, enabling a high level of multimedia performance for the time period. The K6-2 introduced an upgrade to MMX that AMD calls 3DNow!, which adds even more graphics and sound instructions. AMD designed the K6 processor to fit the low-cost, high-volume Socket 7 infrastructure. Initially, it used AMD¶s 0.35-micron, five-metal layer process technology; later the 0.25-micron process was used to increase production quantities because of reduced die size, as well as to decrease power consumption. AMD-K6 processor technical features include Sixth-generation internal design, fifth-generation external interface Internal RISC core, translates x86 to RISC instructions Superscalar parallel execution units (seven) Dynamic execution Branch prediction Speculative execution Large 64KB L1 cache (32KB instruction cache plus 32KB write-back dual-ported data cache) Built-in floating-point unit Industry-standard MMX instruction supportort System Management Mode Ceramic pin grid array (CPGA) Socket 7 design Manufactured using 0.35-micron and 0.25-micron, five-layer designs The AMD-K6 processor architecture is fully x86 binary code compatible, which means it runs all Intel software, including MMX instructions. To make up for the lower L2 cache performance of the Socket 7 design, AMD beefed up the internal L1 cache to 64KB total, twice the size of the Pentium II or III. This, plus the dynamic execution capability, enabled the K6 to outperform the Pentium and come close to the Pentium II and III in performance for a given clock rate. There were two subsequent additions to the K6 family, in the form of the K6-2 and K6-3. The K6-2 offered higher clock and bus speeds (up to 100MHz) and support for the 3DNow! instruction set. The K6-3 added 256KB of on-die full-core speed L2 cache. The addition of the full-speed L2 cache in the K6-3 was significant because it enabled the K6 series to fully compete with the Intel Pentium III processor family, although it did cause the processor to run very hot, resulting in its discontinuation after a relatively brief period. The original K6 has 8.8 million transistors and is built on a 0.35-micron, five-layer process. The die is 12.7mm on each side, or about 162 square mm. The K6-3 uses a 0.25-micron process and incorporates 21.3 million transistors on a die only 10.9mm on each side, or about 118 square mm. AMD K7 Processors The successor to AMD K6-2 /K6-III processor family, AMD K7 family is the seventh generation of x86 processors. The line of K7 processors consists of: high-performance desktop processors - Athlon, Athlon XP and Athlon MP budget desktop processors - Duron and Sempron. Only socket A Sempron processors are based on the K7 architecture. high-performance mobile processors - Athlon 4 and Athlon XP-M. budget mobile processors - Mobile Duron. The K7 processors feature multiple superscalar integer and floating point units, higher bus speed (up to 400 MHz), larger L1 cache, multiprocessing (Ahtlon MP), and additional SIMD instructions (3DNow! and SSE in Athlon XP/MP). Initially produced in June of 1999 using 0.25 micron technology at speeds of 500 MHz and higher, the AMD K7-based processors are currently manufactured at 0.13 micron technology and have rated speed of more than 3 GHz AMD Athlon The Athlon is AMD¶s successor to the K6 series. The Athlon was designed as a new chip from the ground up and does not interface via the Socket 7 or Super7 sockets like its previous chips. In the initial Athlon versions, AMD used a cartridge design, called Slot A, almost exactly like that of the Intel Pentium II and III (see Figure 3.42). This was due to the fact that the original Athlons used 512KB of external L2 cache, which was mounted on the processor cartridge board. The external cache ran at one-half core, two-fifths core, or one-third core, depending on which speed processor you had. In June 2000, AMD introduced a revised version of the Athlon (code-named Thunderbird) that incorporates 256KB of L2 cache directly on the processor die. This on-die cache runs at full-core speed and eliminates a bottleneck in the original Athlon systems. Along with the change to on-die L2 cache, the Athlon was also introduced in a version for AMD¶s own Socket A (Socket 462), which replaced the Slot. Although the Slot A cartridge looks a lot like the Intel Slot 1, and the Socket A looks like Intel¶s Socket 370, the pinouts are completely different and the AMD chips do not work in the same motherboards as the Intel chips. This was by design b ecause AMD was looking for ways to improve its chip architecture and distance itself from Intel. Special blocked pins in either socket or slot design prevent accidentally installing the chip in the wrong orientation or wrong slot. Figure 3.38 shows the Athlon in the Slot A cartridge. Socket A versions of the Athlon closely resemble the Duron. The Athlon was manufactured in speeds from 500MHz up to 1.4GHz and uses a 200 MHz or 266MHz processor (front-side) bus called the EV6 to connect to the motherboard North Bridge chip as well as other processors. Licensed from Digital Equipment, the EV6 bus is the same as that used for the Alpha 21264 processor, later owned by Compaq. The EV6 bus uses a clock speed of 100MHz or 133MHz but double-clocks the data, transferring data twice per cycle, for a cycling speed of 200MHz or 266MHz. Because the bus is 8 bytes (64 bits) wide, this results in a throughput of 8 bytes times 200MHz/ 266MHz, which amounts to 1.6GBps or 2.1GBps. This bus is ideal for supporting PC1600 or PC2100 DDR memory, which also runs at those speeds. The AMD bus design eliminates a potential bottleneck between the chipset and processor and enables more efficient transfers compared to other processors. The use of the EV6 bus is one of the primary reasons the Athlon and Duron (covered later) chips perform so well. AMD Duron The AMD Duron processor (originally code-named Spitfire) was announced in June 2000 and is a derivative of the AMD Athlon processor in the same fashion as the Celeron is a derivative of the Pentium II and III. Basically, the Duron is an Athlon with less L2 cache; all other capabilities are essentially the same. It is designed to be a lower-cost version with less cache but only slightly less performance. In keeping with the low-cost theme, Duron contains 64KB on-die L2 cache and is designed for Socket A, a socket version of the Athlon Slot A. Except for the Duron markings, the Duron is almost identical externally to the Socket A versions of the original Athlon. Essentially, the Duron was designed to compete against the Intel Celeron in the low-cost PC market, just as the Athlon was designed to compete in the higher-end Pentium III market. The Duron has since been discontinued and replaced by the Sempron. AMD Athlon XP As mentioned earlier, the most recent version of the Athlon is called the Athlon XP. This is basically an evolution of the previous Athlon, with improvements in the instruction set so it can execute Intel SSE instructions and a new marketing scheme that directly competed with the Pentium 4. Athlon XP also adopted a larger (512KB) full-speed on-die cache. AMD uses the term QuantiSpeed (a marketing term, not a technical term) to refer to the architecture of the Athlon XP. AMD defines this as including the following: A nine-issue superscalar, fully pipelined microarchitecture ²This provides more pathways for instructions to be sent into the execution sections of the CPU and includes three floating- point execution units, three integer units, and three address calculation units. A superscalar, fully pipelined floating-point calculation unit²This provides faster operations per clock cycle and cures a long-time deficiency of AMD processors versus Intel processors. A hardware data prefetch²This gathers the data needed from system memory and places it in the processor¶s Level 1 cache to save time. Improved translation look-aside buffers (TLBs) ²These enable the storage of data where the processor can access it more quickly without duplication or stalling for lack of fresh information. These design improvements wring more work out of each clock cycle, enabling a ³slower´ Athlon XP to beat a ³faster´ Pentium 4 processor in doing actual work. The first models of the Athlon XP used the Palomino core, which is also shared by the Athlon 4 mobile (laptop) processor. Later models have used the Thoroughbred core, which was later revised to improve thermal characteristics. The different Thoroughbred cores are sometimes referred to as Thoroughbred-A and Thoroughbred-B. Athlon XP processors use a core with 512KB on-die full-speed L2 cache known as Barton. Additional features include the following: 3DNow! Professional multimedia instructions (adding compatibility with the 70 additional SSE instructions in the Pentium III but not the 144 additional SSE2 instructions in the Pentium 4) 266MHz or 333MHz FSB 128KB Level 1 and 256KB or 512KB on-die Level 2 memory caches running at full CPU speed Copper interconnects (instead of aluminum) for more electrical efficiency and less heat Also new to the Athlon XP is the use of a thinner, lighter organic chip packaging compound similar to that used by recent Intel processors.