3 METRICS FOR TECHNOLOGY PERFORMANCE
A customer pays for value. We often associate functionality with value and how well the product or service performs that function as performance. Though this is often true, it is always impor- tant to keep in mind that what we think is of value to the customer, may not always be the case. This is illustrated in Figure 3-1. In the context of information handling devices, performance is related to the specific function of the various types of information handling.
© 1982 by Sidney Harris – “What’s So Funny About Computers?”, William Kaufmann, Inc./ ICE, "Roadmaps of Packaging Technology" 16065
Figure 3-1.
INTEGRATED CIRCUIT ENGINEERING CORPORATION 3-1 Metrics for Technology Performance
INFORMATION HANDLING TECHNOLOGIES
Our society has always had a thirst for information, and technology has been fueling the revolu- tion in how we access information since the Gutenberg press first churned out 42 line Bibles in 1454. Information is handled in four basic ways:
¥ processed ¥ transmitted ¥ stored ¥ interfaced with the physical world
Every task performed by every electronic device fits into one of these four tasks. Figure 3-2 lists examples of each of these activities. Computers, communications and consumer products are huge markets for information handling.
Information Processing Data compression Image morphing Database searching Translation Information Transmission Telephone transmission over twisted pair TV transmission over radio waves PDA to PC over IR link IDE controller to hard drive over SCSI bus Information Storage Floppy drive Hard drive CD Magnetic tape Information Interface CRT monitor Keyboard Mouse Speakers
Source: ICE, "Roadmaps of Packaging Technology" 22164
Figure 3-2. Four Information Operations
Information processing is the transformation of data into information. This is a pervasive task which happens in virtually all electronic systems from supercomputers to digital watches embed- ded in pens or rings. It only takes a few gates to process information, and when a 4-bit embed- ded microcontroller costs 25¢ in high volume, it is only a matter of time before any device with access to a battery or is plugged in the wall will be capable of information processing.
3-2 INTEGRATED CIRCUIT ENGINEERING CORPORATION Metrics for Technology Performance
Information transmission is the transport of data from one location to another. Every electronic interconnect interface is devoted to either power or information transmission. This task spans from transistor to transistor on a chip in 2 micron long fine line aluminum wires, to 45 kilometer long fiber-optic cables from repeater to repeater station, two miles deep in the Atlantic Ocean. In a scale appropriate to our human size, we are touched by information transmission on a daily basis over telephone lines, by electromagnetic waves to radios and by infrared links from our remote control to a TV.
Information storage is the placement of data in a static media where it can be located and retrieved at a later time. This task spans the scale from on chip registers that may be only 16 bits wide, through optical disk farms that may contain 10,000 CD discs, each with 10Gbits of data. There are three currently used electronic media for information storage: semiconductorÑin various forms of random access memory (RAM), such as dynamic (DRAM), static (SRAM), video (VRAM), flash, etc.; magnetic, such as magnetic tape, floppy disks and hard disks; and optical, such as compact disks (CDs), and digital video disks (DVDs).
Information interface refers to the transfer of information from the physical world to the elec- tronic world, either as input or as output. The Man-Machine interface is a specialized case. The most common output interfaces encompass visual display devices such as CRT (cathode ray tube), LCD (liquid crystal display) and printers, or sound generation through speakers. Vibration is also popular as an output for pagers, transmitting one bit of information. The most popular input devices today are keyboards, mice, pen-touch screens and microphones for use with voice recognition software. In addition to the Man-Machine interfaces, there is a whole universe of sen- sors and actuators that are used in monitor and control applications for automotive, home and industrial environments.
Though only information processing and transmission are discussed below, the packaging tech- nologies used in all four applications are discussed throughout this book.
INFORMATION PROCESSING
The Migration from Super Computer to Shirt Pocket
There has never been, and will never be, enough processing power available to the individual. The functions performed by super computers today will eventually be performed by personal computers and PDAs tomorrow. Those functions only dreamed of now, will some day be per- formed by the leading-edge super computers.
Information processing, or computing power, is an intrinsic feature of every electronic device we use. We call some of these devices computers, such as a mainframe, server, personal computer or laptop. And some we do not recognize as computers, yet have information processing as their
INTEGRATED CIRCUIT ENGINEERING CORPORATION 3-3 Metrics for Technology Performance
foundation, such as digital cell phones, cameras, personal digital assistants, TVs, washing machines, and sewing machines. Computers have become embedded into virtually every elec- tronic device that plugs into the wall or is powered by a battery.
The performance of a computer is not the factor that classifies it as a mainframe or a microcom- puter. Any table that listed the MIPS (millions of instructions per second) or FLOPS (Floating Point Operations Per Second) rating of a ÒtypicalÓ super computer would be out of date within a few years of its introduction.
For example, in 1982, a super computer was defined as a computer of about 20MegaFLOPS or higher. In 1989, the Intel 33MHz 486DX had a peak speed of 27MIPS, roughly equivalent to 20MegaFLOPS, the threshold for super computer speed. Also in 1989, NEC introduced the SX-X, at 22GigaFLOPS, three orders of magnitude higher than the super computer threshold. This ever increasing trend in performance is shown in Figure 3-3. The performance of any computer family is a constantly increasing quantity.
$12M 105 Cray T90 Cray C90 Cray T3E Cray Y-MP 4 Cray T3D 10 Cray 2 Cyber 205 er Second) Cray X-MP 103 Cray 1 Intel iPSC TMC CM-1
102 CDC 7600 IBM 360/90 (DEC VAX 780) 10 CDC 6600 IBM 7094 Illiac IV IBM Stretch (DEC PDP11) IBM 7090 (IBM 360) IBM (PDP6) 1 704 eak Speed (Millions of Operations P (DEC PDP1) P $5M 0.1 1955 1960 1965 1970 1975 1980 1985 1990 1995 2000
Year Source: Physics Today/ICE, "Roadmaps of Packaging Technology" 21981
Figure 3-3. Leading Edge Growth of Computer Technologies Since 1955
Rather than performance capability, a computer or information appliance is defined by its form factor. The form factor reflects the physical size of the product, the number of people it can serve and its shape. These factors also define general price ranges for each form factor. A variety of
3-4 INTEGRATED CIRCUIT ENGINEERING CORPORATION Metrics for Technology Performance
form factors have emerged for computer systems and information appliances over the last few decades (Figure 3-4). As electronics technology evolves, these form factors and market prices stay relatively the same. What changes is the performance capability of each product.
Super Computer 100 Inches $10M
Mainframe Computer
Server
Workstation
PC
HDTV
Laptop Fax Notebook Camera
Telephone Calculator
Watch 1 Inch $1 Size Price Relative Performance
Source: ICE, "Roadmaps of Packaging Technology" 15778A
Figure 3-4. Form Factors for Electronic Systems
The real revolutions in new products are occurring in the large form factors and in the smallest form factors. At the high end, the total processing capability available to run an individual pro- gram opens up new problems to simulation, in a reasonable time period. For example, to simu- late and predict the local weather for the next day, in less than one dayÕs worth of computation time, requires an estimated performance of 100GigaFLOPS.
Higher levels of chip integration have allowed what was PC performance to migrate into the shirt pocket. This is seen in the new generation of Òpersonal information managersÓ such as the Zaurus, the Pilot and the Wizard, as well as the Notebooks, such as from Toshiba, Compaq, and IBM. This evolution of microprocessor performance is illustrated in Figure 3-5.
INTEGRATED CIRCUIT ENGINEERING CORPORATION 3-5 Metrics for Technology Performance
3 10 Pentium Pro PC
er Second) Pentium PC $5K 2 10 Macintosh Sun WS PC IBM PC Apollo WS 10 Apple II
Altair 8800
DEC PDPS $30K 0.1 eak Speed (Millions of Operations P
P 1955 1960 1965 1970 1975 1980 1985 1990 1995 2000 Year Source: Physics Today/ICE, "Roadmaps of Packaging Technology" 21980
Figure 3-5. Functional/Affordable Growth of Computer Technology Since 1955
In between these two extreme form factors there is a steady migration of features and performance from the large systems into the smaller ones. Patrick Gelsinger, who led IntelÕs 80486 design team, has proposed a new law of computing,
ÒEvery concept proven useful in mainframes or minicomputers has migrated onto the microprocessor.Ó
We might even generalize this more and propose that,
ÒAny function performed by a large size computer will eventually be offered in the smallest computers.Ó
Driving Forces on Computing Devices
The universal driving force for all these devices is more processing power, in a smaller volume, at lower price. This has often been summarized with the phrase, ÒFaster, smaller, cheaper.Ó This march of ever increasing performance density per unit cost is propelled by finding engineering solutions that allow packing into each form factor more gates, able to switch at higher speeds, at a lower manufacturing cost.
At the high end, the high power dissipation is accepted, and adequate means of dealing with it are implemented. At the low end, where portability is key, low power consumption becomes another critical factor, and the driving force is more performance density per unit cost per watt.
3-6 INTEGRATED CIRCUIT ENGINEERING CORPORATION Metrics for Technology Performance
When the information processing capability that can fit in a form factor reaches the level to encom- pass a new task, at an acceptable market price, a new product may be created. The digital watch was created when the sophisticated timing, alarm, and display functions could be integrated onto one chip. The smart card and hand calculator were made possible with the first generation micro- processor, which integrated about 2000 gates on one chip. These gates, switching at a few hun- dred kilohertz, provided enough processing power to perform floating point operations and transcendental calculations in less than a second.
Two revolutions will drive the need for more computing power to devices used by the individual in the next few years, such as desktop computers and information appliances: ÒsocialÓ interfaces, and real time virtual reality displays.
Social interfaces take advantage of information processing to lower the technical barriers between users and computers or information appliances. Two major elements of a social interface are voice recognition and artificial intelligence. With these two features, users may be able to interact with a computer in plain English, without special training.
Virtual reality is a term used to describe the generation of a simulated environment that approxi- mates our real world and responds to the user. This includes interfaces to all of our five senses. In its simpler form, it is a visual and audio medium that has 3D objects in a 3D world. Just creat- ing the 3D images, with realistic shadowing and textures, responding in real time to the changing view of the user, is driving information processing. Some of the applications for virtual reality are listed in Figure 3-6.
¥ Training ¥ Real-time, on-line maintenance manuals ¥ Tele-operation of equipment ¥ Medical diagnosis ¥ Entertainment ¥ Architectural design ¥ Generic man-machine interface
Source: ICE, "Roadmaps of Packaging Technology" 22165
Figure 3-6. Applications for Virtual Reality
INTEGRATED CIRCUIT ENGINEERING CORPORATION 3-7 Metrics for Technology Performance
Implementing Information Processing
The chips that perform the information processing have evolved as well. The range of information processing devices includes:
¥ motherboard based CPU ¥ chip set based CPU ¥ single chip microprocessor unit (MPU) ¥ microcontroller unit (MCU) ¥ digital signal processor (DSP)
At the high end, it is usually a chip set which composes the CPU (central processing units). For example, in the Fujitsu VP2000, the CPU is one or more boards, each containing over 100 ECL gate arrays, as shown in Figure 3-7.
Source: Fujitsu/ICE, "Roadmaps of Packaging Technology" 22166
Figure 3-7. Fujitsu VP2000 CPU Board and Cold Plate
In the case of servers and high end workstations, a chip set usually is composed of a CPU, a memory management unit (MMU), an integer processor unit (IPU), and associated level 2 cache SRAMs. An example of the Ross Hypersparc processor, used in Sun Microsystems workstations, for example, is shown in Figure 3-8.
3-8 INTEGRATED CIRCUIT ENGINEERING CORPORATION Metrics for Technology Performance
Source: nChip/ICE, "Roadmaps of Packaging Technology" 16062
Figure 3-8. Ross/nCHIP RISC, Thin-Film MCM in Cofired Package
The personal computer revolution, which began in 1980, was enabled because of the introduction of the single chip microprocessor. The first IBM PC, the XT, for example, used the Intel 8086 processor. The single chip processor has increased in processing power, as shown in Figure 3-9. This is a direct result of the advance in the number of transistors that can be economically imple- mented on a chip, and the increase in operating clock frequency.
Two other types of chips have become an integral part of the information processing revolution: microcontroller units (MCUs) and digital signal processors (DSPs). A microcontroller typically integrates on one chip, many of the functions of an MPU, but with lower performance. Where current MPUs operate with 32-bit and 64-bit data streams, current MCUs operate with 8-bit and 16-bit data streams. An MCU has a microprocessor at its core, with on chip ROM, RAM and multiple I/O ports.
Microcontrollers are embedded in more products than all other processor chips. Figure 3-10 illus- trates the increasing market volume for MCUs over MPUs. Most PDAs use microcontrollers as their CPU, because fewer peripheral chips are needed, and the MCU has sufficient processing power for simple functions. Figure 3-11 lists the CPUs for some common PDAs.
INTEGRATED CIRCUIT ENGINEERING CORPORATION 3-9 Metrics for Technology Performance
1,000 700 P6 600 MIPS (200MHz) 500
300
P6* 300 MIPS (100MHz) 100 Pentium 112 MIPS 70 Intel 486DX 41 MIPS (50MHz) (66MHz) 50
30 Intel 386DX 11.4 MIPS (30MHz)
Intel 386DX 8.5 MIPS (25MHz) Intel 486DX 27 MIPS (33MHz) 10 Intel 386DX 7 MIPS (20MHz) 7 5 80286 2.66 MIPS (12MHz) Intel 386DX 6 MIPS (16MHz)
3
8080 0.64 MIPS (2MHz)
1.0 Performance (MIPS)
0.7 8086 0.75 MIPS (10MHz) 0.5
0.3 8085 0.37 MIPS (5MHz)
0.1
0.07 8008 0.06 MIPS (200KHz) 0.05
0.03 4004 0.06 MIPS (108KHz)
0.01 1970 1975 1980 1985 1990 1995 2000
*ICE estimate Year Source: ICE, "Roadmaps of Packaging Technology" 21602B
Figure 3-9. Intel Microprocessor Clock Frequency
A DSP chip is a highly specialized chip which performs a numeric transform of an input data stream to result in an output data stream. The most common function is the Fast Fourier Transform (FFT). This operation takes a time domain data stream and converts it into a frequency spectrum. In this respect, a DSP is a hardware accelerator for some mathematical transforms.
3-10 INTEGRATED CIRCUIT ENGINEERING CORPORATION Metrics for Technology Performance
18,000 3,500 MPU (Units) MPU (Dollars) 16,000 3,000 MCU (Units) MCU (Dollars) 14,000 2,500
12,000 2,000 10,000 1,500 8,000 Millions of Units Millions of Dollars 1,000 6,000
4,000 500
2,000 0 1991 1992 1993 1994 1995 1996 (EST) Year Units (M) MCU 1,722 1,902 2,221 2,659 3,067 3,470 MPU 136 143 167 170 212 245 Dollars ($M) MCU 4,850 5,245 6,560 8,275 10,735 11,615 MPU 3,565 5,460 8,590 10,995 14,280 17,510
Source: WSTS/ICE, "Roadmaps of Packaging Technology" 20318C
Figure 3-10. Comparison of the MCU and MPU Markets
Clock PDA CPU Frequency
Casio Z-7000 x86 7.5MHz
Apple Newton ARM610 20MHz
Sharp ZR5000 16bit custom —
Sony Magic Link PIC-1000 Zilog Z85180 14.3MHz
Motorola Envoy Communicator Motorola Dragon 68349 16MHz
Source: ICE, "Roadmaps of Packaging Technology" 22167
Figure 3-11. Microcontrollers Inside Popular PDAs
Typically, blocks of 1024 points of data are sampled. If this is performed for voice processing, the data stream may be the microphone voltage, sampled and digitized every 100 microseconds. A 1024 point data stream will represent 0.1sec of speech. When an FFT is performed on this data set,
INTEGRATED CIRCUIT ENGINEERING CORPORATION 3-11 Metrics for Technology Performance
the output will be the amplitude and phase of the spectrum, from 10Hz, up to 5KHz, at 10Hz intervals. Many algorithms, such as voice recognition and signal recovery in the presence of noise, operate more efficiently in the frequency domain than in the time domain.
DSPs are typically rated by how quickly they perform an FFT operation on a vector of 1024 data points, each of 16- or 32-bit size. For example, the ADSP-21060 DSP from Analog Devices can per- form the FFT in 0.046 seconds. Two devices could work in parallel and provide real time, contin- uous display of an audio channel.
Applications for the FFT function have exploded in the past few years. They appear in the obvi- ous applications, such as video processors, multi-media, audio, modem/fax chips sets, and wire- less communications. In addition, they are being used as motor controllers. Every device with a motor will soon have a DSP chip. This includes disk drives, dishwashers, air conditioners, and autos. An example of the market breath for DSP chips is illustrated in Figure 3-12, for the $2B market in 1995.
10,000
9,000
8,000
7,000
6,000
5,000 ($M) 4,000
3,000
2,000
1,000
0 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 Year
DSP Market ($M)337 447 674 998 1,729 2,460 3,380 4,400 5,735 7,440 9,110
Source: ICE, "Roadmaps of Packaging Technology" 20435B
Figure 3-12. DSP Market Trends ($M)
Just as microprocessor functions have become embedded on chips as cores, the DSP function has also become embedded on ASIC cores, as well as high end FPGAs (field programmable gate arrays).
3-12 INTEGRATED CIRCUIT ENGINEERING CORPORATION Metrics for Technology Performance
For specialized operations, a DSP based chip can outperform the fastest MPUs. For example, TI introduced the Multimedia Video Processor (MVP). It is a four million transistor chip with four 32-bit DSPs, a 32-bit RISC processor with 100MFLOPs capability, a transfer controller, two video controllers, and 50Kbytes of SRAM. It is capable of 2,000MIPS when performing specialized video processing.
An Introduction To Performance Metrics
Performance, in general, is an ambiguous and poorly defined term for describing computer sys- tems. More often than not, it is used as a sales and marketing tool to justify the price of a new product. It makes sense only when the specific test that is used to measure the performance is explicitly stated, and then only if the test is related to the intended use.
There are two different methods of quantifying performance: the actual measured execution time of running special, well defined reference programs, called benchmarks, or the rate at which well defined instructions, either integer or floating point, can be processed.
The execution speed of instructions will depend on the type of code that is running, and how well it has been optimized for the architecture of the computer. It is important to be aware of the con- ditions of the test run to rate the performance of a computer. Performance ratings can vary over a factor of 10 for the same computer because of variations in test conditions.
Benchmarks As Performance Metrics
There are a number of standardized programs that have been written and accepted by the indus- try as rulers for performance. Their variety reflects the different types of tasks a computer per- forms. The most common historical benchmarks are:
1. Linpack; solves 100 simultaneous linear equations with 100 unknowns, exercising matrix manipulation abilities.
2. Whetstone; most general test, composed of a collection of FORTRAN-like routines that include floating point and integer calculations, transcendental functions, array manipula- tion, and conditional jumps and loops.
3. Dhrystone; non-numeric test to simulate a high-level C program that contains memory assignments, control statements, and function calls.
4. TP1; used to simulate transaction analysis; fetches a record from a database, processes it, and rewrites it.
INTEGRATED CIRCUIT ENGINEERING CORPORATION 3-13 Metrics for Technology Performance
To evaluate the performance of a computer, these programs are run and the number of times they can be repeated per second is reported. The units are in Whetstones/sec, Linpacks/sec, etc. For example, the Sun 4/260 is rated at 19,000 Dhrystones/sec.
The final application will influence which is the most critical benchmark to use. For example, a computer for scientific work should be rated in Whetstones.
Figure 3-13 lists the performance ratings of various computer systems against the standard metrics.
CLOCK PERIOD MIPS MFLOPS COMPUTER (nsec)
Compaq Deskpro 50 15 — 486/25
Intel i860 25 65 5.4
IBM 320 50 27.5 7.4 (RS/6000)
VAX 9000 16 30 125
Amdahl 5990-700 10 63 — (Dual Processor)
NEC SX-3 2.9 — 5,500
Source: ICE, "Roadmaps of Packaging Technology" 15781
Figure 3-13. Performance Ratings of Selected Computers
MIPS and FLOPS
A measure of intrinsic processing speed falls into two categories; the number of instructionsÑ such as add, subtract, move, branch, and fetchÑthat are executed per second, and the number of floating point operations per second. A floating point operation, involving real numbers and operations such as multiply and divide, typically requires 0.1 to 10 instructions to execute, depending on the logic architecture and the code.
When reporting the speed in instructions per second, convention has adopted the units of MIPS, Millions of Instructions Per Second. When reporting the speed for Floating Point Operations Per Second, the units are FLOPS, often given in units of MegaFLOPS or GigaFLOPS.
If the userÕs applications will be mostly floating point operations, the FLOPS are a better metric for real world performance than the MIPS. The number of FLOPS at which a system operates can be maximized by the design of the logic architecture and optimized code. With a co-processor or vector processor, common with super computers, the MegaFLOPS may be from 0.5 to 10 times the MIPS rating.
3-14 INTEGRATED CIRCUIT ENGINEERING CORPORATION Metrics for Technology Performance
To realistically measure the performance of a computer, the MIPS or FLOPS rating should be mea- sured while running a benchmark. The benchmark run should reflect the final intended applica- tions. Even then, the way the benchmark is coded can greatly influence the measured speed. For example, a CRAY XMP-4, running a standard Linpack benchmark, will do 40MegaFLOPS. Running code optimized for matrix algorithm solutions on the Cray will allow the same Linpack to run at 800MegaFLOPS.
The MIPS rating is basically the number of cycles per instruction times the clock frequency. For most RISC-based systems, in which one instruction is performed per cycle, the MIPS is very nearly the clock frequency. For example, the Hyperstone computer, using a novel mixed RISC-CISC processor has a 25MHz clock and executes one instruction per cycle with one processor. It is rated at 25MIPS.
The computer architecture and microcode has had a profound effect on the number of instructions per cycle. In Figure 3-9, the evolution of the Intel processors is shown. The 100MHz Pentium, for example, can execute 3 instructions per cycle and has a rating of 300MIPS.
Some processors are still measured in VAX MIPS. It is the performance of a VAX 11/780. The VAX MIPS rating of a computer is how many times it executes the same code compared to a VAX 11/780. The VAX 11/780, introduced in 1977, is roughly a 1MIPS machine.
Because of the inherent ambiguities in quantifying performance, and the optimism of most mar- keting organizations, which has cast confusion in the popular trade journals, it has been said that MIPS really stands for ÒMeaningless Indication of Performance.Ó
SPECmarks
The System Performance and Evaluation Cooperative (SPEC) is a group formed in 1989 to estab- lish benchmarks specifically for RISC-based systems. The spec is updated periodically. It was first introduced in 1992, and results are referred to in SPEC92 units. There are two test suites, one con- sisting of programs dealing with integer operations, and a second suite of programs dealing with floating point operations.
The single performance number for each suite, in units of SPECmarks, is the ratio of the geomet- ric mean of the rate at which each of the programs can be run, compared to when executed on a reference platform. For the 1992 test spec, the reference platform was the VAX 11/780. Thus, one SPECmark92 is a VAX MIPS, which is roughly one MIP.
INTEGRATED CIRCUIT ENGINEERING CORPORATION 3-15 Metrics for Technology Performance
Ratings are reported in terms of the SPECint92 or SPECfp92, corresponding to the integer or the floating point test suites, respectively. A higher number means it can operate the program more often and can run faster. For example, a PowerPC 601 operating at 100MHz has a 112 SPECint92 rating. This is roughly 112MIPS. A comparison of the SPECmark rating of a variety of worksta- tion class processors is shown in Figure 3-14.
Chip Vendor Chip Clock (MHz) SPECint92 SPECfp92 SPECint95 SPECfp95
DEC Alpha 21164 291 319.3 602.2 7.03 9.64 333 405.9 518 8.08 12.1 350 405.9 518 — — HP PA-RISC 7200 120 169.6 270.5 4.41 7.45 IBM/Motorola PowerPC 604 100 — — 3.47 3.11 133 156.8 144.8 — — Intel Pentium 133 149.3 116 3.9 3.28 Pentium Pro 200 320.2 283.2 6.75 8.09 Fujitsu SPARC64 118 212.6 282.8 — — Ross hyperCACHE 150 — — 4.05 4.89 (HyperSPARC)
Sun UltraSPARC* 167 252 351 — — 200 332 505 — — UltraSPARC II** 300 — — 8.5 15 MIPS R4400MC 250 181 — 4.39 — * Supplied by Sun Microelectronics ** Estimated performance, due out fall '96 Source: Computer Design/ICE, "Roadmaps of Packaging Technology" 21974
Figure 3-14. SPEC Benchmarks
SPECmarks provides an unambiguous scale with which to compare the performance of different processors. Figure 3-15 illustrates the neck and neck race between the Intel family and the PowerPC family of processors for SPECint92 performance. For the same clock frequency, the PowerPC performance is slightly ahead of the Intel processor.
The SPECmark test suites were recently updated in 1995, and the tests are listed in Figure 3-16. The reference platform was changed with this test spec. It is now the Sun Microsystems SPARCstation 10/40, a 40MHz Supersparc based workstation. A SPECmark95 of 10 means the computer will execute the test suite of programs 10 times faster than a SPARCstation 10, with a 40 MHz processor. The SS10 is approximately 65 times faster than a VAX 11/780, so SPEC95 ratings for a computer will be about 65 times smaller than SPEC92 ratings. Partly because of the psycho- logical impact of smaller numbers to represent higher performance, and because this test suite is still very new, most current test results are still reported in terms of the SPEC92 test suite. However, Figure 3-17 lists the SPEC95 ratings for various processors.
3-16 INTEGRATED CIRCUIT ENGINEERING CORPORATION Metrics for Technology Performance
1,000 New P6 P7?? 620 Core 620 604 e New 604 Core P54/P55 603 500
400 200 P6+ e 300 133MHz 200 150 180 New 603 Core 200 133 150 100MHz
SPECint92 (log scale) 150 150
120
100 90MHz 100
80MHz 0 1994 1995 1996 1997 1998 Date of First Volume Shipments
Source: MicroDesign Resources/ICE, "Roadmaps of Packaging Technology" 21975
Figure 3-15. Pentium Versus PowerPC Performance
Suite Name Comments CINT95 099.go Al game. Plays "Go." 124.m88ksim Motorola 88k chip simulator with test program. 126.gcc New version of GCC, compiles SPARC code. 129.compress Compresses/decompresses file in memory. 130.li Lisp interpreter. 132.ijpeg Graphic JPEG compression/decompression. 134.perl Perl code that manipulates strings, prime numbers. 147.vortex Data-base program. CFP95 101.tomcatv Mesh-generation program. 102.swim Shallow-water model (1024 x 1024 grid). 103.su2cor Quantum physics; Monte Carlo simulation. 104.hydro2d Astrophysics; Hydrodynamical Navier Stokes equations. 107.mgrid Multi-grid solver in 3-D potential field. 110.applu Parabolic/elliptic partial differential equations. 125.turb3d Simulates isotropic, homogeneous turbulence in cube program. 141.apsi Solves problems in distribution of pollutants. 145.fppp Quantum chemistry. 146.wave5 Plasma physics; electromagnetic particle simulation.
Source: Computer Design/ICE, "Roadmaps of Packaging Technology" 21976
Figure 3-16. SPEC Benchmark Application Programs
INTEGRATED CIRCUIT ENGINEERING CORPORATION 3-17 Metrics for Technology Performance
SPECint SPECfp SPEC SPEC Model Name CPU Type _base95 _base95 int95 fp95
IBM POWERserver 591 77MHz POWER2 3.67 11.20 — 12 DEC AlphaSta 600 5/300 300MHz 21164 7.33 11.59 7.33 12 DEC AlphaSta 600 5/266 266MHz 21164 6.43 10.64 6.43 11 HP K400 4-CPU 100MHz PA-7200 — 10.20 — 10 HP J210 MP 120MHz PA-7200 — 9.91 — 10 HP K400 3-CPU 100MHz PA-7200 — 9.65 — 10 IBM POWERserver 39H 66.7MHz POWER2 3.28 9.44 — 10 HP K400 2-CPU 100MHz PA-7200 — 8.38 — 8 HP J200 MP 100MHz PA-7200 — 8.12 — 8 HP J210 120MHz PA-7200 4.37 7.54 4.37 7 DEC 3000 Model 900 275MHz 21064A 4.24 6.29 — — HP J200 100MHz PA-7200 3.64 6.28 3.64 6 HP K400 100MHz PA-7200 3.58 6.21 3.58 6 DEC AlphaSta 250 4/266 266MHz 21064A 4.18 5.78 4.18 5 DEC 3000 Model 700 225MHz 21064A 3.66 5.71 — — HP 735/125 125MHz PA-7150 4.04 4.55 4.04 4 HP 735/99 99MHz PA-7100 3.27 3.98 3.27 4 DEC 3000 Model 500 150MHz 21064 2.15 3.65 — — IBM POWERserver C20 120MHz PowerPC 604 3.85 3.50 — — HP 715/100 100MHz PA-7100LC 2.89 3.47 — — IBM POWERstation 43P 133MHz PowerPC 604 4.45 3.31 — — IBM POWERserver C10 80MHz PowerPC 601 2.37 2.97 — — Intel 1110/133 Pentium Processor 3.64 2.37 3.68 3 Sun SPARCstation 20/71 75MHz SuperSPARCII 2.46 2.14 — — Sun SPARCstation 10/40* 40MHz SuperSPARC 1.0 1.0 — —
Source: Computer Design/ICE, "Roadmaps of Packaging Technology" 21977
Figure 3-17. Representative Sample of SPEC95 Benchmarks
Subtle Factors Influencing Measured Performance
In general, Òhard codingÓ a benchmark to be optimized for a particular computerÕs logic archi- tecture, memory architecture, and peripheral access can improve the benchmark performance by 10x. It is common for different computers using the same microprocessor, operating at the same clock frequency, to be rated at different MIPS. This is due to the different tests that were run and how the memory and other peripherals were accessed by the code.
3-18 INTEGRATED CIRCUIT ENGINEERING CORPORATION Metrics for Technology Performance
It is clear there can be ambiguities in using the MIPS of a machine to measure performance. The performance rating depends on a number of subtle factors, rarely stated explicitly:
1. The program being run. 2. The optimization of the code for the specific computer. 3. The logic design of the computer. 4. The memory architecture of the computer. 5. The use of other peripherals and their access times.
INFORMATION TRANSMISSION
Analog and Digital Networks
Two types of networks are in use today, analog and digital. All communications between com- puters is with digitally encoding information. The plain old telephone system (POTS) is still analog based in most locations. The first implementation of cellular phones were analog based. They are now being switched to digital encoding. Radio and TV transmission are still analog based. However, it is only a matter of time before TV is transmitted as digital, especially as it becomes integrated into the Information Superhighway. Only digitally encoded information can become part of the Information Superhighway.
In digital networks, the information is encoded as bits. In a wire or optical fiber, the bit stream is most commonly encoded as a voltage or intensity level. For wireless, and some leading edge, ultra high bandwidth fiber systems, a carrier frequency is modulated. The most common is ampli- tude modulation (AM), frequency modulation (FM) or frequency shift keying (FSK), phase shift keying (PSK), and its derivatives, such as quadrature phase shift keying (QPSK).
The information transmission rate is defined as how many bits per second are transported in the interconnect. Though the information may be encoded as digital bits, the actual signal that prop- agates is still an analog signal, representing a varying voltage level or light intensity level. Ultimately, analog voltage levels are measured and compared to a threshold to recover the digital information of high and low.
The use of digital communications allows a much more versatile network, where the information being transmitted carries with it its own address. This increases the efficiency of the switching system which routes the information to the correct end user. With the use of DSP, error correcting codes and compression, the absolute highest information density can be transmitted over the available bandwidth of the interconnect with digital signals. All networks will move toward dig- ital encoding in the future.
INTEGRATED CIRCUIT ENGINEERING CORPORATION 3-19 Metrics for Technology Performance
In digital networks, there are four elements, the transmission medium that carries the information from point to point, the switching systems that routes the information into the various transmis- sion channels, and the transmitter and receiver. Often the switching system also acts a repeater for receiving, routing and re-transmitting the digital signals.
Interconnect Media
Digital networks typically use four media for transmitting information; wire, such as printed cir- cuit board traces, twisted pair, ribbon cable or coax cable; fiber-optic cable, either multimode or single mode; infrared through free space; and wireless, as radio waves, rf or microwaves. When using free space radio waves, to avoid interference between channels, the FCC has regulated what parts of the spectrum can be used for what purpose. No such restrictions are placed on wire or fiber-optic networks, since they are dedicated lines which do not radiate or interfere. IR is typi- cally for communications between devices on one desktop, or at most within one room, and inter- ference between devices can be easily controlled.
The maximum rate at which bits of information can flow in a channel was first described by Claude Shannon in 1948. It has since become know as ShannonÕs Law:
capacity (bits/sec) = BW x log2 (1+SNR)
The bandwidth is the span in frequency that is used by the signal. When a carrier frequency is used, as in rf transmission, the bandwidth is roughly the modulation frequency of the carrier. The bandwidth in frequency space is closely regulated by the FCC to minimize interference between users. In wire or fiber based digital networks the bandwidth is the highest sine wave frequency component present in the signal. This is related to the edge rate of the signal. It is approximately 0.35/rise time. In a typical digital system network, the bandwidth is about 5x the clocked fre- quency. This is reviewed in detail in Chapter 7.
To increase the transmitted bit rate, either the bandwidth must be increased, which means the clock frequency increases, or the signal to noise ratio must increase. Both of these methods are being used to increase the carrying capacity of networks.
There is a fundamental trade off in bit rate capacity and distance for an interconnect. As the length of an interconnect increases, effects such as attenuation decrease the signal to noise ratio, and reflections and distortions from impedance discontinuities decrease the bandwidth. The choice of the medium to use depends on the distance to be traveled and the required bandwidth. As digi- tal signal processing technology becomes more sophisticated, the data carrying capacity of an interconnect will steadily increase. Figure 3-18 shows the variation of carrying capacity and dis- tance for a variety of interconnect media.
3-20 INTEGRATED CIRCUIT ENGINEERING CORPORATION Metrics for Technology Performance
1x105
1x104