Pentium Pro Processor Workstation Performance Brief

June 1997 Order Number: 242999-003 ® Pro Processor Workstation Performance Brief

Information in this document is provided in connection with products. No license, express or implied, by estoppel or otherwise, to any intellectual property rights is granted by this document. Except as provided in Intel's Terms and Conditions of Sale for such products, Intel assumes no liability whatsoever, and Intel disclaims any express or implied warranty, relating to sale and/or use of Intel products including liability or warranties relating to fitness for a particular purpose, merchantability, or infringement of any patent, copyright or other intellectual property right. Intel products are not intended for use in medical, life saving, or life sustaining applications. Intel may make changes to specifications and product descriptions at any time, without notice. Designers must not rely on the absence or characteristics of any features or instructions marked "reserved" or "undefined." Intel reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them.

The Pentium® Pro processor may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request. Contact your local Intel sales office or your distributor to obtain the latest specifications and before placing your product order. Copies of documents which have an ordering number and are referenced in this document, or other Intel literature, may be obtained from: Intel Corporation P.O. Box 7641 Mt. Prospect IL 60056-7641

or call 1-800-879-4683 or visit Intel’s website at http:\\www.intel.com Copyright © Intel Corporation 1996, 1997. * Third-party brands and names are the property of their respective owners.

2 Pentium® Pro Processor Workstation Performance Brief

CONTENTS

PAGE

INTRODUCTION ...... 5

THE INTEL PENTIUM® PRO PROCESSOR...... 5

PRODUCT FEATURE HIGHLIGHTS...... 6

SPEC...... 7

APPLICATION PERFORMANCE BENCHMARKS...... 12 SYSmark* for Windows NT* ...... 12 LINPACK* 100x100 Benchmark ...... 14 MicroStation* 95 Performance Benchmark...... 15 MSC/NASTRAN* ...... 16 BCELL* ...... 18 ANSYS*...... 20 Parametric Technology Corporation – Pro/ENGINEER*...... 23 HDL Simulators ...... 25

GRAPHICS PERFORMANCE SUMMARY ...... 27 Viewperf* ...... 27 Viewperf CDRS* Viewset ...... 29 Viewperf Design Review* Viewset...... 30 Viewperf Data Explorer* Viewset ...... 32 Advanced Visualizer* Viewset (AWadvs-01) ...... 34 Lightscape* Viewset (Light-01) ...... 36

SUMMARY...... 37

APPENDIX A — TEST CONFIGURATIONS ...... 38 Website References ...... 40

3 Pentium® Pro Processor Workstation Performance Brief

FIGURES Figure 1. SPECint_base95* Benchmark ...... 9 Figure 2. SPECfp_base95* Benchmark...... 10 Figure 3. SPECint_rate95* ...... 11 Figure 4. SPECfp_rate95* Benchmark ...... 12 Figure 5. SYSmark* for Windows NT* Benchmark ...... 14 Figure 6. LINPACK MFLOPS Performance...... 15 Figure 7. MicroStation* 95 Performance Benchmark ...... 16 Figure 8. MSC/NASTRAN* Power Train Vibrational Model Performance...... 17 Figure 9. MSC/NASTRAN* Power Train Vibrational Model Performance...... 18 Figure 10. MSC/NASTRAN* BCELL* Model Performance ~10000 Degrees of Freedom...... 19 Figure 11. MSC/NASTRAN* BCELL* Model Performance ~20000 Degrees of Freedom...... 20 Figure 12. ANSYS* LS Performance...... 21 Figure 13. ANSYS* Performance ...... 22 Figure 14. ANSYS* 5.3/Intergraph & ANSYS* 5.2/non-Intel systems...... 23 Figure 15. PTC Pro/ENGINEER* Performance...... 25 Figure 16. Raynet-gate and TI-Mult...... 26 Figure 17. CPU-Mix and Dmult...... 27 Figure 18. CDRS* Composite Scores...... 30 Figure 19. Design Review* Composite Scores...... 32 Figure 20. Data Explorer* Composite Scores...... 33 Figure 21. AWadvs-01 Composite Numbers...... 35 Figure 22. Light-01 Composite Scores...... 37

TABLES Table 1. Summary of CDRS* Tests ...... 29 Table 2. Summary of Design Review* Tests...... 31 Table 3. Summary of Data Explorer* Tests ...... 33 Table 4. Summary of Advanced Visualizer* Tests...... 35 Table 5. Summary of Lightscape* Tests...... 36

4 Pentium® Pro Processor Workstation Performance Brief

INTRODUCTION

1This performance brief compares the performance of Pentium® Pro processor systems to other non-Intel processor-based workstations, such as those based on the SPARC*, MIPS*, PowerPC*, PA-RISC*, and Alpha* architectures. Traditionally, Intel Architecture processors have been used in personal computers running operating systems like DOS* and Windows* for business and personal productivity applications such as word processing, spreadsheets, and presentations. Workstations have been used for scientific and technical applications such as mechanical computer-aided design and analysis (MCAD), electronic computer-aided design (ECAD), scientific modeling and analysis, high performance graphics and imaging, and other computationally intensive tasks. Workstations often include two-dimensional and three- dimensional graphics accelerators.

The distinction between a personal computer and a workstation has been based on processor performance, features, graphics capabilities, and applications. Today, that distinction is being blurred by the availability of high performance processors that are being used in workstation and server products from a variety of vendors. The widespread adoption of PCI as a high speed peripheral interconnect has allowed workstation-class graphics accelerators to be attached to Pentium and Pentium Pro processor-based systems. The availability of *, NT*, SCSI disks and high-speed network connectivity on the Intel architecture provides robust multitasking capabilities. The high performance of Intel processors has caused many traditional workstation applications to migrate to the Intel platform. The Pentium and Pentium Pro processors provide workstation capabilities at competitive performance levels and superior price/performance.

This performance summary documents the performance of the new Pentium Pro processors on a variety of workstation benchmarks. Competitive results are included for other non-Intel processor-based workstations for which performance data is available. This performance summary does not attempt to provide benchmark results for all workstations throughout the price range of the workstations available. Instead it focuses on those in the price range of an Intel architecture-based workstation. Hence the justification for including a particular workstation in this report is based on the list price of the workstation platform. Intel Pentium Pro processor-based workstations are targeted for the introductory and the volume workstation markets. This performance brief examines those platforms from other non-Intel processor- based workstations that are considered to be in these markets. In some performance comparisons throughout this brief, platforms or architectures may appear that exceed this qualification. These are included for reference only since the configuration or the architecture is not considered a workstation platform. In these cases, an Intel processor-based workstation may have been configured to match certain characteristics of the other platforms. Intel Pentium Pro processor-based systems are denoted in the graphs with a “↓” above the system’s bar when non-Intel processor-based platforms are included. In the graphics portion of the paper, this symbol is used to represent a PCI graphics accelerator card tested in a Pentium Pro processor-based platform.

THE INTEL PENTIUM® PRO PROCESSOR

The Pentium Pro processor family is the next generation of Intel’s processor technology. The Pentium Pro processor family currently consists of the following processors:

Endnotes

1 System configurations, including software and hardware design, may affect actual performance.

5 Pentium® Pro Processor Workstation Performance Brief

• Pentium Pro processor at 200 MHz (512K L2 cache) • Pentium Pro processor at 200 MHz (256K L2 cache) • Pentium Pro processor at 180 MHz (256K L2 cache) • Pentium Pro processor at 166 MHz (512K L2 cache)

While fully compatible with all existing Intel architecture-compatible software, the Pentium Pro processor is specifically optimized to run 32-bit software. This includes demanding software like CAD/CAM, numerical analysis, 3D and multimedia authoring applications running on workstations. The Pentium Pro processor delivers superior performance over the wide-range of applications that exist today and new applications that are coming in the future. It achieves this superior performance through an innovative architectural technique called “Dynamic Execution”. Dynamic Execution removes software bottlenecks via incorporated features such as out-of-order execution and register-renaming. The Pentium Pro processor design also includes glueless logic to support up to four processors in a multiprocessor design. Other high-performance features include the tightly coupled L2 cache and an advanced processor bus.

Manufactured using Intel’s high-volume and well-proven 0.35 µm technology, the 200 MHz Pentium Pro processor delivers outstanding integer and floating-point performance. (See Figures 1-4.) The following numbers are for the peak rating of the SPEC benchmarks and represent the results for a Fujitsu ICL superserver J650I* containing a single Pentium Pro processor running at 200 MHz with 512K of L2 cache. • SPECint95* rating is 8.71 • SPECfp95* rating is 6.68 The Pentium Pro processor has been designed with balanced integer and floating-point performance. This balanced integer and floating-point performance design gives it an edge in workstation applications. For example, MSC/NASTRAN*, a state-of-the-art mechanical engineering tool that is very floating point intensive, runs particularly well on the Pentium Pro processor due to the processor’s balanced design. Results for MSC/NASTRAN on Intel processor-based workstations are reported and contrasted with other workstations later in this document.

PRODUCT FEATURE HIGHLIGHTS • Fully compatible with the entire library of software based on operating systems such as Windows NT*, UNIX* SVR4*, SCO UNIX*, NEXTSTEP*, Solaris*, OS/2*, as well as Windows 95*, Windows for Workgroups 3.11*, Windows 3.1*, and DOS. • The onboard, high-speed secondary cache and an advanced processor bus provide the bandwidth needed for computer-intensive applications. • Glueless 4-way MP delivers superior performance and provides cost-effective upgrade path. • Support for enhanced data integrity and reliability features for mission-critical applications: ECC (Error Checking and Correction), Fault Analysis & Recovery, and Functional Redundancy Checking. • Dynamic Execution: − Multiple Branch prediction allows many branches to be outstanding − Dataflow analysis enables out-of-order execution over tens of instructions − Speculative Execution enables many instructions to be executed beyond the program counter • Headroom for performance growth through process improvement. The 200 MHz Pentium Pro processor is manufactured using the well-proven 0.35 µm process.

6 Pentium® Pro Processor Workstation Performance Brief

Dynamic Execution involves optimally adjusting instruction execution by predicting program flow, analyzing the program’s dataflow graph to choose the best order to execute the instructions, then speculatively executing instructions in the preferred order.

Branch prediction is a concept found in many mainframe and high-speed architectures. It allows the processor to decode instructions beyond branches to keep the instruction full. In the Pentium Pro processor, the instruction fetch/decode unit uses a highly optimized branch prediction algorithm to predict the direction of the instruction stream through multiple levels of branches, procedure calls, and returns.

Pipe-lined machines must fetch the next instruction before they have completely executed the previous instruction. If the previous instruction was a branch, then the next-instruction fetch could have been to the wrong place. Branch prediction is a technique that attempts to infer the correct next instruction address, knowing only the current one. Typically it uses a Branch Target Buffer (BTB), a small separate memory that watches the instruction cache index and attempts to predict which index should be accessed next, based on branch history.

Dynamic data flow analysis involves real-time analysis of the flow of data through the processor to determine data and register dependencies and to detect opportunities for out-of- order instruction execution. The Pentium Pro processor dispatch/execute unit can simultaneously monitor many instructions and execute these instructions in the order that optimizes the use of the processor’s multiple execution units, while maintaining the integrity of the data being operated on. This out-of-order execution keeps the execution units busy even when cache misses and data dependencies among instructions occur. Speculative execution refers to the processor’s ability to execute instructions ahead of the program counter but ultimately to commit the results in the order of the original instruction stream. To make speculative execution possible, the Pentium Pro processor microarchitecture decouples the dispatching and executing of instructions from the commitment of results. The processor’s dispatch/execute unit uses data-flow analysis to execute all available instructions in the instruction pool and store the results in temporary registers. The retirement unit then linearly searches the instruction pool for completed instructions that no longer have data dependencies found, the retirement unit commits the results of these instructions to memory and/or the Intel Architecture registers (the processor’s eight general-purpose registers and eight floating-point registers) in the order they were originally issued and retires the instructions from the instruction pool. Through deep branch prediction, dynamic data-flow analysis, and speculative execution, Dynamic Execution removes the constraint of linear instruction sequencing between the traditional fetch and execute phases of instruction execution. It allows instructions to be decoded deep into multi-level branches to keep the instruction pipeline full. It promotes out-of- order instruction execution to keep the processor’s six instruction execution units running at full capacity. Finally, it commits the results of executed instructions in original program order to maintain data integrity and program coherency. In summary, Dynamic Execution technology optimally adjusts instruction execution by predicting program flow by analyzing the program’s dataflow graph to choose the best order to execute the instructions, then having the ability to speculatively execute instructions in the preferred order. The Pentium Pro processor dynamically adjusts its work, as defined by the incoming instruction stream, to minimize overall execution time.

SPEC

SPEC95* is a software benchmark product produced by the Standard Performance Evaluation Corp. (SPEC), a non-profit group of computer vendors, system integrators, universities, research organizations, publishers and consultants throughout the world. It provides measures of performance for comparing compute-intensive workloads on different computer systems. SPEC95 consists of two suites of benchmarks: CINT95* for measuring and comparing

7 Pentium® Pro Processor Workstation Performance Brief compute-intensive integer performance, and CFP95* for measuring and comparing compute- intensive floating-point performance. The two suites provide component-level benchmarks that measure the performance of the computer’s processor, memory architecture and compiler. SPEC benchmarks are selected from existing application and benchmark source code running across multiple platforms. Each benchmark is tested on different platforms to obtain fair performance results across competing hardware and software systems.

SPEC95 is the third major version of the SPEC benchmark suites, which in 1989 became the first widely accepted standard for comparing compute-intensive performance across various architectures. The new release replaces SPEC92*, which has been phased out. SPEC no longer publishes SPEC92 results and has discontinued selling the benchmark suite. Performance results from SPEC95 cannot be compared to those from SPEC92, since new benchmarks have been added and existing ones changed.

The CINT95 suite, written in C language, contains eight CPU-intensive integer benchmarks. It measures and calculates the following metrics: • SPECint95 — The geometric mean of eight normalized ratios (one for each integer benchmark) when compiled with aggressive optimization for each benchmark. • SPECint_base95* — The geometric mean of eight normalized ratios when compiled with the conservative optimization for each benchmark. The CFP95 suite, written in FORTRAN* language, contains ten CPU-intensive floating-point benchmarks. It measures and calculates the following metrics: • SPECfp95 — The geometric mean of 10 normalized ratios (one for each floating-point benchmark) when compiled with aggressive optimization for each benchmark. • SPECfp_base95* — The geometric mean of 10 normalized ratios when compiled with conservative optimization for each benchmark. In the following figures, the Intel Pentium Pro processor performance is compared with other non-Intel processors. Figures 1-4 show the SPECint95 and SPECfp95 performance of the Pentium Pro processors. As the following figures show, the 200 MHz Pentium Pro processor- based system from Intergraph Computer Systems delivers comparable performance with today’s highest rated non-Intel processor-based systems.

8 Pentium® Pro Processor Workstation Performance Brief

14.00

12.00 SPECint_base95* 10.00 ↓ SPECint95* 8.00 ↓

6.00 SPEC result

4.00

2.00 12.30 10.40 9.82 9.78 8.04 7.22 7.72 6.26 4.72 4.41 5.23 4.10 3.67 9.77 9.45 8.38 8.36 8.04 7.22 6.90 5.58 4.55 4.41 4.31 4.00 3.67 0.00 * Intergraph Intergraph TDZ300/180* 1200* TDZ300/200* HP 9000 C160* 500/333* HP 9000 C110* 255/300* SGI INDY HP 9000 C100* 500/400 600 5/333* 600 43P-133* R5000SC/180MHz* SUN Ultra 2 Series Ultra SUN 2 Digital AlphaStation Digital AlphaStation Digital Digital AlphaStation Digital Digital AlphaStation SUN Ultra 1 model 170* Ultra SUN 1 model IBM RISC System/6000 RISC IBM

Results for Intel processors have been submitted to SPEC for publication; all others obtained from the SPEC website, August 1996.

Figure 1. SPECint_base95* and SPECint95* Benchmarks

9 Pentium® Pro Processor Workstation Performance Brief

18.00

16.00 SPECfp_base95* 14.00 SPECfp95* 12.00

10.00

8.00 SPEC result ↓ ↓ 6.00

4.00

2.00 16.30 13.40 14.10 12.50 10.20 9.06 7.45 6.20 6.41 5.81 5.76 4.71 4.78 3.76 15.20 12.60 12.80 11.80 9.61 8.45 7.45 6.20 5.82 5.38 5.25 4.73 4.72 3.59 0.00 model 170* model SUN Ultra1 SGI INDY 500/400* 500/333* 255/300* HP 9000 C160* 9000 HP C110* 9000 HP C100* 9000 HP 600 5/333* 300/3CT* 43P-133* Model 151* Model R5000SC/180MHz* Digital AlphaStation Digital AlphaStation Digital AlphaStation Digital AlphaStation SUN SPARCstation 20 IBM RISC System/6000 RISC IBM System/6000 RISC IBM Intergraph TDZ300/200* Intergraph TDZ300/180*

Results for Intel processors have been submitted to SPEC for publication; all others obtained from the SPEC website, August 1996.

Figure 2. SPECfp_base95* and SPECfp95* Benchmarks

The SPEC organization also publishes results for throughput (rate) measurement. The benchmark numbers that represent throughput are known as SPECint_rate95*, SPECfp_rate95*, SPECint_rate_base95* and SPECfp_rate_base95*. With this method, called the “homogeneous capacity method,” several copies of a given benchmark are executed. This method is particularly suitable for multiprocessor systems. The results express how many jobs of a particular type (characterized by a particular benchmark) can be executed in a given time. The SPEC rates therefore characterize the capacity of a system for compute-intensive jobs of similar characteristics. Because of the different units, the values SPECint95/SPECfp95 and SPECrate_int95/SPECrate_fp95 cannot be compared directly. The SPEC_rate results are shown for an Intel Pentium Pro processor-based system and several of today’s leading non-Intel processor-based workstations.

10 Pentium® Pro Processor Workstation Performance Brief

120.00 SPECint_rate_base95*

100.00 SPECint_rate95* 80.00 ↓ ↓ 60.00 SPEC result 40.00

20.00 93.60 88.40 111.00 88.00 72.40 64.90 47.10 39.70 33.00 85.00 75.50 88.00 75.20 72.40 64.90 38.80 39.70 33.00 0.00 Intergraph Intergraph TDZ300/200* TDZ300/180* 500/333* 255/300* 500/400* HP 9000 C160* 9000 HP C110* 9000 HP C100* 9000 HP 600 5/333* 600 Digital AlphaStation Digital Digital AlphaStation Digital AlphaStation Digital AlphaStation Digital

Results for Intel processors have been submitted to SPEC for publication; all others obtained from the SPEC website, August 1996.

Figure 3. SPECint_rate_base95* and SPECint_rate95* Benchmarks

11 Pentium® Pro Processor Workstation Performance Brief

160.00 SPECfp_base_rate95* 140.00 SPECfp_rate95* 120.00

100.00

80.00

SPEC result ↓ 60.00 ↓

40.00 147.00 120.00 127.00 113.00 67.10 57.70 55.80 52.30 51.80 20.00 136.00 113.00 115.00 106.00 67.10 52.40 55.80 48.40 47.20 0.00 Intergraph Intergraph TDZ300/200* TDZ300/180* 500/400* 500/333* 255/300* 600 5/333* HP 9000 C110* HP 9000 C100* HP HP 9000 C160* HP Digital AlphaStation Digital AlphaStation Digital AlphaStation Digital AlphaStation Digital

Results for Intel processors have been submitted to SPEC for publication; all others obtained from the SPEC website, August 1996.

Figure 4. SPECfp_base_rate95* and SPECfp_rate95* Benchmarks

APPLICATION PERFORMANCE BENCHMARKS

In this section, we present the Pentium Pro processor performance on Windows NT* application-based benchmarks such as MicroStation* 95, Pro/E* and MSC/NASTRAN*.

SYSmark* for Windows NT* SYSmark* for Windows NT* is a suite of application software and associated benchmark scripts that have been developed by Business Applications Performance Corporation (BAPCo) in order to provide a tool for accurate and realistic measurement of system performance of personal computers running popular business-oriented applications in the Windows NT operating environment. The scripts are developed to reflect usage patterns of PC users in a business-oriented environment.

Workloads for SYSmark for Windows NT were developed based on BAPCo’s standardized practice of surveying users to determine how they exercise popular applications in day-to-day work. The applications selected for testing had to be able to run across all three popular architectures (Intel, MIPS and Alpha). SYSmark for Windows NT can generate performance metrics as a composite of all the different applications or for a specific application, such as word processing or spreadsheets.

12 Pentium® Pro Processor Workstation Performance Brief

• WORD PROCESSING − Microsoft Word* 6.0 − (native 32-bit on all architectures) • SPREADSHEET − Microsoft Excel* 5.0 − (native 32-bit on all architectures) • PROJECT MANAGEMENT − Welcom Software Technology Texim Project* 2.0e − (native 32-bit on all architectures) • COMPUTER-AIDED DESIGN − Orcad MaxEDA* 6.0 (PCB design tool) − (native 32-bit on all architectures) • PRESENTATION GRAPHICS − Microsoft PowerPoint* 4.0 − (16-bit Windows emulation) Figure 5 illustrates the SYSmark for Windows NT Performance on the Intel Pentium Pro processor.

13 Pentium® Pro Processor Workstation Performance Brief

800 ↓ ↓ SYSmark* for Windows NT* 700 ↓ ↓ 600

500

400

300 SYSmark result NT*

200

100 703 690 665 602 529 449

0 600 5/300* 600 5/266* (PPro 200 MHz) GLZ1T*-modified DEC AlphaStation DEC AlphaStation GLZ1T* (PPro 180) (PPro GLZ1T* Intergraph TDZ300 Intergraph Intergraph TDZ300 Intergraph TD-400* HP Vectra*HP 6/200 XU (dual PPro 200 MHz) (dual PPro 200 MHz)

Results for all systems obtained from BAPCo website - June 1996.

Figure 5. SYSmark* for Windows NT* Benchmark

LINPACK* 100x100 Benchmark LINPACK* is a linear equation solver written in FORTRAN. The LINPACK program consists of floating-point addition and multiplication of matrices. 100x100 LINPACK solves a 100x100 matrix of simultaneous linear equations. Source code changes are not allowed so that the results may be used to evaluate the compiler’s ability to optimize for the target system. The LINPACK benchmark measures the execution rate in MFLOPS (millions of floating-point operations per second). When running, the benchmark depends on memory-bandwidth and gives little weight to I/O. Therefore, when LINPACK data fits into system cache, performance may be higher. Figure 6 shows the LINPACK performance of two Pentium Pro processor-based systems and other non-Intel processor-based systems.

14 Pentium® Pro Processor Workstation Performance Brief

160.00

140.00 LINPACK* 100x100 double precision 120.00

100.00

80.00 ↓ Mflops ↓ 60.00

40.00

15.10 20.00

0.00 156.30 153.00 140.00 133.60 67.00 62.00 53.30 27.80 22.40 200/25S* 300/3CT* 43P-133* Intergraph IBM RISC IBM IBM RISC IBM IBM RISC IBM Intergraph TDZ300/180* TDZ300/200* System/6000 System/6000 System/6000 255/300* 500/333* C160* 9000 HP 600 5/333* 600 IBM RISC IBM Digital AlphaStation Digital Digital AlphaStation Digital AlphaStation Digital System/6000 42W/T*

Results for non-Intel processors obtained from each of the manufacturer’s websites or Dr. Jack Dongarra’s report - May through June 1996.

Figure 6. LINPACK* MFLOPS Performance

MicroStation* 95 Performance Benchmark This benchmark replaces CADbench* which was a fully-automated benchmark that ran with MicroStation*. The new benchmark contains 10 tests, which use nine files. The basic operation consists of copying each design file to a temporary directory before doing the manipulations, recording the times for each test, and then deleting all but one newly-created design file. The copying of files is not included in the times that are reported. The 10 tests are grouped into three sets of tests to better test certain aspects of the system. Within Group 1, three tests form the 2D heavy video test. Results yield very good representations of the power of the processor and video subsystem. Within Group 2 are two tests that add the disk I/O subsystem to the test results. Group 3 switches to 3D, and the processor must be good at floating-point calculations. For this test MicroStation 95 Performance Benchmark was run seven times for each different configuration with a warm boot in between each run. The warm boot was used to reset all operating system “states” to the same value prior to each run. Results were then averaged and the standard deviation calculated. When possible, the exact same combination of drives loaded, memory and its use, MicroStation setup and resolution of monitors was used. The benchmark was run on Pentium Pro processor-based systems and DEC* Alpha-based systems. The chart shown in this report contains only those systems containing the Intel 200 MHz Pentium Pro processor using 64 MB of memory compared with systems using the DEC Alpha processor running at both 300 MHz and 333 MHz, also with 64 MB of memory. The actual benchmark data in the article references other systems as well, but these were chosen to represent the highest available performance available for the Pentium Pro processor

15 Pentium® Pro Processor Workstation Performance Brief

and the consistent 64 MB of memory across the platforms. One DEC Alpha system used 128 MB of memory. This particular system was not chosen since the amount of memory can affect the time of the result. The Symbion Pro* and the Xi Pro* used the Intel Pentium Pro processor.

Figure 7 shows the results of the MicroStation 95 benchmark from MicroStation Manager magazine comparing Pentium Pro processor-based systems with non-Intel processor-based systems.

140 ↓ ↓ MicroStation* 95 Performance Benchmark 120

100

80

60 Time in Seconds 40

20 104.6 107.4 127.91 125.61 118.58 0 Pro 200 MHz 200 Pro MHz 200 Pro ® ® Symbion164* AXP Symbion Pro* 200 Pro* Symbion Predator AXP 4-300* DECMHz Alpha* 300 DECMHz Alpha* 300 DEC Alpha*DEC MHz 366 DeskStation Raptor3* DeskStation Xi Pro* MiniTower SP* MiniTower Pro* Xi Pentium Pentium

Results from MicroStation* Manager Magazine, May 1996

Figure 7. MicroStation* 95 Performance Benchmark.

MSC/NASTRAN* MSC/NASTRAN*, the principal product of the MacNeal Schwendler Corporation, is the industry's leading Finite Element Analysis (FEA) program. MSC/NASTRAN offers a wide variety of analysis types, including linear statics, normal modes, buckling, heat transfer, frequency response, transient response, random response, response spectrum analysis, and aeroelasticity. Virtually any material type can be modeled, including composites and hyper elastic materials.

16 Pentium® Pro Processor Workstation Performance Brief

Sparse matrix numerical methods, incorporated in every analysis type, greatly increase the solution speed and reduce the amount of disk space, making the numerical processing fast and efficient.

The following benchmarks from the MSC/NASTRAN suite demonstrate that the NT version of the code, running on the Intel Pentium Pro processor, provides the type of performance that users have come to expect from technical workstations.

MSC/NASTRAN Version 68.2 has been ported to the Windows NT environment running on Intel platforms. This is a released product and the results presented show that the Intel platforms, running Windows NT, represent a good choice for MSC/NASTRAN.

POWER TRAIN VIBRATIONAL MODEL A vibration analysis of a power train model with approximately 60,000 degrees of freedom has been run on a variety of platforms as a comparative reference. In order to find the natural frequencies in the range 0 to 100 Hz, MSC/NASTRAN performed two decompositions and 18 solves, and extracted 18 modes. Figure 8 shows the MSC/NASTRAN performance of the Pentium Pro processor-based systems on what might be considered a typical workstation configuration with 128 MB of memory or less. Figure 9 shows the MSC/NASTRAN performance on larger memory configurations. These include a Cray supercomputer and workstations configured with at least 256 MB of memory. This chart is included for reference only to represent configurations that are of definitely higher cost than those focused on in this performance brief.

MSC/NASTRAN* Power Train Vibrational Model < 256MB 2500

2000 ↓ ↓ ↓

1500 I/O 1338.1 969.3 1050.8 1012.0 1000 CPU Time in Seconds in Time

500 639.2 735.7 828.0 1098.9 0 Intergraph DEC Celebris NeTpower IBM TDZ400* XL6200* Calisto* RS6000/375*

Results for non-Intel processors provided by the MacNeal Schwendler Corporation.

Figure 8. MSC/NASTRAN* Power Train Vibrational Model Performance

17 Pentium® Pro Processor Workstation Performance Brief

MSC/NASTRAN* Power Train Vibrational Model >= 256MB 1200.00 120.00 1000.00 ↓ ↓ 119.5 146.8 800.00 I/O

600.00 184.00 CPU

400.00 Time in Seconds in Time

200.00 360.00 913.00 641.2 733.5

0.00 CRAY* C90 HP Intergraph DEC 9000/819* TDZ400* Celebris XL6200*

Results for non-Intel processors provided by the MacNeal Schwendler Corporation.

Figure 9. MSC/NASTRAN* Power Train Vibrational Model Performance

BCELL* The BCELL* benchmark has been used for many past versions of MSC/NASTRAN. This benchmark represents the overall CPU time in seconds for the calculations involved. The model in the BCELL benchmark is a cube of solids with a plate on each surface of each solid. The tests run represent ~10000 and ~22000 degrees of freedom. These models were chosen to be an upper bound of resource requirements. All runs were done by MSC at their facility except the result for the DEC Celebris XL6200* system. This run was done in the Workstation Technology Lab at Intel. Figures 10 and 11 show the performance of Intel Pentium Pro processor-based systems compared with non-Intel processor-based systems on the BCELL benchmark using ~10000 and ~22000 degrees of freedom.

18 Pentium® Pro Processor Workstation Performance Brief

450

400 BCELL* Degrees of Freedom ~10000 350

300

250 200 ↓

CP Time (secs) Time CP 150

100

50 90.00 166.60 180.00 180.00 190.00 200.00 205.00 210.00 210.00 250.00 300.00 310.00 400.00 420.00 0 EV4 980/98B* SGI R4400* SGI SGI R4000* SGI EV4 SGI R8000* SGI XL6200* IBM RS6000 IBM HP 9000/810* HP 9000/770* HP 9000/735* HP DEC Celebris Alpha 3000/500* Alpha Alpha* 3000/800 Alpha* IBM RS6000 580* RS6000 IBM Sun Ultra Sparc* Ultra Sun Alpha* 2100 4/275 2100 Alpha* IBM RS6000 590* RS6000 IBM

Results for non-Intel processors provided by the MacNeal Schwendler Corporation.

Figure 10. MSC/NASTRAN* BCELL* Model Performance ~10000 Degrees of Freedom

19 Pentium® Pro Processor Workstation Performance Brief

450

400 BCELL* Degrees of Freedom ~10000 350

300

250 200 ↓

CP Time (secs) Time CP 150

100

50 90.00 166.60 180.00 180.00 190.00 200.00 205.00 210.00 210.00 250.00 300.00 310.00 400.00 420.00 0 980/98B* XL6200* EV4 EV4 SGI R8000* SGI R4400* SGI R4000* IBM RS6000 HP 9000/810* HP 9000/770* HP 9000/735* DEC CelebrisDEC Alpha* 3000/800 Alpha 3000/500* Sun Ultra Sparc* IBM RS6000IBM 590* IBM RS6000IBM 580* Alpha* 2100 4/275 Alpha* 2100

Results for non-Intel processors provided by the MacNeal Schwendler Corporation.

Figure 11. MSC/NASTRAN* BCELL* Model Performance ~22000 Degrees of Freedom

ANSYS* ANSYS* is one of the most widely used Finite Element Analysis packages in the world today. From nonlinear and linear, structural, thermal, piezoelectrics and acoustics analysis to optional modules for electromagnetic field and computational fluid dynamics studies, the ANSYS program provides the tools to meet almost any engineering need. ANSYS has designed a set of benchmark programs, known as the LS benchmarks, which have data that demonstrates how well large ANSYS problems perform on computer systems. The data contains five static analysis examples using the ANSYS frontal solver. These examples were deliberately set with a large wavefront to stress the computer systems. The analysis examples represent a cantilevered plate with one element through the thickness. The mesh size is varied for the different cases. A force loading applied to the plate at the free end. The ANSYS 3D solid isoparametric element, SOLID45, is used in these static examples.

Disk No. of No. of Active Degrees Maximum RMS Requirements Name Elements Nodes of Freedom Wavefront Wavefront (MB) LS1 1,000 2,222 6,060 618 573.3 100 LS2 2,000 4,422 12,060 1,218 1,131.9 200 LS3 3,000 6,622 18,060 1,818 1,690.5 400 LS4 4,000 8,822 24,060 2,418 2,249.0 600 LS5 6,000 12,642 36,120 1,818 1,754.1 700

20 Pentium® Pro Processor Workstation Performance Brief

Following are the results obtained on four commercially available Pentium Pro processor workstations. These machines come with a variety of memory architectures, disk drives and number of processors. All reported Intel results are from ANSYS Version 5.3, released on July 10, 1996.

CP Time LS1 LS2 LS3 LS4 LS5

Calisto* 40.749 257.941 865.795 2035.537 1767.101

Celebris* 38.075 235.208 900.945 1975.661 1570.158

TDZ-400* 34.141 207.609 709.391 1512.125 1403.813

Vectra* 38.906 247.625 861.25 1866.125 1659.672

2500

CP Time

2000

1500 Calisto* Celebris* Intergraph TDZ-400*

CP Seconds 1000 Vectra*

500

0 LS1 LS2 LS3 LS4 LS5

Figure 12. ANSYS* LS Performance

Comparative results of elapsed time are less reliable. The current routine for reporting elapsed time in ANSYS reports in even seconds. Further, elapsed time numbers, because of the reliance of ANSYS on reading and writing scratch files, depend upon the structure and fullness of the file system plus other system and network activity. In fact, consecutive runs on the same machine can show deviations in the elapsed time of several percent.

Run Time LS1 LS2 LS3 LS4 LS5 Calisto* 50 303 987 2267 2119 Celebris* 46 284 1022 2205 1877 TDZ-400* 41 219 832 1814 1790 Vectra* 46 301 988 2113 1983

21 Pentium® Pro Processor Workstation Performance Brief

2500 Elapsed Time

2000

Calisto* 1500 Celebris*

Intergraph TDZ-400*

Time (secs) 1000 Vectra*

500

0 LS1 LS2 LS3 LS4 LS5

Figure 13. ANSYS* Performance

Following is a chart of ANSYS performance compared with several other available workstations. The Intel processor-based machine was an Intergraph TDZ400* 200 MHz Pentium Pro processor-based system with 256 MB of memory. The results are shown in the chart compared to results for other non-Intel processor-based systems running Release 5.2. ANSYS has not published numbers for Release 5.3 on the large scale benchmark. These numbers are for reference only.

22 Pentium® Pro Processor Workstation Performance Brief

3500 ANSYS* Large Scale Benchmarks 3000

2500 ↓ LS1

2000 LS2 LS3 1500 LS4

CP Time (secs) Time CP LS5 1000

500

0 128MB 128 MB 128 128MB HP 712/100* HP HP J210 XC* J210 HP 128 MB 192 MB HP 715/100 XC* 715/100 HP 192 MB 192 256 MB R8000* - 256 MB 256 - R8000* 200MHz - 256 MB 200MHz SGI Power Indigo2 Intergraph TDZ-400* Sun UltraSPARC 170* UltraSPARC Sun Alpha* Station 500/333 Station Alpha* Alpha* Station 600 5/333 Station Alpha* Alpha* Server 2100 4/275 2100 Server Alpha*

ANSYS* Version 5.2 results for non-Intel based systems obtained from ANSYS* website dated June 14, 1996.

Figure 14. ANSYS* 5.3/Intergraph & ANSYS* 5.2/non-Intel systems

Parametric Technology Corporation – Pro/ENGINEER* Parametric Technology Corporation’s Pro/ENGINEER* is a suite of integrated software products that automate the mechanical design-through-manufacturing process. This unique, fully-associative suite of mechanical design automation software includes application-specific products that address the complete spectrum of product-development activities. Within the Pro/ENGINEER family, more than 50 modules address every aspect of product development: design, drafting, manufacturing. In addition, specialized tools meet your requirements for advanced surfacing, assembly management, design optimization, cable and pipe routing, die design, data management, and more. Since all applications are fully associative, engineering teams work in parallel to develop product designs and their manufacturing processes concurrently. The Systems Group at Texas Instruments has developed a benchmark for Parametric Technology Corporation’s Pro/ENGINEER. This benchmark has been made publicly available via Pro/E: The Magazine. The results of their benchmarking has been published in three issues to date, most recently in the March/April 1996 edition. The results published were obtained using Release 15 of Pro/ENGINEER. One machine was tested with a beta version of Release 16. The Texas Instruments group developed the Pro/ENGINEER benchmark to test overall system performance on complex assembly models and assembly drawings. Texas Instruments engineers modeled real parts and assemblies to create an accurate manufacturing design work environment. The models represented a wide range of sizes and complexities, while the benchmark tested many of the advanced capabilities of Pro/ENGINEER such as holes, chamfers, rounds, and attachment hardware. The goal was not to stress a single task, but to test a cross section of tasks and operations that make up a true work environment. As a result, the benchmark measured 59 tasks including those only performed occasionally such as plotting

23 Pentium® Pro Processor Workstation Performance Brief

PostScript* files and creating exploded views, as well as more frequently executed tasks such as model spinning, panning or zooming, and rotation. The benchmark’s developers determined the mix of operations as well as the relative weighting of those operations by observing the engineers’ usage patterns over a period of several weeks. Evaluation of output files known as “trail files” led to the determination of the weighting factors. These trail files conta in the actual commands chosen during the time a design engineer is using Pro/ENGINEER.

The test exercises the graphics capabilities as well as the computational capabilities of the platform being tested. It also contains some occasionally performed tasks, such as generating shaded color PostScript files and measuring mass properties. Weighting factors were determined to represent how often a user might use these commands in a single eight-hour day. Adequate memory was used in each machine to ensure there was enough RAM to avoid excessive paging out to disk. The RAM requirement for this benchmark is 192 MB.

The results presented here are for Release 16 of Pro/ENGINEER. The SGI result is for a beta release of Release 16 while the NeTpower result is for a production release of Pro/ENGINEER Release 16. Release 17 of Pro/ENGINEER is scheduled for deployment later in 1996 and preliminary results for Pentium Pro processor-based platforms running Windows NT are demonstrating improved performance. In the Texas Instruments Pro/E benchmark a lower weighted score is better.

24 Pentium® Pro Processor Workstation Performance Brief

Pro/Engineer* rel 16 200

180 ↓

160

140

120

100

80

Weighted Total Time Total Weighted 60 191

40 173

20

0

SGI Indigo2* NeTpower Solid Calisto* IMPACT 200MHz Pentium Pro

SGI result from Pro/E: The Magazine January/February 1996. NeTpower result generated by NeTpower April 96.

Figure 15. PTC Pro/ENGINEER* Performance

HDL Simulators A benchmark report from DA Solutions provides detailed comparisons of execution speeds and memory requirements for a variety of simulators. The intent of the benchmark, which focuses on performance rather than ease-of-use issues, is to help users choose simulators while minimizing the amount of evaluation needed. The benchmark exercise included VHDL simulators from Cadence, Ikos, Mentor Graphics, Synopsis and Veda Design Automation, along with Verilog simulators from Fintronic, Mentor Graphics and Intergraph. It encompassed some 19 benchmark circuits ranging up to 3200 gates at the gate level and 48,000 lines of code at the register-transfer level (RTL). This benchmark exercise was carried out over a nine-month period. While independent evaluators from DA Solutions ran all the tests, vendors provided information about the best options and switches to use for various circuits, and vendor representatives monitored some of the tests. In the following charts the lower the score the better the performance.

25 Pentium® Pro Processor Workstation Performance Brief

60

Raynet-gate & TI-Mult

50 TI-Mult with unit timing 40

Raynet-gate

30

TotalElapsed Time 20 ↓ 20 10 60 28 15 7 4 6 22 0 Sparc 10* UltraSparc 167* Pentium® Pro 200 Alpha* 333

Benchmark results available from DA Solutions Limited - 10 Gale Moor Avenue - Gosport PO12 2SJ, UK. Results appearing here from EETimes, June 3rd 1996 edition.

Figure 16. Raynet-gate and TI-Mult

26 Pentium® Pro Processor Workstation Performance Brief

300

CPU-Mix & DMult

250 Dmult with unit timing 200 CPU-Mix with unit timing 150

Total Elapsed Time 100 ↓

50 52 78 53 259 34 32 27 118 0 Sparc 10* UltraSparc 167* Pentium® Pro 200 Alpha* 333

Benchmark results available from DA Solutions Limited - 10 Gale Moor Avenue - Gosport PO12 2SJ, UK. Results appearing here from EETimes, June 3rd 1996 edition.

Figure 17. CPU-Mix and Dmult

GRAPHICS PERFORMANCE SUMMARY

Viewperf* Viewperf* is a benchmark which measures the 3D rendering performance of systems running under OpenGL* which was written and maintained by the OpenGL Performance Characterization (OPC) group. The OPC organization began in 1993 as an ad-hoc project group aimed at establishing graphics performance benchmarks for systems running under the OpenGL application programming interface (API). The group joined the GPC committee in the summer of 1994. Viewperf measures a mixture of graphics primitives, drawn with different mixtures of primitive attributes. Results of independent tests are then combined into a weighted average to create a composite score for the test. Because of this, it is almost impossible to project a Viewperf score for an accelerator, it must be measured. Improving any one aspect of performance for a card will cause the score to increase. A doubling in performance of triangles won’t necessarily translate to a doubling in performance for Viewperf scores. Adding advanced features to an accelerator like anti-aliasing or hardware support for texture acceleration, on the other hand, could cause a score to double even if the shaded triangle rate is held constant. Since applications actually use these advanced features, this still represents a growth in performance for the application so the increase in the Viewperf metric is a correct indicator. The OPC project group has worked with Independent Software Vendors (ISVs) to obtain tests, data sets and weights that constitute what is called a Viewset. Each Viewset represents the

27 Pentium® Pro Processor Workstation Performance Brief graphics rendering portion of an actual application. Currently, Viewperf5* comprises five standard OPC viewsets: 1. Parametric Technology's CDRS*, which contains seven different Viewperf tests, is a modeling and rendering application for computer-aided industrial design. 2. IBM's Data Explorer* (DX), which has 10 different tests, is a visualization application. 3. Intergraph's Design Review*, which has 10 different tests, is a 3D computer model review package. 4. Alias/Wavefront’s Advanced Visualizer*, with 10 tests, is an animation application. 5. Lightscape Technology’s Lightscape Visualization System*, with 4 tests, is a radiosity visualization application.

All five Viewsets represent relatively high-end applications. These types of applications typically render large data sets. They almost always include lighting, smooth shading, blending, line anti-aliasing, z-buffering, and some texture mapping.

The ISVs that develop OPC Viewsets have provided percentage weights for each test for which a performance number is reported. ISVs have defined these percentages to indicate the relative importance of a test within the overall application. Viewperf offers the following characteristics: • It provides a single-source code for apples-to-apples comparison and performance tuning across different hardware platforms. • It runs on multiple operating systems, including OS/2, UNIX and Windows NT. • It runs across different processors, including Alpha, Intel, MIPS and PowerPC. • It runs on multiple windowing environments, including Presentation Manager*, Windows X* and Windows. • It encompasses a wide variety of OpenGL features and rendering techniques. • It is easily accessible through the OPC project subcommittee, ftp and through OpenGL sample disk distribution. Several factors make Viewperf unique from other benchmarks: • It uses datasets that are designed for and used by real applications. • It uses rendering parameters and models selected by ISVs and graphics users. • It produces numbers based on frames per second, a measurement with which users can readily identify. • It provides one number for each rendering path using one data set. Viewperf measures performance for the following entities: • 3D primitives, including points, lines, line strip, line loop, triangles, triangle strip, triangle fan, quads and polygons. • Attributes per vertex, per primitive and per frame. • Lighting • Texture mapping • Alpha blending • Fogging • Anti-aliasing • Depth buffering

28 Pentium® Pro Processor Workstation Performance Brief

Viewperf CDRS* Viewset CDRS is Parametric Technology's modeling and rendering software for Computer-Aided Industrial Design (CAID). It is used to create concept models of automobile exteriors and interiors, other vehicles, consumer electronics, appliances and other products that have challenging free-form shapes. The users of CDRS are typically creative designers with job titles such as automotive designer, products designer or industrial designer. There are seven tests specified that represent different types of operations performed within CDRS. Five of the tests use a triangle strip data set from a lawnmower model created using CDRS. The other two tests show the representation of the lawnmower as a wireframe. The tests are weighted and show the following CDRS functionality:

Table 1. Summary of CDRS* Tests1 Test Weight CDRS Functionality Represented 1 50% Vectors used in designing the model. Represents most of the design work done in CDRS. Anti-aliasing turned on to allow the designer to see a cleaner version of the model. 2 20% Surfaces shown as polygons, but all with a single surface color. 3 15% Surfaces grouped with different colors per group. 4 8% Textures added to groups of polygons. 5 5% Texture used to evaluate surface quality. 6 2% Color added per vertex to show the curvature of the surface. 7 0% Same as Test Number 1, but without the anti-aliasing. Note 1. All results obtained from SPEC/GPC/OPC web page containing results for Viewperf5. Results obtained on or before August 1996.

29 Pentium® Pro Processor Workstation Performance Brief

35.00

30.00 ↓ CDRS-03* Viewset

25.00

20.00

15.00 ↓ ↓ ↓ ↓ Mean Composite Score ↓ ↓ 10.00 ↓ ↓ 5.00 ↓ ↓ 32.09 27.88 23.52 23.47 23.24 13.91 13.11 12.05 11.58 11.03 9.90 9.48 7.90 4.28 4.26 4.20 4.05 3.61 3.57 3.21 0.00 Pro Pro ® ® Pro 200 MHz 200 Pro MHz 200 Pro MHz 200 Pro MHz 200 Pro ® ® ® ® Solid IMPACT* RealiZm Z13-T* RealiZm High IMPACT*High RealiZm Z13-GT* RealiZm Oxygen 102* 200 MHzGLZ1 200 MHz GLZ1T* MHz 200 POWER GXT500* POWER SGI Indigo2* 200 MHz R4400 MHz 200 Indigo2* SGI Intergraph TDZ310 200 MHz* 200 TDZ310 Intergraph SGI Indigo2* 200 MHz R4400 MHz 200 Indigo2* SGI Intergraph TDZ310* 200 MHz 200 TDZ310* Intergraph Intergraph Intense 3D-T* Intense Intergraph Dell Dimension XPS* 200n DPI 200n XPS* Dimension Dell Matrox Millenium* 4MB (8 bit) (8 4MB Millenium* Matrox Matrox Millenium* 4MB (16 bit) (16 4MB Millenium* Matrox bit) (24 4MB Millenium* Matrox 100 MHz POWER GXT250P* POWER MHz 100 GXT150P* POWER MHz 100 133 MHz POWER GXT250P* POWER MHz 133 GXT150P* POWER MHz 133 Intergraph TDZ300 Pentium TDZ300 Intergraph POWER GXT1000* Model 002 Model GXT1000* POWER Intergraph TDZ300 Pentium TDZ300 Intergraph IBM PC360 Pentium PC360 IBM Pentium PC360 IBM IBM PC360 Pentium PC360 IBM IBM PC360 Pentium PC360 IBM IBM RISC System/6000 Model 42T* Model System/6000 RISC IBM IBM RISC System/6000 Model 43P* Model System/6000 RISC IBM 43P* Model System/6000 RISC IBM IBM RISC System/6000 Model 43P* Model System/6000 RISC IBM 43P* Model System/6000 RISC IBM IBM RISC System/6000 Model 3BT* Model System/6000 RISC IBM IBM RISC System/6000 Model 43P* Model System/6000 RISC IBM AccelGraphics Pro/T* (200 MHz) (HP) MHz) (200 Pro/T* AccelGraphics Dell Dimension XPS* 200n DPI V192* DPI 200n XPS* Dimension Dell 133 MHz POWER GXT1000* Model 002 Model GXT1000* POWER MHz 133 All results obtained from SPEC/OPC web page containing results for Viewperf5*. Results obtained August 1996. Those not appearing have been submitted for inclusion.

Figure 18. CDRS* Composite Scores

Viewperf Design Review* Viewset

Design Review* is a 3D computer model review package specifically tailored for plant design models consisting of piping, equipment and structural elements such as I-beams, HVAC ducting, and electrical raceways. It allows flexible viewing and manipulation of the model for helping the design team visually track progress, identify interference, locate components, and facilitate project approvals by presenting clear presentations that technical and non-technical audiences can understand.

On the construction site, Design Review can display construction status and sequencing through vivid graphics that complement blueprints. After construction is complete, Design Review continues as a valuable tool for planning retrofits and maintenance. Design Review is a multi-threaded application that is available for both UNIX and Windows NT.

The model in this Viewset is a subset of the 3D plant model made for the GYDA offshore oil production platform located in the North Sea on the southwest coast of Norway.

A special thanks goes to British Petroleum, which has given the OPC subcommittee permission to use the geometric data as sample data for this Viewset. Use of this data is restricted to this Viewset.

30 Pentium® Pro Processor Workstation Performance Brief

Design Review works from a memory-resident representation of the model that is composed of high-order objects such as pipes, elbows valves, and I-beams. During a plant walk-through, each view is rendered by transforming these high-order objects to triangle strips or line strips. Tolerancing of each object is done dynamically and only triangles that are front facing are generated. This is apparent in the Viewset model as it is rotated.

Most Design Review models are greater than 50 MB and are stored as high-order objects. For this reason, and for the benefit of dynamic tolerancing and face culling, display lists are not used.

There are 10 tests specified by the Viewset that represent the most common operations performed by Design Review. These tests are as follows:

Table 2. Summary of Design Review* Tests1 Test Weight DRV Functionality Represented 1 38% Walk-through rendering of curved surfaces. Each curved object is rendered as a triangle mesh, depth-buffered, smooth-shaded, with one light and a different color per object 2 30% Walk-through rendering of flat surfaces. This is treated as a different test than Number 1 because normals are sent per facet and a flat shade model is used 3 8% Walk-throughs require many objects to be clipped. This is the same as Number 1 excep clipping is forced. 4 5% Again, clipping of test Number 2. 5 5% To easily spot rendered objects within a complex model, the objects to be identified are rendered as solid and the rest of the view is rendered as wireframe (line strips). The line strips are depth-buffered, flat-shaded, and unlit. Colors are sent per primitive 6 5% As an additional way to help visual identification and location of objects, the model may have "screen door" transparency applied. This adds polygon stippling to test Number 2 7 4% Two other views are present on the screen to help the user select a model orientation These views display the position and orientation of the viewer. A wireframe, orthographic projection of the model is used. Depth buffering is not used, so multithreading cannot be used in order to preserve draw order. 8 3% For more realism, objects in the model can be textured. Decal texturing with linea blending and mipmaps is used. 9 1% Same as test Number 5, except clipping is forced. 10 1% Same as test Number 8, except clipping is forced. Note 1. All results obtained from SPEC/GPC/OPC web page containing results for Viewperf5. Results obtained on or before August 1996.

31 Pentium® Pro Processor Workstation Performance Brief

6.00 ↓ 5.00 ↓ ↓ DRV-04 Viewset 4.00 ↓

3.00 ↓ ↓

Mean CompositeMean Score 2.00 ↓ 1.00 ↓ ↓ ↓ ↓

0.00 5.14 4.52 4.41 4.13 3.34 3.32 3.30 3.29 2.85 2.56 2.39 1.35 0.93 0.85 0.71 0.70 0.68 0.65 0.63 0.59 0.45 Pro ® 200 MHz GLZ1T* GXT250P* GXT150P* GXT250P* Intergraph TDZ400* GXT150P* Intense 3D-T* Intense 200n DPI V192* Solid IMPACT* Solid Solid IMPACT* MHz RealiZm Z13-T* MHz RealiZm Z13-GT* Model 3BT* POWER 3BT* Model MHz RealiZm Z13-T* R4400 High IMPACT* SGI Indigo2* 200 MHz GXT1000*002 Model 200MHz Intergraph 200MHz MHz RealiZm Z25-GT* Intergraph TDZ410*200 Intergraph TDZ310* 200 Intergraph TDZ310* 200 IBMSystem/6000 RISC Intergraph TDZ410* 200 200n DPI Oxygen 102* GXT1000* Model 002 Dell Dimension XPS Pro Millenium* 4MB (8 bit) IBM RISC System/6000 RISC IBM IBM RISC System/6000 RISC IBM IBM PC360 Pentium PC360 IBM Millenium* 4MB (16 bit) Dell Dimension XPS* Pro IBM RISC System/6000 IBM RISC System/6000 IBM RISC System/6000 Millenium*4MBbit) (24 IBM RISC System/6000 RISC IBM SGI Indigo2SGI IMPACT 10000* Model42T* POWERGXT500* IBM PC360200 MHz Matrox IBM PC360200 MHz Matrox Model 43P* 100MHz POWER IBM PC360200 MHz Matrox Model 43P* 100MHz POWER SGI Indigo2* 200 MHz R4400 Model 43P* 133MHzPOWER Model 43P* 133MHzPOWER Model 43P* 133MHzPOWER

All results obtained from SPEC/OPC web page containing results for Viewperf5*. Results obtained August 1996. Those not appearing have been submitted for inclusion.

Figure 19. Design Review* Composite Scores

Viewperf Data Explorer* Viewset

The IBM Visualization Data Explorer* (DX) is a general-purpose software package for scientific data visualization and analysis. It employs a data-flow driven client-server execution model and is currently available on UNIX workstations from Silicon Graphics, IBM, Sun, Hewlett-Packard and Digital Equipment. The OpenGL port of Data Explorer was completed with the recent release of DX 2.1. The tests visualize a set of particle traces through a vector flow field. The width of each tube represents the magnitude of the velocity vector at that location. Data such as this might result from simulations of fluid flow through a constriction. The object represented contains about 1,000 triangle meshes containing approximately 100 vertices each. This is a medium-sized data set for DX. All tests assume z-buffering with one light in addition to specification of a color at every vertex. Triangle meshes are the primary primitives for this Viewset. While Data Explorer allows for many other modes of interaction, these assumptions cover the majority of user interaction.

32 Pentium® Pro Processor Workstation Performance Brief

Table 3. Summary of Data Explorer* Tests1 Test Weight DX Functionality Represented 1 40% TMESH's immediate mode 2 20% LINE's immediate mode 3 10% TMESH's display listed 4 8% POINT's immediate mode 5 5% LINE's display listed 6 5% TMESH's list with facet normals 7 5% TMESH's with polygon stippling 8 2.5% TMESH's with two sided lighting 9 2.5% TMESH's clipped 10 2% POINT's direct rendering display listed Note 1. All results obtained from SPEC/GPC/OPC web page containing results for Viewperf5*. Results obtained on or before August 1996.

8.00

7.00

DX-03 Viewset 6.00 ↓

5.00

4.00 ↓ ↓ ↓ 3.00

Mean Composite Score ↓ ↓

2.00 ↓ ↓ ↓ 1.00

0.00 7.33 7.32 6.49 5.88 5.75 3.24 3.15 3.12 2.88 2.44 2.37 1.49 1.27 1.26 1.22 1.15 0.90 0.73 Pro ® 4MB * Pro 200 Pro Pro 200 Pro200 ® ® ® (24 bit) (8 bit) (8 (16 bit) 200 MHz GLZ1* MHz 200 Intergraph TDZ300* GXT150P* GXT150P* GXT250P* GXT250P* Intense 3D-T* Intense RealiZm Z13* RealiZm High IMPACT* Solid IMPACT* 200n DPI V192* Model 3BT* POWER 3BT* Model GXT1000* Model 002 RealiZm Z13-G*RealiZm 200 MHz Intergraph MHz 200 IBM RISC System/6000 GXT1000*002 Model 200n DPI OxygenDPI 102* 200n IBM RISC System/6000 IBM PC360 PentiumIBM PC360 IBM RISC System/6000 Dell Dimension XPS* Pro XPS* Dimension Dell Pro XPS* Dimension Dell IBM RISC System/6000 IBM System/6000 RISC IBM System/6000 RISC IBM System/6000 RISC MHz Matrox Millenium* 4MB Millenium* Matrox MHz MHz Matrox Millenium* 4MB Millenium* Matrox MHz Model 42T* POWER GXT500* POWER 42T* Model SGI Indigo2* 200 MHz R4400 200 MHz Indigo2* SGI Intergraph TDZ310*Intergraph 200 MHz SGI Indigo2* 200 MHz R4400 MHz Matrox Millenium Matrox MHz Model 43P* 133 MHz POWER MHz 133 43P* Model Model 43P* 100 MHz POWER MHz 100 43P* Model IBM PC360 Pentium IBM Model 43P* 133 MHz POWER MHz 133 43P* Model Model 43P* 100 MHz POWER MHz 100 43P* Model Model 43P* 133 MHz POWER MHz 133 43P* Model IBM PC360 Pentium IBM PC360 Pentium Intergraph TDZ310* 200 MHz

All results obtained from SPEC/OPC web page containing results for Viewperf5*. Results obtained August 1996. Those not appearing have been submitted for inclusion.

Figure 20. Data Explorer* Composite Scores

33 Pentium® Pro Processor Workstation Performance Brief

Advanced Visualizer* Viewset (AWadvs-01) Advanced Visualizer* from Alias/Wavefront is an integrated workstation-based 3D animation system that offers a comprehensive set of tools for 3D modeling, animation, rendering, image composition, and video output.

The Advanced Visualizer provides: • Geometric, analysis, and motion data importation from a wide range of CAD, dynamics and structural systems. • Automatic object simplification and switching tools for working with ultra-large production data sets. • Motion for an unlimited number of object, cameras, and lights with Wavefront SmartCurve* editing technique. • Interactive test rendering and high-quality, free-form surface rendering. • Software rotoscoping for matching computer animation with live action background footage. • Realistic imaging effects, including soft shadows, reflection, refraction, textures, and displacement maps, using Wavefront’s fast hybrid scanline/raytracing renderer. • Interactive image layering and output to video with Wavefront’s new Recording Composer*. • Powerful scripting and customization tools, open file formats, and user-defined interfaces. All operations within Advanced Visualizer are performed in immediate mode with double- buffered windows. There are four basic modes of operation within Advanced Visualizer: 1. 55% material shading (textured, z-buffered, backface-culled, 2 local lights) − 95% perspective, 80% trilinear mipmapped, modulated (41.8%) − 95% perspective, 20% nearest, modulated (10.45%) − 5% ortho, 80% trilinear mipmapped, modulated (2.2%) − 5% ortho, 20% nearest, modulated (0.55%) 2. 30% wireframe (no z-buffering, no lighting) − 95% perspective (28.5%) − 5% ortho (1.5%) 3. 10% smooth shading (z-buffered, backface-culled, 2 local lights) − 95% perspective (9.5%) − 5% ortho (0.5%) 4. 5% flat shading (z-buffered, backface-culled, 2 local lights) − 95% perspective (4.75%) − 5% ortho (0.25%)

34 Pentium® Pro Processor Workstation Performance Brief

These are the 10 tests specified by the Viewset that represent the most common operations performed by Advanced Visualizer. The tests are as follows:

Table 4. Summary of Advanced Visualizer* Tests1 Test Weight Advanced Visualizer* Functionality Represented 1 41.8% Material shading of polygonal animation model with highest interactive image fidelity and perspective projection 2 28.5% Wireframe rendering of polygonal animation model with perspective projection 3 10.45% Material shading of polygonal animation model with lowest interactive image fidelity and perspective projection 4 9.5% Smooth shading of polygonal animation model with perspective projection 5 4.75% Flat shading of polygonal animation model with perspective projection 6 2.2% Material shading of polygonal animation model with highest interactive image fidelity and orthogonal projection 7 1.5% Wireframe rendering of polygonal animation model with orthogonal projection 8 .55% Material shading of polygonal animation model with lowest interactive image fidelity and orthogonal projection 9 .5% Smooth shading of polygonal animation model with orthogonal projection 10 .25% Flat shading of polygonal animation model with orthogonal projection Note 1. All results obtained from SPEC/GPC/OPC web page containing results for Viewperf5*. Results obtained on or before August 1996.

12.00

10.00

↓ AWadvs-01 Viewset

8.00

6.00 ↓ ↓ ↓ 4.00 Mean Composite Score Composite Mean ↓

2.00 ↓ ↓ ↓↓

0.00 10.57 8.67 6.34 6.29 4.01 4.01 3.91 2.53 2.47 1.52 1.51 1.45 1.42 Pro ® (8 bit) (8 (24 bit) (16 bit) (16 GXT500* Model 002 Intergraph TDZ300* GLZ1T* (200 MHz) Intense 3D-T* Intense MHz 200 PC360* IBM MHz 200 PC360* IBM 200n DPI V192* DPI 200n Model POWER42T* MHz RealiZm Z13-T* MHz RealiZm IBM PC360* 200 MHz 200 PC360* IBM Model 002 (133 MHz) (133 002 Model 200 MHz Intergraph 4MB Millenium* Matrox 4MB Millenium* Matrox Intergraph TDZ310* 200 IBM RISC System/6000 RISC IBM IMPACT* (200 (200 MHz) IMPACT* IBM RISC System/6000 RISC IBM IBM RISC System/6000 RISC IBM 43P* POWERGXT1000* 3BT* POWER GXT1000* POWER 3BT* Matrox Millenium* 4MB Millenium* Matrox MHz RealiZm RealiZm MHz Z13-GT* Intergraph TDZ310* 200 200n DPI Oxygen 102* Oxygen DPI 200n SGI Indigo2*SGI R4400 High IBM PC360* Pentium PC360* IBM Dell Dimension XPS* Pro XPS* Dimension Dell Pro XPS* Dimension Dell

All results obtained from SPEC/OPC web page containing results for Viewperf5*. Results obtained August 1996. Those not appearing have been submitted for inclusion. Figure 21. AWadvs-01* Composite Numbers

35 Pentium® Pro Processor Workstation Performance Brief

Lightscape* Viewset (Light-01) The Lightscape Visualization System* from Lightscape Technologies, Inc. represents a new generation of computer graphics technology that combines proprietary radiosity algorithms with a physically-based lighting interface.

LIGHTING The most significant feature of Lightscape is its ability to accurately simulate global illumination effects. The system contains two integrated visualization components. The primary component utilizes progressive radiosity techniques and generates view-independent simulations of the diffuse light propagation within an environment. Subtle but significant effects are captured, including indirect illumination, soft shadows, and color bleeding between surfaces. A post process using ray tracing techniques adds specular highlights, reflection, and transparency effects to specific views of the radiosity solution.

INTERACTIVITY

Most rendering programs calculate the shading of surfaces at the time the image is generated. Lightscape’s radiosity component precalculates the diffuse energy distribution in an environment and stores the lighting distribution as part of the 3D model. The resulting lighting “mesh” can then be rapidly displayed. Using OpenGL display routines, Lightscape takes full advantage of the advanced 3D graphics capabilities of Pentium Pro processor-based systems with OpenGL-compliant graphic acceleration boards. Lightscape allows you to interactively move through fully simulated environments.

PROGRESSIVE REFINEMENT Lightscape utilizes a progressive refinement radiosity algorithm that produces useful visual results almost immediately upon processing. The quality of the visualization improves as the process continues. In this way, the user has total control over the quality (vs. time) desired to perform a given task. At any point in the solution process, users can alter the characteristic of a light source or surface material and the system will rapidly compensate and display the new results without the need for restarting the solution. This flexibility and performance allow users to rapidly test various lighting and material combinations to obtain precisely the visual effect desired. There are four tests specified by the Viewset that represent the most common operations performed by the Lightscape Visualization System . The four tests are as follows:

Table 5. Summary of Lightscape* Tests1 Test Weight Lightscape* Functionality Represented 1 25% Walk-through wireframe rendering of “Cornell Box” model using line loops with colors supplied per vertex 2 25% Full-screen walk-through solid rendering of “Cornell Box” model using smooth-shaded z-buffered quads with colors supplied per vertex 3 25% Walk-through wireframe rendering of 750K-quad Parliament Building model using line loops with colors supplied per vertex 4 25% Full-screen walk-through solid rendering of 750K-quad Parliament Building mode using smooth-shaded z-buffered quads with colors supplied per vertex Note 1. All results obtained from SPEC/GPC/OPC web page containing results for Viewperf5*. Results obtained on or before August 1996.

36 Pentium® Pro Processor Workstation Performance Brief

0.90

0.80 Light-01 Viewset 0.70 ↓ 0.60 ↓ 0.50 ↓ 0.40 ↓ 0.30 Mean Composite Score Composite Mean

0.20

0.10 0.87 0.61 0.52 0.50 0.48 0.47 0.32 0.31

0.00 Pro ® (250 MHz) GXT500* (133 MHz) Solid IMPACT* Solid Intense 3D-T* SGI Indigo2* R4400 Intergraph TDZ310* Model 42T* POWER 42T* Model Model 3BT* POWER Model 43P* POWER 43P* Model 200 MHz RealiZm Z13* RealiZm MHz 200 200 MHz Intergraph GXT1000* Model 002 Model GXT1000* GXT1000* Model 002 Model GXT1000* MHz RealiZm Z13-G* RealiZm MHz IBM RISC System/6000 RISC IBM IBM RISC System/6000 RISC IBM IBM RISC System/6000 RISC IBM 200n DPI Oxygen 102* Oxygen DPI 200n Intergraph TDZ310* 200 TDZ310* Intergraph IBM PC360 Pentium Dell Dimension XPS*Dell Dimension Pro

All results obtained from SPEC/OPC web page containing results for Viewperf5*. Results obtained August 1996. Those not appearing have been submitted for inclusion.

Figure 22. Light-01* Composite Scores

SUMMARY

The Pentium Pro processor family offers superior performance for high-end to entry-level workstations. All the Pentium Pro processor systems listed in this performance brief are priced under $25,000. The performance results show that these systems compare very favorably to higher-priced workstations from non-Intel architecture systems. With SPECint95 rating of 8.71 and SPECfp95 rating of 6.68, the 200 MHz Pentium Pro processor is one of the fastest in the industry. For even higher performance, the Pentium Pro processor offers 4-way glueless multiprocessing. Superior performance, coupled with the availability of advanced operating systems and peripherals, makes the Pentium Pro processor an ideal platform for the most demanding workstation applications. In particular, the Pentium Pro processor performs extremely well on CAD, scientific modeling and analysis, and other computationally intensive tasks. With the availability of workstation applications such as MSC/NASTRAN, MicroStation and Pro/ENGINEER on the Intel Architecture, the Pentium Pro processor is the platform of choice for mechanical CAD users. The performance on graphics related benchmarks demonstrates that the Pentium Pro processor provides an excellent cost/performance choice for the digital content creation market as well. The outstanding integer performance and the comprehensive library of tools available also make the Pentium Pro processor an ideal platform for the software development environment.

37 Pentium® Pro Processor Workstation Performance Brief

APPENDIX A — TEST CONFIGURATIONS Note 1. Specific Non-Intel system configuration information may be obtained from the various website URL’s referenced and/or other references noted. DEC Celebris XL6200* HP Vectra* XU/200

Pentium ® Pro Processor 200 MHz 200 MHz

Number of processors 12

Memory 128 MB / 256 MB 128 MB

Primary Cache 16 KB (8 KB I + 8 KB D) 16 KB (8 KB I + 8 KB D)

Secondary Cache 256 KB 256 KB

Motherboard Chipset, Stepping Intel 82450 Orion Intel 82450 B0 Orion

Operating System Windows NT* Workstation Windows NT Workstation

3.51 (Build 1057) 3.51 (Build 1057)

Hard Disk Controller NCR SDMS PCI SCSI Adaptec AIC 7880* SCSI

Hard Disk Seagate ST15150N* Seagate ST32550N* (2 GB)

Internal transfer rate (Mbit/sec) 47.5 to 72 49.4 to 72

External transfer rate (MB/sec) 10 (burst) 10 (burst)

Avg (R/W) Seek (msec) 8/9 8/9

Avg Latency (msec) 4.17 4.17

Graphics accelerator AccelGraphics AccelGraphics Matrox Elsa* AG/300* 3Dpro* Millennium* Racer/TX*

Tested Resolution (pixels) at 75 1280x1024x16 1024x768x15 1280x1024x24 1152x800x15 1152x800x15 Hz

Intergraph TDZ-400* NeTpower Calisto*

Pentium ® Pro Processor 200 MHz 200 MHz

Number of processors 21

Memory 128 MB / 256 MB 128 MB

Primary Cache 16 KB (8 KB I + 8 KB D) 16 KB (8 KB I + 8 KB D)

Secondary Cache 256 KB 256 KB

Motherboard Chipset, Stepping Intel 82450 Orion Intel 82450 B0 Orion

Operating System Windows NT* Workstation Windows NT Workstation 3.51 (Build 1057)

3.51 (Build 1057)

Hard Disk Controller Adaptec AIC 7850* SCSI Adaptec AHA 2940* SCSI

Hard Disk Seagate ST31250* Seagate ST32550N* (2 GB)

Internal transfer rate (Mbit/sec) 49.4 - 72 49.4 to 72

External transfer rate (MB/sec) 20 (burst) 10 (burst)

Avg (R/W) Seek (msec) 8.5 8/9

Avg Latency (msec) 4.17 4.17

Graphics accelerator GLZ2* AccelGraphics AG/300* Elite 2*

Tested Resolution (pixels) at 75 1280x1024x24 1280x1024x16 1152x800x15 Hz

38 Pentium® Pro Processor Workstation Performance Brief

NeTpower Calisto*

Pentium ® Pro Processor 200 MHz

Number of processors 1

Memory 256 MB

Primary Cache 16 KB (8 KB I + 8 KB D)

Secondary Cache 256 KB

Motherboard Chipset, Stepping Intel 82450 Orion

Operating System Windows NT* Workstation 3.51 (Build 1057)

Swap Space 400MB

Hard Disk Controller Adaptec AHA 2940* SCSI

Hard Disk Seagate ST32550N* (2 GB)

Internal transfer rate (Mbit/sec) 49.4 to 72

External transfer rate (MB/sec) 10 (burst)

Avg (R/W) Seek (msec) 8/9

Avg Latency (msec) 4.17

Pro/E release & build 16/9610

Graphics accelerator Elite2*

Tested Resolution (pixels) at 75 Hz 1024x768x256

The following lists the configurations used for testing the IHV cards running the Viewperf 5.0 tests: The Intergraph Intense 3D-T* runs were on a: IBM PC 360* 200 MHz Pentium Pro processor 64 MB memory 2 GB disk The system configuration for the Dynamic Pictures* runs was: Dell XPS Pro/200* 200 MHz Pentium Pro processor 64 MB memory 2.5 GB disk The AccelGraphics AccelPro TX* was run on: HP Vectra* XU/200 200 MHz Pentium Pro processor 64 MB memory 1 GB disk

39 Pentium® Pro Processor Workstation Performance Brief

Website References

This list shows URL’s where data is available that is contained in this document. It also shows the date of the last update if available on the webpage.

Individual vendor websites: http://www.alphastation.digital.com/products/a255/performance.html (July 11, 1996) http://www.alphastation.digital.com/products/a500/performance.html (June 25, 1996) http://www.alphastation.digital.com/products/a600/Performance1.html (July 11, 1996) http://www.alphastation.digital.com/products/a600/Performance2.html (July 11, 1996) http://www.alphastation.digital.com/products/alphaxl/Performance.html (April 29, 1996) http://www.sun.com/products-n-solutions/hw/wstns/index.html http://www.sgi.com/Products/Indigo2/IMPACT/Products/i2techspecs.html http://www.austin.ibm.com/cgi-bin/systems/300series.pl#performance http://www.austin.ibm.com/cgi-bin/systems/4142series.pl#topic9 http://www.austin.ibm.com/cgi-bin/systems/43p.pl#performance http://www.austin.ibm.com/cgi-bin/systems/200series.pl#topic3 http://www.hp.com:80/wsg/products/cclstech.html (June 26, 1996) http://www.hp.com:80/wsg/products/viz_ctec.html (June 4, 1996)

SPEC results are available at: http://www.specbench.org/osg/cpu95/results/results.html (July 24, 1996) http://www.sgi.com/Products/Indy/Tech/r5kspecs.html http://www.mips.com/r5000/R5000_B.html

BAPCo SYSmark NT results are available at: http://www.bapco.com/ntrslts.htm (July 16, 1996)

Viewperf 5.0 results are available via the GPC newsletter & at the following URL: http://www.specbench.org/gpc/opc/ (June 12, 1996)

LINPACK results are available at: http://netlib2.cs.utk.edu/benchmark/

ANSYS Large Scale Benchmark results are available at: http://www.ansys.com/ (June 14, 1996) Follow the product pointer to reach the benchmark results.

MicroStation 95 Benchmark results: Taken from MicroStation Manager magazine, May 1996 issue. Configuration information is available in this article as well as the test results reported here.

40