A Case Study of 3 Internet Benchmarks on 3 Superscalar Machines + + + Yue Luo , Pattabi Seshadri*, Juan Rubio , Lizy Kurian John and Alex Mericas* + Laboratory for Computer Architecture *IBM Corporation The University of Texas at Austin 11400 Burnet Road Austin, TX 78712 Austin, TX 78758 {luo,jrubio,ljohn}@ece.utexas.edu
[email protected] The phenomenal growth of the World Wide Web has resulted in the emergence and popularity of several information technology related computer applications. These applications are typically executed on computer systems that contain state-of-the-art superscalar microprocessors. Superscalar microprocessors can fetch, decode, and execute multiple instructions in each clock cycle. They contain multiple functional units and generally employ large caches. Most of these superscalar processors execute instructions in an order different from the instruction sequence that is fed to them. In order to finish the job as soon as possible, they look further down into the instruction stream and execute instructions from places where sequential execution flow has not reached yet. With the aid of sophisticated branch predictors, they identify the potential path of program flow in order to find instructions to execute in advance. At times, the predictions are wrong and the processor nullifies the extra work that it speculatively performed. Most of the microprocessors that are executing today’s internet workloads were designed before the advent of these emerging workloads. The SPEC CPU benchmarks (See sidebar on SPEC CPU benchmarks) have been used widely in performance evaluation during the last 12 years, but they are different in functionality from the emerging commercial applications.