•The Better a Computer Performs, the Better It Is! •But What Is Better?

•The Better a Computer Performs, the Better It Is! •But What Is Better?

inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Performance Lecture #38 Performance •The better a computer 2006-12-01 performs, the better it is! Titleless TA Aaron Staley •But what is better? inst.eecs./~cs61c-tc Ancient Greeks used Computers ⇒ Mechanical device had 37 gear wheels which could be used to add, subtract, multiply, and divide http://www.nytimes.com/2006/11/30/science/30compute.html CS61C L3 Performance (1) Staley, Fall 2005 © UCB CS61C L3 Performance (2) Staley, Fall 2005 © UCB Why Performance? Two Notions of “Performance” • Purchasing Perspective: given a DC to Top Passen- Throughput Plane collection of machines (or upgrade Paris Speed gers (pmph) options), which has the Boeing 6.5 610 - best performance ? 470 286,700 - least cost ? 747 hours mph - best performance / cost ? BAD/Sud 3 1350 132 178,200 • Computer Designer Perspective: faced Concorde hours mph with design options, which has the •Which has higher performance? - best performance improvement ? •Interested in time to deliver 100 passengers? - least cost ? •Interested in delivering as many passengers per day as possible? - best performance / cost ? •In a computer, time for one task called • All require basis for comparison and Response Time or Execution Time metric for evaluation! •In a computer, tasks per unit time called Throughput or Bandwidth •Solid metrics lead to solid progress! CS61C L3 Performance (3) Staley, Fall 2005 © UCB CS61C L3 Performance (4) Staley, Fall 2005 © UCB Definitions Example of Response Time v. Throughput • Performance is in units of things per sec • Time of Concorde vs. Boeing 747? • bigger is better • Concord is 6.5 hours / 3 hours = 2.2 times faster • If we are primarily concerned with • Throughput of Boeing vs. Concorde? response time • Boeing 747: 286,700 pmph / 178,200 pmph • performance(x) = 1 = 1.6 times faster execution_time(x) • Boeing is 1.6 times (“60%”) faster in " F(ast) is n times faster than S(low) " means… terms of throughput performance(F) execution_time(S) • Concord is 2.2 times (“120%”) faster in terms of flying time (response time) n = = We will focus primarily on response performance(S) execution_time(F) time. CS61C L3 Performance (5) Staley, Fall 2005 © UCB CS61C L3 Performance (6) Staley, Fall 2005 © UCB Words, Words, Words… What is Time? • Straightforward definition of time: • Will (try to) stick to “n times faster”; • Total time to complete a task, including disk its less confusing than “m % faster” accesses, memory accesses, I/O activities, operating system overhead, ... • As faster means both decreased • “real time”, “response time” or execution time and increased “elapsed time” performance, to reduce confusion we • Alternative: just time processor (CPU) will (and you should) use is working only on your program (since “improve execution time” or multiple processes running at same time) “improve performance” • “CPU execution time” or “CPU time” • Often divided into system CPU time (in OS) and user CPU time (in user program) CS61C L3 Performance (7) Staley, Fall 2005 © UCB CS61C L3 Performance (8) Staley, Fall 2005 © UCB How to Measure Time? Measuring Time using Clock Cycles (1/2) • Real Time ⇒ Actual time elapsed • CPU execution time for a program • CPU Time: Computers constructed using a clock that runs at a constant = Clock Cycles for a program rate and determines when events take x Clock Period place in the hardware • or • These discrete time intervals called clock cycles (or informally clocks or = Clock Cycles for a program cycles) Clock Rate • Length of clock period: clock cycle time (e.g., 2 nanoseconds or 2 ns) and clock rate (e.g., 500 megahertz, or 500 MHz), which is the inverse of the clock period; use these! CS61C L3 Performance (9) Staley, Fall 2005 © UCB CS61C L3 Performance (10) Staley, Fall 2005 © UCB Measuring Time using Clock Cycles (2/2) Performance Calculation (1/2) • One way to define clock cycles: • CPU execution time for program = Clock Cycles for program Clock Cycles for program x Clock Cycle Time = Instructions for a program • Substituting for clock cycles: (called “Instruction Count”) CPU execution time for program x Average Clock cycles Per Instruction = (Instruction Count x CPI) (abbreviated “CPI”) x Clock Cycle Time • CPI one way to compare two machines = Instruction Count x CPI x Clock Cycle Time with same instruction set, since Instruction Count would be the same CS61C L3 Performance (11) Staley, Fall 2005 © UCB CS61C L3 Performance (12) Staley, Fall 2005 © UCB Performance Calculation (2/2) How Calculate the 3 Components? CPU time = Instructions x Cycles x Seconds • Clock Cycle Time: in specification of computer (Clock Rate in advertisements) Program Instruction Cycle • Instruction Count: CPU time = Instructions x Cycles x Seconds • Count instructions in loop of small program Program Instruction Cycle • Use simulator to count instructions CPU time = Instructions x Cycles x Seconds • Hardware counter in spec. register Program Instruction Cycle - (Pentium II,III,4) CPU time = Seconds • CPI: Program • Calculate: Execution Time / Clock cycle time • Product of all 3 terms: if missing a term, can’t Instruction Count predict time, the real measure of performance • Hardware counter in special register (PII,III,4) CS61C L3 Performance (13) Staley, Fall 2005 © UCB CS61C L3 Performance (14) Staley, Fall 2005 © UCB Calculating CPI Another Way Example (RISC processor) Op Freq CPI Prod (% Time) • First calculate CPI for each individual i i instruction (add, sub, and, etc.) ALU 50% 1 .5 (23%) Load 20% 5 1.0 (45%) • Next calculate frequency of each individual instruction Store 10% 3 .3 (14%) Branch 20% 2 .4 (18%) • Finally multiply these two for each 2.2 instruction and add them up to get Instruction Mix (Where time spent) final CPI (the weighted sum) • What if Branch instructions twice as fast? CS61C L3 Performance (15) Staley, Fall 2005 © UCB CS61C L3 Performance (16) Staley, Fall 2005 © UCB What Programs Measure for Comparison? Benchmarks • Ideally run typical programs with typical input before purchase, • Obviously, apparent speed of or before even build machine processor depends on code used to test it • Called a “workload”; For example: • Engineer uses compiler, spreadsheet • Need industry standards so that • Author uses word processor, drawing different processors can be fairly program, compression software compared • In some situations its hard to do • Companies exist that create these • Don’t have access to machine to benchmarks: “typical” code used to “benchmark” before purchase evaluate systems • Don’t know workload in future • Need to be changed every 2 or 3 years since designers could (and do!) target • Next: benchmarks & PC-Mac showdown! for these standard benchmarks CS61C L3 Performance (17) Staley, Fall 2005 © UCB CS61C L3 Performance (18) Staley, Fall 2005 © UCB Example Standardized Benchmarks (1/2) Example Standardized Benchmarks (2/2) • SPEC • Standard Performance Evaluation Corporation (SPEC) SPEC CPU2000 • Benchmarks distributed in source code • Members of consortium select workload • CINT2000 12 integer (gzip, gcc, crafty, perl, ...) - 30+ companies, 40+ universities • CFP2000 14 floating-point (swim, mesa, art, ...) • Compiler, machine designers target • All relative to base machine benchmarks, so try to change every 3 years Sun 300MHz 256Mb-RAM Ultra5_10, which gets score of 100 • An Example of SPEC 2000: • They measure CFP2000 CINT2000 gzip C Compression wupwise Fortran77 Physics / Quantum Chromodynamics swim Fortran77 Shallow Water Modeling vpr C FPGA Circuit Placement and - System speed (SPECint2000) Routing mgrid Fortran77 Multi-grid Solver: 3D Potential Field gcc C C Programming Language Compiler applu Fortran77 Parabolic / Elliptic Partial Differential Equations mcf C Combinatorial Optimization - System throughput (SPECint_rate2000) crafty C Game Playing: Chess mesa C 3-D Graphics Library parser C Word Processing galgel Fortran90 Computational Fluid Dynamics eon C++ Computer Visualization art C Image Recognition / Neural Networks perlbmk C PERL Programming Language equake C Seismic Wave Propagation Simulation facerec Fortran90 Image Processing: Face Recognition •Newer one: www.spec.org/osg/cpu2006/ gap C Group Theory, Interpreter vortex C Object-oriented Database ammp C Computational Chemistry bzip2 C Compression lucas Fortran90 Number Theory / Primality Testing fma3d Fortran90 Finite-element Crash Simulation twolf C Place and Route Simulator sixtrack Fortran77 High Energy Nuclear Physics Accelerator Design CS61C L3 Performance (19) Staley, Fall 2005 © UCB CS61C L3 Performance (20) apsi Fortran77 Meteorology: Pollutant DistrSitabluetyi,o Fnall 2005 © UCB Another Benchmark Performance Evaluation: An Aside Demo • PCs: Ziff-Davis Benchmark Suite If we’re talking about performance, let’s discuss the ways shady salespeople • “Business Winstone is a system-level, have fooled consumers application-based benchmark that measures (so that you don’t get taken!) a PC's overall performance when running today's top-selling Windows-based 32-bit 5. Never let the user touch it applications… it doesn't mimic what these 4. Only run the demo through a script packages do; it runs real applications through a series of scripted activities and 3. Run it on a stock machine in which uses the time a PC takes to complete those “no expense was spared” activities to produce its performance scores. 2. Preprocess all available data • Also tests for CDs, Content-creation, Audio, 3D graphics, battery life 1. Play a movie http://www.etestinglabs.com/benchmarks/ CS61C L3 Performance

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    5 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us