<<

Architecture

Overview

Prof. Tien-Fu Chen Dept. of Computer Science National Chung Cheng Univ Spring 2002

Overview-1 TF Chen@CCU

Computer Architecture Course Focus

Understanding the design techniques, machine structures, technology factors, evaluation methods that will determine the form of programmable processors in 21st Century

Programming Technology Applications Languages

Computer Architecture: Interface Design • Instruction Set Design (ISA) • Organization • Hardware

Operating Measurement & Systems Evaluation History

TF Chen@CCU Overview-2 Topic Coverage Textbook: Hennessy and Patterson, Computer Architecture: A Quantitative Approach, 2nd Ed., 1996.

! Fundamentals of Computer Architecture (Ch. 1) ! Instruction Set Architecture (Ch. 2) ! Pipelining (Ch. 3) ! Pipelining and Instructional Level Parallelism (Ch. 4) ! (Ch. 5) ! Input/Output and Storage, RAID (Ch. 6) ! Networks and Interconnection Technology (Ch. 7), Multiprocessors (Ch. 8) ! Vector, SIMD, MMX, VLIW, EPIC and DSPs ! Configurable processors and computing

! Overview-3 TF Chen@CCUDesign space exploration and embedded processors

Computer Architecture (Graduate): Administrative Information Instructors: Prof. Tien-Fu Chen Office: 412 Computer Science Dept,

Tel: ext-33111, [email protected]

T. A: Class: Wed 10:10-11:00, Fri 10:10-12:00 Room 101 Text: Computer Architecture: A Quantitative Approach, 2nd Edition (1996) Web page: http://www.cs.ccu.edu.tw/~chen/arch2002 Newsgroup: arch2002

TF Chen@CCU Overview-4 ABC of Computer Architecture:

Performance, Cost, Power

Overview-28 TF Chen@CCU

The Performance Metric

"X is n times faster than Y" means

ExTime(Y) Performance(X) ------= ------ExTime(X) Performance(Y)

• Speed of Concorde vs. Boeing 747

• Throughput of Boeing 747 vs. Concorde

TF Chen@CCU Overview-30 Metrics of Performance

Application Answers per month Operations per second Programming Language Compiler (millions) of Instructions per second: MIPS ISA (millions) of (FP) operations per second: MFLOP/s Control Megabytes per second Function Units Transistors Wires Pins Cycles per second ()

Overview-31 TF Chen@CCU

Performance

! Performance Measures – Elapsed time – CPU time %time 90.7u 12.9s 2:39 65%

– Cycle time – Law of Performance – CPI – MIPS – MFLOPS – SPECmarks ! Benchmarks ! Averaging

TF Chen@CCU Overview-32 Aspects of CPU Performance

CPU time = Seconds = Instructions x Cycles x Seconds Program Program

Inst Count CPI Clock Rate Program X

Compiler X (X)

Inst. Set. X X

Organization X X

Technology X

Overview-33 TF Chen@CCU

Cycles Per Instruction “Average

CPI = Cycles / Instruction Count = (CPU Time * Clock Rate) / Instruction Count n Σ CPU time = CycleTime * CPIii * I i =1 “Instruction Frequency”

n Σ CPI = CPIi * Fi where Fii = I i =1 Instruction Count Invest Resources where time is Spent!

TF Chen@CCU Overview-34 MIPS - Million instructions per second

=  instrn  ×  program × −6 MIPS   10  program  time  = inst count = clock rate exec time × 1066CPI × 10 ! Problems: – Depend on instruction set – Depend on different programs – Can vary inversely with performance ! Marketing metrics ! = Meaningless Indicator of Performance??

Overview-36 TF Chen@CCU

MFLOPS - Million floating-point operations per second

=  FP ops  ×  program  × − 6 MFLOPS   10  program   tim e  ! Problems:

" Must be floating-point intensive " Some programs perform no FP " not only changes on mixture of Integer & FP operations, but on mixture of fast and slow FP

! just another Meaningless Indicator of Performance?? ! Peak MFLOPS: • the performance the manufacture guarantees you

TF Chen@CCU won't exceed Overview-37 SPEC: System Performance Evaluation Cooperative

! First Round 1989 – 10 programs yielding a single number (“SPECmarks”) ! Second Round 1992 – SPECInt92 (6 integer programs) and SPECfp92 (14 floating point programs) » Compiler Flags unlimited. March 93 of DEC 4000 Model 610: spice: unix.:/def=(sysv,has_bcopy,”bcopy(a,b,c)= memcpy(b,a,c)” wave5: /ali=(all,dcom=nat)/ag=a/ur=4/ur=200 nasa7: /norecu/ag=a/ur=4/ur2=200/lc=blas ! Third Round 1995 – new set of programs: SPECint95 (8 integer programs) and SPECfp95 (10 floating point) – “benchmarks useful for 3 years” – Single flag setting for all programs: SPECint_base95, SPECfp_base95

Overview-39 TF Chen@CCU

Summarizing Results: Average

! Arithmetic Mean = 1 n tim e ∑ tim e i – means for times, CPI n i = 1

! Harmonic mean n rate = n 1 ∑ – means for rates, MIPS, MFLOPS i =1 ratei ! Geometric mean   ĭ  ĭ ı ijĮ= ı ijĮ  ∏     =   – normalize numbers

TF Chen@CCU Overview-40 Speedup

! Speedup o ld tim e new r ate Speedup == n ew tim e old rate ! Amdahl's Law Improvement to be gained from using faster mode is limited by the fraction of the time the faster mode can be used Consider an enhancement x speedups fraction fx of a task by Sx 1 Speedup = overall −+ ()(/)1 ffSxxx ! Examples ==→ fSxx90%, 5 1 Speedup = = 36. overall (.)(./)190−+ 905

Overview-43 TF Chen@CCU

Amdahl's Law Speedup due to enhancement E: ExTime w/o E Performance w/ E Speedup(E) = ------= ------ExTime w/ E Performance w/o E

Suppose that enhancement E accelerates a fraction F of the task by a factor S, and the remainder of the task is unaffected

TF Chen@CCU Overview-44 Amdahl’s Law

ExTimenew =ExTimeold x(1-Fractionenhanced) + Fractionenhanced

Speedupenhanced

1 ExTimeold Speedupoverall = = (1 - Fractionenhanced) + Fractionenhanced ExTimenew Speedupenhanced

Overview-45 TF Chen@CCU

Amdahl’s Law

! Floating point instructions improved to run 2X; but only 10% of actual instructions are FP

ExTimenew = ExTimeold x (0.9 + .1/2) = 0.95 x ExTimeold

1 Speedupoverall = = 1.053 0.95

Law of diminishing return: Focus on the common case!

TF Chen@CCU Overview-46 Amdahl's Law Corollary

= fx 50%, 1 =→ = = 1.0 ! Increasing speedup Sx 1.10 Speedupoverall (1−+ .5) (.5/1.1) 1 =→ = = 1.818 Sx 10 Speedupoverall (1−+ .5) (.5/10) 1 =∞→ = = 20. Sx Speedupoverall (1−+ .5) (.5/ ∞ )

1 Speedup < ! Bound overall − 1 f x

! Examples fx 1% 5% 10% 20% 50% 1 /(1 -fx ) 1.01 1.05 1.11 1.25 200 Overview-47 TF Chen@CCU

Cost/Performance What is Relationship of Cost to Price?

! Recurring Costs – Component Costs – Direct Costs (add 25% to 40%) recurring costs: labor, purchasing, scrap, warranty ! Non-Recurring Costs or Gross Margin (add 82% to 186%) (R&D, equipment maintenance, rental, marketing, sales, financing cost, pretax profits, taxes ! Average Discount to get List Price (add 33% to 66%): volume discounts and/or retailer markup List Price Average Discount 25% to 40% Avg. Selling Price Gross Margin 34% to 39% Direct Cost 6% to 8% Component TF Chen@CCU Overview-51 Cost 15% to 33% Energy/Power

! Power dissipation: rate at which energy is taken from the supply (power source) and transformed into heat P=E/t ! Energy dissipation for a given instruction depends upon type of instruction (and state of the )

n Σ P=(1/CPUTime)* E*Iii i =1

Overview-55 TF Chen@CCU

Summary, #2

! Amdahl’sLaw: 1 ExTimeold Speedupoverall = = (1 - Fractionenhanced) + Fractionenhanced ExTimenew ! CPI Law: Speedupenhanced

CPU time = Seconds = Instructions x Cycles x Seconds Program Program Instruction Cycle

! Execution time is the REAL measure of ! ! Good products created when have: – Good benchmarks, good ways to summarize performance ! Different set of metrics apply to embedded systems

TF Chen@CCU Overview-57