
Multi-core Programming Speedup Based on slides from Intel Software College and Multi-Core Programming – increasing performance through software multi-threading by Shameem Akhter and Jason Roberts, Intel® Software College Agenda Concurrency vs. Parallelism Characterization Hardware & Parallel Computing Threading Tools Overview Thread Communication Threading Basics 2 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 1 Intel® Software College Agenda Concurrency vs. Parallelism Characterization Hardware & Parallel Computing Threading Tools Overview Thread Communication Threading Basics 3 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Intel® Software College Concurrency vs. Parallelism • Concurrency: two or more threads are in progress at the same time: Thread 1 Thread 2 • Parallelism: two or more threads are executing at the same time Thread 1 Thread 2 • Multiple cores needed Threading Basics 4 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 2 Intel® Software College Agenda Concurrency vs. Parallelism Characterization Hardware & Parallel Computing Threading Tools Overview Thread Communication Threading Basics 5 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Intel® Software College Speedup (Simple) Measure of how much faster the computation executes versus the best serial code • Serial time divided by parallel time Example: Painting a picket fence • 30 minutes of preparation (serial) • One minute to paint a single picket • 30 minutes of cleanup (serial) Thus, 300 pickets takes 360 minutes (serial time) Threading Basics 6 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 3 Intel® Software College Computing Speedup Number of Time Speedup painters 1 30 + 300 + 30 = 360 1.0X 2 30 + 150 + 30 = 210 1.7X Illustrates Amdahl’s Law 10 30 + 30 + 30 = 90 4.0X 100 30 + 3 + 30 = 63 5.7X Potential speedup is restricted by Infinite 30 + 0 + 30 = 60 6.0X serial portion What if fence owner uses spray gun to paint 300 pickets in one hour? • Better serial algorithm • If no spray guns are available for multiple workers, what is maximum parallel speedup? Threading Basics 7 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Intel® Software College Amdahl’s Law Describes the upper bound of parallel execution speedup P is parallel fraction n = ∞2 0.5 + 0.250.0 P T = {(1-P) + P/n} T P/2 parallel serial … n = number of processors serial T P/ ∞ 1.0/0.51.0/0.75 = = 2.0 1.33 Speedup = Tserial / Tparallel (1-P) (1-P) Serial code limits speedup Threading Basics 8 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 4 Intel® Software College Amdahl’s Law Amdahl's Law (Ideal) 45 40 35 30 f=0.0 25 f=0.01 20 f=0.05 f=0.1 Speedup 15 10 5 0 10 20 30 40 Number of Processors Threading Basics 9 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Intel® Software College Amdahl’s Law (Continued) Amdahl's Law (Actual) 50 40 30 f=0.0 f=0.01 20 Speedup Actual 10 0 10 20 30 40 Number CPUs Threading Basics 10 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 5 Intel® Software College Consequences of Amdahl’s Law • Increasing the number of cores only affects the parallel portion • If code only 10% parallel, best one can do is run it in 90% of time • It is more important to reduce the proportion of the code that is sequential than to increase the number of cores • P= 0.3 (30% parallel) • Running on dual-core S = (1-P) + P/n = .7 + .3/2 = .85 • Running on quad-core S = (1-P) + P/n = .7 + .3/4 = .775 • Running on dual-core with double amount of parallelized code P = 0.6 (60% parallel) • Running on dual core S = (1-P) + P/n = .4 + .6/2 = .7 Threading Basics 11 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Intel® Software College Modification of Amdahl’s Law for thread overhead • Need to consider overhead of adding threads Speedup = 1/[ (1-P) + P/n + H(n)] where H(n) is thread overhead • Note that overhead is not linear on good parallel machine • Source of overhead • Actual OS overhead • Inter-thread activities such as synchronization and communication • If overhead large enough speedup can be less than 1 • Threading can actually slow performance • Important to design multi-threaded applications well Threading Basics 12 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 6 Intel® Software College Efficiency Measure of how effectively computation resources (threads) are kept busy • Speedup divided by number of threads • Expressed as average percentage of non-idle time Number of Time Speedup Efficiency painters 1 360 1.0X 100% 2 30 + 150 + 30 = 210 1.7X 85% 10 30 + 30 + 30 = 90 4.0X 40% 100 30 + 3 + 30 = 63 5.7X 5.7% Infinite 30 + 0 + 30 = 60 6.0X very low Threading Basics 13 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Intel® Software College Amdahl’s Law and Hyper-Threading • With HT get 30% performance gain • A thread on HT enabled processor runs slowe Threading Basics 14 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 7 Intel® Software College Repeal of Amdahl’s Law • Amdahl’s work seemed to limit applicability of parallel computing • Amdahl’s Law based on assumptions • Best serial algorithm limited by CPU cycles available • In multi-core may have multiple caches, so more of data may be in cache reducing memory latency • Serial algorithm best • sometimes parallel solution requires less computational steps • Problem size fixed • Usually size increases with resources available Threading Basics 15 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Intel® Software College Gustafson’s Law Consider scaling problem size as processor count increased Ts = serial execution time Tp(N,W) = parallel execution time for same problem, size W, on N CPUs S(N,W) = Speedup on problem size W, N CPUs S(N,W) = (Ts + Tp(1,W) )/( Ts + Tp(N,W) ) Consider case where Tp(N,W) ~ W*W/N S(N,W) -> (N*Ts + N*W*W)/(N*Ts + W*W) If W-> ∞ as N -> ∞ then S(N,W) -> N Gustafson’s Law provides hope for parallel applications to deliver on their promise. Threading Basics 16 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 8.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages8 Page
-
File Size-