Multi-Core Programming Speedup

Multi-Core Programming Speedup

Multi-core Programming Speedup Based on slides from Intel Software College and Multi-Core Programming – increasing performance through software multi-threading by Shameem Akhter and Jason Roberts, Intel® Software College Agenda Concurrency vs. Parallelism Characterization Hardware & Parallel Computing Threading Tools Overview Thread Communication Threading Basics 2 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 1 Intel® Software College Agenda Concurrency vs. Parallelism Characterization Hardware & Parallel Computing Threading Tools Overview Thread Communication Threading Basics 3 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Intel® Software College Concurrency vs. Parallelism • Concurrency: two or more threads are in progress at the same time: Thread 1 Thread 2 • Parallelism: two or more threads are executing at the same time Thread 1 Thread 2 • Multiple cores needed Threading Basics 4 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 2 Intel® Software College Agenda Concurrency vs. Parallelism Characterization Hardware & Parallel Computing Threading Tools Overview Thread Communication Threading Basics 5 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Intel® Software College Speedup (Simple) Measure of how much faster the computation executes versus the best serial code • Serial time divided by parallel time Example: Painting a picket fence • 30 minutes of preparation (serial) • One minute to paint a single picket • 30 minutes of cleanup (serial) Thus, 300 pickets takes 360 minutes (serial time) Threading Basics 6 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 3 Intel® Software College Computing Speedup Number of Time Speedup painters 1 30 + 300 + 30 = 360 1.0X 2 30 + 150 + 30 = 210 1.7X Illustrates Amdahl’s Law 10 30 + 30 + 30 = 90 4.0X 100 30 + 3 + 30 = 63 5.7X Potential speedup is restricted by Infinite 30 + 0 + 30 = 60 6.0X serial portion What if fence owner uses spray gun to paint 300 pickets in one hour? • Better serial algorithm • If no spray guns are available for multiple workers, what is maximum parallel speedup? Threading Basics 7 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Intel® Software College Amdahl’s Law Describes the upper bound of parallel execution speedup P is parallel fraction n = ∞2 0.5 + 0.250.0 P T = {(1-P) + P/n} T P/2 parallel serial … n = number of processors serial T P/ ∞ 1.0/0.51.0/0.75 = = 2.0 1.33 Speedup = Tserial / Tparallel (1-P) (1-P) Serial code limits speedup Threading Basics 8 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 4 Intel® Software College Amdahl’s Law Amdahl's Law (Ideal) 45 40 35 30 f=0.0 25 f=0.01 20 f=0.05 f=0.1 Speedup 15 10 5 0 10 20 30 40 Number of Processors Threading Basics 9 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Intel® Software College Amdahl’s Law (Continued) Amdahl's Law (Actual) 50 40 30 f=0.0 f=0.01 20 Speedup Actual 10 0 10 20 30 40 Number CPUs Threading Basics 10 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 5 Intel® Software College Consequences of Amdahl’s Law • Increasing the number of cores only affects the parallel portion • If code only 10% parallel, best one can do is run it in 90% of time • It is more important to reduce the proportion of the code that is sequential than to increase the number of cores • P= 0.3 (30% parallel) • Running on dual-core S = (1-P) + P/n = .7 + .3/2 = .85 • Running on quad-core S = (1-P) + P/n = .7 + .3/4 = .775 • Running on dual-core with double amount of parallelized code P = 0.6 (60% parallel) • Running on dual core S = (1-P) + P/n = .4 + .6/2 = .7 Threading Basics 11 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Intel® Software College Modification of Amdahl’s Law for thread overhead • Need to consider overhead of adding threads Speedup = 1/[ (1-P) + P/n + H(n)] where H(n) is thread overhead • Note that overhead is not linear on good parallel machine • Source of overhead • Actual OS overhead • Inter-thread activities such as synchronization and communication • If overhead large enough speedup can be less than 1 • Threading can actually slow performance • Important to design multi-threaded applications well Threading Basics 12 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 6 Intel® Software College Efficiency Measure of how effectively computation resources (threads) are kept busy • Speedup divided by number of threads • Expressed as average percentage of non-idle time Number of Time Speedup Efficiency painters 1 360 1.0X 100% 2 30 + 150 + 30 = 210 1.7X 85% 10 30 + 30 + 30 = 90 4.0X 40% 100 30 + 3 + 30 = 63 5.7X 5.7% Infinite 30 + 0 + 30 = 60 6.0X very low Threading Basics 13 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Intel® Software College Amdahl’s Law and Hyper-Threading • With HT get 30% performance gain • A thread on HT enabled processor runs slowe Threading Basics 14 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 7 Intel® Software College Repeal of Amdahl’s Law • Amdahl’s work seemed to limit applicability of parallel computing • Amdahl’s Law based on assumptions • Best serial algorithm limited by CPU cycles available • In multi-core may have multiple caches, so more of data may be in cache reducing memory latency • Serial algorithm best • sometimes parallel solution requires less computational steps • Problem size fixed • Usually size increases with resources available Threading Basics 15 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Intel® Software College Gustafson’s Law Consider scaling problem size as processor count increased Ts = serial execution time Tp(N,W) = parallel execution time for same problem, size W, on N CPUs S(N,W) = Speedup on problem size W, N CPUs S(N,W) = (Ts + Tp(1,W) )/( Ts + Tp(N,W) ) Consider case where Tp(N,W) ~ W*W/N S(N,W) -> (N*Ts + N*W*W)/(N*Ts + W*W) If W-> ∞ as N -> ∞ then S(N,W) -> N Gustafson’s Law provides hope for parallel applications to deliver on their promise. Threading Basics 16 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 8.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    8 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us