Parallel Processing in Python

Parallel Processing in Python

PARALLEL PROCESSING IN PYTHON COSMOS - 1/28/2020 BY JOSEPH KREADY LAYOUT What is Parallel Parallel Processing Processing and Python History of Google Colab Parallel Example Computation Parallel Processing: Multiple objects at a time Serial Processing: One object at a time SERIAL PROCESSING VS. PARALLEL PROCESSING CPU ARCHITECTURE Serial Computation The method used for improving computer performance from 1980s – 2000s FREQUENCY Frequency Scaling Equation: Power consumption = capacitance * SCALING voltage^2 * frequency The increase to power consumption led to the demise of frequency scaling SUPERSCALAR ARCHITECTURE Executes more than one instruction during a clock cycle by simultaneously dispatching multiple instructions to different execution units on the processor. Each CPU is made of independent ‘Cores’ that can access the same memory concurrently MULTI-CORE Moore’s Law: The number of cores per PROCESSORS processor doubles every 18-24 months Operating systems ensure programs run on available cores, but developers must design their programs to take advantage of parallel processing. PROCESS VS. THREAD VS. MULTI-THREADING VS. HYPER-THREADING ¡ Processes are made of threads ¡ Threads of the same processes share memory. ¡ Processes run in separate memory. ¡ Hyper-threading: allows scheduling of 2 processes on 1 CPU core ¡ Multiple Instructions operate on separate data in parallel THE GLOBAL INTERPRETER LOCK!!! ¡ Global Interpreter lock (GIL): All Python processes must go through the GIL to execute; thus threads execute 1 at a time ¡ It is faster in the single-threaded case. ¡ It is faster in the multi-threaded case for i/o bound programs. (they are not GIL locked) ¡ It is faster in the multi-threaded case for CPU-bound programs that do their compute-intensive work in C libraries, ie Numpy. ¡ GIL only becomes a problem when doing CPU-intensive work in pure Python. ¡ Not all versions of Python use a GIL: Jython, IronPython, PyPy-STM YOUR CHOICES MULTI-THREADING MULTI-PROCESSING ¡ Threads act as sub-tasks of a single process ¡ Separate processes act as individual jobs ¡ Threads share the same memory space ¡ Processes run in their own memory space ¡ Great for background tasks / waiting for ¡ Great for complex calculations / running multiple asynchronous functions instances of a whole project ¡ Can lead to conflicts (Race Conditions), when ¡ Higher overhead, but more secure writing to the same memory location at the same time WHY DON’T WE JUST USE TREADS? Problem ¡ Race Conditions: Multiple threads reading and writing to the same object Will cause unexpected results ¡ The operating system handles the processing of threads dynamically. There’s no way to ensure the compute order. Solution ¡ Synchronization using Lock: You can define one part of your function ‘with thread.lock:’ which requires 1 thread processing at a time. ¡ This can lead to Deadlocks: where the lock is not released properly, or you call sub functions that are already locked in another thread WHY DON’T WE JUST USE PROCESSES? Problem ¡ Serialization using Pickles: converting python objects to byte streams ¡ Individual processes run in separate memory spaces and need a way to communicate. This is done through Pickles. ¡ Certain limitations, like limited function arguments and no class supported with pickling. Therefore, you must design your functions with pickling in mind. Solution ¡ Serialization using Dill ¡ Dill extends Pickles, allowing to send arbitrary classes and functions as byte streams. Dill can Pickle all the python objects! ¡ Pathos.Multiprocessing library is a fork of python’s multi- processing that uses Dills instead of Pickles. AMDAHL’S LAW ¡ The small part of a program that cannot be parallelized will limit the overall speedup ¡ S-latency is the potential speedup in latency of the execution of the whole task; ¡ s is the speedup in latency of the execution of the parallelizable part of the task; ¡ p is the percentage of the execution time of the whole task concerning the parallelizable part of the task before parallelization. https://colab.research.google.com/drive/1 TkjjiIrzq5wE1BF2DbOAqTKhmRgzSYVh Real Python. “An Intro to Threading in Python.” Real Python, Real Python, 25 May 2019, Works Cited realpython.com/intro-to-python-threading/. Accessed 31 Jan. 2020. “An Introduction to Parallel Programming Using Python's Multiprocessing Module.” Dr. Sebastian Raschka, 20 June 2014, Rocklin, Matthew. “Parallelism and Serialization How Poor Pickling Breaks Multiprocessing.” sebastianraschka.com/Articles/2014_multiprocessing.html. Accessed 31 Jan. 2020. “Dill.” PyPI, pypi.org/project/dill/. Accessed 31 Jan. 2020. Parallelism and Serialization, matthewrocklin.com/blog/work/2013/12/05/Parallelism- FomiteFomite 2, et al. “Why Was Python Written with the GIL?” Software Engineering Stack and-Serialization. Accessed 31 Jan. 2020. Exchange, 1 Feb. 1963, softwareengineering.stackexchange.com/questions/186889/why- was-python-written-with-the-gil. Accessed 31 Jan. 2020. “Superscalar Processor.” Wikipedia, Wikimedia Foundation, 7 May 2019, “Has the Python GIL Been Slain?” By, hackernoon.com/has-the-python-gil-been-slain- 9440d28fa93d. Accessed 31 Jan. 2020. en.wikipedia.org/wiki/Superscalar_processor. Accessed 31 Jan. 2020. “Hyper-Threading.” Wikipedia, Wikimedia Foundation, 19 Jan. 2020, en.wikipedia.org/wiki/Hyper-threading. Accessed 31 Jan. 2020. “Multithreading (Computer Architecture).” Wikipedia, Wikimedia Foundation, 2 Jan. 2020, en.wikipedia.org/wiki/Multithreading_(computer_architecture). Accessed 31 Jan. 2020. “Parallel Computing.” Wikipedia, Wikimedia Foundation, 26 Dec. 2019, en.wikipedia.org/wiki/Parallel_computing. Accessed 31 Jan. 2020. REFERENCES “Pickle - Python Object Serialization¶.” Pickle - Python Object Serialization - Python 3.8.1 Documentation, docs.python.org/3/library/pickle.html. Accessed 31 Jan. 2020. .

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    18 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us