Introduction to Parallel Processing with Python
Total Page:16
File Type:pdf, Size:1020Kb
Introduction to Parallel Processing with Python 1 What is Parallel Computing? Serial Computing 2 Source - https://computing.llnl.gov/tutorials/parallel_comp/ What is Parallel Computing? Parallel Computing: Breaking a problem into multiple pieces and processing each piece in parallel through multiple processors 3 Parallelized Hardware Nearly all processors now have parallelized processing architectures Eight-core CPUs on now selling for mainstream consumers Intel® Core™ i7-5960X: $1000 (2014) AMD Ryzen 2700X: $300 (2018) 4 HPCs – Built for Parallelization • HPCs employ often 2-4 server-grade CPUs per node • 8 – 16 processor cores per CPU • Shared memory on each node for all processors • Distributed memory architecture • Nodes are connected via a 56-100 Gbps network • Memory is shared between nodes through some API • MPI is most commonly used 5 Global Interpreter Lock 6 Global Interpreter Lock • The Python interpreter is not fully thread-safe. • In order to support multi-threaded Python programs, there’s a global lock, called the global interpreter lock or GIL, that must be held by the current thread before it can safely access Python objects. • Without the lock, even the simplest operations could cause problems in a multi-threaded program • For example, when two threads simultaneously increment the reference count of the same object, the reference count could end up being incremented only once instead of twice. • Therefore, only one thread is run at a time. 7 So how can one effectively parallelize their code? Enter: multiprocessing Time to switch over to Jupyter Notebook 8 Installing a Conda Environment for Keras and TensorFlow with Jupyter Support $ module load python/3.6.1-2-anaconda $ conda create --name py3.6-multiprocess -–clone root $ source activate py3.6-multiprocess $ conda install –c conda-forge multiprocess $ ipython kernel install --user --name py3.6-multiprocess --display- name=“Custom" 9.