Fall 2012 FDI Short Courses
Total Page:16
File Type:pdf, Size:1020Kb
Fall 2012 FDI Short Courses Overview This fall, ARC will offer six two-session short courses through the Virginia Tech Faculty Development Institute (FDI). These courses will generally be held each on Monday afternoon from 3:00-4:45 in Torgersen 3060. Course descriptions are below. Parallel MATLAB (with ICAM) 1] Parallel Matlab at VT and PARFOR (Monday, September 10, 3-4:45pm, Torgersen 3060) This short course is the first in a two-part series on parallel programming in Matlab. We present an overview of Matlab Parallel Computing resources at VT and discuss several approaches to parallelism including: 1) PARallel FOR loops, 2) Single Program Multiple Data constructs, and 3) Task computing. We then focus on parfor loops and discuss several examples that illustrate how to identify bottlenecks (candidates for parallelism) and several limitations on the use of parlor. A variety of example codes are available for student use. 2] Parallel Matlab and Single Program Multiple Data (Tuesday, September 11, 3-4:45pm, Torgersen 3060) This short course is the second in a two-part series on parallel programming in Matlab. This course focuses on Single Program Multiple Data (SPMD) constructs. We discuss SPMD workspace(s), the scope of variables and composite arrays. Construction and use of distributed (and codistributed) arrays is covered along with data exchange among the multiple workers. A variety of example codes are available for student use. Scientific Programming with Python 1] Scientific Programming with Python I (Monday, September 17, 3-4:45pm, Torgersen 3060) Python is a high level, object oriented, interpreted programming language that combines the power andflexibility of compiled languages with the ease-of-use of scripting, interpreted languages. Like C++ and Java, but unlike MATLAB, Python provides - in addition to mathematical libraries - a rich selection of libraries, from web and database programming to visualizaton. In the first part of this two-part course, we first discuss the essentials of Python programming, including lists, tuples, dictionaries, strings, sets, file objects, and control flow. Then we give an overview of the NumPy (fast, multidimensional arrays) and MatplotLib (2D and 3D plotting) packages. Throughout the course, we use iPython for interactive program development and MatPlotLib for plotting. • Course Slides 2] Scientific Programming with Python II (Tuesday, September 18, 3- 4:45pm, Torgersen 3060) In the second part of this two-part course, we delve into the NumPy, SciPy and PyMPI packages. First, we discuss NumPy's array objects, n-dimensional arrays, array slicing, indexing, iterators, arithmetic and comparison operations Then we describe ScipPy, including linear system solvers, LU factorization, matrix functions, and special matrices, sparse matrix storage, sparse direct and iterative linear system solvers, and statistical functions. Finally, we show how to use the Message Passing Interface (MPI) from Python. Throughout the course, we use MatPlotLib for plotting and iPython for interactive program development. • Course Slides Parallel Programming with OpenMP and MPI 1] Shared Memory Parallel Programming in OpenMP (Monday, September 24, 3-4:45pm, Torgersen 3060) This short course, the first in a two-part series on parallel programming in a HPC environment, will introduce basics of OpenMP. It will describe and provide examples of parallel programming using OpenMP in a shared memory environment and describe key considerations when choosing and designing an algorithm using each of these approaches. The course is designed for programmers with little or no experience with parallel computing. • Course Slides 2] Parallel Programming in MPI (Monday, October 1, 3-4:45pm, Torgersen 3060) This short course, the second in a two-part series on parallel programming in a HPC environment, will introduce basics of MPI programming for distributed memory systems, as well as hybrid approaches using MPI and OpenMP. The course is designed for programmers with little or no experience with parallel computing. The course will describe key considerations when choosing and designing an algorithm using each of these approaches. • Course Slides Visual Computing 1] Deep Media (Monday, October 8, 3-4:45pm, Torgersen 3060) [Enroll] This class, the first in a two-part series on visual computing, is a high-level exploration of the possibilities for visual communication using a variety of technologies and display venues; through demonstration and discussion, we will consider Deep Media's applicability to meet current research and pedagogical challenges. We will examine the various tools and data formats involved in the publishing pipelines for interactive 3D. Participants will critically examine the features and methods of different publishing approaches and will leave with an understanding of the ecology of tools available to translate and publish their content. We will also dive into the authoring process for 3D and 4D environments. From text editors to open-source/freeware tools to commercial packages, participants will examine how to build basic objects and environments with geometry and appearance types, lighting, animation, and interaction and publish them to the web. 2] High-Performance Visualization (Monday, October 15, 3-4:45pm, Torgersen 3060) This class, the second in a two-part series on visual computing, will present recent research in human perception and cognition that can serve to inform designers as they make choices regarding representations of various information types. This session will introduce two powerful, open-source visualization software packages in use at the cutting edge of science: Visit and Paraview. We will run through tutorials for each package including loading data, mapping multiple variables to visual form and the production of high-resolution imagery and movies. We will also examine how new graphics hardware at VT enables such scaling including the new GPU cluster Athena and the new Visionarium VisCube. Issues and techniques for interactive (and offline) remote rendering and immersive visualization will be demonstrated and discussed. GPU Programming 1] GPU Programming Using CUDA (Monday, October 22, 3-4:45pm, Torgersen 3060) The use of graphics processing units (GPUs) has become an increasingly accessible strategy for accelerating scientific calculations. CUDA is a programming language designed to target Nvidia GPU such as those on Hokiespeed. This session, the first in a two-part series on CUDA programming, provides a basic introduction to the CUDA programming language and the concepts needed to achieve high performance. • Course Slides 2] Hybrid Programming using CUDA, OpenMP and MPI (Monday, October 29, 3-4:45pm, Torgersen 3060) Supercomputing systems such as Hokiespeed utilize heterogeneous nodes, with multiple CPU cores and multiple GPU on each node. In order to fully harness the capabilities of this setup hybrid programming strategies are necessary. In this session, the second in a two-part series on CUDA programming, we address hybrid implementations of CUDA which utilize OpenMP to scale across cores within a node and MPI to scale across nodes. • Course Slides Parallel Programming with Intel Cilk Plus 1] Parallel Programming with Intel Cilk Plus 1 (Monday, November 5, 3- 4:45pm, Torgersen 3060) In the multi-core and vector mathematics era, parallel programming becomes ubiquitous. Intel Cilk Plus is an extension of C/C++ that supports task parallelism and data parallelism. In this course, we describe parallel programming patterns that model essential parallel programming strategies and show how to apply these patterns for writing effective and hardware-agnostic parallel programs using Cilk Plus. In the first part of this two-part course, we first describe structured parallel programming patterns and then discuss the constructs of Cilk Plus for SIMD parallelism (vectorization) and loop parallelization. 2] Parallel Programming with Intel Cilk Plus 2 (Wednesday, November 7, 3- 4:45pm, Torgersen 3060) In the multi-core and vector mathematics era, parallel programming becomes ubiquitous. Intel Cilk Plus is an extension of C/C++ that supports task parallelism and data parallelism. In this course we describe parallel programming patterns that model essential parallel programming strategies and show how to apply these patterns for writing effective and hardware-agnostic parallel programs using Cilk Plus. In the second part of this two-part course, we first describe and illustrate the task parallelism constructs of Cilk Plus, then we discuss how Cilk Plus achieves dynamic load balancing using work-stealing, and then we show how Cilk Plus avoids determinacy races using lock-free reducer hyperobjects. .