HPC-AI Competition NAMD Benchmark Guideline 1 About the Applications and Benchmarks

HPC-AI Competition NAMD Benchmark Guideline 1 About the applications and benchmarks 1.1 About NAMD NAMD is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems. NAMD, is a parallel molecular dynamics code designed for high-performance simulation of large bio-molecular systems. Based on Charm++ parallel objects. NAMD uses the popular molecular graphics program VMD for simulation setup and trajectory analysis, but is also file-compatible with AMBER, CHARMM, and X-PLOR. For more information about NAMD, please refer to http://www.ks.uiuc.edu/Research/namd/ 1.2 About Charm++ Charm++ is a generalized approach to writing parallel programs, it is an alternative to the likes of MPI, UPC, GA etc. Represents: • The style of writing parallel programs • The runtime system • And the entire ecosystem that surrounds it Three design principles: Overdecomposition, Migratability, Asynchrony For more information about Charm++, please refer to http://charm.cs.uiuc.edu/research/charm 1.3 About UCX UCX is a framework (collection of libraries and interfaces) that provides efficient and relatively easy way to construct widely used HPC protocols: MPI tag matching, RMA operations, rendezvous protocols, stream, fragmentation, remote atomic operations, etc. For more information about OpenUCX, please refer to https://www.openucx.org/ 1.4 About the STMV benchmark Developing biomolecular model inputs for Petascale simulations is an extensive intellectual effort in itself, often involving experimental collaboration. By using synthetic benchmarks for performance measurement, James C. Phillips and his 1 team have made a known stable simulation that is freely distributable, allowing others to replicate their work. Two synthetic benchmarks were assembled by replicating a fully solvated 1.06M atom satellite tobacco mosiac virus (STMV) model with a cubic periodic cell of dimension 216.832Å. The 20stmv benchmark is a 5 × 2 × 2 grid containing 21M atoms, representing the smaller end of Petascale simulations. The 210stmv benchmark is a 7 × 6 × 5 grid containing 224M atoms, representing the largest NAMD simulation. Both simulations employ a 2fs timestep, enabled by a rigid water model and constrained lengths for bonds involving hydrogen, and a 12Å cutoff. PME full electrostatics is evaluated every three steps and pressure control is enabled. Reported timings of STMV benchmark case are the median of the last five NAMD "Benchmark time" outputs, which are generated at 120-step intervals after initial load balancing. 2 2 How to get NAMD codes and dependency files 2.1 Clone the code gits, get tar files To build NAMD executable binary files, codes of multiple dependency projects are required, including: • Charm++ • NAMD • UCX • OpenMPI / Intel MPI 2.1.1 Clone code gits Charm git git clone --bare https://github.com/UIUC-PPL/charm.git \ $HOME/github/charm.git NAMD git git clone --bare https://charm.cs.illinois.edu/gerrit/namd.git \ $HOME/github/namd.git 2.1.2 Get tar files FFTW3 tar file wget http://www.fftw.org/fftw-3.3.8.tar.gz \ -O $HOME/code/fftw-3.3.8.tar.gz HPC-X 2.6 tar file wget http://content.mellanox.com/hpc/hpc-x/v2.6/hpcx-v2.6.0-gcc- MLNX_OFED_LINUX-4.7-1.0.0.1-redhat7.7-x86_64.tbz \ -O $HOME/code/hpcx-v2.6.0-gcc-MLNX_OFED_LINUX-4.7-1.0.0.1-redhat7.7- x86_64.tbz 2.2 Checkout Charm++ and NAMD codes. Untar HPC-X OpenMPI 2.2.1 Charm++ Check Charm branches and tags CODE_NAME=charm \ GIT_DIR=$HOME/github/$CODE_NAME.git \ bash -c ' git --bare --git-dir=$GIT_DIR \ fetch --all --prune; git --bare --git-dir=$GIT_DIR \ show HEAD FETCH_HEAD --quiet --pretty=format:%H%n%cd; git --bare --git-dir=$GIT_DIR \ branch --list --all; git --bare --git-dir=$4GIT_DIR \ tag ' Checkout Charm++ v6.10.1 (proven workable) or FETCH_HEAD codes CODE_NAME=charm \ CODE_GIT_TAG=FETCH_HEAD \ CODE_GIT_TAG=v6.10.1 \ 3 GIT_DIR=$HOME/github/$CODE_NAME.git \ GIT_WORK_TREE=$HOME/cluster/thor/code \ CODE_DIR=$GIT_WORK_TREE/$CODE_NAME-$CODE_GIT_TAG-$(date +%y-%m-%d) \ bash -c ' git --bare --git-dir=$GIT_DIR \ fetch --all --prune; git --bare --git-dir=$GIT_DIR --work-tree=$GIT_WORK_TREE \ reset --mixed $CODE_GIT_TAG; git --bare --git-dir=$GIT_DIR --work-tree=$CODE_DIR \ clean -fxdn; git --bare --git-dir=$GIT_DIR --work-tree=$GIT_WORK_TREE \ checkout-index --force --all --prefix=$CODE_DIR/ ' Additional optimization options (not limited by this): Code: Old release version of Charm++ code 2.2.2 NAMD Check NAMD branches and tags CODE_NAME=namd \ GIT_DIR=$HOME/github/$CODE_NAME.git \ CODE_DIR=$GIT_WORK_TREE/$CODE_NAME-$CODE_GIT_TAG \ bash -c ' git --bare --git-dir=$GIT_DIR \ fetch --all --prune; git --bare --git-dir=$GIT_DIR \ show HEAD FETCH_HEAD --quiet --pretty=format:%H%n%cd; git --bare --git-dir=$GIT_DIR \ branch --list --all; git --bare --git-dir=$GIT_DIR \ tag ' Checkout NAMD 2.13 CODE_NAME=namd \ CODE_GIT_TAG=FETCH_HEAD \ GIT_DIR=$HOME/github/$CODE_NAME.git \ GIT_WORK_TREE=$HOME/cluster/thor/code \ CODE_DIR=$GIT_WORK_TREE/$CODE_NAME-$CODE_GIT_TAG-$(date +%y-%m-%d) \ bash -c ' git --bare --git-dir=$GIT_DIR \ fetch --all --prune; git --bare --git-dir=$GIT_DIR --work-tree=$GIT_WORK_TREE \ reset --mixed $CODE_GIT_TAG; git --bare --git-dir=$GIT_DIR --work-tree=$CODE_DIR \ clean -fxdn; git --bare --git-dir=$GIT_DIR --work-tree=$GIT_WORK_TREE \ checkout-index --force --all --prefix=$CODE_DIR/ ' Additional optimization options (not limited by this): Code: NAMD FETCH_HEAD code 2.2.3 Untar HPC-X 2.6 APP_MPI_PATH=$HOME/cluster/thor/application/mpi \ HPCX_TAR=$HOME/code/hpcx-v2.6.0-gcc-MLNX_OFED_LINUX-4.7-1.0.0.1-redhat7.7- x86_64.tbz \ bash -c ' tar xf $HPCX_TAR -C $APP_MPI_PATH ' 4 3 How to Build NAMD executable files 3.1 Build FFTW3 Build optimized FFTW3 libraries with GNU C compiler and Intel C compiler CODE_NAME=fftw \ CODE_TAG=3.3.8 \ PSXE_DIR=/global/software/centos-7/modules/langs/intel/2020.1.217 \ ICC_DIR=$PSXE_DIR/compilers_and_libraries_2020.1.217 \ INTEL_LICENSE_FILE+=:[email protected] \ CODE_BASE_DIR=$HOME/cluster/thor/code \ CODE_DIR=$CODE_BASE_DIR/$CODE_NAME-$CODE_TAG \ INSTALL_DIR=$HOME/cluster/thor/application/libs/fftw \ CMAKE_PATH=/global/software/centos-7/modules/tools/cmake/3.16.4/bin/cmake \ GCC_PATH=/global/software/centos-7/modules/langs/gcc/8.4.0/bin/gcc \ ICC_PATH=$ICC_DIR/linux/bin/intel64/icc \ NATIVE_GCC_FLAGS='"-march=native -mtune=native -mavx2 -msse4.2 -O3 - DNDEBUG"' \ GCC_FLAGS='"-march=broadwell -mtune=broadwell -mavx2 -msse4.2 -O3 -DNDEBUG"' \ ICC_FLAGS='"-xBROADWELL -axBROADWELL,CORE-AVX2,SSE4.2 -O3 -DNDEBUG"' \ bash -c ' CMD_REBUILD_CODE_DIR="rm -fr $CODE_DIR \ && tar xf $HOME/code/$CODE_NAME-$CODE_TAG.tar.gz -C $CODE_BASE_DIR" ### To build shared library (single precision) with GNU Compiler BUILD_LABEL=$CODE_TAG-shared-gcc840-avx2-broadwell \ CMD_BUILD_SHARED_GCC=" \ mkdir $CODE_DIR/build-$BUILD_LABEL; \ cd $CODE_DIR/build-$BUILD_LABEL \ && $CMAKE_PATH .. \ -DBUILD_SHARED_LIBS=ON -DENABLE_FLOAT=ON \ -DENABLE_OPENMP=OFF -DENABLE_THREADS=OFF \ -DCMAKE_C_COMPILER=$GCC_PATH -DCMAKE_CXX_COMPILER=$GCC_PATH \ -DENABLE_AVX2=ON -DENABLE_AVX=ON \ -DENABLE_SSE2=ON -DENABLE_SSE=ON \ -DCMAKE_INSTALL_PREFIX=$INSTALL_DIR/$BUILD_LABEL \ -DCMAKE_C_FLAGS_RELEASE=$GCC_FLAGS \ -DCMAKE_CXX_FLAGS_RELEASE=$GCC_FLAGS \ && time -p make VERBOSE=1 V=1 install -j \ && cd $INSTALL_DIR/$BUILD_LABEL && ln -s lib64 lib | tee $BUILD_LABEL.log " ### To build shared library (single precision) with Intel C Compiler BUILD_LABEL=$CODE_TAG-shared-icc20-avx2-broadwell \ CMD_BUILD_SHARED_ICC=" \ mkdir $CODE_DIR/build-$BUILD_LABEL; \ cd $CODE_DIR/build-$BUILD_LABEL \ && $CMAKE_PATH .. \ -DBUILD_SHARED_LIBS=ON -DENABLE_FLOAT=ON \ -DENABLE_OPENMP=OFF -DENABLE_THREADS=OFF \ -DCMAKE_C_COMPILER=$ICC_PATH -DCMAKE_CXX_COMPILER=$ICC_PATH \ -DENABLE_AVX2=ON -DENABLE_AVX=ON \ -DENABLE_SSE2=ON -DENABLE_SSE=ON \ -DCMAKE_INSTALL_PREFIX=$INSTALL_DIR/$BUILD_LABEL \ -DCMAKE_C_FLAGS_RELEASE=$ICC_FLAGS \ -DCMAKE_CXX_FLAGS_RELEASE=$ICC_FLAGS \ && time -p make VERBOSE=1 V=1 install -j \ && cd $INSTALL_DIR/$BUILD_LABEL && ln -s lib64 lib | tee $BUILD_LABEL.log " eval $CMD_REBUILD_CODE_DIR; eval $CMD_BUILD_SHARED_GCC & 5 eval $CMD_BUILD_SHARED_ICC & wait echo $CMD_REBUILD_CODE_DIR; echo $CMD_BUILD_SHARED_GCC echo $CMD_BUILD_SHARED_ICC ' | tee fftw3buildlog 2>&1 Run a small FFTW benchmark with/without SIMD [pengzhiz@thor001 fftw-3.3.8]$ ./build-3.3.8-shared-icc20-avx2- broadwell/bench -o patient -o nosimd 10240 Problem: 10240, setup: 5.07 s, time: 91.15 us, ``mflops'': 7483.2 [pengzhiz@thor001 fftw-3.3.8]$ ./build-3.3.8-shared-gcc840-avx2- broadwell/bench -o patient -o nosimd 10240 Problem: 10240, setup: 5.29 s, time: 93.61 us, ``mflops'': 7286.5 [pengzhiz@thor001 fftw-3.3.8]$ ./build-3.3.8-shared-icc20-avx2- broadwell/bench -o patient 10240 Problem: 10240, setup: 8.97 s, time: 18.33 us, ``mflops'': 37219 [pengzhiz@thor001 fftw-3.3.8]$ ./build-3.3.8-shared-gcc840-avx2- broadwell/bench -o patient 10240 Problem: 10240, setup: 8.84 s, time: 18.26 us, ``mflops'': 37362 Additional optimization options (not limited by this): SIMD: avx avx2 avx512 SSE4.2 FMA 3.2 Build Charm CODE_NAME=charm \ CODE_GIT_TAG=FETCH_HEAD \ CODE_GIT_TAG=v6.10.1 \ GIT_DIR=$HOME/github/$CODE_NAME.git \ GIT_WORK_TREE=$HOME/cluster/thor/code \ CHARM_CODE_DIR=$GIT_WORK_TREE/$CODE_NAME-$CODE_GIT_TAG-$(date +%y-%m-%d) \ CHARM_DIR=$CHARM_CODE_DIR \ APP_MPI_PATH=$HOME/cluster/thor/application/mpi \ HPCX_FILES_DIR=$APP_MPI_PATH/hpcx-v2.6.0-gcc-MLNX_OFED_LINUX-4.7-1.0.0.1- redhat7.7-x86_64 \ HPCX_MPI_DIR=$HPCX_FILES_DIR/ompi \ HPCX_UCX_DIR=$HPCX_FILES_DIR/ucx \ UCX_DIR=$SELF_BUILT_DIR \ UCX_DIR=$HPCX_UCX_DIR \ GCC_DIR=/global/software/centos-7/modules/langs/gcc/8.4.0/bin

Load more