ADIOS/Tutorial
Total Page:16
File Type:pdf, Size:1020Kb
`` ADIOS/Tutorial Dresden, Germany 10/11/16 – 10/13/16 S. A. Klasky ORNL, GT, UTK Too many contributors to mention but • Norbert Podhorszki • Viraj Bhat • Chuck Atkins • Qing Liu • Rob Ross • Axel Huebl • Matt Wolf • Garth Gibson • Tahsin Kurc • Karsten Schwan • Chen Jin • Joel Saltz • Manish Parashar • Roselyne Tchoua • Jai Dayal • Ciprian Docan • C. S. Chang • Qian Sun • Hasan Abbasi • Stephane Ethier • Mark Ainsworth • Fang Zheng • Michael Bussmann • William Tang • Fan Zhan • Wei Xue • Jeoren Tromp • Tong Jin • Nagiza Samatova • Weikuan Yu • Jong Choi • Bing Xie • Dave Pugmire • Jeremy Logan • George Ostrouchov • Dave Pugmire • Yuan Tian • James Kress • Arie Shoshani • Eric Suchyta • John Wu [email protected] Agenda • Day 1 • Reading 60 minutes • Introduction 55 minutes • Exercises 15 minutes • ADIOS APIs 20 minutes • Day 3 • Break 15 minutes • Review 10 minutes • Writing-XML 60 minutes • SKEL 20 minutes • BPLS 15 minutes • Write-NO 45 minutes • Exercises 15 minutes • Python 30 minutes • Day 2 • Break 15 minutes • Review 10 minutes • Staging 60 minutes • Schema 20 minutes • Visit 60 minutes • Plotter 15 minutes • Transforms 45 minutes • Break 15 minutes [email protected] Where is ORNL? [email protected] A Little About ORNL… [email protected] DOE’s Office of Science Computation User Facilities • DOE is leader in open High- Performance Computing • Provide the world’s most powerful computational tools for open science • Access is free to researchers who publish • Boost US competitiveness • Attract the best and brightest researchers NERSC ALCF OLCF Edison is 2.57 PF Mira is 10 PF Titan is 27 PF [email protected] What is the Leadership Computing Facility (LCF)? • Collaborative DOE Office of Science user- • Highly competitive user allocation facility program at ORNL and ANL programs (INCITE, ALCC). • Mission: Provide the computational and data • Projects receive 10x to 100x more resource resources required to solve the most than at other generally available centers. challenging problems. • LCF centers partner with users to enable • 2-centers/2-architectures to address diverse science & engineering breakthroughs and growing computational needs of the (Liaisons, Catalysts). scientific community [email protected] Three primary user programs for access to LCF Distribution of allocable hours 10% Director’s Discretionary 30% ALCC 60% INCITE ASCR Leadership Computing Challenge [email protected] Our Science requires that we continue to advance our computational capability over the next decade on the roadmap to Exascale. Since clock-rate scaling ended in 2003, Titan and beyond deliver hierarchical HPC performance has been achieved parallelism with very powerful nodes. MPI through increased parallelism. Jaguar plus thread level parallelism through scaled to 300,000 cores. OpenACC or OpenMP plus vectors OLCF5: 5-10x Summit Summit: 5-10x Titan ~20 MW Titan: 27 PF Hybrid GPU/CPU 10 MW Jaguar: 2.3 PF Hybrid GPU/CPU Multi-core CPU 9 MW CORAL System 7 MW 2010 2013 2017 2022 [email protected] Notes to change for the talk • I need to add to this, and have more examples since I have an hour. • Scott will do this….. [email protected] What isADIOS • An extendable framework that allows developers to plug-in Interface to apps for descrip on of data (ADIOS, etc.) • Data Management Services I/O methods: Aggregate, Posix, MPI Feedback Buffering Schedule • Mul -resolu on Data Compression Data Indexing Services: Compression, Decompression methods methods (FastBit) methods • File Formats: HDF5, netcdf, … Plugins to the hybrid staging area Provenance Workflow Engine Run me engine Data movement • Stream Format: ADIOS-BP Analysis Plugins Visualiza on Plugins • Plug-ins: Analytic, Visualization Adios-bp IDX HDF5 pnetcdf “raw” data Image data Parallel and Distributed File System Viz. Client • Indexing: FastBit, ISABELLA-QA • Incorporates the “best” practices in the I/O middleware layer • Incorporates self describing data streams and files • Released twice a year, now 1.9, under the completely free BSD license • https://www.olcf.ornl.gov/center-projects/adios/, https://github.com/ornladios/ADIOS • Available at ALCF, OLCF, NERSC, CSCS, Tianhe-1,2, Pawsey SC, Ostrava • Applications are supported through OLCF INCITE program • Outreach via on-line manuals, and live tutorials [email protected] ADIOS applications 1. Accelerator: PIConGPU, Warp Impact on Industry : 2. Astronomy: SKA • NUMECA (FINE/Turbo) – 3. Astrophysics: Chimera Allowed time-varying interaction of turbomachinery-related 4. Combustion: S3D aerodynamic phenomena 5. CFD: FINE/Turbo, OpenFoam • TOTAL (RTM) – Allowed running 6. Fusion: XGC, GTC, GTC-P, M3D,M3D-C1, M3D-K, Pixie3D of higher fidelity seismic 7. Geoscience: SPECFEM3D_GLOBE, AWP-ODC, RTM simulations • FMGLOBAL (OpenFoam)– 8. Materials Science: QMCPack, LAMMPS Allowed running higher fidelity fire 9. Medical Imaging: Cancer pathology propagation simulations 10.Quantum Turbulence: QLG2Q 11.Relativity: Maya Over 1B LCF hours from ADIOS 12.Weather: GRAPES enabled Apps 2015 13.Visualization: Paraview, Visit, VTK, ITK, OpenCV, VTKm Over 1,500 citations LCF/NERSC Codes in red [email protected] Impact from running large scale applications at scale Typical Applications on LCF machines observed a 10X performance improvement in I/O [email protected] Impact at the HPC User facilities • ALCF • OLCF • NERSC • Tiahne-1A • Tiahne-2 • Bluelight • Singapore • KAIST • Ostrava • Dresden • ERDC • CSCS • Blue Waters • EPFL • Barcelona Supercomputing Center [email protected] How to use ADIOS • ADIOS is provided as a library to users; use it like other I/O libraries, except • ADIOS has a declarative approach for I/O • User defines in application source code: “what” and “when” • Every process defines what data and when to output or read • ADIOS takes care of the “how” • Biggest hurdle for users: • Forget all of your manual tricks to gain I/O performance on your particular target system and target scale and just say what you want to write/read • Trust ADIOS to deliver the performance • Performance Portability: • Write once, perform well anywhere • It comes naturally with ADIOS • ADIOS has many different I/O methods (strategies) • Predictable performance • Staging, I/O throttling: Allow scientist to use different computational technologies to achieve good, predictable, performance [email protected] ADIOS project goals and current status • Utilize the best practices for ALL of the platforms DOE researchers utilize • File System, Topology, Memory_ Optimizations • Domain Specific Optimizations • PIC simulations, Monte Carlos simulations, Data Driven (Visualization, Medical Images, …) • Predictable Performance Optimizations • Hybrid Staging Techniques, I/O Throttling • Reduce I/O load • In situ indexing, queries, executable code, data refactoring • In situ infrastructure for code coupling and in situ/transit processing • Hybrid Staging, Burst Buffers, learning techniques for caching • Usability and Sustainability • Partnering with Kitware, along with user surveys to create better software for our users [email protected] ADIOS Research-Development-Production cycle Applications Research ORNL UniversitiesResearch & Development Production Development Production ORNL ORNL Kitware Influencers Technology Collaborators HDF5 [email protected] R&D necessary to support petascale Apps • I/O for checkpoint restart files, for large writes per process • Small writes at high velocities • Reading for different IO patterns • Domain Specific methods • IO variability [email protected] Key ideas for good performance of ADIOS for large writes • Avoid latency (of small writes) ADIOS-BP stream/file format Burst Buffer Burst • Buffer data for large bursts • Allows data from each node to be • Avoid global communication written independently with each other • ADIOS has that for metadata only, which with metadata can even be postponed for post-processing • Ability to create a separate metadata • Later: Topology-aware data movement file when “sub-files” are generated that takes advantage of topology • Allows variables to be individually • Find the closest I/O node to each compressed writer • Has a schema to introspect the • Minimize data movement across information, on each process racks/mid-planes (on Bluegene/Q) • Has workflows embedded into the data streams • Format is for “data-in-motion” and “data-at-rest” [email protected] Checkpoint Restart File Writes Solved for Chimera & GTC J. F. Lofstead, S. Klasky, K. Schwan, N. Podhorszki, C. Jin, Flexible io and integration for scientific codes through the adaptable io system (adios) in Proceedings of the 6th international workshop on Challenges of large applications in distributed environments, ACM, pp. 15–24. J. Lofstead, F. Zheng, S. Klasky, K. Schwan, Input/output apis and data organization for high performance scientific computing in Petascale Data Storage Workshop, 2008. PDSW’08. 3rd, IEEE, pp. 1–6. J. Lofstead, F. Zheng, S. Klasky, K. Schwan, Adaptable, metadata rich IO methods for portable high performance IO in Parallel & Distributed Processing, 2009. IPDPS 2009. IEEE International Symposium on, IEEE, pp. 1–10. Z. Lin, Y. Xiao, I. Holod, W. Zhang, W. Deng, S. Klasky, J. Lofstead, C. Kamath, N. Wichmann. Advanced simulation of electron heat transport in fusion plasmas. Journal of Physics: Conference Series 2009, 180, 012059. C. Chang, S. Ku, P. Diamond, M. Adams, R. Barreto, Y. Chen, J. Cummings, E. D’Azevedo, G. Dif-Pradalier, S. Ethier, et al.. Whole-volume integrated gyrokinetic simulation of plasma turbulence in realistic diverted- tokamak geometry. Journal of Physics: Conference