The Deal.II Library, Version 9.3, 2021
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Accelerating the LOBPCG Method on Gpus Using a Blocked Sparse Matrix Vector Product
Accelerating the LOBPCG method on GPUs using a blocked Sparse Matrix Vector Product Hartwig Anzt and Stanimire Tomov and Jack Dongarra Innovative Computing Lab University of Tennessee Knoxville, USA Email: [email protected], [email protected], [email protected] Abstract— the computing power of today’s supercomputers, often accel- erated by coprocessors like graphics processing units (GPUs), This paper presents a heterogeneous CPU-GPU algorithm design and optimized implementation for an entire sparse iter- becomes challenging. ative eigensolver – the Locally Optimal Block Preconditioned Conjugate Gradient (LOBPCG) – starting from low-level GPU While there exist numerous efforts to adapt iterative lin- data structures and kernels to the higher-level algorithmic choices ear solvers like Krylov subspace methods to coprocessor and overall heterogeneous design. Most notably, the eigensolver technology, sparse eigensolvers have so far remained out- leverages the high-performance of a new GPU kernel developed side the main focus. A possible explanation is that many for the simultaneous multiplication of a sparse matrix and a of those combine sparse and dense linear algebra routines, set of vectors (SpMM). This is a building block that serves which makes porting them to accelerators more difficult. Aside as a backbone for not only block-Krylov, but also for other from the power method, algorithms based on the Krylov methods relying on blocking for acceleration in general. The subspace idea are among the most commonly used general heterogeneous LOBPCG developed here reveals the potential of eigensolvers [1]. When targeting symmetric positive definite this type of eigensolver by highly optimizing all of its components, eigenvalue problems, the recently developed Locally Optimal and can be viewed as a benchmark for other SpMM-dependent applications. -
Development of a Coupling Approach for Multi-Physics Analyses of Fusion Reactors
Development of a coupling approach for multi-physics analyses of fusion reactors Zur Erlangung des akademischen Grades eines Doktors der Ingenieurwissenschaften (Dr.-Ing.) bei der Fakultat¨ fur¨ Maschinenbau des Karlsruher Instituts fur¨ Technologie (KIT) genehmigte DISSERTATION von Yuefeng Qiu Datum der mundlichen¨ Prufung:¨ 12. 05. 2016 Referent: Prof. Dr. Stieglitz Korreferent: Prof. Dr. Moslang¨ This document is licensed under the Creative Commons Attribution – Share Alike 3.0 DE License (CC BY-SA 3.0 DE): http://creativecommons.org/licenses/by-sa/3.0/de/ Abstract Fusion reactors are complex systems which are built of many complex components and sub-systems with irregular geometries. Their design involves many interdependent multi- physics problems which require coupled neutronic, thermal hydraulic (TH) and structural mechanical (SM) analyses. In this work, an integrated system has been developed to achieve coupled multi-physics analyses of complex fusion reactor systems. An advanced Monte Carlo (MC) modeling approach has been first developed for converting complex models to MC models with hybrid constructive solid and unstructured mesh geometries. A Tessellation-Tetrahedralization approach has been proposed for generating accurate and efficient unstructured meshes for describing MC models. For coupled multi-physics analyses, a high-fidelity coupling approach has been developed for the physical conservative data mapping from MC meshes to TH and SM meshes. Interfaces have been implemented for the MC codes MCNP5/6, TRIPOLI-4 and Geant4, the CFD codes CFX and Fluent, and the FE analysis platform ANSYS Workbench. Furthermore, these approaches have been implemented and integrated into the SALOME simulation platform. Therefore, a coupling system has been developed, which covers the entire analysis cycle of CAD design, neutronic, TH and SM analyses. -
Present and Future Leadership Computers at OLCF
Present and Future Leadership Computers at OLCF Buddy Bland OLCF Project Director Presented at: SC’14 November 17-21, 2014 New Orleans ORNL is managed by UT-Battelle for the US Department of Energy Oak Ridge Leadership Computing Facility (OLCF) Mission: Deploy and operate the computational resources required to tackle global challenges Providing world-leading computational and data resources and specialized services for the most computationally intensive problems Providing stable hardware/software path of increasing scale to maximize productive applications development Providing the resources to investigate otherwise inaccessible systems at every scale: from galaxy formation to supernovae to earth systems to automobiles to nanomaterials With our partners, deliver transforming discoveries in materials, biology, climate, energy technologies, and basic science SC’14 Summit - Bland 2 Our Science requires that we continue to advance OLCF’s computational capability over the next decade on the roadmap to Exascale. Since clock-rate scaling ended in 2003, HPC Titan and beyond deliver hierarchical parallelism with performance has been achieved through increased very powerful nodes. MPI plus thread level parallelism parallelism. Jaguar scaled to 300,000 cores. through OpenACC or OpenMP plus vectors OLCF5: 5-10x Summit Summit: 5-10x Titan ~20 MW Titan: 27 PF Hybrid GPU/CPU Jaguar: 2.3 PF Hybrid GPU/CPU 10 MW Multi-core CPU 9 MW CORAL System 7 MW 2010 2012 2017 2022 3 SC’14 Summit - Bland Today’s Leadership System - Titan Hybrid CPU/GPU architecture, Hierarchical Parallelism Vendors: Cray™ / NVIDIA™ • 27 PF peak • 18,688 Compute nodes, each with – 1.45 TF peak – NVIDIA Kepler™ GPU - 1,311 GF • 6 GB GDDR5 memory – AMD Opteron™- 141 GF • 32 GB DDR3 memory – PCIe2 link between GPU and CPU • Cray Gemini 3-D Torus Interconnect • 32 PB / 1 TB/s Lustre® file system 4 SC’14 Summit - Bland Scientific Progress at all Scales Fusion Energy Liquid Crystal Film Stability A Princeton Plasma Physics Laboratory ORNL Postdoctoral fellow Trung Nguyen team led by C.S. -
Intel Xeon W-1200 Workstation Processors Product Brief
PRODUCT BRIEF | Intel® Xeon® W-1200 Workstation Processors PROFESSIONAL PERFORMANCE POWER AN ENTRY-LEVEL PROFESSIONAL WORKSTATION WITH AN INTEL® Xeon® W-1200 PROCESSOR Intel® Xeon® W-1200 processors (succeeding the Intel® Xeon® E-2200 processors) deliver great performance for entry workstation users with integrated processor graphics alongside the added reliability and confidence of Error Correcting Code (ECC) memory. Get outstanding performance plus best-in-class manageability features and support for ground- breaking technologies that enable you to visualize, simulate, research and work with greater accuracy than ever before. PROFESSIONAL Performance WHEN IT MATTERS • Up to 10 Cores | Up to 20 Threads • Up to 4.1 GHz Base • Up to 5.3 GHz with Intel® Thermal Velocity Boost1 • NEW Intel® Turbo Boost Max Technology 3.0 • Support for up to 128 GB DDR4-2933 ECC Memory2 • Intel® Wi-Fi AX202 (Gig+) support using CNVi³ FeaTURED TecHNOLOGIES • Intel® Hyper-Threading Technology • Up to 40 processor PCIe* lanes • Error-correcting code (ECC) memory support • Thunderbolt™ 3 support • Intel® Optane™ technology support • Intel vPro® platform support A NEW LEVEL OF Performance Designed to deliver an entry-level platform for professionals requiring a true workstation, Intel® Xeon® W-1200 processors are specially optimized for a wide range of workflows and industries such as health and life sciences, financial services, architecture, engineering and construction (AEC). IncreaseD CapaBILITY ENHANCED Performance FAST ConnecTIVITY NEW—Intel® Thermal Velocity Boost NEW—UP TO 5.3 GHz NEW—2.5G Intel® Ethernet Controller Technology i225 support4 Get up to a blazing 5.3 GHz clock speed, Even the most complex workflows won’t Network speed is essential in today’s right out of the box for fast performance. -
Exploring Capabilities Within Fortrilinos by Solving the 3D Burgers Equation
Scientific Programming 20 (2012) 275–292 275 DOI 10.3233/SPR-2012-0353 IOS Press Exploring capabilities within ForTrilinos by solving the 3D Burgers equation Karla Morris a,∗, Damian W.I. Rouson a, M. Nicole Lemaster a and Salvatore Filippone b a Sandia National Laboratories, Livermore, CA, USA b Università di Roma “Tor Vergata”, Roma, Italy Abstract. We present the first three-dimensional, partial differential equation solver to be built atop the recently released, open-source ForTrilinos package (http://trilinos.sandia.gov/packages/fortrilinos). ForTrilinos currently provides portable, object- oriented Fortran 2003 interfaces to the C++ packages Epetra, AztecOO and Pliris in the Trilinos library and framework [ACM Trans. Math. Softw. 31(3) (2005), 397–423]. Epetra provides distributed matrix and vector storage and basic linear algebra cal- culations. Pliris provides direct solvers for dense linear systems. AztecOO provides iterative sparse linear solvers. We demon- strate how to build a parallel application that encapsulates the Message Passing Interface (MPI) without requiring the user to make direct calls to MPI except for startup and shutdown. The presented example demonstrates the level of effort required to set up a high-order, finite-difference solution on a Cartesian grid. The example employs an abstract data type (ADT) calculus [Sci. Program. 16(4) (2008), 329–339] that empowers programmers to write serial code that lower-level abstractions resolve into distributed-memory, parallel implementations. The ADT calculus uses compilable Fortran constructs that resemble the mathe- matical formulation of the partial differential equation of interest. Keywords: ForTrilinos, Trilinos, Fortran 2003/2008, object oriented programming 1. Introduction Burgers [4]. -
Official Precice Adapters for Standard Open-Source Solvers
Proceedings of the 7th GACM Colloquium on Computational Mechanics for Young Scientists from Academia and Industry October 11-13, 2017 in Stuttgart, Germany Official preCICE Adapters for Standard Open-Source Solvers Benjamin Uekermann1*, Hans-Joachim Bungartz1, Lucia Cheung Yau1, Gerasimos Chourdakis1 and Alexander Rusch1 Micro Abstract To deal with the increasing complexity of today's multiphysics applications, the reuse of existing simulation software often becomes a necessity. Coupling to open-source simulation codes, in particular, is a time-efficient way to tackle new applications. The open-source coupling library preCICE enables such coupling in a minimally-invasive way. In this contribution, we give an overview on ready-to-use preCICE adapters for standard open-source solvers, namely CalculiX, Code Aster, OpenFOAM, and SU2. 1Department of Informatics, Technical University of Munich, Garching, Germany *Corresponding author: [email protected] Introduction The preCICE library [1] allows to couple existing simulation software to overall multi-physics simulations in a flexible way at run-time. To this end, preCICE offers methods for equation coupling, such as quasi-Newton acceleration schemes, methods for data mapping, such as radial- basis function interpolation, and fully parallel communication means, based on either MPI or TCP/IP. preCICE is implemented in C++, but offers an API in most standard scientific computing languages. The software is actively developed at the Technical University of Munich and the University of Stuttgart and is published under the free LGPL license on GitHub (http://www.precice.org, https://github.com/precice). A recent article [8] summarizes the development of preCICE over the last ten years for a general audience. -
Loads, Load Factors and Load Combinations
Overall Outline 1000. Introduction 4000. Federal Regulations, Guides, and Reports Training Course on 3000. Site Investigation Civil/Structural Codes and Inspection 4000. Loads, Load Factors, and Load Combinations 5000. Concrete Structures and Construction 6000. Steel Structures and Construction 7000. General Construction Methods BMA Engineering, Inc. 8000. Exams and Course Evaluation 9000. References and Sources BMA Engineering, Inc. – 4000 1 BMA Engineering, Inc. – 4000 2 4000. Loads, Load Factors, and Load Scope: Primary Documents Covered Combinations • Objective and Scope • Minimum Design Loads for Buildings and – Introduce loads, load factors, and load Other Structures [ASCE Standard 7‐05] combinations for nuclear‐related civil & structural •Seismic Analysis of Safety‐Related Nuclear design and construction Structures and Commentary [ASCE – Present and discuss Standard 4‐98] • Types of loads and their computational principles • Load factors •Design Loads on Structures During • Load combinations Construction [ASCE Standard 37‐02] • Focus on seismic loads • Computer aided analysis and design (brief) BMA Engineering, Inc. – 4000 3 BMA Engineering, Inc. – 4000 4 Load Types (ASCE 7‐05) Load Types (ASCE 7‐05) • D = dead load • Lr = roof live load • Di = weight of ice • R = rain load • E = earthquake load • S = snow load • F = load due to fluids with well‐defined pressures and • T = self‐straining force maximum heights • W = wind load • F = flood load a • Wi = wind‐on‐ice loads • H = ldload due to lllateral earth pressure, ground water pressure, -
Przegląd Popularnych Systemów CAD, CAM, CAE
JEŚLI NIE RZEŹBISZ ZAMKÓW Z PIASKU, KORZYSTAJ Z SOLIDWORKS. real power PROJEKTUJ LEPSZE PRODUKTY. www.solidworks.pl REKLAMA CADblog.pl WYDANIE SPECJALNE internetowego czasopisma użytkowników systemów CAD/CAM/CAE CADblog.pl nr 7(08) 2009 | www.cadblog.pl | www.cadraport.pl | www.cadglobe.com | www.swblog.pl | CADblog. pl edycja pdf PPrzeglrzegląd ppopularnychopularnych ssystemówystemów CCAD,AD, CCAM,AM, CCAEAE ddostostęppnychnych nnaa ppolskimolskim rrynkuynku 22009009 Historia komputerowych systemów inżynierskich Symulacje w OBRUM Nowości w Solid Edge z Synchronous Technology 2 Co nowego w SolidWorks 2010 Aktualności, wydarzenia... CADblog.pl Od redakcji Zamiast „wstępniaka”... To nietypowe wydanie, nietypowo spóźnione i z nietypową obję- tością i zawartością. Tym razem przygotowałem dla Państwa bli- sko 90 stron materiałów, w większości jednak są to rzeczy, które uważnym Czytelnikom wydadzą się znajome. Chociażby ze względu na to, iż w zasadzie prawie wszystkie można było znaleźć na stronach CADblog.pl i związanych z nim CADraport.pl, a także SWblog.pl. Niemniej jednak stanowią one pasującą do siebie całość i dlatego złożyły się na obecne wydanie. A objętość – cóż, ma w pewnym sensie zrekompensować niemalże dwumiesięczną przerwę w cyklu wydawniczym :). To oczywiście żart... Przerwa przerwą, ale nie dotknęła ona bynajmniej wspomnianych CADblogowych stron, czyli tej całkowicie wirtualnej postaci mojej inicjatywy. Rozwijały się i rozwijają nieprzerwanie (ponad 10 000 unikalnych użyt- kowników, ponad 7 000 wejść na strony w październi- ku...), ale muszę troszeczkę więcej uwagi poświęcić idei bezpłatnego internetowego czasopisma. Bo chyba o tym ostatnio trochę zapomniałem. AD meritum: w tym numerze kompilacja historii syste- mów CAD – bo jej dalszy ciąg już w listopadowym wyda- edycja pdf niu. -
The Latest in Tpetra: Trilinos' Parallel Sparse Linear Algebra
Photos placed in horizontal position with even amount of white space between photos The latest in Tpetra: Trilinos’ and header parallel sparse linear algebra Photos placed in horizontal position with even amount of white space Mark Hoemmen between photos and header Sandia National Laboratories 23 Apr 2019 SAND2019-4556 C (UUR) Sandia National Laboratories is a multimission laboratory managed and operated by National Technology & Engineering Solutions of Sandia, LLC, a wholly owned subsidiary of Honeywell International, Inc. for the U.S. Department of Energy’s National Nuclear Security Administration under contract DE-NA0003525. Tpetra: parallel sparse linear algebra § “Parallel”: MPI + Kokkos § Tpetra implements § Sparse graphs & matrices, & dense vecs § Parallel kernels for solving Ax=b & Ax=λx § MPI communication & (re)distribution § Tpetra supports many linear solvers § Key Tpetra features § Can manage > 2 billion (10^9) unknowns § Can pick the type of values: § Real, complex, extra precision § Automatic differentiation § Types for stochastic PDE discretizations 2 Over 15 years of Tpetra 2004-5: Stage 1 rewrite of Tpetra to use Paul 2005-10: Mid 2010: Assemble Deprecate & Sexton ill-tested I start Kokkos 2.0 new Tpetra purge for starts research- work on Kokkos team; gather Trilinos 13 Tpetra ware Tpetra 2.0 requirements 2005 2007 2009 2011 2013 2015 2017 2019 Trilinos 10.0 Trilinos 11.0 Trilinos 12.0 Trilinos 13.0 ??? Trilinos 9.0 2006 2008 2010 2012 2014 2016 2018 2020 2008: Chris Baker Fix bugs; rewrite Stage 2: purge Improve GPU+MPI -
Warthog: a MOOSE-Based Application for the Direct Code Coupling of BISON and PROTEUS (MS-15OR04010310)
ORNL/TM-2015/532 Warthog: A MOOSE-Based Application for the Direct Code Coupling of BISON and PROTEUS (MS-15OR04010310) Alexander J. McCaskey Approved for public release. Stuart Slattery Distribution is unlimited. Jay Jay Billings September 2015 DOCUMENT AVAILABILITY Reports produced after January 1, 1996, are generally available free via US Department of Energy (DOE) SciTech Connect. Website: http://www.osti.gov/scitech/ Reports produced before January 1, 1996, may be purchased by members of the public from the following source: National Technical Information Service 5285 Port Royal Road Springfield, VA 22161 Telephone: 703-605-6000 (1-800-553-6847) TDD: 703-487-4639 Fax: 703-605-6900 E-mail: [email protected] Website: http://www.ntis.gov/help/ordermethods.aspx Reports are available to DOE employees, DOE contractors, Energy Technology Data Ex- change representatives, and International Nuclear Information System representatives from the following source: Office of Scientific and Technical Information PO Box 62 Oak Ridge, TN 37831 Telephone: 865-576-8401 Fax: 865-576-5728 E-mail: [email protected] Website: http://www.osti.gov/contact.html This report was prepared as an account of work sponsored by an agency of the United States Government. Neither the United States Government nor any agency thereof, nor any of their employees, makes any warranty, express or implied, or assumes any legal lia- bility or responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or rep- resents that its use would not infringe privately owned rights. Refer- ence herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise, does not nec- essarily constitute or imply its endorsement, recommendation, or fa- voring by the United States Government or any agency thereof. -
Geeng Started with Trilinos
Geng Started with Trilinos Michael A. Heroux Photos placed in horizontal position Sandia National with even amount of white space Laboratories between photos and header USA Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation, for the U.S. Department of Energy’s National Nuclear Security Administration under contract DE-AC04-94AL85000. SAND NO. 2011-XXXXP Outline - Some (Very) Basics - Overview of Trilinos: What it can do for you. - Trilinos Software Organization. - Overview of Packages. - Package Management. - Documentation, Building, Using Trilinos. 1 Online Resource § Trilinos Website: https://trilinos.org § Portal to Trilinos resources. § Documentation. § Mail lists. § Downloads. § WebTrilinos § Build & run Trilinos examples in your web browser! § Need username & password (will give these out later) § https://code.google.com/p/trilinos/wiki/TrilinosHandsOnTutorial § Example codes: https://code.google.com/p/trilinos/w/list 2 WHY USE MATHEMATICAL LIBRARIES? 3 § A farmer had chickens and pigs. There was a total of 60 heads and 200 feet. How many chickens and how many pigs did the farmer have? § Let x be the number of chickens, y be the number of pigs. § Then: x + y = 60 2x + 4y = 200 § From first equaon x = 60 – y, so replace x in second equaon: 2(60 – y) + 4y = 200 § Solve for y: 120 – 2y + 4y = 200 2y = 80 y = 40 § Solve for x: x = 60 – 40 = 20. § The farmer has 20 chickens and 40 pigs. 4 § A restaurant owner purchased one box of frozen chicken and another box of frozen pork for $60. Later the owner purchased 2 boxes of chicken and 4 boxes of pork for $200. -
A Three-Dimensional Finite Element Mesh Generator with Built-In Pre- and Post-Processing Facilities
INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN ENGINEERING Int. J. Numer. Meth. Engng 2009; 0:1{24 Prepared using nmeauth.cls [Version: 2002/09/18 v2.02] This is a preprint of an article accepted for publication in the International Journal for Numerical Methods in Engineering, Copyright c 2009 John Wiley & Sons, Ltd. Gmsh: a three-dimensional finite element mesh generator with built-in pre- and post-processing facilities Christophe Geuzaine1, Jean-Fran¸cois Remacle2∗ 1 Universit´ede Li`ege,Department of Electrical Engineering And Computer Science, Montefiore Institute, Li`ege, Belgium. 2 Universite´ catholique de Louvain, Institute for Mechanical, Materials and Civil Engineering, Louvain-la-Neuve, Belgium SUMMARY Gmsh is an open-source three-dimensional finite element grid generator with a build-in CAD engine and post-processor. Its design goal is to provide a fast, light and user-friendly meshing tool with parametric input and advanced visualization capabilities. This paper presents the overall philosophy, the main design choices and some of the original algorithms implemented in Gmsh. Copyright c 2009 John Wiley & Sons, Ltd. key words: Computer Aided Design, Mesh generation, Post-Processing, Finite Element Method, Open Source Software 1. Introduction When we started the Gmsh project in the summer of 1996, our goal was to develop a fast, light and user-friendly interactive software tool to easily create geometries and meshes that could be used in our three-dimensional finite element solvers [8], and then visualize and export the computational results with maximum flexibility. At the time, no open-source software combining a CAD engine, a mesh generator and a post-processor was available: the existing integrated tools were expensive commercial packages [41], and the freeware or shareware tools were limited to either CAD [29], two-dimensional mesh generation [44], three-dimensional mesh generation [53, 21, 30], or post-processing [31].