HDF5 Overview.Pptx

Total Page:16

File Type:pdf, Size:1020Kb

HDF5 Overview.Pptx The HDF Group HDF5 Overview Elena Pourmal [email protected] The HDF Group 10/17/15 ICALEPCS 2015 1 www.hdfgroup.org Outline • The HDF Group company • Products and services • Overview of HDF5 • What is coming in HDF5 1.10.0 release? • Future directions 10/17/15 ICALEPCS 2015 2 www.hdfgroup.org THE HDF GROUP COMPANY 10/17/15 ICALEPCS 2015 3 www.hdfgroup.org Champaign, Illinois, USA 10/17/15 ICALEPCS 2015 4 www.hdfgroup.org The HDF Group www.hdfgroup.org • Not-for-profit company (since 2006), ex-NCSA at University of Illinois • Offices in 5 states • About 40 employees (more than 50% growth in the past 9 years) - Core software developers - Domain specialists - Documentation team - Technical support • Mission-driven 10/17/15 ICALEPCS 2015 5 www.hdfgroup.org The HDF Group Mission To ensure long-term accessibility of HDF data through sustainable development and support of HDF technologies. 10/17/15 ICALEPCS 2015 6 www.hdfgroup.org The HDF Group philosophy • Committed to Open Source • HDF software is free • BSD type of license • Community involvement • Testing • Patches • New features (e.g., CMake support) • Serving diverse user base • Remote sensing, HPC, non-destructive testing, medical records, scientific modeling, etc. 10/17/15 ICALEPCS 2015 7 www.hdfgroup.org Revenue by Source Light Sources 2014 3% 0% Earth science 4% Finance 28% General NASA, NOAA Naonal Labs HPC 62% Oil & gas 3% 0% Par<cle science 10/17/15 ICALEPCS 2015 8 www.hdfgroup.org Revenue by Project Type Revenues by type of proJect Training and other Consulng outreach 8% 0% R&D 22% Development 24% Premium support 1% Enterprise support 45% 10/17/15 ICALEPCS 2015 9 www.hdfgroup.org PRODUCTS AND SERVICES 10/17/15 ICALEPCS 2015 10 www.hdfgroup.org The HDF Group products • Main product: HDF Technology Suite - For managing high volume complex, heterogeneous data - Flagship: HDF5 data store - Flexible and efficient storage and I/O - Portable - Highly customizable - Misc. tools - Specialized software and tools (e.g., JPSS) 10/17/15 ICALEPCS 2015 11 www.hdfgroup.org Data challenges addressed by HDF5 HDF5 IN 5 MINUTES 10/17/15 ICALEPCS 2015 12 www.hdfgroup.org HDF5 Technology Platform • HDF5 Abstract Data Model • Defines the “building blocks” for data organization and specification • Files, Groups, Links, Datasets, Attributes, Datatypes, Dataspaces • HDF5 Software • Tools • Language Interfaces (C, Fortran, C++, Java) • HDF5 Library • HDF5 Binary File Format • Bit-level organization of HDF5 file • Defined by HDF5 File Format Specification • HDF5 Ecosystem • Tools and services (h5py, MATLAB, IDL, OPeNDAP, etc.) • Communities (Earth Sciences, medical imaging, modeling and visualization) • Community standards (NeXus, HDF-EOS5, h5part, CGNS) • Institutional support and endorsement (NASA, NOAA, DOE) 10/17/15 ICALEPCS 2015 13 www.hdfgroup.org Members of the HDF community 10/17/15 ICALEPCS 2015 14 www.hdfgroup.org Success stories • Petabytes of NASA remote sensing data in HDF4 and HDF5 file formats • New NASA/JPSS missions chose HDF5 format for data archiving Need to organize complex collections of data Long term data preservation lat | lon | temp ----|-----|----- 12 | 23 | 3.1 15 | 24 | 4.2 17 | 21 | 3.6 Efficient, scalable storage and access 10/17/15 ICALEPCS 2015 15 www.hdfgroup.org Success story: Trillion Particle Simulation • Physics plasma simulation at NERSC Cray XE6 • Simulation ran on 120,000 cores using 80% of computing resources 90% of available memory 50% of Lustre scratch system and writing 10 one-trillion particle dumps of 30-42 TBs in HDF5 files; sustained ~ 27 GB/sec; total 350 TBs in HDF5 10/17/15 ICALEPCS 2015 16 www.hdfgroup.org The HDF Group services • Helpdesk and mailing lists - [email protected] - [email protected] - Open to all users of HDF • HDF5 Documentation https://www.hdfgroup.org/HDF5/doc/index.html • HDF Examples (C, Fortran, C++, Java, Python, MATLAB) https://www.hdfgroup.org/HDF5/examples/ 10/17/15 ICALEPCS 2015 17 www.hdfgroup.org The HDF Group services • Standard support • Assistance in general areas of HDF usage • Premium support • Access to our consulting and training resources • Limited consulting hours are included • Enterprise support • Help with developing common strategies for managing HDF data within organization • Organization shares consulting/troubleshooting services • Training • Consulting, custom development and support 10/17/15 ICALEPCS 2015 18 www.hdfgroup.org New Upcoming Features HDF5 1.10.0 RELEASE 10/17/15 ICALEPCS 2015 19 www.hdfgroup.org Reusing free file space in a file PERSISTENT FILE FREE SPACE TRACKING 10/17/15 ICALEPCS 2015 20 www.hdfgroup.org Unused space in HDF5 file • HDF5 library currently only tracks free space while file is open • Space from deleted objects • Space from resized compressed chunks • Free space in the file is “lost” after file is closed • h5repack is used to remove “holes” in the file • New function H5Pset_file_space • Sets a property to track free space in the file that can be reused when file is reopened • Allows fine tuning space tracking 10/17/15 ICALEPCS 2015 21 www.hdfgroup.org Improving performance and saving space SCALABLE CHUNK INDEXING 10/17/15 ICALEPCS 2015 22 www.hdfgroup.org Optimizing chunking storage and performance • HDF5 has an ability to add more data to existing datasets (data arrays) • Special storage mechanism – chunked storage • B-trees are used to index chunks in the file • O(log n) lookup time • HDF5 takes advantage of the access pattern and properties of the datasets • O(1) lookup time • File space savings when storing HDF5 metadata 10/17/15 ICALEPCS 2015 23 www.hdfgroup.org Optimizing chunking storage and performance • B-tree implementation was reworked to use less space in the file • Used for datasets with more than one unlimited dimension • New indexing structures were introduced to achieve O(1) performance and storage savings in special cases 10/17/15 ICALEPCS 2015 24 www.hdfgroup.org Optimizing chunking storage and performance • Examples of O(1) lookup access: • Fixed-size chunked dataset with no compression filters • Algorithmic lookup • Fixed-size chunked dataset with compression filters • Array to index chunks • Fixed-size dataset stored in one chunk (i.e., we now allow compression for contiguous dataset) • No index • Dataset with one unlimited dimension • Extensible array to index chunks 10/17/15 ICALEPCS 2015 25 www.hdfgroup.org CONCURRENCY: SINGLE-WRITER/MULTIPLE- READER 10/17/15 ICALEPCS 2015 26 www.hdfgroup.org Concurrent Access to Data New data elements … Writer Reader …which can be read … are added by a reader… to a dataset HDF5 File in the file… with no IPC necessary. 10/17/15 ICALEPCS 2015 27 www.hdfgroup.org Managing data stored across HDF5 files VIRTUAL DATASET (VDS) 10/17/15 ICALEPCS 2015 28 www.hdfgroup.org VDS Use Case with NPP satellite data 4 granules in 9 GMODO-SVM07… files Visualization with IDV 10/17/15 ICALEPCS 2015 29 www.hdfgroup.org VDS Use Case with NPP satellite data One virtual dataset with 36 granules stored in one file Visualization with IDV 10/17/15 ICALEPCS 2015 30 www.hdfgroup.org VDS use case: Percival detector Series of images D C B A t3+4k t1+4k t4 t3 Virtual Dataset VDS has images A, B, C and D interleaved t2 t1 reader VDS.h5 writer writer writer writer 10/17/15Dataset A Dataset B Dataset C Dataset D A B C D a.h5 b.h5 c.h5 d.h5 31 www.hdfgroup.org VDS: Conceptual View 32 10/17/15 www.hdfgroup.org Performance boost when opening and closing HDF5 files METADATA CACHE IMAGE 10/17/15 ICALEPCS 2015 33 www.hdfgroup.org Problem: Metadata Cache Image ! HDF5 metadata is typically small and scattered throughout the file. ! Resulting many small I/Os a major problem for parallel file systems. ! Metadata cache minimizes this during normal operation, but must still populate cache on file open, and flush it on file close. ! Problem if files are opened and closed often. 10/17/15 ICALEPCS 2015 34 www.hdfgroup.org Solution: Metadata Cache Image ! Store the contents of the metadata cache in a single block at file close, and then populate the cache with the stored entries on file open. ! If access pattern is similar over close and reopen, should save a significant number of small I/O operations. ! This solution is implemented in the metadata cache image feature. 10/17/15 ICALEPCS 2015 35 www.hdfgroup.org Metadata Cache Image ! To enable, set cache image FAPL property on file create or open: H5AC_cache_image_config_t cache_image_config = {H5AC__CURR_CACHE_IMAGE_CONFIG_VERSION, TRUE, 0}; fapl_id = H5Pcreate(H5P_FILE_ACCESS); H5Pset_libver_bounds(fapl_id, H5F_LIBVER_LATEST, H5F_LIBVER_LATEST); H5Pset_mdc_image_config(fapl_id, &cache_image_config); ! Then create or open file as usual. 10/17/15 ICALEPCS 2015 36 www.hdfgroup.org Metadata Cache Image ! Metadata cache image is read and deleted automatically on file open. ! Must set cache image FAPL property again if a new cache image is desired on file close. ! Earlier versions of HDF5 that don't understand the cache image will refuse to open the file. ! One can use a light-weight utility to remove caching info making file compatible with 1.8 ! Prototype implementation showed order of magnitude speedup on parallel systems 10/17/15 ICALEPCS 2015 37 www.hdfgroup.org Performance imporvemnts DATA AGGREGATION AND PAGE BUFFERING 10/17/15 ICALEPCS 2015 38 www.hdfgroup.org Page buffering/ Data aggregation Aggregate and align metadata and small data, perform I/O in aligned pages 10/17/15 39 www.hdfgroup.org Data and Metadata Aggregators The new aggregators pack small raw data and metadata allocations into aligned
Recommended publications
  • Introduction CFD General Notation System (CGNS)
    CGNS Tutorial Introduction CFD General Notation System (CGNS) Christopher L. Rumsey NASA Langley Research Center Outline • Introduction • Overview of CGNS – What it is – History – How it works, and how it can help – The future • Basic usage – Getting it and making it work for you – Simple example – Aspects for data longevity 2 Introduction • CGNS provides a general, portable, and extensible standard for the description, storage, and retrieval of CFD analysis data • Principal target is data normally associated with computed solutions of the Navier-Stokes equations & its derivatives • But applicable to computational field physics in general (with augmentation of data definitions and storage conventions) 3 What is CGNS? • Standard for defining & storing CFD data – Self-descriptive – Machine-independent – Very general and extendable – Administered by international steering committee • AIAA recommended practice (AIAA R-101A-2005) • Free and open software • Well-documented • Discussion forum: [email protected] • Website: http://www.cgns.org 4 History • CGNS was started in the mid-1990s as a joint effort between NASA, Boeing, and McDonnell Douglas – Under NASA’s Advanced Subsonic Technology (AST) program • Arose from need for common CFD data format for improved collaborative analyses between multiple organizations – Existing formats, such as PLOT3D, were incomplete, cumbersome to share between different platforms, and not self-descriptive (poor for archival purposes) • Initial development was heavily influenced by McDonnell Douglas’ “Common
    [Show full text]
  • Development of a Coupling Approach for Multi-Physics Analyses of Fusion Reactors
    Development of a coupling approach for multi-physics analyses of fusion reactors Zur Erlangung des akademischen Grades eines Doktors der Ingenieurwissenschaften (Dr.-Ing.) bei der Fakultat¨ fur¨ Maschinenbau des Karlsruher Instituts fur¨ Technologie (KIT) genehmigte DISSERTATION von Yuefeng Qiu Datum der mundlichen¨ Prufung:¨ 12. 05. 2016 Referent: Prof. Dr. Stieglitz Korreferent: Prof. Dr. Moslang¨ This document is licensed under the Creative Commons Attribution – Share Alike 3.0 DE License (CC BY-SA 3.0 DE): http://creativecommons.org/licenses/by-sa/3.0/de/ Abstract Fusion reactors are complex systems which are built of many complex components and sub-systems with irregular geometries. Their design involves many interdependent multi- physics problems which require coupled neutronic, thermal hydraulic (TH) and structural mechanical (SM) analyses. In this work, an integrated system has been developed to achieve coupled multi-physics analyses of complex fusion reactor systems. An advanced Monte Carlo (MC) modeling approach has been first developed for converting complex models to MC models with hybrid constructive solid and unstructured mesh geometries. A Tessellation-Tetrahedralization approach has been proposed for generating accurate and efficient unstructured meshes for describing MC models. For coupled multi-physics analyses, a high-fidelity coupling approach has been developed for the physical conservative data mapping from MC meshes to TH and SM meshes. Interfaces have been implemented for the MC codes MCNP5/6, TRIPOLI-4 and Geant4, the CFD codes CFX and Fluent, and the FE analysis platform ANSYS Workbench. Furthermore, these approaches have been implemented and integrated into the SALOME simulation platform. Therefore, a coupling system has been developed, which covers the entire analysis cycle of CAD design, neutronic, TH and SM analyses.
    [Show full text]
  • Integrated Tool Development for Used Fuel Disposition Natural System Evaluation –
    Integrated Tool Development for Used Fuel Disposition Natural System Evaluation – Phase I Report Prepared for U.S. Department of Energy Used Fuel Disposition Yifeng Wang & Teklu Hadgu Sandia National Laboratories Scott Painter, Dylan R. Harp & Shaoping Chu Los Alamos National Laboratory Thomas Wolery Lawrence Livermore National Laboratory Jim Houseworth Lawrence Berkeley National Laboratory September 28, 2012 FCRD-UFD-2012-000229 SAND2012-7073P DISCLAIMER This information was prepared as an account of work sponsored by an agency of the U.S. Government. Neither the U.S. Government nor any agency thereof, nor any of their employees, makes any warranty, expressed or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness, of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately owned rights. References herein to any specific commercial product, process, or service by trade name, trade mark, manufacturer, or otherwise, does not necessarily constitute or imply its endorsement, recommendation, or favoring by the U.S. Government or any agency thereof. The views and opinions of authors expressed herein do not necessarily state or reflect those of the U.S. Government or any agency thereof. Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation, for the U.S. Department of Energy‘s National Nuclear Security Administration under contract DE-AC04-94AL85000. Integrated Tool Development for UFD Natural System Evaluation 9/1/2012 iii Integrated Tool Development for UFD Natural System Evaluation iv 9/1/2012 Integrated Tool Development for UFD Natural System Evaluation 9/1/2012 v Executive Summary The natural barrier system (NBS) is an integral part of a geologic nuclear waste repository.
    [Show full text]
  • Seven Keys for Practical Understanding and Use of CGNS
    Seven keys for practical understanding and use of CGNS Marc Poinot∗ Christopher L. Rumseyy SAFRAN NASA Langley Research Center [email protected] [email protected] We present key features of the CGNS standard, focusing on its two main elements, the data model (CGNS/SIDS) and its implementations (CGNS/HDF5 and CGNS/Python). The data model is detailed to emphasize how the topological user oriented information, such as families, are separated from the actual meshing that could be split or modified during the CFD workflow, and how this topological information is traced during the meshing process. We also explain why the same information can be described in multiple ways and how to handle such alternatives in an application. Two implementations, using HDF5 and Python, are illustrated in several use examples, both for archival and interoperability purposes. The CPEX extension formalized process is explained to show how to add new features to the standard in a consensual way; we present some of the next extensions to come. Finally we conclude by showing how powerful a consensual public approach like CGNS can be, as opposed to a stand-alone private one. All throughout the paper, we demonstrate how the use of CGNS could be of great benefit for both the meshing and CFD solver communities. I. Introduction The CFD General Notation System12 (CGNS) is a public standard for the CFD community. It has had more than 20 years of feedback from users throughout the world in industry, universities, and government research labs. The CGNS name is now well known in the CFD community, but it is often only associated with the CGNS library (CGNS/MLL) used by CFD tools.
    [Show full text]
  • Summary of Software Using HDF5 by Name (December 2017)
    Summary of Software Using HDF5 by Name (December 2017) Name (with Product Open Source URL) Application Type / Commercial Platforms Languages Short Description ActivePapers (http://www.activepapers File format for storing .org) General Problem Solving Open Source computations. Freely available SZIP implementation available from Adaptive Entropy the German Climate Encoding Library Special Purpose Open Source Computing Center IO componentization of ADIOS Special Purpose Open Source different IO transport methods Software for the display and Linux, Mac, analysis of X-Ray diffraction Adxv Visualization/Analysis Open Source Windows images Computer graphics interchange Alembic Visualization Open Source framework API to provide standard for working with electromagnetic Amelet-HDF Special Purpose Open Source data Bathymetric Attributed File format for storing Grid Data Format Special Purpose Open Source Linux, Win32 bathymetric data Basic ENVISAT Toolbox for BEAM Special Purpose Open Source (A)ATSR and MERIS data Bers slices and holomy Bear Special Purpose Open Source representations Unix, Toolkit for working BEAT Visualization/Analysis Open Source Windows,Mac w/atmospheric data Cactus General Problem Solving Problem solving environment Rapid C++ technical computing environment,with built-in HDF Ceemple Special Purpose Commercial C++ support Software for working with CFD CGNS Special Purpose Open Source Analysis data Tools for working with partial Chombo Special Purpose Open Source differential equations Command-line tool to convert/plot a Perkin
    [Show full text]
  • Getting Data Into Visit
    LLNL-SM-446033 Getting Data Into VisIt July 2010 Version 2.0.0 Brad Whitlock wrence La Livermore National Laboratory ii DISCLAIMER This document was prepared as an account of work sponsored by an agency of the United States government. Neither the United States government nor Lawrence Livermore National Security, LLC, nor any of their employees makes any warranty, expressed or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately owned rights. Reference herein to any specific commercial product, process, or service by trade name, trade- mark, manufacturer, or otherwise does not necessarily constitute or imply its endorsement, recommendation, or favoring by the United States government or Lawrence Livermore National Security, LLC. The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States government or Lawrence Liver- more National Security, LLC, and shall not be used for advertising or product endorsement purposes. This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory in part under Contract W-7405-Eng-48 and in part under Contract DE-AC52-07NA27344. iii iv Table of Contents Introduction Manual chapters . 2 Manual conventions . 2 Strategies . 2 Picking a strategy . 3 Definition of terms . 4 Creating compatible files Creating a conversion utility or extending a simulation . 7 Survey of database reader plug-ins. 9 BOV file format . 9 X-Y Curve file format. 12 Plain text ASCII files .
    [Show full text]
  • AIAA Paper 2012-1264
    AIAA-2012-1264 50th AIAA Aerospace Sciences Meeting, January 9 – 12, 2012, Nashville, TN Recent Updates to the CFD General Notation System (CGNS) Christopher L. Rumsey∗ NASA Langley Research Center, Hampton, VA 23681 Bruce Wedan† Computational Engineering Solutions, San Ramon, CA 94582 Thomas Hauser‡ University of Colorado Boulder, Boulder, CO 80309 Marc Poinot§ ONERA - The French Aerospace Lab, F-92322 Chatillon, FRANCE The CFD General Notation System (CGNS)—a general, portable, and extensible standard for the storage and retrieval of computational fluid dynamics (CFD) analysis data—has been in existence for more than a decade (Version 1.0 was released in May 1998). Both structured and unstructured CFD data are covered by the standard, and CGNS can be easily extended to cover any sort of data imaginable, while retaining backward compatibility with existing CGNS data files and software. Although originally designed for CFD, it is readily extendable to any field of computational analysis. In early 2011, CGNS Version 3.1 was released, which added significant capabilities. This paper describes these recent enhancements and highlights the continued usefulness of the CGNS methodology. Glossary of Terms ADF CGNS Advanced Data Format, www.grc.nasa.gov/www/cgns/CGNS docs current/adf API Application Programming Interface CGIO CGNS low-level library, www.grc.nasa.gov/www/cgns/CGNS docs current/cgio CHLone CGNS special purpose C library (for HDF5 files only), chlone.sourceforge.net CGNS CFD General Notation System, cgns.org CGNSTalk CGNS user forum,
    [Show full text]
  • BASEMENT Reference Manual
    BASEMENT System Manuals VAW - ETH Zurich v2.8.1 BASEMENT System Manuals VAW - ETH Zurich v2.8.1 Contents Preamble 5 Credits . 5 License ......................................... 7 1 Pre-Processing in QGIS with BASEmesh 11 1.1 Introduction . 11 1.2 Tutorial 1: Mesh Generation based on Pointwise Elevation Data . 12 1.2.1 Project Settings . 13 1.2.2 Coordinate Reference System Configuration . 14 1.2.3 Loading Input Data for Elevation Model . 14 1.2.4 Saving Layer as Shape File . 15 1.2.5 Loading the Model Boundary . 15 1.2.6 Editing the Model Boundary . 17 1.2.7 Loading Breakline Data . 19 1.2.8 Creation of the Elevation Model as TIN . 20 1.2.9 Adaption of the Breaklines for Quality Meshing . 21 1.2.10 Creation of Region Marker Points . 22 1.2.11 Creation of Quality Mesh . 24 1.2.12 Interpolating elevation data from elevation mesh (TIN) . 25 1.2.13 3D view of the mesh . 26 1.2.14 Export of mesh to 2dm . 27 1.3 Tutorial 2: Import/Modify an existing Mesh and use Raster Data as Elevation Model . 28 1.3.1 Importing a .2dm mesh file . 29 1.3.2 Modifying the material indices of elements . 29 1.3.3 Manual editing of mesh elements . 33 1.3.4 Renumbering the mesh . 34 1.3.5 Interpolating elevations from raster data . 35 1.4 Tutorial 3: Using dividing constraints along boundary cross sections and setting up a BASEMENT simulation with a mesh from BASEmesh . 35 1.4.1 Using dividing constraints for Quality meshing .
    [Show full text]
  • About Pycgns 3 1.1 CGNS Standard
    pyCGNS.intro/Manual Release 4.2.0 Marc Poinot August 27, 2007 CONTENTS 1 About pyCGNS 3 1.1 CGNS Standard...........................................3 1.2 Package contents...........................................4 1.3 Quick start..............................................5 2 Build and Install 7 2.1 Required libraries..........................................7 2.2 Optional libraries...........................................7 2.3 Installation process..........................................8 2.4 Single module installation......................................8 2.5 Configuration file contents......................................8 2.6 NAV depends.............................................9 2.7 MAP depends............................................9 2.8 WRA depends............................................9 3 Glossary 11 4 PDF Docs 13 i ii pyCGNS.intro/Manual, Release 4.2.0 Release 4.2 • CGNS.MAP is now a wrapper to CHLone python module • CGNS.WRA reborn, re-coding using Cython • CGNS.VAL reborn • completely new CGNS.NAV using Qt, VTK and Cython • YouTube demos here The package gathers various tools and libraries for CGNS end-users and Python application developpers. The main object of pyCGNS is to provide the application developpers with a Python interface to CGNS/SIDS, the data model. The MAP and PAT modules are dedicated to this goal: map the CGNS/SIDS data model the CGNS/Python implementation. The WRA module contains wrapper on CGNS/MLL and a MLL-like set of functions that uses the CGNS/Python mapping as implementation. The NAV module supports the CGNS.NAV graphical browser, with nice features about tree exploration, copy/paste and even global node changes. Then, the VAL module is a parser engine for CGNS/Python tree compliance checking. The CGNS.VAL tool can analyze your CGNS/HDF5 file and returns you a large panel of diagnostics.
    [Show full text]
  • MASS2, Modular Aquatic Simulation System in Two Dimensions, User
    PNNL-14820-2 MASS2 Modular Aquatic Simulation System in Two Dimensions User Guide and Reference W. A. Perkins M. C. Richmond September 2004 Prepared for the U.S. Department of Energy under Contract DE-AC06-76RL01830 DISCLAIMER United States Government. Neither the United States Government nor any agency thereof, nor Battelle Memorial Institute, nor any of their employees, makes any warranty, express or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately owned rights. Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise does not necessarily constitute or imply its endorsement, recommendation, or favoring by the United States Government or any agency thereof, or Battelle Memorial Institute. The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government or any agency thereof. PACIFIC NORTHWEST NATIONAL LABORATORY operated by BATTELLE for the UNITED STATES DEPARTMENT OF ENERGY under Contract DE-AC06-76RLO1830 Printed in the United States of America Available to DOE and DOE contractors from the Office of Scientific and Technical Information, P.O. Box 62, Oak Ridge, TN 37831-0062; ph: (865) 576-8401 fax: (865) 576-5728 email: [email protected] Available to the public from the National Technical Information Service, U.S. Department of Commerce, 5285 Port Royal Rd., Springfield, VA 22161 ph: (800) 553-6847 fax: (703) 605-6900 email: [email protected] online ordering: http://www.ntis.gov/ordering.htm This document was printed on recycled paper.
    [Show full text]
  • Data Formats for Visualization and I T Bilit Interoperability
    Data Formats for Visualization and Int eroperabilit y Steve Lantz Senior Research Associate 10/27/2008 www.cac.cornell.edu 1 How will you store your data? • Raw binary is compact but not portable – “Unformatted,” machine-specific representation – Byte-order issues: big endian (IBM) vs. little endian (Intel) • Formatted text is portable but not compact – Need to know all the details of formatting just to read the data – 1 byyygg()te of ASCII text stores only a single decimal digit (~3 bits) – Most of the fat can be knocked out by compression (gzip, bzip, etc.) – However, compression is impractical and slow for large files • Need to consider how data will ultimately be used – Are you trying to ensure future portability? – Will your favored analysis tools be able to read the data? – What storage constraints are there? 10/27/2008 www.cac.cornell.edu 2 Issues beyond the scope of this talk… • Provenance – The record of the origin or source of data – The history of the ownership or location of data – Purpose: to confirm the time and place of, and perhaps the person responsible for, the creation, production or discovery of the data • Curation – Collecting, cataloging, organizing, and preserving data • Ontology – Assigning t ypes and pro perties to data ob jects – Determining relationships among data objects – Associating concepts and meanings with data (semantics) •Portab le data fo rmats can an d do addr ess som e of th ese i ssues… 10/27/2008 www.cac.cornell.edu 3 Portable data formats: the HDF5 technology suite • Versatile data model that
    [Show full text]
  • Tutorial Session Agenda
    CFD General Notation System (CGNS) Tutorial Session Agenda • 7:00-7:30 Introduction, overview, and basic usage C. Rumsey (NASA Langley) • 7:30-7:50 Usage for structured grids B. Wedan (ANSYS – ICEM) • 7:50-8:10 Usage for unstructured grids E. van der Weide (Stanford University) • 8:20-8:40 HDF5 usage and parallel implementation T. Hauser (Utah State University) • 8:40-9:00 Python and in-memory CGNS trees M. Poinot (ONERA) • 9:00-9:30 Discussion and question/answer period 2 CFD General Notation System (CGNS) Introduction, overview, and basic usage Christopher L. Rumsey NASA Langley Research Center Chair, CGNS Steering Committee Outline • Introduction • Overview of CGNS – What it is – History – How it works, and how it can help – The future • Basic usage – Getting it and making it work for you – Simple example – Aspects for data longevity 4 Introduction • CGNS provides a general, portable, and extensible standard for the storage and retrieval of CFD analysis data • Principal target is data normally associated with computed solutions of the Navier-Stokes equations & its derivatives • But applicable to computational field physics in general (with augmentation of data definitions and storage conventions) 5 What is CGNS? • Standard for defining & storing CFD data – Self-descriptive – Machine-independent – Very general and extendable – Administered by international steering committee • AIAA recommended practice (AIAA R-101-2002) • In process of becoming part of international ISO standard • Free and open software • Well-documented • Discussion
    [Show full text]