Hierarchical Data Format 5 (HDF5): Why and How to Use It

Total Page:16

File Type:pdf, Size:1020Kb

Hierarchical Data Format 5 (HDF5): Why and How to Use It Hierarchical Data Format 5 (HDF5): Why And How To Use It Andrea Negri 22 February 2013 Goals of this seminar Provide an overview of what HDF5 can do what is the usage and the spread of HDF5 in the scientific community stimulate the interest for this format present some references for a COMPLETE documentation (both official and unofficial) show some simple example of writing an HDF5 I will NOT talk about the internal structure of the library (B-tree structure, etc. ) details about advanced HDF5 usage such as parallel I/O, etc. Andrea Negri Hierarchical Data Format 5 (HDF5): Why And How To Use It The necessity of a good data container Common cases in Astronomy: observations: tons of huge ASCII file with tables (preferred portability), slow I/O, no compression, no optimization of the disk usage; FITS files does not support storage of very large data simulations: binary files of very large numerical arrays, often written in a parallel way (preferred I/O efficiency), portability problem, sometimes no optimization of the disk usage (it depends on the user!) A unified and very versatile container would be great A unique format for different needs: HDF5! Andrea Negri Hierarchical Data Format 5 (HDF5): Why And How To Use It What is HDF5? Basically, HDF5 = Open file format + Open source software + data model Versatile data model that can represent very complex data objects and a wide variety of metadata; A completely portable file format with no limit on the number or size of data objects in the collection; the file format is defined by the HDF5 File Format Specification A software library that runs on a range of computational platforms, from laptops to massively parallel systems, and implements a high-level API with C, C++, Fortran 90, and Java interfaces; A rich set of integrated performance features that allow for access time and storage space optimizations; Tools and applications for managing, manipulating, viewing, and analyzing the data in the collection; Completely open format and open-source software. Andrea Negri Hierarchical Data Format 5 (HDF5): Why And How To Use It A little of history 1987: Graphics task force at NCSA began work on architecture-independent format and library, HDF; 1994: NASA selected HDF as standard format for Earth Observing System; 1996-1998: DOE tri-labs and NCSA, with additional support from NASA, developed HDF5, initially called \BigHDF" 2005 NASA funded development of netCDF-4, a new version of netCDF that uses the HDF5 file format; 2006: The HDF Group, a non-profit corporation, spun off from NCSA and the University of Illinois. Currently, the HDFgroup develop and maintain HDF5. HDF4 is still maintained by HDFgroup, but its use is obscure, the most severe limitation: no file bigger than 2 GB. HDF5 is a complete different data model, file format and library. The library can be downloaded from http://www.hdfgroup.org/HDF5/release/obtain5.html Andrea Negri Hierarchical Data Format 5 (HDF5): Why And How To Use It Features of the file format possibility of compression (standard Defined by the HDF5 File Format Gzip, other compression are Specification (open) possible) designed high volume (1 TB in a very strong support for file!) and complex data multidimensional arrays POSIX access (i.e. /fields/density parallel I/O, through MPI-2 I/O ...) partial read/writes on arrays are completely self-describing: max possible portability extensible data mainly binary, random access runs on many platforms, from my optimized and tunable I/O efficiency netbook to Bluegene Q (chunks data and other techniques) file format designed to work well good for HPC with other technologies wide range of native and short timescales for bug fixing user-defined datatypes supported long term access to data Andrea Negri Hierarchical Data Format 5 (HDF5): Why And How To Use It Users HDF5 can be used alone (as I actually do) but many organizations and groups use it as an efficient base to develop their data models and containers: all the advantages of the HDF5 format tuned for your personal needs! Andrea Negri Hierarchical Data Format 5 (HDF5): Why And How To Use It Who uses HDF5? All user are spread around the world! Used in astrophysics, biology, optics, meteorology, oceanography, medical image processing, bioengineering, crystallography, to store radio data from LOFAR NASA: both HDF5 and EOS for aerospace engineering, Los Alamos laboratories Lucasfilm National Oceanographic Data Center (NOAA) an incomplete list: http://www.hdfgroup.org/HDF5/users5.html Andrea Negri Hierarchical Data Format 5 (HDF5): Why And How To Use It Explore an HDF5: tools from hdfgroup.org There are extremely useful command line tools, the most important are: h5ls: list the datasets h5dump: Enables the user to examine the contents of an HDF5 file and dump those contents to an ASCII file h5diff: compares HDF5 files h5import: imports ASCII or binary data into HDF5 h5check: validation tool, ensure the possibility of a long-term access h5repack: repacks a file, better usage of unused space, change properties of datasets h5perf and h5per serial: measures HDF5 serial and parallel performance and more. HDFView: visual tool for browsing and editing HDF4 and HDF5 files. Conversion tools between HDF5 and: HDF4, EOS5, netCDF4, gif format for images. Andrea Negri Hierarchical Data Format 5 (HDF5): Why And How To Use It APIs APIs officially supported by the hdfgroup: C and Fortran 90/2003 (both low and high level), Java (high level) Third-party bindings: GNU Data Language, IDL, MATLAB, Scilab, Mathematica, Perl, Python (h5py, PyTables), CGNS, R and others. Two different philosophies: low-level and high level APIs low-level: arrays of any size and type high-level: image, table, packet table, dimension scale C function: name command(args) Fortran subroutine: name command f(args, int error) Andrea Negri Hierarchical Data Format 5 (HDF5): Why And How To Use It HDF5 structure: a quick look Andrea Negri Hierarchical Data Format 5 (HDF5): Why And How To Use It HDF5 structure: a dataset Andrea Negri Hierarchical Data Format 5 (HDF5): Why And How To Use It HDF5 datasets organize and contain \raw data dimensions values" HDF5 Datatypes describe individual data elements in an HDF5 dataset Wide range of datatypes supported integer, float, double, unsigned, bitfield, user-defined, any KIND in Fortran 2003 variable length types (e.g., strings) reference to object; reference to dataset region opaque types array compound HDF5 Dataspaces describe the logical layout of the elements in an HDF5 dataset Andrea Negri Hierarchical Data Format 5 (HDF5): Why And How To Use It How to write a simple dataset use hdf5 CALL h5open f (error) CALL h5fcreate f(filename, H5F ACC TRUNC F, file id, error) ! begin repetitive task call h5screate simple f(rank, dims, dspace id, error) call h5dcreate f(file id,dsetname,h5t native double,dspace id, & dset id,error) call h5dwrite f(dset id, h5t native double, dset, dims, error) call h5dclose f(dset id, error) ! end access to the dataset call h5sclose f(dspace id, error) ! term. access to data space ! end repetitive task CALL h5fclose f(file id, error) CALL h5close f(error) now let's see a real code Andrea Negri Hierarchical Data Format 5 (HDF5): Why And How To Use It To learn more Official documentation and tutorials: www.hdfgroup.org Very interesting series of slides: http://www.lofar.org/wiki/doku.php?id=public:hdf5 CINECA will held a two-day course (16-17 May 2013): Parallel I/O and management of large scientific data @ CINECA http://events.prace-ri.eu/conferenceDisplay.py?confId=126 Andrea Negri Hierarchical Data Format 5 (HDF5): Why And How To Use It.
Recommended publications
  • Parallel Data Analysis Directly on Scientific File Formats
    Parallel Data Analysis Directly on Scientific File Formats Spyros Blanas 1 Kesheng Wu 2 Surendra Byna 2 Bin Dong 2 Arie Shoshani 2 1 The Ohio State University 2 Lawrence Berkeley National Laboratory [email protected] {kwu, sbyna, dbin, ashosani}@lbl.gov ABSTRACT and physics, produce massive amounts of data. The size of Scientific experiments and large-scale simulations produce these datasets typically ranges from hundreds of gigabytes massive amounts of data. Many of these scientific datasets to tens of petabytes. For example, the Intergovernmen- are arrays, and are stored in file formats such as HDF5 tal Panel on Climate Change (IPCC) multi-model CMIP-5 and NetCDF. Although scientific data management systems, archive, which is used for the AR-5 report [22], contains over petabytes such as SciDB, are designed to manipulate arrays, there are 10 of climate model data. Scientific experiments, challenges in integrating these systems into existing analy- such as the LHC experiment routinely store many gigabytes sis workflows. Major barriers include the expensive task of of data per second for future analysis. As the resolution of preparing and loading data before querying, and convert- scientific data is increasing rapidly due to novel measure- ing the final results to a format that is understood by the ment techniques for experimental data and computational existing post-processing and visualization tools. As a con- advances for simulation data, the data volume is expected sequence, integrating a data management system into an to grow even further in the near future. existing scientific data analysis workflow is time-consuming Scientific data are often stored in data formats that sup- and requires extensive user involvement.
    [Show full text]
  • Ipython: a System for Interactive Scientific
    P YTHON: B ATTERIES I NCLUDED IPython: A System for Interactive Scientific Computing Python offers basic facilities for interactive work and a comprehensive library on top of which more sophisticated systems can be built. The IPython project provides an enhanced interactive environment that includes, among other features, support for data visualization and facilities for distributed and parallel computation. he backbone of scientific computing is All these systems offer an interactive command mostly a collection of high-perfor- line in which code can be run immediately, without mance code written in Fortran, C, and having to go through the traditional edit/com- C++ that typically runs in batch mode pile/execute cycle. This flexible style matches well onT large systems, clusters, and supercomputers. the spirit of computing in a scientific context, in However, over the past decade, high-level environ- which determining what computations must be ments that integrate easy-to-use interpreted lan- performed next often requires significant work. An guages, comprehensive numerical libraries, and interactive environment lets scientists look at data, visualization facilities have become extremely popu- test new ideas, combine algorithmic approaches, lar in this field. As hardware becomes faster, the crit- and evaluate their outcome directly. This process ical bottleneck in scientific computing isn’t always the might lead to a final result, or it might clarify how computer’s processing time; the scientist’s time is also they need to build a more static, large-scale pro- a consideration. For this reason, systems that allow duction code. rapid algorithmic exploration, data analysis, and vi- As this article shows, Python (www.python.org) sualization have become a staple of daily scientific is an excellent tool for such a workflow.1 The work.
    [Show full text]
  • SSWGDL Installation on Ubuntu 10.04 Using VMWARE Player
    SSWGDL Installation on Ubuntu 10.04 using VMWARE Player 1. Install ubuntu 10.04 on vmware a) ubuntu-10.04.1-desktop-i386.iso, 32 bit b) configure with 1-2 GB mem, 20-40 GB disk c) vmware tools - d) shared folders e) do not enable multiple processors even if your machines supports many f) pword yourpassword g) /home/yourname - that's the way I did it - h) login name is yourchoice 2. Configure Ubuntu a) do default system update via system update manager b) install vmware tools using easy install, run the perl scipt (.pl), let it compile and install c) use ubuntu software center d) cvs, plplot x11 driver, tcsh, wxidgets I grabbed wx2.8 dev and lib packages, see package- manager-installs.txt for details. 3. Download and install GDL with dependencies a) Download and unpack 0.90 release tar.gz into gdl-0.9 (use current release from gdl) b) Get dependencies using sudo apt-get build-dep gnudatalanguage c) cd to gdl-0.9 d) Configure using “./configure --with-Magick=no --with-python=no --with-openmp=no – with-hdf=no” e) Does anyone know how to install numarray so we don't have to use python=no switch f) Here is the message of success after configure: GDL - GNU Data Language • ----- compilation options: --------------------------- • System: i686-pc-linux-gnu • Installation prefix: /usr/local • C++ compiler: g++ -g -O2 • OpenMP support: no • Build type: standalone (other: Python module) • ----- optional libraries (consult README/INSTALL): --- • wxWidgets: yes • Magick: no • NetCDF: yes • HDF4: no • HDF5: yes • FFTW: yes • libproject: no (see also
    [Show full text]
  • A Metadata Based Approach for Supporting Subsetting Queries Over Parallel HDF5 Datasets
    A Metadata Based Approach For Supporting Subsetting Queries Over Parallel HDF5 Datasets Thesis Presented in Partial Fulfillment of the Requirements for the Degree Master of Science in the Graduate School of The Ohio State University By Vignesh Santhanagopalan, B.S. Graduate Program in Computer Science and Engineering The Ohio State University 2011 Thesis Committee: Dr. Gagan Agrawal, Advisor Dr. Radu Teodorescu ABSTRACT A key challenge in scientific data management is to manage the data as the data sizes are growing at a very rapid speed. Scientific datasets are typically stored using low-level formats which store the data as binary. This makes specification of the data and processing very hard. Also, as the volume of data is huge, parallel configurations must be used to process the data to enable efficient access. We have developed a data virtualization approach for supporting subsetting queries on scientific datasets stored in native format. The data is stored in Hierarchical Data Format (HDF5) which is one of the popular formats for storing scientific data. Our system supports SQL queries using the Select, From and Where clauses. We support queries based on the dimensions of the dataset and also queries which are based on the dimensions and attributes (which provide extra information about the dataset) of the dataset. In order to support the different types of queries, we have the pre-processing and post-processing modules. We also parallelize the selection queries involving the dimensions and the attributes. Our system offers the following advantages. We provide SQL like abstraction for specifying subsets of interest which is a powerful mechanism.
    [Show full text]
  • Jazyk Gdl Na Spracovanie Vedeckých Dát
    JAZYK GDL NA SPRACOVANIE VEDECKÝCH DÁT ŠECHNÝ, Martin (SK) Abstrakt. GNU Data Language (GDL) je jazyk na spracovanie vedeckých dát a zároveň prostredie na spúšťanie programov v tomto jazyku. GDL je slobodný softvér kompatibilný s komerčne licencovaným Interactive Data Language (IDL). GDL je platformovo nezávislé prostredie a využíva iné dostupné nainštalované knižnice a aplikácie. Jazyk GDL umožňuje spracovávať vstupy z klávesnice, dátové súbory, obrázky a dokáže vizualizovať dáta tabuľkami, grafmi, obrázkami. GDL je efektívny pri numerickej analýze dát, vektorovej reprezentácii, použití matematických funkcií a procedúr. Tento nástroj je vhodný pre široké použitie vo vede, výskume, aj ako alternatíva k známym matematickým a vizualizačným nástrojom. Kľúčové slová. GDL, IDL, vedecké dáta, programovanie, vizualizácia. GDL LANGUAGE FOR SCIENTIFIC DATA PROCESSING Abstract. GNU Data Language (GDL) is a language for scientific data processing and also the environment for launching programs in that language. GDL is a free software that is compatible with commercially licensed Interactive Data Language (IDL). GDL is a platform-independent environment and uses other available libraries and applications installed. GDL language enables to process keyboard input, data files, images and can visualize data tables, charts, pictures. GDL is effective in the analysis of numerical data, vector representation, the use of mathematical functions and procedures. This tool is suitable for wide use in science, research, and as an alternative to known mathematical and visualization tools. Key words and phrases. GDL, IDL, scientific data, programming, visualization. 1 Úvod GNU Data Language (GDL)1 je jazyk na spracovanie vedeckých dát a zároveň prostredie (interpreter a inkrementálny prekladač) na spúšťanie programov v tomto jayzku.
    [Show full text]
  • HUDDL for Description and Archive of Hydrographic Binary Data
    University of New Hampshire University of New Hampshire Scholars' Repository Center for Coastal and Ocean Mapping Center for Coastal and Ocean Mapping 4-2014 HUDDL for description and archive of hydrographic binary data Giuseppe Masetti University of New Hampshire, Durham, [email protected] Brian R. Calder University of New Hampshire, Durham, [email protected] Follow this and additional works at: https://scholars.unh.edu/ccom Recommended Citation G. Masetti and Calder, Brian R., “Huddl for description and archive of hydrographic binary data”, Canadian Hydrographic Conference 2014. St. John's, NL, Canada, 2014. This Conference Proceeding is brought to you for free and open access by the Center for Coastal and Ocean Mapping at University of New Hampshire Scholars' Repository. It has been accepted for inclusion in Center for Coastal and Ocean Mapping by an authorized administrator of University of New Hampshire Scholars' Repository. For more information, please contact [email protected]. Canadian Hydrographic Conference April 14-17, 2014 St. John's N&L HUDDL for description and archive of hydrographic binary data Giuseppe Masetti and Brian Calder Center for Coastal and Ocean Mapping & Joint Hydrographic Center – University of New Hampshire (USA) Abstract Many of the attempts to introduce a universal hydrographic binary data format have failed or have been only partially successful. In essence, this is because such formats either have to simplify the data to such an extent that they only support the lowest common subset of all the formats covered, or they attempt to be a superset of all formats and quickly become cumbersome. Neither choice works well in practice.
    [Show full text]
  • Towards Interactive, Reproducible Analytics at Scale on HPC Systems
    Towards Interactive, Reproducible Analytics at Scale on HPC Systems Shreyas Cholia Lindsey Heagy Matthew Henderson Lawrence Berkeley National Laboratory University Of California, Berkeley Lawrence Berkeley National Laboratory Berkeley, USA Berkeley, USA Berkeley, USA [email protected] [email protected] [email protected] Drew Paine Jon Hays Ludovico Bianchi Lawrence Berkeley National Laboratory University Of California, Berkeley Lawrence Berkeley National Laboratory Berkeley, USA Berkeley, USA Berkeley, USA [email protected] [email protected] [email protected] Devarshi Ghoshal Fernando Perez´ Lavanya Ramakrishnan Lawrence Berkeley National Laboratory University Of California, Berkeley Lawrence Berkeley National Laboratory Berkeley, USA Berkeley, USA Berkeley, USA [email protected] [email protected] [email protected] Abstract—The growth in scientific data volumes has resulted analytics at scale. Our work addresses these gaps. There are in a need to scale up processing and analysis pipelines using High three key components driving our work: Performance Computing (HPC) systems. These workflows need interactive, reproducible analytics at scale. The Jupyter platform • Reproducible Analytics: Reproducibility is a key com- provides core capabilities for interactivity but was not designed ponent of the scientific workflow. We wish to to capture for HPC systems. In this paper, we outline our efforts that the entire software stack that goes into a scientific analy- bring together core technologies based on the Jupyter Platform to create interactive, reproducible analytics at scale on HPC sis, along with the code for the analysis itself, so that this systems. Our work is grounded in a real world science use case can then be re-run anywhere. In particular it is important - applying geophysical simulations and inversions for imaging to be able reproduce workflows on HPC systems, against the subsurface.
    [Show full text]
  • Parsing Hierarchical Data Format (HDF) Files Karl Nyberg Grebyn Corporation P
    Parsing Hierarchical Data Format (HDF) Files Karl Nyberg Grebyn Corporation P. O. Box 47 Sterling, VA 20167-0047 703-406-4161 [email protected] ABSTRACT This paper presents a description of the creation of a library to parse Hierarchical Data Format (HDF) Files in Ada. It describes a “work in progress” with discussion of current performance, limitations and future plans. Categories and Subject Descriptors D.2.2 [Design Tools and Techniques]: Software Libraries; E.2 [Data Storage Representations]: Object Representation; I.2.10 [Vision and Scene Understanding]: Representations, data structures, and transforms. General Terms Algorithms, Performance, Experimentation, Data Formats. Keywords Ada, HDF. 1. INTRODUCTION Large quantities of scientific data are published each year by NASA. These data are often accompanied by metadata files that describe the contents of individual files of the data. One example of this data is the ASTER (Advanced Spaceborne Thermal Emission and Reflection Radiometer) [1]. Each file of data (consisting of satellite imagery) in HDF (Hierarchical Data Format) [2] is accompanied by a metadata file in XML (Extensible Markup Language) [3], encoded according to a published DTD (Document Type Description) that indicates the components and types of data in the metadata file. Each ASTER data file consists of an image of the satellite as it passes over the earth [4]. Information on the location of the data collected as the satellite passes is contained in the metadata file. Over time, multiple images of the same location on earth are obtained. For many purposes of analysis (erosion, building patterns, deforestation, glacier movement, etc.), these images of the same location are compared over time.
    [Show full text]
  • A New Data Model, Programming Interface, and Format Using HDF5
    NetCDF-4: A New Data Model, Programming Interface, and Format Using HDF5 Russ Rew, Ed Hartnett, John Caron UCAR Unidata Program Center Mike Folk, Robert McGrath, Quincey Kozial NCSA and The HDF Group, Inc. Final Project Review, August 9, 2005 THG, Inc. 1 Motivation: Why is this area of work important? While the commercial world has standardized on the relational data model and SQL, no single standard or tool has critical mass in the scientific community. There are many parallel and competing efforts to build these tool suites – at least one per discipline. Data interchange outside each group is problematic. In the next decade, as data interchange among scientific disciplines becomes increasingly important, a common HDF-like format and package for all the sciences will likely emerge. Jim Gray, Distinguished “Scientific Data Management in the Coming Decade,” Jim Gray, David Engineer at T. Liu, Maria A. Nieto-Santisteban, Alexander S. Szalay, Gerd Heber, Microsoft, David DeWitt, Cyberinfrastructure Technology Watch Quarterly, 1998 Turing Award Volume 1, Number 2, February 2005 winner 2 Preservation of scientific data … the ephemeral nature of both data formats and storage media threatens our very ability to maintain scientific, legal, and cultural continuity, not on the scale of centuries, but considering the unrelenting pace of technological change, from one decade to the next. … And that's true not just for the obvious items like images, documents, and audio files, but also for scientific images, … and MacKenzie Smith, simulations. In the scientific research community, Associate Director standards are emerging here and there—HDF for Technology at (Hierarchical Data Format), NetCDF (network the MIT Libraries, Common Data Form), FITS (Flexible Image Project director at Transport System)—but much work remains to be MIT for DSpace, a groundbreaking done to define a common cyberinfrastructure.
    [Show full text]
  • A Very Useful Enhancement for MSC Nastran and Patran
    Tips & Tricks HDF5: A Very Useful Enhancement for MSC Nastran and Patran et’s face it, as FEA engineers we’ve all been As a premier FEA solver that is widely used by the in that situation where the finite element Aerospace and Automotive industries, MSC Nastran L model has been prepared, boundary now takes advantage of the HDF5 file to manage conditions have been checked and the model has and store your FEA data. This can simplify your post been solved, but then it comes time to investigate processing tasks and make data management much the results. It’s not unusual on a large project to easier and simpler (see below). produce different outputs for different purposes: an XDB or OP2 for Patran, a Punch file to pass to So what is HDF5? It is an open source file format, Excel, and the f06 to read the printed output. In data model, and library developed by the HDF Group. addition, your company may have programs that read one of these files to perform special purpose HDF5 has three significant advantages compared to calculations. Managing all these files becomes a previous result file formats: 1) The HDF5 file is project of its own. smaller than XDB and OP2, 2) accessing results is significantly faster with HDF5, and 3) the input and Well let me tell you, there is a better way to do this!! output datablocks are stored in a single, high 82 | Engineering Reality Magazine Before HDF5 After HDF5 precision file. With input and output in the same file, party extensions are available for Python, .Net, and you eliminate confusion and potential errors keeping many other languages.
    [Show full text]
  • Status of GDL-GNU Data Language
    Astronomical Data Analysis Software and Systems XIX O14.3 ASP Conference Series, Vol. XXX, 2009 Y. Mizumoto, K.-I. Morita, and M. Ohishi, eds. Status of GDL - GNU Data Language A. Coulais LERMA, Obs. de Paris, ENS, UPMC, UCP, CNRS, Paris, France M. Schellens1 J. Gales Goddard Space Flight Center, Greenbelt, MD, USA S. Arabas Institute of Geophysics, University of Warsaw, Poland M. Boquien University of Massachusetts, Dep. of Astronomy, Amherst, MA, USA P. Chanial P. Messmer, D. Fillmore Tech-X GmbH, Zurich, Switzerland; Tech-X Corp, Boulder, CO, USA O. Poplawski Colorado Div. (CoRA) of NorthWest Res. Ass. Inc., Boulder, CO, USA S. Maret LAOG, Obs. de Grenoble, UJF, CNRS, Grenoble, France G. Marchal2, N. Galmiche2, T. Mermet2 arXiv:1101.0679v1 [astro-ph.IM] 4 Jan 2011 Abstract. Gnu Data Language (GDL) is an open-source interpreted language aimed at numerical data analysis and visualisation. It is a free implementation of the Interactive Data Language (IDL) widely used in Astronomy. GDL has a full syntax compatibility with IDL, and includes a large set of library routines targeting advanced matrix manipulation, plotting, time-series and image analy- sis, mapping, and data input/output including numerous scientific data formats. We will present the current status of the project, the key accomplishments, and the weaknesses - areas where contributions are welcome! 1Head of the project 2Former students at LERMA CNRS and Observatoire de Paris 1 2 Coulais et al. 1. Dependencies GDL is written in C++ and can be compiled on systems with GCC (≥ 3.4) and X11 or equivalents. The code, under GNU GPL, is hosted by SourceForge.
    [Show full text]
  • GNU Data Language (GDL) - a Free and Open-Source Implementation of IDL
    Geophysical Research Abstracts Vol. 12, EGU2010-924-1, 2010 EGU General Assembly 2010 © Author(s) 2009 GNU Data Language (GDL) - a free and open-source implementation of IDL Sylwester Arabas (1), Marc Schellens (), Alain Coulais (2), Joel Gales (3), and Peter Messmer (4) (1) Institute of Geophysics, University of Warsaw, Warsaw, Poland ([email protected] / +48225546882), (2) LERMA, CNRS and Observatoire de Paris, Paris, France, (3) NASA Goddard Space Flight Center, Greenbelt, Maryland, USA, (4) Tech-X Corporation, Boulder, Colorado, USA GNU Data Language (GDL) is developed with the aim of providing an open-source drop-in replacement for the ITTVIS’s Interactive Data Language (IDL). It is free software developed by an international team of volunteers led by Marc Schellens - the project’s founder (a list of contributors is available on the project’s website). The development is hosted on SourceForge where GDL continuously ranks in the 99th percentile of most active projects. GDL with its library routines is designed as a tool for numerical data analysis and visualisation. As its proprietary counterparts (IDL and PV-WAVE), GDL is used particularly in geosciences and astronomy. GDL is dynamically-typed, vectorized and has object-oriented programming capabilities. The library routines handle numerical calculations, data visualisation, signal/image processing, interaction with host OS and data input/output. GDL supports several data formats such as netCDF, HDF4, HDF5, GRIB, PNG, TIFF, DICOM, etc. Graphical output is handled by X11, PostScript, SVG or z-buffer terminals, the last one allowing output to be saved in a variety of raster graphics formats.
    [Show full text]