DATA PROCESSING AND ANALYSIS TOOLS FOR DATA ANALYSIS

13 NOV 2018 I SABINE SCHRÖDER, IEK-8 OUTLINE 1. Motivation 2. Tools and data standards 3. Commands, Interpreter, Programming: When to use what? 4. Tools in command line operators/viewers 5. Tools in interpreted languages 6. Tools in compiled languages 7. Summary

13 November 2018 Sabine Schröder Tools for data analysis MOTIVATION

13 November 2018 Sabine Schröder Tools for data analysis MOTIVATION

13 November 2018 Sabine Schröder Tools for data analysis TOOLS AND DATA STANDARDS

tool1 A chosen data format influences the variety tool2 of available tools.

standard tool3 Sticking to data standards enhances the data availability of reusable tools. format tool4 Keep up to date: Not only data formats … develop, but also tools!

tooln

13 November 2018 Sabine Schröder Tools for data analysis COMMANDS, INTERPRETERS, COMPILED PROGRAMS

• Ease of learning command line tool: interpreter: programming: > print_helloWorld >>> print("Hello, PROGRAM Hello world!") WRITE(*,*) "Hello, world!" END PROGRAM Hello

13 November 2018 Sabine Schröder Tools for data analysis COMMANDS, INTERPRETERS, COMPILED PROGRAMS

• Ease of learning

• Getting up and running command line tool: interpreter: programming: • download • download and • license compiler installation of • install compiler interpreter • learn about compiler • learn syntax of • learn syntax of interpreter programming language • after programming, compile, then run

13 November 2018 Sabine Schröder Tools for data analysis COMMANDS, INTERPRETERS, COMPILED PROGRAMS

• Ease of learning

• Getting up and running

• Speed: a productivity vs. performance tradeoff command line tool: interpreter: programming: > ncwa --no_tmp_fl -y >>> from netCDF4 import PROGRAM Maxi max -v tpot test.nc Dataset USE netcdf max_tpot.nc >>> rootgrp = st=nf90_open("test.nc", Dataset("test.nc","") NF90_NOWRITE, ncid) >>> print st=nf90_inq_varid(ncid, rootgrp.variables["tpot"][: "tpot", tpotId) ,:,:,:].max() st=nf90_get_var(ncid, tpotId, >>> rootgr.close() tpot) write(*,*) maxval(tpot) END PROGRAM Maxi 13 November 2018 Sabine Schröder Tools for data analysis COMMANDS, INTERPRETERS, COMPILED PROGRAMS

• Ease of learning

• Getting up and running

• Speed: a productivity vs. performance tradeoff command line tool: interpreter: programming: > ncwa --no_tmp_fl -y >>> from netCDF4 import PROGRAM Maxi max -v tpot test.nc Dataset USE netcdf max_tpot.nc >>> rootgrp = st=nf90_open("test.nc", Dataset("test.nc","r") NF90_NOWRITE, ncid) >>> print st=nf90_inq_varid(ncid,  0.16 s rootgrp.variables 0.05 [s" tpot"][: "tpot", tpotId) ,:,:,:].max() st=nf90_get_var( 0.03ncid s , tpotId, >>> rootgr.close() tpot) write(*,*) maxval(tpot) END PROGRAM Maxi 13 November 2018 Sabine Schröder Tools for data analysis COMMANDS, INTERPRETERS, COMPILED PROGRAMS

• Ease of learning

• Getting up and running

• Speed: a productivity vs. performance tradeoff

• Scope of use

• Requirements

13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN COMMAND LINE OPERATORS/VIEWERS NCDUMP  NCGEN From CDL to Header • netCDF-3 • netCDF-4 • /F77/JAVA program

Variable

Record

13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN COMMAND LINE VIEWERS NCVIEW

http://meteora.ucsd.edu/~pierce/ncview_home_page.html

13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN COMMAND LINE VIEWERS PANOPLY

https://www.giss.nasa.gov/tools/panoply/

13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN COMMAND LINE OPERATORS NCO (NETCDF OPERATORS) http://nco.sourceforge.net/

• standalone, command-line programs to . derive new fields . compute statistics . hyperslab . manipulate metadata . regrid • input: . netCDF, HDF, DAP . flat files . GODAD (Group-Oriented Data Analysis and Distribution)

13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN COMMAND LINE OPERATORS NCO (NETCDF OPERATORS) ncap2 (arithmetic processor (version 2)) ncap2 ncatted (attribute editor) arithmetically processes netCDF files ncbo (binary operator) instructions from command line ncclimo (climatology generator) or from file ncecat (ensemble concatenator) treats missing values nces (ensemble statistics) definition of dimensions possible ncflint (file interpolator) can link to the GNU Scientific (GSL) ncks (kitchen sink) can create derived fields ncpdq (permute dimensions quickly) ncra (record average) example: ncrcat (record concatenator) ncap2 -v -s 'a=3;b=4;c=sqrt(a^2+b^2)' in.nc ncremap (remapper) out.nc ncrename (renamer) ncwa (weighted average)

13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN COMMAND LINE OPERATORS NCO (NETCDF OPERATORS) ncap2 (arithmetic processor (version 2)) ncatted ncatted (attribute editor) append, create, delete, modify, and overwrite ncbo (binary operator) attributes ncclimo (climatology generator) ncecat (ensemble concatenator) example: nces (ensemble statistics) ncatted -a history,global,a,c,'Data version ncflint (file interpolator) 2.0\n' in.nc ncks (kitchen sink) ncpdq (permute dimensions quickly) ncra (record average) ncrcat (record concatenator) ncremap (remapper) ncrename (renamer) ncwa (weighted average)

13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN COMMAND LINE OPERATORS NCO (NETCDF OPERATORS) ncap2 (arithmetic processor (version 2)) ncbo ncatted (attribute editor) four operations: ncbo (binary operator) Addition, Subtraction, Multiplication, Division ncclimo (climatology generator) ncecat (ensemble concatenator) example: nces (ensemble statistics) ncbo --op_typ=sub 86_0112.nc ncflint (file interpolator) 85_0112.nc 86m85_0112.nc ncks (kitchen sink) ncpdq (permute dimensions quickly) ncra (record average) ncrcat (record concatenator) ncremap (remapper) ncrename (renamer) ncwa (weighted average)

13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN COMMAND LINE OPERATORS NCO (NETCDF OPERATORS) ncap2 (arithmetic processor (version 2)) ncclimo ncatted (attribute editor) • Climatology modes: ncbo (binary operator) annual, monthly, daily ncclimo (climatology generator) • seasons: ncecat (ensemble concatenator) jfm,amj,jas,ond,on,fm,djf,mam,jja,son,ann nces (ensemble statistics) • number of timesteps-per-day in output ncflint (file interpolator) • automatic filename creation and splitting ncks (kitchen sink) of output files ncpdq (permute dimensions quickly) ncra (record average) example: ncrcat (record concatenator) ncclimo -C ann -m cism -h h -c caseid ncremap (remapper) -s 1851-e 1900 -i drc_in -o drc_out ncrename (renamer) ncwa (weighted average)

13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN COMMAND LINE OPERATORS NCO (NETCDF OPERATORS) ncap2 (arithmetic processor (version 2)) ncecat ncatted (attribute editor) create one output file: ncbo (binary operator) • Record Aggregation – NetCDF3 ncclimo (climatology generator) • Group Aggregation – NetCDF4 ncecat (ensemble concatenator) nces (ensemble statistics) example: ncflint (file interpolator) ncecat -u realization 85_0[1-5].nc 85.nc ncks (kitchen sink) ncpdq (permute dimensions quickly) ncra (record average) ncrcat (record concatenator) ncremap (remapper) ncrename (renamer) ncwa (weighted average)

13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN COMMAND LINE OPERATORS NCO (NETCDF OPERATORS) ncap2 (arithmetic processor (version 2)) nces ncatted (attribute editor) gridpoint statistics on variables across ncbo (binary operator) • an ensemble of input-files ncclimo (climatology generator) • input groups within each file ncecat (ensemble concatenator) statistics: nces (ensemble statistics) avg Mean value sqravg Square of the mean ncflint (file interpolator) avgsqr Mean of sum of squares max Maximum value min Minimum value mabs Maximum absolute value ncks (kitchen sink) mebs Mean absolute value mibs Minimum absolute value ncpdq (permute dimensions quickly) rms Root-mean-square (normalized by N) ncra (record average) rmssdn Root-mean square (normalized by N-1) sqrt Square root of the mean tabs Sum of absolute values ncrcat (record concatenator) ttl Sum of values ncremap (remapper) example: ncrename (renamer) nces 85_0[1-5].nc 85.nc ncwa (weighted average)

13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN COMMAND LINE OPERATORS NCO (NETCDF OPERATORS) ncap2 (arithmetic processor (version 2)) ncflint ncatted (attribute editor) linear combination of input files: ncbo (binary operator) • weighted average ncclimo (climatology generator) • normalized weighted average ncecat (ensemble concatenator) • interpolation nces (ensemble statistics) ncflint (file interpolator) example: ncks (kitchen sink) ncflint -i time,86 85.nc 87.nc 86.nc ncpdq (permute dimensions quickly) ncra (record average) ncrcat (record concatenator) ncremap (remapper) ncrename (renamer) ncwa (weighted average)

13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN COMMAND LINE OPERATORS NCO (NETCDF OPERATORS) ncap2 (arithmetic processor (version 2)) ncks ncatted (attribute editor) file creation and conversion with special ncbo (binary operator) features: ncclimo (climatology generator) • extract ncecat (ensemble concatenator) • hyperslab nces (ensemble statistics) • multi-slab ncflint (file interpolator) • sub-set ncks (kitchen sink) • translate ncpdq (permute dimensions quickly) ncra (record average) example: ncrcat (record concatenator) ncks -d time,5 -d lat,,0.0 -d lon,260.0,45.0 ncremap (remapper) -d lev,1000.0 in.nc out.nc ncrename (renamer) ncwa (weighted average)

13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN COMMAND LINE OPERATORS NCO (NETCDF OPERATORS) ncap2 (arithmetic processor (version 2)) ncpdq ncatted (attribute editor) two distinct functions: ncbo (binary operator) • packing (pdq: Pack Data Quietly) ncclimo (climatology generator) • dimension permutation (pdq:Permute ncecat (ensemble concatenator) Dimensions Quickly) nces (ensemble statistics) ncflint (file interpolator) example: ncks (kitchen sink) packing: ncpdq (permute dimensions quickly) ncpdq in.nc out.nc ncra (record average) dimension permuation: ncrcat (record concatenator) ncpdq -a lon,-lat in.nc out.nc ncremap (remapper) ncrename (renamer) ncwa (weighted average)

13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN COMMAND LINE OPERATORS NCO (NETCDF OPERATORS) ncap2 (arithmetic processor (version 2)) ncra ncatted (attribute editor) computes statistics of record variables across ncbo (binary operator) an arbitrary number of input-files ncclimo (climatology generator) statistics: ncecat (ensemble concatenator) avg Mean value sqravg Square of the mean nces (ensemble statistics) avgsqr Mean of sum of squares max Maximum value min Minimum value mabs Maximum absolute value ncflint (file interpolator) mebs Mean absolute value mibs Minimum absolute value ncks (kitchen sink) rms Root-mean-square (normalized by N) rmssdn Root-mean square (normalized by N-1) ncpdq (permute dimensions quickly) sqrt Square root of the mean tabs Sum of absolute values ncra (record average) ttl Sum of values ncrcat (record concatenator) example: ncremap (remapper) ncra -d time,11,13 85.nc 86.nc 87.nc ncrename (renamer) 8512_8602.nc ncwa (weighted average)

13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN COMMAND LINE OPERATORS NCO (NETCDF OPERATORS) ncap2 (arithmetic processor (version 2)) ncrcat ncatted (attribute editor) concatenates record variables across an ncbo (binary operator) arbitrary number of input-files ncclimo (climatology generator) ncecat (ensemble concatenator) example: nces (ensemble statistics) ncrcat 85.nc 86.nc 87.nc 88.nc 89.nc 8589.nc ncflint (file interpolator) ncks (kitchen sink) ncpdq (permute dimensions quickly) ncra (record average) ncrcat (record concatenator) ncremap (remapper) ncrename (renamer) ncwa (weighted average)

13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN COMMAND LINE OPERATORS NCO (NETCDF OPERATORS) ncap2 (arithmetic processor (version 2)) ncremap ncatted (attribute editor) remap to grid specified by a map file (weight- ncbo (binary operator) file), grid destination file, or a template file ncclimo (climatology generator) (data file on destination grid) ncecat (ensemble concatenator) nces (ensemble statistics) example: ncflint (file interpolator) ncremap -d dst.nc in.nc out.nc ncks (kitchen sink) ncpdq (permute dimensions quickly) ncra (record average) ncrcat (record concatenator) ncremap (remapper) ncrename (renamer) ncwa (weighted average)

13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN COMMAND LINE OPERATORS NCO (NETCDF OPERATORS) ncap2 (arithmetic processor (version 2)) ncrename ncatted (attribute editor) renames dimensions, variables, attributes, ncbo (binary operator) and groups ncclimo (climatology generator) ncecat (ensemble concatenator) example: nces (ensemble statistics) ncrename -d lon,longitude -v lon,longitude ncflint (file interpolator) in.nc ncks (kitchen sink) ncpdq (permute dimensions quickly) ncra (record average) ncrcat (record concatenator) ncremap (remapper) ncrename (renamer) ncwa (weighted average)

13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN COMMAND LINE OPERATORS NCO (NETCDF OPERATORS) ncap2 (arithmetic processor (version 2)) ncwa ncatted (attribute editor) computes statistics on variables in a single ncbo (binary operator) file over arbitrary dimensions, with options to ncclimo (climatology generator) specify weights, masks, and normalization ncecat (ensemble concatenator) statistics: nces (ensemble statistics) avg Mean value sqravg Square of the mean ncflint (file interpolator) avgsqr Mean of sum of squares max Maximum value min Minimum value mabs Maximum absolute value ncks (kitchen sink) mebs Mean absolute value mibs Minimum absolute value ncpdq (permute dimensions quickly) rms Root-mean-square (normalized by N) ncra (record average) rmssdn Root-mean square (normalized by N-1) sqrt Square root of the mean tabs Sum of absolute values ncrcat (record concatenator) ttl Sum of values ncremap (remapper) example: ncrename (renamer) ncwa -y max -v tpot test.nc max_tpot.nc ncwa (weighted average)

13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN COMMAND LINE OPERATORS CDO (CLIMATE DATA OPERATORS) https://code.mpimet.mpg.de/projects/cdo

• collection of command-line operators to . manipulate . analyse climate and NWP model data (more than 600 operators available)

• input: . GRIB 1/2 . netCDF 3/4 . SERVICE, EXTRA and IEG

13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN COMMAND LINE OPERATORS CDO (CLIMATE DATA OPERATORS) Collections: • Dataset information • Information • Comparison of two datasets • File operations • Number of • Selection/Conditional selection parameters/levels/years/mont • Comparison hs/dates/timesteps/… • Modification • Show • Arithmetic standard_names/attributes/le • Statistical values/Correlation/Regression/EOF vels/date information/… • Interpolation • Grid description • Transformation • Import/Export • Miscellaneous/NCL

CDO has a interface: CDI 13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN COMMAND LINE OPERATORS CDO (CLIMATE DATA OPERATORS) Collections: • CopyDataset datasets information • Information • ConcatenateComparison ofdatasets two datasets • File operations • ReplaceNumber ofvariables • Selection/Conditional selection • Mergeparameters datasets/levels /years/mont • Comparison • Spliths/dates by /timesteps/… • Modification • codenumberShow /levels/grids/hour • Arithmetic sstandard_names/days/…/time selection/attributes /le • Statistical values/Correlation/Regression/EOF • Distributevels/date /informationcollect horizontal/… • Interpolation • gridGrid description • Transformation • Import/Export • Miscellaneous/NCL

CDO has a Fortran interface: CDI 13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN COMMAND LINE OPERATORS CDO (CLIMATE DATA OPERATORS) Collections: •• DatasetSelect/delete information fields • Information •• ComparisonSelect of two datasets • File operations • Numberparameters of /levels/grids • Selection/Conditional selection • parametersSelect /levels/years/mont • Comparison hstimesteps/dates/timesteps/hours/days/…/… • Modification •• ShowSelect lat-lon-box/index-box • Arithmetic • standard_namesSelect/delete grid/attributes cells /le • Statistical values/Correlation/Regression/EOF • velsResample/date information grid /… • Interpolation •• GridUse maskdescription file with conditions • Transformation (ifthen/ifnothen/ifthenelse/…) • Import/Export • Miscellaneous/NCL

CDO has a Fortran interface: CDI 13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN COMMAND LINE OPERATORS CDO (CLIMATE DATA OPERATORS) Collections: •eqDataset Equal information • Information •neComparison Not equal of two datasets • File operations •le NumberLess equal of • Selection/Conditional selection lt Lessparameters than /levels/years/mont • Comparison gehs Greater/dates /equaltimesteps /… • Modification •gt ShowGreater than • Arithmetic eqcstandard_names Equal constant/ attributes/le • Statistical values/Correlation/Regression/EOF necvels Not/date equal information constant/… • Grid description • Interpolation lec Less equal constant • Transformation ltc Less than constant • Import/Export gec Greater equal constant • Miscellaneous/NCL gtc Greater than constant

CDO has a Fortran interface: CDI 13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN COMMAND LINE OPERATORS CDO (CLIMATE DATA OPERATORS) Collections: •• DatasetModify information • Information • Comparison. metadata of two datasets • File operations • Number. fields of or part of a field • Selection/Conditional selection • parametersin a dataset/ levels/years/mont • Comparison • hsSet/dates attributes/date/time/timesteps/… • Modification • Showbounds/grids/levels/missing • Arithmetic standard_namesvalue/valid range/ attributes/le • Statistical values/Correlation/Regression/EOF • velsInvert/date latitudes/levels information/… • Grid description • Interpolation • Shift x/y • Transformation • Mask regions/boxes • Import/Export • Miscellaneous/NCL

CDO has a Fortran interface: CDI 13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN COMMAND LINE OPERATORS CDO (CLIMATE DATA OPERATORS) Collections: •• Datasetarithmetically information process datasets • Information • Comparisonvia expression of (twoscript datasets)s • File operations •• Numberoperators of • Selection/Conditional selection parameters(abs,sqrt,acos,log10,…)/levels/years/ mont • Comparison • hsOperate/dates /ontimesteps two fields/… • Modification • Show(add/sub /min/…) • Arithmetic • standard_namesMonthly/multiyear/attributes /le • Statistical values/Correlation/Regression/EOF vels[hourly/date/daily information/monthly//…seasonal ] • Grid description • Interpolation arithmetics (add, sub, mul, div) • Transformation • Days per month (add, sub, …) • Import/Export • Miscellaneous/NCL

CDO has a Fortran interface: CDI 13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN COMMAND LINE OPERATORS CDO (CLIMATE DATA OPERATORS) Collections: •• DatasetCumulative information values • Information •• ComparisonEnsemble/field of /zonal/meridional/two datasets • File operations • Numbergridbox/ verticalof /time • Selection/Conditional selection parametersselection/running/levels/time//yearshourly/mont/ • Comparison hsmonthly/dates//yearlytimesteps/seasonal/… / • Modification • Showmultiyear statistics • Arithmetic • standard_namesCorrelation in grid/attributes/time /le • Statistical values/Correlation/Regression/EOF• velsCovariance/date information in grid/time/… • Grid description • Interpolation • Regression • Transformation • Trends • Import/Export • EOF calculations • Miscellaneous/NCL

CDO has a Fortran interface: CDI 13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN COMMAND LINE OPERATORS CDO (CLIMATE DATA OPERATORS) Collections: •• Datasethorizontal information fields to a new grid • Information •• Comparisoninterpolation ofof two3D variablesdatasets • File operations • Numberfrom hybrid of model levels to • Selection/Conditional selection parametersheight or pressure/levels/ yearslevels/mont • Comparison • hsinterpolation/dates/timesteps in time/… between • Modification • Showtime steps and years • Arithmetic • standard_nameslinear/bilinear/cubic/attributes interpolation/le • Statistical values/Correlation/Regression/EOF • velsnearest/date neighbor/distance information/… - • Grid description • Interpolation weighted average remapping • Transformation • Import/Export • Miscellaneous/NCL

CDO has a Fortran interface: CDI 13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN COMMAND LINE OPERATORS CDO (CLIMATE DATA OPERATORS) Collections: • spectral to gridpoint • Information and vice versa • File operations • divergence and vorticity to U • Selection/Conditional selection and V wind • Comparison and vice versa • Modification • D and V to velocity potential and • Arithmetic stream function • Statistical values/Correlation/Regression/EOF • Interpolation • Transformation • Import/Export • Miscellaneous/NCL

CDO has a Fortran interface: CDI 13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN COMMAND LINE OPERATORS CDO (CLIMATE DATA OPERATORS) Collections: import and export data files which • Information cannot be read or written directly • File operations with CDO • Selection/Conditional selection • Comparison • Modification • Arithmetic • Statistical values/Correlation/Regression/EOF • Interpolation • Transformation • Import/Export • Miscellaneous/NCL

CDO has a Fortran interface: CDI 13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN COMMAND LINE OPERATORS CDO (CLIMATE DATA OPERATORS) Collections: • Filtering (band/low/highpass) • Information • Create pressure/temperature • File operations values for hydrostatic • Selection/Conditional selection atmosphere • Comparison • Potential temperature to in-situ • Modification temperature (and vice versa) • Arithmetic • Histogram • Statistical values/Correlation/Regression/EOF • Frost/strong wind/strong • Interpolation breeze/strong gale/hurrican • Transformation days • Import/Export • GrADS data descriptor file • Miscellaneous/NCL • ECHAM post processor

CDO has a Fortran interface: CDI 13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN INTERPRETED LANGUAGES NCL (NCAR COMMAND LANGUAGE)

ncl 0> a=addfile("test.nc","r") (also OPeNDAP possible)

13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN INTERPRETED LANGUAGES NCL

• General NCL routines Array creation, manipulation, query • Input/output Group creation, query • Math and statistics List routines • Earth Science String • Visualization System Type conversion Variable query, manipulation

13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN INTERPRETED LANGUAGES NCL

• General NCL routines File input/output • Input/output Printing • Math and statistics • Earth Science • Visualization

13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN INTERPRETED LANGUAGES NCL General applied math • General NCL routines Bootstrap • Input/output Cumulative distribution functions • Math and statistics Empirical orthogonal functions • Earth Science ESMF regridding • Visualization Extreme values Heat stress Interpolation Ngmath routines Random number generators Regridding Singular value decomposition Spherical harmonics Statistics

13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN INTERPRETED LANGUAGES NCL

• General NCL routines Climatology • Input/output CESM • Math and statistics Crop • Earth Science Heat-stress • Visualization Date Drought Lat/lon functions Metadata/missing values Oceanography RIP functions WRF functions

13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN INTERPRETED LANGUAGES NCL

• General NCL routines Graphics routines • Input/output Color • Math and statistics Object manipulation • Earth Science Workstation • Visualization

13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN INTERPRETED LANGUAGES NCL

Climatology

13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN INTERPRETED LANGUAGES GRADS (GRID ANALYSIS AND DISPLAY SYSTEM)

ga-> sdfopen test.nc

Data Descriptor File DSET STID CHSUB TVAR DTYPE TOFFVAR INDEX CACHESIZE STNMAP OPTIONS TITLE PDEF UNDEF XDEF UNPACK YDEF FILEHEADER ZDEF XYHEADER TDEF XYTRAILER EDEF THEADER VECTORPAIRS HEADERBYTES VARS TRAILERBYTES ENDVARS XVAR ATTRIBUTE METADATA YVAR COMMENTS ZVAR

13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN INTERPRETED LANGUAGES (PY)FERRET yes? use test.nc (USE is an alias for SET DATA/FORMAT=CDF; also OPeNDAP possible)

To output a variable in NetCDF: yes? LIST/FORMAT=CDF variable_name (If a filename is not specified for writing, Ferret will generate one.)

13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN INTERPRETED LANGUAGES R > library(ncdf4) > ncin <- nc_open('test.nc')

13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN INTERPRETED LANGUAGES PYTHON (HTTP://UNIDATA.GITHUB.IO/NETCDF4-PYTHON/) >>> from netCDF4 import Dataset >>> rootgrp = Dataset("test.nc","r") Functions Classes Class chartostring CompoundType CompoundType date2index Dataset __init__ date2num Dimension getlibversion EnumType num2date Group stringtoarr MFDataset stringtochar MFTime VLType Variable

13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN INTERPRETED LANGUAGES PYTHON (HTTP://UNIDATA.GITHUB.IO/NETCDF4-PYTHON/) >>> from netCDF4 import Dataset

>>> rootgrp = Dataset("test.nc","r") ncattrs Functions Classes Class renameAttribute chartostring CompoundType Dataset renameDimension date2index Dataset __init__ renameGroup date2num Dimension close renameVariable getlibversion EnumType createCompoundType set_always_mask num2date Group createDimension set_auto_chartostring stringtoarr MFDataset createEnumType set_auto_mask stringtochar MFTime createGroup set_auto_maskandscale VLType createVLType set_auto_scale Variable createVariable set_fill_off delncattr set_fill_on filepath setncattr get_variables_by_attributes setncattr_string getncattr setncatts isopen sync 13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN INTERPRETED LANGUAGES PYTHON (HTTP://UNIDATA.GITHUB.IO/NETCDF4-PYTHON/) >>> from netCDF4 import Dataset >>> rootgrp = Dataset("test.nc","r") Functions Classes Class chartostring CompoundType Dimension date2index Dataset __init__ date2num Dimension group getlibversion EnumType isunlimited num2date Group stringtoarr MFDataset stringtochar MFTime VLType Variable

13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN INTERPRETED LANGUAGES PYTHON (HTTP://UNIDATA.GITHUB.IO/NETCDF4-PYTHON/) >>> from netCDF4 import Dataset >>> rootgrp = Dataset("test.nc","r") Functions Classes Class chartostring CompoundType EnumType date2index Dataset __init__ date2num Dimension getlibversion EnumType num2date Group stringtoarr MFDataset stringtochar MFTime VLType Variable

13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN INTERPRETED LANGUAGES PYTHON (HTTP://UNIDATA.GITHUB.IO/NETCDF4-PYTHON/) >>> from netCDF4 import Dataset >>> rootgrp = Dataset("test.nc","r") ncattrs Functions Classes Class renameAttribute chartostring CompoundType Group renameDimension date2index Dataset __init__ renameGroup date2num Dimension close renameVariable getlibversion EnumType createCompoundType set_always_mask num2date Group createDimension set_auto_chartostring stringtoarr MFDataset createEnumType set_auto_mask stringtochar MFTime createGroup set_auto_maskandscale VLType createVLType set_auto_scale Variable createVariable set_fill_off delncattr set_fill_on filepath setncattr get_variables_by_attributes setncattr_string getncattr setncatts isopen sync 13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN INTERPRETED LANGUAGES PYTHON (HTTP://UNIDATA.GITHUB.IO/NETCDF4-PYTHON/) >>> from netCDF4 import Dataset >>> rootgrp = Dataset("test.nc","r") renameGroup Functions Classes Class renameVariable chartostring CompoundType MFDataset set_always_mask date2index Dataset createCompoundType set_auto_chartostring date2num Dimension createDimension set_auto_mask getlibversion EnumType createEnumType set_auto_maskandscale num2date Group createGroup set_auto_scale stringtoarr MFDataset createVLType set_fill_off stringtochar MFTime createVariable set_fill_on VLType delncattr setncattr Variable filepath setncattr_string get_variables_by_attributes setncatts getncattr sync Isopen __init__ renameAttribute close renameDimension ncattrs 13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN INTERPRETED LANGUAGES PYTHON (HTTP://UNIDATA.GITHUB.IO/NETCDF4-PYTHON/) >>> from netCDF4 import Dataset >>> rootgrp = Dataset("test.nc","r") Functions Classes Class chartostring CompoundType MFTime date2index Dataset __init__ date2num Dimension ncattrs getlibversion EnumType set_auto_chartostring num2date Group set_auto_mask stringtoarr MFDataset set_auto_maskandscale stringtochar MFTime set_auto_scale VLType typecode Variable

13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN INTERPRETED LANGUAGES PYTHON (HTTP://UNIDATA.GITHUB.IO/NETCDF4-PYTHON/) >>> from netCDF4 import Dataset >>> rootgrp = Dataset("test.nc","r") Functions Classes Class chartostring CompoundType VLType date2index Dataset __init__ date2num Dimension getlibversion EnumType num2date Group stringtoarr MFDataset stringtochar MFTime VLType Variable

13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN INTERPRETED LANGUAGES PYTHON (HTTP://UNIDATA.GITHUB.IO/NETCDF4-PYTHON/) >>> from netCDF4 import Dataset >>> rootgrp = Dataset("test.nc","r") Functions Classes Class set_always_mask chartostring CompoundType Variable set_auto_chartostring date2index Dataset __init__ set_auto_mask date2num Dimension assignValue set_auto_maskandscale getlibversion EnumType chunking set_auto_scale num2date Group delncattr set_collective stringtoarr MFDataset endian set_var_chunk_cache stringtochar MFTime filters setncattr VLType getValue setncattr_string Variable get_dims setncatts get_var_chunk_cache use_nc_get_vars getncattr group ncattrs renameAttribute 13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN INTERPRETED LANGUAGES PYTHON (EXCURSUS: PANGEO) • Foster collaboration around the open source scientific python ecosystem for ocean / atmosphere / land / climate science. • Support the development with domain-specific geoscience packages. • Improve scalability of these tools to handle petabyte-scale datasets on HPC and cloud platforms.

The Python Data Stack The Pangeo Platform (Jake VanderPlas, “The State of the Stack,” SciPy (Abernathey et al (2017), “Pangeo: An Open Source Big Data Climate Science Platform“ Keynote (SciPy 2015)) NSF award 1740648)

13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN INTERPRETED LANGUAGES IDL (INTERACTIVE DATA LANGUAGE) proprietary programming tool from Exelis Visual Information Solutions, Inc., IDL> result=ncdf_open("test.nc") a subsidiary of Harris Corporation Simplified Interface NCDF_GET - Retrieve variables and attributes from a NetCDF file. NCDF_LIST - Print out a list of variables and attributes from a NetCDF file. NCDF_PUT - Create or modify a NetCDF file.

13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN INTERPRETED LANGUAGES IDL (INTERACTIVE DATA LANGUAGE) proprietary programming tool from Exelis Visual Information Solutions, Inc., IDL> result=ncdf_open("test.nc") a subsidiary of Harris Corporation Creating NetCDF Files NCDF_CREATE: Call this procedure to begin creating a new file. The new file is put into define mode. NCDF_DIMDEF: Create dimensions for the file. NCDF_VARDEF: Define the variables to be used in the file. NCDF_ATTPUT: Optionally, use attributes to describe the data. NCDF_CONTROL, /ENDEF: Call NCDF_CONTROL and set the ENDEF keyword to leave define mode and enter data mode. NCDF_VARPUT: Write the appropriate data to the netCDF file. NCDF_CLOSE: Close the file.

13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN INTERPRETED LANGUAGES IDL (INTERACTIVE DATA LANGUAGE) proprietary programming tool from Exelis Visual Information Solutions, Inc., IDL> result=ncdf_open("test.nc") a subsidiary of Harris Corporation Reading NetCDF Files NCDF_IS_NCDF: Check if one or more input files are in NetCDF-3 format. NCDF_OPEN: Open an existing netCDF file. NCDF_PARSE: Return an ordered hash containing object information and data from a NetCDF-3 file. NCDF_INQUIRE: Call this function to find the format of the netCDF file. NCDF_DIMINQ: Retrieve the names and sizes of dimensions in the file. NCDF_VARINQ: Retrieve the names, types, and sizes of variables in the file. NCDF_ATTNAME: Optionally, retrieve attribute names. NCDF_ATTINQ: Optionally, retrieve the types and lengths of attributes. NCDF_ATTGET: Optionally, retrieve the attributes. NCDF_VARGET: Read the data from the variables. NCDF_CLOSE: Close the file.

13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN COMPILED LANGUAGES

FORTRAN90 (HTTPS://WWW.UNIDATA.UCAR.EDU/SOFTWARE/NETCDF/NETCDF -4/NEWDOCS/NETCDF-F90.HTML) USE netcdf st=nf90_open("test.nc", NF90_NOWRITE, ncid) st=nf90_inq_varid(ncid, "tpot", tpotId) st=nf90_get_var(ncid, tpotId, tpot) Datasets NF90_STRERROR NF90_INQ_LIBVERS Groups NF90_CREATE Dimensions NF90_OPEN NF90_REDEF User Defined Data Types NF90_ENDDEF Compound Types NF90_CLOSE NF90_INQUIRE Family Variable Length Array NF90_SYNC Opaque Type NF90_ABORT NF90_SET_FILL Enum Type Variables Attributes 13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN COMPILED LANGUAGES

FORTRAN90 (HTTPS://WWW.UNIDATA.UCAR.EDU/SOFTWARE/NETCDF/NETCDF -4/NEWDOCS/NETCDF-F90.HTML) USE netcdf st=nf90_open("test.nc", NF90_NOWRITE, ncid) st=nf90_inq_varid(ncid, "tpot", tpotId) st=nf90_get_var(ncid, tpotId, tpot) Datasets NF90_INQ_NCID NF90_INQ_GRPS Groups NF90_INQ_VARIDS Dimensions NF90_INQ_DIMIDS NF90_INQ_GRPNAME_LEN User Defined Data Types NF90_INQ_GRPNAME Compound Types NF90_INQ_GRPNAME_FULL Variable Length Array NF90_INQ_GRP_PARENT NF90_DEF_GRP Opaque Type Enum Type Variables Attributes 13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN COMPILED LANGUAGES

FORTRAN90 (HTTPS://WWW.UNIDATA.UCAR.EDU/SOFTWARE/NETCDF/NETCDF -4/NEWDOCS/NETCDF-F90.HTML) USE netcdf st=nf90_open("test.nc", NF90_NOWRITE, ncid) st=nf90_inq_varid(ncid, "tpot", tpotId) st=nf90_get_var(ncid, tpotId, tpot) Datasets NF90_DEF_DIM NF90_INQ_DIMID Groups NF90_INQUIRE_DIMENSION Dimensions NF90_RENAME_DIM User Defined Data Types Compound Types Variable Length Array Opaque Type Enum Type Variables Attributes 13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN COMPILED LANGUAGES

FORTRAN90 (HTTPS://WWW.UNIDATA.UCAR.EDU/SOFTWARE/NETCDF/NETCDF -4/NEWDOCS/NETCDF-F90.HTML) USE netcdf st=nf90_open("test.nc", NF90_NOWRITE, ncid) st=nf90_inq_varid(ncid, "tpot", tpotId) st=nf90_get_var(ncid, tpotId, tpot) Datasets NF90_INQ_TYPEIDS NF90_INQ_TYPE Groups NF90_INQ_USER_TYPE Dimensions User Defined Data Types Compound Types Variable Length Array Opaque Type Enum Type Variables Attributes 13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN COMPILED LANGUAGES

FORTRAN90 (HTTPS://WWW.UNIDATA.UCAR.EDU/SOFTWARE/NETCDF/NETCDF -4/NEWDOCS/NETCDF-F90.HTML) USE netcdf st=nf90_open("test.nc", NF90_NOWRITE, ncid) st=nf90_inq_varid(ncid, "tpot", tpotId) st=nf90_get_var(ncid, tpotId, tpot) Datasets NF90_DEF_COMPOUND NF90_INSERT_COMPOUND Groups NF90_INSERT_ARRAY_COMPOUND Dimensions NF90_INQ_COMPOUND User Defined Data Types NF90_INQ_COMPOUND_FIELD Compound Types Variable Length Array Opaque Type Enum Type Variables Attributes 13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN COMPILED LANGUAGES

FORTRAN90 (HTTPS://WWW.UNIDATA.UCAR.EDU/SOFTWARE/NETCDF/NETCDF -4/NEWDOCS/NETCDF-F90.HTML) USE netcdf st=nf90_open("test.nc", NF90_NOWRITE, ncid) st=nf90_inq_varid(ncid, "tpot", tpotId) st=nf90_get_var(ncid, tpotId, tpot) Datasets NF90_DEF_VLEN NF90_INQ_VLEN Groups NF90_FREE_VLEN Dimensions User Defined Data Types Compound Types Variable Length Array Opaque Type Enum Type Variables Attributes 13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN COMPILED LANGUAGES

FORTRAN90 (HTTPS://WWW.UNIDATA.UCAR.EDU/SOFTWARE/NETCDF/NETCDF -4/NEWDOCS/NETCDF-F90.HTML) USE netcdf st=nf90_open("test.nc", NF90_NOWRITE, ncid) st=nf90_inq_varid(ncid, "tpot", tpotId) st=nf90_get_var(ncid, tpotId, tpot) Datasets NF90_DEF_OPAQUE NF90_INQ_OPAQUE Groups Dimensions User Defined Data Types Compound Types Variable Length Array Opaque Type Enum Type Variables Attributes 13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN COMPILED LANGUAGES

FORTRAN90 (HTTPS://WWW.UNIDATA.UCAR.EDU/SOFTWARE/NETCDF/NETCDF -4/NEWDOCS/NETCDF-F90.HTML) USE netcdf st=nf90_open("test.nc", NF90_NOWRITE, ncid) st=nf90_inq_varid(ncid, "tpot", tpotId) st=nf90_get_var(ncid, tpotId, tpot) Datasets NF90_DEF_ENUM NF90_INSERT_ENUM Groups NF90_INQ_ENUM Dimensions NF90_INQ_ENUM_MEMBER User Defined Data Types NF90_INQ_ENUM_IDENT Compound Types Variable Length Array Opaque Type Enum Type Variables Attributes 13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN COMPILED LANGUAGES

FORTRAN90 (HTTPS://WWW.UNIDATA.UCAR.EDU/SOFTWARE/NETCDF/NETCDF -4/NEWDOCS/NETCDF-F90.HTML) USE netcdf st=nf90_open("test.nc", NF90_NOWRITE, ncid) st=nf90_inq_varid(ncid, "tpot", tpotId) st=nf90_get_var(ncid, tpotId, tpot) Datasets NF90_DEF_VAR NF90_DEF_VAR_CHUNKING Groups NF90_INQ_VAR_CHUNKING Dimensions NF90_DEF_VAR_FILL NF90_INQ_VAR_FILL User Defined Data Types NF90_DEF_VAR_DEFLATE NF90_INQ_VAR_DEFLATE Compound Types NF90_DEF_VAR_FLETCHER32 Variable Length Array NF90_INQ_VAR_FLETCHER32 NF90_DEF_VAR_ENDIAN Opaque Type NF90_INQ_VAR_ENDIAN Enum Type NF90_INQUIRE_VARIABLE NF90_PUT_VAR Variables NF90_GET_VAR Attributes NF90_RENAME_VAR 13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN COMPILED LANGUAGES

FORTRAN90 (HTTPS://WWW.UNIDATA.UCAR.EDU/SOFTWARE/NETCDF/NETCDF -4/NEWDOCS/NETCDF-F90.HTML) USE netcdf st=nf90_open("test.nc", NF90_NOWRITE, ncid) st=nf90_inq_varid(ncid, "tpot", tpotId) st=nf90_get_var(ncid, tpotId, tpot) Datasets NF90_PUT_ATT NF90_INQUIRE_ATTRIBUTE Groups NF90_GET_ATT Dimensions NF90_COPY_ATT NF90_RENAME_ATT User Defined Data Types NF90_DEL_ATT Compound Types Variable Length Array Opaque Type Enum Type Variables Attributes 13 November 2018 Sabine Schröder Tools for data analysis SUMMARY/OUTLOOK

• Not one tool faces all requests on atmospheric data.

• Don’t use tools blindly!

• Keep up-to-date!

This talk focused on files in NetCDF format. For other data formats refer to additional material.

13 November 2018 Sabine Schröder Tools for data analysis SUMMARY/OUTLOOK Data format Tools//libs Reference CSV Tools: MS Excel (proprietary) a lot of public-domain tools (like csvkit, …) available

csv: Python module http://docs.python.org/library/csv (import csv)

no special FORTRAN API needed (list directed input and/or formatted output will do the trick) NASA Ames nastools: Python module https://files.pythonhosted.org/packages/e3/3c/3bbdd20ad05f737e4c4df3d8 (import nastools) ac5b8ee7ad172d6fcf16096500e2f5dfb3f1/NAStools-0.1.2.tar.gz

13 November 2018 Sabine Schröder Tools for data analysis SUMMARY/OUTLOOK Data format Tools/APIs/libs Reference BUFR ecCodes: BUFR tools https://confluence.ecmwf.int/display/ECC (bufr_count, bufr_dump, bufr_ls, bufr_get, bufr_compare, bufr_copy, bufr_filter) ecCodes: Python module (import eccodes; linking with -leccodes_f90 -leccodes) ecCodes: F90 module (use eccodes) GRIB ecCodes: GRIB tools https://confluence.ecmwf.int/display/ECC (grib_compare, grib_copy, grib_count, grib_dump, grib_filter, grib_get, grib_get_data, grib_index_build, grib_ls, grib_set, grib_to_netcdf) Tools: CDO ecCodes: Python module (import eccodes; linking with https://code.mpimet.mpg.de/projects/cdo/ -leccodes_f90 -leccodes) ecCodes: F90 module (use eccodes)

13 November 2018 Sabine Schröder Tools for data analysis