
DATA PROCESSING AND ANALYSIS TOOLS FOR DATA ANALYSIS 13 NOV 2018 I SABINE SCHRÖDER, IEK-8 OUTLINE 1. Motivation 2. Tools and data standards 3. Commands, Interpreter, Programming: When to use what? 4. Tools in command line operators/viewers 5. Tools in interpreted languages 6. Tools in compiled languages 7. Summary 13 November 2018 Sabine Schröder Tools for data analysis MOTIVATION 13 November 2018 Sabine Schröder Tools for data analysis MOTIVATION 13 November 2018 Sabine Schröder Tools for data analysis TOOLS AND DATA STANDARDS tool1 A chosen data format influences the variety tool2 of available tools. standard tool3 Sticking to data standards enhances the data availability of reusable tools. format tool4 Keep up to date: Not only data formats … develop, but also tools! tooln 13 November 2018 Sabine Schröder Tools for data analysis COMMANDS, INTERPRETERS, COMPILED PROGRAMS • Ease of learning command line tool: interpreter: programming: > print_helloWorld >>> print("Hello, PROGRAM Hello world!") WRITE(*,*) "Hello, world!" END PROGRAM Hello 13 November 2018 Sabine Schröder Tools for data analysis COMMANDS, INTERPRETERS, COMPILED PROGRAMS • Ease of learning • Getting up and running command line tool: interpreter: programming: • download • download and • license compiler installation of • install compiler interpreter • learn about compiler • learn syntax of • learn syntax of interpreter programming language • after programming, compile, then run 13 November 2018 Sabine Schröder Tools for data analysis COMMANDS, INTERPRETERS, COMPILED PROGRAMS • Ease of learning • Getting up and running • Speed: a productivity vs. performance tradeoff command line tool: interpreter: programming: > ncwa --no_tmp_fl -y >>> from netCDF4 import PROGRAM Maxi max -v tpot test.nc Dataset USE netcdf max_tpot.nc >>> rootgrp = st=nf90_open("test.nc", Dataset("test.nc","r") NF90_NOWRITE, ncid) >>> print st=nf90_inq_varid(ncid, rootgrp.variables["tpot"][: "tpot", tpotId) ,:,:,:].max() st=nf90_get_var(ncid, tpotId, >>> rootgr.close() tpot) write(*,*) maxval(tpot) END PROGRAM Maxi 13 November 2018 Sabine Schröder Tools for data analysis COMMANDS, INTERPRETERS, COMPILED PROGRAMS • Ease of learning • Getting up and running • Speed: a productivity vs. performance tradeoff command line tool: interpreter: programming: > ncwa --no_tmp_fl -y >>> from netCDF4 import PROGRAM Maxi max -v tpot test.nc Dataset USE netcdf max_tpot.nc >>> rootgrp = st=nf90_open("test.nc", Dataset("test.nc","r") NF90_NOWRITE, ncid) >>> print st=nf90_inq_varid(ncid, 0.16 s rootgrp.variables 0.05 [s" tpot"][: "tpot", tpotId) ,:,:,:].max() st=nf90_get_var( 0.03ncid s , tpotId, >>> rootgr.close() tpot) write(*,*) maxval(tpot) END PROGRAM Maxi 13 November 2018 Sabine Schröder Tools for data analysis COMMANDS, INTERPRETERS, COMPILED PROGRAMS • Ease of learning • Getting up and running • Speed: a productivity vs. performance tradeoff • Scope of use • Requirements 13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN COMMAND LINE OPERATORS/VIEWERS NCDUMP NCGEN From CDL to Header • netCDF-3 • netCDF-4 • C/F77/JAVA program Variable Record 13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN COMMAND LINE VIEWERS NCVIEW http://meteora.ucsd.edu/~pierce/ncview_home_page.html 13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN COMMAND LINE VIEWERS PANOPLY https://www.giss.nasa.gov/tools/panoply/ 13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN COMMAND LINE OPERATORS NCO (NETCDF OPERATORS) http://nco.sourceforge.net/ • standalone, command-line programs to . derive new fields . compute statistics . hyperslab . manipulate metadata . regrid • input: . netCDF, HDF, DAP . flat files . GODAD (Group-Oriented Data Analysis and Distribution) 13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN COMMAND LINE OPERATORS NCO (NETCDF OPERATORS) ncap2 (arithmetic processor (version 2)) ncap2 ncatted (attribute editor) arithmetically processes netCDF files ncbo (binary operator) instructions from command line ncclimo (climatology generator) or from file ncecat (ensemble concatenator) treats missing values nces (ensemble statistics) definition of dimensions possible ncflint (file interpolator) can link to the GNU Scientific Library (GSL) ncks (kitchen sink) can create derived fields ncpdq (permute dimensions quickly) ncra (record average) example: ncrcat (record concatenator) ncap2 -v -s 'a=3;b=4;c=sqrt(a^2+b^2)' in.nc ncremap (remapper) out.nc ncrename (renamer) ncwa (weighted average) 13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN COMMAND LINE OPERATORS NCO (NETCDF OPERATORS) ncap2 (arithmetic processor (version 2)) ncatted ncatted (attribute editor) append, create, delete, modify, and overwrite ncbo (binary operator) attributes ncclimo (climatology generator) ncecat (ensemble concatenator) example: nces (ensemble statistics) ncatted -a history,global,a,c,'Data version ncflint (file interpolator) 2.0\n' in.nc ncks (kitchen sink) ncpdq (permute dimensions quickly) ncra (record average) ncrcat (record concatenator) ncremap (remapper) ncrename (renamer) ncwa (weighted average) 13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN COMMAND LINE OPERATORS NCO (NETCDF OPERATORS) ncap2 (arithmetic processor (version 2)) ncbo ncatted (attribute editor) four operations: ncbo (binary operator) Addition, Subtraction, Multiplication, Division ncclimo (climatology generator) ncecat (ensemble concatenator) example: nces (ensemble statistics) ncbo --op_typ=sub 86_0112.nc ncflint (file interpolator) 85_0112.nc 86m85_0112.nc ncks (kitchen sink) ncpdq (permute dimensions quickly) ncra (record average) ncrcat (record concatenator) ncremap (remapper) ncrename (renamer) ncwa (weighted average) 13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN COMMAND LINE OPERATORS NCO (NETCDF OPERATORS) ncap2 (arithmetic processor (version 2)) ncclimo ncatted (attribute editor) • Climatology modes: ncbo (binary operator) annual, monthly, daily ncclimo (climatology generator) • seasons: ncecat (ensemble concatenator) jfm,amj,jas,ond,on,fm,djf,mam,jja,son,ann nces (ensemble statistics) • number of timesteps-per-day in output ncflint (file interpolator) • automatic filename creation and splitting ncks (kitchen sink) of output files ncpdq (permute dimensions quickly) ncra (record average) example: ncrcat (record concatenator) ncclimo -C ann -m cism -h h -c caseid ncremap (remapper) -s 1851-e 1900 -i drc_in -o drc_out ncrename (renamer) ncwa (weighted average) 13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN COMMAND LINE OPERATORS NCO (NETCDF OPERATORS) ncap2 (arithmetic processor (version 2)) ncecat ncatted (attribute editor) create one output file: ncbo (binary operator) • Record Aggregation – NetCDF3 ncclimo (climatology generator) • Group Aggregation – NetCDF4 ncecat (ensemble concatenator) nces (ensemble statistics) example: ncflint (file interpolator) ncecat -u realization 85_0[1-5].nc 85.nc ncks (kitchen sink) ncpdq (permute dimensions quickly) ncra (record average) ncrcat (record concatenator) ncremap (remapper) ncrename (renamer) ncwa (weighted average) 13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN COMMAND LINE OPERATORS NCO (NETCDF OPERATORS) ncap2 (arithmetic processor (version 2)) nces ncatted (attribute editor) gridpoint statistics on variables across ncbo (binary operator) • an ensemble of input-files ncclimo (climatology generator) • input groups within each file ncecat (ensemble concatenator) statistics: nces (ensemble statistics) avg Mean value sqravg Square of the mean ncflint (file interpolator) avgsqr Mean of sum of squares max Maximum value min Minimum value mabs Maximum absolute value ncks (kitchen sink) mebs Mean absolute value mibs Minimum absolute value ncpdq (permute dimensions quickly) rms Root-mean-square (normalized by N) ncra (record average) rmssdn Root-mean square (normalized by N-1) sqrt Square root of the mean tabs Sum of absolute values ncrcat (record concatenator) ttl Sum of values ncremap (remapper) example: ncrename (renamer) nces 85_0[1-5].nc 85.nc ncwa (weighted average) 13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN COMMAND LINE OPERATORS NCO (NETCDF OPERATORS) ncap2 (arithmetic processor (version 2)) ncflint ncatted (attribute editor) linear combination of input files: ncbo (binary operator) • weighted average ncclimo (climatology generator) • normalized weighted average ncecat (ensemble concatenator) • interpolation nces (ensemble statistics) ncflint (file interpolator) example: ncks (kitchen sink) ncflint -i time,86 85.nc 87.nc 86.nc ncpdq (permute dimensions quickly) ncra (record average) ncrcat (record concatenator) ncremap (remapper) ncrename (renamer) ncwa (weighted average) 13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN COMMAND LINE OPERATORS NCO (NETCDF OPERATORS) ncap2 (arithmetic processor (version 2)) ncks ncatted (attribute editor) file creation and conversion with special ncbo (binary operator) features: ncclimo (climatology generator) • extract ncecat (ensemble concatenator) • hyperslab nces (ensemble statistics) • multi-slab ncflint (file interpolator) • sub-set ncks (kitchen sink) • translate ncpdq (permute dimensions quickly) ncra (record average) example: ncrcat (record concatenator) ncks -d time,5 -d lat,,0.0 -d lon,260.0,45.0 ncremap (remapper) -d lev,1000.0 in.nc out.nc ncrename (renamer) ncwa (weighted average) 13 November 2018 Sabine Schröder Tools for data analysis TOOLS IN COMMAND LINE OPERATORS NCO (NETCDF
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages76 Page
-
File Size-