Final Presentation Tip of the Day ¥ Wednesday (February 23) 11:00-2:00 ¥ You have a that you misplaced and ¥ Carlson Learning Center 76-1275 want to it quickly AWK ¥ Please Review Submission Guidelines ¥ You want to use the command ¥ Also please include a PDF File of your final find report Ð First create a postscript file of your report Ð Then use the ps2pdf utility on the center machines to generate a pdf file ¥ Create a gzipped tar file of your entire project.

Another find command Yet another find command find command example example example % find ~ -name “final_exam.txt” % find . -name “*.pro” - % find / -name “*junk*” -exec {} \;

¥ Above command will search for the file ¥ Above command will search for all your ¥ Above command will search for all files “final_exam.txt” in all subdirectories IDL files in all subdirectories under your in the entire directory tree that contains under your home directory current working directory the pattern junk in the file name ¥ When found, it will print out the full path ¥ When found, it will print out an ls ¥ When found, the file will be deleted file name listing of the file Solutions Aho, Weinberger, Kernighan Example #1 - I have a column of numbers (input.dat) that require ¥ You can an IDL or program to do AWK - is a text processing conversion, e.g., square root. this. utility that can efficiently Centigrad to Fahrenheit, etc. ¥ Transfer the data over to a spreadsheet process and extract text data ¥ Or write a one line program with minimal programming 1 2 … 100

Syntax of AWK Simplest AWK program How Does AWK Work?

¥ Awk is based on the concept of pattern /pattern/ {action} % gawk ‘{print $0}’ input.dat matching This simply prints out (echoes) the output ¥ Think of AWK as a filter program file Ð Looks for key “patterns” and process records matching patterns. If you have two columns of To take the square root data and you want to add Meaning of the fields them up % gawk ‘{print sqrt($0)}’ input.dat % gawk ‘{print $1+$2}’ input.dat $0 - represent the entire input line $1 - represent the first field $2 - represents the second field Etc. NF - number of fields NR - record number

Suppose you had headers on What about patterns? Matching the top of your file which you wanted to ignore * - matches all patterns /pattern/ - tries to match the pattern ? - matches a single character /^pattern/ - makes sure the pattern starts % gawk ‘/[0-9]/ {print $0}’ input.dat [0-9] - matches a single character that is a the beginning of a line number /pattern$/ - end of a line [A-Z] - matches a single character that is an $1 ~ /pattern/ - tries to match the first field to upper case letter. a pattern $1 !~ /pattern/ - tries to NOT match the first field to a pattern Conversion of wavelength units Real Life Problem 1: Removing comments # from nanometers to microns for a ASD Spectra Conversion spectral file (water.ref) gawk '$0 !~ /^#/’ Water Quality Samples ID Chlor SS CDOM B1 MISI Image example 400.350 0.0509975 P1 P2 ¥ Above works for # at the beginning of at 2000'AGL4'pixel 410.170 0.0502359 line 419.990 0.0474999 … Legend gawk '$0 !~ /^ *#/’ 4 MISI flight area 4 4 683.900 0.0215759 ¥ Better Pattern 693.440 0.0214323 Boston Whaler 4 702.980 0.0213168 ÐWorks for # at the beginning of line when Pier Team canoe radiometer preceded by whitespace thermistors ASD secchi depth Truth water samples 4 kayak Panels

How do we repeatedly apply Conversion AWK What if you have multiple files the AWK script % gawk ‘{print $1/1000.0, $2}’ water.ref > Water_0001.ref ¥ We would use the foreach UNIX statement. water.ref.microns Water_0002.ref ¥ The form of the foreach statement … Soil_0001.ref % foreach shell_variable (regular_expression) unix_statements Soil_0002.ref unix_statements … … Cement_1000.ref unix_statments end We need tools to extract file Processing only the water files Renaming a set of files name components ¥ Suppose you had a set of files % foreach i (water*.ref) Water_0001.ref.microns ¥ Given the sample file foreach? “Processing $i” water_0001.ref.microns foreach? gawk ‘{print $1/1000.0, $2}’ $i > Water_0002.ref.microns ¥ Need to extract the file name extension(s) $i.microns … foreach? end .ref.microns Water_0100.ref.microns .microns ¥ You want to rename them back to ¥ Need to extract the file name base Water_0001.ref Water_0001 Water_0002.ref

Sample output of the modifiers Shell Filename Modifiers Renaming the water files % set a=/usr/tmp/water_00001.ref.microns h Remove a trailing pathname component, % echo $a % foreach i (water*.microns) leaving only the . /usr/tmp/water_00001.ref.microns foreach? echo “Renaming $i to $i:r” r Remove a trailing suffix of the form % echo $a:h foreach? $i $i:r .xxx, leaving the . /usr/tmp foreach? end e Remove all but the trailing suffix. % echo $a:r t Remove all leading pathname components, /usr/tmp/water_00001.ref leaving the . % echo $a:e microns % echo $a:t water_00001.ref.microns foreach statement can extract Real Life Problem 2: What do we want? elements of a shell variable MODTRAN Output ¥ “H2O” value

Z P T REL H H2O CLD AMT RAIN RATE AEROSOL % set a='0.0 0.1 0.2' ¥ How do you extract a single value out (KM) (MB) (K) (%) (GM M-3) (GM M-3) (MM HR-1) PROFILE 0.315 984.200 305.45 2.20 7.545E-01 0.000E+00 0.000E+00 RURAL RURAL % foreach i ($a) of a 40 page output? 0.554 958.100 300.35 2.60 6.765E-01 0.000E+00 0.000E+00 RURAL

… foreach? echo $i H2O O3 CO2 CO CH4 N2O 1 ***** MODTRAN 3.5 Version 1.1 Jan 97 ***** 0 CARD 1 *****t0 7 2 2 1 0 0 0 0 0 0 1 1 0 ( ATM CM ) foreach? end 0.000 0.00 0 CARD 1B *****T 8F 0 360.000 2.2208E+02 1.3433E-01 2.6589E+02 8.2446E-02 1.1924E+00 2.2553E-01 0.0 0 CARD 2 ***** 1 1 0 0 0 0 30.00000 0.00000 0.00000 0. 00000 0.31500 … 0 GNDALT = 0.31500 Z P T REL H H2O CLD AMT RAIN RATE AEROSOL 0.1 0 CARD 2C ***** 15 0 0AUG01 MODEL ATMOSPHERE NO. 7 ICLD = 0 (KM) (MB) (K) (%) (GM M-3) (GM M-3) (MM HR-1) TYPE PROFILE MODEL 0 / 7 USER INPUT DATA 0.315 984.200 305.45 2.20 7.545E-01 0.000E+00 0.000E+00 RURAL RURAL 0.2 0.315 9.842E+02 3.230E+01 7.545E-01 0.000E+00 0.000E+00 ABD2222222 0.554 958.100 300.35 2.60 6.765E-01 0.000E+00 0.000E+00 RURAL 22222 0.554 9.581E+02 2.720E+01 6.765E-01 0.000E+00 0.000E+00 ABD2222222 2

What do we know? Using to help analyze pattern Need to Identify Unique ¥ We know that the value we want has the table Pattern Property name “H2O” in the first field. % grep H2O output.tp6 ¥ Several H2O’s in the file Z P T REL H H2O CLD AMT RAIN RATE AEROSOL Z P T REL H H2O CLD AMT RAIN RATE AEROSOL (KM) (MB) (K) (%) (GM M-3) (GM M-3) (MM HR-1) TYPE PROFILE I Z P H2O O3 CO2 CO CH4 N2O O2 NH3 NO NO2 SO2 HNO3 ¥ Desired record is in the first column 0.315 984.200 305.45 2.20 7.545E-01 0.000E+00 0.000E+00 RURAL RURAL 1 J Z H2O O3 CO2 CO CH4 N2O O2 NH3 NO NO2 SO2 0.554 958.100 300.35 2.60 6.765E-01 0.000E+00 0.000E+00 RURAL H2O O3 CO2 CO CH4 N2O 1 J Z H2O O3 CO2 CO CH4 N2O O2 NH3 NO NO2 SO2 ¥ Need to specify “first column”-only … H2O O3 CO2 CO CH4 N2O H2O O3 CO2 CO CH4 N2O matches ( ATM CM ) $1 ~ /H2O/ 2.2208E+02 1.3433E-01 2.6589E+02 8.2446E-02 1.1924E+00 2.2553E-01

… Z P T REL H H2O CLD AMT RAIN RATE AEROSOL (KM) (MB) (K) (%) (GM M-3) (GM M-3) (MM HR-1) TYPE PROFILE 0.315 984.200 305.45 2.20 7.545E-01 0.000E+00 0.000E+00 RURAL RURAL 0.554 958.100 300.35 2.60 6.765E-01 0.000E+00 0.000E+00 RURAL Need to skip to the value and Putting it all together Can be made into a extract the value (get_water_vapor.csh) gawk '$1 ~ /H2O/ { getline; getline; getline; \ ¥ Based on the following pattern print ($1*18.015/22413.83) }’ input_modtran.dat #!/bin/csh

gawk '$1 ~ /H2O/ { getline; getline; getline; \ H2O O3 CO2 CO CH4 N2O ¥ Action is a unit conversion of water vapor value print ($1*18.015/22413.83) }' $1 ( ATM CM ) print ($1*18.015/22413.83) 2.2208E+02 1.3433E-01 2.6589E+02 8.2446E-02 1.1924E+00 2.2553E-01

¥ We need to “skip” to the third line and get the first record ¥ This can be accomplished by the getline command

From within IDL What is this file?

IDL> spawn, ‘get_water_vapor.csh 400.350 0.0509975 input.dat’, results Stripping Out Comments in 410.170 0.0502359 IDL 419.990 0.0474999 … 683.900 0.0215759 693.440 0.0214323 702.980 0.0213168 Comment Stripping Routine Commented File pro strip_out_comments, input_file_name, output_file_name openr, input_file, input_file_name, /get_lun # Water reflectance data file openw, output_file, output_file_name, /get_lun # ASD Reflectance May 20, 1999 11:31 PM original_string = '' # Local while ( NOT EOF(input_file)) do begin readf, input_file, original_string # Wavelength [Nanometers] Reflectance input_string=strtrim( original_string, 2 ) # [unitless] comment_position = strpos(input_string,'#') 400.350 0.0509975 if( comment_position eq -1 and input_string ne '' )then begin , output_file, input_string 410.170 0.0502359 end else if( comment_position gt 0 ) then begin 419.990 0.0474999 printf,output_file,strmid(input_string,0,comment_position ) … endif endwhile 702.980 0.0213168 free_lun, input_file, output_file end