Awk and Sed in UNIX

Becoming an efficient DB2 LUW DBA by leveraging awk and sed in UNIX Pavan Kristipati Huntington Bank Session Code: D11 05/25/2016 1:00 PM | Platform: DB2 for Linux, Unix and Windows Photo by Steve from Austin, TX, USA Pavan Kristipati • IBM Champion 2015, 2016 • IDUG North America Conference Planning Committee • DB2 LUW DBA since 2005 • Presented at IDUG in 2014 and 2015 • IBM Certified Advanced Database Administrator • Technical blogging • www.db2talk.com – Owner • www.db2commerce.com – Occasional guest blogger • https://www.linkedin.com/in/pavankristipati • @pkristipati @db2talk 2 Agenda • Objectives • Motivation and Introduction • Learning how to use awk and sed in regular DBA activities • Usage examples • Questions 3 Objectives • Introduction to awk and sed • If you haven’t considered using awk and sed until now, my goal is to motivate you to do so • Share 4 things about awk and 2 things about sed that make a DB2 DBA efficient • Show that awk and sed are not that clumsy! • Show how sed and awk together could be powerful tools 4 Setting the stage • No heavy attention to syntax • Multiple ways to get the same thing done. Only one approach is discussed • Focus is on breath. Not depth 5 Motivation to look into awk and sed • Reduce (boring) typing • Do, wait, do, wait, do.. doesn’t scale • Increased confidence in scripting/automating over time self sufficient DBA • Write UNIX scripts to make your job simpler • Automation using scripts • Up to the challenge of learning usage of seemingly arcane utilities 6 So, what are awk and sed? • Two of the most under-appreciated utilities in UNIX • Invaluable if you ever have to: • Make repetitive changes to large pieces of code or text • Analyze some text • Part of most UNIX offerings 7 Introduction to awk • Originally designed/implemented in 1977 by Al Aho, Peter Weinberger, and Brian Kernigan - language takes it name from authors’ initials • Several implementations (that rhyme with awk) • General purpose programming language in UNIX that handles text (strings) as easily as numbers • This makes awk one of the most powerful of the Unix utilities • Easy manipulation of structured data • Process fields while sed process lines 8 awk basics and terminology 9 awk - Input and Output • Input • Content in a file - $cat filename | awk ‘command’ • Output from a command - $db2 list db directory | awk ‘command’ • Output • Fields in a file • Single value • Functions or other actions on content • Open to imagination 10 awk - Input and Output • What can we do with output? • Send to a file • print "expression" > "file name“ (Think SQL files) • Send through a pipe for further actin • print "expression" | "command” (One – liners) 11 Lets get basics right - Fields and Delimiters in awk • Think of field as a column • Default delimiter: ‘space’ or ‘tab’ • $1 = 1st field • $2 = 2nd field • $0 = the entire line 12 Lets get basics right - Fields and Delimiters in awk • awk’s most basic function – Printing text (don’t underestimate) • Print 1st field (print only schema names) $ cat file1 | awk '{print $1}‘ • Notice the difference: • cat file1 | awk ‘{print “$1”}’ – prints $1 instead of value of 1st field • Text in double quotes is printed literally 13 Unleashing the power of awk 14 awk Tip #1 15 In tip #1, we will look at: • Generate DB2 commands • Combine SQL and awk • Perform a set of actions on a: • static file of list of tables • dynamic list of tables 16 Power of awk – Generating DB2 Commands Scenario: Runstats on multiple tables • Table list in a file • Do-Wait-Do-Wait.. is not for efficient DBAs • Use awk to be efficient • 2 steps • Generate command (we will use UNIX’s ‘cat’ in this step) • Run command (and take a coffee break) 17 Power of awk – Generating DB2 Commands Template: $ cat tabslist | awk '{print “runstats command” variables}’ Step1: $cat tabslist| awk ‘{print “runstats on table “$1”.”$2”;”}’ > stats.sql Step2: $db2 –tvf stats.sql runstats on table SCHEMA1.TABLE1 DB20000I The RUNSTATS command completed successfully. runstats on table SCHEMA2.TABLE2 DB20000I The RUNSTATS command completed successfully. …………………… 18 awk one liners • Could this be done in 1 step? Yes • Just send awk’s output to DB2 prompt • $cat tabslist | awk ‘{print “runstats on table “$1”.”$2}’ | db2 –v db2 => runstats on table SCHEMA1.TABLE1 DB20000I The RUNSTATS command completed successfully. db2 => runstats on table SCHEMA2.TABLE2 DB20000I The RUNSTATS command completed successfully. 19 Power of awk – Generating Commands from a dynamic list Two questions: • Does this scale to large number of tables? Yes • Does the table list has to be in a file? No 3 actions in one line: . Generate a list . Pass the list to awk as input and . Pass the entire output to DB2 prompt $db2 -x "select char(tabschema,20), char(tabname, 50) from syscat.tables where type = 'T' and tabschema = 'SYSIBM'" | awk '{print "runstats on table "$1"."$2}‘ | db2 -v Runstats on all tables in SYSIBM schema (144 tables in DB2 10.5) 20 awk Tip #2 21 Combining SQL and awk In tip #2, we will look at: • Pass awk fields into SQL • Generating SQL statements using awk • Handling single quotes in awk in SQLs ‘where’ clause 22 Combining SQL and awk • Example: Print column names • Where clause has table and schema names in single quotes SQL to print column names: • select char(tabschema, 20), char(tabname, 60), char(colnames,200) from syscat.indexes where tabschema='SCHEMA1' and tabname='TABLE1’ 23 Handling quotes in awk – What doesn’t work Quotes needs extra attention and handling Goal: Enclose SCHEMA and TABLE in quotes None of these work $cat list | awk '{print "select char(tabschema, 20), char(tabname, 60), char(colnames, 200) from syscat.indexes where clause Enclose fields in single quotes: × where tabschema = ‘$1’ and tabname = ‘$2’;"}’ Using combination of double and single quotes: × where tabschema = "'$1'" and tabname = "'$2'";"}‘ Escaping single quotes: × where tabschema = "\'$1\'" and tabname = "\'$2\'";"}‘ 24 Handling quotes in awk - – What works Solution – awk’s Assignment Operator ‘-v’: • $cat list | awk -v x="'" '{print "select char(tabschema, 20), char(tabname, 60), char(colnames, 200) from syscat.indexes where tabschema="x$1x" and tabname ="x$2x}‘ • Looks complex? Actually it is not! • Just have assignment operator before and after fields ($1 $2) • http://db2talk.com/2015/07/31/dba-tip-easy-way-to- handle-single-quotes-in-awk/ 25 Assignment Operator – Add date/time to vmstat data • By default, AIX’s vmstat output does not have a good way to record date/time values. • vmstat in RHEL has ‘-t’ option for date/timestamp • How do we add date/time values? • Use awk’s assignment operator. 26 Adding date/time to vmstat data using awk $vmstat 15 | parse_vmstat.sh ## Pass vmstat output to script ##parse_vmstat.sh – Snippet ## Step 1 – Capture current date and time in a shell variable $cur_time=$(date '+%m/%d/%Y %H:%M:%S') ## Step 2 – Pass this variable to awk and print it along with vmstats output echo $line | grep –vi swpd | awk –v cTime="$cur_time" '{print cTime, $0}‘ # output 03/05/2016 15:31:12 1 0 0 3311268 327304 3757292 0 0 12367 58 13 24 17 2 80 0 0 03/05/2016 15:31:27 1 0 0 3311268 327304 3757292 0 0 12367 58 13 24 17 2 80 0 0 27 awk Tip #3 28 In tip #3, we will look at: • Printing output in a better format • Separate fields in output for clarity 29 Applying make up to output – Tip #3 • Print output in a more formatted way – Use printf (fancy print) • Format specifiers let us separate fields in output for clarity 30 Applying make up to output – Tip #3 31 awk Tip #4 32 In tip #4, we will look at: • How to handle non-default field separators • Other special variables in awk 33 Custom Field Separator in awk • Until now, we worked with files that have fields separated by space or tab. • What if fields are separated by other characters? Request from a developer: Hey DBA, could do you drop these tables? 34 Custom Field Separator in awk • Use awk’s ‘F’ or ‘FS’ for field separator. 35 Other Special Variables in awk • FS -- The input field separator. The default value is a blank. • OFS - The output field separator (default is a space). • RS -- Input record separator (default is a new-line character). • ORS - Output record separator (default is a new-line character). • NF -- The number of fields in the current record. 36 sed 37 Introduction to sed • stream editor (sed) • Perfect for applying a series of edits to a number of files or to output of a command • Processes lines while awk processes fields How it works? Input text flows through the (sed) program, is modified, and is directed to output Output from a Sed command Command prompt / command /script file /text file 38 sed – Input and Output • By default, ‘sed’ operates on each line. • By default, all ‘sed’ commands are applied on the pattern buffer, hence the input file remains unchanged. • GNU ‘sed’ provides a way to modify the input file in-a-place. 39 sed tip #1 The most popular substitute command 40 sed’s most popular substitute command – #1 Goal: To delete double quotes from a DDL file $cat ddlfile | sed 's/"//g‘ > newddlfile More variations of this command are in notes 41 sed’s most popular substitute command - #2 Goal: To replace EDWDV with EDWQA Think QA migration $cat ddlfile | sed ‘s/EDWDV/EDWQA/g‘ > newddlfile What if we wanted quotes to go away as well? 42 sed tip #2 handling large number of substitutions 43 Handling multiple substitutions – the efficient way • If you have a large number of sed commands, you can put them into a file and use sed -f sedscript <old >new 44 Multiple substitutions using sed Delete quotes, lines with CONNECT, COMMIT, RESET and TERMINATE How did this happen? 45 Multiple substitutions using sed dv.ddl sed –f sed.file dv.ddl > qa.ddl Simpler example: Case in-sensitive replace s/[Aa]/A/g s/[Ee]/E/g s/[Ii]/I/g s/[Oo]/O/g s/[Uu]/U/g 46 awk and sed together 47 Example #1 48 Handling $# =1 or $# =2 Scenario: There could be 1 or 2 arguments.

Load more