<<

Practical Report and Extraction Language () Introduction

• What is PERL? – Practical Report and Extraction Language. – It is an interpreted language optimized for scanning arbitrary text files, extracting information from them, and printing reports based on that information. – Very powerful string handling features. – Available on all platforms.

Internet & Web Based Technology 2 Main Advantages

• Speed of development – You can enter the program in a text file, and just run it. It is an interpretive language; no is needed. • It is powerful – The regular expressions of Perl are extremely powerful. – Uses sophisticated pattern matching techniques to scan large amounts of data very quickly. • Portability – Perl is a standard language and is available on all platforms. – Free versions are available on the Internet. • Editing Perl programs – No sophisticated editing tool is needed. – Any simple text editor like Notepad or vi will do.

Internet & Web Based Technology 3 • Flexibility – Perl does not limit the size of your data. – If memory is available, Perl can handle the whole file as a single string. – Allows one to write simple programs to perform complex tasks.

Internet & Web Based Technology 4 How to run Perl?

• Perl can be downloaded from the Internet. – Available on almost all platforms. • Assumptions: – For Windows , you can run Perl programs from the command prompt. • Run “cmd” to get command prompt window. – For Unix/Linux, you can run directly from the shell prompt.

Internet & Web Based Technology 5 Working through an example

• Recommended steps: – Create a directory/folder where you will be storing the Perl files. – Using any text editor, create a file “test.pl”with the following content:

print “Good day\n”; print “This is my first Perl program\n”;

– Execute the program by typing the following at the command prompt: perl test.pl

Internet & Web Based Technology 6 • On Unix/Linux, an additional line has to be given at the beginning of every Perl program.

#!/usr/bin/perl print “Good day\n”; print “This is my first Perl program \n”;

Internet & Web Based Technology 7 Variables

• Scalar variables – A scalar variable holds a single value. – Other variable types are also available (array and associative array) – to be discussed later. – A ‘$’ is used before the name of a variable to indicate that it is a scalar variable. $xyz = 20;

Internet & Web Based Technology 8 • Some examples: $a = 10; $name=“Indranil Sen Gupta”; $average = 28.37;

– Variables do not have any fixed types.

– Variables can be printed as: print “My name is $name, the average temperature is $average\n”;

Internet & Web Based Technology 9 • Data types: – Perl does not specify the types of variables. • It is a loosely typed language. • Languages like or java are strongly typed.

Internet & Web Based Technology 10 Variable Interpolation

• A powerful feature – Variable names are automatically replaced by values when they appear in double-quoted strings. • An example:

$stud = “Rupak”; $marks = 75; print “Marks obtained by $stud is $marks\n”; print ‘Marks obtained by $stud is $marks\n’;

Internet & Web Based Technology 11 – The program will give the following output:

Marks obtained by Rupak is 75 Marks obtained by $stud is $marks

– What do we see: • If we need to do variable interpolation, use double quotes; otherwise, use single quotes.

Internet & Web Based Technology 12 • Another example:

$Expense = ‘$100’; print “The expenditure is $Expense.\n”;

Internet & Web Based Technology 13 Expressions with Scalars

• Illustrated through examples (syntax similar to C)

$abc = 10; $abc++; $total- -; $a = $b ** 10; # exponentiation $a = $b % 10; # modulus $balance = $balance + $deposit; $balance += $deposit;

Internet & Web Based Technology 14 • Operations on strings: – Concatenation: the dot (.) is used. $a = “Good”; $b = “ day”; $c = “\n”; $total = $a.$b.$c; # concatenate the strings

$a .= “ day\n”; # add to the string $a

Internet & Web Based Technology 15 – Arithmetic operations on strings $a = “bat”; $b = $a + 1; print $a, “ and ”, $b;

will print bat and bau

– Operations carried out based on ASCII codes. • May not always be meaningful.

Internet & Web Based Technology 16 – String repetition operator (x). $a = $b x3; will concatenate three copies of $b and assign it to $a.

print “Ba”. “na”x2;

will print the string “banana”.

Internet & Web Based Technology 17 String as a Number

• A string can be used in an arithmetic expression. – How is the value evaluated? – When converting a string to a number, Perl takes any spaces, an optional minus sign, and as many digits it can find (with dot) at the beginning of the string, and ignores everything else.

“23.54” evaluates to 23.54 “123Hello25” evaluates to 123 “banana” evaluates to 0

Internet & Web Based Technology 18 Escaping

• The character ‘\’ is used as the escape character. – It escapes all of Perl’s special characters (e.g., $, @, #, etc.).

$num = 20; print “Value of \$num is $num\n”;

print “The windows path is c:\\perl\\”;

Internet & Web Based Technology 19 Line Oriented Quoting

• Perl supports specification of a string spanning multiple lines. – Use the marker ‘<<’. – Follow it by a string, which is used to terminate the quoted material. • Example:

print << terminator; Hello, how are you? Good day. terminator

Internet & Web Based Technology 20 • Another example:

print “\n”; print “Test page \n”; print “\n”; print “

This is a test document.

\n”; print “”;

Internet & Web Based Technology 21 print << EOM; Test page

This is a test document.

EOM

Internet & Web Based Technology 22 Lists and Arrays Basic Difference

• List is an ordered list of scalars. • Array is a variable that holds a list. • Each element of an array is a scalar. • The size of an array: – Lower limit: 0 – Upper limit: no specific limit; depends on virtual memory.

Internet & Web Based Technology 24 List Literal

• Examples: (10, 20, 50, 100) (‘red', “blue", “green") (“a", 1, 2, 3, ‘b')

($a, 12) () # empty list (10..20) # list constructor function (‘A’..’Z’) # same, for lettere\s

Internet & Web Based Technology 25 Specifying Array Variable

• We use the special character ‘@’. @months # denotes an array

The individual elements of the array are scalars, and can be referred to as: $months[0] # first element of @months $months[1] # second element of @months ……

Internet & Web Based Technology 26 Initializing an Array

• Two ways: – Specify values, separated by commas. @color = (‘red’, ‘green’, “blue”, “black”);

– Use the quote words (qw) function, that uses space as the delimiter: @color = qw (red green blue black);

Internet & Web Based Technology 27 Array Assignment

– Assign from a list of literals @numbers = (1, 2, 3); @colors = (“red”, “green”, “blue”);

– From the contents of another array. @array1 = @array2;

– Using the qw function: @word = qw (Hello good morning);

– Combination of above: @allcolors = (“white”, @colors, “brown”);

Internet & Web Based Technology 28 – Some other examples:

@xyz = (2..5);

@xyz = (1, @xyz);

@xyz = (@xyz, 6);

Internet & Web Based Technology 29 Multiple Assignments

($x, $y, $y) = (10, 20, 30);

($x, $y) = ($y, $x); # swap elements

($a, @col) = (‘red’, ‘green’, ‘blue’);

# $a gets the value ‘red’ # @col gets the value (‘green’, ‘blue’)

($first, @val, $last) = (1, 2, 3, 4);

# $first gets the value 1 # @val gets the value (2, 3, 4) # $last is undefined

Internet & Web Based Technology 30 Number of Elements in Array

• Two ways: $size = scalar @colors; $size = @colors;

Internet & Web Based Technology 31 Accessing Elements

@list = (1, 2, 3, 4);

$first = $list[0];

$fourth = $list[3];

$list[1]++; # array becomes (1, 3, 3, 4)

$x = $list[5]; # $x gets the value undef

$list[2] = “Go”; # array becomes (1, 2, “Go”, 4)

Internet & Web Based Technology 32 • The $# is the index of the last element of the array. @value = (1, 2, 3, 4, 5);

print “$#value \n”; # prints 4

• An empty array has the value $#value = -1;

Internet & Web Based Technology 33 shift and unshift

• They operate on the front of the array. – ‘shift’ removes the first element of the array. – ‘unshift’ replaces the element at the start of the array.

Internet & Web Based Technology 34 • Example: @color = qw (red, blue, green, black);

$first = shift @color; # $first gets “red”, and @color becomes # (blue, green, black)

unshift (@color, “white”); # @color becomes (white, blue, green, black)

Internet & Web Based Technology 35 pop and push

• They operate on the bottom of the array. – ‘pop’ removes the last element of the array. – ‘push’ replaces the last element of the array.

Internet & Web Based Technology 36 • Example: @color = qw (red, blue, green, black);

$first = pop @color; # $first gets “black”, and @color becomes # (red, blue, green)

push (@color, “white”); # @color becomes (red, blue, green, white)

Internet & Web Based Technology 37 Reversing an Array

• By using the ‘reverse’ keyword.

@names = (“Mina”, “Tina”, ‘Rina”)

@rev = reverse @names; # Reversed list stored in ‘rev’.

@names = reverse @names; # Original array is reversed.

Internet & Web Based Technology 38 Printing an Array

• Example:

@colors = qw (red, green, blue);

print @colors; # prints without spaces – redgreenblue

print “@colors”; # prints with spaces – red green blue

Internet & Web Based Technology 39 Sort the Elements of an Array

• Using the ‘sort’ keyword, by default we can sort the elements of an array lexicographically. – Elements considered as strings.

@colors = qw (red blue green black); @sort_col = sort @colors # Array @sort_col is (black blue green red)

Internet & Web Based Technology 40 – Another example:

@num = qw (10 2 5 22 7 15); @new = sort @num; # @new will contain (10 15 2 22 5 7)

– How do sort numerically?

@num = qw (10 2 5 22 7 15); @new = sort {$a <=> $b} @num; # @new will contain (2 5 7 10 15 22)

Internet & Web Based Technology 41 The ‘splice’ function

• Arguments to the ‘splice’ function: – The first argument is an array. – The second argument is an offset (index number of the list element to begin splicing at). – Third argument is the number of elements to remove.

@colors = (“red”, “green”, “blue”, “black”); @middle = splice (@colors, 1, 2); # @middle contains the elements removed

Internet & Web Based Technology 42 File Handling Interacting with the user

• Read from the keyboard (standard input). – Use the file handle . – Very simple to use.

print “Enter your name: ”; $name = ; # Read from keyboard print “Good morning, $name. \n”;

– $name also contains the newline character. • Need to chop it off.

Internet & Web Based Technology 44 The ‘chop’ Function

• The ‘chop’ function removes the last character of whatever it is given to chop. • In the following example, it chops the newline.

print “Enter your name: ”; chop ($name = ); # Read from keyboard and chop newline print “Good morning, $name. \n”;

• ‘chop’ removes the last character irrespective of whether it is a newline or not. – Sometimes dangerous.

Internet & Web Based Technology 45 Safe chopping: ‘chomp’

• The ‘chomp’ function works similar to ‘chop’, with the difference that it chops off the last character only if it is a newline.

print “Enter your name: ”; chomp ($name = ); # Read from keyboard and chomp newline print “Good morning, $name. \n”;

Internet & Web Based Technology 46 File Operations

• Opening a file – The ‘open’ command opens a file and returns a file handle. – For standard input, we have a predefined handle .

$fname = “/home/isg/report.txt”; open XYZ , $fname; while () { print “Line number $. : $_”; }

Internet & Web Based Technology 47 – Checking the error code:

$fname = “/home/isg/report.txt”; open XYZ, $fname or die “Error in open: $!”; while () { print “Line number $. : $_”; }

– $. returns the line number (starting at 1) – $_ returns the contents of last match – $i returns the error code/message

Internet & Web Based Technology 48 • Reading from a file: – The last example also illustrates file reading. – The angle brackets (< >) are the line input operators. • The data read goes into $_

Internet & Web Based Technology 49 • Writing into a file:

$out = “/home/isg/out.txt”; open XYZ , “>$out” or die “Error in write: $!”; for $i (1..20) { print XYZ “$i :: Hello, the time is”, scalar(localtime), “\n”; }

Internet & Web Based Technology 50 • Appending to a file:

$out = “/home/isg/out.txt”; open XYZ , “>>$out” or die “Error in write: $!”; for $i (1..20) { print XYZ “$i :: Hello, the time is”, scalar(localtime), “\n”; }

Internet & Web Based Technology 51 • Closing a file: close XYZ; where XYZ is the file handle of the file being closed.

Internet & Web Based Technology 52 • Printing a file: – This is very easy to do in Perl.

$input = “/home/isg/report.txt”; open IN, $input or die “Error in open: $!”; while () { print; } close IN;

Internet & Web Based Technology 53 Command Line Arguments

• Perl uses a special array called @ARGV. – List of arguments passed along with the script name on the command line. – Example: if you invoke Perl as: perl test.pl red blue green then @ARGV will be (red blue green). – Printing the command line arguments:

foreach (@ARGV) { print “$_ \n”; }

Internet & Web Based Technology 54 Standard File Handles

– Read from standard input (keyboard). • – Print to standard output (screen). • – For outputting error messages. • – Reads the names of the files from the command line and opens them all.

Internet & Web Based Technology 55 – @ARGV array contains the text after the program’s name in command line. • takes each file in turn. • If there is nothing specified on the command line, it reads from the standard input. – Since this is very commonly used, Perl provides an abbreviation for , namely, < > – An example is shown.

Internet & Web Based Technology 56 $lineno = 1; while (< >) { print $lineno ++; print “$lineno: $_”; }

– In this program, the name of the file has to be given on the command line. perl list_lines.pl file1.txt perl list_lines.pl a.txt b.txt c.txt

Internet & Web Based Technology 57 Control Structures Introduction

• There are many control constructs in Perl. – Similar to those in C. – Would be illustrated through examples. – The available constructs: • for • foreach • if/elseif/else • while • do, etc.

Internet & Web Based Technology 59 Concept of Block

• A statement block is a sequence of statements enclosed in matching pair of { and }.

if (year == 2000) { print “You have entered new millenium.\n”; }

• Blocks may be nested within other blocks.

Internet & Web Based Technology 60 Definition of TRUE in Perl

• In Perl, only three things are considered as FALSE: – The value 0 – The empty string (“ ”) – undef • Everything else in Perl is TRUE.

Internet & Web Based Technology 61 if .. else

• General syntax:

if (test expression) { # if TRUE, do this } else { # if FALSE, do this }

Internet & Web Based Technology 62 • Examples:

if ($name eq ‘isg’) { print “Welcome Indranil. \n”; } else { print “You are somebody else. \n”; }

if ($flag == 1) { print “There has been an error. \n”; } # The else block is optional

Internet & Web Based Technology 63 elseif

• Example:

print “Enter your id: ”; chomp ($name = ); if ($name eq ‘isg’) { print “Welcome Indranil. \n”; } elseif ($name eq ‘bkd’) { print “Welcome Bimal. \n”; } elseif ($name eq ‘akm’) { print “Welcome Arun. \n”; } else { print “Sorry, I do not know you. \n”; }

Internet & Web Based Technology 64 while

• Example: (Guessing the correct word)

$your_choice = ‘ ‘; $secret_word = ‘India’; while ($your_choice ne $secret_word) { print “Enter your guess: \n”; chomp ($your_choice = ); }

print “Congratulations! Mera Bharat Mahan.”

Internet & Web Based Technology 65 for

• Syntax same as in C. • Example:

for ($i=1; $i<10; $i++) { print “Iteration number $i \n”; }

Internet & Web Based Technology 66 foreach

• Very commonly used function that iterates over a list. • Example:

@colors = qw (red blue green); foreach $name (@colors) { print “Color is $name. \n”; }

• We can use ‘for’ in place of ‘foreach’.

Internet & Web Based Technology 67 • Example: Counting odd numbers in a list

@xyz = qw (10 15 17 28 12 77 56); $count = 0;

foreach $number (@xyz) { if (($number % 2) == 1) { print “$number is odd. \n”; $count ++; } print “Number of odd numbers is $count. \n”; }

Internet & Web Based Technology 68 Breaking out of a loop

• The statement ‘last’, if it appears in the body of a loop, will cause Perl to immediately the loop.

– Used with a conditional.

last if (i > 10);

Internet & Web Based Technology 69 Skipping to end of loop

• For this we use the statement ‘next’. – When executed, the remaining statements in the loop will be skipped, and the next iteration will begin. – Also used with a conditional.

Internet & Web Based Technology 70 Relational Operators The Operators Listed

Comparison Numeric String

Equal == eq

Not equal != ne

Greater than > gt

Less than < lt

Greater or equal >= ge

Less or equal <= le

Internet & Web Based Technology 72 Logical Connectives

• If $a and $b are logical expressions, then the following conjunctions are supported by Perl: – $a and $b $a && $b – $a or $b $a || $b – not $a ! $a • Both the above alternatives are equivalent; first one is more readable.

Internet & Web Based Technology 73 String Functions The Split Function

• ‘split’ is used to split a string into multiple pieces using a delimiter, and create a list out of it.

$_=‘Red:Blue:Green:White:255'; @details = split /:/, $_; foreach (@details) { print “$_\n”; }

– The first parameter to ‘split’ is a regular expression that specifies what to split on. – The second specifies what to split.

Internet & Web Based Technology 75 • Another example:

$_= “Indranil [email protected] 283496”; ($name, $email, $phone) = split / /, $_;

• By default, ‘split’ breaks a string using space as delimiter.

Internet & Web Based Technology 76 The Join Function

• ‘join’ is used to concatenate several elements into a single string, with a specified delimiter in between.

$new = join ' ', $x1, $x2, $x3, $x4, $x5, $x6;

$sep = ‘::’; $new = join $sep, $x1, $x2, $w3, @abc, $x4, $x5;

Internet & Web Based Technology 77 Regular Expressions Introduction

• One of the most useful features of Perl. • What is a regular expression (RegEx)? – Refers to a pattern that follows the rules of syntax. – Basically specifies a chunk of text. – Very powerful way to specify string patterns.

Internet & Web Based Technology 79 An Example: without RegEx

$found = 0; $_ = “Hello good morning everybody”; $search = “every”; foreach $word (split) { if ($word eq $search) { $found = 1; last; } } if ($found) { print “Found the word ‘every’ \n”; }

Internet & Web Based Technology 80 Using RegEx

$_ = “Hello good morning everybody”;

if ($_ =~ /every/) { print “Found the word ‘every’ \n”; }

• Very easy to use. • The text between the forward slashes defines the regular expression. • If we use “!~” instead of “=~”, it means that the pattern is not present in the string.

Internet & Web Based Technology 81 • The previous example illustrates literal texts as regular expressions. – Simplest form of regular expression. • Point to remember: – When performing the matching, all the characters in the string are considered to be significant, including punctuation and white spaces. • For example, /every / will not match in the previous example.

Internet & Web Based Technology 82 Another Simple Example

$_ = “Welcome to IIT Kharagpur, students”; if (/IIT K/) { print “’IIT K’ is present in the string\n”; { if (/Kharagpur students/) { print “This will not match\n”; }

Internet & Web Based Technology 83 Types of RegEx

• Basically two types: – Matching • Checking if a string contains a substring. • The symbol ‘m’ is used (optional if forward slash used as delimiter). – Substitution • Replacing a substring by another substring. • The symbol ‘s’is used.

Internet & Web Based Technology 84 Matching The =~ Operator

• Tells Perl to apply the regular expression on the right to the value on the left. • The regular expression is contained within delimiters (forward slash by default). – If some other delimiter is used, then a preceding ‘m’ is essential.

Internet & Web Based Technology 86 Examples

$string = “Good day”;

if ($string =~ m/day/) { print “Match successful \n"; }

if ($string =~ /day/) { print “Match successful \n"; }

• Both forms are equivalent. • The ‘m’ in the first form is optional.

Internet & Web Based Technology 87 $string = “Good day”;

if ($string =~ m@day@) { print “Match successful \n"; }

if ($string =~ m[day[ ) { print “Match successful \n"; }

• Both forms are equivalent. • The character following ‘m’ is the delimiter.

Internet & Web Based Technology 88 Character Class

• Use square brackets to specify “any value in the list of possible values”.

my $string = “Some test string 1234"; if ($string =~ /[0123456789]/) { print "found a number \n"; } if ($string =~ /[aeiou]/) { print "Found a vowel \n"; } if ($string =~ /[0123456789ABCDEF]/) { print "Found a hex digit \n"; }

Internet & Web Based Technology 89 Character Class Negation

• Use ‘^’ at the beginning of the character class to specify “any single element that is not one of these values”.

my $string = “Some test string 1234"; if ($string =~ /[^aeiou]/) { print "Found a consonant\n"; }

Internet & Web Based Technology 90 Pattern Abbreviations

• Useful in common cases

. Anything except newline (\n) \ A digit, same as [0-9]

\w A word character, [0-9a-zA-Z_]

\s A space character (tab, space, etc)

\D Not a digit, same as [^0-9]

\W Not a word character

\S Not a space character

Internet & Web Based Technology 91 $string = “Good and bad days"; if ($string =~ /d..s/) { print "Found something like days\n"; } if ($string =~ /\w\w\w\w\s/) { print "Found a four-letter word!\n"; }

Internet & Web Based Technology 92 Anchors

• Three ways to define an anchor: ^ :: anchors to the beginning of string $ :: anchors to the end of the string \b :: anchors to a word boundary

Internet & Web Based Technology 93 if ($string =~ /^\w/) :: does string start with a word character? if ($string =~ /\d$/) :: does string end with a digit? if ($string =~ /\bGood\b/) :: Does string contain the word “Good”?

Internet & Web Based Technology 94 Multipliers

• There are three multiplier characters. * :: Find zero or more occurrences + :: Find one or more occurrences ? :: Find zero or one occurrence • Some example usages: $string =~ /^\w+/; $string =~ /\d?/; $string =~ /\b\w+\s+/; $string =~ /\w+\s?$/;

Internet & Web Based Technology 95 Substitution Basic Usage

• Uses the ‘s’ character. • Basic syntax is: $new =~ s/pattern_to_match/new_pattern/;

What this does? • Looks for pattern_to_match in $new and, if found, replaces it with new_pattern. • It looks for the pattern once. That is, only the first occurrence is replaced. • There is a way to replace all occurrences (to be discussed shortly).

Internet & Web Based Technology 97 Examples

$xyz = “Rama and Lakshman went to the forest”;

$xyz =~ s/Lakshman/Bharat/;

$xyz =~ s/R\w+a/Bharat/;

$xyz =~ s/[aeiou]/i/;

$abc = “A year has 11 months \n”;

$abc =~ s/\d+/12/;

$abc =~ s /\n$/ /;

Internet & Web Based Technology 98 Common Modifiers

• Two such modifiers are defined: /i :: ignore case /g :: match/substitute all occurrences

$string = “Ram and Shyam are very honest"; if ($string =~ /RAM/i) { print “Ram is present in the string”; }

$string =~ s/m/j/g; # Ram -> Raj, Shyam -> Shyaj

Internet & Web Based Technology 99 Use of Memory in RegEx

• We can use parentheses to capture a piece of matched text for later use. – Perl memorizes the matched texts. – Multiple sets of parentheses can be used. • How to recall the captured text? – Use \1, \2, \3, etc. if still in RegEx. – Use $1, $2, $3 if after the RegEx.

Internet & Web Based Technology 100 Examples

$string = “Ram and Shyam are honest";

$string =~ /^(\w+)/; print $1, "\n"; # prints “Ra\n”

$string =~ /(\w+)$/; print $1, "\n"; # prints “st\n”

$string =~ /^(\w+)\s+(\w+)/; print "$1 $2\n"; # prints “Ramnd Shyam are honest”;

Internet & Web Based Technology 101 $string = “Ram and Shyam are very poor"; if ($string =~ /(\w)\1/) { print "found 2 in a row\n"; } if ($string =~ /(\w+).*\1/) { print "found repeat\n"; }

$string =~ s/(\w+) and (\w+)/$2 and $1/;

Internet & Web Based Technology 102 Example 1

• validating user input

print “Enter age (or 'q' to quit): "; chomp (my $age = );

exit if ($age =~ /^q$/i);

if ($age =~ /\D/) { print "$age is a non-number!\n"; }

Internet & Web Based Technology 103 Example 2: validation contd.

• File has 2 columns, name and age, delimited by one or more spaces. Can also have blank lines or commented lines (start with #).

open IN, $file or die "Cannot open $file: $!"; while (my $line = ) { chomp $line; next if ($line =~ /^\s*$/ or $line =~ /^\s*#/); my ($name, $age) = split /\s+/, $line; print “The age of $name is $age. \n"; }

Internet & Web Based Technology 104 Some Special Variables $&, $` and $’

• What is $&? – It represents the string matched by the last successful pattern match. • What is $`? – It represents the string preceding whatever was matched by the last successful pattern match. • What is $‘? – It represents the string following whatever was matched by the last successful pattern match .

Internet & Web Based Technology 106 – Example:

$_ = 'abcdefghi'; /def/; print "$\`:$&:$'\n"; # prints abc:def:ghi

Internet & Web Based Technology 107 • So actually …. – S` represents pre match – $& represents present match – $’ represents post match

Internet & Web Based Technology 108 Associative Arrays Introduction

• Associative arrays, also known as hashes. – Similar to a list • Every list element consists of a pair, a hash key and a value. • Hash keys must be unique. – Accessing an element • Unlike an array, an element value can be found out by specifying the hash key value. • Associative search. – A hash array name must begin with a ‘%’.

Internet & Web Based Technology 110 Specifying Hash Array

• Two ways to specify:

– Specifying hash keys and values, in proper sequence.

%directory = ( “Rabi”, “258345”, “Chandan”, “325129”, “Atul”, “445287”, “Sruti”, “237221” );

Internet & Web Based Technology 111 – Using the => operator.

%directory = ( Rabi => “258345”, Chandan => “325129”, Atul => “445287”, Sruti => “237221” );

– Whatever appears on the left hand side of ‘=>’ is treated as a double-quoted string.

Internet & Web Based Technology 112 Conversion Array <=> Hash

• An array can be converted to hash.

@list = qw (Rabi 258345 Chandan 325129 Atul 445287 Sruti 237221); %directory = @list;

• A hash can be converted to an array:

@list = %directory;

Internet & Web Based Technology 113 Accessing a Hash Element

• Given the hash key, the value can be accessed using ‘{ }’. • Example:

@list = qw (Rabi 258345 Chandan 325129 Atul 445287 Sruti 237221); %directory = @list; print “Atul’s number is $directory{“Atul”} \n”;

Internet & Web Based Technology 114 Modifying a Value

• By simple assignment:

@list = qw (Rabi 258345 Chandan 325129 Atul 445287 Sruti 237221); %directory = @list;

$directory{Sruti} = “453322”; $directory{‘Chandan’} ++;

Internet & Web Based Technology 115 Deleting an Entry

• A (hash key, value) pair can be deleted from a hash array using the “delete” function. – Hash key has to be specified.

@list = qw (Rabi 258345 Chandan 325129 Atul 445287 Sruti 237221); %directory = @list; delete $directory{Atul};

Internet & Web Based Technology 116 Swapping Keys and Values

• Why needed? – Suppose we want to search for a person, given the phone number.

@list = qw (Rabi 258345 Chandan 325129 Atul 445287 Sruti 237221); %directory = @list;

%revdir = reverse %directory; print “$revdir{237221} \n”;

Internet & Web Based Technology 117 Using Functions ‘keys’, ‘values’

• ‘keys’ returns all the hash keys as a list. • ‘values’ returns all the values as a list.

@list = qw (Rabi 258345 Chandan 325129 Atul 445287 Sruti 237221); %directory = @list;

@all_names = keys %directory; @all_phones = values %directory;

Internet & Web Based Technology 118 An Example

• List all person names and telephone numbers.

@list = qw (Rabi 258345 Chandan 325129 Atul 445287 Sruti 237221); %directory = @list;

foreach $name (keys %directory) { print “$name \t $directory{$name} \n”; }

Internet & Web Based Technology 119 Introduction

• A ….. – Is a user-defined function. – Allows code reuse. – Define ones, use multiple times.

Internet & Web Based Technology 121 How to use?

• Defining a subroutine sub test_sub { # the body of the subroutine goes here # …….. }

• Calling a subroutine – Use the ‘&’ prefix to call a subroutine. &test_sub; &gcd ($val1, $val2); # Two parameters – However, the ‘&’ is optional.

Internet & Web Based Technology 122 Subroutine Return Values

• Use the ‘return’ statement. – This is also optional. – If the keyword ‘return’ is omitted, Perl functions return the last value evaluated. • A subroutine can also return a non-scalar. • Some examples are given next.

Internet & Web Based Technology 123 Example 1

$name = ‘Indranil'; welcome(); # call the first sub welcome_namei(); # call the second sub exit; sub welcome { print "hi there\n"; } sub welcome_name { print "hi $name\n"; # uses global $name variable }

Internet & Web Based Technology 124 Example 2

# Return a non-scalar sub return_alpha_and_beta { return ($alpha, $beta); }

$alpha = 15; $beta = 25;

@c = return_alpha_and_beta; # @c gets (5,6)

Internet & Web Based Technology 125 Passing Arguments

• All arguments are passed into a Perl function through the special array $_. – Thus, we can send as many arguments as we want. • Individual arguments can also be accessed as $_[0], $_[1], $_[2], etc.

Internet & Web Based Technology 126 Example 3

# Two different ways to write a subroutine to add two numbers sub add_ver1 { ($first, $second) = @_; return ($first + $second); }

sub add_ver2 { return $_[0] + $_[1]; # $_[0] and $_[1] are the first two # elements of @_ }

Internet & Web Based Technology 127 Example 4

$total = find_total (5, 10, -12, 7, 40); sub find_total { # adds all numbers passed to the sub $sum = 0; for $num (@_) { $sum += $num; } return $sum; }

Internet & Web Based Technology 128 ‘my’ variables

• We can define local variables using the ‘my’ keyword. – Confines a variable to a region of code (within a block { } ). – ‘my’ variable’s storage is freed whenever the variable goes out of scope. – All variables in Perl is by default ‘global’.

Internet & Web Based Technology 129 Example 5

$sum = 7; $total = add_any (20, 10, -15); # $total gets 15 sub add_any { # local variable, won't interfere # with global $sum my $sum = 0;

for my $num (@_ ) { $sum += $num; } return $sum; }

Internet & Web Based Technology 130 Writing CGI Scripts in Perl Introduction

• Perl provides with a number of facilities to facilitate writing of CGI scripts. – Standard library modules. • Included as part of the Perl distribution. • No need to install them separately.

#!/usr/bin/perl use CGI qw (:standard);

Internet & Web Based Technology 132 • Some of the functions included in the CGI.pm (.pm is optional) are: – header • This prints out the “Content-type” header. • With no arguments, the type is assumed to be “text/html”. – start_html • This prints out the , , and <body> tags. • Accepts optional arguments.</p><p>Internet & Web Based Technology 133 – end_html • This prints out the closing HTML tags, </body>, >/html>. </p><p>• Typical usages and arguments would be illustrated through examples.</p><p>Internet & Web Based Technology 134 Example 1 (without using CGI.pm)</p><p>#!/usr/bin/perl print <<TO_END; Content-type: text/html</p><p><HTML> <HEAD> <TITLE> Server Details Server name: $ENV{SERVER_NAME}
Server port number: $ENV{SERVER_PORT}
Server protocol: $ENV{SERVER_PROTOCOL} TO_END

Internet & Web Based Technology 135 Example 2 (using CGI.pm)

#!/usr/bin/perl -wT use CGI qw(:standard); print header (“text/html”); print start_html ("Hello World"); print "

Hello, world!

\n"; print end_html;

Internet & Web Based Technology 136 Example 3: Decoding Form Input

sub parse_form_data { my %form_data; my $name_value; my @nv_pairs = split /&/, $ENV{QUERY_STRING};

if ( $ENV{REQUEST_METHOD} eq ‘POST’ ) { my $query = “”; read (STDIN, $query, $ENV{CONTENT_LENGTH}); push @nv_pairs, split /&/, $query; }

Internet & Web Based Technology 137 foreach $name_value (@nv_pairs) { my ($name, $value) = split /=/, $name_value;

$name =~ tr/+/ /; $name =~ s/%([\da-f][\da-f])/chr (hex($1))/egi; $value =~ tr/+/ /; $value =~ s/%([\da-f][\da-f])/chr (hex($1))/egi;

$form_data{$name} = $value; } return %form_data; }

Internet & Web Based Technology 138 Using CGI.pm

• The decoded form value can be directly accessed as: $value = param (‘fieldname’);

• An equivalent Perl code as in the last example using CGI.pm – Shown in next slide.

Internet & Web Based Technology 139 Example 4

#!/usr/bin/perl -wT use CGI qw(:standard); my %form_data; foreach my $name (param() ) { $form_data {$name} = param($name); }

Internet & Web Based Technology 140 Example 5: sending mail

#!/usr/bin/perl -wT use CGI qw(:standard); print header; print start_html (“Response to Guestbook”); $ENV{PATH} = “/usr/sbin”; # to locate sendmail open (MAIL, “| /usr/sbin/sendmail –oi –t”); # open the pipe to sendmail my $recipient = ‘[email protected]’; print MAIL “To: $recipient\n”; print MAIL “From: isg\@cse.iitkgp.ac.in\n”; print MAIL “Subject: Submitted data\n\n”;

Internet & Web Based Technology 141 foreach my $xyz (param()) { print MAIL “$xyz = “, param($xyz), “\n”; } close (MAIL); print <Thanks for the comments

Hope you visit again.

EOM print end_html;

Internet & Web Based Technology 142