introduction to perl part i, ii, and iii by: bridget thomson mcinnes 20 january 2004
TRANSCRIPT
Introduction to PerlIntroduction to Perl
Part I, II, and IIIPart I, II, and III
By: Bridget Thomson McInnesBy: Bridget Thomson McInnes
20 January 200420 January 2004
22
What is Perl?What is Perl?
Perl is a Portable Scripting LanguagePerl is a Portable Scripting Language– No compiling is needed.No compiling is needed.– Runs on Windows, UNIX and LINUXRuns on Windows, UNIX and LINUX
Fast and easy text processing capabilityFast and easy text processing capability Fast and easy file handling capabilityFast and easy file handling capability Written by Larry WallWritten by Larry Wall ““Perl is the language for getting your job Perl is the language for getting your job
done.”done.”
33
How to Access PerlHow to Access Perl Off the school networkOff the school network
– Located on the csdev machines at: Located on the csdev machines at: /usr/local/bin/perl/usr/local/bin/perl
To install at home To install at home – www.perl.comwww.perl.com Has rpm's for Linux Has rpm's for Linux– www.activestate.comwww.activestate.com Has binaries for Has binaries for
WindowsWindows Latest Version is 5.8Latest Version is 5.8
– To check if Perl is working and the version To check if Perl is working and the version numbernumber % perl -v% perl -v
44
Resources For PerlResources For Perl Books:Books:
– Learning PerlLearning Perl By Larry WallBy Larry Wall Published by O'ReillyPublished by O'Reilly
– Programming PerlProgramming Perl By Larry Wall,Tom Christiansen and Jon By Larry Wall,Tom Christiansen and Jon
OrwantOrwant Published by O'ReillyPublished by O'Reilly
Web SiteWeb Site– http://safari.oreilly.comhttp://safari.oreilly.com
Contains both Learning Perl and Contains both Learning Perl and Programming Perl in ebook formProgramming Perl in ebook form
55
Web Sources for PerlWeb Sources for Perl
WebWeb– www.perl.comwww.perl.com– www.perldoc.comwww.perldoc.com– www.perl.orgwww.perl.org– www.perlmonks.orgwww.perlmonks.org
66
The Basic Hello World The Basic Hello World ProgramProgram
Program:Program:
#!/usr/local/bin/perl -w#!/usr/local/bin/perl -w
print “Hello World!\n”;print “Hello World!\n”;
Save this as “hello.pl”Save this as “hello.pl” Give it executable permissionsGive it executable permissions
– chmod ug+x hello.plchmod ug+x hello.pl Run it as follows:Run it as follows:
– ./hello.pl./hello.pl
77
““Hello World” ObservationsHello World” Observations
““.pl” extension is optional but is .pl” extension is optional but is commonly usedcommonly used
The first line “#!/usr/local/bin/perl” tells The first line “#!/usr/local/bin/perl” tells UNIX where to find PerlUNIX where to find Perl
““-w” switches on warning : not required -w” switches on warning : not required but a really good ideabut a really good idea
Second Line: brackets are not needed Second Line: brackets are not needed around the argument of the print around the argument of the print functionfunction
88
Numerical LiteralsNumerical Literals
Numerical LiteralsNumerical Literals– 66 IntegerInteger– 12.612.6 Floating PointFloating Point– 1e101e10 Scientific NotationScientific Notation– 6.4E-336.4E-33 Scientific NotationScientific Notation– 4_348_3484_348_348 Underscores instead of Underscores instead of
commas for long commas for long numbersnumbers
99
String LiteralsString Literals
String LiteralsString Literals– ““There is more than on way to do it!”There is more than on way to do it!”– 'Just don't create a file called -rf.''Just don't create a file called -rf.'– ““Beauty?\nWhat's that?\n”Beauty?\nWhat's that?\n”– “”“”– ““Real programmers can write assembly Real programmers can write assembly
in any language.in any language.
Quotes from Larry WallQuotes from Larry Wall
1010
Types of VariablesTypes of Variables Types of variables:Types of variables:
– Scalar variables : $a, $b, $cScalar variables : $a, $b, $c– Array variables : @arrayArray variables : @array– Hash variables : %hashHash variables : %hash– File handles : STDIN, SRC, DESTFile handles : STDIN, SRC, DEST
Variables do not need to be declaredVariables do not need to be declared Variable type (int, char, ...) is decided at Variable type (int, char, ...) is decided at
run time run time – $a = 5; # now an integer$a = 5; # now an integer
– $a = “perl”; # now a string$a = “perl”; # now a string
1111
Operators on Scalar Operators on Scalar VariablesVariables
Numeric and Logic OperatorsNumeric and Logic Operators– Typical : +, -, *, /, %, ++, --, +=, -=, *=, /=, ||, Typical : +, -, *, /, %, ++, --, +=, -=, *=, /=, ||,
&&, ! ect …&&, ! ect …– Not typical: ** for exponentiationNot typical: ** for exponentiation
String OperatorsString Operators– Concatenation: “.” - similar to strcatConcatenation: “.” - similar to strcat
$first_name = “Larry”; $first_name = “Larry”;
$last_name = “Wall”;$last_name = “Wall”;
$full_name = $first_name $full_name = $first_name .. “ “ “ “ .. $last_name; $last_name;
1212
Equality Operators for Equality Operators for StringsStrings
Equality/ Inequality : eq and neEquality/ Inequality : eq and ne
$language = “Perl”;$language = “Perl”;
if ($language == “Perl”) ...if ($language == “Perl”) ...# Wrong!# Wrong!
if ($languageif ($language eq eq “Perl”) ...“Perl”) ... #Correct#Correct
– Use eq / ne rather than == / != for Use eq / ne rather than == / != for stringsstrings
1313
Relational Operators for Relational Operators for StringsStrings
Greater thanGreater than– Numeric : >Numeric : > String : gtString : gt
Greater than or equal toGreater than or equal to– Numeric : >=Numeric : >= String : geString : ge
Less thanLess than– Numeric : >Numeric : > String : ltString : lt
Less than or equal toLess than or equal to– Numeric : >=Numeric : >= String : leString : le
1414
String FunctionsString Functions
Convert to upper caseConvert to upper case– $name = uc($name);$name = uc($name);
Convert only the first char to upper caseConvert only the first char to upper case– $name = ucfirst($name);$name = ucfirst($name);
Convert to lower caseConvert to lower case– $name = lc($name);$name = lc($name);
Convert only the first char to lower caseConvert only the first char to lower case– $name = lcfirst($name);$name = lcfirst($name);
1515
A String Example ProgramA String Example Program
#!/usr/local/bin/perl#!/usr/local/bin/perl
$var1 = “larry”;$var1 = “larry”;
$var2 = “moe”;$var2 = “moe”;
$var3 = “shemp”;$var3 = “shemp”;
print ucfirst($var1);print ucfirst($var1); # Prints 'Larry'# Prints 'Larry'
print uc($var2);print uc($var2); # Prints 'MOE'# Prints 'MOE'
print lcfirst(uc($var3));print lcfirst(uc($var3)); # Prints # Prints 'sHEMP''sHEMP'
1616
Variable InterpolationVariable Interpolation Perl looks for variables inside strings and Perl looks for variables inside strings and
replaces them with their valuereplaces them with their value$stooge = “Larry”$stooge = “Larry”
print “$stooge is one of the three stooges.\n”;print “$stooge is one of the three stooges.\n”;
Produces the output:Produces the output:Larry is one of the three stooges.Larry is one of the three stooges.
This does not happen when you use single This does not happen when you use single quotesquotes
print '$stooge is one of the three stooges.\n”;print '$stooge is one of the three stooges.\n”;
Produces the output:Produces the output:$stooge is one of the three stooges.\n$stooge is one of the three stooges.\n
1717
Character InterpolationCharacter Interpolation
List of character escapes that are recognized List of character escapes that are recognized when using double quoted stringswhen using double quoted strings– \n\n newlinenewline– \t\t tabtab– \r\r carriage returncarriage return
Common Example :Common Example :
– print “Hello\n”; # prints Hello and then a returnprint “Hello\n”; # prints Hello and then a return
1818
Numbers and Strings are Numbers and Strings are InterchangeableInterchangeable
If a scalar variable looks like a number If a scalar variable looks like a number and Perl needs a number, it will use it as and Perl needs a number, it will use it as a numbera number
$a = 4;$a = 4; # a number# a number
print $a + 18;print $a + 18; # prints 60# prints 60
$b = “50”;$b = “50”; # looks like a string, but ...# looks like a string, but ...
print $b – 10; print $b – 10; # will print 40!# will print 40!
1919
If ... else ... statementsIf ... else ... statements Similar to C/C++ - except the scope Similar to C/C++ - except the scope
braces are REQUIRED!!braces are REQUIRED!!
if ( $os eq “Linux” ) { if ( $os eq “Linux” ) {
print “Sweet!\n”; print “Sweet!\n”;
}}
elsif ( $os eq “Windows” ) {elsif ( $os eq “Windows” ) {
print “Time to move to Linux, buddy!\n”;print “Time to move to Linux, buddy!\n”;
}}
else {else {
print “Hmm...!\n”;print “Hmm...!\n”;
}}
2020
Unless ... else StatementsUnless ... else Statements Unless Statements are the opposite of if ... Unless Statements are the opposite of if ...
else statements. else statements.
Unless ($os eq “Linux”) {Unless ($os eq “Linux”) {
print “Time to move to Linux, buddy!\n”;print “Time to move to Linux, buddy!\n”;
}}
else {else {
print “Sweet!\n”;print “Sweet!\n”;
}}
And again remember the braces are required!And again remember the braces are required!
2121
While LoopWhile Loop While loop: Similar to C/C++ but again While loop: Similar to C/C++ but again
the braces are the braces are required!!required!!
Example :Example :$i = 0;$i = 0;
while ( $i <= 1000 ) {while ( $i <= 1000 ) {
print “$i\n”; print “$i\n”;
$i++;$i++;
}}
2222
Until LoopUntil Loop The until function evaluates an The until function evaluates an
expression repeatedly until a specific expression repeatedly until a specific condition is met. condition is met.
Example:Example:
$i = 0;$i = 0;
until ($i == 1000) {until ($i == 1000) {
print “$i\n”;print “$i\n”;
$i++;$i++;
}}
2323
For LoopsFor Loops Like C/C++ Like C/C++
– Example :Example : for ( $i = 0; $i <= 1000; $i++ ) {for ( $i = 0; $i <= 1000; $i++ ) { print “$i\n”;print “$i\n”; }}
Another way to create a for loopAnother way to create a for loop– ExampleExample
for $i(0..1000) {for $i(0..1000) { print “$i\n”;print “$i\n”; }}
2424
Moving around in a LoopMoving around in a Loop Where you would use continue in C, use Where you would use continue in C, use
next.next. Where you would use break in C, use last.Where you would use break in C, use last.
What is the output for the following code What is the output for the following code snippet:snippet:for ( $i = 0; $i < 10; $i++) {for ( $i = 0; $i < 10; $i++) {
if ($i == 1 || $i == 3) { if ($i == 1 || $i == 3) { next;next; } }
if($i == 5) { if($i == 5) { last;last; } }
print “$i\n”;print “$i\n”;
}}
AnswerAnswer
00
22
44
2626
ArraysArrays Array variable is denoted by the @ symbolArray variable is denoted by the @ symbol
– @array = ( “Larry”, Curly”, “Moe” );@array = ( “Larry”, Curly”, “Moe” );
To access the whole array, use the whole To access the whole array, use the whole arrayarray– print @array; # prints : Larry Curly Moeprint @array; # prints : Larry Curly Moe
Notice that you do not need to loop through Notice that you do not need to loop through the whole array to print it – Perl does this for the whole array to print it – Perl does this for youyou
2727
Arrays cont…Arrays cont… To access one element of the array : use $To access one element of the array : use $
– Why? Because every element in the array is Why? Because every element in the array is scalarscalar
– print “$array[0]\n”; # prints : Larryprint “$array[0]\n”; # prints : Larry
Question:Question:
– What happens if we access $array[3] ?What happens if we access $array[3] ?
Answer : NothingAnswer : Nothing
2828
Arrays cont ...Arrays cont ... To find the index of the last element in To find the index of the last element in
the arraythe arrayprint $#array; # prints 2 in the previous print $#array; # prints 2 in the previous
# example # example
Note another way to find the number of Note another way to find the number of elements in the array:elements in the array:$array_size = @array; $array_size = @array; – $array_size now has 3 in the above example $array_size now has 3 in the above example
because there are 3 elements in the arraybecause there are 3 elements in the array
2929
Sorting ArraysSorting Arrays Perl has a built in sort functionPerl has a built in sort function Two ways to sort:Two ways to sort:
– Default : sorts in a standard string comparisons Default : sorts in a standard string comparisons orderorder sort LISTsort LIST
– Usersub: create your own subroutine that Usersub: create your own subroutine that returns an integer less than, equal to or greater returns an integer less than, equal to or greater than 0than 0 Sort USERSUB LISTSort USERSUB LIST The <=> and cmp operators make creating The <=> and cmp operators make creating
sorting subroutines very easysorting subroutines very easy
3030
Numerical Sorting ExampleNumerical Sorting Example#!/usr/local/bin/perl -w#!/usr/local/bin/perl -w
@unsortedArray = (3, 10, 76, 23, 1, 54);@unsortedArray = (3, 10, 76, 23, 1, 54);
@sortedArray = sort numeric @unsortedArray;@sortedArray = sort numeric @unsortedArray;
print “@unsortedArray\n”; # prints 3 10 76 23 1 print “@unsortedArray\n”; # prints 3 10 76 23 1 5454
print “@sortedArray\n”;print “@sortedArray\n”; # prints 1 3 10 23 54 76 # prints 1 3 10 23 54 76
sub numeric {sub numeric {
$a <=> $b$a <=> $b}}
3131
#!/usr/local/bin/perl -w
@unsortedArray = (Larry”, “Curly”, “moe”);
@sortedArray = sort { lc($a) cmp lc($b)} @unsortedArray;
print “@unsortedArray\n”; # prints Larry Curly moe
print “@sortedArray\n”; # prints Curly Larry moe
String Sorting ExampleString Sorting Example
3232
ForeachForeach Foreach allows you to iterate over an Foreach allows you to iterate over an
arrayarray Example:Example:
foreachforeach $element (@array) { $element (@array) {
print “$element\n”;print “$element\n”;
}}
This is similar to : This is similar to : for ($i = 0; $i <= $#array; $i++) {for ($i = 0; $i <= $#array; $i++) {
print “$array[$i]\n”;print “$array[$i]\n”;
}}
3333
Sorting with ForeachSorting with Foreach The sort function sorts the array and The sort function sorts the array and
returns the list in sorted order.returns the list in sorted order. Example : Example :
@array( “Larry”, “Curly”, “Moe”);@array( “Larry”, “Curly”, “Moe”);
foreach $element (foreach $element (sortsort @array) { @array) {
print “$element ”;print “$element ”;
}}
Prints the elements in sorted order:Prints the elements in sorted order:
Curly Larry MoeCurly Larry Moe
3434
Strings to Arrays : splitStrings to Arrays : split Split a string into words and put into an Split a string into words and put into an
arrayarray@array = @array = splitsplit( / /, “Larry Curly Moe” );( / /, “Larry Curly Moe” );
# creates the same array as we saw# creates the same array as we saw previouslypreviously
Split into charactersSplit into characters@stooge = @stooge = splitsplit( //, “curly” );( //, “curly” );
# array @stooge has 5 elements: c, u, r, l, y# array @stooge has 5 elements: c, u, r, l, y
3535
Split cont..Split cont..
Split on any characterSplit on any character@array = @array = splitsplit( /:/, “10:20:30:40”);( /:/, “10:20:30:40”);
# array has 4 elements : 10, 20, 30, 40# array has 4 elements : 10, 20, 30, 40
Split on Multiple White SpaceSplit on Multiple White Space@array = split(/\s+/, “this is a test”;@array = split(/\s+/, “this is a test”;
# array has 4 elements : this, is, a, test# array has 4 elements : this, is, a, test
More on ‘\s+’ laterMore on ‘\s+’ later
3636
Arrays to StringsArrays to Strings Array to space separated stringArray to space separated string
@array = (“Larry”, “Curly”, “Moe”);@array = (“Larry”, “Curly”, “Moe”);
$string = $string = joinjoin( “ “, @array); ( “ “, @array);
# string = “Larry Curly Moe”# string = “Larry Curly Moe”
Array of characters to stringArray of characters to string@stooge = (“c”, “u”, “r”, “l”, “y”);@stooge = (“c”, “u”, “r”, “l”, “y”);
$string = $string = joinjoin( “”, @stooge ); ( “”, @stooge );
# string = “curly”# string = “curly”
3737
Joining Arrays cont…Joining Arrays cont…
Join with any character you wantJoin with any character you want@array = ( “10”, “20”, “30”, “40” );@array = ( “10”, “20”, “30”, “40” );
$string = $string = joinjoin( “:”, @array); ( “:”, @array);
# string = “10:20:30:40”# string = “10:20:30:40”
Join with multiple charactersJoin with multiple characters@array = “10”, “20”, “30”, “40”);@array = “10”, “20”, “30”, “40”);
$string = $string = joinjoin(“->”, @array);(“->”, @array);
# string = “10->20->30->40”# string = “10->20->30->40”
3838
Arrays as Stacks and ListsArrays as Stacks and Lists To append to the end of an array :To append to the end of an array :
@array = ( “Larry”, “Curly”, “Moe” );@array = ( “Larry”, “Curly”, “Moe” );
pushpush (@array, “Shemp” ); (@array, “Shemp” );
print $array[3]; # prints “Shemp”print $array[3]; # prints “Shemp”
To remove the last element of the array To remove the last element of the array (LIFO)(LIFO)$elment = $elment = poppop @array; @array;
print $element; # prints “Shemp”print $element; # prints “Shemp”– @array now has the original elements @array now has the original elements
(“Larry”, “Curly”, “Moe”)(“Larry”, “Curly”, “Moe”)
3939
Arrays as Stacks and ListsArrays as Stacks and Lists To prepend to the beginning of an arrayTo prepend to the beginning of an array
@array = ( “Larry”, “Curly”, “Moe” );@array = ( “Larry”, “Curly”, “Moe” );
unshift unshift @array, “Shemp”;@array, “Shemp”;
print $array[3]; print $array[3]; # prints “Moe”# prints “Moe”
print “$array[0];print “$array[0]; # prints “Shemp”# prints “Shemp”
To remove the first element of the array To remove the first element of the array $element = $element = shiftshift @array; @array;
print $element; # prints “Shemp”print $element; # prints “Shemp”– The array now contains only : The array now contains only :
““Larry”, “Curly”, “Moe”Larry”, “Curly”, “Moe”
4040
HashesHashes Hashes are like array, they store Hashes are like array, they store
collections of scalarscollections of scalars ... but unlike arrays, indexing is by name... but unlike arrays, indexing is by name Two components to each hash entry:Two components to each hash entry:
– KeyKey example : nameexample : name– Value Value example : phone numberexample : phone number
Hashes denoted with %Hashes denoted with %– Example : %phoneDirectoryExample : %phoneDirectory
Elements are accessed using {} (like [] Elements are accessed using {} (like [] in arrays)in arrays)
4141
Hashes continued ...Hashes continued ... Adding a new key-value pairAdding a new key-value pair
$phoneDirectory{“Shirly”} = 7267975$phoneDirectory{“Shirly”} = 7267975
– Note the Note the $$ to specify “scalar” context! to specify “scalar” context! Each key can have only one valueEach key can have only one value
$phoneDirectory{“Shirly”} = 7265797$phoneDirectory{“Shirly”} = 7265797
# overwrites previous assignment# overwrites previous assignment Multiple keys can have the same valueMultiple keys can have the same value Accessing the value of a keyAccessing the value of a key
$phoneNumber =$phoneDirectory{“Shirly”};$phoneNumber =$phoneDirectory{“Shirly”};
4242
Hashes and ForeachHashes and Foreach Foreach works in hashes as well!Foreach works in hashes as well!
foreach $person (keys %phoneDirectory) {foreach $person (keys %phoneDirectory) {
print “$person: $phoneDirectory{$person}”;print “$person: $phoneDirectory{$person}”;
}}
Never depend on the order you put Never depend on the order you put key/values in the hash! Perl has its own key/values in the hash! Perl has its own magic to make hashes amazingly fast!!magic to make hashes amazingly fast!!
4343
Hashes and SortingHashes and Sorting The sort function works with hashes as The sort function works with hashes as
well well Sorting on the keysSorting on the keys
foreach $person (foreach $person (sortsort keys %phoneDirectory) keys %phoneDirectory) {{
print “$person : $directory{$person}\n”;print “$person : $directory{$person}\n”;
}}– This will print the phoneDirectory hash table This will print the phoneDirectory hash table
in alphabetical order based on the name of in alphabetical order based on the name of the person, i.e. the person, i.e. the key.the key.
4444
Hash and Sorting cont...Hash and Sorting cont...
Sorting by valueSorting by value
foreach $person (sort {$phoneDirectory{$a} <=> foreach $person (sort {$phoneDirectory{$a} <=> $phoneDirectory{$b}} keys %phoneDirectory) {$phoneDirectory{$b}} keys %phoneDirectory) {
print “$person : $phoneDirectory{$person}\print “$person : $phoneDirectory{$person}\n”;n”;
}}
– Prints the person and their phone number in Prints the person and their phone number in the order of their respective phone numbers, the order of their respective phone numbers, i.e. i.e. the value.the value.
4545
A Quick Program using A Quick Program using HashesHashes
Count the number of Republicans in an Count the number of Republicans in an arrayarray
%seen = (); # initialize hash to empty%seen = (); # initialize hash to empty
@politArray = ( “R”, “R”, “D”, “I”, “D”, “R”, “G” );@politArray = ( “R”, “R”, “D”, “I”, “D”, “R”, “G” );
foreach $politician (@politArray) {foreach $politician (@politArray) {
$seen{$politician}++;$seen{$politician}++;
}}
print “Number of Republicans = $seen{'R'}\n”;print “Number of Republicans = $seen{'R'}\n”;
4646
Slightly more advanced Slightly more advanced programprogram
Count the number of parties represented, Count the number of parties represented, and by how much!and by how much!
%seen = (); # initialize hash to empty%seen = (); # initialize hash to empty
@politArray = ( “R”, “R”, “D”, “I”, “D”, “R”, “G” );@politArray = ( “R”, “R”, “D”, “I”, “D”, “R”, “G” );
foreach $politician (@politArray) {foreach $politician (@politArray) {
$seen{$politician}++;$seen{$politician}++;
}}
foreach $party (keys %seen) {foreach $party (keys %seen) {
print “Party : $party. Num reps: print “Party : $party. Num reps: $seen{$party}\n”;$seen{$party}\n”;
}}
4747
Command Line ArgumentsCommand Line Arguments
Command line arguments in Perl are extremely easy.
@ARGV is the array that holds all arguments passed in from the command line.– Example:
% ./prog.pl arg1 arg2 arg3– @ARGV would contain ('arg1', arg2', 'arg3)
$#ARGV returns the number of command line arguments that have been passed. – Remember $#array is the size of the array!
4848
Quick Program with @ARGVQuick Program with @ARGV Simple program called log.pl that takes in a
number and prints the log base 2 of that number;
#!/usr/local/bin/perl -w$log = log($ARGV[0]) / log(2);print “The log base 2 of $ARGV[0] is $log.\n”;
Run the program as follows:– % log.pl 8
This will return the following:– The log base 2 of 8 is 3.
4949
Another Example ProgramAnother Example Program You want to print the binary form of an integer You want to print the binary form of an integer
#!/usr/local/bin/perl -w#!/usr/local/bin/perl -w
foreach $integer (@ARGV) {foreach $integer (@ARGV) {
# converts the integer to a 32 bit binary number # converts the integer to a 32 bit binary number
@binary=split//,unpack(“B32”,pack(“N”,$integer)); @binary=split//,unpack(“B32”,pack(“N”,$integer));
# Store the last 4 elements of @binary into @bits# Store the last 4 elements of @binary into @bits
@bits = @binary[28..$#binary];@bits = @binary[28..$#binary];
# Print the integer and its binary form# Print the integer and its binary form
print “$integer : @bits\n”;print “$integer : @bits\n”;
}}
5050
File HandlersFile Handlers Very simple compared to C/ C++ !!!Very simple compared to C/ C++ !!! Are not prefixed with a symbol ($, @, %, ect)Are not prefixed with a symbol ($, @, %, ect)
Opening a File:Opening a File:open (SRC, “my_file.txt”);open (SRC, “my_file.txt”);
Reading from a FileReading from a File$line = <SRC>; # reads upto a newline character$line = <SRC>; # reads upto a newline character
Closing a FileClosing a Fileclose (SRC);close (SRC);
5151
File Handlers cont...File Handlers cont... Opening a file for output:Opening a file for output:
open (DST, “>my_file.txt”);open (DST, “>my_file.txt”); Opening a file for appendingOpening a file for appending
open (DST, “>>my_file.txt”);open (DST, “>>my_file.txt”); Writing to a file:Writing to a file:
print DST “Printing my first line.\n”;print DST “Printing my first line.\n”;
Safeguarding against opening a non existent Safeguarding against opening a non existent filefileopen (SRC, “file.txt”) || die “Could not open file.\n”;open (SRC, “file.txt”) || die “Could not open file.\n”;
5252
File Test OperatorsFile Test Operators Check to see if a file exists:Check to see if a file exists:
if ( -e “file.txt”) {if ( -e “file.txt”) { # The file exists!# The file exists!}}
Other file test operators:Other file test operators:-r-r readablereadable-x-x executableexecutable-d-d is a directoryis a directory-T-T is a text fileis a text file
5353
Quick Program with File Quick Program with File HandlesHandles
Program to copy a file to a destination fileProgram to copy a file to a destination file
#!/usr/local/bin/perl -w#!/usr/local/bin/perl -w
open(SRC, “file.txt”) || die “Could not open open(SRC, “file.txt”) || die “Could not open source file.\n”;source file.\n”;
open(DST< “>newfile.txt”);open(DST< “>newfile.txt”);
while ( $line = <SRC> ) {while ( $line = <SRC> ) {
print DST $line;print DST $line;
}}
close SRC;close SRC;
close DST;close DST;
5454
Some Default File HandlesSome Default File Handles STDIN : Standard InputSTDIN : Standard Input
$line = <STDIN>; # takes input from stdin$line = <STDIN>; # takes input from stdin
STDOUT : Standard outputSTDOUT : Standard outputprint STDOUT “File handling in Perl is sweet!\print STDOUT “File handling in Perl is sweet!\
n”;n”;
STDERR : Standard ErrorSTDERR : Standard Errorprint STDERR “Error!!\n”; print STDERR “Error!!\n”;
5555
The <> File HandleThe <> File Handle
The “empty” file handle takes the command The “empty” file handle takes the command line file(s) or STDIN;line file(s) or STDIN;– $line = <>;$line = <>;
If program is run ./prog.pl file.txt, this will If program is run ./prog.pl file.txt, this will automatically open file.txt and read the first automatically open file.txt and read the first line.line.
If program is run ./prog.pl file1.txt file2.txt, If program is run ./prog.pl file1.txt file2.txt, this will first read in file1.txt and then file2.txt this will first read in file1.txt and then file2.txt ... you will not know when one ends and the ... you will not know when one ends and the other begins.other begins.
5656
The <> File Handle cont...The <> File Handle cont...
If program is run ./prog.pl, the program If program is run ./prog.pl, the program will wait for you to enter text at the will wait for you to enter text at the prompt, and will continue until you enter prompt, and will continue until you enter the EOF character the EOF character
– CTRL-D in UNIXCTRL-D in UNIX
5757
Example Program with Example Program with STDINSTDIN
Suppose you want to determine if you are Suppose you want to determine if you are one of the three stoogesone of the three stooges
#!/usr/local/bin/perl#!/usr/local/bin/perl
%stooges = (larry => 1, moe => 1, curly => %stooges = (larry => 1, moe => 1, curly => 1 );1 );
print “Enter your name: ? “; print “Enter your name: ? “;
$name = <STDIN>; chomp $name;$name = <STDIN>; chomp $name;
if($stooges{lc($name)}) { if($stooges{lc($name)}) {
print “You are one of the Three Stooges!!\print “You are one of the Three Stooges!!\n”;n”;
} else { } else {
print “Sorry, you are not a Stooge!!\n”;print “Sorry, you are not a Stooge!!\n”;
}}
5858
Chomp and ChopChomp and Chop
Chomp : function that deletes a trailing newline from the end of a string.
$line = “this is the first line of text\n”; chomp $line; # removes the new line character print $line; # prints “this is the first line of
# text” without returning Chop : function that chops off the last
character of a string. $line = “this is the first line of text”; chop $line; print $line; #prints “this is the first line of tex”
5959
$_$_
Perl default scalar value that is used when a variable is not explicitly specified.
Can be used in– For Loops– File Handling– Regular Expressions – discussed later
6060
$_ and For Loops$_ and For Loops Example using $_ in a for loop
@array = ( “Perl”, “C”, “Java” );for(@array) { print $_ . “is a language I know\n”;}
– Output : Perl is a language I know. C is a language I know. Java is a language I know.
6161
$_ and File Handlers$_ and File Handlers Example in using $_ when reading in a file;
while( <> ) { chomp $_; # remove the newline
char @array = split/ /, $_; # split the line on white
space # and stores data in an array
}
Note:– The line read in from the file is automatically
store in the default scalar variable $_
6262
$_ and File Handling cont..$_ and File Handling cont..Another example similar to the previous example:
while(<>) { chomp; # removes trailing newline
chars @array = split/ /; # splits the line on white
# space and stores the data # in the array
}
Notes:– The functions chomp and split automatically perform
their respective operations on $_.
6363
Example ProgramExample Program Count the number of words in a text and Count the number of words in a text and
display the top 10 most frequency words.display the top 10 most frequency words.#!/usr/local/bin/perl#!/usr/local/bin/perl
%vocab = (); $counter = 0;%vocab = (); $counter = 0;
while(<>) {while(<>) {
chomp; chomp;
foreach $element (split/ /) { $vocab{$element}+foreach $element (split/ /) { $vocab{$element}++; }+; }
}}
foreach $word (sort foreach $word (sort {$vocab{$b}<=>$vocab{$a}} %vocab) {{$vocab{$b}<=>$vocab{$a}} %vocab) {
print “$word $vocab{$word}\n”;print “$word $vocab{$word}\n”;
if($counter == 10) { last; } $counter++;if($counter == 10) { last; } $counter++;
}}
6464
Regular ExpressionsRegular Expressions What are Regular Expressions .. a few What are Regular Expressions .. a few
definitions.definitions.– Specifies a class of strings that belong to the Specifies a class of strings that belong to the
formal / regular languages defined by formal / regular languages defined by regular expressionsregular expressions
– In other words, a formula for matching In other words, a formula for matching strings that follow a specified pattern.strings that follow a specified pattern.
Some things you can do with regular Some things you can do with regular expressionsexpressions– Parse the textParse the text– Add and/or replace subsections of textAdd and/or replace subsections of text– Remove pieces of the textRemove pieces of the text
6565
Regular Expressions cont..Regular Expressions cont.. A regular expression characterizes a A regular expression characterizes a
regular languageregular language
Examples in UNIX: Examples in UNIX: – ls *.cls *.c
Lists all the files in the current directory that Lists all the files in the current directory that are postfixed '.c'are postfixed '.c'
– ls *.txtls *.txt Lists all the files in the current directory that Lists all the files in the current directory that
are postfixed '.txt'are postfixed '.txt'
6666
Simple Example for ... ? Simple Example for ... ? ClarityClarity
In the simplest form, a regular expression In the simplest form, a regular expression is a string of characters that you are is a string of characters that you are looking forlooking for
We want to find all the words that contain We want to find all the words that contain the string 'ing' in our text.the string 'ing' in our text.
The regular expression we would use :The regular expression we would use :
/ing//ing/
6767
Simple Example cont...Simple Example cont... What would are program then look like:What would are program then look like:
#!/usr/local/bin/perl#!/usr/local/bin/perl
while(<>) {while(<>) {
chomp;chomp;
@words = split/ /, $_;@words = split/ /, $_;
foreach $word(@words) {foreach $word(@words) {
if(if($word=~m/ing/$word=~m/ing/) { print “$word\) { print “$word\n”; }n”; }
}}
}}
6868
Regular Expressions TypesRegular Expressions Types Regular expressions are composed of two Regular expressions are composed of two
types of characters:types of characters:
– LiteralsLiterals Normal text charactersNormal text characters Like what we saw in the previous program ( Like what we saw in the previous program (
/ing/ ) /ing/ )– MetacharactersMetacharacters
special charactersspecial characters Add a great deal of flexibility to your searchAdd a great deal of flexibility to your search
6969
MetacharactersMetacharacters
Match more than just charactersMatch more than just characters Match line positionMatch line position
– ^̂ start of a linestart of a line ( carat )( carat )– $$ end of a lineend of a line ( dollar sign )( dollar sign )
Match any characters in a list : [ ... ] Match any characters in a list : [ ... ] Example : Example :
– /[Bb]ridget//[Bb]ridget/ matches Bridget or bridget matches Bridget or bridget – /Mc[Ii]nnes//Mc[Ii]nnes/ matches McInnes or Mcinnes matches McInnes or Mcinnes
7070
Our Simple Example Our Simple Example RevisitedRevisited
Now suppose we only want to match words Now suppose we only want to match words that end in 'ing' rather than just contain 'ing'.that end in 'ing' rather than just contain 'ing'.
How would we change are regular How would we change are regular expressions to accomplish this:expressions to accomplish this:
– Previous Regular Expression:Previous Regular Expression:
$word =~m/ ing /$word =~m/ ing /
– New Regular Expression:New Regular Expression:
$word=~m/ ing$word=~m/ ing$ $ //
7171
Ranges of Regular Ranges of Regular ExpressionsExpressions
Ranges can be specified in Regular Ranges can be specified in Regular ExpressionsExpressions
Valid RangesValid Ranges– [A-Z][A-Z] Upper Case Roman AlphabetUpper Case Roman Alphabet– [a-z][a-z] Lower Case Roman AlphabetLower Case Roman Alphabet– [A-Za-z][A-Za-z] Upper or Lower Case Roman Upper or Lower Case Roman
AlphabetAlphabet– [A-F][A-F] Upper Case A through F Roman Upper Case A through F Roman
CharactersCharacters Invalid RangesInvalid Ranges
– [a-Z][a-Z] Not ValidNot Valid– [A-z][A-z] Not ValidNot Valid– [F-A][F-A] Not ValidNot Valid
7272
Ranges cont ...Ranges cont ...
Ranges of Digits can also be specifiedRanges of Digits can also be specified– [0-9][0-9] ValidValid– [9-0][9-0] InvalidInvalid
Negating RangesNegating Ranges– / ^[0-9] // ^[0-9] /
Match anything except a digitMatch anything except a digit
– / ^a // ^a / Match anything except an aMatch anything except an a
– / ^[^A-Z] // ^[^A-Z] / Match anything that starts with something Match anything that starts with something
other than a single upper case letter other than a single upper case letter First ^ First ^ : : start of linestart of line Second ^ :Second ^ : negationnegation
7373
Our Simple Example AgainOur Simple Example Again Now suppose we want to create a list of all Now suppose we want to create a list of all
the words in our text that do not end in 'ing'the words in our text that do not end in 'ing' How would we change are regular How would we change are regular
expressions to accomplish this:expressions to accomplish this:
– Previous Regular Expression:Previous Regular Expression:
$word =~m/ ing$ /$word =~m/ ing$ /
– New Regular Expression:New Regular Expression:
$word=~m/ $word=~m/ [^ ing][^ ing]$ /$ /
7474
Literal MetacharactersLiteral Metacharacters
Suppose that you actually want to look Suppose that you actually want to look for all strings that equal '^' in your textfor all strings that equal '^' in your text– Use theUse the \ \ symbolsymbol– // \^ \^ / / Regular expression to search forRegular expression to search for
What does the following Regular What does the following Regular Expressions Match?Expressions Match?
/ [ A - Z ^ ] ^ / / [ A - Z ^ ] ^ /
– Matches any line that contains ( A-Z or ^) Matches any line that contains ( A-Z or ^) followed by ^followed by ^
7575
Patterns provided in PerlPatterns provided in Perl Some PatternsSome Patterns
– \d\d [ 0 – 9 ][ 0 – 9 ]– \w\w [a – z A – z 0 – 9 _ ][a – z A – z 0 – 9 _ ]– \s\s [ \r \t \n \f ][ \r \t \n \f ] (white space pattern)(white space pattern)– \D\D [^ 0 - 9][^ 0 - 9]– \W\W [^ a – z A – Z 0 – 9 ][^ a – z A – Z 0 – 9 ]– \S\S [^ \r \t \n \f][^ \r \t \n \f]
Example :Example : [ 19[ 19\d\d\d\d ] ]– Looks for any year in the 1900'sLooks for any year in the 1900's
7676
Using Patterns in our Using Patterns in our ExampleExample
Commonly words are not separated by just a single space but by tabs, returns, ect...
Let's modify our split function to incorporate multiple white space
#!/usr/local/bin/perlwhile(<>) { chomp; @words = split/\s+/, $_; foreach $word(@words) { if($word=~m/ing/) { print “$word\n”; }}
7777
Word Boundary Word Boundary MetacharacterMetacharacter
Regular Expression to match the start or Regular Expression to match the start or the end of a 'word' : the end of a 'word' : \b\b
Examples:Examples:
– / Jeff\b // Jeff\b / Match Jeff but not JeffersonMatch Jeff but not Jefferson– / Carol\b // Carol\b / Match Chris but not Caroline Match Chris but not Caroline – / Rollin\b // Rollin\b / Match Rollin but not RollingMatch Rollin but not Rolling– /\bform //\bform / Match form or formation but Match form or formation but
not not InformationInformation– /\bform\b//\bform\b/ Match form but neither Match form but neither
information information nor formationnor formation
7878
DOT MetacharacterDOT Metacharacter
The DOT Metacharacter, 'The DOT Metacharacter, '..' symbolizes ' symbolizes any character except a new lineany character except a new line
/ b / b . . bble/bble/– Would possibly return : bobble, babble, Would possibly return : bobble, babble,
bubblebubble / / . . oat/oat/
– Would possibly return : boat, coat, goatWould possibly return : boat, coat, goat
Note: remember 'Note: remember '.*.*' usually means a ' usually means a bunch of anything, this can be handy but bunch of anything, this can be handy but also can have hidden ramifications.also can have hidden ramifications.
7979
PIPE MetacharacterPIPE Metacharacter The PIPE Metacharacter is used for alternationThe PIPE Metacharacter is used for alternation
/ Bridget (Thomson | McInnes) // Bridget (Thomson | McInnes) /– Match Bridget Thomson or Bridget McInnes but Match Bridget Thomson or Bridget McInnes but
NOTNOT Bridget Thomson McInnes Bridget Thomson McInnes
/ B | bridget / / B | bridget / – Match B or bridgetMatch B or bridget
/ ^( B | b ) ridget // ^( B | b ) ridget /– Match Bridget or bridget at the beginning of a lineMatch Bridget or bridget at the beginning of a line
8080
Our Simple ExampleOur Simple Example Now with our example, suppose that we want Now with our example, suppose that we want
to not only get all words that end in 'ing' but to not only get all words that end in 'ing' but also 'ed'.also 'ed'.
How would we change are regular expressions How would we change are regular expressions to accomplish this:to accomplish this:
– Previous Regular Expression:Previous Regular Expression:
$word =~m/ ing$ /$word =~m/ ing$ /
– New Regular Expression:New Regular Expression:
$word=~m/ $word=~m/ (ing|ed)(ing|ed)$ /$ /
8181
The ? MetacharacterThe ? Metacharacter The metacharacter, ?, indicates that the The metacharacter, ?, indicates that the
character immediately preceding it character immediately preceding it occurs zero or one timeoccurs zero or one time
Examples:Examples:
– / worl?ds // worl?ds / Match either 'worlds' or 'words'Match either 'worlds' or 'words'
– / m?ethane / / m?ethane / Match either 'methane' or 'ethane'Match either 'methane' or 'ethane'
8282
The * MetacharacterThe * Metacharacter The metacharacter, *, indicates that the The metacharacter, *, indicates that the
characterer immediately preceding it occurs zero characterer immediately preceding it occurs zero or more timesor more times
Example :Example :
– / ab*c// ab*c/ Match 'ac', 'abc', 'abbc', 'abbbc' ect... Match 'ac', 'abc', 'abbc', 'abbbc' ect...
– Matches any string that starts with an a, if possibly Matches any string that starts with an a, if possibly followed by a sequence of b's and ends with a c.followed by a sequence of b's and ends with a c.
Sometimes called Kleene's starSometimes called Kleene's star
8383
Our Simple Example againOur Simple Example again Now suppose we want to create a list of
all the words in our text that end in 'ing' or 'ings'
How would we change are regular expressions to accomplish this:
– Previous Regular Expression: $word =~m/ ing$ /
– New Regular Expression:
$word=~m/ ings?$ /
8484
Modifying TextModifying Text MatchMatch
– Up to this point, we have seen attempt to Up to this point, we have seen attempt to match a given regular expression match a given regular expression
– Example : $variable =~m/ regex /Example : $variable =~m/ regex /
SubstitutionSubstitution– Takes match one step further : if there is a Takes match one step further : if there is a
match, then replace it with the given stringmatch, then replace it with the given string– Example : $variable =~s/ regex / replacementExample : $variable =~s/ regex / replacement
$var =~ / Thomson / McInnes /;$var =~ / Thomson / McInnes /;
$var =~ / Bridgette / Bridget /;$var =~ / Bridgette / Bridget /;
8585
Substitution ExampleSubstitution Example Suppose when we find all our words that end Suppose when we find all our words that end
in 'ing' we want to replace the 'ing' with 'ed'.in 'ing' we want to replace the 'ing' with 'ed'.
#!/usr/local/bin/perl -w#!/usr/local/bin/perl -w
while(<>) {while(<>) {
chomp $_;chomp $_;
@words = split/ \s+/, $_; @words = split/ \s+/, $_;
foreach $word(@words) { foreach $word(@words) {
if(if($word=~s/ing$/ed/$word=~s/ing$/ed/) { print “$word\n”; ) { print “$word\n”; }}
}}
}}
8686
Special Variable Modified Special Variable Modified by a Matchby a Match
$& $& – Copy of text matched by the regexCopy of text matched by the regex
$' $' – A copy of the target text in from of the matchA copy of the target text in from of the match
$` $` – A copy of the target text after the matchA copy of the target text after the match
$1, $2, $3, ect$1, $2, $3, ect– The text matched by 1st, 2nd, ect., set of The text matched by 1st, 2nd, ect., set of
parentheses. Note : $0 is not included hereparentheses. Note : $0 is not included here $+$+
– A copy of the highest numbered $1, $2, $3, ect..A copy of the highest numbered $1, $2, $3, ect..
8787
Our Simple Example once Our Simple Example once againagain
Now lets revise are program to find all Now lets revise are program to find all the words that end in 'ing' without the words that end in 'ing' without splitting our line of text into an array of splitting our line of text into an array of wordswords
#!/usr/local/bin/perl -w#!/usr/local/bin/perl -w
while(<>) {while(<>) {
chomp $_;chomp $_;
if($_=~/([A-Za-z]*ing\b)/) { print "$&\n"; }if($_=~/([A-Za-z]*ing\b)/) { print "$&\n"; }
}}
Thank you Thank you