introduction to perl part ii by: cédric notredame (adapted from bt mcinnes)

50
Introduction to Perl Part II By: Cédric Notredame (Adapted from BT McInnes)

Upload: aubree-peniston

Post on 15-Dec-2015

222 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Introduction to Perl Part II By: Cédric Notredame (Adapted from BT McInnes)

Introduction to Perl

Part II

By: Cédric Notredame (Adapted from BT McInnes)

Page 2: Introduction to Perl Part II By: Cédric Notredame (Adapted from BT McInnes)

Passing Arguments To Your Program

Page 3: Introduction to Perl Part II By: Cédric Notredame (Adapted from BT McInnes)

Command Line Arguments

Command line arguments in Perl are extremely easy. @ARGV is the array that holds all arguments passed

in from the command line. Example:

./prog.pl arg1 arg2 arg3 @ARGV would contain ('arg1', ‘arg2', 'arg3’)

$#ARGV returns the number of command line arguments that have been passed. Remember $#array is the size of the array!

Page 4: Introduction to Perl Part II By: Cédric Notredame (Adapted from BT McInnes)

Reading/Writing Files

Page 5: Introduction to Perl Part II By: Cédric Notredame (Adapted from BT McInnes)

File Handlers

Opening a File:open (SRC, “my_file.txt”);

Reading from a File$line = <SRC>; # reads upto a newline character

Closing a Fileclose (SRC);

Page 6: Introduction to Perl Part II By: Cédric Notredame (Adapted from BT McInnes)

File Handlers cont...

Opening a file for output:open (DST, “>my_file.txt”);

Opening a file for appending

open (DST, “>>my_file.txt”);

Writing to a file:

print DST “Printing my first line.\n”;

Safeguarding against opening a non existent fileopen (SRC, “file.txt”) || die “Could not open file.\n”;

Page 7: Introduction to Perl Part II By: Cédric Notredame (Adapted from BT McInnes)

File Test Operators Check to see if a file exists:

if ( -e “file.txt”) { # The file exists!}

Other file test operators:-r readable-x executable-d is a directory-T is a text file

Page 8: Introduction to Perl Part II By: Cédric Notredame (Adapted from BT McInnes)

Quick Program with File Handles

Program to copy a file to a destination file

#!/usr/bin/perl -w

open(SRC, “file.txt”) || die “Could not open source file.\n”;

open(DST, “>newfile.txt”);

while ( $line = <SRC> )

{

print DST $line;

}

close SRC;

close DST;

Page 9: Introduction to Perl Part II By: Cédric Notredame (Adapted from BT McInnes)

Some Default File Handles STDIN : Standard Input

$line = <STDIN>; # takes input from stdin

STDOUT : Standard outputprint STDOUT “File handling in Perl is sweet!\n”;

STDERR : Standard Errorprint STDERR “Error!!\n”;

Page 10: Introduction to Perl Part II By: Cédric Notredame (Adapted from BT McInnes)

The <> File Handle

The “empty” file handle takes the command line file(s) or STDIN; $line = <>;

If program is run ./prog.pl file.txt, this will automatically open file.txt and read the first line.

If program is run ./prog.pl file1.txt file2.txt, this will first read in file1.txt and then file2.txt ... you will not know when one ends and the other begins.

Page 11: Introduction to Perl Part II By: Cédric Notredame (Adapted from BT McInnes)

The <> File Handle cont...

If program is run ./prog.pl, the program will wait for you to enter text at the prompt, and will continue until you enter the EOF character

CTRL-D in UNIX

Page 12: Introduction to Perl Part II By: Cédric Notredame (Adapted from BT McInnes)

Example Program with STDIN

Suppose you want to determine if you are one of the three stooges

#!/usr/local/bin/perl

%stooges = (larry => 1, moe => 1, curly => 1 );

print “Enter your name: ? “;

$name = <STDIN>; chomp $name;

if($stooges{ lc($name) }) {

print “You are one of the Three Stooges!!\n”;

} else {

print “Sorry, you are not a Stooge!!\n”;

}

Page 13: Introduction to Perl Part II By: Cédric Notredame (Adapted from BT McInnes)

Combining File ContentGiven The two Following Files:

File1.txt123

AndFile2.txt

abc

Write a program that takes the two files as arguments and outputs a third file that looks like:

File3.txt1a2b3

Tip: ./mix_files File1.txt File2.txt File3.txt

Page 14: Introduction to Perl Part II By: Cédric Notredame (Adapted from BT McInnes)

Combining File Content

#! /usr/bin/perlopen (F, “$ARGV[0]);open (G, “$ARGV[1]);open (H, “>$ARGV[2]);while ( defined (F) && defined (G) && ($l1=<F>) && ($l2=<G>))

{print H “$l1$l2”;

}close (F); close (G); close (H);

Page 15: Introduction to Perl Part II By: Cédric Notredame (Adapted from BT McInnes)

Chomp and Chop

Chomp : function that deletes a trailing newline from the end of a string.

$line = “this is the first line of text\n”; chomp $line; # removes the new line character print $line; # prints “this is the first line of

# text” without returning Chop : function that chops off the last character of a

string. $line = “this is the first line of text”; chop $line; print $line; #prints “this is the first line of tex”

Page 16: Introduction to Perl Part II By: Cédric Notredame (Adapted from BT McInnes)

Matching Regular Expressions

Page 17: Introduction to Perl Part II By: Cédric Notredame (Adapted from BT McInnes)

Regular Expressions

What are Regular Expressions .. a few definitions. Specifies a class of strings that belong to the

formal / regular languages defined by regular expressions

In other words, a formula for matching strings that follow a specified pattern.

Some things you can do with regular expressions Parse the text Add and/or replace subsections of text Remove pieces of the text

Page 18: Introduction to Perl Part II By: Cédric Notredame (Adapted from BT McInnes)

Regular Expressions cont..

A regular expression characterizes a regular language

Examples in UNIX: ls *.c

Lists all the files in the current directory that are postfixed '.c'

ls *.txt Lists all the files in the current directory that are

postfixed '.txt'

Page 19: Introduction to Perl Part II By: Cédric Notredame (Adapted from BT McInnes)

Simple Example for ... ? Clarity

In the simplest form, a regular expression is a string of characters that you are looking for

We want to find all the words that contain the string 'ing' in our text.

The regular expression we would use :

/ing/

Page 20: Introduction to Perl Part II By: Cédric Notredame (Adapted from BT McInnes)

The Match Operator

What would are program then look like:

if($word=~m/ing/) { print “$word\n”;}

Page 21: Introduction to Perl Part II By: Cédric Notredame (Adapted from BT McInnes)

Exercise:

Download any text you wish from the internet and count all the words in “ing” it contains…

wget “http://www.trinity.edu/~mkearl/family.html”

Page 22: Introduction to Perl Part II By: Cédric Notredame (Adapted from BT McInnes)

Exercise:

#!/usr/local/bin/perl

while(<>)

{

chomp;

@words = split/ /;

foreach $word(@words)

{

if($word=~m/ing/) { print “$word\n”;$ing++; }

}

}

print “$ing Words in ing\n”;

Page 23: Introduction to Perl Part II By: Cédric Notredame (Adapted from BT McInnes)

Regular Expressions Types

Regular expressions are composed of two types of characters: Literals

Normal text characters Like what we saw in the previous program

( /ing/ )

Metacharacters special characters Add a great deal of flexibility to your search

Page 24: Introduction to Perl Part II By: Cédric Notredame (Adapted from BT McInnes)

Metacharacters

Match more than just characters Match line position

^ start of a line ( carat ) $ end of a line ( dollar sign )

Match any characters in a list : [ ... ] Example :

/[Bb]ridget/ matches Bridget or bridget /Mc[Ii]nnes/ matches McInnes or Mcinnes

Page 25: Introduction to Perl Part II By: Cédric Notredame (Adapted from BT McInnes)

Our Simple Example Revisited

Now suppose we only want to match words that end in 'ing' rather than just contain 'ing'.

How would we change are regular expressions to accomplish this:

Previous Regular Expression:

$word =~m/ ing /

New Regular Expression:

$word=~m/ ing$ /

Page 26: Introduction to Perl Part II By: Cédric Notredame (Adapted from BT McInnes)

Ranges of Regular Expressions

Ranges can be specified in Regular Expressions Valid Ranges

[A-Z] Upper Case Roman Alphabet [a-z] Lower Case Roman Alphabet [A-Za-z] Upper or Lower Case Roman Alphabet [A-F] Upper Case A through F Roman

Characters [A-z] Valid but be careful

Invalid Ranges [a-Z] Not Valid [F-A] Not Valid

Page 27: Introduction to Perl Part II By: Cédric Notredame (Adapted from BT McInnes)

Ranges cont ...

Ranges of Digits can also be specified [0-9] Valid [9-0] Invalid

Negating Ranges / [^0-9] /

Match anything except a digit / [^a] /

Match anything except an a / ^[^A-Z] /

Match anything that starts with something other than a single upper case letter

First ^ : start of line Second ^ : negation

Page 28: Introduction to Perl Part II By: Cédric Notredame (Adapted from BT McInnes)

Our Simple Example Again

Now suppose we want to create a list of all the words in our text that do not end in 'ing'

How would we change are regular expressions to accomplish this:

Previous Regular Expression: $word =~m/ ing$ /

New Regular Expression: !($word=~m/ (ing)$ /)

Page 29: Introduction to Perl Part II By: Cédric Notredame (Adapted from BT McInnes)

Matching Interogations

$string=~/([^.?]+\?)/ $string=~/[.?]([A-Z0-9][^.?]+\?)/ $string=~/([\w\s]+\?)/

Page 30: Introduction to Perl Part II By: Cédric Notredame (Adapted from BT McInnes)

Removing HTML Tags

$string=~s/\<[^>]+\>/ /g g: substitute EVERY instance

Page 31: Introduction to Perl Part II By: Cédric Notredame (Adapted from BT McInnes)

Literal Metacharacters

Suppose that you actually want to look for all strings that equal ‘$' in your text Use the \ symbol / \$ / Regular expression to search for

What does the following Regular Expressions Match?

/ [ ABCDEFGHIJKLMNOP$] \$/

/ [ A-P$ ] \$ /

Matches any line that contains ( A-P or $) followed by $

Page 32: Introduction to Perl Part II By: Cédric Notredame (Adapted from BT McInnes)

Patterns provided in Perl

Some Patterns \d [ 0 – 9 ] \w [a – z A – Z 0 – 9_] \s [ \r \t \n \f ] (white space pattern) \D [^ 0 - 9] \W [^ a – z A – Z 0 – 9_] \S [^ \r \t \n \f]

Example : ( 19\d\d ) Looks for any year in the 1900's

Page 33: Introduction to Perl Part II By: Cédric Notredame (Adapted from BT McInnes)

Using Patterns in our Example

Commonly words are not separated by just a single space but by tabs, returns, ect...

Let's modify our split function to incorporate multiple white space

#!/usr/local/bin/perlwhile(<>) { chomp; @words = split/\s+/, $_; foreach $word(@words) { if($word=~m/ing$/) { print “$word\n”; }}

Page 34: Introduction to Perl Part II By: Cédric Notredame (Adapted from BT McInnes)

Word Boundary Metacharacter

Regular Expression to match the start or the end of a 'word' : \b

Examples:

/ Jeff\b / Match Jeff but not Jefferson / Carol\b / Match Carol but not Caroline / Rollin\b / Match Rollin but not Rolling /\bform / Match form or formation but not

Information /\bform\b/ Match form but neither information

nor formation

Page 35: Introduction to Perl Part II By: Cédric Notredame (Adapted from BT McInnes)

DOT Metacharacter

The DOT Metacharacter, '.' symbolizes any character except a new line

/ b . bble/ Would possibly return : bobble, babble, bubble

/ . oat/ Would possibly return : boat, coat, goat

Note: remember '.*' usually means a bunch of anything, this can be handy but also can have hidden ramifications.

Page 36: Introduction to Perl Part II By: Cédric Notredame (Adapted from BT McInnes)

PIPE Metacharacter

The PIPE Metacharacter is used for alternation

/ Bridget (Thomson | McInnes) / Match Bridget Thomson or Bridget McInnes but NOT

Bridget Thomson McInnes

/ B | bridget / Match B or bridget

/ ^( B | b ) ridget / Match Bridget or bridget at the beginning of a line

Page 37: Introduction to Perl Part II By: Cédric Notredame (Adapted from BT McInnes)

Our Simple Example

Now with our example, suppose that we want to not only get all words that end in 'ing' but also 'ed'.

How would we change are regular expressions to accomplish this:

Previous Regular Expression:

$word =~m/ ing /

New Regular Expression:

$word=~m/ (ing|ed)/

Page 38: Introduction to Perl Part II By: Cédric Notredame (Adapted from BT McInnes)

The ? Metacharacter

The metacharacter, ?, indicates that the character immediately preceding it occurs zero or one time

Examples:

/ worl?ds / Match either 'worlds' or 'words'

/ m?ethane / Match either 'methane' or 'ethane'

Page 39: Introduction to Perl Part II By: Cédric Notredame (Adapted from BT McInnes)

The * Metacharacter

The metacharacter, *, indicates that the character immediately preceding it occurs zero or more times

Example :

/ ab*c/ Match 'ac', 'abc', 'abbc', 'abbbc' ect...

Matches any string that starts with an a, if possibly followed by a sequence of b's and ends with a c.

Sometimes called Kleene's star

Page 40: Introduction to Perl Part II By: Cédric Notredame (Adapted from BT McInnes)

Our Simple Example again

Now suppose we want to create a list of all the words in our text that end in 'ing' or 'ings'

How would we change are regular expressions to accomplish this:

Previous Regular Expression:

$word =~m/ ing$ /

New Regular Expression:

$word=~m/ ings?$ /

Page 41: Introduction to Perl Part II By: Cédric Notredame (Adapted from BT McInnes)

Exercise For each of the strings (a)--(e), say which of the patterns (i)--(xii) it matches. Where there is a match,

what would be the values of $MATCH, $1, $2, etc.?

1) the quick brown fox jumped over the lazy dog 2) The Sea! The Sea! 3) (.+)\s*\1 4) 9780471975632 5) C:\DOS\PATH\NAME

1) /[a-z]/ 2) /(\W+)/ 3) /\W*/ 4) /^\w+$/ 5) /[^\w+$]/ 6) /\d/ 7) /(.+)\s*\1/ 8) /((.+)\s*\1)/ 9) /(.+)\s*((\1))/ 11) /\DOS/ 12) /\\DOS/ 13) /\\\DOS/

Page 42: Introduction to Perl Part II By: Cédric Notredame (Adapted from BT McInnes)

Exercise For each of the strings (a)--(e), say which of the patterns (i)--(xii) it matches. Where there is a match, what would be the values of

$MATCH, $1, $2, etc.?

1) the quick brown fox jumped over the lazy dog 1,2,3,5

2) The Sea! The Sea! 1,2,3,5,7,9

3) (.+)\s*\1 1,2,3, 5, 6

4) 9780471975632 3,4,6 5) C:\DOS\PATH\NAME

2,3,5,10,11,12

1) /[a-z]/ 1,2,3 2) /(\W+)/ 1,2,3,5 3) /\W*/ 1,2,3,5 4) /^\w+$/ 4 5) /[^\w+$]/ 1,2,3,5 6) /\d/ 3,4 7) /(.+)\s*\1/ 2, 8) /((.+)\s*\1)/ 9) /(.+)\s*((\1))/ 2 10) /\DOS/ 5 11) /\\DOS/ 5 12) /\\\DOS/ 5

Page 43: Introduction to Perl Part II By: Cédric Notredame (Adapted from BT McInnes)

Modifying Text With Regular Expressions

Page 44: Introduction to Perl Part II By: Cédric Notredame (Adapted from BT McInnes)

Modifying Text

Match Up to this point, we have seen attempt to

match a given regular expression Example : $variable =~m/ regex /

Substitution Takes match one step further : if there is a

match, then replace it with the given string Example : $variable =~s/ regex / replacement/

$var =~ s/ Cedric / Notredame /g; $var =~ s/ing/ed /;

Page 45: Introduction to Perl Part II By: Cédric Notredame (Adapted from BT McInnes)

Substitution Example

Suppose when we find all our words that end in 'ing' we want to replace the 'ing' with 'ed'.

#!/usr/local/bin/perl -w

while(<>) {

chomp $_;

@words = split/ \s+/, $_;

foreach $word(@words) {

if($word=~s/ing$/ed/) { print “$word\n”; }

}

}

Page 46: Introduction to Perl Part II By: Cédric Notredame (Adapted from BT McInnes)

Special Variable Modified by a Match

$target=“I have 25 apples” $target=~/(\d+)/ $& => 25

Copy of text matched by the regex $' =>”I have “

A copy of the target text until the first match $` => “ apples”

A copy of the target text after the last match $1, $2, $3, ect $1=25

The text matched by 1st, 2nd, ect., set of parentheses. Note : $0 is not included here

$+ A copy of the highest numbered $1, $2, $3, ect..

Page 47: Introduction to Perl Part II By: Cédric Notredame (Adapted from BT McInnes)

Our Simple Example once again

Now lets revise our program to find all the words that end in 'ing' without splitting our line of text into an array of words

#!/usr/local/bin/perl -w

while(<>) {

chomp $_;

if($_=~/([A-Za-z]*ing\b)/g) { print "$&\n"; }

}

Page 48: Introduction to Perl Part II By: Cédric Notredame (Adapted from BT McInnes)

Example

#!/usr/local/bin

$exp = <STDIN>; chomp $exp;

if($exp=~/^([A-Za-z+\s]*)\bcrave\b([\sA-Za-z]+)/)

{

print “$1\n”;

print “$2\n”;

} Run Program with string : I crave to rule the world! Results:

“I “ to rule the world!

Page 49: Introduction to Perl Part II By: Cédric Notredame (Adapted from BT McInnes)

Example

#!/usr/local/bin

$exp = <STDIN>; chomp $exp;

if($exp=~/\bcrave\b/)

{

print “$`\n”; print “$&\n”; print “$’\n”;

} Run Program with string : I crave to rule the world! Results:

I crave to rule the world!

Page 50: Introduction to Perl Part II By: Cédric Notredame (Adapted from BT McInnes)

Thank you