perl part 3 1.subroutines 2.pattern matching and regular expressions
Post on 20-Dec-2015
226 views
TRANSCRIPT
(1) Subroutines
• Subroutines provide a way for programmers to group a set of statements, set them aside, and turn them into mini-programs within a larger program.
• These mini-programs can be executed several times from different places in the overall program
Working with Subroutines
• You can create a subroutine by placing a group of statements into the following format:
sub subroutine_name { set of statements
}
• For example a outputTableRow subroutinesub outputTableRow { print ‘<TR><TD>One</TD><TD>Two</TD></TR>’;
}
• Execute the statements in this subroutine, by preceding the name by an ampersand: &outputTableRow;
Subroutine Example Program
1. #!/usr/bin/perl2. use CGI ':standard';3. print header, start_html( 'Subroutine Example' );4. print 'Here is simple table <TABLE BORDER=1>';5. &outputTableRow;6. &outputTableRow;7. &outputTableRow;8. print '</TABLE>', end_html;
9. sub outputTableRow {10. print '<TR><TD>One</TD><TD>Two</TD></TR>';11.}
(2) Pattern matching and regular expressions
• Use Perl pattern matching and regular expressions to filter input data
• Work with files to enable a program to store and retrieve data
Patterns in String Variables • Many programming problems require matching,
changing, or manipulating patterns in string variables. – An important use is verifying input fields of a form
• helps provide security against accidental or malicious attacks.
• For example, if expecting a form field to provide a telephone number as input, your program needs a way to verify that the input comprises a string of seven digits.
Four Different Constructs
– The match operator enables your program to look for patterns in strings.
– The substitute operator enables your program to change patterns in strings.
– The split function enables your program to split strings into separate variables based on a pattern.
– Regular expressions provide a pattern matching language that can work with these operators and functions to work on string variables.
The Match Operator
• The match operator is used to test if a pattern appears in a string. – It is used with the binding operator (“=~”)
to see whether a variable contains a particular pattern.
if ( $name =~ m/edu/ ) {
set of statements to execute}
These statements execute if 'edu' isANYWHERE in the contents of the stringvariable $name.
Trys to match the patterninside slashes "/". In thiscase the pattern "edu".
The binding operatorindicates toexamine thecontents of$name.
Other Delimiters? • Slash (“/”) is most common match pattern
– Others are possible, For example, both use valid match operator syntax:
– if ( $name =~ m!Dave! ) {
– if ( $name =~ m<Dave> ) {
• The reverse binding operator test if pattern is NOT found:
if ( $color !~ m/blue/ ) {
Substitutes
• Substitutes the first occurrence of the search pattern for the change pattern in the string variable.
• For example, the following changes the first occurrence of t to T:
$name = “tom turtle”;$name =~ s/t/T/;print “Name=$name”;
• The output of this code would be
Name=Tom turtle
Changing All Occurrences
• You can place a g (for global substitution) at the end of the substitution expression to change all occurrences of the target pattern string in the search string. For example,
– $name = “tom turtle”;– $name =~ s/t/T/g;– print “Name=$name”;
• The output of this code would be
– Name= Tom TurTle
Using Translate
• A similar function is called tr (for “translate”). Useful for translating characters from uppercase to lowercase, and vice versa.
– The tr function allows you to specify a range of characters to translate from and a range of characters to translate to. :
$name="smokeY";
$name =~ tr/[a-z]/[A-Z]/;
print "name=$name";
Would output the following
Name=SMOKEY
The Alternation Operator
• Alternation operator looks for alternative strings for matching within a pattern.
– (That is, you use it to indicate that the program should match one pattern OR the other). The following shows a match statement using the alternation operator (left) and some possible matches based on the contents of $address (right); this pattern matches either com or edu.
Parenthesis For Groupings
• You use parentheses within regular expressions to specify groupings. For example, the following matches a $name value of Dave or David.
• Match Statement:
if ( $name =~ /Dav(e|id)/)
{
print “$name came home from school\n”;
}
Example Alternation Operator Match Statement Possible Matching String Values for
$address
if ( $address =~ /com|edu/ ) { “www.mysite.com”, “Welcome to my
site”,
"Time for education”,“www.mysite.edu”
Using Regular Expressions
• regular expressions to enable programs to more completely match patterns.
– They actually make up a small language of special matching operators that can be employed to enhance the Perl string pattern matching.
Special Character Classes
• Perl has a special set of character classes for short hand pattern matching
• For example consider these two statements
if ( $name =~ m/ / ) {
if ($name =~ m/\s/ ) {
Special Character ClassesCharacter Class Meaning
\s Matches a single space. For example, the following matches
“Apple Core”, “Alle y”, and “Here you go”; it does not match
“Alone”: if ( $name =~ m/e\s/ ) {
\S Matches any nonspace, tab, newline, return, or formfeed
character. For example, the following matches “ZT”, “YT”,
and “;T”: if( $part =~ m/\ST/ ) {
Special Character Classes - IICharacter Class Meaning
\w Matches any word character (uppercase or lowercase letters, digits, or the
underscore character). For example, the following matches “Apple”,
“Time”, “Part time”, “time_to_go”, “ Time”, and “1234”; it does not
match “#%^&”: if ( $part =~ m/\w/ ) {
\W Matches any nonword character (not uppercase or lowercase letters,
digits, or the underscore character). For example, the following
matches “A*B” and “A{B”, but not “A**B”, “AB*”, “AB101”,
or “1234”: if ( $part =~ m/A\WB/ ) {
Special Character Classes - IIICharacter Class Meaning
\d Matches any valid numerical digit (that is, any number 0–9). For
example, the following matches “B12abc”, “The B1 product is late”, “I
won bingo with a B9”, and “Product B00121”; it does not match “B 0”,
“Product BX 111”, or “Be late 1”: if ( $part =~ m/B\d/ ) {
\D Matches any non-numerical character (that is any character not a digit 0–
9). For example, the following matches “AB1234”, “Product number
1111”, “Number VG928321212”, “The number_A1234”, and “Product
1212”; it does not match “1212” or “PR12”:
if ( $part =~ m/\D\D\d\d\d\d/) {
Setting Specific Patterns w/ Quantifiers
• Character quantifiers let you look for very specific patterns
• For example, use the dollar sign (“$”) to to match if a string ends with a specified pattern.
if ($Name =~ /Jones$/ ) {
• Matches “John Jones” but not “Jones is here” would not. Also, “The guilty party is Jones” would matches.
Selected Perl Character Quantifiers
Character
Quantifier
Meaning
^ Matches when the following character starts the string. For example,
the following matches “Smith is OK”, “Smithsonian”, and “Smith,
Black”: if ( $name =~ m/^Smith/ ) {
$ Matches when the preceding character ends the string. For example,
the following matches “the end”, “Tend”, and “Time to Bend”:
if ( $part =~ m/end$/ ) {
Selected Perl Character Quantifiers
Character
Quantifier
Meaning
+ Matches one or more occurrences of the preceding character.
For example, the following matches “AB101”, “ABB101”,
and “ABBB101 is the right part”: if ( $part =~ m/^AB+101/ ) {* Matches zero or more occurrences of the preceding character. For
example, the following matches “AB101”, “ABB101”, “A101”, and
“A101 is broke”: if ( $part =~ m/^AB*101/) {
Building Regular Expressions that Work
1. Determine the precise field rules.
2. Get form and form-handling programs working
3. Start with the most specific term possible.
4. Anchor and refine. (Use ^ and $ when possible)– if ( $date =~ m{^\d\d/\d\d/\d\d\d\d$} ) {
Starts with2 digits
2 digitsin middle
Ends with 4 digits
Regular Expression Special Variables
• Perl regexs set several special scalar variables:
– $& will be equal to the first matching text
– $`will be the text before the match, and
– $’ will be the text after the first match.
$name='*****Marty';
if ( $name =~ m/\w/ ) {
print “match at=$& ";
print "B4=$` after=$'";
} else { print "Not match"; }
• Output: match at=M B4=***** after=arty
Drivedate4.pl Example Program
1. #!/usr/bin/perl2. use CGI ':standard';3. print header, start_html('Date Check');4. $date=param('udate');5. if ( $date =~ m{^\d\d/[0-3]\d/2\d\d\d$} ) {6. print 'Valid date=', $date;7. } else {8. print 'Invalid date=', $date;9.}
10. print end_html;
A Pattern Matching Example1. #!/usr/bin/perl2. use CGI ':standard';3. print header, start_html('Command Search');4. @PartNums=( 'XX1234', 'XX1892', 'XX9510');5. $com=param('command');6. $prod=param('uprod');7. if ($com eq "ORDER" || $com eq "RETURN") {8. $prod =~ s/xx/XX/g; # switch xx to XX9. if ($prod =~ /XX/ ) {10. foreach $item ( @PartNums ) {11. if ( $item eq $prod ) {12. print "VALIDATED command=$com prodnum=$prod";13. $found = 1;14. }15. }16. if ( $found != 1 ) {17. print br,"Sorry Prod Num=$prod NOT FOUND";18. }19. } else {20. print br, "Sorry that prod num prodnum=$prod looks wrong";21. }22. } else {23. print br, "Invalid command=$com did not receive ORDER or RETURN";24. }
25. print end_html;
The Split Function • split() breaks a string into different pieces
based on a field separator. 2 arguments:
– a pattern to match (which can contain regular expressions)
– and a string variable to split. (into as many pieces as there are matches for the pattern)
@output = split( /\s+/, $names );
A string variable.
A list variable that will contain resulting
matches.
Regular expressionto match.
split() Example
$line = “Please , pass thepepper”;
@result = split( /\s+/, $line );
• Sets list variable $result with the following: $result[0] = “Please”;$result[1] = “,”$result[2] = “pass”;$result[3] = “thepepper”;
1 or more spaces
Variable to splitResults
into a list
Another split() Example
• Another split() example:
$line = “Baseball, hot dogs, apple pie”;@newline = split( /,/, $line );
print “newline= @newline”;
• These lines will have the following output:
– newline= Baseball hot dogs apple pie
The Split Function • When you know how many matches to expect:
$line = “AA1234:Hammer:122:12”;($partno, $name, $id, $cost) =
split( /:/, $line );print “Part#: $partno; Name: $part; ID: $num;
Cost: $cost”;
• Would output the following:
Part#: AA1234; Name: Hammer;
ID: 122; Cost: 12
Summary– Perl supports a set of operators and functions that are
useful for working with string variables and verifying input data.
• The match operator, the substitute operator, the translate operator, the split function.
– Perl uses regular expressions to to enable a program to look for specific characters (such as numbers, words, or letters) in specific places in any string.
• You can use them to verify form input, thereby providing a first line of defense against accidental or malicious input.