introduction to programming the www i cmsc 10100-1 winter 2004 lecture 12

64
Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

Upload: harold-brown

Post on 19-Jan-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

Introduction to Programming the WWW I

Introduction to Programming the WWW I

CMSC 10100-1

Winter 2004

Lecture 12

Page 2: Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

2

Today’s TopicsToday’s Topics

• CGI module (cont’d)

• Matching patterns

• Regular expressions

Page 3: Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

3

Midterm ResultsMidterm Results

• Total points of the paper: 50

• Highest grade: 47.5

• Avg. grade: 42.6

• 10 submitted papers 9 papers with grade >= 40

Page 4: Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

4

Review: Using CGI.pm to generate HTML

Review: Using CGI.pm to generate HTML

• The CGI.pm module provides several functions that can be used to concisely output HTML tags

• For example, $mypage=‘It is a New Day’;

print “<HTML><HEAD><TITLE> $mypage </TITLE></HEAD><BODY>”;

can also be written as:$mypage=’It is a New Day’;

print start_html(‘$mypage’);

Page 5: Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

5

Review: Three Basic CGI.pm Modules

Review: Three Basic CGI.pm Modules

• start_html creates starting HTML tags

• Header creates the MIME Content-type line

• end_html creates ending HTML tags

1.     #!/usr/local/bin/perl2.     use CGI ‘:standard’;3.     print header;4.     print start_html;5.     print '<FONT size=4 color="blue">';6.     print 'Welcome <I>humans</I> to my site</FONT>'; 7. print end_html;

Page 6: Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

6

Review: CGI.pm Basic Functions

Review: CGI.pm Basic Functions

• The various CGI/PM function accept 3 basic syntactic formats:

No argument format

• functions that can be used without any arguments

Positional argument format

• functions that can accept comma-separated arguments within parentheses

Name-value argument format

• functions that accept parameters submitted as name-and-value pairs

Page 7: Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

7

Review: Some Single Argument Functions

Review: Some Single Argument Functions

CGI.pm Function Example of Use Example Output

header- the MIME Content-type line

print header; Content-type:text/html\n\n

start_html—Tags to start an HTML document

print start_html; <HTML><HEAD><TITLE></TITLE></HEAD><BODY>

br—output <BR> tag

print br; <BR>

hr—generate horizontal rule

print hr; <HR>

end_html—end an HTML document

print end_html; </BODY></HTML>

Page 8: Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

8

Review: Some Positional Functions

Review: Some Positional Functions

CGI.pm Functions Example of Use Example Output

start_html()—tags needed to start an HTML document.

start_html(‘My Page’);

<HTML><HEAD><TITLE> My Page </TITLE></HEAD><BODY>

h1()—header level 1 tags. (also h2(), h3(), h4() )

print h1(‘Hello There’);

<H1>Hello There </H1>

strong() – output argument in strong.

print strong('Now');

<STRONG>Now</STRONG>

p()—creates a paragraph.

print p(‘Time to move’);

<P>Time to move </P>

b()—prints the argument in bold.

print b('Exit'); <B>Exit</B>

i()—prints the argument in italics.

Print i('Quick'); <I>Quick</I>

Page 9: Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

9

Review: Some name/value functions

Review: Some name/value functions

CGI.pm Function

Example Usage Example Output

start_html start HTML document

print start_html({ -title=>‘my title’, –bgcolor=>’red’ });

<HTML><HEAD><TITLE>my title</TITLE></HEAD> <BODY BGCOLOR=”RED”>

img—inserts an image

print img({-src=>'myfile.gif', -alt=>’picture’});

<IMG SRC="myfile.gif” alt=”picture”>

a—establishes links

print a({ -href => 'http://www.mysite.com'}, 'Click Here');

<A HREF="http://www.mysite.com"> Click Here </A>

font()—creates <FONT> … </FONT> tags

print font( { -color=>‘BLUE’,–size=> ’4’}, ‘Lean, and mean.’);

<FONT SIZE=”4” COLOR=”BLUE”> Lean, and mean. </FONT>

 

Page 10: Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

10

Review: How Web Applications Work with CGI

Review: How Web Applications Work with CGI

Web Server receives therequest and starts up te

CGI program.

Send results back

Please Enter APhone Number

Submit Erase

Web Browser

CGI-basedcomputerprogram

Web Browser

Phone QueryResults:

That isJohn Doe'sPhone Number

Web Browser

Your PC(Internet connected)

WebServer(Internet connected)

Page 11: Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

11

Review: HTLM FormsReview: HTLM Forms

• HTML Forms are used to select different kinds of user input, defined with <form> tag

• Form contains form elements to allow the user to enter information text fields textarea fields drop-down menus radio buttons checkboxes, etc

Page 12: Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

12

Review: <form> Tag Attributes

Review: <form> Tag Attributes

• action attribute Gives the URL of the program(CGI) to receive and

process the form’s data

• method attribute Sets the HTTP method by which the browser

sends the form data to the program, value can be GET or POST

Avoid GET method in favor of POST for security reasons

Page 13: Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

13

Review: <input> TagReview: <input> Tag

• To define any one of a number of common form “controls” Text fields (including password, hidden fields) multiple-choice lists Clickable images Submission buttons

• Only type and name attribute required

• No ending tag (no </input>)

Page 14: Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

14

Review: Text FieldsReview: Text Fields

• single line of text <input type=text name=XXX>

• Set type to password to mask text like a password

• Set type to hidden to create a hidden field

size and maxlength attributes value attributes to give default text

Page 15: Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

15

Review: Multi-line Text AreaReview: Multi-line Text Area

• The <textarea> tag

• Attributes cols rows wrap

• Values: Off,virtual(default),physical

Page 16: Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

16

Review: Check BoxesReview: Check Boxes

• Check boxes for “check all that apply” questions <input type=checkbox name=XXX value=XXX>

Make sure name identical among a group of checkboxes

checked attribute

• When form is submitted, names and values of those checked boxes are sent

Page 17: Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

17

Review: Radio ButtonsReview: Radio Buttons

• Similar as checkboxes, but only one in the group may be selected <input type=radio name=XXX value=XXX>

Page 18: Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

18

Review: Multiple Choice Elements

Review: Multiple Choice Elements

• The <select> tag creates either pull-down menus or scrolling lists

• The <option> tag defines each item within a <select> tag

• <select> tag attributes size attribute

• Number of rows visible at the same time multiple attribute

• If set, allow multiple selections name attribute

Page 19: Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

19

Review: Action ButtonsReview: Action Buttons

• What are they? Submit buttons

• <input type=submit name=XXX value=XXX> Reset buttons

• <input type=reset name=XXX value=XXX> Regular buttons

• <input type=button name=XXX value=XXX> image buttons (will send form content as submit button)

• <input type=image name=XXX src=XXX> *File buttons (need to deal with enctyple attribute)

• <input type=file name=XXX accept=“text/*”>

Page 20: Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

20

Using CGI.pm with HTML forms

Using CGI.pm with HTML forms

CGI.pm Function

Example Usage Example Output

start_form start HTML form element

print start_form({ -method=>‘post’,–action=> ‘http://people.cs.uchicago.edu/~wfreis/cgi-bin/reflector.pl’});

<form method="post" action=http://people.cs.uchicago.edu/~wfreis/cgi-bin/reflector.pl>

textfield, password_field —inserts a text field or password field

print textfield(-name=>'textfield1', -size=>'50', -maxlength=>'50');

<input type="text" name="textfield1" size=50 maxlength=50 />

scrolling_list —insert a multiple list

print scrolling_list(-name=>'list1', -values=> ['eenie', 'minie', 'moe'], -default=> ['eenie','moe'], -size=>5, -multiple=>'true');

<select name="list1" size=5 multiple><option selected value= "eenie“ > eenie</option><option value="minie">minie </option><option selected value="moe">moe </option></select>

textarea—inserts a text area

print textarea(-name=> 'large_field_name',-rows=> 10, -columns=>50);

<textarea name="large_field_name" rows=10 cols=50></textarea>

Page 21: Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

21

Using CGI.pm with HTML forms (cont’d)Using CGI.pm with HTML forms (cont’d)

CGI.pm Function

Example Usage Example Output

checkbox_group – insert checkbox

print checkbox_group(-name=> 'color', -values=>['red ','orange ','yellow '], -default=>['red ']);

<input type="checkbox" name="color" value="red " checked />red <input type="checkbox" name="color" value="orange " />orange <input type="checkbox" name="color" value="yellow " />yellow

raidio-group —inserts a text field

print radio_group(-name=>'color blind', -values=>['Yes','No'], -default=>'No');

<input type="radio" name="color blind" value="Yes" />Yes<input type="radio" name="color blind" value="No" checked />No

submit,reset—insert a submit or reset button

print submit('submit', 'Submit');Print reset;

<input type="submit" name="submit" value="Submit" /><input type="reset" />

endform— print end form tag

print endform(); </form>

Perl CGI Reference

Page 22: Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

22

A CGI Form ExampleA CGI Form Example

http://people.cs.uchicago.edu/~hai/hw4/cgiform1.cgi

Page 23: Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

23

Receiving HTML Form ArgumentsReceiving HTML Form Arguments

• Within the CGI program call param() function Input variables into CGI/Perl programs are

called CGI variables

• Values received through your Web server as input from a Web browser, usually filled in a form

To use param():

$thecolor = param('color');

The CGI variblename in

quotation marks.

Assign the value ofthe CGI variable to

$thecolor.

Page 24: Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

24

Receiving HTML Form Arguments Receiving HTML Form Arguments

<FORM ACTION="cgiform1_checker.cgi" METHOD="POST">print "What is your favourite color?";print checkbox_group(-name=>'color',-values=>['red ','orange ','yellow ','green ','blue ','indigo ','violet '], -default=>['red ','blue ']);

URL of program to send form output to.

Name ofargument fromcheckbox is color.

.

.

.

</ FORM>

#!/ usr/ l ocal / bi n/ perluse CGI ": standard";pri nt header;pri nt "Your f avouri te col or: ", param(' col or' ) ;. . .pri nt end_html ;

Get the valueof form elementcalled color

The Calling HTML Form

The Receiving CGI/Perl Program

http://people.cs.uchicago.edu/~hai/hw4/cgiform1.cgi

Page 25: Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

25

Sending ArgumentsSending Arguments

• You can send arguments to your CGI program directly from the URL address of a browser

http://people.cs.uchicago.edu/~hai/hw4/cgiform1_checker.cgi?color=red

The argument name is color.Its' value is red.

URL of the CGIprogram to start.

The "?" signals argument to follow.

http://people.cs.uchicago.edu/~hai/hw4/cgiform1_checker.cgi?color=red

Page 26: Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

26

Sending Multiple ArgumentsSending Multiple Arguments

Precede firstargument with ?

Precede next argument with &

http://people.cs.uchicago.edu/~hai/hw4/cgiform1_checker.cgi?color=red&secret=nothing

Page 27: Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

27

Debug CGI Program in Command Line

Debug CGI Program in Command Line

• To start and send an argument to the password program can execute the following:

perl cgiform1_checker.cgi color=red

• Enclose blank spaces or multiple arguments in quotation marks:

perl cgiform1_checker.cgi ‘color=rose red’

perl cgiform1_checker.cgi 'color=red&secret=none'

Page 28: Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

28

Check CGI VariablesValues

Check CGI VariablesValues

• Perl provides a simple method to test if any parameters were received or null: $var = param(‘some_cgi_variable’) ;

if ($var) {

statement(s) to execute when $var has a value

} else {

statement(s) to execute when $var has no value

}

Page 29: Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

29

Combining Program FilesCombining Program Files

• Applications so far have required two separate files; one file for to generate the form, and the other to process the form Example:

cgiform1.cgi and cgiform1_checker.cgi

Can test return value on param() to combine these

• At least two advantages With one file, it is easier to change arguments It is easier to maintain one file.

Page 30: Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

30

Combining Program FilesCombining Program Files

if ( !param() ) { &create_form(); }else { &process_form();}

If no parameters, thenthis is first time for

program. Call create_formto create the form.

Check to seeif there are any

parameters.

Must be some parameters toprocess so call process_form

http://people.cs.uchicago.edu/~hai/hw4/cgiform2.cgi

Page 31: Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

31

Patterns in String Variables Patterns in String Variables

• Many programming problems require matching, changing, or manipulating patterns in string variables. An important use is verifying input fields of a form

• helps provide security against accidental or malicious attacks.

• For example, if expecting a form field to provide a telephone number as input, your program needs a way to verify that the input comprises a string of seven digits.

Page 32: Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

32

Four Different Constructs Four Different Constructs

• Will look at 4 different Perl String manipulation constructs: The match operator enables your program to look for

patterns in strings. The substitute operator enables your program to change

patterns in strings. The split function enables your program to split strings

into separate variables based on a pattern. (already covered)

Regular expressions provide a pattern matching language that can work with these operators and functions to work on string variables.

Page 33: Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

33

The Match OperatorThe Match Operator

• The match operator is used to test if a pattern appears in a string. It is used with the binding operator (“=~”)

to see whether a variable contains a particular pattern.

if ( $name =~ m/edu/ ) {

set of statements to execute}

These statements execute if 'edu' isANYWHERE in the contents of the stringvariable $name.

Trys to match the patterninside slashes "/". In thiscase the pattern "edu".

The binding operatorindicates toexamine thecontents of$name.

Page 34: Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

34

Possible Values of $namePossible Values of $name

Value of $name Test from Figure 7.1

‘www.myschool.edu’ True because the string contains edu

‘www.myschool.com’ False because edu is not in the string

‘I like my education’ True because the string contains edu

‘I Like My Education’ False because matching is case sensitive

‘I liked umbrellas’ False because edu is not in the string

Page 35: Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

35

Other Delimiters? Other Delimiters?

• Slash (“/”) is most common match pattern

Others are possible, For example, both use valid match operator syntax:

if ( $name =~ m!Dave! ) { if ( $name =~ m<Dave> ) {

• The reverse binding operator test if pattern is NOT found:

if ( $color !~ m/blue/ ) {

Page 36: Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

36

The Substitution OperatorThe Substitution Operator

• Similar to the match operator but also enables you to change the matched string.

Use with the binding operator (“=~”) to test whether a variable contains a pattern

$stringvar =~ s/ABC/abc/;

Pattern to change if a match.

String variable tosearch for and

potentiallysubstitute pattern in.

Pattern tosearch for.

Page 37: Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

37

How It WorksHow It Works

• Substitutes the first occurrence of the search pattern for the change pattern in the string variable.

• For example, the following changes the first occurrence of t to T:

$name = “tom turtle”;$name =~ s/t/T/;print “Name=$name”;

• The output of this code would beName=Tom turtle

Page 38: Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

38

Changing All OccurrencesChanging All Occurrences

• You can place a g (for global substitution) at the end of the substitution expression to change all occurrences of the target pattern string in the search string. For example,

$name = “tom turtle”; $name =~ s/t/T/g; print “Name=$name”;

• The output of this code would be

Name= Tom TurTle

Page 39: Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

39

Using TranslateUsing Translate

• A similar function is called tr (for “translate”). Useful for translating characters from uppercase to lowercase, and vice versa.

The tr function allows you to specify a range of characters to translate from and a range of characters to translate to. :

$name="smokeY";

$name =~ tr/[a-z]/[A-Z]/;

print "name=$name";

Would output the following

Name=SMOKEY

Page 40: Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

40

A Full Pattern Matching Example

A Full Pattern Matching Example

1. #!/usr/local/bin/perl2. use CGI ':standard';3. print header, start_html('Command Search');4. @PartNums=( 'XX1234', 'XX1892', 'XX9510');5. $com=param('command');6. $prod=param('uprod');7. if ($com eq "ORDER" || $com eq "RETURN") {8. $prod =~ s/xx/XX/g; # switch xx to XX9. if ($prod =~ /XX/ ) {10. foreach $item ( @PartNums ) {11. if ( $item eq $prod ) {12. print "VALIDATED command=$com prodnum=$prod";13. $found = 1;14. }15. }16. if ( $found != 1 ) {17. print br,"Sorry Prod Num=$prod NOT FOUND";18. }19. } else {20. print br, "Sorry that prod num prodnum=$prod looks wrong";21. }22. } else {23. print br, "Invalid command=$com did not receive ORDER or RETURN";24. }

25. print end_html;

Page 41: Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

41

Would Output The Following ...Would Output The Following ...

Page 42: Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

42

Using Regular Expressions Using Regular Expressions

• regular expressions to enable programs to match patterns more completely .

They actually make up a small language of special matching operators that can be employed to enhance the Perl string pattern matching.

Page 43: Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

43

The Alternation OperatorThe Alternation Operator

• Alternation operator looks for alternative strings for matching within a pattern.

(That is, you use it to indicate that the program should match one pattern OR the other). The following shows a match statement using the alternation operator (left) and some possible matches based on the contents of $address (right); this pattern matches either com or edu.

Page 44: Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

44

Example Alternation Operator Example Alternation Operator

Match Statement Possible Matching String Values for

$address

if ( $address =~ /com|edu/ ) { “www.mysite.com”, “Welcome to my

site”,

"Time for education”,“www.mysite.edu”

Page 45: Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

45

Parenthesis For GroupingsParenthesis For Groupings

• You use parentheses within regular expressions to specify groupings. For example, the following matches a $name value of Dave or David.

Match Statement Possible Matching String Values for $nameif ( $name =~ /Dav(e|id)/

) {

“Dave”, “David”, “Dave was here”,

"How long before David comes home”

Page 46: Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

46

Special Character Classes Special Character Classes

• Perl has a special set of character classes for short hand pattern matching

• For example consider these two statements

if ( $name =~ m/ / ) {

will match $name with embedded space char

if ($name =~ m/\s/ ) {

will match $name with embedded space, tab, newline

Page 47: Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

47

Special Character ClassesSpecial Character Classes

Character Class Meaning

\s Matches a single space. For example, the following matches

“Apple Core”, “Alle y”, and “Here you go”; it does not match

“Alone”: if ( $name =~ m/e\s/ ) {

\S Matches any nonspace, tab, newline, return, or formfeed

character. For example, the following matches “ZT”, “YT”,

and “;T”: if( $part =~ m/\ST/ ) {

Page 48: Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

48

Special Character Classes - IISpecial Character Classes - IICharacter Class Meaning

\w Matches any word character (uppercase or lowercase letters, digits, or the

underscore character). For example, the following matches “Apple”,

“Time”, “Part time”, “time_to_go”, “ Time”, and “1234”; it does not

match “#%^&”: if ( $part =~ m/\w/ ) {

\W Matches any nonword character (not uppercase or lowercase letters,

digits, or the underscore character). For example, the following

matches “A*B” and “A{B”, but not “A**B”, “AB*”, “AB101”,

or “1234”: if ( $part =~ m/A\WB/ ) {

Page 49: Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

49

Special Character Classes - IIISpecial Character Classes - IIICharacter Class Meaning

\d Matches any valid numerical digit (that is, any number 0–9). For

example, the following matches “B12abc”, “The B1 product is late”, “I

won bingo with a B9”, and “Product B00121”; it does not match “B 0”,

“Product BX 111”, or “Be late 1”: if ( $part =~ m/B\d/ ) {

\D Matches any non-numerical character (that is any character not a digit 0–

9). For example, the following matches “AB1234”, “Product number

1111”, “Number VG928321212”, “The number_A1234”, and “Product

1212”; it does not match “1212” or “PR12”:

if ( $part =~ m/\D\D\d\d\d\d/) {

Page 50: Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

50

Setting Specific Patterns w/ Quantifiers

Setting Specific Patterns w/ Quantifiers

• Character quantifiers let you look for very specific patterns

• For example, use the dollar sign (“$”) to to match if a string ends with a specified pattern.

if ($Name =~ /Jones$/ ) {

• Matches “John Jones” but not “Jones is here” would not. Also, “The guilty party is Jones” would matches.

Page 51: Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

51

Selected Perl Character Quantifiers I

Selected Perl Character Quantifiers I

Character

Quantifier

Meaning

^ Matches when the following character starts the string. For example,

the following matches “Smith is OK”, “Smithsonian”, and “Smith,

Black”: if ( $name =~ m/^Smith/ ) {

$ Matches when the preceding character ends the string. For example,

the following matches “the end”, “Tend”, and “Time to Bend”:

if ( $part =~ m/end$/ ) {

Page 52: Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

52

Selected Perl Character Quantifiers II

Selected Perl Character Quantifiers II

Character

Quantifier

Meaning

+ Matches one or more occurrences of the preceding character.

For example, the following matches “AB101”, “ABB101”,

and “ABBB101 is the right part”: if ( $part =~ m/^AB+101/ ) {* Matches zero or more occurrences of the preceding character. For

example, the following matches “AB101”, “ABB101”, “A101”, and

“A101 is broke”: if ( $part =~ m/^AB*101/) {

Page 53: Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

53

Selected Perl Character Quantifiers III

Selected Perl Character Quantifiers III

Character

Quantifier

Meaning

. A wildcard symbol that matches any one character. For example, the

following matches “Stop”, “Soap”, “Szxp”, and “Soap is good”; it

does not match “Sxp”:

if ( $name =~ m/^S..p/ ) {

Page 54: Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

54

Building Regular Expressions That Work

Building Regular Expressions That Work

• Regular expressions are very powerful—but they can also be virtually unreadable. When building one, tart with a simple regular

expression and then refine it incrementally. • Build a piece and then test

The following example will build a regular expression for a date checker

• dd/mm/yyyy format (for example, 05/05/2002 but not 5/12/01).

Page 55: Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

55

1. Determine the precise field rules. - What is valid input and what is not valid input? E.g., For a date field, think through the valid

and invalid rules for the field. You might allow 09/09/2002 but not 9/9/2002 or Sep/9/2002.

Work through several examples as follows:

Building Regular Expressions That Work

Page 56: Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

56

Work through several examplesWork through several examples

Rule Reject These

05/05/2002 - / as a separator 05-05-2002—Require slash delimiters

05/05/2002—Use a four-digit year 05/05/02—Four-digit years only

05/05/2001—Contain only a date The date is 05/05/2002—Only date fields

05/05/2002 is my date—Only date fields

05/05/2001 —Two digits for

months and days

5/05/2002—Two-digit months only

05/5/2002—Two-digit days only

5/5/2002—Two-digit days and months only

Page 57: Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

57

Building Regular Expressions that Work

Building Regular Expressions that Work

2. Get form and form-handling programs working Build a sending form the input field

Build the receiving program that accepts the field.

For example, a first cut receiving program: $date = param(‘udate’);if ( $date =~ m/.+/ ) {

print ‘Valid date=’, $date;} else {

print ‘Invalid date=’, $date;}

Any Sequence of characters

Page 58: Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

58

Building Regular Expressions that Work

Building Regular Expressions that Work

3. Start with the most specific term possible. For example, slashes must always separate

two characters (for the month), followed by two more characters (for the day), followed by four characters (for the year).

if ( $date =~ m{../../....} ) {

Any 2 characters

Any 2 characters

Any 4characters

Page 59: Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

59

Building Regular Expressions that Work

Building Regular Expressions that Work

4. Anchor and refine. (Use ^ and $ when possible) if ( $date =~ m{^\d\d/\d\d/\d\d\d\d$} ) {

Starts with2 digits

2 digitsin middle

Ends with 4 digits

Page 60: Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

60

Building Regular Expressions that Work

Building Regular Expressions that Work

5. Get more specific if possible. The first digit of the month can be only 0, 1, 2

or 3. For example, 05/55/2002 is clearly an illegal date.

Only years from this century are allowed. Because we don’t care about dates like 05/05/1999 or 05/05/3003.

Page 61: Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

61

• Add these rules belowif ( $date =~ m{^\d\d/[0-3]\d/2\d\d\d$} ) {

Now the regular expression recognizes input like 09/99/2001 and 05/05/4000 as illegal.

Year starts with a “2”

Month starts with a “0-3”

Building Regular Expressions that Work

Page 62: Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

62

Tip: Regular Expression Special Variables

Tip: Regular Expression Special Variables

• Perl regexs set several special scalar variables:

$& will be equal to the first matching text

$`will be the text before the match, and

$’ will be the text after the first match. $name='*****Marty';

if ( $name =~ m/\w/ ) {

print "got match at=$& ";

print "B4=$` after=$'";

} else { print "Not match"; }

• would output: got match at=M B4=***** after=arty

Page 63: Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

63

Full Example ProgramFull Example Program

1. #!/usr/local/bin/perl2. use CGI ':standard';3. print header, start_html('Date Check');4. $date=param('udate');5. if ($date =~ m{^\d\d/[0-3]\d/2\d\d\d$}){6. print 'Valid date=', $date;7. } else {8. print 'Invalid date=', $date;9.}

10. print end_html;

Page 64: Introduction to Programming the WWW I CMSC 10100-1 Winter 2004 Lecture 12

64

Would Output The Following ...Would Output The Following ...