introduction to programming the www i cmsc 10100-1 winter 2004 lecture 12
TRANSCRIPT
Introduction to Programming the WWW I
Introduction to Programming the WWW I
CMSC 10100-1
Winter 2004
Lecture 12
2
Today’s TopicsToday’s Topics
• CGI module (cont’d)
• Matching patterns
• Regular expressions
3
Midterm ResultsMidterm Results
• Total points of the paper: 50
• Highest grade: 47.5
• Avg. grade: 42.6
• 10 submitted papers 9 papers with grade >= 40
4
Review: Using CGI.pm to generate HTML
Review: Using CGI.pm to generate HTML
• The CGI.pm module provides several functions that can be used to concisely output HTML tags
• For example, $mypage=‘It is a New Day’;
print “<HTML><HEAD><TITLE> $mypage </TITLE></HEAD><BODY>”;
can also be written as:$mypage=’It is a New Day’;
print start_html(‘$mypage’);
5
Review: Three Basic CGI.pm Modules
Review: Three Basic CGI.pm Modules
• start_html creates starting HTML tags
• Header creates the MIME Content-type line
• end_html creates ending HTML tags
1. #!/usr/local/bin/perl2. use CGI ‘:standard’;3. print header;4. print start_html;5. print '<FONT size=4 color="blue">';6. print 'Welcome <I>humans</I> to my site</FONT>'; 7. print end_html;
6
Review: CGI.pm Basic Functions
Review: CGI.pm Basic Functions
• The various CGI/PM function accept 3 basic syntactic formats:
No argument format
• functions that can be used without any arguments
Positional argument format
• functions that can accept comma-separated arguments within parentheses
Name-value argument format
• functions that accept parameters submitted as name-and-value pairs
7
Review: Some Single Argument Functions
Review: Some Single Argument Functions
CGI.pm Function Example of Use Example Output
header- the MIME Content-type line
print header; Content-type:text/html\n\n
start_html—Tags to start an HTML document
print start_html; <HTML><HEAD><TITLE></TITLE></HEAD><BODY>
br—output <BR> tag
print br; <BR>
hr—generate horizontal rule
print hr; <HR>
end_html—end an HTML document
print end_html; </BODY></HTML>
8
Review: Some Positional Functions
Review: Some Positional Functions
CGI.pm Functions Example of Use Example Output
start_html()—tags needed to start an HTML document.
start_html(‘My Page’);
<HTML><HEAD><TITLE> My Page </TITLE></HEAD><BODY>
h1()—header level 1 tags. (also h2(), h3(), h4() )
print h1(‘Hello There’);
<H1>Hello There </H1>
strong() – output argument in strong.
print strong('Now');
<STRONG>Now</STRONG>
p()—creates a paragraph.
print p(‘Time to move’);
<P>Time to move </P>
b()—prints the argument in bold.
print b('Exit'); <B>Exit</B>
i()—prints the argument in italics.
Print i('Quick'); <I>Quick</I>
9
Review: Some name/value functions
Review: Some name/value functions
CGI.pm Function
Example Usage Example Output
start_html start HTML document
print start_html({ -title=>‘my title’, –bgcolor=>’red’ });
<HTML><HEAD><TITLE>my title</TITLE></HEAD> <BODY BGCOLOR=”RED”>
img—inserts an image
print img({-src=>'myfile.gif', -alt=>’picture’});
<IMG SRC="myfile.gif” alt=”picture”>
a—establishes links
print a({ -href => 'http://www.mysite.com'}, 'Click Here');
<A HREF="http://www.mysite.com"> Click Here </A>
font()—creates <FONT> … </FONT> tags
print font( { -color=>‘BLUE’,–size=> ’4’}, ‘Lean, and mean.’);
<FONT SIZE=”4” COLOR=”BLUE”> Lean, and mean. </FONT>
10
Review: How Web Applications Work with CGI
Review: How Web Applications Work with CGI
Web Server receives therequest and starts up te
CGI program.
Send results back
Please Enter APhone Number
Submit Erase
Web Browser
CGI-basedcomputerprogram
Web Browser
Phone QueryResults:
That isJohn Doe'sPhone Number
Web Browser
Your PC(Internet connected)
WebServer(Internet connected)
11
Review: HTLM FormsReview: HTLM Forms
• HTML Forms are used to select different kinds of user input, defined with <form> tag
• Form contains form elements to allow the user to enter information text fields textarea fields drop-down menus radio buttons checkboxes, etc
12
Review: <form> Tag Attributes
Review: <form> Tag Attributes
• action attribute Gives the URL of the program(CGI) to receive and
process the form’s data
• method attribute Sets the HTTP method by which the browser
sends the form data to the program, value can be GET or POST
Avoid GET method in favor of POST for security reasons
13
Review: <input> TagReview: <input> Tag
• To define any one of a number of common form “controls” Text fields (including password, hidden fields) multiple-choice lists Clickable images Submission buttons
• Only type and name attribute required
• No ending tag (no </input>)
14
Review: Text FieldsReview: Text Fields
• single line of text <input type=text name=XXX>
• Set type to password to mask text like a password
• Set type to hidden to create a hidden field
size and maxlength attributes value attributes to give default text
15
Review: Multi-line Text AreaReview: Multi-line Text Area
• The <textarea> tag
• Attributes cols rows wrap
• Values: Off,virtual(default),physical
16
Review: Check BoxesReview: Check Boxes
• Check boxes for “check all that apply” questions <input type=checkbox name=XXX value=XXX>
Make sure name identical among a group of checkboxes
checked attribute
• When form is submitted, names and values of those checked boxes are sent
17
Review: Radio ButtonsReview: Radio Buttons
• Similar as checkboxes, but only one in the group may be selected <input type=radio name=XXX value=XXX>
18
Review: Multiple Choice Elements
Review: Multiple Choice Elements
• The <select> tag creates either pull-down menus or scrolling lists
• The <option> tag defines each item within a <select> tag
• <select> tag attributes size attribute
• Number of rows visible at the same time multiple attribute
• If set, allow multiple selections name attribute
19
Review: Action ButtonsReview: Action Buttons
• What are they? Submit buttons
• <input type=submit name=XXX value=XXX> Reset buttons
• <input type=reset name=XXX value=XXX> Regular buttons
• <input type=button name=XXX value=XXX> image buttons (will send form content as submit button)
• <input type=image name=XXX src=XXX> *File buttons (need to deal with enctyple attribute)
• <input type=file name=XXX accept=“text/*”>
20
Using CGI.pm with HTML forms
Using CGI.pm with HTML forms
CGI.pm Function
Example Usage Example Output
start_form start HTML form element
print start_form({ -method=>‘post’,–action=> ‘http://people.cs.uchicago.edu/~wfreis/cgi-bin/reflector.pl’});
<form method="post" action=http://people.cs.uchicago.edu/~wfreis/cgi-bin/reflector.pl>
textfield, password_field —inserts a text field or password field
print textfield(-name=>'textfield1', -size=>'50', -maxlength=>'50');
<input type="text" name="textfield1" size=50 maxlength=50 />
scrolling_list —insert a multiple list
print scrolling_list(-name=>'list1', -values=> ['eenie', 'minie', 'moe'], -default=> ['eenie','moe'], -size=>5, -multiple=>'true');
<select name="list1" size=5 multiple><option selected value= "eenie“ > eenie</option><option value="minie">minie </option><option selected value="moe">moe </option></select>
textarea—inserts a text area
print textarea(-name=> 'large_field_name',-rows=> 10, -columns=>50);
<textarea name="large_field_name" rows=10 cols=50></textarea>
21
Using CGI.pm with HTML forms (cont’d)Using CGI.pm with HTML forms (cont’d)
CGI.pm Function
Example Usage Example Output
checkbox_group – insert checkbox
print checkbox_group(-name=> 'color', -values=>['red ','orange ','yellow '], -default=>['red ']);
<input type="checkbox" name="color" value="red " checked />red <input type="checkbox" name="color" value="orange " />orange <input type="checkbox" name="color" value="yellow " />yellow
raidio-group —inserts a text field
print radio_group(-name=>'color blind', -values=>['Yes','No'], -default=>'No');
<input type="radio" name="color blind" value="Yes" />Yes<input type="radio" name="color blind" value="No" checked />No
submit,reset—insert a submit or reset button
print submit('submit', 'Submit');Print reset;
<input type="submit" name="submit" value="Submit" /><input type="reset" />
endform— print end form tag
print endform(); </form>
Perl CGI Reference
22
A CGI Form ExampleA CGI Form Example
http://people.cs.uchicago.edu/~hai/hw4/cgiform1.cgi
23
Receiving HTML Form ArgumentsReceiving HTML Form Arguments
• Within the CGI program call param() function Input variables into CGI/Perl programs are
called CGI variables
• Values received through your Web server as input from a Web browser, usually filled in a form
To use param():
$thecolor = param('color');
The CGI variblename in
quotation marks.
Assign the value ofthe CGI variable to
$thecolor.
24
Receiving HTML Form Arguments Receiving HTML Form Arguments
<FORM ACTION="cgiform1_checker.cgi" METHOD="POST">print "What is your favourite color?";print checkbox_group(-name=>'color',-values=>['red ','orange ','yellow ','green ','blue ','indigo ','violet '], -default=>['red ','blue ']);
URL of program to send form output to.
Name ofargument fromcheckbox is color.
.
.
.
</ FORM>
#!/ usr/ l ocal / bi n/ perluse CGI ": standard";pri nt header;pri nt "Your f avouri te col or: ", param(' col or' ) ;. . .pri nt end_html ;
Get the valueof form elementcalled color
The Calling HTML Form
The Receiving CGI/Perl Program
http://people.cs.uchicago.edu/~hai/hw4/cgiform1.cgi
25
Sending ArgumentsSending Arguments
• You can send arguments to your CGI program directly from the URL address of a browser
http://people.cs.uchicago.edu/~hai/hw4/cgiform1_checker.cgi?color=red
The argument name is color.Its' value is red.
URL of the CGIprogram to start.
The "?" signals argument to follow.
http://people.cs.uchicago.edu/~hai/hw4/cgiform1_checker.cgi?color=red
26
Sending Multiple ArgumentsSending Multiple Arguments
Precede firstargument with ?
Precede next argument with &
http://people.cs.uchicago.edu/~hai/hw4/cgiform1_checker.cgi?color=red&secret=nothing
27
Debug CGI Program in Command Line
Debug CGI Program in Command Line
• To start and send an argument to the password program can execute the following:
perl cgiform1_checker.cgi color=red
• Enclose blank spaces or multiple arguments in quotation marks:
perl cgiform1_checker.cgi ‘color=rose red’
perl cgiform1_checker.cgi 'color=red&secret=none'
28
Check CGI VariablesValues
Check CGI VariablesValues
• Perl provides a simple method to test if any parameters were received or null: $var = param(‘some_cgi_variable’) ;
if ($var) {
statement(s) to execute when $var has a value
} else {
statement(s) to execute when $var has no value
}
29
Combining Program FilesCombining Program Files
• Applications so far have required two separate files; one file for to generate the form, and the other to process the form Example:
cgiform1.cgi and cgiform1_checker.cgi
Can test return value on param() to combine these
• At least two advantages With one file, it is easier to change arguments It is easier to maintain one file.
30
Combining Program FilesCombining Program Files
if ( !param() ) { &create_form(); }else { &process_form();}
If no parameters, thenthis is first time for
program. Call create_formto create the form.
Check to seeif there are any
parameters.
Must be some parameters toprocess so call process_form
http://people.cs.uchicago.edu/~hai/hw4/cgiform2.cgi
31
Patterns in String Variables Patterns in String Variables
• Many programming problems require matching, changing, or manipulating patterns in string variables. An important use is verifying input fields of a form
• helps provide security against accidental or malicious attacks.
• For example, if expecting a form field to provide a telephone number as input, your program needs a way to verify that the input comprises a string of seven digits.
32
Four Different Constructs Four Different Constructs
• Will look at 4 different Perl String manipulation constructs: The match operator enables your program to look for
patterns in strings. The substitute operator enables your program to change
patterns in strings. The split function enables your program to split strings
into separate variables based on a pattern. (already covered)
Regular expressions provide a pattern matching language that can work with these operators and functions to work on string variables.
33
The Match OperatorThe Match Operator
• The match operator is used to test if a pattern appears in a string. It is used with the binding operator (“=~”)
to see whether a variable contains a particular pattern.
if ( $name =~ m/edu/ ) {
set of statements to execute}
These statements execute if 'edu' isANYWHERE in the contents of the stringvariable $name.
Trys to match the patterninside slashes "/". In thiscase the pattern "edu".
The binding operatorindicates toexamine thecontents of$name.
34
Possible Values of $namePossible Values of $name
Value of $name Test from Figure 7.1
‘www.myschool.edu’ True because the string contains edu
‘www.myschool.com’ False because edu is not in the string
‘I like my education’ True because the string contains edu
‘I Like My Education’ False because matching is case sensitive
‘I liked umbrellas’ False because edu is not in the string
35
Other Delimiters? Other Delimiters?
• Slash (“/”) is most common match pattern
Others are possible, For example, both use valid match operator syntax:
if ( $name =~ m!Dave! ) { if ( $name =~ m<Dave> ) {
• The reverse binding operator test if pattern is NOT found:
if ( $color !~ m/blue/ ) {
36
The Substitution OperatorThe Substitution Operator
• Similar to the match operator but also enables you to change the matched string.
Use with the binding operator (“=~”) to test whether a variable contains a pattern
$stringvar =~ s/ABC/abc/;
Pattern to change if a match.
String variable tosearch for and
potentiallysubstitute pattern in.
Pattern tosearch for.
37
How It WorksHow It Works
• Substitutes the first occurrence of the search pattern for the change pattern in the string variable.
• For example, the following changes the first occurrence of t to T:
$name = “tom turtle”;$name =~ s/t/T/;print “Name=$name”;
• The output of this code would beName=Tom turtle
38
Changing All OccurrencesChanging All Occurrences
• You can place a g (for global substitution) at the end of the substitution expression to change all occurrences of the target pattern string in the search string. For example,
$name = “tom turtle”; $name =~ s/t/T/g; print “Name=$name”;
• The output of this code would be
Name= Tom TurTle
39
Using TranslateUsing Translate
• A similar function is called tr (for “translate”). Useful for translating characters from uppercase to lowercase, and vice versa.
The tr function allows you to specify a range of characters to translate from and a range of characters to translate to. :
$name="smokeY";
$name =~ tr/[a-z]/[A-Z]/;
print "name=$name";
Would output the following
Name=SMOKEY
40
A Full Pattern Matching Example
A Full Pattern Matching Example
1. #!/usr/local/bin/perl2. use CGI ':standard';3. print header, start_html('Command Search');4. @PartNums=( 'XX1234', 'XX1892', 'XX9510');5. $com=param('command');6. $prod=param('uprod');7. if ($com eq "ORDER" || $com eq "RETURN") {8. $prod =~ s/xx/XX/g; # switch xx to XX9. if ($prod =~ /XX/ ) {10. foreach $item ( @PartNums ) {11. if ( $item eq $prod ) {12. print "VALIDATED command=$com prodnum=$prod";13. $found = 1;14. }15. }16. if ( $found != 1 ) {17. print br,"Sorry Prod Num=$prod NOT FOUND";18. }19. } else {20. print br, "Sorry that prod num prodnum=$prod looks wrong";21. }22. } else {23. print br, "Invalid command=$com did not receive ORDER or RETURN";24. }
25. print end_html;
41
Would Output The Following ...Would Output The Following ...
42
Using Regular Expressions Using Regular Expressions
• regular expressions to enable programs to match patterns more completely .
They actually make up a small language of special matching operators that can be employed to enhance the Perl string pattern matching.
43
The Alternation OperatorThe Alternation Operator
• Alternation operator looks for alternative strings for matching within a pattern.
(That is, you use it to indicate that the program should match one pattern OR the other). The following shows a match statement using the alternation operator (left) and some possible matches based on the contents of $address (right); this pattern matches either com or edu.
44
Example Alternation Operator Example Alternation Operator
Match Statement Possible Matching String Values for
$address
if ( $address =~ /com|edu/ ) { “www.mysite.com”, “Welcome to my
site”,
"Time for education”,“www.mysite.edu”
45
Parenthesis For GroupingsParenthesis For Groupings
• You use parentheses within regular expressions to specify groupings. For example, the following matches a $name value of Dave or David.
Match Statement Possible Matching String Values for $nameif ( $name =~ /Dav(e|id)/
) {
“Dave”, “David”, “Dave was here”,
"How long before David comes home”
46
Special Character Classes Special Character Classes
• Perl has a special set of character classes for short hand pattern matching
• For example consider these two statements
if ( $name =~ m/ / ) {
will match $name with embedded space char
if ($name =~ m/\s/ ) {
will match $name with embedded space, tab, newline
47
Special Character ClassesSpecial Character Classes
Character Class Meaning
\s Matches a single space. For example, the following matches
“Apple Core”, “Alle y”, and “Here you go”; it does not match
“Alone”: if ( $name =~ m/e\s/ ) {
\S Matches any nonspace, tab, newline, return, or formfeed
character. For example, the following matches “ZT”, “YT”,
and “;T”: if( $part =~ m/\ST/ ) {
48
Special Character Classes - IISpecial Character Classes - IICharacter Class Meaning
\w Matches any word character (uppercase or lowercase letters, digits, or the
underscore character). For example, the following matches “Apple”,
“Time”, “Part time”, “time_to_go”, “ Time”, and “1234”; it does not
match “#%^&”: if ( $part =~ m/\w/ ) {
\W Matches any nonword character (not uppercase or lowercase letters,
digits, or the underscore character). For example, the following
matches “A*B” and “A{B”, but not “A**B”, “AB*”, “AB101”,
or “1234”: if ( $part =~ m/A\WB/ ) {
49
Special Character Classes - IIISpecial Character Classes - IIICharacter Class Meaning
\d Matches any valid numerical digit (that is, any number 0–9). For
example, the following matches “B12abc”, “The B1 product is late”, “I
won bingo with a B9”, and “Product B00121”; it does not match “B 0”,
“Product BX 111”, or “Be late 1”: if ( $part =~ m/B\d/ ) {
\D Matches any non-numerical character (that is any character not a digit 0–
9). For example, the following matches “AB1234”, “Product number
1111”, “Number VG928321212”, “The number_A1234”, and “Product
1212”; it does not match “1212” or “PR12”:
if ( $part =~ m/\D\D\d\d\d\d/) {
50
Setting Specific Patterns w/ Quantifiers
Setting Specific Patterns w/ Quantifiers
• Character quantifiers let you look for very specific patterns
• For example, use the dollar sign (“$”) to to match if a string ends with a specified pattern.
if ($Name =~ /Jones$/ ) {
• Matches “John Jones” but not “Jones is here” would not. Also, “The guilty party is Jones” would matches.
51
Selected Perl Character Quantifiers I
Selected Perl Character Quantifiers I
Character
Quantifier
Meaning
^ Matches when the following character starts the string. For example,
the following matches “Smith is OK”, “Smithsonian”, and “Smith,
Black”: if ( $name =~ m/^Smith/ ) {
$ Matches when the preceding character ends the string. For example,
the following matches “the end”, “Tend”, and “Time to Bend”:
if ( $part =~ m/end$/ ) {
52
Selected Perl Character Quantifiers II
Selected Perl Character Quantifiers II
Character
Quantifier
Meaning
+ Matches one or more occurrences of the preceding character.
For example, the following matches “AB101”, “ABB101”,
and “ABBB101 is the right part”: if ( $part =~ m/^AB+101/ ) {* Matches zero or more occurrences of the preceding character. For
example, the following matches “AB101”, “ABB101”, “A101”, and
“A101 is broke”: if ( $part =~ m/^AB*101/) {
53
Selected Perl Character Quantifiers III
Selected Perl Character Quantifiers III
Character
Quantifier
Meaning
. A wildcard symbol that matches any one character. For example, the
following matches “Stop”, “Soap”, “Szxp”, and “Soap is good”; it
does not match “Sxp”:
if ( $name =~ m/^S..p/ ) {
54
Building Regular Expressions That Work
Building Regular Expressions That Work
• Regular expressions are very powerful—but they can also be virtually unreadable. When building one, tart with a simple regular
expression and then refine it incrementally. • Build a piece and then test
The following example will build a regular expression for a date checker
• dd/mm/yyyy format (for example, 05/05/2002 but not 5/12/01).
55
1. Determine the precise field rules. - What is valid input and what is not valid input? E.g., For a date field, think through the valid
and invalid rules for the field. You might allow 09/09/2002 but not 9/9/2002 or Sep/9/2002.
Work through several examples as follows:
Building Regular Expressions That Work
56
Work through several examplesWork through several examples
Rule Reject These
05/05/2002 - / as a separator 05-05-2002—Require slash delimiters
05/05/2002—Use a four-digit year 05/05/02—Four-digit years only
05/05/2001—Contain only a date The date is 05/05/2002—Only date fields
05/05/2002 is my date—Only date fields
05/05/2001 —Two digits for
months and days
5/05/2002—Two-digit months only
05/5/2002—Two-digit days only
5/5/2002—Two-digit days and months only
57
Building Regular Expressions that Work
Building Regular Expressions that Work
2. Get form and form-handling programs working Build a sending form the input field
Build the receiving program that accepts the field.
For example, a first cut receiving program: $date = param(‘udate’);if ( $date =~ m/.+/ ) {
print ‘Valid date=’, $date;} else {
print ‘Invalid date=’, $date;}
Any Sequence of characters
58
Building Regular Expressions that Work
Building Regular Expressions that Work
3. Start with the most specific term possible. For example, slashes must always separate
two characters (for the month), followed by two more characters (for the day), followed by four characters (for the year).
if ( $date =~ m{../../....} ) {
Any 2 characters
Any 2 characters
Any 4characters
59
Building Regular Expressions that Work
Building Regular Expressions that Work
4. Anchor and refine. (Use ^ and $ when possible) if ( $date =~ m{^\d\d/\d\d/\d\d\d\d$} ) {
Starts with2 digits
2 digitsin middle
Ends with 4 digits
60
Building Regular Expressions that Work
Building Regular Expressions that Work
5. Get more specific if possible. The first digit of the month can be only 0, 1, 2
or 3. For example, 05/55/2002 is clearly an illegal date.
Only years from this century are allowed. Because we don’t care about dates like 05/05/1999 or 05/05/3003.
61
• Add these rules belowif ( $date =~ m{^\d\d/[0-3]\d/2\d\d\d$} ) {
Now the regular expression recognizes input like 09/99/2001 and 05/05/4000 as illegal.
Year starts with a “2”
Month starts with a “0-3”
Building Regular Expressions that Work
62
Tip: Regular Expression Special Variables
Tip: Regular Expression Special Variables
• Perl regexs set several special scalar variables:
$& will be equal to the first matching text
$`will be the text before the match, and
$’ will be the text after the first match. $name='*****Marty';
if ( $name =~ m/\w/ ) {
print "got match at=$& ";
print "B4=$` after=$'";
} else { print "Not match"; }
• would output: got match at=M B4=***** after=arty
63
Full Example ProgramFull Example Program
1. #!/usr/local/bin/perl2. use CGI ':standard';3. print header, start_html('Date Check');4. $date=param('udate');5. if ($date =~ m{^\d\d/[0-3]\d/2\d\d\d$}){6. print 'Valid date=', $date;7. } else {8. print 'Invalid date=', $date;9.}
10. print end_html;
64
Would Output The Following ...Would Output The Following ...