1 programming in unix zregular expressions zthese expressions are used in grep, sed, awk, ed, vi and...

Post on 13-Jan-2016

230 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1

Programming in Unix

Regular ExpressionsThese expressions are used in grep,

sed, awk, ed, vi and the various shells

2

Regular Expressions

A regular expression is a pattern to be matched

Perl is a superset of all these toolsAny regular expression used in Unix

tools can be used in Perl

3

Regular Expressions

The string abc can be a regular expression by enclosing the string in slashes:

$_ = “I know my abc s” if (/abc/) {

print $_; }

4

Regular Expressions

Single character patterns - a character in the expression must match a single character in the string

The dot “.” matches any single character other than “\n”

/r.g/ would match rug or rag

5

Regular Expressions

Metacharacters or escape sequences allow you to match certain conditions in a string. \ | ( ) [ * + ? (Are all metacharacters)

A backslash in front of any metacharacter makes it non-special

5.18 would use /5\.18/ 01\20\03 would use /01\\20\\03

6

Regular Expressions

Some escape sequences you might see\a An alarm bell

\d A digit between 0 and 9

\D A non-digit character

\e The character generated by pressing Escape

\f A form feed

7

Regular Expressions

\l The next lower case letter

\r A carriage return

\s A whitespace character

\S A non whitespace character

\U The next uppercase character

8

Regular Expressions

Pattern /m./ matches any two character pattern that starts with m

my or me would be examples of matches

9

Regular Expressions

A character class uses a list of possible characters enclosed in brackets [ ]

It will match any one character listed within the brackets

[a-z] will match any single lowercase letter (a range can be used with the hyphen)

Negated character class ^ matches character not in the list

10

Regular Expressions

Grouping Patterns - one or more of….Sequence - i.e.; abc means a followed by b

followed by cMultipliers

* indicates zero or more of previous characters + meaning one or more of the immediately

previous character ? means zero or one of the immediately

previous character

11

Regular Expressions

General Multiplier $_ = “fred xxxxxxxxxx barney”; /x{5,10}/ #would look for 5 to 10

repetitions of the letter x s/x[5,10]/and/; #would substitutesubstitute and

for the x’s

12

Regular Expressions

Parentheses (a) matches an a ([a-z]) matches any single lowercase

letterAlternation

match exactly one of the alternatives a|b|c

/[abc]/ works the same way

13

Regular Expressions

Anchoring Patterns Generally when a pattern is matched

against a string it is evaluated from left to right matching at the first opportunity

\b anchor requires a word boundary at the indicated point

\B requires that there is not a word boundary

^ matches the beginning of a pattern $ matches the ending of a pattern

14

Regular Expressions

/fred\b/; #matches fred but not frederick

/\bmo/; #matches moe but not Elmo

/\bFred\b/;#matches Fred but not Freddy or AlFred

/\b\+\b/; #matches “ + “but not ++ or x+y

15

Regular Expressions

Precedence Parentheses ( ) Quantifiers * + ? { } Anchors and sequence ^ $ \b \B\ Alternation |

16

Regular Expressions

Matches with m// (m not needed when using //)

Searches using /pattern/ is actually a shortcut for m/pattern/

You may choose any pair of delimiters to quote the contents

Where you used /fred/ you can use m(fred) or m,fred, or m<fred> or m!fred!

17

Regular Expressions

Different delimiter rather than the slash (/) add the letter m to the new delimiter ie. m@/usr/etc@

18

Regular Expressions

Binding Operator =~ selects a different target, it tells Perl to match the pattern on the right against the string on the left (instead of matching $_)

Ignoring case with /i [yY] matches either upper or lower case y /^procedure/i #matches P or p

19

Regular Expressions

Case shifting$_ = “I saw Barney with Fred.”;s/(fred|barney)/\U$1/gi;#Now $_ is “I saw BARNEY with FRED.”

20

Regular Expressions

The split Operator will break up a string according to a separator. This is useful for tab separated or colon-separated data@fields = split /:/, “abc:def:g:h”; Gives you (“abc”, “def”, “g”, “h”)@fields = split /:/, “abc:def::g:h”; Gives you (“abc”, “def”, “”, “g”, “h”)

21

Regular Expressions

It is common to split on whitespace using /\s+/ as the pattern

All whitespace runs equal to a single space$input= “This is a \t test.\n”;split /\s+/, $input;will give you the result “This”, “is”, “a”, “test.”

22

Regular Expressions

Substitutions $_ = “foot fool buffoon”; s/foo/bar/;#$_is now “bart fool buffoon” s/// will make just one replacement s/foo/bar/g; #$_is now “bart barl

bufbarn” /g globally replace on all possible

matches

23

Regular Expressions

The join function takes a list of values and glues them together. Performs the opposite of split.

For example$info = join(“\n”, Name, Address, “Zip Code”); print $info will displayNameAddressZip Code

24

Regular Expressions

Or take a list @values = ( 2, 4, 6, 8, 10);$new_value= join “-”, @values;# $new_value looks like “2-4-6-8-10”$new_value= join “:”, @values;# $new_value looks like “2:4:6:8:10”$new_value= join “-”, “cat”, @values;# $new_value looks like “cat-2-4-6-8-10”

25

Filehandles and File Tests

What is a filehandle? An I/O connection between your Perl

process and the outside world. Like the names for labeled blocks Easy to confuse with future reserved

words, so recommendation is to use all UPPERCASE letters in your filehandle;

26

Filehandles and File Tests

syntax is like: open (FILEHANDLE, “somename”); FILEHANDLE is the new filehandle and

somename is the external filename (such as file or device)

To open a file for write, use the same open statement but prefix the filename with a greater than sign (caution this will overwrite any existing files with the same name)open (OUT, “>outfile”);

27

Filehandles and File Tests

Syntax continued: To open a file to append data to it

open (LOGFILE, “>>mylogfile”); All forms of open return true for success

and false for failure When finished with a filehandle you close

itclose(LOGFILE);

reopening a filehandle will close the previous version

28

Filehandles and File Tests

When a filehandle does not open successfully you can use the die function to report that an error has occurred

unless statement can be used as a logical or unless (this) { that; } this || that;

unless statement used as a logical or with the die statement

unless (open (DATAPLACE, >/tmp/dataplace”)) {print “Sorry, I couldn’t create your file”;}else {#the rest of your program

}

29

Filehandles and File Tests

Or….make it even simpler with:unless (open DATAPLACE, “>/tmp/dataplace”) { die “Sorry, I couldn’t create your file”;

oropen (DATAPLACE, “>tmp/dataplace”) ||

die “Sorry, I couldn’t create your file”;

30

Filehandles and File Tests

The -x File TestsSuppose you wanted to make sure

that there wasn’t a file by that name (so you don’t blow away valuable data) when you open and write to a file

Use file tests (see page 157-8)-e for a file or directory exists

31

Formats

Helps you generate simple, formatted reports and charts

Keeps track of number of lines per page, current page

Use “format” to declare and “write” to execute

32

Declaring a Format

format MYNAME = FORMLIST.

Note: if MYNAME is omitted writes to STDOUT

FORMLIST is a list containing the followingA comment (start the line with #)A “picture” giving the output for one output

lineAn argument line supplying values to plug

into the previous “picture” line

33

Special Values

FORMAT_NAME_TOP defines text that will appear at the top of each page

FORMAT_NAME section defines format and variables for each line that should print as the body of the report

You should define the format and format_top together somewhere in your program (often seen at the end).

34

Example

# a report on the /etc/passwd fileformat MY_REPORT_TOP =

Password File Report

Name Login Uid Gid Shell Home

-------------------------------------------------------------------.

35

Example

#how to send output to the screenformat STDOUT =

Password File Report

Name Login Uid Gid Shell Home-------------------------------------------------------------------.

open STDOUT;write;

36

Example (cont...)

format MY_REPORT = @<<<<< @||||||| @<<<< @>>>> @>>>>

@<<<<<<<<<<<<$name, $login, $uid, $gid, $shell, $home

.Then to print this when you want:write MY_REPORT;

Example of Code#!/usr/local/bin/perl -w

print "This is an address label program\n";

print "Enter your name: \n";

$name=<>;

print "Enter your street address: \n";

$street=<>;

print "Enter your City, State, and Zip: \n";

$therest=<>;

open (AddressLabel,">myaddrlist");

write (AddressLabel);

format AddressLabel =

==================================

| @<<<<<<<<<<<<<<<<<<<<<<<<<<<<< |

$name

| @<<<<<<<<<<<<<<<<<<<<<<<<<<<<< |

$street

| @<<<<<<<<<<<<<<<<<<<<<<<<<<<<< |

$therest

==================================

.

38

Example Entering Data

# addrlabel.pl

This is an address label program

Enter your name:

Mike

Enter your street address:

14590 Roller Coaster Rd

Enter your City, State, and Zip:

Denver, CO 80931

39

Example Output to File

# cat myaddrlist

==================================

| Mike |

| 14590 Roller Coaster Rd |

| Denver, CO 80931 |

==================================

40

Format Pictures

@ or ^ indicates substitution at run-time

< left justify> right justify| centeringIf the variable has more characters than

the format picture, it will be truncatedTo avoid truncating use “@*” on a

format line by itself.

41

The ^ Picture

Starting a field with ^ allows you to print part of the text with the first call

The next time you reference it, the string will only contain that part of the string that has not been printed and the next n characters will be printed and so on...

Warning!: this does destroy the original value of the variable so store it off if you will need it again.

42

Example of the ^# a report from a bug report formformat BUG_REPORT = Subject: @<<<<<<<<<<<<<<<<<<<<<<<<<<<<< $subjectFrom: @<<<<<<<<<<<<<< Priority: @<<<<<<<<<< $from, $priorityDescription:

^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< $description

^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< $description

^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<… $description

43

Special Variables

$~ contains $FORMAT_NAME$^ contains $FORMAT_NAME_TOP$% contains the current output page

number$= contains number of lines per page$- contains lines remaining on

current page (set to zero to force a new page)

44

To Use Special Variables

You can use these by “selecting”:$myform = select(MYFORMAT);$~ = “My_Other_Format”;$^ = “My_Top_Format”;select($myform);

top related