prx functions: there is hardly anything regular about them!
DESCRIPTION
PRX Functions: There is Hardly Anything Regular About Them!. Ken Borowiak. Regular Expressions. Regular Expressions. String that describes a PATTERN. Why Should You Care About Regex?. Flexibility INDEX Colon modifier LIKE operator in a WHERE clause. Why Should You Care About Regex?. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: PRX Functions: There is Hardly Anything Regular About Them!](https://reader036.vdocuments.net/reader036/viewer/2022062500/568159a6550346895dc70445/html5/thumbnails/1.jpg)
PRX Functions: There is Hardly Anything Regular About Them!
Ken Borowiak
![Page 2: PRX Functions: There is Hardly Anything Regular About Them!](https://reader036.vdocuments.net/reader036/viewer/2022062500/568159a6550346895dc70445/html5/thumbnails/2.jpg)
Regular Expressions
![Page 3: PRX Functions: There is Hardly Anything Regular About Them!](https://reader036.vdocuments.net/reader036/viewer/2022062500/568159a6550346895dc70445/html5/thumbnails/3.jpg)
Regular Expressions
String that describes a
PATTERN
![Page 4: PRX Functions: There is Hardly Anything Regular About Them!](https://reader036.vdocuments.net/reader036/viewer/2022062500/568159a6550346895dc70445/html5/thumbnails/4.jpg)
Why Should You Care About Regex?
•Flexibility– INDEX–Colon modifier–LIKE operator in a WHERE clause
![Page 5: PRX Functions: There is Hardly Anything Regular About Them!](https://reader036.vdocuments.net/reader036/viewer/2022062500/568159a6550346895dc70445/html5/thumbnails/5.jpg)
Why Should You Care About Regex?
• Flexibility
•Ubiquity– SAS V9– Oracle 10g– Java– Perl, grep, sed– Text Editors – SAS Enhanced Editor,
TextPad, etc.– Applications – ODS Tagsets, more
![Page 6: PRX Functions: There is Hardly Anything Regular About Them!](https://reader036.vdocuments.net/reader036/viewer/2022062500/568159a6550346895dc70445/html5/thumbnails/6.jpg)
Why Should You Care About Regex?
•Flexibility•Ubiquitity
•Portable syntax
![Page 7: PRX Functions: There is Hardly Anything Regular About Them!](https://reader036.vdocuments.net/reader036/viewer/2022062500/568159a6550346895dc70445/html5/thumbnails/7.jpg)
Why Should You Care About Regex?
•Flexibility•Ubiquitous•Portable syntax
•Tons of Documentation
![Page 8: PRX Functions: There is Hardly Anything Regular About Them!](https://reader036.vdocuments.net/reader036/viewer/2022062500/568159a6550346895dc70445/html5/thumbnails/8.jpg)
Why Should You Care About Regex?
Assert your: GeeknessNerdnessCoolness
![Page 9: PRX Functions: There is Hardly Anything Regular About Them!](https://reader036.vdocuments.net/reader036/viewer/2022062500/568159a6550346895dc70445/html5/thumbnails/9.jpg)
What Can You Do With Regex?
•Match– Subsetting– Conditional logic– Validation
![Page 10: PRX Functions: There is Hardly Anything Regular About Them!](https://reader036.vdocuments.net/reader036/viewer/2022062500/568159a6550346895dc70445/html5/thumbnails/10.jpg)
ODM – ISO Time Validation </xs:simpleType> - <xs:simpleType name="time"> - <xs:restriction base="xs:time">
<xs:pattern value="(((([0-1][0-9])|([2][0-3])):([0-5][0-9]):([0-5][0-9])(\.[0-9]+)?)(((\+|-)(([0-1][0-9])|([2][0-3])):[0-5][0-9])|(Z))?)" />
</xs:restriction>
![Page 11: PRX Functions: There is Hardly Anything Regular About Them!](https://reader036.vdocuments.net/reader036/viewer/2022062500/568159a6550346895dc70445/html5/thumbnails/11.jpg)
What Can You Do With Regex?• Match
•Extract
![Page 12: PRX Functions: There is Hardly Anything Regular About Them!](https://reader036.vdocuments.net/reader036/viewer/2022062500/568159a6550346895dc70445/html5/thumbnails/12.jpg)
What Can You Do with Regex?• Match• Extract
•Substitution (Find-&-Replace)–Compression
![Page 13: PRX Functions: There is Hardly Anything Regular About Them!](https://reader036.vdocuments.net/reader036/viewer/2022062500/568159a6550346895dc70445/html5/thumbnails/13.jpg)
PRX* Functions
•New in SAS V9
•Regex engine of Perl 5.6.1
![Page 14: PRX Functions: There is Hardly Anything Regular About Them!](https://reader036.vdocuments.net/reader036/viewer/2022062500/568159a6550346895dc70445/html5/thumbnails/14.jpg)
Sample Data
MR Bigglesworth Mini-mr biggggleswerth Mr. Austin D. Powers dr evil MINI-ME(1/8th size of dr evil) mr bIgglesWorTH Mi$$e$ Vanessa Kensington Sc0tt Evil
![Page 15: PRX Functions: There is Hardly Anything Regular About Them!](https://reader036.vdocuments.net/reader036/viewer/2022062500/568159a6550346895dc70445/html5/thumbnails/15.jpg)
Matching via PRXMATCH
proc print data=characters label ;
where
prxmatch('/Mr/', name)>0; run ;
![Page 16: PRX Functions: There is Hardly Anything Regular About Them!](https://reader036.vdocuments.net/reader036/viewer/2022062500/568159a6550346895dc70445/html5/thumbnails/16.jpg)
Matching via PRXMATCH
prxmatch('/Mr/', name)>0;
RESULT
obs name
3 Mr. Austin D. Powers
![Page 17: PRX Functions: There is Hardly Anything Regular About Them!](https://reader036.vdocuments.net/reader036/viewer/2022062500/568159a6550346895dc70445/html5/thumbnails/17.jpg)
IMPORTANT POINT
Default setting is case-sensitive
![Page 18: PRX Functions: There is Hardly Anything Regular About Them!](https://reader036.vdocuments.net/reader036/viewer/2022062500/568159a6550346895dc70445/html5/thumbnails/18.jpg)
Match 'M' followed by 'R' or 'r'
![Page 19: PRX Functions: There is Hardly Anything Regular About Them!](https://reader036.vdocuments.net/reader036/viewer/2022062500/568159a6550346895dc70445/html5/thumbnails/19.jpg)
Match 'M' followed by 'R' or 'r'
proc print data=characters label ;
where
prxmatch('/M[Rr]/', name) ;
run ;
![Page 20: PRX Functions: There is Hardly Anything Regular About Them!](https://reader036.vdocuments.net/reader036/viewer/2022062500/568159a6550346895dc70445/html5/thumbnails/20.jpg)
Match 'M' followed by 'R' or 'r'
proc print data=characters label ;
where
prxmatch('/M[Rr]/', name) ;
run ;
CHARACTER CLASS
![Page 21: PRX Functions: There is Hardly Anything Regular About Them!](https://reader036.vdocuments.net/reader036/viewer/2022062500/568159a6550346895dc70445/html5/thumbnails/21.jpg)
Match 'M' followed by 'R' or 'r'
prxmatch('/M[Rr]/', name) ;
RESULT
obs name
1 MR Bigglesworth
3 Mr. Austin D. Powers
![Page 22: PRX Functions: There is Hardly Anything Regular About Them!](https://reader036.vdocuments.net/reader036/viewer/2022062500/568159a6550346895dc70445/html5/thumbnails/22.jpg)
Match 'M' followed by 'R' or 'rs'
proc print data=characters label ;
where
prxmatch('/M(R|rs)/',name) ;
run ;
![Page 23: PRX Functions: There is Hardly Anything Regular About Them!](https://reader036.vdocuments.net/reader036/viewer/2022062500/568159a6550346895dc70445/html5/thumbnails/23.jpg)
Match 'M' followed by 'R' or 'r'
proc print data=characters label ;
where
prxmatch('/M(R|rs)/',name) ;
run ;
Alternation
![Page 24: PRX Functions: There is Hardly Anything Regular About Them!](https://reader036.vdocuments.net/reader036/viewer/2022062500/568159a6550346895dc70445/html5/thumbnails/24.jpg)
Match 'M' followed by 'R' or 'rs'
prxmatch('/M(R|rs)/', name) ;RESULT
obs name
1 MR Bigglesworth
![Page 25: PRX Functions: There is Hardly Anything Regular About Them!](https://reader036.vdocuments.net/reader036/viewer/2022062500/568159a6550346895dc70445/html5/thumbnails/25.jpg)
Case Insensitive Search for ‘MR’
![Page 26: PRX Functions: There is Hardly Anything Regular About Them!](https://reader036.vdocuments.net/reader036/viewer/2022062500/568159a6550346895dc70445/html5/thumbnails/26.jpg)
Case Insensitive Search for ‘MR’
proc print data=characters label ;
where
prxmatch('/MR/i', name) ;
run ;
Modifier
![Page 27: PRX Functions: There is Hardly Anything Regular About Them!](https://reader036.vdocuments.net/reader036/viewer/2022062500/568159a6550346895dc70445/html5/thumbnails/27.jpg)
Case Insensitive Search for ‘MR’
prxmatch('/MR/i', name) ;
obs name
1 MR Bigglesworth
2 Mini-mr bigggglesworth
3 Mr. Austin D. Powers
6 mr bIgglesWorTH
![Page 28: PRX Functions: There is Hardly Anything Regular About Them!](https://reader036.vdocuments.net/reader036/viewer/2022062500/568159a6550346895dc70445/html5/thumbnails/28.jpg)
Case Insensitive Search for ‘MR’ at Start of the Field
![Page 29: PRX Functions: There is Hardly Anything Regular About Them!](https://reader036.vdocuments.net/reader036/viewer/2022062500/568159a6550346895dc70445/html5/thumbnails/29.jpg)
Case Insensitive Search for ‘MR’ at Start of Field
proc print data=characters label ;
where
prxmatch('/^MR/i', name) ;
run ;
Anchor
![Page 30: PRX Functions: There is Hardly Anything Regular About Them!](https://reader036.vdocuments.net/reader036/viewer/2022062500/568159a6550346895dc70445/html5/thumbnails/30.jpg)
Case Insensitive Search for ‘MR’ at Start of Field
prxmatch('/^MR/i', name) ;RESULT
obs name
1 MR Bigglesworth
3 Mr. Austin D. Powers
6 mr bIgglesWorTH
![Page 31: PRX Functions: There is Hardly Anything Regular About Them!](https://reader036.vdocuments.net/reader036/viewer/2022062500/568159a6550346895dc70445/html5/thumbnails/31.jpg)
Metacharacters
• [ Beginning of character class• ] End of character class• ^ Beginning of field anchor (1st pos
of regex)• [^ ] Negated character class• ( Beginning of grouping for
alternation
![Page 32: PRX Functions: There is Hardly Anything Regular About Them!](https://reader036.vdocuments.net/reader036/viewer/2022062500/568159a6550346895dc70445/html5/thumbnails/32.jpg)
More Metacharacters
• . Match any character• ? Match preceeding subexpression 0
or 1 times• * Match preceeding subexpression 0
or many times• + Match preceeding subexpression 1
or many times
![Page 33: PRX Functions: There is Hardly Anything Regular About Them!](https://reader036.vdocuments.net/reader036/viewer/2022062500/568159a6550346895dc70445/html5/thumbnails/33.jpg)
More Metacharacters
QUANTIFIERS• ? Match preceeding
subexpression 0 or 1 times• * Match preceeding
subexpression 0 or many times• + Match preceeding
subexpression 1 or many times
![Page 34: PRX Functions: There is Hardly Anything Regular About Them!](https://reader036.vdocuments.net/reader036/viewer/2022062500/568159a6550346895dc70445/html5/thumbnails/34.jpg)
Matching a Metacharacter
Case Insensitive Search for ‘MR.’
![Page 35: PRX Functions: There is Hardly Anything Regular About Them!](https://reader036.vdocuments.net/reader036/viewer/2022062500/568159a6550346895dc70445/html5/thumbnails/35.jpg)
Matching a Metacharacter
proc print data=characters label ;
where
prxmatch('/MR./i', name) ;
run ;
![Page 36: PRX Functions: There is Hardly Anything Regular About Them!](https://reader036.vdocuments.net/reader036/viewer/2022062500/568159a6550346895dc70445/html5/thumbnails/36.jpg)
Matching a Metacharacter
prxmatch('/MR./i', name) ;obs name 1 MR_Bigglesworth 2 Mini-mr_bigggglesworth
3 Mr. Austin D. Powers 6 mr_bIgglesWorTH
![Page 37: PRX Functions: There is Hardly Anything Regular About Them!](https://reader036.vdocuments.net/reader036/viewer/2022062500/568159a6550346895dc70445/html5/thumbnails/37.jpg)
Matching a Metacharacter
proc print data=characters label ;
where
prxmatch('/MR\./i', name) ;
run ;
![Page 38: PRX Functions: There is Hardly Anything Regular About Them!](https://reader036.vdocuments.net/reader036/viewer/2022062500/568159a6550346895dc70445/html5/thumbnails/38.jpg)
Matching a Metacharacter
proc print data=characters label ;
where
prxmatch('/MR\./i', name) ;
run ;
‘backwhacked’ or masked
![Page 39: PRX Functions: There is Hardly Anything Regular About Them!](https://reader036.vdocuments.net/reader036/viewer/2022062500/568159a6550346895dc70445/html5/thumbnails/39.jpg)
Matching a Metacharacter
prxmatch('/MR\./i', name) ;RESULT
obs name
3 Mr. Austin D. Powers
![Page 40: PRX Functions: There is Hardly Anything Regular About Them!](https://reader036.vdocuments.net/reader036/viewer/2022062500/568159a6550346895dc70445/html5/thumbnails/40.jpg)
Quantifiers
Find misspellings of ‘bigglesworth’obs name
1 MR Bigglesworth
2 Mini-mr biggggleswerth
6 mr bIgglesWorTH
![Page 41: PRX Functions: There is Hardly Anything Regular About Them!](https://reader036.vdocuments.net/reader036/viewer/2022062500/568159a6550346895dc70445/html5/thumbnails/41.jpg)
Quantifiers
'/bigg+lesw(o|e)rth/i'
Quantifier applies only to the second ‘g’
![Page 42: PRX Functions: There is Hardly Anything Regular About Them!](https://reader036.vdocuments.net/reader036/viewer/2022062500/568159a6550346895dc70445/html5/thumbnails/42.jpg)
Quantifiers
'/big{2,}lesw(o|e)rth/i'
Match at least 2 ‘g’
![Page 43: PRX Functions: There is Hardly Anything Regular About Them!](https://reader036.vdocuments.net/reader036/viewer/2022062500/568159a6550346895dc70445/html5/thumbnails/43.jpg)
Predefined Character Classes
• \d Any digit[0-9]
• \D Any non-digit [^0-9]
• [[:digit:]] POSIX bracketed expression
• \w Any word charcter [A-Za-z0-9_]
![Page 44: PRX Functions: There is Hardly Anything Regular About Them!](https://reader036.vdocuments.net/reader036/viewer/2022062500/568159a6550346895dc70445/html5/thumbnails/44.jpg)
Search for a Digit
![Page 45: PRX Functions: There is Hardly Anything Regular About Them!](https://reader036.vdocuments.net/reader036/viewer/2022062500/568159a6550346895dc70445/html5/thumbnails/45.jpg)
Search for a Digit
prxmatch('/\d/', name);
RESULT
obs name 5 MINI-ME(1/8th size of dr evil)
8 Sc0tt Evil
![Page 46: PRX Functions: There is Hardly Anything Regular About Them!](https://reader036.vdocuments.net/reader036/viewer/2022062500/568159a6550346895dc70445/html5/thumbnails/46.jpg)
Search for a Digit
prxmatch('/[[:digit:]]/', name);
RESULT
obs name
5 MINI-ME(1/8th size of dr evil) 8 Sc0tt Evil
![Page 47: PRX Functions: There is Hardly Anything Regular About Them!](https://reader036.vdocuments.net/reader036/viewer/2022062500/568159a6550346895dc70445/html5/thumbnails/47.jpg)
Quiz
Rewrite the following with PRX
where substr( ATC, 1, 3 )
in ( ‘C01’ ‘C03’ ‘C07’ ‘C08’ ‘C09’ ) ;
![Page 48: PRX Functions: There is Hardly Anything Regular About Them!](https://reader036.vdocuments.net/reader036/viewer/2022062500/568159a6550346895dc70445/html5/thumbnails/48.jpg)
Solution
prxmatch( ‘/^C0[13789]/’ , ATC ) ;
prxmatch( ‘/^C0[137-9]/’ , ATC ) ;
prxmatch( ‘/^C0(1|3|7|8|9)/’ , ATC ) ;
![Page 49: PRX Functions: There is Hardly Anything Regular About Them!](https://reader036.vdocuments.net/reader036/viewer/2022062500/568159a6550346895dc70445/html5/thumbnails/49.jpg)
SUMMARY
•PRX* are powerful•Learning curve can be steep–Start with easy task
•Shine in the face of difficult tasks