perl6 grammars

Post on 18-Nov-2014

21.320 Views

Category:

Documents

6 Downloads

Preview:

Click to see full reader

DESCRIPTION

 

TRANSCRIPT

Regexesand Grammars

in Perl 6

Preface

Synopsis 5

Synopsis 5Regexes and Rules

S05

Damian ConwayAllison RandalPatrick MichaudLarry WallMoritz Lenz

Created: 24 Jun 2002Last Modified: 30 Aug 2010Version: 132

54 pages

Part I

Regexes

Random factsand terminology

Regular expressionsin Perl 5 were not regular

Regular expressionsin Perl 5 were not regular

Regular expressionsin Perl 6 are called regexes

Regular expressionsin Perl 5 were not regular

Regular expressionsin Perl 6 are called regexes

Which means “kinda like a regular expression”

Match objectcontains result of matching

$/

Capture variable indexesstart with 0

$0

$0, $1, etc.are part of $/

my $q = "Hotels in Berlin";$q ~~ /in\s(.*)/;

say $0; # Berlinsay $/[0]; # Berlin

Metacharacters

are everything except Unicode letters or numbers or underscore

Quotes

may be used for creating atoms

'I will never use PHP again. '*

Repetition

(\d+ \s?) ** 3

(\d+ \s?) ** 5..10

\d+ ** ','

/x modifier gone

"ab" ~~ / a b /;say $/; # ab

/s, /m modifiers gone

"a1\nb2\nc3" ~~ /^^ .2 $$/;

"a1\nb2\nc3" ~~ /\N+/;

/e modifier gone

$str =~ s/pattern/{action()}/;

Modifier syntax

@names = $str =~ m:i/MiSteR \s (\w+)/;

Brackets

Capturing group

( . . . )

Non-capturing group

[ . . . ]

Character class

<[ . . . ]>

Embedded closure

{ . . . }

Embedded closure

{ . . . }

> "500" ~~ /(\d+) {$0 < 200 or fail}/===SORRY!===

Named rule or token

<. . .>

Part II

Grammars

Keywords

grammargrammarruletokenprotoTOP

grammar Grammar { rule TOP {...} rule some_rule {...} token some_token {...}}

grammar Grammar { rule TOP {...} rule some_rule {...} token some_token {...}}Syntax is similarto class definition

grammar Grammar { rule TOP {...} rule some_rule {...} token some_token {...}}Grammar.parse($string);

Example.Step by step

Executed by Rakudo

rakudo.org

Executed by Rakudo

rakudo.org

Sometimes it fails

City

grammar SearchQuery {

}

grammar SearchQuery { rule TOP { }}

grammar SearchQuery { rule TOP { ^ $ }}

grammar SearchQuery { rule TOP { ^ <query> $ }}

grammar SearchQuery { rule TOP { ^ <query> $ }}

Easy, isn't it?

Grammars are part of the language

grammar SearchQuery { rule TOP { ^ <query> $ } rule query { }}

grammar SearchQuery { rule TOP { ^ <query> $ } rule query { <city> }}

grammar SearchQuery { rule TOP { ^ <query> $ } rule query { <city> } token city { }}

grammar SearchQuery { rule TOP { ^ <query> $ } rule query { <city> } token city { }}

N. B.

rules

token

token is a "word"

rule is a "phrase"

grammar SearchQuery { rule TOP { ^ <query> $ } rule query { <city> } token city { }}

grammar SearchQuery { rule TOP { ^ <query> $ } rule query { <city> } token city { <capital> }}

grammar SearchQuery { rule TOP { ^ <query> $ } rule query { <city> } token city { <capital> } token capital { }}

my $result = SearchQuery.parse("Amsterdam");say $result.perl;

Match.new( from => 0, orig => "Amsterdam", to => 9, named => { query => Match.new( from => 0, orig => "Amsterdam", to => 9, named => { city => Match.new( from => 0, orig => "Amsterdam", to => 9, named => { capital => Match.new( from => 0, orig => "Amsterdam", to => 9, ), }, ), }, ), },)

Match.new( from => 0, orig => "Amsterdam", to => 9, named => { query => Match.new( from => 0, orig => "Amsterdam", to => 9, named => { city => Match.new( from => 0, orig => "Amsterdam", to => 9, named => { capital => Match.new( from => 0, orig => "Amsterdam", to => 9, ), }, ), }, ), },)

Matched text

Match.new( from => 0, orig => "Amsterdam", to => 9, named => { query => Match.new( from => 0, orig => "Amsterdam", to => 9, named => { city => Match.new( from => 0, orig => "Amsterdam", to => 9, named => { capital => Match.new( from => 0, orig => "Amsterdam", to => 9, ), }, ), }, ), },)

rule query {}

Match.new( from => 0, orig => "Amsterdam", to => 9, named => { query => Match.new( from => 0, orig => "Amsterdam", to => 9, named => { city => Match.new( from => 0, orig => "Amsterdam", to => 9, named => { capital => Match.new( from => 0, orig => "Amsterdam", to => 9, ), }, ), }, ), },)

token city {}

Match.new( from => 0, orig => "Amsterdam", to => 9, named => { query => Match.new( from => 0, orig => "Amsterdam", to => 9, named => { city => Match.new( from => 0, orig => "Amsterdam", to => 9, named => { capital => Match.new( from => 0, orig => "Amsterdam", to => 9, ), }, ), }, ), },)

token capital {}

Country

rule query { <city> | <country>}

rule query { <city> | <country>}rule country { 'Afghanistan' | 'Akrotiri' | 'Albania' | 'Algeria' | 'American Samoa' | 'Andorra' . . .}

my $result = SearchQuery.parse("Amsterdam");say $result.perl;

$result = SearchQuery.parse("China");say $result.perl;

rule query { <city> ',' <ws>? <country> | <city> | <country>}

rule query { <city> ',' <ws>? <country> | <city> | <country>}

SearchQuery.parse("Tirana, Albania");

rule query { <city> ',' <ws>? <country> | <city> | <country>}

SearchQuery.parse("Tirana, Albania");

Capturing and accessing

Everything goesto Match object

$/

SearchQuery.parse("Tirana, Albania");say $<query><city>;say $<query><country>;

SearchQuery.parse("Tirana, Albania");say $<query><city>;say $<query><country>;

Tirana

Albania

SearchQuery.parse("Tirana, Albania");say $<query><city>;say $<query><country>;

say $/<query><city>;say $/<query><country>;

Shortcut

Full syntax

rule query { 'Hotels in'? [ <city> ',' <ws>? <country> | <city> | <country> ] }

SearchQuery.parse("Tirana, Albania");say $<query><city>;say $<query><country>;

SearchQuery.parse\ ("Hotels in Tirana, Albania");say $<query><city>;say $<query><country>;

rule date { <day> <month> } token day { \d+ ['st' | 'nd' | 'th']? } token month { 'January' | 'February' | 'March' | 'April' . . .

SearchQuery.parse("Hotels in Tirana, Albania from 25th December");

SearchQuery.parse("Hotels in Tirana, Albania from 25 December");

What will $<query><date>

print?

What will $<query><date>

print?

25th Decemberor

25 December

token day { (\d+) {$0 <= 31 or fail} }

How to check days

rule query { 'Hotels in '? [ <city> ',' <ws>? <country> | <city> | <country> ] [ 'from' <date> 'to' <date> ]? [ 'for' <guest_number> ]? }

token guest_number { \d | 'one' | 'two' | 'three' | 'four' | 'five' }

"Hotels in Tirana, Albania from25 December to 7 January for two"

rule date { 'today' | 'tomorrow' | [ <day> <month> ] }

$ perl6 10-all.pl Hotels in Amsterdam, Netherlands from 1 January to 5 February for three City: Amsterdam Country: Netherlands From: 1 January To: 5 February Guests: three

__END__

Andrew Shitov talks.shitov.ru | andy@shitov.ru

top related