Download - PATR II Compiler
![Page 1: PATR II Compiler](https://reader036.vdocuments.net/reader036/viewer/2022062409/5681467a550346895db39d73/html5/thumbnails/1.jpg)
PATR II PATR II CompilerCompiler
Prolog Aufbaukurs SS 2000
Heinrich-Heine-Universität Düsseldorf
Christof Rumpf
![Page 2: PATR II Compiler](https://reader036.vdocuments.net/reader036/viewer/2022062409/5681467a550346895db39d73/html5/thumbnails/2.jpg)
22.05.2000 PATR II Compiler 2
NotationskonventionenNotationskonventionen
• Instantiierungsmodus von Argumenten– Blau: Input-Argumente– Rot: Output-Argumente
• Cut– roter Cut !– grüner Cut !
• Prädikatsdefinitionen– abgeschlossen– wird fortgesetzt
![Page 3: PATR II Compiler](https://reader036.vdocuments.net/reader036/viewer/2022062409/5681467a550346895db39d73/html5/thumbnails/3.jpg)
22.05.2000 PATR II Compiler 3
DirektivenDirektiven
% external resources
:- [tokenize]. % load tokenizer
% operators
:- op(510, xfy, : ). % attr:val:- op(600, xfx, ===). % path equation:- op(1100,xfx,'--->'). % syntax rule, lexical entry:- op(1200,xfx,'::'). % description annotation
![Page 4: PATR II Compiler](https://reader036.vdocuments.net/reader036/viewer/2022062409/5681467a550346895db39d73/html5/thumbnails/4.jpg)
22.05.2000 PATR II Compiler 4
3 Compiler-Komponenten3 Compiler-Komponenten
• Tokenizer– Input: PATR II-Grammatik– Output: Token-Zeilen
• Präprozessor– Input: Token-Zeilen– Output: Token-Sätze
• Syntax-Compiler– Input: Token-Sätze– Output: Prolog-Klauseln
compile_grammar(File):-clear_grammar,tokenize_file(File), read_sentences,compile_sentences.
![Page 5: PATR II Compiler](https://reader036.vdocuments.net/reader036/viewer/2022062409/5681467a550346895db39d73/html5/thumbnails/5.jpg)
22.05.2000 PATR II Compiler 5
Tokenizer-InputTokenizer-Input
; Shieb1.ptr; Sample grammar one from Shieber 1986
; Grammar Rules; ------------------------------------------------------------
Rule {sentence formation} S --> NP VP:
<S head> = <VP head><VP head subject> = <NP head>.
Rule {trivial verb phrase} VP --> V:
<VP head> = <V head>.
; Lexicon; ----------------------------------------------------------------
Word uther:<cat> = NP<head agreement gender> = masculine<head agreement person> third<head agreement number> = singular.
![Page 6: PATR II Compiler](https://reader036.vdocuments.net/reader036/viewer/2022062409/5681467a550346895db39d73/html5/thumbnails/6.jpg)
22.05.2000 PATR II Compiler 6
Tokenizer Output = Präprozessor InputTokenizer Output = Präprozessor Input
line(1,[o($;$),b(1),u($Shieb1$),o($.$),l($ptr$)]).line(2,[o($;$),b(1),u($Sample$),b(1),l($grammar$),b(1),l($one$),b(1),l($from$),b(1), ...line(3,[ ]).line(4,[ ]).line(5,[o($;$),b(1),u($Grammar$),b(1),u($Rules$)]).line(6,[o($;$),b(1),o($-$),o($-$),o($-$),o($-$),o($-$),o($-$),o($-$),o($-$),o($-$),o($-$), ...line(7,[ ]).line(8,[u($Rule$),b(1),o(${$),l($sentence$),b(1),l($formation$),o($}$)]).line(9,[b(2),u($S$),b(1),o($-$),o($-$),o($>$),b(1),u($NP$),b(1),u($VP$),o($:$)]).line(10,[b(1),o($<$),u($S$),b(1),l($head$),o($>$),b(1),o($=$),b(1),o($<$),u($VP$),b(1), ...line(11,[b(1),o($<$),u($VP$),b(1),l($head$),b(1),l($subject$),o($>$),b(1),o($=$),b(1), ...line(12,[b(1)]).line(13,[u($Rule$),b(1),o(${$),l($trivial$),b(1),l($verb$),b(1),l($phrase$),o($}$)]).line(14,[b(2),u($VP$),b(1),o($-$),o($-$),o($>$),b(1),u($V$),o($:$)]).line(15,[b(1),o($<$),u($VP$),b(1),l($head$),o($>$),b(1),o($=$),b(1),o($<$),u($V$),b(1),......line(41,[b(1),o($<$),l($head$),b(1),l($subject$),b(1),l($agreement$),b(1),l($number$),...line(42,[eof]).
![Page 7: PATR II Compiler](https://reader036.vdocuments.net/reader036/viewer/2022062409/5681467a550346895db39d73/html5/thumbnails/7.jpg)
22.05.2000 PATR II Compiler 7
Präprozessor Output = Compiler InputPräprozessor Output = Compiler Input
sentence( 1,11,[u($Rule$),o(${$),l($sentence$),l($formation$),o($}$),...
sentence(12,15,[u($Rule$),o(${$),l($trivial$),l($verb$),l($phrase$),o($}$),...
sentence(16,24,[u($Word$),l($uther$),o($:$),o($<$),l($cat$),o($>$),o($=$),...
sentence(25,30,[u($Word$),l($knights$),o($:$),o($<$),l($cat$),o($>$),o($=$),...
sentence(31,36,[u($Word$),l($sleeps$),o($:$),o($<$),l($cat$),o($>$),o($=$),...
sentence(37,41,[u($Word$),l($sleep$),o($:$),o($<$),l($cat$),o($>$),o($=$),...
sentence(42,42,[eof]).
Der Präprozessor entfernt Kommentare und Leerzeichen und fasst mit einem Punkt terminierte Sätze aus mehreren Zeilen zusammen. Der eigentliche Compiler kann sich dann auf das wesentliche konzentrieren.
![Page 8: PATR II Compiler](https://reader036.vdocuments.net/reader036/viewer/2022062409/5681467a550346895db39d73/html5/thumbnails/8.jpg)
22.05.2000 PATR II Compiler 8
Präprozessor: Main LoopPräprozessor: Main Loopread_sentences:-
abolish(cnt/1),write('preprocessing...'), nl,repeat,count(I),read_sentence(N,M,S),assert(sentence(N,M,S)),put(13), tab(3), write(I), write(' sentences preprocessed'),S = [eof], !, nl.
read_sentence(N,M,S):-retract(line(N,L)),read_sentence(L,N,M,S), !.
Backtracking
![Page 9: PATR II Compiler](https://reader036.vdocuments.net/reader036/viewer/2022062409/5681467a550346895db39d73/html5/thumbnails/9.jpg)
22.05.2000 PATR II Compiler 9
Präprozessor: Satz lesenPräprozessor: Satz lesen
read_sentence([eof],N,N,[eof]):- !. % end of fileread_sentence([o($.$)|_],N,N,[]):- !. % end of sentenceread_sentence([o($;$)|_],N,M,S):- !, % skip comment
N1 is N+1,retract(line(N1,L)), % next lineread_sentence(L,N1,M,S).
read_sentence([],N,M,S):- !, % end of lineN1 is N+1,retract(line(N1,L)), % next lineread_sentence(L,N1,M,S).
read_sentence([b(_)|T1],N,M,T2):- !, % skip blanksread_sentence(T1,N,M,T2).
read_sentence([H|T1],N,M,[H|T2]):- % collect tokensread_sentence(T1,N,M,T2).
![Page 10: PATR II Compiler](https://reader036.vdocuments.net/reader036/viewer/2022062409/5681467a550346895db39d73/html5/thumbnails/10.jpg)
22.05.2000 PATR II Compiler 10
Compiler: Main LoopCompiler: Main Loop
compile_sentences:-abolish(cnt/1),write('compiling...'), nl,retract(sentence(N,M,S)),compile_sentence((N,M),C,S,[]),assert(C),count(I), put(13), tab(3), write(I), write(' sentences compiled'),S = [eof], !,nl.
Backtracking
![Page 11: PATR II Compiler](https://reader036.vdocuments.net/reader036/viewer/2022062409/5681467a550346895db39d73/html5/thumbnails/11.jpg)
22.05.2000 PATR II Compiler 11
Compiler: SatztypenCompiler: Satztypen
% compile_sentence(Position,Clause,Sentence,Rest)
compile_sentence(_,C) --> [eof], !, {C = finished}.compile_sentence(_,C) --> syntax_rule(C), !.compile_sentence(_,C) --> lex_entry(C), !.compile_sentence(_,C) --> template(C), !.compile_sentence(P,_,_,_):-
P = (N,M), nl,write(' error in sentence between lines '),write(N),write(' and '), write(M), nl, fail.
![Page 12: PATR II Compiler](https://reader036.vdocuments.net/reader036/viewer/2022062409/5681467a550346895db39d73/html5/thumbnails/12.jpg)
22.05.2000 PATR II Compiler 12
Syntax-RegelnSyntax-Regeln
syntax_rule(C) --> rs('Rule'), !, syntax_rule_cont(C).
syntax_rule_cont((Expansion :: Descr)) -->
rule_name,
sr_expansion(Expansion,Sugar),
rs(:), !,
sr_path_equations(Equations,Sugar),
{sr_sugar_cats(Sugar,Equations,Descr)}.
![Page 13: PATR II Compiler](https://reader036.vdocuments.net/reader036/viewer/2022062409/5681467a550346895db39d73/html5/thumbnails/13.jpg)
22.05.2000 PATR II Compiler 13
Reservierte SymboleReservierte Symbolers(=) --> [o($=$)], !.rs(:) --> [o($:$)], !.rs(<) --> [o($<$)], !.rs(>) --> [o($>$)], !.rs('{') --> [o(${$)], !.rs('}') --> [o($}$)], !.rs('Rule') --> [u($Rule$)], !.rs('Word') --> [u($Word$)], !.rs('Let') --> [u($Let$)], !.rs('be') --> [l($be$)], !.rs('-->') --> [o($-$),o($-$),o($>$)], !.
Alternative: Definiere für jedes reservierte Symbol ein eigenes Prädikat, z.B. colon statt rs(:).
![Page 14: PATR II Compiler](https://reader036.vdocuments.net/reader036/viewer/2022062409/5681467a550346895db39d73/html5/thumbnails/14.jpg)
22.05.2000 PATR II Compiler 14
Weitere TerminalsymboleWeitere Terminalsymbole
uatom(A) --> [u(S)], {atom_string(A,S)}.latom(A) --> [l(S)], {atom_string(A,S)}.satom(A) --> [s(S)], {atom_string(A,S)}.
int(I) --> [i(I)].
atom(A) --> uatom(A), !.atom(A) --> latom(A), !.atom(A) --> satom(A), !.
atomic(A) --> atom(A), !.atomic(A) --> int(A), !.
![Page 15: PATR II Compiler](https://reader036.vdocuments.net/reader036/viewer/2022062409/5681467a550346895db39d73/html5/thumbnails/15.jpg)
22.05.2000 PATR II Compiler 15
RegelnamenRegelnamen
rule_name --> rs('{'), !, % start of rule namecurley_braces_terminated_string.
rule_name --> []. % rule names are optional
curley_braces_terminated_string --> rs('}'), !. % end of rule name
curley_braces_terminated_string --> [_], % read any symbolcurley_braces_terminated_string.
Regelnamen werden überlesen und nicht in die Prolog-Repräsentation der Regeln übernommen.
![Page 16: PATR II Compiler](https://reader036.vdocuments.net/reader036/viewer/2022062409/5681467a550346895db39d73/html5/thumbnails/16.jpg)
22.05.2000 PATR II Compiler 16
RegelexpansionRegelexpansion
sr_expansion((LHS ---> RHS),[LSugar|RSugar]) --> sr_lhs(LHS,LSugar),rs('-->'),sr_rhs(RHS,RSugar).
sr_lhs(LHS,Sugar) --> fsd(LHS,Sugar).sr_rhs(RHS,Sugar) --> ne_fsd_seq(RHS,Sugar).
ne_fsd_seq((FSD,FSDs),[Sugar|Sugars]) --> fsd(FSD,Sugar), ne_fsd_seq(FSDs,Sugars).
ne_fsd_seq(FSD,[Sugar]) --> fsd(FSD,Sugar).
fsd(Var,(FSD,Var)) --> uatom(FSD).
![Page 17: PATR II Compiler](https://reader036.vdocuments.net/reader036/viewer/2022062409/5681467a550346895db39d73/html5/thumbnails/17.jpg)
22.05.2000 PATR II Compiler 17
Syntax-Regeln: PfadgleichungenSyntax-Regeln: Pfadgleichungen
sr_path_equations((E,Es),Sugar) -->sr_path_equation(E,Sugar),sr_path_equations(Es,Sugar).
sr_path_equations(E,Sugar) --> sr_path_equation(E,Sugar).
sr_path_equation((LHS === RHS),Sugar) --> sr_path(LHS,Sugar), rs(=),sr_val(RHS,Sugar).
sr_val(V,Sugar) --> sr_path(V,Sugar).sr_val(V,_) --> atomic(V).
![Page 18: PATR II Compiler](https://reader036.vdocuments.net/reader036/viewer/2022062409/5681467a550346895db39d73/html5/thumbnails/18.jpg)
22.05.2000 PATR II Compiler 18
Syntax-Regeln: PfadeSyntax-Regeln: Pfade
sr_path(Var,Sugar) --> rs(<), fsd(FSD), rs(>), {member((FSD,Var),Sugar)}, !.
sr_path(Var:P,Sugar) --> rs(<), fsd(FSD), ne_feature_seq(P), rs(>), {member((FSD,Var),Sugar)}, !.
ne_feature_seq(F) --> feature(F).ne_feature_seq(F:P) -->
feature(F), ne_feature_seq(P).
fsd(FSD) --> uatom(FSD).feature(F) --> atomic(F).
![Page 19: PATR II Compiler](https://reader036.vdocuments.net/reader036/viewer/2022062409/5681467a550346895db39d73/html5/thumbnails/19.jpg)
22.05.2000 PATR II Compiler 19
Syntaktischer ZuckerSyntaktischer Zucker
sr_sugar_cats([(Cat,Var)|Sugar],Equations,((Var:cat === Cat),Descr)):-
sr_sugar_cats(Sugar,Equations,Descr).
sr_sugar_cats([],Descr,Descr).
Rule {sentence formation} S --> NP VP: <S head> = <VP head> <VP head subject> = <NP head>.
Rule {sentence formation} X0 --> X1 X2:
<X0 cat> = S<X1 cat> = NP<X2 cat> = VP<X0 head> = < X2 head><X2 head subject> = <X1 head>.
![Page 20: PATR II Compiler](https://reader036.vdocuments.net/reader036/viewer/2022062409/5681467a550346895db39d73/html5/thumbnails/20.jpg)
22.05.2000 PATR II Compiler 20
Lexikalische EinträgeLexikalische Einträge
lex_entry(C) --> rs('Word'), !, lex_entry_cont(C).
lex_entry_cont((FS ---> L :: Descr)) --> lexeme(L),rs(:), !,lex_definition(FS, Descr).
lexeme(L) --> atom(L).
![Page 21: PATR II Compiler](https://reader036.vdocuments.net/reader036/viewer/2022062409/5681467a550346895db39d73/html5/thumbnails/21.jpg)
22.05.2000 PATR II Compiler 21
Lexikon: MerkmalsstrukturenLexikon: Merkmalsstrukturen
lex_definition(FS,(LDef,LDefs)) --> lexdef(FS,LDef),lex_definition(FS,LDefs).
lex_definition(FS,LDef) --> lexdef(FS,LDef).
lexdef(FS,LDef) --> template_name(FS,LDef), !.
lexdef(FS,LDef) --> lex_path_equation(FS,LDef), !.
![Page 22: PATR II Compiler](https://reader036.vdocuments.net/reader036/viewer/2022062409/5681467a550346895db39d73/html5/thumbnails/22.jpg)
22.05.2000 PATR II Compiler 22
Lexikon: PfadgleichungenLexikon: Pfadgleichungen
lex_path_equation(FS, (LHS === RHS)) --> lex_path(FS, LHS), rs(=), !,lex_val(FS, RHS).
lex_path(FS,FS:P) --> rs(<), ne_feature_seq(P), rs(>), !.
lex_val(FS,V) --> lex_path(FS,V).lex_val(_,V) --> atomic(V).
![Page 23: PATR II Compiler](https://reader036.vdocuments.net/reader036/viewer/2022062409/5681467a550346895db39d73/html5/thumbnails/23.jpg)
22.05.2000 PATR II Compiler 23
TemplatesTemplates
template(C) --> rs('Let'), !, template_cont(C).
template_cont((N :- TDef)) --> template_name(FS,N),rs('be'),template_definition(FS,TDef),{assert(template(N))}.
![Page 24: PATR II Compiler](https://reader036.vdocuments.net/reader036/viewer/2022062409/5681467a550346895db39d73/html5/thumbnails/24.jpg)
22.05.2000 PATR II Compiler 24
Templates: Head & BodyTemplates: Head & Body
template_name(FS,N) -->atom(A),{N =.. [A,FS]}.
template_definition(FS,TDef) -->lex_definition(FS,TDef).
![Page 25: PATR II Compiler](https://reader036.vdocuments.net/reader036/viewer/2022062409/5681467a550346895db39d73/html5/thumbnails/25.jpg)
22.05.2000 PATR II Compiler 25
Löschen einer GrammatikLöschen einer Grammatik
clear_templates:-template(T),T =.. [F,_],abolish(F/1),fail.
clear_templates:- abolish(template/1).
clear_grammar:-abolish('::'/2),abolish(line/2),abolish(sentence/3),clear_templates.
![Page 26: PATR II Compiler](https://reader036.vdocuments.net/reader036/viewer/2022062409/5681467a550346895db39d73/html5/thumbnails/26.jpg)
22.05.2000 PATR II Compiler 26
Compiler OutputCompiler Output
A ---> B , C :: A : cat === 'S', B : cat === 'NP', C : cat === 'VP', A : head === C : head, C : head : subject === B : head.
A ---> uther :: A : cat === 'NP', A : head : agreement : gender === masculine, A : head : agreement : person === third, A : head : agreement : number === singular.
![Page 27: PATR II Compiler](https://reader036.vdocuments.net/reader036/viewer/2022062409/5681467a550346895db39d73/html5/thumbnails/27.jpg)
22.05.2000 PATR II Compiler 27
ResourcenResourcen
• Grammatiken PATR II / Prolog– shieb1.ptr / shieb1.ari
– shieb2.ptr / shieb2.ari
– shieb3.ptr / shieb3.ari
– shieb4.ptr / shieb4.ari
• Tokens– shieb1.tok (Tokenizer)
– shieb1.snt (Präprozessor)
• PATR II Interpreter– patrlcl.ari: Left-corner
mit Linking– patrlclc.ari: Left-corner
mit Linking und Syntaxbäumen
– patr-ii.ari: DCG
• PATR II Compiler– patrcomp.ari– patr-ii.ari: DCG
![Page 28: PATR II Compiler](https://reader036.vdocuments.net/reader036/viewer/2022062409/5681467a550346895db39d73/html5/thumbnails/28.jpg)
22.05.2000 PATR II Compiler 28
Offene Probleme und ErweiterungenOffene Probleme und Erweiterungen
• Syntaktischer Zucker der Form VP_1 VP_2 X
• Lexikalische Regeln
• Templates in Syntaxregeln
• Negation und Disjunktion
• Default Vererbung (Priority Union)
• ...
![Page 29: PATR II Compiler](https://reader036.vdocuments.net/reader036/viewer/2022062409/5681467a550346895db39d73/html5/thumbnails/29.jpg)
22.05.2000 PATR II Compiler 29
LiteraturLiteratur
• Shieber, Stuart (1986): An Introduction to Unification-based Approaches to Grammar. CSLI Lecture Notes.
• Gazdar, Gerald & Chris Mellish (1989): Natural Language Processing in Prolog. Addison Wesley.
• Covington, Michael A. (1994): Natural Language Processing for Prolog Programmers. Chap. 6: Parsing Algorithms. Prentice-Hall.